Skip to content

#42 — NetFlow Collector (Analyse Flux Réseau)

PLANIFIÉ

Priorité: 🟠 HAUTE · Type: TYPE F · Conteneur: rgz-netflow · Code: config/netflow/

Dépendances: #7 rgz-gateway


Description

Collecteur NetFlow v5 stateless qui écoute sur le port 2055/udp pour récupérer les enregistrements de flux réseau générés par rgz-gateway. Le gateway nftables envoie les flux NetFlow vers le collecteur toutes les 15 minutes ou au changement d'état de la connexion.

Le collecteur analyse les flux pour extraire : top talkers (adresses IP consommant le plus de bande), bandwidth trending (charge réseau sur 24h), traffic patterns (heures de pointe, applications détectées par port). Les données brutes sont stockées 30 jours en détail dans InfluxDB ou PostgreSQL TimescaleDB, puis agrégées sur 12 mois.

Les reportages mensuels (#73 bandwidth-trending) consomment ces données pour afficher tendances ARPU et saturation prévue. Les alertes anomalies (#44) déclenchent si un seul subscriber dépasse 1 GB/jour ou si trafic de site >80% de CIR réservé.

Architecture Interne

NetFlow v5 Dataflow:
  1. rgz-gateway (nftables + flowgen):
     └─ Export NetFlow v5 → localhost:2055/udp
     └─ Format standard: source_ip, dest_ip, source_port, dest_port, protocol,
        bytes, packets, start_time, end_time

  2. rgz-netflow containeur (nFlow or goflow2):
     └─ Écoute 2055/udp
     └─ Parse NetFlow v5 packets
     └─ Enrichit avec GeoIP, ASN, vendor lookup (optionnel)
     └─ Envoie vers TimescaleDB (flows table)

  3. TimescaleDB flows hypertable:
     └─ (time BIGINT, src_ip INET, dst_ip INET, src_port INT, dst_port INT,
        protocol INT, bytes BIGINT, packets BIGINT, nas_id TEXT,
        subscriber_id UUID, reseller_id UUID)
     └─ Compression + retention policy (30j détail, 12m aggrégé)

  4. Grafana Dashboard + API:
     └─ Top talkers: SELECT src_ip, SUM(bytes) FROM flows WHERE time > NOW() - 24h
        GROUP BY src_ip ORDER BY 2 DESC LIMIT 10
     └─ Bandwidth trending: SELECT time_bucket('1h', time), SUM(bytes)
        FROM flows GROUP BY time_bucket
     └─ Port analysis: SELECT dst_port, protocol, COUNT(*) FROM flows
        GROUP BY dst_port, protocol

  5. Alertes + Reporting:
     └─ Prometheus → (subscriber_bytes_24h > 1GB) → AlertManager → SMS/email
     └─ Celery task #73: generate_bandwidth_trending() utilise flows agrégées

Configuration

env
# .env pour NetFlow Collector
NETFLOW_LISTEN_PORT=2055
NETFLOW_LISTEN_ADDR=0.0.0.0
NETFLOW_PROTOCOL=v5

# TimescaleDB flows hypertable
TIMESCALEDB_FLOWS_TABLE=flows
TIMESCALEDB_FLOWS_RETENTION_DETAIL=30  # jours
TIMESCALEDB_FLOWS_RETENTION_AGGREGATE=12  # mois
TIMESCALEDB_FLOWS_CHUNK_INTERVAL=1 day

# InfluxDB option (alternative stockage)
INFLUXDB_BUCKET=netflow
INFLUXDB_ORG=rgz
INFLUXDB_RETENTION=30d

# Prometheus metrics (optionnel)
PROMETHEUS_NETFLOW_METRICS_ENABLED=true
PROMETHEUS_NETFLOW_PORT=9100

# Top talkers alert thresholds
NETFLOW_ALERT_SUBSCRIBER_24H_BYTES=1073741824  # 1 GB
NETFLOW_ALERT_SITE_CIR_PERCENT=80
NETFLOW_ALERT_DPI_PROTOCOL=torrent,streaming,p2p

Endpoints API

MéthodeRouteRéponse
GET/api/v1/netflow/top-talkers?period=24h&limit=10{items: [{src_ip, bytes_sent, packets, avg_latency_ms}], total}
GET/api/v1/netflow/bandwidth-trend?from=&to=&interval=1h{items: [{timestamp, bytes_in, bytes_out, pkts}], total}
GET/api/v1/netflow/protocol-breakdown?nas_id=&period=7d{items: [{protocol, dst_port, bytes%, packets%}]}
GET/api/v1/netflow/subscriber/{subscriber_id}/usage?period=24h{subscriber_id, bytes_total, pkts_total, top_ports: [{port, protocol, bytes}]}
POST/api/v1/netflow/v5-ingestNetFlow v5 UDP syslog receptor : receives and parses
GET/api/v1/netflow/alerts?severity=high&days=7{items: [{timestamp, alert_type, subscriber_id, bytes, threshold}]}

Commandes Utiles

bash
# Vérifier écoute du port 2055
docker exec rgz-netflow netstat -ulpn | grep 2055
# ou
ss -ulpn | grep 2055

# Tester envoi NetFlow v5 dummy (depuis gateway)
docker exec rgz-gateway bash -c '
cat > /tmp/netflow_test.bin << EOF
# NetFlow v5 header (24 bytes) + flow records
# Simplification: utiliser nftest ou tcpdump
EOF
nc -u localhost 2055 < /tmp/netflow_test.bin
'

# Vérifier flows stockées en TimescaleDB
docker exec rgz-db psql -U rgz -d rgz -c "
SELECT time, src_ip, dst_ip, src_port, dst_port, protocol, bytes, packets
FROM flows
ORDER BY time DESC
LIMIT 20;
"

# Top 10 talkers dernières 24h
docker exec rgz-db psql -U rgz -d rgz -c "
SELECT src_ip, SUM(bytes) as total_bytes, SUM(packets) as total_pkts
FROM flows
WHERE time > NOW() - INTERVAL '24 hours'
GROUP BY src_ip
ORDER BY total_bytes DESC
LIMIT 10;
"

# Consulter dashboard Grafana
curl -H "Authorization: Bearer ${GRAFANA_API_TOKEN}" \
  https://grafana-rgz.duckdns.org/api/dashboards/uid/netflow

# Exporter statistiques NetFlow en CSV
docker exec rgz-db psql -U rgz -d rgz \
  -c "COPY (SELECT * FROM flows WHERE time > NOW() - INTERVAL '7 days') \
       TO STDOUT CSV HEADER;" > netflow_7days.csv

# Simuler alerte top talker
docker exec rgz-api python -c "
from app.services.monitoring import check_netflow_alerts
check_netflow_alerts(period='24h', alert_bytes_threshold=1073741824)
"

Implémentation TODO

  • [ ] Créer hypertable TimescaleDB flows (time, src_ip INET, dst_ip INET, src_port, dst_port, protocol, bytes, packets, nas_id, subscriber_id UUID, reseller_id UUID)
  • [ ] Déployer goflow2 ou nFlow (Golang NetFlow v5 parser) en conteneur rgz-netflow
  • [ ] Configurer rgz-gateway : exporter NetFlow v5 toutes les 15 minutes (nftables extension ou sflow)
  • [ ] Implémenter POST /api/v1/netflow/v5-ingest (rPython async UDP listener sur port 2055)
  • [ ] Ajouter compression TimescaleDB : SELECT compress_chunk(chunkname) après 1 jour
  • [ ] Développer Prometheus exporter : netflow_bytes_total, netflow_packets_total, netflow_top_talker_bytes
  • [ ] Créer Grafana dashboards : Top Talkers, Bandwidth Trend, Protocol Breakdown, Geographic Distribution
  • [ ] Implémenter alerting Celery task : vérifier subscriber/site thresholds toutes les heures
  • [ ] Ajouter DPI (Deep Packet Inspection) optionnel pour classification applicative (Suricata rules)
  • [ ] Tests : mock NetFlow v5 packets, requêtes TimescaleDB complexes avec time_bucket

Dernière mise à jour: 2026-02-21

PROJET MOSAÏQUE — 81 outils, 22 conteneurs, 500+ revendeurs WiFi Zone