Flux de Données — 12 flux principaux
Détail complet des 12 flux de données circulant dans la stack RGZ.
Rôles concernés
Ce dataflow concerne: Admin NOC · Revendeur · Abonné WiFi
Flux 1: Authentification abonné (Portail captif)
Flux complet: Subscriber → Portal → RADIUS → API → DB → Session
Timeline
| Étape | Durée | Service |
|---|---|---|
| Sinkhole DNS | <1s | rgz-dns |
| POST OTP request | <100ms | rgz-api |
| SMS envoyé | 5-10s | Letexto |
| OTP verify | <150ms | rgz-api |
| RADIUS auth | <100ms | rgz-radius |
| Total | 15-20s |
Base de données
sql
-- Tables impliquées
subscribers (id, subscriber_ref, msisdn, status, created_at)
radius_sessions (id, subscriber_id, nas_id, mac_address, bytes_in, bytes_out, session_start, session_stop)
subscriber_devices (id, subscriber_id, mac_address, first_seen_at, last_seen_at)Redis keys
rgz:otp:{phone} → {code} (TTL=300s)
rgz:session:{subscriber_id} → {token} (TTL=86400s)
rgz:rate:{phone} → counter (TTL=60s)
rgz:devices:{subscriber_id} → SET {mac1, mac2}Flux 2: Paiement mobile (KKiaPay)
Flux complet: Portal → KKiaPay → Webhook → API → RADIUS CoA → CPE
Timeline
| Étape | Durée | Notes |
|---|---|---|
| Request → KKiaPay | <100ms | Création transaction |
| Paiement USSD/app | 30-120s | Utilisateur |
| Webhook retour | <5s | Après succès |
| DB update | <50ms | Transaction ACID |
| CoA vers CPE | <100ms | RADIUS |
| Total | 30-130s | Async |
Redis keys
rgz:payment:{transaction_id} → {payload} (TTL=3600s)
rgz:dba:mir:{reseller_id} → {cir, mir, burst} (TTL=600s)Montants (FCFA)
Abonnements:
1000 FCFA → 10 MB/jour
5000 FCFA → 100 MB/jour
10000 FCFA → 1 GB/mois
50000 FCFA → 5 GB/mois
Tarification:
50/50 split entre RGZ et revendeur
Commission KKiaPay: 1.5% montant
Frais revendeur: 5% (example)
MIR (Throughput): illimité (plan 1GB+)Flux 3: Monitoring temps réel (Prometheus + Grafana)
Flux complet: API metrics → Prometheus scrape → Grafana dashboard → AlertManager
Metrics scraped
| Metric | Source | Interval | Thresholds |
|---|---|---|---|
| cpu_usage_percent | /metrics | 15s | warn=60%, crit=80% |
| memory_usage_mb | /metrics | 15s | warn=6GB, crit=7.5GB |
| api_requests_total | /metrics | 15s | - |
| api_request_duration_seconds | /metrics | 15s | warn=0.5s, crit=1s |
| db_connections_active | /metrics | 15s | warn=40, crit=50 |
| redis_connected_clients | /metrics | 15s | warn=20, crit=30 |
| radius_auth_success_rate | /metrics | 15s | warn=95%, crit=90% |
Alertes
Rule: APIHighLatency
expr: api_request_duration_seconds{quantile="0.95"} > 0.5
for: 5m
severity: warning
Rule: DatabaseDown
expr: pg_up == 0
for: 1m
severity: critical
→ Send SMS + create incidentFlux 4: Réconciliation paiements (Celery task)
Flux complet: Daily 00:15 → KKiaPay API → Compare DB → Adjustments
Conditions de réconciliation
Match: kkiapay_transaction_id + amount_fcfa + timestamp match ✅
Missing in DB: KKiaPay sent but DB absent → insert + audit
Duplicate in DB: DB has 2x transaction → flag + manual review
Amount mismatch: amount differs → alert finance
Status mismatch: kkiapay=completed but db=pending → update
Threshold alert: Mismatches > 1% daily revenue → P1 incidentCelery task
python
@app.task
def reconcile_payments():
"""Daily 00:15 UTC - KKiaPay ↔ DB"""
yesterday = date.today() - timedelta(days=1)
db_payments = db.query(Payment).filter(
Payment.status == "completed",
Payment.updated_at >= yesterday
)
kkiapay_payments = kkiapay_client.get_transactions(
from_date=yesterday
)
# Compare & reconcile
# ...
return {"processed": len(db_payments), "mismatches": 0}Flux 5: Logs centralisés (ELK)
Flux complet: Services → Logstash pipelines → Elasticsearch → Kibana search
┌─────────────────┐
│ rgz-api │
│ stdout/stderr │
└────────┬────────┘
│ Docker logs driver
↓
┌─────────────────────────────────┐
│ Logstash (4 pipelines) │
├─────────────────────────────────┤
│ 1. api_logs (JSON from app) │
│ 2. radius_logs (syslog 514) │
│ 3. cpe_syslog (remote syslog) │
│ 4. netflow_logs (goflow2 JSON) │
└──────────┬──────────────────────┘
│ Bulk indexing
↓
┌──────────────────────────┐
│ Elasticsearch │
│ (ILM: 30d hot, 300d │
│ warm, 365d delete) │
├──────────────────────────┤
│ logstash-api-* │
│ logstash-radius-* │
│ logstash-cpe-* │
│ logstash-netflow-* │
└──────────┬───────────────┘
│
┌──────┴──────┐
↓ ↓
Kibana Grafana
(Search) (Trends)Logstash pipelines
Pipeline 1: api_logs
json
Input: Docker logs (JSON)
Filter: grok + json parse
{
"timestamp": "2026-02-21T12:34:56Z",
"level": "INFO",
"message": "OTP sent to +229xxxxx",
"service": "rgz-api",
"request_id": "uuid"
}
Output: Elasticsearch
Index: logstash-api-2026.02.21Pipeline 2: radius_logs
Input: Syslog port 514
Filter: grok radius syslog format
[Auth] [Success] User=subscriber_ref, NAS-ID=access_tech_connect
Output: Elasticsearch
Index: logstash-radius-2026.02.21Pipeline 3: cpe_syslog
Input: Remote syslog (CPE devices)
Filter: Parse vendor-specific logs
LiteBeam: signal level, noise, throughput
Cambium: MAC, VLAN, authentication
Output: Elasticsearch
Index: logstash-cpe-2026.02.21Pipeline 4: netflow_logs
Input: goflow2 JSON
Filter: Flatten NetFlow fields
{
"source_ip": "10.100.0.50",
"dest_ip": "8.8.8.8",
"bytes": 1048576,
"duration_ms": 5000,
"protocol": "TCP"
}
Output: Elasticsearch
Index: logstash-netflow-2026.02.21ILM (Index Lifecycle Management)
Phase: hot (0-30 days)
- Active indexing
- Full replicas (future HA)
- 60GB max size → rollover
Phase: warm (30-300 days)
- Read-only
- Force merge
- Compress
Phase: cold (optional, 300+ days)
- Searchable snapshot (future)
Phase: delete (365 days)
- Purge old indicesRetention
Total: 12 mois (365 jours)
Intraday: first 30 days (hot)
Daily: 30-365 days (warm)
Size: ~40-60 GB (12 months)Flux 6-12: Autres flux principaux
Flux 6: DBA MIR recalcul (every 5min)
Celery task (rgz.qos.dba_recalculate)
→ Get usage from SNMP/RADIUS
→ Calculate new MIR (fair share)
→ Send RADIUS CoA to CPE
→ Update Redis cacheFlux 7: SNMP polling (every 5min)
Celery task (rgz.monitoring.snmp_poll)
→ Query CPE/gateway via SNMP
→ Parse response (RSSI, throughput, errors)
→ Insert to TimescaleDB hypertables
→ Expose /metrics for PrometheusFlux 8: SLA probes (every 5min)
Celery task (rgz.sla.probe)
→ ICMP ping (latency)
→ TCP http://portal/health
→ DNS lookup (recursive)
→ Store in DB
→ Compute uptime % = success_count/total_countFlux 9: Daily PDF invoices (monthly J+5)
Celery task (rgz.reports.monthly_invoices)
→ Query payments for previous month
→ Generate PDF (WeasyPrint)
→ Sign with SHA-256 (immutable log #45)
→ Email to reseller
→ Archive in /invoices/Flux 10: ARCEP reporting (quarterly)
Celery task (rgz.reports.arcep)
→ Aggregate: abonnés, sessions, revenues, incidents
→ Format CSV + PDF
→ Sign & timestamp
→ Upload to ARCEP portalFlux 11: Backup PostgreSQL (daily 03:00)
Celery task (rgz.backup.pg_dump)
→ Execute pg_dump (full backup)
→ Compress gzip
→ Encrypt (future)
→ Upload to S3/NAS
→ Retention: 30 days
→ Verify integrityFlux 12: Incident escalation (real-time)
Event: Alert fired (Prometheus)
→ AlertManager groups alerts
→ Routes to Slack/Email/SMS
→ Celery creates incident record
→ Assign P0/P1/P2 based on severity
→ Escalate if not acknowledged in X minutes
→ Automatic rollback decision (future)Tableau résumé 12 flux
| # | Flux | Initiateur | Durée | Type | Criticalité |
|---|---|---|---|---|---|
| 1 | Auth portail | Abonné | 20s | Sync | CRITIQUE |
| 2 | Paiement KKiaPay | Abonné | 130s | Async | CRITIQUE |
| 3 | Monitoring Prometheus | Timer | Continu | Async | HAUTE |
| 4 | Réconciliation | Celery daily | 5min | Async | HAUTE |
| 5 | Logs ELK | Continu | Temps réel | Async | HAUTE |
| 6 | DBA MIR | Celery 5min | 30s | Async | CRITIQUE |
| 7 | SNMP polling | Celery 5min | 30s | Async | HAUTE |
| 8 | SLA probes | Celery 5min | 30s | Async | HAUTE |
| 9 | PDF invoices | Celery monthly | 5min | Async | MOYENNE |
| 10 | ARCEP report | Celery quarterly | 10min | Async | CRITIQUE |
| 11 | DB backup | Celery daily | 15min | Async | CRITIQUE |
| 12 | Incident escalation | Prometheus | <1min | Async | CRITIQUE |
Support
- Logs en temps réel:
docker logs -f <service> - Traces: Kibana → Discover → logstash-api-*
- Metrics: Grafana → dashboards
- Incidents: AlertManager UI