Premier Démarrage

Vérification et validation du NOC après installation.

Prérequis

✅ Installation complétée avec init.sh réussi.

Étape 1: Démarrer la stack

bash

# Démarrer tous les services (core + monitoring)
./scripts/ops/start.sh

# Output attendu:
# ✅ Starting Docker Compose stacks...
# [+] Building docker images...
# [+] Starting services...
# ...
# ✅ Stack core started
# ✅ Stack monitoring started
# ✅ Waiting for services...

Durée: 30-60 secondes selon le hardware.

Étape 2: Vérifier la santé

bash

# Vérifier l'état de tous les services
./scripts/ops/status.sh

# Output attendu:
# SERVICE               STATUS    HEALTH
# rgz-api               running   healthy ✅
# rgz-db                running   healthy ✅
# rgz-redis             running   healthy ✅
# rgz-radius            running   healthy ✅
# rgz-gateway           running   healthy ✅
# rgz-dns               running   healthy ✅
# rgz-beat              running   healthy ✅
# rgz-portal            running   healthy ✅
# rgz-web               running   healthy ✅
# rgz-kea               running   healthy ✅
# rgz-ids               running   healthy ✅
# rgz-nginx             running   healthy ✅
# rgz-wireguard         running   healthy ✅
# rgz-canary            running   healthy ✅
# rgz-prometheus        running   healthy ✅
# rgz-alertmanager      running   healthy ✅
# rgz-grafana           running   healthy ✅
# rgz-elasticsearch     running   healthy ✅
# rgz-kibana            running   healthy ✅
# rgz-logstash          running   healthy ✅
# rgz-netflow           running   healthy ✅
# rgz-docs              running   healthy ✅

Services "unhealthy"?

bash

# Vérifier les logs du service problématique
docker logs rgz-api --tail 50

# Attendre 5-10 secondes (démarrage lent)
sleep 10 && ./scripts/ops/status.sh

# Vérifier les erreurs critiques
docker compose -f docker-compose.core.yml logs --tail 100 rgz-api
docker compose -f docker-compose.monitoring.yml logs --tail 100 rgz-elasticsearch

Causes fréquentes:

Service	Unhealthy cause	Solution
rgz-elasticsearch	Heap trop petit	Attendre 30s, vérifier `vm.max_map_count`
rgz-kibana	Elasticsearch pas prêt	Attendre que ES devienne green
rgz-api	DB non initialisée	Vérifier logs: `docker logs rgz-db`
rgz-grafana	DB grafana pas créée	Relancer: `./scripts/ops/init.sh`
rgz-redis	Authentification	Vérifier REDIS_PASSWORD dans `.env`

Étape 3: Exécuter les smoke tests

bash

./scripts/ops/smoke_test.sh

# Output attendu:
# ╔════════════════════════════════════════╗
# ║   RGZ NOC — SMOKE TESTS                ║
# ╠════════════════════════════════════════╣
# ✅ API health check: OK
# ✅ Database connectivity: OK
# ✅ Redis connectivity: OK
# ✅ RADIUS (FreeRADIUS): OK
# ✅ DNS resolution: OK
# ✅ Portal accessibility: OK
# ✅ Prometheus metrics: OK
# ✅ Elasticsearch cluster: yellow/green
# ✅ HTTPS (TLS): OK (api-rgz.duckdns.org)
# ✅ All smoke tests passed! ✅
# ╚════════════════════════════════════════╝

# Temps: 30-60 secondes

En cas d'échec d'un test

bash

# Test API
curl -s https://api-rgz.duckdns.org/health | jq .
# Attendu: {"status": "ok"}

# Test DB
docker exec rgz-db psql -U rgz_admin -d rgz_noc -c "SELECT NOW();"
# Attendu: Timestamp courant

# Test Redis
docker exec rgz-redis redis-cli ping
# Attendu: PONG

# Test DNS
nslookup access-rgz.duckdns.org 127.0.0.1
# Attendu: Address = IP serveur

# Test HTTPS
curl -vI https://api-rgz.duckdns.org/docs
# Attendu: HTTP/2 200

Étape 4: Accès Web

Ouvrir les URLs suivantes dans un navigateur:

Services externes (via Traefik HTTPS)

URL	Service	Credentials
https://api-rgz.duckdns.org	API Swagger	Aucun (public)
https://api-rgz.duckdns.org/docs	API Swagger UI	Aucun
https://api-rgz.duckdns.org/redoc	API ReDoc	Aucun
https://admin-rgz.duckdns.org	Dashboard admin	TBD
https://access-rgz.duckdns.org	Portail captif	Aucun (public)
https://grafana-rgz.duckdns.org	Grafana dashboards	admin / (voir GRAFANA_ADMIN_PASSWORD)
https://docs-rgz.duckdns.org	Documentation	Aucun (public)
https://registre-rgz.duckdns.org	Site recensement	TBD

Services internes (réseau local, pas d'authentification Traefik)

URL	Service	Port	Notes
http://[server_ip]:9090	Prometheus	9090	Accès interne seulement
http://[server_ip]:9093	AlertManager	9093	Accès interne seulement
http://[server_ip]:5601	Kibana	5601	Accès interne + authentification elastic/pass
http://[server_ip]:9200	Elasticsearch API	9200	Accès interne + authentification
http://[server_ip]:9000	Portainer (optionnel)	9000	Docker management UI

Première connexion Grafana

Ouvrir https://grafana-rgz.duckdns.org
Entrer credentials:
- User: admin
- Password: (valeur de GRAFANA_ADMIN_PASSWORD dans .env)
Cliquer sur Home (coin haut-gauche)
Vérifier que les dashboards existent:
- [x] Core Metrics
- [x] Network Overview
- [x] RF Monitoring
- [x] Alerts & Incidents

Tester API via Swagger

Ouvrir https://api-rgz.duckdns.org/docs
Clique sur /health → "Try it out" → "Execute"

Réponse attendue:

json

{
  "status": "ok",
  "timestamp": "2026-02-21T12:34:56Z",
  "version": "1.0.0"
}

Tester Portail captif

Ouvrir https://access-rgz.duckdns.org
Devrait afficher: Formulaire connexion MSISDN + bannières (si configurées)
Logo + couleurs ACCESS (jaune, bleu, rouge) visibles

Étape 5: Vérifier les logs

bash

# Logs API (debug)
docker logs rgz-api --follow

# Logs PostgreSQL
docker logs rgz-db --follow

# Logs Redis
docker logs rgz-redis --follow

# Logs RADIUS (le plus important)
docker logs rgz-radius --follow

# Tous les logs
docker compose -f docker-compose.core.yml logs --follow

# Quitter: Ctrl+C

Temps de démarrage par service

Service	Temps	Notes
rgz-api	5-10s	FastAPI startup
rgz-db	5-10s	PostgreSQL init
rgz-redis	2-3s	Redis startup
rgz-radius	3-5s	FreeRADIUS config
rgz-elasticsearch	20-40s	JVM startup (premier démarrage 60s+)
rgz-kibana	10-15s	Kibana init (dépend ES)
rgz-grafana	5-10s	Grafana startup
Tous les services	60-120s	Stack complète healthy

Arrêt gracieux

bash

# Arrêter les services
./scripts/ops/stop.sh

# Output attendu:
# ✅ Stopping services...
# ✅ All services stopped gracefully

# Vérifier
docker compose -f docker-compose.core.yml ps
# STATUS: Exited (0)

Après premier boot

✅ Lire Architecture — Comprendre la stack
✅ Lire Accès URLs — Accès détaillé à tous les services
✅ Consulter Opérations — Backup, monitoring, logs
✅ Configurer alertes dans Grafana/AlertManager
✅ Tester webhooks KKiaPay (paiements)
✅ Configurer domaines resellers + CPEs

Dashboard santé (Grafana)

Ouvrir https://grafana-rgz.duckdns.org/d/core-metrics

Vérifier les métriques clés:

[ ] CPU usage < 80%
[ ] RAM available > 2 GB
[ ] Disk usage < 80%
[ ] Uptime counter > 0
[ ] API response time < 200ms
[ ] DB connections < 50
[ ] Redis hit rate > 95%

Checkpoints après premier boot

[ ] Tous les 22 services = "healthy" ✅
[ ] Smoke tests = tous passent ✅
[ ] API /docs accessible ✅
[ ] Grafana login = OK ✅
[ ] Portail captif = s'affiche ✅
[ ] Logs = aucune erreur critique ✅
[ ] HTTPS = TLS valide (Let's Encrypt) ✅
[ ] Dashboard Grafana = métriques visibles ✅

Étape suivante: Architecture

Premier Démarrage ​

Prérequis ​

Étape 1: Démarrer la stack ​

Étape 2: Vérifier la santé ​

Services "unhealthy"? ​

Étape 3: Exécuter les smoke tests ​

En cas d'échec d'un test ​

Étape 4: Accès Web ​

Services externes (via Traefik HTTPS) ​

Services internes (réseau local, pas d'authentification Traefik) ​

Première connexion Grafana ​

Tester API via Swagger ​

Tester Portail captif ​

Étape 5: Vérifier les logs ​

Temps de démarrage par service ​

Arrêt gracieux ​

Après premier boot ​

Dashboard santé (Grafana) ​

Checkpoints après premier boot ​

Premier Démarrage

Prérequis

Étape 1: Démarrer la stack

Étape 2: Vérifier la santé

Services "unhealthy"?

Étape 3: Exécuter les smoke tests

En cas d'échec d'un test

Étape 4: Accès Web

Services externes (via Traefik HTTPS)

Services internes (réseau local, pas d'authentification Traefik)

Première connexion Grafana

Tester API via Swagger

Tester Portail captif

Étape 5: Vérifier les logs

Temps de démarrage par service

Arrêt gracieux

Après premier boot

Dashboard santé (Grafana)

Checkpoints après premier boot