🧰 Operations & Runbook¶
✅ Startup Checklist¶
- Ensure AWS SQS is configured.
 - Run 
docker-compose up. - Verify:
 - DB tables created
 - ML model loaded
 - Prometheus scraping metrics
 
⚠️ Common Issues¶
- SQS Connection Failure → Check credentials and queue URL.
 - Scraper Timeout → Increase timeout/delay in 
SCRAPER_CONFIG. - DB IntegrityError → Ensure schema matches ORM definitions.
 - Model Not Loading → Verify correct 
MODEL_PATH. 
🛡️ On-Call Guide¶
- First check Grafana dashboards.
 - Look for spikes in 
SCRAPER_FAILURESor API latency. - Validate Prometheus targets are healthy.
 
🔄 Recovery Procedures¶
- Restart consumer if scraping fails continuously.
 - Replay failed messages from SQS DLQ.
 - For DB issues, rollback transaction & re-run.