Runbooks¶
Short, fix-it-now docs derived from real incidents in the lab. Each runbook follows the same shape:
- Symptom you would actually see.
- Quick triage to confirm.
- Fix.
- Follow-up so it does not bite again.
Kubernetes platform¶
- Postgres rejecting K8s pods (
pg_hba.conf) - MetalLB IP not answering ARP
- Grafana CrashLoopBackOff
- etcd database bloat and defragmentation
- Bitnami image 404 /
ImagePullBackOff