Runbooks¶
Short, fix-it-now docs derived from real incidents in the lab. Each runbook follows the same shape:
- Symptom you would actually see.
- Quick triage to confirm.
- Fix.
- Follow-up so it does not bite again.
Kubernetes platform¶
- Postgres rejecting K8s pods (
pg_hba.conf) - MetalLB IP not answering ARP
- Grafana CrashLoopBackOff
- etcd database bloat and defragmentation
- Bitnami image 404 /
ImagePullBackOff
Apps¶
- Jellyfin database corruption from ungraceful shutdown
- Jellyfin transcoding and ffmpeg
- GPU passthrough D3cold recovery (jellyfin / pve2 / RTX 2080 Ti)