Skip to content

Decisions

Short notes on tradeoffs taken in the lab. Each ADR follows the standard shape:

  • Context — what was true at the time.
  • Decision — what I picked.
  • Considered — what else I looked at and why I did not pick it.
  • Consequences — what this commits me to, including the parts that are not great.

Index

Active questions (not yet ADR'd)

  • Uptime Kuma on a VM, in-cluster, or off-site. A VM was provisioned but never started (incident reference). The right move is probably to run it off-site for external vantage. To be ADR'd before the next round of monitoring changes.
  • Public read-only Grafana dashboard. The operational UIs (ArgoCD, homepage, Prometheus, Alertmanager) are intentionally LAN-only. A single curated read-only Grafana dashboard exposed through NPM, with the rest of Grafana login-gated, would be the smallest blast-radius way to give an external viewer a real-time look at the lab. Open: which dashboard, what to redact, whether to put it behind Cloudflare Access for additional gating.
  • Calico vs Cilium. Today the lab runs Calico. Cilium would buy me eBPF-based observability and policy, at the cost of one more thing to debug. Defer.
  • hostssl + ssl_mode=require for all Postgres clients. Already a follow-up from the 2026-05-03 incident.