Files
home-server/talos/runbooks/etcd-talos.md

856 B

Runbook: etcd / Talos control plane

Symptoms: API flaps, etcd alarms, multiple control planes NotReady, upgrades stuck.

Checks

  1. talosctl health and talosctl etcd status (with TALOSCONFIG; target a control-plane node if needed).
  2. kubectl get nodes — control planes Ready; look for disk/memory pressure.
  3. Talos version skew: talosctl version vs node image in talos/talconfig.yaml / Image Factory schematic.

Common fixes

  • One bad control plane: cordon/drain workloads only after confirming quorum; follow Talos maintenance docs for replace/remove.
  • Disk full on etcd volume: resolve host disk / system partition (Talos ephemeral vs user volumes per machine config).

References: Talos etcd, talos/README.md.