17 lines
856 B
Markdown
17 lines
856 B
Markdown
# Runbook: etcd / Talos control plane
|
|
|
|
**Symptoms:** API flaps, `etcd` alarms, multiple control planes `NotReady`, upgrades stuck.
|
|
|
|
**Checks**
|
|
|
|
1. `talosctl health` and `talosctl etcd status` (with `TALOSCONFIG`; target a control-plane node if needed).
|
|
2. `kubectl get nodes` — control planes **Ready**; look for disk/memory pressure.
|
|
3. Talos version skew: `talosctl version` vs node image in [`talos/talconfig.yaml`](../talconfig.yaml) / Image Factory schematic.
|
|
|
|
**Common fixes**
|
|
|
|
- One bad control plane: cordon/drain workloads only after confirming quorum; follow Talos maintenance docs for replace/remove.
|
|
- Disk full on etcd volume: resolve host disk / system partition (Talos ephemeral vs user volumes per machine config).
|
|
|
|
**References:** [Talos etcd](https://www.talos.dev/latest/advanced/etcd-maintenance/), [`talos/README.md`](../README.md).
|