# Ansible — noble cluster Automates [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md): optional **Talos Phase A** (genconfig → apply → bootstrap → kubeconfig), then **Phase B+** (CNI → add-ons → ingress → Argo CD → Kyverno → observability, etc.). **Argo CD** does not reconcile core charts — optional GitOps starts from an empty [`clusters/noble/apps/kustomization.yaml`](../clusters/noble/apps/kustomization.yaml). ## Order of operations 1. **From `talos/`:** `talhelper gensecret` / `talsecret` as in [`talos/README.md`](../talos/README.md) §1 (if not already done). 2. **Talos Phase A (automated):** run [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) **or** the full pipeline [`playbooks/deploy.yml`](playbooks/deploy.yml). This runs **`talhelper genconfig -o out`**, **`talosctl apply-config`** on each node, **`talosctl bootstrap`**, and **`talosctl kubeconfig`** → **`talos/kubeconfig`**. 3. **Platform stack:** [`playbooks/noble.yml`](playbooks/noble.yml) (included at the end of **`deploy.yml`**). Your workstation must be able to reach **node IPs on the lab LAN** (Talos API **:50000** for `talosctl`, Kubernetes **:6443** for `kubectl` / Helm). If `kubectl` cannot reach the VIP (`192.168.50.230`), use `-e 'noble_k8s_api_server_override=https://:6443'` on **`noble.yml`** (see `group_vars/all.yml`). **One-shot full deploy** (after nodes are booted and reachable): ```bash cd ansible ansible-playbook playbooks/deploy.yml ``` ## Deploy secrets (`.env`) Copy **`.env.sample`** to **`.env`** at the repository root (`.env` is gitignored). At minimum set **`CLOUDFLARE_DNS_API_TOKEN`** for cert-manager DNS-01. The **cert-manager** role applies it automatically during **`noble.yml`**. See **`.env.sample`** for optional placeholders (e.g. Newt/Pangolin). ## Prerequisites - `talosctl` (matches node Talos version), `talhelper`, `helm`, `kubectl`. - **SOPS secrets:** `sops` and `age` on the control host if you use **`clusters/noble/secrets/`** with **`age-key.txt`** (see **`clusters/noble/secrets/README.md`**). - **Phase A:** same LAN/VPN as nodes so **Talos :50000** and **Kubernetes :6443** are reachable (see [`talos/README.md`](../talos/README.md) §3). - **noble.yml:** bootstrapped cluster and **`talos/kubeconfig`** (or `KUBECONFIG`). ## Playbooks | Playbook | Purpose | |----------|---------| | [`playbooks/deploy.yml`](playbooks/deploy.yml) | **Talos Phase A** then **`noble.yml`** (full automation). | | [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) | `genconfig` → `apply-config` → `bootstrap` → `kubeconfig` only. | | [`playbooks/noble.yml`](playbooks/noble.yml) | Helm + `kubectl` platform (after Phase A). | | [`playbooks/post_deploy.yml`](playbooks/post_deploy.yml) | SOPS reminders and optional Argo root Application note. | | [`playbooks/talos_bootstrap.yml`](playbooks/talos_bootstrap.yml) | **`talhelper genconfig` only** (legacy shortcut; prefer **`talos_phase_a.yml`**). | | [`playbooks/debian_harden.yml`](playbooks/debian_harden.yml) | Baseline hardening for Debian servers (SSH/sysctl/fail2ban/unattended-upgrades). | | [`playbooks/debian_maintenance.yml`](playbooks/debian_maintenance.yml) | Debian maintenance run (apt upgrades, autoremove/autoclean, reboot when required). | | [`playbooks/debian_rotate_ssh_keys.yml`](playbooks/debian_rotate_ssh_keys.yml) | Rotate managed users' `authorized_keys`. | | [`playbooks/debian_ops.yml`](playbooks/debian_ops.yml) | Convenience pipeline: harden then maintenance for Debian servers. | | [`playbooks/proxmox_prepare.yml`](playbooks/proxmox_prepare.yml) | Configure Proxmox community repos and disable no-subscription UI warning. | | [`playbooks/proxmox_upgrade.yml`](playbooks/proxmox_upgrade.yml) | Proxmox maintenance run (apt dist-upgrade, cleanup, reboot when required). | | [`playbooks/proxmox_cluster.yml`](playbooks/proxmox_cluster.yml) | Create a Proxmox cluster on the master and join additional hosts. | | [`playbooks/proxmox_ops.yml`](playbooks/proxmox_ops.yml) | Convenience pipeline: prepare, upgrade, then cluster Proxmox hosts. | ```bash cd ansible export KUBECONFIG=/absolute/path/to/home-server/talos/kubeconfig # noble.yml only — if VIP is unreachable from this host: # ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443' ansible-playbook playbooks/noble.yml ansible-playbook playbooks/post_deploy.yml ``` ### Talos Phase A variables (role `talos_phase_a` defaults) Override with `-e` when needed, e.g. **`-e noble_talos_skip_bootstrap=true`** if etcd is already initialized. | Variable | Default | Meaning | |----------|---------|---------| | `noble_talos_genconfig` | `true` | Run **`talhelper genconfig -o out`** first. | | `noble_talos_apply_mode` | `auto` | **`auto`** — **`talosctl apply-config --dry-run`** on the first node picks maintenance (**`--insecure`**) vs joined (**`TALOSCONFIG`**). **`insecure`** / **`secure`** force talos/README §2 A or B. | | `noble_talos_skip_bootstrap` | `false` | Skip **`talosctl bootstrap`**. If etcd is **already** initialized, bootstrap is treated as a no-op (same as **`talosctl`** “etcd data directory is not empty”). | | `noble_talos_apid_wait_delay` / `noble_talos_apid_wait_timeout` | `20` / `900` | Seconds to wait for **apid :50000** on the bootstrap node after **apply-config** (nodes reboot). Increase if bootstrap hits **connection refused** to `:50000`. | | `noble_talos_nodes` | neon/argon/krypton/helium | IP + **`out/*.yaml`** filename — align with **`talos/talconfig.yaml`**. | ### Tags (partial runs) ```bash ansible-playbook playbooks/noble.yml --tags cilium,metallb ansible-playbook playbooks/noble.yml --skip-tags newt ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=... ``` ### Variables — `group_vars/all.yml` and role defaults - **`group_vars/all.yml`:** **`noble_newt_install`**, **`noble_velero_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_argocd_apply_root_application`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`** - **`roles/noble_platform/defaults/main.yml`:** **`noble_apply_sops_secrets`**, **`noble_sops_age_key_file`** (SOPS secrets under **`clusters/noble/secrets/`**) ## Roles | Role | Contents | |------|----------| | `talos_phase_a` | Talos genconfig, apply-config, bootstrap, kubeconfig | | `helm_repos` | `helm repo add` / `update` | | `noble_*` | Cilium, CSI Volume Snapshot CRDs + controller, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, Velero (optional) | | `noble_landing_urls` | Writes **`ansible/output/noble-lab-ui-urls.md`** — URLs, service names, and (optional) Argo/Grafana passwords from Secrets | | `noble_post_deploy` | Post-install reminders | | `talos_bootstrap` | Genconfig-only (used by older playbook) | | `debian_baseline_hardening` | Baseline Debian hardening (SSH policy, sysctl profile, fail2ban, unattended upgrades) | | `debian_maintenance` | Routine Debian maintenance tasks (updates, cleanup, reboot-on-required) | | `debian_ssh_key_rotation` | Declarative `authorized_keys` rotation for server users | | `proxmox_baseline` | Proxmox repo prep (community repos) and no-subscription warning suppression | | `proxmox_maintenance` | Proxmox package maintenance (dist-upgrade, cleanup, reboot-on-required) | | `proxmox_cluster` | Proxmox cluster bootstrap/join automation using `pvecm` | ## Debian server ops quick start These playbooks are separate from the Talos/noble flow and target hosts in `debian_servers`. 1. Copy `inventory/debian.example.yml` to `inventory/debian.yml` and update hosts/users. 2. Update `group_vars/debian_servers.yml` with your allowed SSH users and real public keys. 3. Run with the Debian inventory: ```bash cd ansible ansible-playbook -i inventory/debian.yml playbooks/debian_harden.yml ansible-playbook -i inventory/debian.yml playbooks/debian_rotate_ssh_keys.yml ansible-playbook -i inventory/debian.yml playbooks/debian_maintenance.yml ``` Or run the combined maintenance pipeline: ```bash cd ansible ansible-playbook -i inventory/debian.yml playbooks/debian_ops.yml ``` ## Proxmox host + cluster quick start These playbooks are separate from the Talos/noble flow and target hosts in `proxmox_hosts`. 1. Copy `inventory/proxmox.example.yml` to `inventory/proxmox.yml` and update hosts/users. 2. Update `group_vars/proxmox_hosts.yml` with your cluster name (`proxmox_cluster_name`), chosen cluster master, and root public key file paths to install. 3. First run (no SSH keys yet): use `--ask-pass` **or** set `ansible_password` (prefer Ansible Vault). Keep `ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new"` in inventory for first-contact hosts. 4. Run prepare first to install your public keys on each host, then continue: ```bash cd ansible ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_prepare.yml --ask-pass ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_upgrade.yml ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_cluster.yml ``` After `proxmox_prepare.yml` finishes, SSH key auth should work for root (keys from `proxmox_root_authorized_key_files`), so `--ask-pass` is usually no longer needed. If `pvecm add` still prompts for the master root password during join, set `proxmox_cluster_master_root_password` (prefer Vault) to run join non-interactively. Changing `proxmox_cluster_name` only affects new cluster creation; it does not rename an already-created cluster. Or run the full Proxmox pipeline: ```bash cd ansible ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_ops.yml ``` ## Migrating from Argo-managed `noble-platform` ```bash kubectl delete application -n argocd noble-platform noble-kyverno noble-kyverno-policies --ignore-not-found kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml ``` Then run `playbooks/noble.yml` so Helm state matches git values.