Files

Nikholas Pcenicni f259285f6e Enhance Argo CD integration by adding support for a bootstrap root application. Update group_vars/all.yml and role defaults to include noble_argocd_apply_bootstrap_root_application. Modify tasks to apply the bootstrap application conditionally. Revise documentation to clarify the GitOps workflow and the relationship between the core platform and optional applications. Remove outdated references and streamline the README for better user guidance.

2026-04-01 01:55:41 -04:00

10 KiB

Raw Blame History

Ansible — noble cluster

Automates talos/CLUSTER-BUILD.md: optional Talos Phase A (genconfig → apply → bootstrap → kubeconfig), then Phase B+ (CNI → add-ons → ingress → Argo CD → Kyverno → observability, etc.). Argo CD does not reconcile core charts — optional GitOps starts from an empty clusters/noble/apps/kustomization.yaml.

Order of operations

From talos/: talhelper gensecret / talsecret as in talos/README.md §1 (if not already done).
Talos Phase A (automated): run playbooks/talos_phase_a.yml or the full pipeline playbooks/deploy.yml. This runs talhelper genconfig -o out, talosctl apply-config on each node, talosctl bootstrap, and talosctl kubeconfig → talos/kubeconfig.
Platform stack: playbooks/noble.yml (included at the end of deploy.yml).

Your workstation must be able to reach node IPs on the lab LAN (Talos API :50000 for talosctl, Kubernetes :6443 for kubectl / Helm). If kubectl cannot reach the VIP (192.168.50.230), use -e 'noble_k8s_api_server_override=https://<control-plane-ip>:6443' on noble.yml (see group_vars/all.yml).

One-shot full deploy (after nodes are booted and reachable):

cd ansible
ansible-playbook playbooks/deploy.yml

Deploy secrets (`.env`)

Copy .env.sample to .env at the repository root (.env is gitignored). At minimum set CLOUDFLARE_DNS_API_TOKEN for cert-manager DNS-01. The cert-manager role applies it automatically during noble.yml. See .env.sample for optional placeholders (e.g. Newt/Pangolin).

Prerequisites

talosctl (matches node Talos version), talhelper, helm, kubectl.
SOPS secrets: sops and age on the control host if you use clusters/noble/secrets/ with age-key.txt (see clusters/noble/secrets/README.md).
Phase A: same LAN/VPN as nodes so Talos :50000 and Kubernetes :6443 are reachable (see talos/README.md §3).
noble.yml: bootstrapped cluster and talos/kubeconfig (or KUBECONFIG).

Playbooks

Playbook	Purpose
`playbooks/deploy.yml`	Talos Phase A then `noble.yml` (full automation).
`playbooks/talos_phase_a.yml`	`genconfig` → `apply-config` → `bootstrap` → `kubeconfig` only.
`playbooks/noble.yml`	Helm + `kubectl` platform (after Phase A).
`playbooks/post_deploy.yml`	SOPS reminders and optional Argo root Application note.
`playbooks/talos_bootstrap.yml`	`talhelper genconfig` only (legacy shortcut; prefer `talos_phase_a.yml`).
`playbooks/debian_harden.yml`	Baseline hardening for Debian servers (SSH/sysctl/fail2ban/unattended-upgrades).
`playbooks/debian_maintenance.yml`	Debian maintenance run (apt upgrades, autoremove/autoclean, reboot when required).
`playbooks/debian_rotate_ssh_keys.yml`	Rotate managed users' `authorized_keys`.
`playbooks/debian_ops.yml`	Convenience pipeline: harden then maintenance for Debian servers.
`playbooks/proxmox_prepare.yml`	Configure Proxmox community repos and disable no-subscription UI warning.
`playbooks/proxmox_upgrade.yml`	Proxmox maintenance run (apt dist-upgrade, cleanup, reboot when required).
`playbooks/proxmox_cluster.yml`	Create a Proxmox cluster on the master and join additional hosts.
`playbooks/proxmox_ops.yml`	Convenience pipeline: prepare, upgrade, then cluster Proxmox hosts.

cd ansible
export KUBECONFIG=/absolute/path/to/home-server/talos/kubeconfig

# noble.yml only — if VIP is unreachable from this host:
# ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443'

ansible-playbook playbooks/noble.yml
ansible-playbook playbooks/post_deploy.yml

Talos Phase A variables (role `talos_phase_a` defaults)

Override with -e when needed, e.g. -e noble_talos_skip_bootstrap=true if etcd is already initialized.

Variable	Default	Meaning
`noble_talos_genconfig`	`true`	Run `talhelper genconfig -o out` first.
`noble_talos_apply_mode`	`auto`	`auto` — `talosctl apply-config --dry-run` on the first node picks maintenance (`--insecure`) vs joined (`TALOSCONFIG`). `insecure` / `secure` force talos/README §2 A or B.
`noble_talos_skip_bootstrap`	`false`	Skip `talosctl bootstrap`. If etcd is already initialized, bootstrap is treated as a no-op (same as `talosctl` “etcd data directory is not empty”).
`noble_talos_apid_wait_delay` / `noble_talos_apid_wait_timeout`	`20` / `900`	Seconds to wait for apid :50000 on the bootstrap node after apply-config (nodes reboot). Increase if bootstrap hits connection refused to `:50000`.
`noble_talos_nodes`	neon/argon/krypton/helium	IP + *`out/.yaml` filename — align with `talos/talconfig.yaml`**.

Tags (partial runs)

ansible-playbook playbooks/noble.yml --tags cilium,metallb
ansible-playbook playbooks/noble.yml --skip-tags newt
ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=...

Variables — `group_vars/all.yml` and role defaults

group_vars/all.yml: noble_newt_install, noble_velero_install, noble_cert_manager_require_cloudflare_secret, noble_argocd_apply_root_application, noble_argocd_apply_bootstrap_root_application, noble_k8s_api_server_override, noble_k8s_api_server_auto_fallback, noble_k8s_api_server_fallback, noble_skip_k8s_health_check
roles/noble_platform/defaults/main.yml: noble_apply_sops_secrets, noble_sops_age_key_file (SOPS secrets under clusters/noble/secrets/)

Roles

Role	Contents
`talos_phase_a`	Talos genconfig, apply-config, bootstrap, kubeconfig
`helm_repos`	`helm repo add` / `update`
`noble_*`	Cilium, CSI Volume Snapshot CRDs + controller, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, Velero (optional)
`noble_landing_urls`	Writes `ansible/output/noble-lab-ui-urls.md` — URLs, service names, and (optional) Argo/Grafana passwords from Secrets
`noble_post_deploy`	Post-install reminders
`talos_bootstrap`	Genconfig-only (used by older playbook)
`debian_baseline_hardening`	Baseline Debian hardening (SSH policy, sysctl profile, fail2ban, unattended upgrades)
`debian_maintenance`	Routine Debian maintenance tasks (updates, cleanup, reboot-on-required)
`debian_ssh_key_rotation`	Declarative `authorized_keys` rotation for server users
`proxmox_baseline`	Proxmox repo prep (community repos) and no-subscription warning suppression
`proxmox_maintenance`	Proxmox package maintenance (dist-upgrade, cleanup, reboot-on-required)
`proxmox_cluster`	Proxmox cluster bootstrap/join automation using `pvecm`

Debian server ops quick start

These playbooks are separate from the Talos/noble flow and target hosts in debian_servers.

Copy inventory/debian.example.yml to inventory/debian.yml and update hosts/users.
Update group_vars/debian_servers.yml with your allowed SSH users and real public keys.
Run with the Debian inventory:

cd ansible
ansible-playbook -i inventory/debian.yml playbooks/debian_harden.yml
ansible-playbook -i inventory/debian.yml playbooks/debian_rotate_ssh_keys.yml
ansible-playbook -i inventory/debian.yml playbooks/debian_maintenance.yml

Or run the combined maintenance pipeline:

cd ansible
ansible-playbook -i inventory/debian.yml playbooks/debian_ops.yml

Proxmox host + cluster quick start

These playbooks are separate from the Talos/noble flow and target hosts in proxmox_hosts.

Copy inventory/proxmox.example.yml to inventory/proxmox.yml and update hosts/users.
Update group_vars/proxmox_hosts.yml with your cluster name (proxmox_cluster_name), chosen cluster master, and root public key file paths to install.
First run (no SSH keys yet): use --ask-pass or set ansible_password (prefer Ansible Vault). Keep ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new" in inventory for first-contact hosts.
Run prepare first to install your public keys on each host, then continue:

cd ansible
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_prepare.yml --ask-pass
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_upgrade.yml
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_cluster.yml

After proxmox_prepare.yml finishes, SSH key auth should work for root (keys from proxmox_root_authorized_key_files), so --ask-pass is usually no longer needed.

If pvecm add still prompts for the master root password during join, set proxmox_cluster_master_root_password (prefer Vault) to run join non-interactively.

Changing proxmox_cluster_name only affects new cluster creation; it does not rename an already-created cluster.

Or run the full Proxmox pipeline:

cd ansible
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_ops.yml

Migrating from Argo-managed `noble-platform`

kubectl delete application -n argocd noble-platform noble-kyverno noble-kyverno-policies --ignore-not-found
kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml

Then run playbooks/noble.yml so Helm state matches git values.

10 KiB Raw Blame History