# Ansible getting started — Proxmox → Talos → cluster → Argo CD This guide walks through the **intended order** for this repository: prepare **Proxmox VE** hosts and optionally form a **Proxmox cluster**, bring up **Talos** nodes and the **Kubernetes** control plane, install the **platform stack** with Ansible, then hand ongoing **bootstrap** configuration to **Argo CD** when you are ready. Shorter reference tables and variable lists live in [`ansible/README.md`](../ansible/README.md). Deep operational detail for Talos and the noble lab checklist are in [`talos/README.md`](../talos/README.md) and [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md). Argo-specific sequencing is in [`clusters/noble/bootstrap/argocd/README.md`](../clusters/noble/bootstrap/argocd/README.md). --- ## What runs where | Layer | What automates it | Where it runs | |--------|-------------------|----------------| | Proxmox hosts (repos, keys, upgrades, `pvecm`) | `proxmox_*.yml` playbooks | SSH to `proxmox_hosts` in `ansible/inventory/proxmox.yml` | | Talos machine config, bootstrap, admin kubeconfig | `playbooks/talos_phase_a.yml` (or `deploy.yml` first half) | **Localhost** — needs LAN/VPN to node IPs (`:50000`, later `:6443`) | | CNI, storage, ingress, cert-manager, Argo CD install, observability, policy, … | `playbooks/noble.yml` (or `deploy.yml` second half) | **Localhost** — uses `kubectl` / Helm against `KUBECONFIG` | | Post-install reminders | `playbooks/post_deploy.yml` | Localhost | Default Ansible inventory for Talos/noble playbooks is [`ansible/inventory/localhost.yml`](../ansible/inventory/localhost.yml) (`ansible.cfg` points there). **Proxmox** playbooks use **`-i inventory/proxmox.yml`** explicitly. --- ## Prerequisites (all phases) On the machine that runs Ansible (your workstation or a bastion): - **Ansible** (version compatible with the playbooks in this repo). - **SSH** access to Proxmox hosts when running Proxmox playbooks. - For Talos and Kubernetes phases: **same L2/L3 path** to lab node IPs (and eventually the API VIP) as documented in [`talos/README.md`](../talos/README.md) §3. - **Talos tooling:** `talosctl` (version aligned with the node image), **`talhelper`**, **`kubectl`**, **`helm`**. Optional but common for this repo: - **SOPS** + **age** if you use encrypted manifests under `clusters/noble/secrets/` (see `clusters/noble/secrets/README.md`). - Repository root **`.env`** copied from [`.env.sample`](../.env.sample) for cert-manager (Cloudflare DNS-01) and other optional components. --- ## 1. Proxmox — hosts and VE cluster These steps are **independent** of Talos and Kubernetes. They configure community repositories, routine upgrades, SSH keys for `root`, and optionally create a **Proxmox VE cluster** (`pvecm`). ### 1.1 Inventory and variables 1. Copy the example inventory: ```bash cp ansible/inventory/proxmox.example.yml ansible/inventory/proxmox.yml ``` 2. Edit `ansible/inventory/proxmox.yml`: set `ansible_host`, `ansible_user` (typically `root`), and for the first login without key auth either **`--ask-pass`** or `ansible_password` (prefer **Ansible Vault** for passwords). 3. Edit [`ansible/inventory/group_vars/proxmox_hosts.yml`](../ansible/inventory/group_vars/proxmox_hosts.yml): - **`proxmox_cluster_name`** — name for a **new** Proxmox cluster (changing it later does not rename an existing cluster). - **`proxmox_cluster_master`** — inventory host name of the first node that runs `pvecm create`; leave empty only if the default ordering matches your intent (see role defaults). - **`proxmox_root_authorized_key_files`** — public keys installed for `root` (after prepare, password login is usually unnecessary). - **`proxmox_cluster_master_root_password`** — only if `pvecm add` still needs the master’s root password for joins; store with Vault in real environments. Repo variables for Debian codename and subscription notices are already set in that file; adjust **`proxmox_repo_debian_codename`** if your PVE major tracks a different Debian base. ### 1.2 Run order From the **`ansible/`** directory, targeting Proxmox: ```bash cd ansible # First contact: often need --ask-pass until SSH keys are installed ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_prepare.yml --ask-pass ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_upgrade.yml ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_cluster.yml ``` **`proxmox_prepare.yml`** runs the `proxmox_baseline` role (community repos, suppress no-subscription UI nag, install your **`authorized_keys`** for root). **`proxmox_upgrade.yml`** runs maintenance (dist-upgrade, cleanup, reboot when required) **serially** one host at a time. **`proxmox_cluster.yml`** bootstraps or joins the Proxmox cluster **serially**. Convenience wrapper — same three steps in order: ```bash ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_ops.yml ``` After this phase you should have stable **Proxmox** hosts (and optionally a single **Proxmox cluster**) for creating the Talos VMs or bare-metal install targets. Creating those VMs or ISO boot entries is **outside** these playbooks; align disks and networks with [`talos/talconfig.yaml`](../talos/talconfig.yaml) and [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md) inventory. --- ## 2. Talos — secrets, generated configs, and Phase A Talos automation assumes **`talos/talconfig.yaml`** (and secrets) describe your nodes. Ansible **does not** replace reading [`talos/README.md`](../talos/README.md): order matters (**genconfig → apply all nodes → bootstrap**), and **`--insecure`** is only for maintenance-mode APIs. ### 2.1 One-time (or when rotating machine config) From the **`talos/`** directory: ```bash cd talos talhelper gensecret > talsecret.yaml # talhelper validate talconfig talconfig.yaml # after edits ``` Do **not** commit `talsecret.yaml`, `talos/out/`, or `talos/kubeconfig`. If you use SOPS for secrets, follow talhelper and repo docs for encrypted variants. ### 2.2 Automated Phase A (recommended) [`ansible/playbooks/talos_phase_a.yml`](../ansible/playbooks/talos_phase_a.yml) runs the **`talos_phase_a`** role on **localhost**: 1. **`talhelper genconfig -o out`** (when `noble_talos_genconfig` is true). 2. **`talosctl apply-config`** for each entry in **`noble_talos_nodes`** (maintenance vs secure mode is auto-probed unless you force `noble_talos_apply_mode`). 3. **`talosctl bootstrap`** on the bootstrap node (unless `noble_talos_skip_bootstrap` is true or etcd is already initialized). 4. **`talosctl kubeconfig`** writing **`talos/kubeconfig`** at the repo root (path set in the playbook). Run from **`ansible/`**: ```bash cd ansible ansible-playbook playbooks/talos_phase_a.yml ``` Override IPs, machine filenames, or timing when your lab differs from [`ansible/roles/talos_phase_a/defaults/main.yml`](../ansible/roles/talos_phase_a/defaults/main.yml), for example: ```bash ansible-playbook playbooks/talos_phase_a.yml \ -e 'noble_talos_bootstrap_node_ip=192.168.50.20' \ -e 'noble_talos_kubeconfig_endpoint=192.168.50.20' ``` If etcd is already bootstrapped and you only need apply/kubeconfig: ```bash ansible-playbook playbooks/talos_phase_a.yml -e 'noble_talos_skip_bootstrap=true' ``` **Legacy:** `playbooks/talos_bootstrap.yml` only runs genconfig via `talos_bootstrap`; prefer **`talos_phase_a.yml`** for a full bring-up. ### 2.3 Sanity checks before the platform playbook - **`talos/kubeconfig`** exists (or export **`KUBECONFIG`** to your own path). - From the same network path you will use for Helm: **`kubectl get --raw /healthz`** returns **`ok`** (see [`talos/README.md`](../talos/README.md) §3 if the kubeconfig points at a VIP you cannot reach — use `noble_k8s_api_server_override` on **`noble.yml`** as in [`ansible/inventory/group_vars/all.yml`](../ansible/inventory/group_vars/all.yml)). --- ## 3. Kubernetes cluster creation — platform install (`noble.yml`) Here “cluster creation” means: **empty Talos nodes are now members of a Kubernetes cluster**, and you are installing **CNI, storage, load balancing, ingress, cert-manager, GitOps, observability, policy**, and related components from this repo’s Helm/kubectl roles. [`ansible/playbooks/noble.yml`](../ansible/playbooks/noble.yml) is the main playbook. It sets **`KUBECONFIG`** from the environment or defaults to **`$REPO_ROOT/talos/kubeconfig`**, runs an API **`/healthz`** preflight (with optional VIP fallback), then applies roles in dependency order (for example **Cilium** before **MetalLB** / **kube-vip**, **Kyverno** before **Longhorn** as documented in the playbook comments). ### 3.1 Secrets and feature flags - **`.env`** at repo root: at minimum **`CLOUDFLARE_DNS_API_TOKEN`** when [`noble_cert_manager_require_cloudflare_secret`](../ansible/inventory/group_vars/all.yml) is true, so cert-manager can create DNS-01 issuers. - **[`ansible/inventory/group_vars/all.yml`](../ansible/inventory/group_vars/all.yml)** toggles optional components (`noble_newt_install`, `noble_velero_install`, `noble_authentik_install`, Argo root Application flags, API server override, etc.). ### 3.2 Run ```bash cd ansible export KUBECONFIG=/absolute/path/to/home-server/talos/kubeconfig # optional if default path is correct ansible-playbook playbooks/noble.yml ``` If the kubeconfig targets the API VIP but this host can only reach a control-plane IP: ```bash ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443' ``` Partial runs use **tags** (see [`ansible/README.md`](../ansible/README.md)). ### 3.3 One-shot pipeline from Talos through platform [`ansible/playbooks/deploy.yml`](../ansible/playbooks/deploy.yml) imports **`talos_phase_a.yml`** then **`noble.yml`**. Use it when nodes are booted and reachable and you want a single command after updating `talconfig`: ```bash cd ansible ansible-playbook playbooks/deploy.yml ``` --- ## 4. Cutover to Argo CD for deployment and config Important mental model from [`clusters/noble/apps/README.md`](../clusters/noble/apps/README.md) and [`clusters/noble/bootstrap/argocd/README.md`](../clusters/noble/bootstrap/argocd/README.md): - **Core platform** (CNI, storage, ingress, cert-manager, observability stack, Kyverno, etc.) is installed by **`noble.yml`** from **`clusters/noble/bootstrap/`** via Helm and kubectl — **Argo CD does not reconcile those core Helm charts by default** (those leaves live under **`argocd/app-of-apps/`** and are applied after Ansible Helm). - **`noble-bootstrap-root`** tracks **`clusters/noble/bootstrap/`** (which **kustomize-includes** **`clusters/noble/apps/`**) for GitOps alignment with bootstrap kustomize and optional add-on **`Application`** manifests — enable **automated** sync only **after** Ansible has finished so Argo does not fight Helm mid-play. ### 4.1 What Ansible already does for Argo At the **end** of **`noble.yml`**, after all Ansible Helm roles (**`noble_platform`**, **`noble_authentik`**, **`noble_velero`** when enabled), the play runs **`noble_argocd`** task file **`applications_post_platform.yml`**, which applies: - **`bootstrap-root-application.yaml`** and **`kubectl apply -k clusters/noble/bootstrap/argocd/app-of-apps`** when **`noble_argocd_apply_bootstrap_root_application`** is true. So the **bootstrap root Application CR** and **leaf Application** registrations typically already exist on the cluster after a successful **`noble.yml`**. They are created **last** on purpose so `argocd-application-controller` does not adopt resources before Helm installs them. ### 4.2 Before you enable GitOps automation 1. **Edit Git URLs** in **`bootstrap-root-application.yaml`**: set **`repoURL`** and **`targetRevision`** to your real remote and branch. 2. **Register the repository** in Argo CD (UI, `argocd repo add`, or a repository `Secret`) if it is private. 3. Leave **`noble-bootstrap-root`** on **manual** sync until Helm and the cluster match git (see **§5** in [`clusters/noble/bootstrap/argocd/README.md`](../clusters/noble/bootstrap/argocd/README.md)). ### 4.3 Enable automated sync for `noble-bootstrap-root` After **`noble.yml`** completes successfully and you have refreshed the app in Argo, enable automated sync (prune + self-heal) using one of the methods documented in **§5** of [`clusters/noble/bootstrap/argocd/README.md`](../clusters/noble/bootstrap/argocd/README.md), for example: ```bash kubectl patch application noble-bootstrap-root -n argocd --type merge \ -p '{"spec":{"syncPolicy":{"automated":{"prune":true,"selfHeal":true},"syncOptions":["CreateNamespace=true"]}}}' ``` **Leaf** `Application` objects under **`clusters/noble/bootstrap/argocd/app-of-apps/`** remain **manual** until you intentionally turn on auto-sync **per chart** — when Argo should own a release, enable that leaf and **remove** the corresponding **`helm upgrade`** from Ansible so a single controller owns the release. ### 4.4 Optional apps repo path Add only **additive** workloads under **`clusters/noble/apps/`** as `Application` manifests (see [`clusters/noble/apps/README.md`](../clusters/noble/apps/README.md)). **`kustomization.yaml`** may start empty; that is expected. ### 4.5 Post-deploy reminders ```bash cd ansible ansible-playbook playbooks/post_deploy.yml ``` This prints guidance about **SOPS** key handling and points back to the Argo README for sync policy. ### 4.6 Migrating from older Argo Application names If you previously used different Application objects (for example a monolithic `noble-platform`), delete stale Applications as described in [`ansible/README.md`](../ansible/README.md) under **Migrating from Argo-managed `noble-platform`**, then re-apply the root manifests and reconcile with **`noble.yml`** if Helm drifted. --- ## Quick reference — minimal command sequence ```bash # 1) Proxmox (from ansible/, with proxmox inventory) ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_ops.yml # 2–3) Talos + Kubernetes platform (localhost inventory default) cd ansible ansible-playbook playbooks/deploy.yml # 4) Reminders ansible-playbook playbooks/post_deploy.yml ``` Then finish **Argo** cutover in the UI or CLI: register repo → refresh **`noble-bootstrap-root`** → enable **AUTO-SYNC** when ready → selectively enable leaf apps and retire overlapping Ansible Helm tasks. --- ## Related documentation | Topic | Path | |--------|------| | Ansible overview | [`ansible/README.md`](../ansible/README.md) | | Talos quick start | [`talos/README.md`](../talos/README.md) | | Noble lab checklist | [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md) | | Argo bootstrap and sync policy | [`clusters/noble/bootstrap/argocd/README.md`](../clusters/noble/bootstrap/argocd/README.md) | | Optional Argo apps dir | [`clusters/noble/apps/README.md`](../clusters/noble/apps/README.md) | | Deploy secrets | [`.env.sample`](../.env.sample) |