94 lines
5.8 KiB
Markdown
94 lines
5.8 KiB
Markdown
# Ansible — noble cluster
|
|
|
|
Automates [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md): optional **Talos Phase A** (genconfig → apply → bootstrap → kubeconfig), then **Phase B+** (CNI → add-ons → ingress → Argo CD → Kyverno → observability, etc.). **Argo CD** does not reconcile core charts — optional GitOps starts from an empty [`clusters/noble/apps/kustomization.yaml`](../clusters/noble/apps/kustomization.yaml).
|
|
|
|
## Order of operations
|
|
|
|
1. **From `talos/`:** `talhelper gensecret` / `talsecret` as in [`talos/README.md`](../talos/README.md) §1 (if not already done).
|
|
2. **Talos Phase A (automated):** run [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) **or** the full pipeline [`playbooks/deploy.yml`](playbooks/deploy.yml). This runs **`talhelper genconfig -o out`**, **`talosctl apply-config`** on each node, **`talosctl bootstrap`**, and **`talosctl kubeconfig`** → **`talos/kubeconfig`**.
|
|
3. **Platform stack:** [`playbooks/noble.yml`](playbooks/noble.yml) (included at the end of **`deploy.yml`**).
|
|
|
|
Your workstation must be able to reach **node IPs on the lab LAN** (Talos API **:50000** for `talosctl`, Kubernetes **:6443** for `kubectl` / Helm). If `kubectl` cannot reach the VIP (`192.168.50.230`), use `-e 'noble_k8s_api_server_override=https://<control-plane-ip>:6443'` on **`noble.yml`** (see `group_vars/all.yml`).
|
|
|
|
**One-shot full deploy** (after nodes are booted and reachable):
|
|
|
|
```bash
|
|
cd ansible
|
|
ansible-playbook playbooks/deploy.yml
|
|
```
|
|
|
|
## Deploy secrets (`.env`)
|
|
|
|
Copy **`.env.sample`** to **`.env`** at the repository root (`.env` is gitignored). At minimum set **`CLOUDFLARE_DNS_API_TOKEN`** for cert-manager DNS-01. The **cert-manager** role applies it automatically during **`noble.yml`**. See **`.env.sample`** for optional placeholders (e.g. Newt/Pangolin).
|
|
|
|
## Prerequisites
|
|
|
|
- `talosctl` (matches node Talos version), `talhelper`, `helm`, `kubectl`.
|
|
- **Phase A:** same LAN/VPN as nodes so **Talos :50000** and **Kubernetes :6443** are reachable (see [`talos/README.md`](../talos/README.md) §3).
|
|
- **noble.yml:** bootstrapped cluster and **`talos/kubeconfig`** (or `KUBECONFIG`).
|
|
|
|
## Playbooks
|
|
|
|
| Playbook | Purpose |
|
|
|----------|---------|
|
|
| [`playbooks/deploy.yml`](playbooks/deploy.yml) | **Talos Phase A** then **`noble.yml`** (full automation). |
|
|
| [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) | `genconfig` → `apply-config` → `bootstrap` → `kubeconfig` only. |
|
|
| [`playbooks/noble.yml`](playbooks/noble.yml) | Helm + `kubectl` platform (after Phase A). |
|
|
| [`playbooks/post_deploy.yml`](playbooks/post_deploy.yml) | Vault / ESO reminders (`noble_apply_vault_cluster_secret_store`). |
|
|
| [`playbooks/talos_bootstrap.yml`](playbooks/talos_bootstrap.yml) | **`talhelper genconfig` only** (legacy shortcut; prefer **`talos_phase_a.yml`**). |
|
|
|
|
```bash
|
|
cd ansible
|
|
export KUBECONFIG=/absolute/path/to/home-server/talos/kubeconfig
|
|
|
|
# noble.yml only — if VIP is unreachable from this host:
|
|
# ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443'
|
|
|
|
ansible-playbook playbooks/noble.yml
|
|
ansible-playbook playbooks/post_deploy.yml
|
|
```
|
|
|
|
### Talos Phase A variables (role `talos_phase_a` defaults)
|
|
|
|
Override with `-e` when needed, e.g. **`-e noble_talos_skip_bootstrap=true`** if etcd is already initialized.
|
|
|
|
| Variable | Default | Meaning |
|
|
|----------|---------|---------|
|
|
| `noble_talos_genconfig` | `true` | Run **`talhelper genconfig -o out`** first. |
|
|
| `noble_talos_apply_mode` | `auto` | **`auto`** — **`talosctl apply-config --dry-run`** on the first node picks maintenance (**`--insecure`**) vs joined (**`TALOSCONFIG`**). **`insecure`** / **`secure`** force talos/README §2 A or B. |
|
|
| `noble_talos_skip_bootstrap` | `false` | Skip **`talosctl bootstrap`**. If etcd is **already** initialized, bootstrap is treated as a no-op (same as **`talosctl`** “etcd data directory is not empty”). |
|
|
| `noble_talos_apid_wait_delay` / `noble_talos_apid_wait_timeout` | `20` / `900` | Seconds to wait for **apid :50000** on the bootstrap node after **apply-config** (nodes reboot). Increase if bootstrap hits **connection refused** to `:50000`. |
|
|
| `noble_talos_nodes` | neon/argon/krypton/helium | IP + **`out/*.yaml`** filename — align with **`talos/talconfig.yaml`**. |
|
|
|
|
### Tags (partial runs)
|
|
|
|
```bash
|
|
ansible-playbook playbooks/noble.yml --tags cilium,metallb
|
|
ansible-playbook playbooks/noble.yml --skip-tags newt
|
|
ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=...
|
|
```
|
|
|
|
### Variables — `group_vars/all.yml`
|
|
|
|
- **`noble_newt_install`**, **`noble_velero_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_apply_vault_cluster_secret_store`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`**.
|
|
|
|
## Roles
|
|
|
|
| Role | Contents |
|
|
|------|----------|
|
|
| `talos_phase_a` | Talos genconfig, apply-config, bootstrap, kubeconfig |
|
|
| `helm_repos` | `helm repo add` / `update` |
|
|
| `noble_*` | Cilium, CSI Volume Snapshot CRDs + controller, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, Velero (optional) |
|
|
| `noble_landing_urls` | Writes **`ansible/output/noble-lab-ui-urls.md`** — URLs, service names, and (optional) Argo/Grafana passwords from Secrets |
|
|
| `noble_post_deploy` | Post-install reminders |
|
|
| `talos_bootstrap` | Genconfig-only (used by older playbook) |
|
|
|
|
## Migrating from Argo-managed `noble-platform`
|
|
|
|
```bash
|
|
kubectl delete application -n argocd noble-platform noble-kyverno noble-kyverno-policies --ignore-not-found
|
|
kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
|
|
```
|
|
|
|
Then run `playbooks/noble.yml` so Helm state matches git values.
|