diff --git a/ansible/README.md b/ansible/README.md index 555e815..74c1adf 100644 --- a/ansible/README.md +++ b/ansible/README.md @@ -2,7 +2,7 @@ **Narrative walkthrough (Proxmox → Talos → platform → Argo):** [`docs/ansible-getting-started.md`](../docs/ansible-getting-started.md). -Automates [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md): optional **Talos Phase A** (genconfig → apply → bootstrap → kubeconfig), then **Phase B+** (CNI → add-ons → ingress → Argo CD → Kyverno → observability → Trivy, etc.). **Argo CD** does not reconcile core charts — optional GitOps starts from an empty [`clusters/noble/apps/kustomization.yaml`](../clusters/noble/apps/kustomization.yaml). +Automates [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md): optional **Talos Phase A** (genconfig → apply → bootstrap → kubeconfig), then **Phase B+** (CNI → add-ons → ingress → Argo CD → Kyverno → observability, etc.). **Trivy Operator** is installed via Argo (**`noble-trivy-operator`** app-of-apps), not **`noble.yml`**. **Argo CD** does not reconcile core charts — optional GitOps starts from an empty [`clusters/noble/apps/kustomization.yaml`](../clusters/noble/apps/kustomization.yaml). ## Order of operations @@ -75,7 +75,6 @@ Override with `-e` when needed, e.g. **`-e noble_talos_skip_bootstrap=true`** if ```bash ansible-playbook playbooks/noble.yml --tags cilium,metallb -ansible-playbook playbooks/noble.yml --tags trivy ansible-playbook playbooks/noble.yml --skip-tags newt ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=... ansible-playbook playbooks/noble.yml --tags authentik -e noble_authentik_install=true @@ -92,7 +91,7 @@ ansible-playbook playbooks/noble.yml --tags authentik -e noble_authentik_install |------|----------| | `talos_phase_a` | Talos genconfig, apply-config, bootstrap, kubeconfig | | `helm_repos` | `helm repo add` / `update` | -| `noble_*` | Cilium, CSI Volume Snapshot CRDs + controller, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, **Authentik** (optional OIDC), **Trivy Operator**, Velero (optional) | +| `noble_*` | Cilium, CSI Volume Snapshot CRDs + controller, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, **Authentik** (optional OIDC), Velero (optional). **Trivy Operator:** Argo leaf **`noble-trivy-operator`** (see `clusters/noble/bootstrap/argocd/app-of-apps/`); role **`noble_trivy`** is not invoked by **`noble.yml`**. | | `noble_landing_urls` | Writes **`ansible/output/noble-lab-ui-urls.md`** — URLs, service names, and (optional) Argo/Grafana passwords from Secrets | | `noble_post_deploy` | Post-install reminders | | `talos_bootstrap` | Genconfig-only (used by older playbook) | diff --git a/ansible/playbooks/noble.yml b/ansible/playbooks/noble.yml index 8b04e81..bd32053 100644 --- a/ansible/playbooks/noble.yml +++ b/ansible/playbooks/noble.yml @@ -4,8 +4,9 @@ # Run from repo **ansible/** directory: ansible-playbook playbooks/noble.yml # # Tags: repos, cilium, csi_snapshot, metrics, longhorn, metallb, kube_vip, traefik, cert_manager, newt, -# argocd, kyverno, kyverno_policies, platform, authentik, trivy, velero, landing, all (default) +# argocd, kyverno, kyverno_policies, platform, authentik, velero, landing, all (default) # Argo leaf **Application** CRs are applied in play **tasks:** after **noble_velero** (Ansible Helm first, then GitOps). +# Trivy Operator is **not** installed here — sync **noble-trivy-operator** from Argo (app-of-apps) after deploy. - name: Noble cluster — platform stack (Ansible-managed) hosts: localhost connection: local @@ -231,13 +232,11 @@ tags: [platform, observability, apps] - role: noble_authentik tags: [authentik, sso, oauth, oidc] - - role: noble_trivy - tags: [trivy, security, scanning] - role: noble_velero tags: [velero, backups] tasks: - # Leaf Application CRs must exist only after all Ansible Helm in this play (platform, authentik, trivy, …) + # Leaf Application CRs must exist only after all Ansible Helm in this play (platform, authentik, velero, …) # so argocd-controller does not SSA resources before Helm owns them; then Argo can take over (manual → auto). - name: Apply Argo CD root / bootstrap / leaf Application manifests (post–Ansible Helm) ansible.builtin.include_role: diff --git a/ansible/roles/helm_repos/defaults/main.yml b/ansible/roles/helm_repos/defaults/main.yml index 1e57ce4..8349942 100644 --- a/ansible/roles/helm_repos/defaults/main.yml +++ b/ansible/roles/helm_repos/defaults/main.yml @@ -14,6 +14,5 @@ noble_helm_repos: - { name: headlamp, url: "https://kubernetes-sigs.github.io/headlamp/" } - { name: kyverno, url: "https://kyverno.github.io/kyverno/" } - { name: vmware-tanzu, url: "https://vmware-tanzu.github.io/helm-charts" } - - { name: aqua, url: "https://aquasecurity.github.io/helm-charts/" } - { name: goauthentik, url: "https://charts.goauthentik.io" } - { name: oauth2-proxy, url: "https://oauth2-proxy.github.io/manifests" } diff --git a/ansible/roles/noble_argocd/tasks/applications_post_platform.yml b/ansible/roles/noble_argocd/tasks/applications_post_platform.yml index 1729705..4636fa2 100644 --- a/ansible/roles/noble_argocd/tasks/applications_post_platform.yml +++ b/ansible/roles/noble_argocd/tasks/applications_post_platform.yml @@ -1,7 +1,7 @@ --- -# Run from **ansible/playbooks/noble.yml** *after* roles **noble_platform**, **noble_authentik**, **noble_trivy**, -# **noble_velero** (see play **tasks:**). Leaf **Application** CRs must not be reconciled before Ansible Helm -# finishes, or **argocd-controller** can SSA resources without Helm release metadata (e.g. Trivy ServiceAccount). +# Run from **ansible/playbooks/noble.yml** *after* roles **noble_platform**, **noble_authentik**, **noble_velero** +# (see play **tasks:**). Leaf **Application** CRs must not be reconciled before Ansible Helm finishes, or +# **argocd-controller** can SSA resources without Helm release metadata (e.g. chart-owned ServiceAccounts). - name: Apply Argo CD root Application (app-of-apps) ansible.builtin.command: argv: diff --git a/ansible/roles/noble_post_deploy/tasks/main.yml b/ansible/roles/noble_post_deploy/tasks/main.yml index d1d8540..ac08539 100644 --- a/ansible/roles/noble_post_deploy/tasks/main.yml +++ b/ansible/roles/noble_post_deploy/tasks/main.yml @@ -9,7 +9,7 @@ - name: Argo CD optional root Application (empty app-of-apps) ansible.builtin.debug: msg: >- - App-of-apps: at the **end** of **noble.yml** (after **noble_platform**, **noble_authentik**, **noble_trivy**, + App-of-apps: at the **end** of **noble.yml** (after **noble_platform**, **noble_authentik**, **noble_velero**), **noble_argocd** `applications_post_platform.yml` runs: root-application.yaml when noble_argocd_apply_root_application is true; bootstrap-root + **kubectl apply -k argocd/app-of-apps** when noble_argocd_apply_bootstrap_root_application is true (inventory/group_vars/all.yml). diff --git a/clusters/noble/bootstrap/argocd/README.md b/clusters/noble/bootstrap/argocd/README.md index f3a74fe..b603315 100644 --- a/clusters/noble/bootstrap/argocd/README.md +++ b/clusters/noble/bootstrap/argocd/README.md @@ -52,13 +52,13 @@ Use **Settings → Repositories** in the UI, or `argocd repo add` / a `Secret` o ## 4. App-of-apps (GitOps) -**Ansible** (`ansible/playbooks/noble.yml`) runs **`kubectl apply -k clusters/noble/bootstrap`** from **`noble_platform`**, then Helm for the platform stack, **then** **`noble_authentik`**, **`noble_trivy`**, **`noble_velero`**, and **only then** (play **`tasks:`**) **`noble_argocd`** `applications_post_platform.yml` applies **`root-application.yaml`**, **`bootstrap-root-application.yaml`**, and **`kubectl apply -k clusters/noble/bootstrap/argocd/app-of-apps`**. That order keeps **Ansible Helm first** and lets Argo **take ownership** when you sync or enable automation (no premature SSA vs Helm). +**Ansible** (`ansible/playbooks/noble.yml`) runs **`kubectl apply -k clusters/noble/bootstrap`** from **`noble_platform`**, then Helm for the platform stack, **then** **`noble_authentik`**, **`noble_velero`**, and **only then** (play **`tasks:`**) **`noble_argocd`** `applications_post_platform.yml` applies **`root-application.yaml`**, **`bootstrap-root-application.yaml`**, and **`kubectl apply -k clusters/noble/bootstrap/argocd/app-of-apps`**. **Trivy Operator** is **not** installed by Ansible; sync the **`noble-trivy-operator`** leaf app (or enable automation) after **`noble.yml`**. That order keeps **Ansible Helm first** and lets Argo **take ownership** when you sync or enable automation (no premature SSA vs Helm). 1. Edit **`root-application.yaml`** and **`bootstrap-root-application.yaml`**: set **`repoURL`** and **`targetRevision`**. The **`resources-finalizer.argocd.argoproj.io/background`** finalizer uses Argo’s path-qualified form so **`kubectl apply`** does not warn about finalizer names. 2. Optional add-on apps: add **`Application`** manifests under **`clusters/noble/apps/`** (see **`clusters/noble/apps/README.md`**). 3. **Bootstrap kustomize** (namespaces, datasource, etc.): **`noble-bootstrap-root`** syncs **`clusters/noble/bootstrap`** (no **`argocd/app-of-apps/`** in that kustomization). Leaf **`Application`** manifests live under **`argocd/app-of-apps/`**; Ansible applies that directory **after** all **`noble_*`** Helm roles in **`noble.yml`** (see §4) so Argo does not SSA charts before Helm. The root app uses **manual** sync; each leaf app is **manual** until you enable automation (see **§5**). - **`ansible/playbooks/noble.yml`**: roles **`noble_argocd`** (Argo Helm only), **`noble_platform`**, **`noble_authentik`**, **`noble_trivy`**, **`noble_velero`**, then play **`tasks`** run **`applications_post_platform`** when **`noble_argocd_apply_*`** flags are set in **`ansible/inventory/group_vars/all.yml`**. + **`ansible/playbooks/noble.yml`**: roles **`noble_argocd`** (Argo Helm only), **`noble_platform`**, **`noble_authentik`**, **`noble_velero`**, then play **`tasks`** run **`applications_post_platform`** when **`noble_argocd_apply_*`** flags are set in **`ansible/inventory/group_vars/all.yml`**. Trivy is deployed only via Argo (**`noble-trivy-operator`**). ```bash kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml @@ -67,9 +67,15 @@ Use **Settings → Repositories** in the UI, or `argocd repo add` / a `Secret` o If you migrated from older GitOps **`Application`** names, delete stale **`Application`** objects on the cluster (see **`clusters/noble/apps/README.md`**) then re-apply the roots. +**Trivy (`noble-trivy-operator`):** If an older install left an orphan **`ServiceMonitor`** named **`trivy-operator`** in **`monitoring`** (missing `meta.helm.sh/release-*` annotations), Helm/Argo will refuse to adopt it. Delete once, then sync **`noble-trivy-operator`**: + +```bash +kubectl delete servicemonitor trivy-operator -n monitoring --ignore-not-found +``` + ## 5. After Ansible: enable automated sync for **noble-bootstrap-root** -Do this only after **`ansible-playbook playbooks/noble.yml`** has finished successfully (including **`noble_platform`** / **`noble_authentik`** / **`noble_trivy`** Helm and the final **`applications_post_platform`** `kubectl apply` of leaf **Application** CRs). Until then, leave **manual** sync so Argo does not fight the playbook. +Do this only after **`ansible-playbook playbooks/noble.yml`** has finished successfully (including **`noble_platform`** / **`noble_authentik`** Helm, **`noble_velero`** if enabled, and the final **`applications_post_platform`** `kubectl apply` of leaf **Application** CRs). Until then, leave **manual** sync so Argo does not fight the playbook. **Required steps** @@ -99,7 +105,7 @@ Do this only after **`ansible-playbook playbooks/noble.yml`** has finished succe 5. Trigger a sync if the app does not go green immediately: **Sync** in the UI, or `argocd app sync noble-bootstrap-root`. -6. **Leaf apps** (`noble-cilium`, `noble-kube-prometheus`, … under **`app-of-apps/`**) stay **manual** until you turn on **AUTO-SYNC** (or sync once) **per app** after Ansible has finished. Until then they only register intent in Argo; **Ansible** still performs the Helm installs in **`noble_*`** roles. When you are ready for Argo to own a chart, enable sync for that leaf app and **remove** the corresponding **`helm upgrade`** task from Ansible so only one controller manages the release. +6. **Leaf apps** (`noble-cilium`, `noble-kube-prometheus`, … under **`app-of-apps/`**) stay **manual** until you turn on **AUTO-SYNC** (or sync once) **per app** after Ansible has finished. Until then they only register intent in Argo; **Ansible** still performs the Helm installs in **`noble_*`** roles for those charts (**Trivy Operator** is an exception — install/sync only via **`noble-trivy-operator`**). When you are ready for Argo to own a chart, enable sync for that leaf app and **remove** the corresponding **`helm upgrade`** task from Ansible so only one controller manages the release. If **`helm upgrade`** failed with **conflict with `argocd-controller`**, a leaf app had already reconciled: apply the updated manifests (manual leaf sync), delete the conflicting **`Application`** with **`--cascade=false`** if needed, then re-run the playbook — or finish migration to Argo-only for that chart. diff --git a/clusters/noble/bootstrap/argocd/app-of-apps/trivy-operator-application.yaml b/clusters/noble/bootstrap/argocd/app-of-apps/trivy-operator-application.yaml index 81e748d..bce2639 100644 --- a/clusters/noble/bootstrap/argocd/app-of-apps/trivy-operator-application.yaml +++ b/clusters/noble/bootstrap/argocd/app-of-apps/trivy-operator-application.yaml @@ -22,7 +22,7 @@ spec: destination: server: https://kubernetes.default.svc namespace: trivy-system - # Manual sync: Ansible helm runs first; enable automation after cutover (see ../README.md §5). + # Manual sync after **noble.yml**: install Trivy via Argo only (not Ansible). Enable automation after cutover (../README.md §5). syncPolicy: syncOptions: - CreateNamespace=true diff --git a/clusters/noble/bootstrap/kustomization.yaml b/clusters/noble/bootstrap/kustomization.yaml index f6a12d9..c29d1d9 100644 --- a/clusters/noble/bootstrap/kustomization.yaml +++ b/clusters/noble/bootstrap/kustomization.yaml @@ -1,7 +1,7 @@ # Ansible **noble_platform**: `kubectl apply -k` this directory (namespaces + static YAML only). # Leaf Argo **Application** manifests live under **argocd/app-of-apps/** and are applied at the **end** # of **ansible/playbooks/noble.yml** (play **tasks:** → **noble_argocd** `applications_post_platform.yml`) so -# **argocd-controller** does not SSA chart resources before **helm upgrade** (platform, authentik, trivy, …). +# **argocd-controller** does not SSA chart resources before **helm upgrade** (platform, authentik, velero, …). # # **noble-bootstrap-root** syncs this same path for GitOps on namespaces/datasource/VolumeSnapshotClass. # Per-chart GitOps: each **noble-*** app under **argocd/app-of-apps/** (manual sync until you cut over). diff --git a/clusters/noble/bootstrap/trivy/namespace.yaml b/clusters/noble/bootstrap/trivy/namespace.yaml index c49863c..c430bb4 100644 --- a/clusters/noble/bootstrap/trivy/namespace.yaml +++ b/clusters/noble/bootstrap/trivy/namespace.yaml @@ -1,4 +1,4 @@ -# Trivy Operator — apply before Helm (Ansible **noble_trivy**). +# Trivy Operator — namespace + PSA; applied with **noble_platform** bootstrap kustomize before Argo syncs the chart. # Scan jobs may use elevated capabilities; align with other operator namespaces. apiVersion: v1 kind: Namespace diff --git a/clusters/noble/bootstrap/trivy/values.yaml b/clusters/noble/bootstrap/trivy/values.yaml index 92b25d5..aa4645c 100644 --- a/clusters/noble/bootstrap/trivy/values.yaml +++ b/clusters/noble/bootstrap/trivy/values.yaml @@ -1,5 +1,7 @@ # Trivy Operator — in-cluster image vulnerability + config reports (Aqua trivy-operator Helm chart). +# Deploy via Argo CD: **noble-trivy-operator** (`clusters/noble/bootstrap/argocd/app-of-apps/trivy-operator-application.yaml`). # +# Manual Helm (if not using Argo): # helm repo add aqua https://aquasecurity.github.io/helm-charts/ && helm repo update # kubectl apply -f clusters/noble/bootstrap/trivy/namespace.yaml # helm upgrade --install trivy-operator aqua/trivy-operator -n trivy-system \ diff --git a/docs/ansible-getting-started.md b/docs/ansible-getting-started.md index 7588886..e84e4bf 100644 --- a/docs/ansible-getting-started.md +++ b/docs/ansible-getting-started.md @@ -188,7 +188,7 @@ Important mental model from [`clusters/noble/apps/README.md`](../clusters/noble/ ### 4.1 What Ansible already does for Argo -At the **end** of **`noble.yml`**, after all Helm roles (including **`noble_platform`**, **`noble_authentik`**, **`noble_trivy`**, **`noble_velero`**), the play runs **`noble_argocd`** task file **`applications_post_platform.yml`**, which applies: +At the **end** of **`noble.yml`**, after all Ansible Helm roles (**`noble_platform`**, **`noble_authentik`**, **`noble_velero`** when enabled), the play runs **`noble_argocd`** task file **`applications_post_platform.yml`**, which applies: - **`clusters/noble/bootstrap/argocd/root-application.yaml`** when **`noble_argocd_apply_root_application`** is true. - **`bootstrap-root-application.yaml`** and **`kubectl apply -k clusters/noble/bootstrap/argocd/app-of-apps`** when **`noble_argocd_apply_bootstrap_root_application`** is true.