Update Ansible configuration to integrate SOPS for managing secrets. Enhance README.md with SOPS usage instructions and prerequisites. Remove External Secrets Operator references and related configurations from the bootstrap process, streamlining the deployment. Adjust playbooks and roles to apply SOPS-encrypted secrets automatically, improving security and clarity in secret management.

This commit is contained in:
Nikholas Pcenicni
2026-03-30 22:42:52 -04:00
parent 023ebfee5d
commit 3a6e5dff5b
44 changed files with 644 additions and 809 deletions

View File

@@ -24,6 +24,7 @@ Copy **`.env.sample`** to **`.env`** at the repository root (`.env` is gitignore
## Prerequisites
- `talosctl` (matches node Talos version), `talhelper`, `helm`, `kubectl`.
- **SOPS secrets:** `sops` and `age` on the control host if you use **`clusters/noble/secrets/`** with **`age-key.txt`** (see **`clusters/noble/secrets/README.md`**).
- **Phase A:** same LAN/VPN as nodes so **Talos :50000** and **Kubernetes :6443** are reachable (see [`talos/README.md`](../talos/README.md) §3).
- **noble.yml:** bootstrapped cluster and **`talos/kubeconfig`** (or `KUBECONFIG`).
@@ -34,7 +35,7 @@ Copy **`.env.sample`** to **`.env`** at the repository root (`.env` is gitignore
| [`playbooks/deploy.yml`](playbooks/deploy.yml) | **Talos Phase A** then **`noble.yml`** (full automation). |
| [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) | `genconfig``apply-config``bootstrap``kubeconfig` only. |
| [`playbooks/noble.yml`](playbooks/noble.yml) | Helm + `kubectl` platform (after Phase A). |
| [`playbooks/post_deploy.yml`](playbooks/post_deploy.yml) | Vault / ESO reminders (`noble_apply_vault_cluster_secret_store`). |
| [`playbooks/post_deploy.yml`](playbooks/post_deploy.yml) | SOPS reminders and optional Argo root Application note. |
| [`playbooks/talos_bootstrap.yml`](playbooks/talos_bootstrap.yml) | **`talhelper genconfig` only** (legacy shortcut; prefer **`talos_phase_a.yml`**). |
```bash
@@ -68,9 +69,10 @@ ansible-playbook playbooks/noble.yml --skip-tags newt
ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=...
```
### Variables — `group_vars/all.yml`
### Variables — `group_vars/all.yml` and role defaults
- **`noble_newt_install`**, **`noble_velero_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_apply_vault_cluster_secret_store`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`**.
- **`group_vars/all.yml`:** **`noble_newt_install`**, **`noble_velero_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`**
- **`roles/noble_platform/defaults/main.yml`:** **`noble_apply_sops_secrets`**, **`noble_sops_age_key_file`** (SOPS secrets under **`clusters/noble/secrets/`**)
## Roles

View File

@@ -13,14 +13,11 @@ noble_k8s_api_server_fallback: "https://192.168.50.20:6443"
# Only if you must skip the kubectl /healthz preflight (not recommended).
noble_skip_k8s_health_check: false
# Pangolin / Newt — set true only after creating newt-pangolin-auth Secret (see clusters/noble/bootstrap/newt/README.md)
# Pangolin / Newt — set true only after newt-pangolin-auth Secret exists (SOPS: clusters/noble/secrets/ or imperative — see clusters/noble/bootstrap/newt/README.md)
noble_newt_install: false
# cert-manager needs Secret cloudflare-dns-api-token in cert-manager namespace before ClusterIssuers work
noble_cert_manager_require_cloudflare_secret: true
# post_deploy.yml — apply Vault ClusterSecretStore only after Vault is initialized and K8s auth is configured
noble_apply_vault_cluster_secret_store: false
# Velero — set **noble_velero_install: true** plus S3 bucket/URL (and credentials — see clusters/noble/bootstrap/velero/README.md)
noble_velero_install: false

View File

@@ -1,12 +1,7 @@
---
# Manual follow-ups after **noble.yml**: Vault init/unseal, Kubernetes auth for Vault, ESO ClusterSecretStore.
# Run: ansible-playbook playbooks/post_deploy.yml
- name: Noble cluster — post-install reminders
hosts: localhost
# Manual follow-ups after **noble.yml**: SOPS key backup, optional Argo root Application.
- hosts: localhost
connection: local
gather_facts: false
vars:
noble_repo_root: "{{ playbook_dir | dirname | dirname }}"
noble_kubeconfig: "{{ lookup('env', 'KUBECONFIG') | default(noble_repo_root + '/talos/kubeconfig', true) }}"
roles:
- role: noble_post_deploy
- noble_post_deploy

View File

@@ -8,9 +8,6 @@ noble_helm_repos:
- { name: fossorial, url: "https://charts.fossorial.io" }
- { name: argo, url: "https://argoproj.github.io/argo-helm" }
- { name: metrics-server, url: "https://kubernetes-sigs.github.io/metrics-server/" }
- { name: sealed-secrets, url: "https://bitnami-labs.github.io/sealed-secrets" }
- { name: external-secrets, url: "https://charts.external-secrets.io" }
- { name: hashicorp, url: "https://helm.releases.hashicorp.com" }
- { name: prometheus-community, url: "https://prometheus-community.github.io/helm-charts" }
- { name: grafana, url: "https://grafana.github.io/helm-charts" }
- { name: fluent, url: "https://fluent.github.io/helm-charts" }

View File

@@ -39,11 +39,6 @@ noble_lab_ui_entries:
namespace: longhorn-system
service: longhorn-frontend
url: https://longhorn.apps.noble.lab.pcenicni.dev
- name: Vault
description: Secrets engine UI (after init/unseal)
namespace: vault
service: vault
url: https://vault.apps.noble.lab.pcenicni.dev
- name: Velero
description: Cluster backups — no web UI (velero CLI / kubectl CRDs)
namespace: velero

View File

@@ -24,7 +24,6 @@ This file is **generated** by Ansible (`noble_landing_urls` role). Use it as a t
| **Prometheus** | — | No auth in default install (lab). |
| **Alertmanager** | — | No auth in default install (lab). |
| **Longhorn** | — | No default login unless you enable access control in the UI settings. |
| **Vault** | Token | Root token is only from **`vault operator init`** (not stored in git). See `clusters/noble/bootstrap/vault/README.md`. |
### Commands to retrieve passwords (if not filled above)
@@ -46,7 +45,7 @@ To generate this file **without** calling kubectl, run Ansible with **`-e noble_
- **Argo CD** `argocd-initial-admin-secret` disappears after you change the admin password.
- **Grafana** password is random unless you set `grafana.adminPassword` in chart values.
- **Vault** UI needs **unsealed** Vault; tokens come from your chosen auth method.
- **Prometheus / Alertmanager** UIs are unauthenticated by default — restrict when hardening (`talos/CLUSTER-BUILD.md` Phase G).
- **SOPS:** cluster secrets in git under **`clusters/noble/secrets/`** are encrypted; decrypt with **`age-key.txt`** (not in git). See **`clusters/noble/secrets/README.md`**.
- **Headlamp** token above expires after the configured duration; re-run Ansible or `kubectl create token` to refresh.
- **Velero** has **no web UI** — use **`velero`** CLI or **`kubectl -n velero get backup,schedule,backupstoragelocation`**. Metrics: **`velero`** Service in **`velero`** (Prometheus scrape). See `clusters/noble/bootstrap/velero/README.md`.

View File

@@ -4,5 +4,6 @@ noble_platform_kubectl_request_timeout: 120s
noble_platform_kustomize_retries: 5
noble_platform_kustomize_delay: 20
# Vault: injector (vault-k8s) owns MutatingWebhookConfiguration.caBundle; Helm upgrade can SSA-conflict. Delete webhook so Helm can recreate it.
noble_vault_delete_injector_webhook_before_helm: true
# Decrypt **clusters/noble/secrets/*.yaml** with SOPS and kubectl apply (requires **sops**, **age**, and **age-key.txt**).
noble_apply_sops_secrets: true
noble_sops_age_key_file: "{{ noble_repo_root }}/age-key.txt"

View File

@@ -1,6 +1,6 @@
---
# Mirrors former **noble-platform** Argo Application: Helm releases + plain manifests under clusters/noble/bootstrap.
- name: Apply clusters/noble/bootstrap kustomize (namespaces, Grafana Loki datasource, Vault extras)
- name: Apply clusters/noble/bootstrap kustomize (namespaces, Grafana Loki datasource)
ansible.builtin.command:
argv:
- kubectl
@@ -16,77 +16,26 @@
until: noble_platform_kustomize.rc == 0
changed_when: true
- name: Install Sealed Secrets
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- sealed-secrets
- sealed-secrets/sealed-secrets
- --namespace
- sealed-secrets
- --version
- "2.18.4"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/sealed-secrets/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Stat SOPS age private key (age-key.txt)
ansible.builtin.stat:
path: "{{ noble_sops_age_key_file }}"
register: noble_sops_age_key_stat
- name: Install External Secrets Operator
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- external-secrets
- external-secrets/external-secrets
- --namespace
- external-secrets
- --version
- "2.2.0"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/external-secrets/values.yaml"
- --wait
- name: Apply SOPS-encrypted cluster secrets (clusters/noble/secrets/*.yaml)
ansible.builtin.shell: |
set -euo pipefail
shopt -s nullglob
for f in "{{ noble_repo_root }}/clusters/noble/secrets"/*.yaml; do
sops -d "$f" | kubectl apply -f -
done
args:
executable: /bin/bash
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
# vault-k8s patches webhook CA after install; Helm 3/4 SSA then conflicts on upgrade. Removing the MWC lets Helm re-apply cleanly; injector repopulates caBundle.
- name: Delete Vault agent injector MutatingWebhookConfiguration before Helm (avoids caBundle field conflict)
ansible.builtin.command:
argv:
- kubectl
- delete
- mutatingwebhookconfiguration
- vault-agent-injector-cfg
- --ignore-not-found
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_vault_mwc_delete
when: noble_vault_delete_injector_webhook_before_helm | default(true) | bool
changed_when: "'deleted' in (noble_vault_mwc_delete.stdout | default(''))"
- name: Install Vault
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- vault
- hashicorp/vault
- --namespace
- vault
- --version
- "0.32.0"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/vault/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
HELM_SERVER_SIDE_APPLY: "false"
SOPS_AGE_KEY_FILE: "{{ noble_sops_age_key_file }}"
when:
- noble_apply_sops_secrets | default(true) | bool
- noble_sops_age_key_stat.stat.exists
changed_when: true
- name: Install kube-prometheus-stack

View File

@@ -1,24 +1,10 @@
---
- name: Vault — manual steps (not automated)
- name: SOPS secrets (workstation)
ansible.builtin.debug:
msg: |
1. kubectl -n vault get pods (wait for Running)
2. kubectl -n vault exec -it vault-0 -- vault operator init (once; save keys)
3. Unseal per clusters/noble/bootstrap/vault/README.md
4. ./clusters/noble/bootstrap/vault/configure-kubernetes-auth.sh
5. kubectl apply -f clusters/noble/bootstrap/external-secrets/examples/vault-cluster-secret-store.yaml
- name: Optional — apply Vault ClusterSecretStore for External Secrets
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/external-secrets/examples/vault-cluster-secret-store.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_apply_vault_cluster_secret_store | default(false) | bool
changed_when: true
Encrypted Kubernetes Secrets live under clusters/noble/secrets/ (Mozilla SOPS + age).
Private key: age-key.txt at repo root (gitignored). See clusters/noble/secrets/README.md
and .sops.yaml. noble.yml decrypt-applies these when age-key.txt exists.
- name: Argo CD optional root Application (empty app-of-apps)
ansible.builtin.debug: