diff --git a/.env.sample b/.env.sample index ae677ea..96c7705 100644 --- a/.env.sample +++ b/.env.sample @@ -11,3 +11,9 @@ CLOUDFLARE_DNS_API_TOKEN= PANGOLIN_ENDPOINT= NEWT_ID= NEWT_SECRET= + +# Velero — when **noble_velero_install=true**, set bucket + S3 API URL and credentials (see clusters/noble/bootstrap/velero/README.md). +NOBLE_VELERO_S3_BUCKET= +NOBLE_VELERO_S3_URL= +NOBLE_VELERO_AWS_ACCESS_KEY_ID= +NOBLE_VELERO_AWS_SECRET_ACCESS_KEY= diff --git a/ansible/README.md b/ansible/README.md index 6118762..3fe1a0f 100644 --- a/ansible/README.md +++ b/ansible/README.md @@ -65,11 +65,12 @@ Override with `-e` when needed, e.g. **`-e noble_talos_skip_bootstrap=true`** if ```bash ansible-playbook playbooks/noble.yml --tags cilium,metallb ansible-playbook playbooks/noble.yml --skip-tags newt +ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=... ``` ### Variables — `group_vars/all.yml` -- **`noble_newt_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_apply_vault_cluster_secret_store`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`**. +- **`noble_newt_install`**, **`noble_velero_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_apply_vault_cluster_secret_store`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`**. ## Roles @@ -77,7 +78,7 @@ ansible-playbook playbooks/noble.yml --skip-tags newt |------|----------| | `talos_phase_a` | Talos genconfig, apply-config, bootstrap, kubeconfig | | `helm_repos` | `helm repo add` / `update` | -| `noble_*` | Cilium, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack | +| `noble_*` | Cilium, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, Velero (optional) | | `noble_landing_urls` | Writes **`ansible/output/noble-lab-ui-urls.md`** — URLs, service names, and (optional) Argo/Grafana passwords from Secrets | | `noble_post_deploy` | Post-install reminders | | `talos_bootstrap` | Genconfig-only (used by older playbook) | diff --git a/ansible/group_vars/all.yml b/ansible/group_vars/all.yml index 87271b3..6dff5ef 100644 --- a/ansible/group_vars/all.yml +++ b/ansible/group_vars/all.yml @@ -21,3 +21,6 @@ noble_cert_manager_require_cloudflare_secret: true # post_deploy.yml — apply Vault ClusterSecretStore only after Vault is initialized and K8s auth is configured noble_apply_vault_cluster_secret_store: false + +# Velero — set **noble_velero_install: true** plus S3 bucket/URL (and credentials — see clusters/noble/bootstrap/velero/README.md) +noble_velero_install: false diff --git a/ansible/playbooks/noble.yml b/ansible/playbooks/noble.yml index 1986799..78c00c5 100644 --- a/ansible/playbooks/noble.yml +++ b/ansible/playbooks/noble.yml @@ -4,7 +4,7 @@ # Run from repo **ansible/** directory: ansible-playbook playbooks/noble.yml # # Tags: repos, cilium, metrics, longhorn, metallb, kube_vip, traefik, cert_manager, newt, -# argocd, kyverno, kyverno_policies, platform, all (default) +# argocd, kyverno, kyverno_policies, platform, velero, all (default) - name: Noble cluster — platform stack (Ansible-managed) hosts: localhost connection: local @@ -224,5 +224,7 @@ tags: [kyverno_policies, policy] - role: noble_platform tags: [platform, observability, apps] + - role: noble_velero + tags: [velero, backups] - role: noble_landing_urls tags: [landing, platform, observability, apps] diff --git a/ansible/roles/helm_repos/defaults/main.yml b/ansible/roles/helm_repos/defaults/main.yml index 90a33cb..d635baa 100644 --- a/ansible/roles/helm_repos/defaults/main.yml +++ b/ansible/roles/helm_repos/defaults/main.yml @@ -16,3 +16,4 @@ noble_helm_repos: - { name: fluent, url: "https://fluent.github.io/helm-charts" } - { name: headlamp, url: "https://kubernetes-sigs.github.io/headlamp/" } - { name: kyverno, url: "https://kyverno.github.io/kyverno/" } + - { name: vmware-tanzu, url: "https://vmware-tanzu.github.io/helm-charts" } diff --git a/ansible/roles/noble_velero/defaults/main.yml b/ansible/roles/noble_velero/defaults/main.yml new file mode 100644 index 0000000..2768040 --- /dev/null +++ b/ansible/roles/noble_velero/defaults/main.yml @@ -0,0 +1,13 @@ +--- +# **noble_velero_install** is in **ansible/group_vars/all.yml**. Override S3 fields via extra-vars or group_vars. +noble_velero_chart_version: "12.0.0" + +noble_velero_s3_bucket: "" +noble_velero_s3_url: "" +noble_velero_s3_region: "us-east-1" +noble_velero_s3_force_path_style: "true" +noble_velero_s3_prefix: "" + +# Optional — if unset, Ansible expects Secret **velero/velero-cloud-credentials** (key **cloud**) to exist. +noble_velero_aws_access_key_id: "" +noble_velero_aws_secret_access_key: "" diff --git a/ansible/roles/noble_velero/tasks/from_env.yml b/ansible/roles/noble_velero/tasks/from_env.yml new file mode 100644 index 0000000..5cf552f --- /dev/null +++ b/ansible/roles/noble_velero/tasks/from_env.yml @@ -0,0 +1,68 @@ +--- +# See repository **.env.sample** — copy to **.env** (gitignored). +- name: Stat repository .env for Velero + ansible.builtin.stat: + path: "{{ noble_repo_root }}/.env" + register: noble_deploy_env_file + changed_when: false + +- name: Load NOBLE_VELERO_S3_BUCKET from .env when unset + ansible.builtin.shell: | + set -a + . "{{ noble_repo_root }}/.env" + set +a + echo "${NOBLE_VELERO_S3_BUCKET:-}" + register: noble_velero_s3_bucket_from_env + when: + - noble_deploy_env_file.stat.exists | default(false) + - noble_velero_s3_bucket | default('') | length == 0 + changed_when: false + +- name: Apply NOBLE_VELERO_S3_BUCKET from .env + ansible.builtin.set_fact: + noble_velero_s3_bucket: "{{ noble_velero_s3_bucket_from_env.stdout | trim }}" + when: + - noble_velero_s3_bucket_from_env is defined + - (noble_velero_s3_bucket_from_env.stdout | default('') | trim | length) > 0 + +- name: Load NOBLE_VELERO_S3_URL from .env when unset + ansible.builtin.shell: | + set -a + . "{{ noble_repo_root }}/.env" + set +a + echo "${NOBLE_VELERO_S3_URL:-}" + register: noble_velero_s3_url_from_env + when: + - noble_deploy_env_file.stat.exists | default(false) + - noble_velero_s3_url | default('') | length == 0 + changed_when: false + +- name: Apply NOBLE_VELERO_S3_URL from .env + ansible.builtin.set_fact: + noble_velero_s3_url: "{{ noble_velero_s3_url_from_env.stdout | trim }}" + when: + - noble_velero_s3_url_from_env is defined + - (noble_velero_s3_url_from_env.stdout | default('') | trim | length) > 0 + +- name: Create velero-cloud-credentials from .env when keys present + ansible.builtin.shell: | + set -euo pipefail + set -a + . "{{ noble_repo_root }}/.env" + set +a + if [ -z "${NOBLE_VELERO_AWS_ACCESS_KEY_ID:-}" ] || [ -z "${NOBLE_VELERO_AWS_SECRET_ACCESS_KEY:-}" ]; then + echo SKIP + exit 0 + fi + CLOUD="$(printf '[default]\naws_access_key_id=%s\naws_secret_access_key=%s\n' \ + "${NOBLE_VELERO_AWS_ACCESS_KEY_ID}" "${NOBLE_VELERO_AWS_SECRET_ACCESS_KEY}")" + kubectl -n velero create secret generic velero-cloud-credentials \ + --from-literal=cloud="${CLOUD}" \ + --dry-run=client -o yaml | kubectl apply -f - + echo APPLIED + environment: + KUBECONFIG: "{{ noble_kubeconfig }}" + when: noble_deploy_env_file.stat.exists | default(false) + no_log: true + register: noble_velero_secret_from_env + changed_when: "'APPLIED' in (noble_velero_secret_from_env.stdout | default(''))" diff --git a/ansible/roles/noble_velero/tasks/main.yml b/ansible/roles/noble_velero/tasks/main.yml new file mode 100644 index 0000000..65d7ed6 --- /dev/null +++ b/ansible/roles/noble_velero/tasks/main.yml @@ -0,0 +1,85 @@ +--- +# Velero — S3 backup target + built-in CSI snapshots (Longhorn: label VolumeSnapshotClass per README). +- name: Apply velero namespace + ansible.builtin.command: + argv: + - kubectl + - apply + - -f + - "{{ noble_repo_root }}/clusters/noble/bootstrap/velero/namespace.yaml" + environment: + KUBECONFIG: "{{ noble_kubeconfig }}" + when: noble_velero_install | default(false) | bool + changed_when: true + +- name: Include Velero settings from repository .env (S3 bucket, URL, credentials) + ansible.builtin.include_tasks: from_env.yml + when: noble_velero_install | default(false) | bool + +- name: Require S3 bucket and endpoint for Velero + ansible.builtin.assert: + that: + - noble_velero_s3_bucket | default('') | length > 0 + - noble_velero_s3_url | default('') | length > 0 + fail_msg: >- + Set NOBLE_VELERO_S3_BUCKET and NOBLE_VELERO_S3_URL in .env, or noble_velero_s3_bucket / noble_velero_s3_url + (e.g. -e ...), or group_vars when noble_velero_install is true. + when: noble_velero_install | default(false) | bool + +- name: Create velero-cloud-credentials from Ansible vars + ansible.builtin.shell: | + set -euo pipefail + CLOUD="$(printf '[default]\naws_access_key_id=%s\naws_secret_access_key=%s\n' \ + "${AWS_ACCESS_KEY_ID}" "${AWS_SECRET_ACCESS_KEY}")" + kubectl -n velero create secret generic velero-cloud-credentials \ + --from-literal=cloud="${CLOUD}" \ + --dry-run=client -o yaml | kubectl apply -f - + environment: + KUBECONFIG: "{{ noble_kubeconfig }}" + AWS_ACCESS_KEY_ID: "{{ noble_velero_aws_access_key_id }}" + AWS_SECRET_ACCESS_KEY: "{{ noble_velero_aws_secret_access_key }}" + when: + - noble_velero_install | default(false) | bool + - noble_velero_aws_access_key_id | default('') | length > 0 + - noble_velero_aws_secret_access_key | default('') | length > 0 + no_log: true + changed_when: true + +- name: Check velero-cloud-credentials Secret + ansible.builtin.command: + argv: + - kubectl + - -n + - velero + - get + - secret + - velero-cloud-credentials + environment: + KUBECONFIG: "{{ noble_kubeconfig }}" + register: noble_velero_secret_check + failed_when: false + changed_when: false + when: noble_velero_install | default(false) | bool + +- name: Require velero-cloud-credentials before Helm + ansible.builtin.assert: + that: + - noble_velero_secret_check.rc == 0 + fail_msg: >- + Velero needs Secret velero/velero-cloud-credentials (key cloud). Set NOBLE_VELERO_AWS_ACCESS_KEY_ID and + NOBLE_VELERO_AWS_SECRET_ACCESS_KEY in .env, or noble_velero_aws_* extra-vars, or create the Secret manually + (see clusters/noble/bootstrap/velero/README.md). + when: noble_velero_install | default(false) | bool + +- name: Optional object prefix argv for Helm + ansible.builtin.set_fact: + noble_velero_helm_prefix_argv: "{{ ['--set-string', 'configuration.backupStorageLocation[0].prefix=' ~ (noble_velero_s3_prefix | default(''))] if (noble_velero_s3_prefix | default('') | length > 0) else [] }}" + when: noble_velero_install | default(false) | bool + +- name: Install Velero + ansible.builtin.command: + argv: "{{ ['helm', 'upgrade', '--install', 'velero', 'vmware-tanzu/velero', '--namespace', 'velero', '--version', noble_velero_chart_version, '-f', noble_repo_root ~ '/clusters/noble/bootstrap/velero/values.yaml', '--set-string', 'configuration.backupStorageLocation[0].bucket=' ~ noble_velero_s3_bucket, '--set-string', 'configuration.backupStorageLocation[0].config.s3Url=' ~ noble_velero_s3_url, '--set-string', 'configuration.backupStorageLocation[0].config.region=' ~ noble_velero_s3_region, '--set-string', 'configuration.backupStorageLocation[0].config.s3ForcePathStyle=' ~ noble_velero_s3_force_path_style] + (noble_velero_helm_prefix_argv | default([])) + ['--wait'] }}" + environment: + KUBECONFIG: "{{ noble_kubeconfig }}" + when: noble_velero_install | default(false) | bool + changed_when: true diff --git a/clusters/noble/bootstrap/kustomization.yaml b/clusters/noble/bootstrap/kustomization.yaml index c18be26..913b61a 100644 --- a/clusters/noble/bootstrap/kustomization.yaml +++ b/clusters/noble/bootstrap/kustomization.yaml @@ -12,6 +12,7 @@ resources: - external-secrets/namespace.yaml - vault/namespace.yaml - kyverno/namespace.yaml + - velero/namespace.yaml - headlamp/namespace.yaml - grafana-loki-datasource/loki-datasource.yaml - vault/unseal-cronjob.yaml diff --git a/clusters/noble/bootstrap/velero/README.md b/clusters/noble/bootstrap/velero/README.md new file mode 100644 index 0000000..d2507ab --- /dev/null +++ b/clusters/noble/bootstrap/velero/README.md @@ -0,0 +1,123 @@ +# Velero (cluster backups) + +Ansible-managed core stack — **not** reconciled by Argo CD (`clusters/noble/apps` is optional GitOps only). + +## What you get + +- **vmware-tanzu/velero** Helm chart (**12.0.0** → Velero **1.18.0**) in namespace **`velero`** +- **AWS plugin** init container for **S3-compatible** object storage (`velero/velero-plugin-for-aws:v1.14.0`) +- **CSI snapshots** via Velero’s built-in CSI support (`EnableCSI`) and **VolumeSnapshotLocation** `velero.io/csi` (no separate CSI plugin image for Velero ≥ 1.14) +- **Prometheus** scraping: **ServiceMonitor** labeled for **kube-prometheus** (`release: kube-prometheus`) + +## Prerequisites + +1. **Longhorn** (or another CSI driver) with a **VolumeSnapshotClass** for that driver. +2. For **Velero** to pick a default snapshot class, **one** `VolumeSnapshotClass` per driver should carry: + + ```yaml + metadata: + labels: + velero.io/csi-volumesnapshot-class: "true" + ``` + + Example for Longhorn: after install, confirm the driver name (often `driver.longhorn.io`) and either label Longhorn’s `VolumeSnapshotClass` or create one and label it (see [Velero CSI](https://velero.io/docs/main/csi/)). + +3. **S3-compatible** endpoint (MinIO, VersityGW, AWS, etc.) and a **bucket**. + +## Credentials Secret + +Velero expects **`velero/velero-cloud-credentials`**, key **`cloud`**, in **INI** form for the AWS plugin: + +```ini +[default] +aws_access_key_id= +aws_secret_access_key= +``` + +Create manually: + +```bash +kubectl -n velero create secret generic velero-cloud-credentials \ + --from-literal=cloud="$(printf '[default]\naws_access_key_id=%s\naws_secret_access_key=%s\n' "$KEY" "$SECRET")" +``` + +Or let **Ansible** create it from **`.env`** (`NOBLE_VELERO_AWS_ACCESS_KEY_ID`, `NOBLE_VELERO_AWS_SECRET_ACCESS_KEY`) or from extra-vars **`noble_velero_aws_access_key_id`** / **`noble_velero_aws_secret_access_key`**. + +## Apply (Ansible) + +1. Copy **`.env.sample`** → **`.env`** at the **repository root** and set at least: + - **`NOBLE_VELERO_S3_BUCKET`** — object bucket name + - **`NOBLE_VELERO_S3_URL`** — S3 API base URL (e.g. `https://minio.lan:9000` or your VersityGW/MinIO endpoint) + - **`NOBLE_VELERO_AWS_ACCESS_KEY_ID`** / **`NOBLE_VELERO_AWS_SECRET_ACCESS_KEY`** — credentials the AWS plugin uses (S3-compatible access key style) + +2. Enable the role: set **`noble_velero_install: true`** in **`ansible/group_vars/all.yml`**, **or** pass **`-e noble_velero_install=true`** on the command line. + +3. Run from **`ansible/`** (adjust **`KUBECONFIG`** to your cluster admin kubeconfig): + +```bash +cd ansible +export KUBECONFIG=/absolute/path/to/home-server/talos/kubeconfig + +# Velero only (after helm repos; skips other roles unless their tags match — use full playbook if unsure) +ansible-playbook playbooks/noble.yml --tags repos,velero -e noble_velero_install=true +``` + +If **`NOBLE_VELERO_S3_BUCKET`** / **`NOBLE_VELERO_S3_URL`** are not in **`.env`**, pass them explicitly: + +```bash +ansible-playbook playbooks/noble.yml --tags repos,velero -e noble_velero_install=true \ + -e noble_velero_s3_bucket=my-bucket \ + -e noble_velero_s3_url=https://s3.example.com:9000 +``` + +Full platform run (includes Velero when **`noble_velero_install`** is true in **`group_vars`**): + +```bash +ansible-playbook playbooks/noble.yml +``` + +## Install (Ansible) — details + +1. Set **`noble_velero_install: true`** in **`ansible/group_vars/all.yml`** (or pass **`-e noble_velero_install=true`**). +2. Set **`noble_velero_s3_bucket`** and **`noble_velero_s3_url`** via **`.env`** (**`NOBLE_VELERO_S3_*`**) or **`group_vars`** or **`-e`**. Extra-vars override **`.env`**. Optional: **`noble_velero_s3_region`**, **`noble_velero_s3_prefix`**, **`noble_velero_s3_force_path_style`** (defaults match `values.yaml`). +3. Run **`ansible/playbooks/noble.yml`** (Velero runs after **`noble_platform`**). + +Example without **`.env`** (all on the CLI): + +```bash +cd ansible +ansible-playbook playbooks/noble.yml --tags velero \ + -e noble_velero_install=true \ + -e noble_velero_s3_bucket=noble-velero \ + -e noble_velero_s3_url=https://minio.lan:9000 \ + -e noble_velero_aws_access_key_id="$KEY" \ + -e noble_velero_aws_secret_access_key="$SECRET" +``` + +The **`clusters/noble/bootstrap/kustomization.yaml`** applies **`velero/namespace.yaml`** with the rest of the bootstrap namespaces (so **`velero`** exists before Helm). + +## Install (Helm only) + +From repo root: + +```bash +kubectl apply -f clusters/noble/bootstrap/velero/namespace.yaml +# Create velero-cloud-credentials (see above), then: +helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts && helm repo update +helm upgrade --install velero vmware-tanzu/velero -n velero --version 12.0.0 \ + -f clusters/noble/bootstrap/velero/values.yaml \ + --set-string configuration.backupStorageLocation[0].bucket=YOUR_BUCKET \ + --set-string configuration.backupStorageLocation[0].config.s3Url=https://YOUR-S3-ENDPOINT \ + --wait +``` + +Edit **`values.yaml`** defaults (bucket placeholder, `s3Url`) or override with **`--set-string`** as above. + +## Quick checks + +```bash +kubectl -n velero get pods,backupstoragelocation,volumesnapshotlocation +velero backup create test --wait +``` + +(`velero` CLI: install from [Velero releases](https://github.com/vmware-tanzu/velero/releases).) diff --git a/clusters/noble/bootstrap/velero/namespace.yaml b/clusters/noble/bootstrap/velero/namespace.yaml new file mode 100644 index 0000000..812313e --- /dev/null +++ b/clusters/noble/bootstrap/velero/namespace.yaml @@ -0,0 +1,5 @@ +# Velero — apply before Helm (Ansible **noble_velero**). +apiVersion: v1 +kind: Namespace +metadata: + name: velero diff --git a/clusters/noble/bootstrap/velero/values.yaml b/clusters/noble/bootstrap/velero/values.yaml new file mode 100644 index 0000000..9bd9249 --- /dev/null +++ b/clusters/noble/bootstrap/velero/values.yaml @@ -0,0 +1,57 @@ +# Velero Helm values — vmware-tanzu/velero chart (see CLUSTER-BUILD.md Phase F). +# Install: **ansible/playbooks/noble.yml** role **noble_velero** (override S3 settings via **noble_velero_*** vars). +# Requires Secret **velero/velero-cloud-credentials** key **cloud** (INI for AWS plugin — see README). +# +# Chart: vmware-tanzu/velero — pin version on install (e.g. 12.0.0 / Velero 1.18.0). +# helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts && helm repo update +# kubectl apply -f clusters/noble/bootstrap/velero/namespace.yaml +# helm upgrade --install velero vmware-tanzu/velero -n velero --version 12.0.0 -f clusters/noble/bootstrap/velero/values.yaml + +initContainers: + - name: velero-plugin-for-aws + image: velero/velero-plugin-for-aws:v1.14.0 + imagePullPolicy: IfNotPresent + volumeMounts: + - mountPath: /target + name: plugins + +configuration: + features: EnableCSI + defaultBackupStorageLocation: default + defaultVolumeSnapshotLocations: velero.io/csi:default + + backupStorageLocation: + - name: default + provider: aws + bucket: noble-velero + default: true + accessMode: ReadWrite + credential: + name: velero-cloud-credentials + key: cloud + config: + region: us-east-1 + s3ForcePathStyle: "true" + s3Url: https://s3.CHANGE-ME.invalid + + volumeSnapshotLocation: + - name: default + provider: velero.io/csi + config: {} + +credentials: + useSecret: true + existingSecret: velero-cloud-credentials + +snapshotsEnabled: true +deployNodeAgent: false + +metrics: + enabled: true + serviceMonitor: + enabled: true + autodetect: true + additionalLabels: + release: kube-prometheus + +schedules: {} diff --git a/docs/architecture.md b/docs/architecture.md index fcac12c..4c5268a 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -233,7 +233,7 @@ The **noble** environment is a **Talos** lab cluster on **`192.168.50.0/24`** wi **Open questions** - **Split horizon:** Confirm whether only LAN DNS resolves `*.apps.noble.lab.pcenicni.dev` to **`192.168.50.211`** or whether public resolvers also point at that address. -- **Velero / S3:** **TBD** until an S3-compatible backend is configured. +- **Velero / S3:** optional **Ansible** install (**`noble_velero_install`**) from **`clusters/noble/bootstrap/velero/`** once an S3-compatible backend and credentials exist (see **`talos/CLUSTER-BUILD.md`** Phase F). - **Argo CD:** Confirm **`repoURL`** in `root-application.yaml` and what is actually applied on-cluster. --- diff --git a/talos/CLUSTER-BUILD.md b/talos/CLUSTER-BUILD.md index 35bd584..6dc92c3 100644 --- a/talos/CLUSTER-BUILD.md +++ b/talos/CLUSTER-BUILD.md @@ -4,7 +4,7 @@ This document is the **exported TODO** for the **noble** Talos cluster (4 nodes) ## Current state (2026-03-28) -Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vault **CiliumNetworkPolicy**, **`talos/runbooks/`**). **Next focus:** optional **Alertmanager** receivers (Slack/PagerDuty); tighten **RBAC** (Headlamp / cluster-admin); **Cilium** policies for other namespaces as needed; enable **Mend Renovate** for PRs; Pangolin/sample Ingress; **Velero** when S3 exists. +Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vault **CiliumNetworkPolicy**, **`talos/runbooks/`**). **Next focus:** optional **Alertmanager** receivers (Slack/PagerDuty); tighten **RBAC** (Headlamp / cluster-admin); **Cilium** policies for other namespaces as needed; enable **Mend Renovate** for PRs; Pangolin/sample Ingress; **Velero** backup/restore drill after S3 credentials are set (**`noble_velero_install`**). - **Talos** v1.12.6 (target) / **Kubernetes** as bundled — four nodes **Ready** unless upgrading; **`talosctl health`**; **`talos/kubeconfig`** is **local only** (gitignored — never commit; regenerate with `talosctl kubeconfig` per `talos/README.md`). **Image Factory (nocloud installer):** `factory.talos.dev/nocloud-installer/249d9135de54962744e917cfe654117000cba369f9152fbab9d055a00aa3664f:v1.12.6` - **Cilium** Helm **1.16.6** / app **1.16.6** (`clusters/noble/bootstrap/cilium/`, phase 1 values). @@ -21,7 +21,8 @@ Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vaul - **Sealed Secrets** Helm **2.18.4** / app **0.36.1** — `clusters/noble/bootstrap/sealed-secrets/` (namespace **`sealed-secrets`**); **`kubeseal`** on client should match controller minor (**README**); back up **`sealed-secrets-key`** (see README). - **External Secrets Operator** Helm **2.2.0** / app **v2.2.0** — `clusters/noble/bootstrap/external-secrets/`; Vault **`ClusterSecretStore`** in **`examples/vault-cluster-secret-store.yaml`** (**`http://`** to match Vault listener — apply after Vault **Kubernetes auth**). - **Vault** Helm **0.32.0** / app **1.21.2** — `clusters/noble/bootstrap/vault/` — standalone **file** storage, **Longhorn** PVC; **HTTP** listener (`global.tlsDisable`); optional **CronJob** lab unseal **`unseal-cronjob.yaml`**; **not** initialized in git — run **`vault operator init`** per **`README.md`**. -- **Still open:** **Renovate** — install **[Mend Renovate](https://github.com/apps/renovate)** (or self-host) so PRs run; optional **Alertmanager** notification channels; optional **sample Ingress + cert + Pangolin** end-to-end; **Velero** when S3 is ready; **Argo CD SSO**. +- **Velero** Helm **12.0.0** / app **v1.18.0** — `clusters/noble/bootstrap/velero/` (**Ansible** **`noble_velero`**, not Argo); **S3-compatible** backup location + **CSI** snapshots (**`EnableCSI`**); enable with **`noble_velero_install`** per **`velero/README.md`**. +- **Still open:** **Renovate** — install **[Mend Renovate](https://github.com/apps/renovate)** (or self-host) so PRs run; optional **Alertmanager** notification channels; optional **sample Ingress + cert + Pangolin** end-to-end; **Argo CD SSO**. ## Inventory @@ -44,7 +45,7 @@ Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vaul | Grafana (Ingress + TLS) | **`grafana.apps.noble.lab.pcenicni.dev`** — `grafana.ingress` in `clusters/noble/bootstrap/kube-prometheus-stack/values.yaml` (**`letsencrypt-prod`**) | | Headlamp (Ingress + TLS) | **`headlamp.apps.noble.lab.pcenicni.dev`** — chart `ingress` in `clusters/noble/bootstrap/headlamp/` (**`letsencrypt-prod`**, **`ingressClassName: traefik`**) | | Public DNS (Pangolin) | **Newt** tunnel + **CNAME** at registrar + **Integration API** — `clusters/noble/bootstrap/newt/` | -| Velero | S3-compatible URL — configure later | +| Velero | S3-compatible endpoint + bucket — **`clusters/noble/bootstrap/velero/`**, **`ansible/playbooks/noble.yml`** (**`noble_velero_install`**) | ## Versions @@ -67,6 +68,7 @@ Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vaul - Vault: **0.32.0** (Helm chart `hashicorp/vault`; app **1.21.2**) - Kyverno: **3.7.1** (Helm chart `kyverno/kyverno`; app **v1.17.1**); **kyverno-policies** **3.7.1** — **baseline** PSS, **Audit** (`clusters/noble/bootstrap/kyverno/`) - Headlamp: **0.40.1** (Helm chart `headlamp/headlamp`; app matches chart — see [Artifact Hub](https://artifacthub.io/packages/helm/headlamp/headlamp)) +- Velero: **12.0.0** (Helm chart `vmware-tanzu/velero`; app **v1.18.0**) — **`clusters/noble/bootstrap/velero/`**; AWS plugin **v1.14.0**; Ansible **`noble_velero`** - Renovate: **hosted** (Mend **Renovate** GitHub/GitLab app — no cluster chart) **or** **self-hosted** — pin chart when added ([Helm charts](https://docs.renovatebot.com/helm-charts/), OCI `ghcr.io/renovatebot/charts/renovate`); pair **`renovate.json`** with this repo’s Helm paths under **`clusters/noble/`** ## Repo paths (this workspace) @@ -97,6 +99,7 @@ Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vaul | Vault (Helm + optional unseal CronJob) | `clusters/noble/bootstrap/vault/` — `values.yaml`, `namespace.yaml`, `unseal-cronjob.yaml`, `cilium-network-policy.yaml`, `configure-kubernetes-auth.sh`, `README.md` | | Kyverno + PSS baseline policies | `clusters/noble/bootstrap/kyverno/` — `values.yaml`, `policies-values.yaml`, `namespace.yaml`, `README.md` | | Headlamp (Helm + Ingress) | `clusters/noble/bootstrap/headlamp/` — `values.yaml`, `namespace.yaml`, `README.md` | +| Velero (Helm + S3 BSL; CSI snapshots) | `clusters/noble/bootstrap/velero/` — `values.yaml`, `namespace.yaml`, `README.md`; **`ansible/roles/noble_velero`** | | Renovate (repo config + optional self-hosted Helm) | **`renovate.json`** at repo root; optional self-hosted chart under **`clusters/noble/apps/`** (Argo) + token Secret (**Sealed Secrets** / **ESO** after **Phase E**) | **Git vs cluster:** manifests and `talconfig` live in git; **`talhelper genconfig -o out`**, bootstrap, Helm, and `kubectl` run on your LAN. See **`talos/README.md`** for workstation reachability (lab LAN/VPN), **`talosctl kubeconfig`** vs Kubernetes `server:` (VIP vs node IP), and **`--insecure`** only in maintenance. @@ -111,6 +114,7 @@ Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vaul 6. **Vault:** **Longhorn** default **StorageClass** before **`clusters/noble/bootstrap/vault/`** Helm (PVC **`data-vault-0`**); **External Secrets** **`ClusterSecretStore`** after Vault is initialized, unsealed, and **Kubernetes auth** is configured. 7. **Headlamp:** **Traefik** + **cert-manager** (**`letsencrypt-prod`**) before exposing **`headlamp.apps.noble.lab.pcenicni.dev`**; treat as **cluster-admin** UI — protect with network policy / SSO when hardening (**Phase G**). 8. **Renovate:** **Git remote** + platform access (**hosted app** needs org/repo install; **self-hosted** needs **`RENOVATE_TOKEN`** and chart **`renovate.config`**). If the bot runs **in-cluster**, add the token **after** **Sealed Secrets** / **Vault** (**Phase E**) — no ingress required for the bot itself. +9. **Velero:** **S3-compatible** endpoint + bucket + **`velero/velero-cloud-credentials`** before **`ansible/playbooks/noble.yml`** with **`noble_velero_install: true`**; for **CSI** volume snapshots, label a **VolumeSnapshotClass** per **`clusters/noble/bootstrap/velero/README.md`** (e.g. Longhorn). ## Prerequisites (before phases) @@ -170,7 +174,7 @@ Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vaul ## Phase F — Policy + backups - [x] **Kyverno** baseline policies — `clusters/noble/bootstrap/kyverno/` (Helm **kyverno** **3.7.1** + **kyverno-policies** **3.7.1**, **baseline** / **Audit** — see **`README.md`**) -- [ ] **Velero** when S3 is ready; backup/restore drill +- [ ] **Velero** — manifests + Ansible **`noble_velero`** (`clusters/noble/bootstrap/velero/`); enable with **`noble_velero_install: true`** + S3 bucket/URL + **`velero/velero-cloud-credentials`** (see **`velero/README.md`**); optional backup/restore drill ## Phase G — Hardening @@ -197,6 +201,7 @@ Lab stack is **up** on-cluster through **Phase D**–**F** and **Phase G** (Vaul - [x] **`external-secrets`** — controller + webhook + cert-controller **Running** in **`external-secrets`**; apply **`ClusterSecretStore`** after Vault **Kubernetes auth** - [x] **`vault`** — **StatefulSet** **Running**, **`data-vault-0`** PVC **Bound** on **longhorn**; **`vault operator init`** + unseal per **`apps/vault/README.md`** - [x] **`kyverno`** — admission / background / cleanup / reports controllers **Running** in **`kyverno`**; **ClusterPolicies** for **PSS baseline** **Ready** (**Audit**) +- [ ] **`velero`** — when enabled: Deployment **Running** in **`velero`**; **`BackupStorageLocation`** / **`VolumeSnapshotLocation`** **Available**; test backup per **`velero/README.md`** - [x] **Phase G (partial)** — Vault **`CiliumNetworkPolicy`**; **`talos/runbooks/`** (incl. **RBAC**); **Headlamp**/**Argo CD** RBAC tightened — **Alertmanager** receivers still optional ---