Files

Nikholas Pcenicni f259285f6e Enhance Argo CD integration by adding support for a bootstrap root application. Update group_vars/all.yml and role defaults to include noble_argocd_apply_bootstrap_root_application. Modify tasks to apply the bootstrap application conditionally. Revise documentation to clarify the GitOps workflow and the relationship between the core platform and optional applications. Remove outdated references and streamline the README for better user guidance.

2026-04-01 01:55:41 -04:00

26 KiB

Raw Blame History

Noble lab — Talos cluster build checklist

This document is the exported TODO for the noble Talos cluster (4 nodes). Commands and troubleshooting live in README.md.

Current state (2026-03-28)

Lab stack is up on-cluster through Phase D–F and Phase G (talos/runbooks/, SOPS-encrypted secrets in clusters/noble/secrets/). Next focus: optional Alertmanager receivers (Slack/PagerDuty); tighten RBAC (Headlamp / cluster-admin); Cilium policies for other namespaces as needed; enable Mend Renovate for PRs; Pangolin/sample Ingress; Velero backup/restore drill after S3 credentials are set (noble_velero_install).

Talos v1.12.6 (target) / Kubernetes as bundled — four nodes Ready unless upgrading; talosctl health; talos/kubeconfig is local only (gitignored — never commit; regenerate with talosctl kubeconfig per talos/README.md). Image Factory (nocloud installer): factory.talos.dev/nocloud-installer/249d9135de54962744e917cfe654117000cba369f9152fbab9d055a00aa3664f:v1.12.6
Cilium Helm 1.16.6 / app 1.16.6 (clusters/noble/bootstrap/cilium/, phase 1 values).
CSI Volume Snapshot — external-snapshotter v8.5.0 CRDs + registry.k8s.io/sig-storage/snapshot-controller (clusters/noble/bootstrap/csi-snapshot-controller/, Ansible noble_csi_snapshot_controller).
MetalLB Helm 0.15.3 / app v0.15.3; IPAddressPool noble-l2 + L2Advertisement — pool 192.168.50.210–192.168.50.229.
kube-vip DaemonSet 3/3 on control planes; VIP 192.168.50.230 on ens18 (vip_subnet /32 required — bare 32 breaks parsing). Verified from workstation: kubectl config set-cluster noble --server=https://192.168.50.230:6443 then kubectl get --raw /healthz → ok (talos/kubeconfig; see talos/README.md).
metrics-server Helm 3.13.0 / app v0.8.0 — clusters/noble/bootstrap/metrics-server/values.yaml (--kubelet-insecure-tls for Talos); kubectl top nodes works.
Longhorn Helm 1.11.1 / app v1.11.1 — clusters/noble/bootstrap/longhorn/ (PSA privileged namespace, defaultDataPath /var/mnt/longhorn, preUpgradeChecker enabled); StorageClass longhorn (default); nodes.longhorn.io all Ready; test PVC Bound on longhorn.
Traefik Helm 39.0.6 / app v3.6.11 — clusters/noble/bootstrap/traefik/; Service LoadBalancer EXTERNAL-IP 192.168.50.211; IngressClass traefik (default). Point *.apps.noble.lab.pcenicni.dev at 192.168.50.211. MetalLB pool verification was done before replacing the temporary nginx test with Traefik.
cert-manager Helm v1.20.0 / app v1.20.0 — clusters/noble/bootstrap/cert-manager/; ClusterIssuer letsencrypt-staging and letsencrypt-prod (DNS-01 via Cloudflare for pcenicni.dev, Secret cloudflare-dns-api-token in cert-manager); ACME email certificates@noble.lab.pcenicni.dev (edit in manifests if you want a different mailbox).
Newt Helm 1.2.0 / app 1.10.1 — clusters/noble/bootstrap/newt/ (fossorial/newt); Pangolin site tunnel — newt-pangolin-auth Secret (PANGOLIN_ENDPOINT, NEWT_ID, NEWT_SECRET). Store credentials in git with SOPS (clusters/noble/secrets/newt-pangolin-auth.secret.yaml, age-key.txt, .sops.yaml) — see clusters/noble/secrets/README.md. Public DNS is not automated with ExternalDNS: CNAME records at your DNS host per Pangolin’s domain instructions, plus Integration API for HTTP resources/targets — see clusters/noble/bootstrap/newt/README.md. LAN access to Traefik can still use *.apps.noble.lab.pcenicni.dev → 192.168.50.211 (split horizon / local resolver).
Argo CD Helm 9.4.17 / app v3.3.6 — clusters/noble/bootstrap/argocd/; argocd-server LoadBalancer 192.168.50.210; noble-root → clusters/noble/apps/; noble-bootstrap-root → clusters/noble/bootstrap (manual sync until argocd/README.md §5 after noble.yml). Edit repoURL in both root Application files before applying.
kube-prometheus-stack — Helm chart 82.15.1 — clusters/noble/bootstrap/kube-prometheus-stack/ (namespace monitoring, PSA privileged — node-exporter needs host mounts); Longhorn PVCs for Prometheus, Grafana, Alertmanager; node-exporter DaemonSet 4/4. Grafana Ingress: https://grafana.apps.noble.lab.pcenicni.dev (Traefik ingressClassName: traefik, cert-manager.io/cluster-issuer: letsencrypt-prod). Loki datasource in Grafana: ConfigMap clusters/noble/bootstrap/grafana-loki-datasource/loki-datasource.yaml (sidecar label grafana_datasource: "1") — not via grafana.additionalDataSources in the chart. helm upgrade --install with --wait is silent until done — use --timeout 30m; Grafana admin: Secret kube-prometheus-grafana, keys admin-user / admin-password.
Loki + Fluent Bit — grafana/loki 6.55.0 SingleBinary + filesystem on Longhorn (clusters/noble/bootstrap/loki/); loki.auth_enabled: false; chunksCache.enabled: false (no memcached chunk cache). fluent/fluent-bit 0.56.0 → loki-gateway.loki.svc:80 (clusters/noble/bootstrap/fluent-bit/); logging PSA privileged. Grafana Explore: kubectl apply -f clusters/noble/bootstrap/grafana-loki-datasource/loki-datasource.yaml then Explore → Loki (e.g. {job="fluent-bit"}).
SOPS — cluster Secret manifests under clusters/noble/secrets/ encrypted with age (see .sops.yaml, age-key.txt gitignored); noble.yml decrypt-applies when the private key is present.
Velero Helm 12.0.0 / app v1.18.0 — clusters/noble/bootstrap/velero/ (Ansible noble_velero, not Argo); S3-compatible backup location + CSI snapshots (EnableCSI); enable with noble_velero_install per velero/README.md.
Still open: Renovate — install Mend Renovate (or self-host) so PRs run; optional Alertmanager notification channels; optional sample Ingress + cert + Pangolin end-to-end; Argo CD SSO.

Inventory

Host	Role	IP
helium	worker	`192.168.50.10`
neon	control-plane + worker	`192.168.50.20`
argon	control-plane + worker	`192.168.50.30`
krypton	control-plane + worker	`192.168.50.40`

Network reservations

Use	Value
Kubernetes API VIP (kube-vip)	`192.168.50.230` (see `talos/README.md`; align with `talos/talconfig.yaml` `additionalApiServerCertSans`)
MetalLB L2 pool	`192.168.50.210`–`192.168.50.229`
Argo CD `LoadBalancer`	Pick one IP in the MetalLB pool (e.g. `192.168.50.210`)
Traefik (apps ingress)	`192.168.50.211` — `metallb.io/loadBalancerIPs` in `clusters/noble/bootstrap/traefik/values.yaml`
Apps ingress (LAN / split horizon)	`*.apps.noble.lab.pcenicni.dev` → Traefik LB
Grafana (Ingress + TLS)	`grafana.apps.noble.lab.pcenicni.dev` — `grafana.ingress` in `clusters/noble/bootstrap/kube-prometheus-stack/values.yaml` (`letsencrypt-prod`)
Headlamp (Ingress + TLS)	`headlamp.apps.noble.lab.pcenicni.dev` — chart `ingress` in `clusters/noble/bootstrap/headlamp/` (`letsencrypt-prod`, `ingressClassName: traefik`)
Public DNS (Pangolin)	Newt tunnel + CNAME at registrar + Integration API — `clusters/noble/bootstrap/newt/`
Velero	S3-compatible endpoint + bucket — `clusters/noble/bootstrap/velero/`, `ansible/playbooks/noble.yml` (`noble_velero_install`)

Versions

Talos: v1.12.6 — align talosctl client with node image
Talos Image Factory (iscsi-tools + util-linux-tools): factory.talos.dev/nocloud-installer/249d9135de54962744e917cfe654117000cba369f9152fbab9d055a00aa3664f:v1.12.6 — same schematic must appear in machine.install.image after talhelper genconfig (bare metal may use metal-installer/ instead of nocloud-installer/)
Kubernetes: 1.35.2 on current nodes (bundled with Talos; not pinned in repo)
Cilium: 1.16.6 (Helm chart; see clusters/noble/bootstrap/cilium/README.md)
MetalLB: 0.15.3 (Helm chart; app v0.15.3)
metrics-server: 3.13.0 (Helm chart; app v0.8.0)
Longhorn: 1.11.1 (Helm chart; app v1.11.1)
Traefik: 39.0.6 (Helm chart; app v3.6.11)
cert-manager: v1.20.0 (Helm chart; app v1.20.0)
Newt (Fossorial): 1.2.0 (Helm chart; app 1.10.1)
Argo CD: 9.4.17 (Helm chart argo/argo-cd; app v3.3.6)
kube-prometheus-stack: 82.15.1 (Helm chart prometheus-community/kube-prometheus-stack; app v0.89.x bundle)
Loki: 6.55.0 (Helm chart grafana/loki; app 3.6.7)
Fluent Bit: 0.56.0 (Helm chart fluent/fluent-bit; app 4.2.3)
Kyverno: 3.7.1 (Helm chart kyverno/kyverno; app v1.17.1); kyverno-policies 3.7.1 — baseline PSS, Audit (clusters/noble/bootstrap/kyverno/)
Headlamp: 0.40.1 (Helm chart headlamp/headlamp; app matches chart — see Artifact Hub)
Velero: 12.0.0 (Helm chart vmware-tanzu/velero; app v1.18.0) — clusters/noble/bootstrap/velero/; AWS plugin v1.14.0; Ansible noble_velero
Renovate: hosted (Mend Renovate GitHub/GitLab app — no cluster chart) or self-hosted — pin chart when added (Helm charts, OCI ghcr.io/renovatebot/charts/renovate); pair renovate.json with this repo’s Helm paths under clusters/noble/

Repo paths (this workspace)

Artifact	Path
This checklist	`talos/CLUSTER-BUILD.md`
Operational runbooks (API VIP, etcd, Longhorn, SOPS)	`talos/runbooks/`
Talos quick start + networking + kubeconfig	`talos/README.md`
talhelper source (active)	`talos/talconfig.yaml` — may be wipe-phase (no Longhorn volume) during disk recovery
Longhorn volume restore	`talos/talconfig.with-longhorn.yaml` — copy to `talconfig.yaml` after GPT wipe (see `talos/README.md` §5)
Longhorn GPT wipe automation	`talos/scripts/longhorn-gpt-recovery.sh`
kube-vip (kustomize)	`clusters/noble/bootstrap/kube-vip/` (`vip_interface` e.g. `ens18`)
Cilium (Helm values)	`clusters/noble/bootstrap/cilium/` — `values.yaml` (phase 1), optional `values-kpr.yaml`, `README.md`
CSI Volume Snapshot (CRDs + controller)	`clusters/noble/bootstrap/csi-snapshot-controller/` — `crd/`, `controller/` kustomize; `ansible/roles/noble_csi_snapshot_controller`
MetalLB	`clusters/noble/bootstrap/metallb/` — `namespace.yaml` (PSA privileged), `ip-address-pool.yaml`, `kustomization.yaml`, `README.md`
Longhorn	`clusters/noble/bootstrap/longhorn/` — `values.yaml`, `namespace.yaml` (PSA privileged), `kustomization.yaml`
metrics-server (Helm values)	`clusters/noble/bootstrap/metrics-server/values.yaml`
Traefik (Helm values)	`clusters/noble/bootstrap/traefik/` — `values.yaml`, `namespace.yaml`, `README.md`
cert-manager (Helm + ClusterIssuers)	`clusters/noble/bootstrap/cert-manager/` — `values.yaml`, `namespace.yaml`, `kustomization.yaml`, `README.md`
Newt / Pangolin tunnel (Helm)	`clusters/noble/bootstrap/newt/` — `values.yaml`, `namespace.yaml`, `README.md`
Argo CD (Helm) + app-of-apps	`clusters/noble/bootstrap/argocd/` — `values.yaml`, `root-application.yaml`, `bootstrap-root-application.yaml`, `app-of-apps/`, `README.md`; `noble-root` syncs `clusters/noble/apps/`; `noble-bootstrap-root` syncs `clusters/noble/bootstrap` (enable automation after `noble.yml`)
kube-prometheus-stack (Helm values)	`clusters/noble/bootstrap/kube-prometheus-stack/` — `values.yaml`, `namespace.yaml`
Grafana Loki datasource (ConfigMap; no chart change)	`clusters/noble/bootstrap/grafana-loki-datasource/loki-datasource.yaml`
Loki (Helm values)	`clusters/noble/bootstrap/loki/` — `values.yaml`, `namespace.yaml`
Fluent Bit → Loki (Helm values)	`clusters/noble/bootstrap/fluent-bit/` — `values.yaml`, `namespace.yaml`
SOPS-encrypted cluster Secrets	`clusters/noble/secrets/` — `README.md`, `.secret.yaml`; `.sops.yaml`, `age-key.txt`* (gitignored) at repo root
Kyverno + PSS baseline policies	`clusters/noble/bootstrap/kyverno/` — `values.yaml`, `policies-values.yaml`, `namespace.yaml`, `README.md`
Headlamp (Helm + Ingress)	`clusters/noble/bootstrap/headlamp/` — `values.yaml`, `namespace.yaml`, `README.md`
Velero (Helm + S3 BSL; CSI snapshots)	`clusters/noble/bootstrap/velero/` — `values.yaml`, `namespace.yaml`, `README.md`; `ansible/roles/noble_velero`
Renovate (repo config + optional self-hosted Helm)	`renovate.json` at repo root; optional self-hosted chart under `clusters/noble/apps/` (Argo) + token Secret (SOPS under `clusters/noble/secrets/` or imperative `kubectl create secret`)

Git vs cluster: manifests and talconfig live in git; talhelper genconfig -o out, bootstrap, Helm, and kubectl run on your LAN. See talos/README.md for workstation reachability (lab LAN/VPN), talosctl kubeconfig vs Kubernetes server: (VIP vs node IP), and --insecure only in maintenance.

Ordering (do not skip)

Talos installed; Cilium (or chosen CNI) before most workloads — with cni: none, nodes stay NotReady / network-unavailable taint until CNI is up.
MetalLB Helm chart (CRDs + controller) before kubectl apply -k on the pool manifests.
clusters/noble/bootstrap/metallb/namespace.yaml before or merged onto metallb-system so Pod Security does not block speaker (see bootstrap/metallb/README.md).
CSI Volume snapshots: kubernetes-csi/external-snapshotter CRDs + snapshot-controller (clusters/noble/bootstrap/csi-snapshot-controller/) before relying on Longhorn / Velero volume snapshots.
Longhorn: Talos user volume + extensions in talconfig.with-longhorn.yaml (when restored); Helm defaultDataPath in clusters/noble/bootstrap/longhorn/values.yaml.
Loki → Fluent Bit → Grafana datasource: deploy Loki (loki-gateway Service) before Fluent Bit; apply clusters/noble/bootstrap/grafana-loki-datasource/loki-datasource.yaml after Loki (sidecar picks up the ConfigMap — no kube-prometheus values change for Loki).
Headlamp: Traefik + cert-manager (letsencrypt-prod) before exposing headlamp.apps.noble.lab.pcenicni.dev; treat as cluster-admin UI — protect with network policy / SSO when hardening (Phase G).
Renovate: Git remote + platform access (hosted app needs org/repo install; self-hosted needs RENOVATE_TOKEN and chart renovate.config). If the bot runs in-cluster, store the token with SOPS or an imperative Secret — no ingress required for the bot itself.
Velero: S3-compatible endpoint + bucket + velero/velero-cloud-credentials before ansible/playbooks/noble.yml with noble_velero_install: true; for CSI volume snapshots, label a VolumeSnapshotClass per clusters/noble/bootstrap/velero/README.md (e.g. Longhorn).

Prerequisites (before phases)

talos/talconfig.yaml checked in (VIP, API SANs, cni: none, iscsi-tools / util-linux-tools in schematic) — run talhelper validate talconfig talconfig.yaml after edits
Workstation on a routable path to node IPs or VIP (same LAN / VPN); talos/README.md §3 if kubectl hits wrong server: or network is unreachable
talosctl client matches node Talos version; talhelper for genconfig
Node static IPs (helium, neon, argon, krypton)
DHCP does not lease 192.168.50.210–229, 230, or node IPs
DNS for API and apps as in talos/README.md
Git remote ready for Argo CD (argo-cd)
talos/kubeconfig from talosctl kubeconfig — root repo kubeconfig is a stub until populated

Phase A — Talos bootstrap + API VIP

Optional: Ansible runs the same steps — ansible/playbooks/talos_phase_a.yml (genconfig → apply → bootstrap → kubeconfig) or ansible/playbooks/deploy.yml (Phase A + noble.yml); see ansible/README.md.
talhelper gensecret → talhelper genconfig -o out (re-run genconfig after every talconfig edit)
apply-config all nodes (talos/README.md §2 — no --insecure after nodes join; use TALOSCONFIG)
talosctl bootstrap once; other control planes and worker join
talosctl kubeconfig → working kubectl (talos/README.md §3 — override server: if VIP not reachable from workstation)
kube-vip manifests in clusters/noble/bootstrap/kube-vip
kube-vip healthy; vip_interface matches uplink (talosctl get links); VIP reachable where needed
talosctl health (e.g. talosctl health -n 192.168.50.20 with TALOSCONFIG set)

Phase B — Core platform

Install order: Cilium → Volume Snapshot CRDs + snapshot-controller (clusters/noble/bootstrap/csi-snapshot-controller/, Ansible noble_csi_snapshot_controller) → metrics-server → Longhorn (Talos disk + Helm) → MetalLB (Helm → pool manifests) → ingress / certs / DNS as planned.

Cilium (Helm 1.16.6) — required before MetalLB if cni: none (clusters/noble/bootstrap/cilium/)
CSI Volume Snapshot — CRDs + snapshot-controller in kube-system (clusters/noble/bootstrap/csi-snapshot-controller/); Ansible noble_csi_snapshot_controller; verify kubectl api-resources | grep VolumeSnapshot
metrics-server — Helm 3.13.0; values in clusters/noble/bootstrap/metrics-server/values.yaml; verify kubectl top nodes
Longhorn — Talos: user volume + kubelet mounts + extensions (talos/README.md §5); Helm 1.11.1; kubectl apply -k clusters/noble/bootstrap/longhorn; verify nodes.longhorn.io and test PVC Bound
MetalLB — chart installed; pool + L2 from clusters/noble/bootstrap/metallb/ applied (192.168.50.210–229)
Service LoadBalancer / pool check — MetalLB assigns from 210–229 (validated before Traefik; temporary nginx test removed in favor of Traefik)
Traefik LoadBalancer for *.apps.noble.lab.pcenicni.dev — clusters/noble/bootstrap/traefik/; 192.168.50.211
cert-manager + ClusterIssuer (letsencrypt-staging / letsencrypt-prod) — clusters/noble/bootstrap/cert-manager/
Newt (Pangolin tunnel; replaces ExternalDNS for public DNS) — clusters/noble/bootstrap/newt/ — newt-pangolin-auth; CNAME + Integration API per newt/README.md

Phase C — GitOps

Argo CD bootstrap — clusters/noble/bootstrap/argocd/ (helm upgrade --install argocd …) — also covered by ansible/playbooks/noble.yml (role noble_argocd)
Argo CD server LoadBalancer — 192.168.50.210 (see values.yaml)
App-of-apps — optional; clusters/noble/apps/kustomization.yaml is empty (core stack is Ansible-managed from clusters/noble/bootstrap/, not Argo). Set repoURL in root-application.yaml and add Application manifests only for optional GitOps workloads — see clusters/noble/apps/README.md
Renovate — renovate.json at repo root (Renovate — Kubernetes manager for clusters/noble/**/*.yaml image pins; grouped minor/patch PRs). Activate PRs: install Mend Renovate on the Git repo (Option A), or Option B: self-hosted chart per Helm charts + token from SOPS or a one-off Secret. Helm chart versions pinned only in comments still need manual bumps or extra regex customManagers — extend renovate.json as needed.
SSO — later

Phase D — Observability

kube-prometheus-stack — kubectl apply -f clusters/noble/bootstrap/kube-prometheus-stack/namespace.yaml then helm upgrade --install as in clusters/noble/bootstrap/kube-prometheus-stack/values.yaml (chart 82.15.1); PVCs longhorn; --wait --timeout 30m recommended; verify kubectl -n monitoring get pods,pvc
Loki + Fluent Bit + Grafana Loki datasource — order: kubectl apply -f clusters/noble/bootstrap/loki/namespace.yaml → helm upgrade --install loki grafana/loki 6.55.0 -f clusters/noble/bootstrap/loki/values.yaml → kubectl apply -f clusters/noble/bootstrap/fluent-bit/namespace.yaml → helm upgrade --install fluent-bit fluent/fluent-bit 0.56.0 -f clusters/noble/bootstrap/fluent-bit/values.yaml → kubectl apply -f clusters/noble/bootstrap/grafana-loki-datasource/loki-datasource.yaml. Verify Explore → Loki in Grafana; kubectl -n loki get pods,pvc, kubectl -n logging get pods
Headlamp — Kubernetes web UI (Headlamp); helm repo add headlamp https://kubernetes-sigs.github.io/headlamp/; kubectl apply -f clusters/noble/bootstrap/headlamp/namespace.yaml → helm upgrade --install headlamp headlamp/headlamp --version 0.40.1 -n headlamp -f clusters/noble/bootstrap/headlamp/values.yaml; Ingress https://headlamp.apps.noble.lab.pcenicni.dev (ingressClassName: traefik, cert-manager.io/cluster-issuer: letsencrypt-prod). values.yaml: config.sessionTTL: null works around chart 0.40.1 / binary mismatch (headlamp#4883). RBAC: chart defaults are permissive — tighten before LAN-wide exposure; align with Phase G hardening.

Phase E — Secrets

SOPS — encrypt Secret YAML under clusters/noble/secrets/ with age (see .sops.yaml, clusters/noble/secrets/README.md); keep age-key.txt private (gitignored). ansible/playbooks/noble.yml decrypt-applies *.yaml when age-key.txt exists.

Phase F — Policy + backups

Kyverno baseline policies — clusters/noble/bootstrap/kyverno/ (Helm kyverno 3.7.1 + kyverno-policies 3.7.1, baseline / Audit — see README.md)
Velero — manifests + Ansible noble_velero (clusters/noble/bootstrap/velero/); enable with noble_velero_install: true + S3 bucket/URL + velero/velero-cloud-credentials (see velero/README.md); optional backup/restore drill

Phase G — Hardening

Runbooks — talos/runbooks/ (API VIP / kube-vip, etcd–Talos, Longhorn, SOPS)
RBAC — Headlamp ClusterRoleBinding uses built-in edit (not cluster-admin); Argo CD policy.default: role:readonly with g, admin, role:admin — see clusters/noble/bootstrap/headlamp/values.yaml, clusters/noble/bootstrap/argocd/values.yaml, talos/runbooks/rbac.md
Alertmanager — add slack_configs, pagerduty_configs, or other receivers under kube-prometheus-stack alertmanager.config (chart defaults use null receiver)

Quick validation

kubectl get nodes — all Ready
API via VIP :6443 — kubectl get --raw /healthz → ok with kubeconfig server: https://192.168.50.230:6443
Ingress LoadBalancer in pool 210–229 (Traefik → 192.168.50.211)
Argo CD UI — argocd-server LoadBalancer 192.168.50.210 (initial admin password from argocd-initial-admin-secret)
Renovate — renovate.json committed; enable Mend Renovate or self-hosted bot for PRs
Sample Ingress + cert (cert-manager ready) + Pangolin resource + CNAME
PVC Bound on Longhorn (storageClassName: longhorn); Prometheus/Loki durable when configured
monitoring — kube-prometheus-stack core workloads Running (Prometheus, Grafana, Alertmanager, operator, kube-state-metrics, node-exporter); PVCs Bound on longhorn
loki — Loki SingleBinary + gateway Running; loki PVC Bound on longhorn (no chunks-cache by design)
logging — Fluent Bit DaemonSet Running on all nodes (logs → Loki)
Grafana — Loki datasource from grafana-loki-datasource ConfigMap (Explore works after apply + sidecar sync)
Headlamp — Deployment Running in headlamp; UI at https://headlamp.apps.noble.lab.pcenicni.dev (TLS via letsencrypt-prod)
SOPS secrets — clusters/noble/secrets/*.yaml encrypted in git; noble.yml applies decrypted manifests when age-key.txt is present
kyverno — admission / background / cleanup / reports controllers Running in kyverno; ClusterPolicies for PSS baseline Ready (Audit)
velero — when enabled: Deployment Running in velero; BackupStorageLocation / VolumeSnapshotLocation Available; test backup per velero/README.md
Phase G (partial) — talos/runbooks/ (incl. RBAC); Headlamp/Argo CD RBAC tightened — Alertmanager receivers still optional

Keep in sync with talos/README.md and manifests under clusters/noble/.

26 KiB Raw Blame History Unescape Escape