Refactor noble cluster configurations by removing deprecated Argo CD application management files and transitioning to a streamlined Ansible-driven installation approach. Update kustomization.yaml files to reflect the new structure, ensuring clarity on resource management. Introduce new namespaces and configurations for cert-manager, external-secrets, and logging components, enhancing the overall deployment process. Add detailed README.md documentation for each component to guide users through the setup and management of the noble lab environment.
This commit is contained in:
@@ -1,17 +0,0 @@
|
||||
# Argo CD — app-of-apps children (optional GitOps only)
|
||||
|
||||
**Core platform is Ansible-managed** — see repository **`ansible/README.md`** and **`ansible/playbooks/noble.yml`**.
|
||||
|
||||
This directory’s **`kustomization.yaml`** has **`resources: []`** so **`noble-root`** (if applied) does not reconcile Helm charts or cluster add-ons. **Add `Application` manifests here only** for apps you want Argo to manage (for example, sample workloads or third-party charts not covered by the bootstrap playbook).
|
||||
|
||||
| Previous (removed) | Now |
|
||||
|--------------------|-----|
|
||||
| **`noble-kyverno`**, **`noble-kyverno-policies`**, **`noble-platform`** | Installed by Ansible roles **`noble_kyverno`**, **`noble_kyverno_policies`**, **`noble_platform`** |
|
||||
|
||||
If you previously synced **`noble-root`** with the old child manifests, delete stale Applications on the cluster:
|
||||
|
||||
```bash
|
||||
kubectl delete application -n argocd noble-platform noble-kyverno noble-kyverno-policies --ignore-not-found
|
||||
```
|
||||
|
||||
Then re-apply **`root-application.yaml`** so Argo matches this repo.
|
||||
@@ -1,6 +0,0 @@
|
||||
# Intentionally empty: core platform (CNI, ingress, storage, observability, policy, etc.) is
|
||||
# installed by **ansible/playbooks/noble.yml** — not by Argo CD. Add optional Application
|
||||
# manifests here only for workloads you want GitOps-managed.
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources: []
|
||||
53
clusters/noble/bootstrap/cert-manager/README.md
Normal file
53
clusters/noble/bootstrap/cert-manager/README.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# cert-manager — noble
|
||||
|
||||
**Prerequisites:** **Traefik** (ingress class **`traefik`**), DNS for **`*.apps.noble.lab.pcenicni.dev`** → Traefik LB for app traffic.
|
||||
|
||||
**ACME (Let’s Encrypt)** uses **DNS-01** via **Cloudflare** for zone **`pcenicni.dev`**. Create an API token with **Zone → DNS → Edit** and **Zone → Zone → Read** (or use the “Edit zone DNS” template), then:
|
||||
|
||||
**Option A — Ansible:** copy **`.env.sample`** to **`.env`** in the repo root, set **`CLOUDFLARE_DNS_API_TOKEN`**, run **`ansible/playbooks/noble.yml`** (or **`deploy.yml`**). The **cert-manager** role creates **cloudflare-dns-api-token** from `.env` after the chart installs.
|
||||
|
||||
**Option B — kubectl:**
|
||||
|
||||
```bash
|
||||
kubectl -n cert-manager create secret generic cloudflare-dns-api-token \
|
||||
--from-literal=api-token='YOUR_CLOUDFLARE_API_TOKEN' \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
```
|
||||
|
||||
Without this Secret, **`ClusterIssuer`** will not complete certificate orders.
|
||||
|
||||
1. Create the namespace:
|
||||
|
||||
```bash
|
||||
kubectl apply -f clusters/noble/apps/cert-manager/namespace.yaml
|
||||
```
|
||||
|
||||
2. Install the chart (CRDs included via `values.yaml`):
|
||||
|
||||
```bash
|
||||
helm repo add jetstack https://charts.jetstack.io
|
||||
helm repo update
|
||||
helm upgrade --install cert-manager jetstack/cert-manager \
|
||||
--namespace cert-manager \
|
||||
--version v1.20.0 \
|
||||
-f clusters/noble/apps/cert-manager/values.yaml \
|
||||
--wait
|
||||
```
|
||||
|
||||
3. Optionally edit **`spec.acme.email`** in both ClusterIssuer manifests (default **`certificates@noble.lab.pcenicni.dev`**) — Let’s Encrypt uses this for expiry and account notices. Do **not** use **`example.com`** (ACME rejects it).
|
||||
|
||||
4. Apply ClusterIssuers (staging then prod, or both):
|
||||
|
||||
```bash
|
||||
kubectl apply -k clusters/noble/apps/cert-manager
|
||||
```
|
||||
|
||||
5. Confirm:
|
||||
|
||||
```bash
|
||||
kubectl get clusterissuer
|
||||
```
|
||||
|
||||
Use **`cert-manager.io/cluster-issuer: letsencrypt-staging`** on Ingresses while testing; switch to **`letsencrypt-prod`** when ready.
|
||||
|
||||
**HTTP-01** is not configured: if the hostname is **proxied** (orange cloud) in Cloudflare, Let’s Encrypt may hit Cloudflare’s edge and get **404** for `/.well-known/acme-challenge/`. DNS-01 avoids that.
|
||||
@@ -0,0 +1,23 @@
|
||||
# Let's Encrypt production — trusted certificates; respect rate limits.
|
||||
# Prefer a real mailbox for expiry notices; this domain is accepted by LE (edit if needed).
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod
|
||||
spec:
|
||||
acme:
|
||||
email: certificates@noble.lab.pcenicni.dev
|
||||
server: https://acme-v02.api.letsencrypt.org/directory
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-prod-account-key
|
||||
solvers:
|
||||
# DNS-01 — works when public HTTP to Traefik is wrong (e.g. hostname proxied through Cloudflare
|
||||
# returns 404 for /.well-known/acme-challenge). Requires Secret cloudflare-dns-api-token in cert-manager.
|
||||
- dns01:
|
||||
cloudflare:
|
||||
apiTokenSecretRef:
|
||||
name: cloudflare-dns-api-token
|
||||
key: api-token
|
||||
selector:
|
||||
dnsZones:
|
||||
- pcenicni.dev
|
||||
@@ -0,0 +1,21 @@
|
||||
# Let's Encrypt staging — use for tests (untrusted issuer in browsers).
|
||||
# Prefer a real mailbox for expiry notices; this domain is accepted by LE (edit if needed).
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-staging
|
||||
spec:
|
||||
acme:
|
||||
email: certificates@noble.lab.pcenicni.dev
|
||||
server: https://acme-staging-v02.api.letsencrypt.org/directory
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-staging-account-key
|
||||
solvers:
|
||||
- dns01:
|
||||
cloudflare:
|
||||
apiTokenSecretRef:
|
||||
name: cloudflare-dns-api-token
|
||||
key: api-token
|
||||
selector:
|
||||
dnsZones:
|
||||
- pcenicni.dev
|
||||
5
clusters/noble/bootstrap/cert-manager/kustomization.yaml
Normal file
5
clusters/noble/bootstrap/cert-manager/kustomization.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources:
|
||||
- clusterissuer-letsencrypt-staging.yaml
|
||||
- clusterissuer-letsencrypt-prod.yaml
|
||||
9
clusters/noble/bootstrap/cert-manager/namespace.yaml
Normal file
9
clusters/noble/bootstrap/cert-manager/namespace.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
# cert-manager controller + webhook — noble lab
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: cert-manager
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: baseline
|
||||
pod-security.kubernetes.io/audit: baseline
|
||||
pod-security.kubernetes.io/warn: baseline
|
||||
14
clusters/noble/bootstrap/cert-manager/values.yaml
Normal file
14
clusters/noble/bootstrap/cert-manager/values.yaml
Normal file
@@ -0,0 +1,14 @@
|
||||
# cert-manager — noble lab
|
||||
#
|
||||
# Chart: jetstack/cert-manager — pin version on the helm command (e.g. v1.20.0).
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/cert-manager/namespace.yaml
|
||||
# helm repo add jetstack https://charts.jetstack.io
|
||||
# helm repo update
|
||||
# helm upgrade --install cert-manager jetstack/cert-manager -n cert-manager \
|
||||
# --version v1.20.0 -f clusters/noble/apps/cert-manager/values.yaml --wait
|
||||
#
|
||||
# kubectl apply -k clusters/noble/apps/cert-manager
|
||||
|
||||
crds:
|
||||
enabled: true
|
||||
34
clusters/noble/bootstrap/cilium/README.md
Normal file
34
clusters/noble/bootstrap/cilium/README.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Cilium — noble (Talos)
|
||||
|
||||
Talos uses **`cluster.network.cni.name: none`**; you must install Cilium (or another CNI) before nodes become **Ready** and before **MetalLB** / most workloads. See `talos/CLUSTER-BUILD.md` ordering.
|
||||
|
||||
## 1. Install (phase 1 — required)
|
||||
|
||||
Uses **`values.yaml`**: IPAM **kubernetes**, **`k8sServiceHost` / `k8sServicePort`** pointing at **KubePrism** (`127.0.0.1:7445`, Talos default), Talos cgroup paths, **drop `SYS_MODULE`** from agent caps, **`bpf.masquerade: false`** ([Talos Cilium](https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/), [KubePrism](https://www.talos.dev/latest/kubernetes-guides/configuration/kubeprism/)). Without this, host-network CNI clients may **`dial tcp <VIP>:6443`** and fail if the VIP path is unhealthy.
|
||||
|
||||
From **repository root**:
|
||||
|
||||
```bash
|
||||
helm repo add cilium https://helm.cilium.io/
|
||||
helm repo update
|
||||
helm upgrade --install cilium cilium/cilium \
|
||||
--namespace kube-system \
|
||||
--version 1.16.6 \
|
||||
-f clusters/noble/apps/cilium/values.yaml \
|
||||
--wait
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system rollout status ds/cilium
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
When nodes are **Ready**, continue with **MetalLB** (`clusters/noble/apps/metallb/README.md`) and other Phase B items. **kube-vip** for the Kubernetes API VIP is separate (L2 ARP); it can run after the API is reachable.
|
||||
|
||||
## 2. Optional: kube-proxy replacement (phase 2)
|
||||
|
||||
To replace **`kube-proxy`** with Cilium entirely, use **`values-kpr.yaml`** and **`cluster.proxy.disabled: true`** in Talos on every node (see comments inside `values-kpr.yaml`). Follow the upstream [Deploy Cilium CNI](https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/) section *without kube-proxy*.
|
||||
|
||||
Do **not** skip phase 1 unless you already know your cluster matches the “bootstrap window” flow from the Talos docs.
|
||||
49
clusters/noble/bootstrap/cilium/values-kpr.yaml
Normal file
49
clusters/noble/bootstrap/cilium/values-kpr.yaml
Normal file
@@ -0,0 +1,49 @@
|
||||
# Optional phase 2: kube-proxy replacement via Cilium + KubePrism (Talos apid forwards :7445 → :6443).
|
||||
# Prerequisites:
|
||||
# 1. Phase 1 Cilium installed and healthy; nodes Ready.
|
||||
# 2. Add to Talos machine config on ALL nodes:
|
||||
# cluster:
|
||||
# proxy:
|
||||
# disabled: true
|
||||
# (keep cluster.network.cni.name: none). Regenerate, apply-config, reboot as needed.
|
||||
# 3. Remove legacy kube-proxy objects if still present:
|
||||
# kubectl delete ds -n kube-system kube-proxy --ignore-not-found
|
||||
# kubectl delete cm -n kube-system kube-proxy --ignore-not-found
|
||||
# 4. helm upgrade cilium ... -f values-kpr.yaml
|
||||
#
|
||||
# Ref: https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/
|
||||
|
||||
ipam:
|
||||
mode: kubernetes
|
||||
|
||||
kubeProxyReplacement: "true"
|
||||
|
||||
k8sServiceHost: localhost
|
||||
k8sServicePort: "7445"
|
||||
|
||||
securityContext:
|
||||
capabilities:
|
||||
ciliumAgent:
|
||||
- CHOWN
|
||||
- KILL
|
||||
- NET_ADMIN
|
||||
- NET_RAW
|
||||
- IPC_LOCK
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
- DAC_OVERRIDE
|
||||
- FOWNER
|
||||
- SETGID
|
||||
- SETUID
|
||||
cleanCiliumState:
|
||||
- NET_ADMIN
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
|
||||
cgroup:
|
||||
autoMount:
|
||||
enabled: false
|
||||
hostRoot: /sys/fs/cgroup
|
||||
|
||||
bpf:
|
||||
masquerade: false
|
||||
44
clusters/noble/bootstrap/cilium/values.yaml
Normal file
44
clusters/noble/bootstrap/cilium/values.yaml
Normal file
@@ -0,0 +1,44 @@
|
||||
# Cilium on Talos — phase 1: bring up CNI while kube-proxy still runs.
|
||||
# See README.md for install order (before MetalLB scheduling) and optional kube-proxy replacement.
|
||||
#
|
||||
# Chart: cilium/cilium — pin version in helm command (e.g. 1.16.6).
|
||||
# Ref: https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/
|
||||
|
||||
ipam:
|
||||
mode: kubernetes
|
||||
|
||||
kubeProxyReplacement: "false"
|
||||
|
||||
# Host-network components cannot use kubernetes.default ClusterIP; Talos KubePrism (enabled by default)
|
||||
# on 127.0.0.1:7445 proxies to healthy apiservers and avoids flaky dials to cluster.controlPlane.endpoint (VIP).
|
||||
# Ref: https://www.talos.dev/latest/kubernetes-guides/configuration/kubeprism/
|
||||
k8sServiceHost: "127.0.0.1"
|
||||
k8sServicePort: "7445"
|
||||
|
||||
securityContext:
|
||||
capabilities:
|
||||
ciliumAgent:
|
||||
- CHOWN
|
||||
- KILL
|
||||
- NET_ADMIN
|
||||
- NET_RAW
|
||||
- IPC_LOCK
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
- DAC_OVERRIDE
|
||||
- FOWNER
|
||||
- SETGID
|
||||
- SETUID
|
||||
cleanCiliumState:
|
||||
- NET_ADMIN
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
|
||||
cgroup:
|
||||
autoMount:
|
||||
enabled: false
|
||||
hostRoot: /sys/fs/cgroup
|
||||
|
||||
# Workaround: Talos host DNS forwarding + bpf masquerade can break CoreDNS; see Talos Cilium guide "Known issues".
|
||||
bpf:
|
||||
masquerade: false
|
||||
60
clusters/noble/bootstrap/external-secrets/README.md
Normal file
60
clusters/noble/bootstrap/external-secrets/README.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# External Secrets Operator (noble)
|
||||
|
||||
Syncs secrets from external systems into Kubernetes **Secret** objects via **ExternalSecret** / **ClusterExternalSecret** CRDs.
|
||||
|
||||
- **Chart:** `external-secrets/external-secrets` **2.2.0** (app **v2.2.0**)
|
||||
- **Namespace:** `external-secrets`
|
||||
- **Helm release name:** `external-secrets` (matches the operator **ServiceAccount** name `external-secrets`)
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
helm repo add external-secrets https://charts.external-secrets.io
|
||||
helm repo update
|
||||
kubectl apply -f clusters/noble/apps/external-secrets/namespace.yaml
|
||||
helm upgrade --install external-secrets external-secrets/external-secrets -n external-secrets \
|
||||
--version 2.2.0 -f clusters/noble/apps/external-secrets/values.yaml --wait
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
kubectl -n external-secrets get deploy,pods
|
||||
kubectl get crd | grep external-secrets
|
||||
```
|
||||
|
||||
## Vault `ClusterSecretStore` (after Vault is deployed)
|
||||
|
||||
The checklist expects a **Vault**-backed store. Install Vault first (`talos/CLUSTER-BUILD.md` Phase E — Vault on Longhorn + auto-unseal), then:
|
||||
|
||||
1. Enable **KV v2** secrets engine and **Kubernetes** auth in Vault; create a **role** (e.g. `external-secrets`) that maps the cluster’s **`external-secrets` / `external-secrets`** service account to a policy that can read the paths you need.
|
||||
2. Copy **`examples/vault-cluster-secret-store.yaml`**, set **`spec.provider.vault.server`** to your Vault URL. This repo’s Vault Helm values use **HTTP** on port **8200** (`global.tlsDisable: true`): **`http://vault.vault.svc.cluster.local:8200`**. Use **`https://`** if you enable TLS on the Vault listener.
|
||||
3. If Vault uses a **private TLS CA**, configure **`caProvider`** or **`caBundle`** on the Vault provider — see [HashiCorp Vault provider](https://external-secrets.io/latest/provider/hashicorp-vault/). Do not commit private CA material to public git unless intended.
|
||||
4. Apply: **`kubectl apply -f …/vault-cluster-secret-store.yaml`**
|
||||
5. Confirm the store is ready: **`kubectl describe clustersecretstore vault`**
|
||||
|
||||
Example **ExternalSecret** (after the store is healthy):
|
||||
|
||||
```yaml
|
||||
apiVersion: external-secrets.io/v1
|
||||
kind: ExternalSecret
|
||||
metadata:
|
||||
name: demo
|
||||
namespace: default
|
||||
spec:
|
||||
refreshInterval: 1h
|
||||
secretStoreRef:
|
||||
name: vault
|
||||
kind: ClusterSecretStore
|
||||
target:
|
||||
name: demo-synced
|
||||
data:
|
||||
- secretKey: password
|
||||
remoteRef:
|
||||
key: secret/data/myapp
|
||||
property: password
|
||||
```
|
||||
|
||||
## Upgrades
|
||||
|
||||
Pin the chart version in `values.yaml` header comments; run the same **`helm upgrade --install`** with the new **`--version`** after reviewing [release notes](https://github.com/external-secrets/external-secrets/releases).
|
||||
@@ -0,0 +1,31 @@
|
||||
# ClusterSecretStore for HashiCorp Vault (KV v2) using Kubernetes auth.
|
||||
#
|
||||
# Do not apply until Vault is running, reachable from the cluster, and configured with:
|
||||
# - Kubernetes auth at mountPath (default: kubernetes)
|
||||
# - A role (below: external-secrets) bound to this service account:
|
||||
# name: external-secrets
|
||||
# namespace: external-secrets
|
||||
# - A policy allowing read on the KV path used below (e.g. secret/data/* for path "secret")
|
||||
#
|
||||
# Adjust server, mountPath, role, and path to match your Vault deployment. If Vault uses TLS
|
||||
# with a private CA, set provider.vault.caProvider or caBundle (see README).
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/external-secrets/examples/vault-cluster-secret-store.yaml
|
||||
---
|
||||
apiVersion: external-secrets.io/v1
|
||||
kind: ClusterSecretStore
|
||||
metadata:
|
||||
name: vault
|
||||
spec:
|
||||
provider:
|
||||
vault:
|
||||
server: "http://vault.vault.svc.cluster.local:8200"
|
||||
path: secret
|
||||
version: v2
|
||||
auth:
|
||||
kubernetes:
|
||||
mountPath: kubernetes
|
||||
role: external-secrets
|
||||
serviceAccountRef:
|
||||
name: external-secrets
|
||||
namespace: external-secrets
|
||||
5
clusters/noble/bootstrap/external-secrets/namespace.yaml
Normal file
5
clusters/noble/bootstrap/external-secrets/namespace.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
# External Secrets Operator — apply before Helm.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: external-secrets
|
||||
10
clusters/noble/bootstrap/external-secrets/values.yaml
Normal file
10
clusters/noble/bootstrap/external-secrets/values.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
# External Secrets Operator — noble
|
||||
#
|
||||
# helm repo add external-secrets https://charts.external-secrets.io
|
||||
# helm repo update
|
||||
# kubectl apply -f clusters/noble/apps/external-secrets/namespace.yaml
|
||||
# helm upgrade --install external-secrets external-secrets/external-secrets -n external-secrets \
|
||||
# --version 2.2.0 -f clusters/noble/apps/external-secrets/values.yaml --wait
|
||||
#
|
||||
# CRDs are installed by the chart (installCRDs: true). Vault ClusterSecretStore: see README + examples/.
|
||||
commonLabels: {}
|
||||
10
clusters/noble/bootstrap/fluent-bit/namespace.yaml
Normal file
10
clusters/noble/bootstrap/fluent-bit/namespace.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
# Fluent Bit (tail container logs → Loki) — apply before Helm.
|
||||
# HostPath mounts under /var/log require PSA privileged (same idea as monitoring/node-exporter).
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: logging
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: privileged
|
||||
pod-security.kubernetes.io/audit: privileged
|
||||
pod-security.kubernetes.io/warn: privileged
|
||||
40
clusters/noble/bootstrap/fluent-bit/values.yaml
Normal file
40
clusters/noble/bootstrap/fluent-bit/values.yaml
Normal file
@@ -0,0 +1,40 @@
|
||||
# Fluent Bit — noble lab (DaemonSet; ship Kubernetes container logs to Loki gateway).
|
||||
#
|
||||
# Chart: fluent/fluent-bit — pin version on install (e.g. 0.56.0).
|
||||
# Install **after** Loki so `loki-gateway.loki.svc` exists.
|
||||
#
|
||||
# Talos: only **tail** `/var/log/containers` (no host **systemd** input — journal layout differs from typical Linux).
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/fluent-bit/namespace.yaml
|
||||
# helm repo add fluent https://fluent.github.io/helm-charts
|
||||
# helm repo update
|
||||
# helm upgrade --install fluent-bit fluent/fluent-bit -n logging \
|
||||
# --version 0.56.0 -f clusters/noble/apps/fluent-bit/values.yaml --wait --timeout 15m
|
||||
|
||||
config:
|
||||
inputs: |
|
||||
[INPUT]
|
||||
Name tail
|
||||
Path /var/log/containers/*.log
|
||||
multiline.parser docker, cri
|
||||
Tag kube.*
|
||||
Mem_Buf_Limit 5MB
|
||||
Skip_Long_Lines On
|
||||
|
||||
filters: |
|
||||
[FILTER]
|
||||
Name kubernetes
|
||||
Match kube.*
|
||||
Merge_Log On
|
||||
Keep_Log Off
|
||||
K8S-Logging.Parser On
|
||||
K8S-Logging.Exclude On
|
||||
|
||||
outputs: |
|
||||
[OUTPUT]
|
||||
Name loki
|
||||
Match kube.*
|
||||
Host loki-gateway.loki.svc.cluster.local
|
||||
Port 80
|
||||
tls Off
|
||||
labels job=fluent-bit
|
||||
@@ -0,0 +1,27 @@
|
||||
# Extra Grafana datasource — apply to **monitoring** (same namespace as kube-prometheus Grafana).
|
||||
# The Grafana sidecar watches ConfigMaps labeled **grafana_datasource: "1"** and loads YAML keys as files.
|
||||
# Does not require editing the kube-prometheus-stack Helm release.
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/grafana-loki-datasource/loki-datasource.yaml
|
||||
#
|
||||
# Remove with: kubectl delete -f clusters/noble/apps/grafana-loki-datasource/loki-datasource.yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: grafana-datasource-loki
|
||||
namespace: monitoring
|
||||
labels:
|
||||
grafana_datasource: "1"
|
||||
data:
|
||||
loki.yaml: |
|
||||
apiVersion: 1
|
||||
datasources:
|
||||
- name: Loki
|
||||
type: loki
|
||||
uid: loki
|
||||
access: proxy
|
||||
url: http://loki-gateway.loki.svc.cluster.local:80
|
||||
isDefault: false
|
||||
editable: false
|
||||
jsonData:
|
||||
maxLines: 1000
|
||||
35
clusters/noble/bootstrap/headlamp/README.md
Normal file
35
clusters/noble/bootstrap/headlamp/README.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Headlamp (noble)
|
||||
|
||||
[Headlamp](https://headlamp.dev/) web UI for the cluster. Exposed on **`https://headlamp.apps.noble.lab.pcenicni.dev`** via **Traefik** + **cert-manager** (`letsencrypt-prod`), same pattern as Grafana.
|
||||
|
||||
- **Chart:** `headlamp/headlamp` **0.40.1** (`config.sessionTTL: null` avoids chart/binary mismatch — [issue #4883](https://github.com/kubernetes-sigs/headlamp/issues/4883))
|
||||
- **Namespace:** `headlamp`
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
helm repo add headlamp https://kubernetes-sigs.github.io/headlamp/
|
||||
helm repo update
|
||||
kubectl apply -f clusters/noble/apps/headlamp/namespace.yaml
|
||||
helm upgrade --install headlamp headlamp/headlamp -n headlamp \
|
||||
--version 0.40.1 -f clusters/noble/apps/headlamp/values.yaml --wait --timeout 10m
|
||||
```
|
||||
|
||||
Sign-in uses a **ServiceAccount token** (Headlamp docs: create a limited SA for day-to-day use). This repo binds the Headlamp workload SA to the built-in **`edit`** ClusterRole (**`clusterRoleBinding.clusterRoleName: edit`** in **`values.yaml`**) — not **`cluster-admin`**. For cluster-scoped admin work, use **`kubectl`** with your admin kubeconfig. Optional **OIDC** in **`config.oidc`** replaces token login for SSO.
|
||||
|
||||
## Sign-in token (ServiceAccount `headlamp`)
|
||||
|
||||
Use a short-lived token (Kubernetes **1.24+**; requires permission to create **TokenRequests**):
|
||||
|
||||
```bash
|
||||
export KUBECONFIG=/path/to/talos/kubeconfig # or your admin kubeconfig
|
||||
kubectl -n headlamp create token headlamp --duration=48h
|
||||
```
|
||||
|
||||
Paste the printed JWT into Headlamp’s token field at **`https://headlamp.apps.noble.lab.pcenicni.dev`**.
|
||||
|
||||
To use another duration (cluster `spec.serviceAccount` / admission limits may cap it):
|
||||
|
||||
```bash
|
||||
kubectl -n headlamp create token headlamp --duration=8760h
|
||||
```
|
||||
10
clusters/noble/bootstrap/headlamp/namespace.yaml
Normal file
10
clusters/noble/bootstrap/headlamp/namespace.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
# Headlamp — apply before Helm.
|
||||
# Chart pods do not satisfy PSA "restricted" (see install warnings); align with other UIs.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: headlamp
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: privileged
|
||||
pod-security.kubernetes.io/audit: privileged
|
||||
pod-security.kubernetes.io/warn: privileged
|
||||
37
clusters/noble/bootstrap/headlamp/values.yaml
Normal file
37
clusters/noble/bootstrap/headlamp/values.yaml
Normal file
@@ -0,0 +1,37 @@
|
||||
# Headlamp — noble (Kubernetes web UI)
|
||||
#
|
||||
# helm repo add headlamp https://kubernetes-sigs.github.io/headlamp/
|
||||
# helm repo update
|
||||
# kubectl apply -f clusters/noble/apps/headlamp/namespace.yaml
|
||||
# helm upgrade --install headlamp headlamp/headlamp -n headlamp \
|
||||
# --version 0.40.1 -f clusters/noble/apps/headlamp/values.yaml --wait --timeout 10m
|
||||
#
|
||||
# DNS: headlamp.apps.noble.lab.pcenicni.dev → Traefik LB (see talos/CLUSTER-BUILD.md).
|
||||
# Default chart RBAC is broad — restrict for production (Phase G).
|
||||
# Bind Headlamp’s ServiceAccount to the built-in **edit** ClusterRole (not **cluster-admin**).
|
||||
# For break-glass cluster-admin, use kubectl with your admin kubeconfig — not Headlamp.
|
||||
# If changing **clusterRoleName** on an existing install, Kubernetes forbids mutating **roleRef**:
|
||||
# kubectl delete clusterrolebinding headlamp-admin
|
||||
# helm upgrade … (same command as in the header comments)
|
||||
clusterRoleBinding:
|
||||
clusterRoleName: edit
|
||||
#
|
||||
# Chart 0.40.1 passes -session-ttl but the v0.40.1 binary does not define it — omit the flag:
|
||||
# https://github.com/kubernetes-sigs/headlamp/issues/4883
|
||||
config:
|
||||
sessionTTL: null
|
||||
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- host: headlamp.apps.noble.lab.pcenicni.dev
|
||||
paths:
|
||||
- path: /
|
||||
type: Prefix
|
||||
tls:
|
||||
- secretName: headlamp-apps-noble-tls
|
||||
hosts:
|
||||
- headlamp.apps.noble.lab.pcenicni.dev
|
||||
@@ -0,0 +1,11 @@
|
||||
# kube-prometheus-stack — apply before Helm (omit --create-namespace on install).
|
||||
# prometheus-node-exporter uses hostNetwork, hostPID, and hostPath (/proc, /sys, /) — incompatible
|
||||
# with PSA "baseline"; use "privileged" (same idea as longhorn-system / metallb-system).
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: monitoring
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: privileged
|
||||
pod-security.kubernetes.io/audit: privileged
|
||||
pod-security.kubernetes.io/warn: privileged
|
||||
112
clusters/noble/bootstrap/kube-prometheus-stack/values.yaml
Normal file
112
clusters/noble/bootstrap/kube-prometheus-stack/values.yaml
Normal file
@@ -0,0 +1,112 @@
|
||||
# kube-prometheus-stack — noble lab (Prometheus Operator + Grafana + Alertmanager + exporters)
|
||||
#
|
||||
# Chart: prometheus-community/kube-prometheus-stack — pin version on install (e.g. 82.15.1).
|
||||
#
|
||||
# Install (use one terminal; chain with && so `helm upgrade` always runs after `helm repo update`):
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/kube-prometheus-stack/namespace.yaml
|
||||
# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
|
||||
# helm repo update && helm upgrade --install kube-prometheus prometheus-community/kube-prometheus-stack -n monitoring \
|
||||
# --version 82.15.1 -f clusters/noble/apps/kube-prometheus-stack/values.yaml --wait --timeout 30m
|
||||
#
|
||||
# Why it looks "stalled": with --wait, Helm prints almost nothing until the release finishes (can be many minutes).
|
||||
# Do not use --timeout 5m for first install — Longhorn PVCs + StatefulSets often need 15–30m. To watch progress,
|
||||
# open a second terminal: kubectl -n monitoring get pods,sts,ds -w
|
||||
# To apply manifest changes without blocking: omit --wait, then kubectl -n monitoring get pods -w
|
||||
#
|
||||
# Grafana admin password: Secret `kube-prometheus-grafana` keys `admin-user` / `admin-password` unless you set grafana.adminPassword.
|
||||
|
||||
# Use cert-manager for admission webhook TLS instead of Helm pre-hook Jobs (patch/create Secret).
|
||||
# Those Jobs are validated by Kyverno before `kyverno-svc` exists during a single Argo sync, which fails.
|
||||
# Requires cert-manager CRDs (bootstrap before this chart).
|
||||
prometheusOperator:
|
||||
admissionWebhooks:
|
||||
certManager:
|
||||
enabled: true
|
||||
|
||||
# --- Longhorn-backed persistence (default chart storage is emptyDir) ---
|
||||
alertmanager:
|
||||
alertmanagerSpec:
|
||||
storage:
|
||||
volumeClaimTemplate:
|
||||
spec:
|
||||
storageClassName: longhorn
|
||||
accessModes: ["ReadWriteOnce"]
|
||||
resources:
|
||||
requests:
|
||||
storage: 5Gi
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- alertmanager.apps.noble.lab.pcenicni.dev
|
||||
paths:
|
||||
- /
|
||||
pathType: Prefix
|
||||
tls:
|
||||
- secretName: alertmanager-apps-noble-tls
|
||||
hosts:
|
||||
- alertmanager.apps.noble.lab.pcenicni.dev
|
||||
|
||||
prometheus:
|
||||
prometheusSpec:
|
||||
retention: 15d
|
||||
retentionSize: 25GB
|
||||
storageSpec:
|
||||
volumeClaimTemplate:
|
||||
spec:
|
||||
storageClassName: longhorn
|
||||
accessModes: ["ReadWriteOnce"]
|
||||
resources:
|
||||
requests:
|
||||
storage: 30Gi
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- prometheus.apps.noble.lab.pcenicni.dev
|
||||
paths:
|
||||
- /
|
||||
pathType: Prefix
|
||||
tls:
|
||||
- secretName: prometheus-apps-noble-tls
|
||||
hosts:
|
||||
- prometheus.apps.noble.lab.pcenicni.dev
|
||||
|
||||
grafana:
|
||||
persistence:
|
||||
enabled: true
|
||||
type: sts
|
||||
storageClassName: longhorn
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
size: 10Gi
|
||||
|
||||
# HTTPS via Traefik + cert-manager (ClusterIssuer letsencrypt-prod; same pattern as other *.apps.noble.lab.pcenicni.dev hosts).
|
||||
# DNS: grafana.apps.noble.lab.pcenicni.dev → Traefik LoadBalancer (192.168.50.211) — see clusters/noble/apps/traefik/values.yaml
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
path: /
|
||||
pathType: Prefix
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- grafana.apps.noble.lab.pcenicni.dev
|
||||
tls:
|
||||
- secretName: grafana-apps-noble-tls
|
||||
hosts:
|
||||
- grafana.apps.noble.lab.pcenicni.dev
|
||||
|
||||
grafana.ini:
|
||||
server:
|
||||
domain: grafana.apps.noble.lab.pcenicni.dev
|
||||
root_url: https://grafana.apps.noble.lab.pcenicni.dev/
|
||||
# Traefik sets X-Forwarded-*; required for correct redirects and cookies behind the ingress.
|
||||
use_proxy_headers: true
|
||||
|
||||
# Loki datasource: apply `clusters/noble/apps/grafana-loki-datasource/loki-datasource.yaml` (sidecar ConfigMap) instead of additionalDataSources here.
|
||||
5
clusters/noble/bootstrap/kube-vip/kustomization.yaml
Normal file
5
clusters/noble/bootstrap/kube-vip/kustomization.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources:
|
||||
- vip-rbac.yaml
|
||||
- vip-daemonset.yaml
|
||||
83
clusters/noble/bootstrap/kube-vip/vip-daemonset.yaml
Normal file
83
clusters/noble/bootstrap/kube-vip/vip-daemonset.yaml
Normal file
@@ -0,0 +1,83 @@
|
||||
apiVersion: apps/v1
|
||||
kind: DaemonSet
|
||||
metadata:
|
||||
name: kube-vip-ds
|
||||
namespace: kube-system
|
||||
spec:
|
||||
updateStrategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxUnavailable: 1
|
||||
maxSurge: 0
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: kube-vip-ds
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: kube-vip-ds
|
||||
spec:
|
||||
hostNetwork: true
|
||||
dnsPolicy: ClusterFirstWithHostNet
|
||||
priorityClassName: system-node-critical
|
||||
terminationGracePeriodSeconds: 90
|
||||
serviceAccountName: kube-vip
|
||||
nodeSelector:
|
||||
node-role.kubernetes.io/control-plane: ""
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/control-plane
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
- key: node-role.kubernetes.io/master
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
- operator: Exists
|
||||
effect: NoExecute
|
||||
containers:
|
||||
- name: kube-vip
|
||||
image: ghcr.io/kube-vip/kube-vip:v0.8.3
|
||||
imagePullPolicy: IfNotPresent
|
||||
args:
|
||||
- manager
|
||||
env:
|
||||
# Leader election identity must be the Kubernetes node name (hostNetwork
|
||||
# hostname is not always the same; without this, no leader → no VIP).
|
||||
- name: vip_nodename
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
- name: vip_arp
|
||||
value: "true"
|
||||
- name: address
|
||||
value: "192.168.50.230"
|
||||
- name: port
|
||||
value: "6443"
|
||||
# Physical uplink from `talosctl -n <cp-ip> get links` (this cluster: ens18).
|
||||
- name: vip_interface
|
||||
value: "ens18"
|
||||
# Must include "/" — kube-vip does netlink.ParseAddr(address + subnet); "32" breaks (192.168.50.x32).
|
||||
- name: vip_subnet
|
||||
value: "/32"
|
||||
- name: vip_leaderelection
|
||||
value: "true"
|
||||
- name: cp_enable
|
||||
value: "true"
|
||||
- name: cp_namespace
|
||||
value: "kube-system"
|
||||
# Control-plane VIP only until stable: with svc_enable=true the services leader-election
|
||||
# path calls log.Fatal on many failures / leadership moves → CrashLoopBackOff on all CP nodes.
|
||||
# Re-enable "true" after pods are 1/1; if they loop again, capture: kubectl logs -n kube-system -l app.kubernetes.io/name=kube-vip-ds --previous --tail=100
|
||||
- name: svc_enable
|
||||
value: "false"
|
||||
- name: vip_leaseduration
|
||||
value: "15"
|
||||
- name: vip_renewdeadline
|
||||
value: "10"
|
||||
- name: vip_retryperiod
|
||||
value: "2"
|
||||
securityContext:
|
||||
capabilities:
|
||||
add:
|
||||
- NET_ADMIN
|
||||
- NET_RAW
|
||||
- SYS_TIME
|
||||
39
clusters/noble/bootstrap/kube-vip/vip-rbac.yaml
Normal file
39
clusters/noble/bootstrap/kube-vip/vip-rbac.yaml
Normal file
@@ -0,0 +1,39 @@
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: kube-vip
|
||||
namespace: kube-system
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRole
|
||||
metadata:
|
||||
name: kube-vip-role
|
||||
rules:
|
||||
- apiGroups: [""]
|
||||
resources: ["services", "services/status", "endpoints"]
|
||||
verbs: ["get", "list", "watch", "update"]
|
||||
- apiGroups: [""]
|
||||
resources: ["nodes"]
|
||||
verbs: ["get", "list", "watch", "update", "patch"]
|
||||
- apiGroups: [""]
|
||||
resources: ["events"]
|
||||
verbs: ["create", "patch", "update"]
|
||||
- apiGroups: ["coordination.k8s.io"]
|
||||
resources: ["leases"]
|
||||
verbs: ["get", "list", "watch", "create", "update", "patch"]
|
||||
- apiGroups: ["discovery.k8s.io"]
|
||||
resources: ["endpointslices"]
|
||||
verbs: ["get", "list", "watch", "update"]
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
name: kube-vip-binding
|
||||
roleRef:
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
kind: ClusterRole
|
||||
name: kube-vip-role
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: kube-vip
|
||||
namespace: kube-system
|
||||
17
clusters/noble/bootstrap/kustomization.yaml
Normal file
17
clusters/noble/bootstrap/kustomization.yaml
Normal file
@@ -0,0 +1,17 @@
|
||||
# Plain Kustomize only (namespaces + extra YAML). Helm installs are driven by **ansible/playbooks/noble.yml**
|
||||
# (role **noble_platform**) — avoids **kustomize --enable-helm** in-repo.
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
resources:
|
||||
- kube-prometheus-stack/namespace.yaml
|
||||
- loki/namespace.yaml
|
||||
- fluent-bit/namespace.yaml
|
||||
- sealed-secrets/namespace.yaml
|
||||
- external-secrets/namespace.yaml
|
||||
- vault/namespace.yaml
|
||||
- kyverno/namespace.yaml
|
||||
- headlamp/namespace.yaml
|
||||
- grafana-loki-datasource/loki-datasource.yaml
|
||||
- vault/unseal-cronjob.yaml
|
||||
- vault/cilium-network-policy.yaml
|
||||
31
clusters/noble/bootstrap/kyverno/README.md
Normal file
31
clusters/noble/bootstrap/kyverno/README.md
Normal file
@@ -0,0 +1,31 @@
|
||||
# Kyverno (noble)
|
||||
|
||||
Admission policies using [Kyverno](https://kyverno.io/). The main chart installs controllers and CRDs; **`kyverno-policies`** installs **Pod Security Standard** rules matching the **`baseline`** profile in **`Audit`** mode (violations are visible in policy reports; workloads are not denied).
|
||||
|
||||
- **Charts:** `kyverno/kyverno` **3.7.1** (app **v1.17.1**), `kyverno/kyverno-policies` **3.7.1**
|
||||
- **Namespace:** `kyverno`
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
helm repo add kyverno https://kyverno.github.io/kyverno/
|
||||
helm repo update
|
||||
kubectl apply -f clusters/noble/apps/kyverno/namespace.yaml
|
||||
helm upgrade --install kyverno kyverno/kyverno -n kyverno \
|
||||
--version 3.7.1 -f clusters/noble/apps/kyverno/values.yaml --wait --timeout 15m
|
||||
helm upgrade --install kyverno-policies kyverno/kyverno-policies -n kyverno \
|
||||
--version 3.7.1 -f clusters/noble/apps/kyverno/policies-values.yaml --wait --timeout 10m
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
kubectl -n kyverno get pods
|
||||
kubectl get clusterpolicy | head
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- **`validationFailureAction: Audit`** in `policies-values.yaml` avoids breaking namespaces that need **privileged** behavior (Longhorn, monitoring node-exporter, etc.). Switch specific policies or namespaces to **`Enforce`** when you are ready.
|
||||
- To use **`restricted`** instead of **`baseline`**, change **`podSecurityStandard`** in `policies-values.yaml` and reconcile expectations for host mounts and capabilities.
|
||||
- Upgrade: bump **`--version`** on both charts together; read [Kyverno release notes](https://github.com/kyverno/kyverno/releases) for breaking changes.
|
||||
5
clusters/noble/bootstrap/kyverno/namespace.yaml
Normal file
5
clusters/noble/bootstrap/kyverno/namespace.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
# Kyverno — apply before Helm.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: kyverno
|
||||
62
clusters/noble/bootstrap/kyverno/policies-values.yaml
Normal file
62
clusters/noble/bootstrap/kyverno/policies-values.yaml
Normal file
@@ -0,0 +1,62 @@
|
||||
# kyverno/kyverno-policies — Pod Security Standards as Kyverno ClusterPolicies
|
||||
#
|
||||
# helm upgrade --install kyverno-policies kyverno/kyverno-policies -n kyverno \
|
||||
# --version 3.7.1 -f clusters/noble/apps/kyverno/policies-values.yaml --wait --timeout 10m
|
||||
#
|
||||
# Default profile is baseline; validationFailureAction is Audit so existing privileged
|
||||
# workloads are not blocked. Kyverno still emits PolicyReports for matches — Headlamp
|
||||
# surfaces those as “policy violations”. Exclude namespaces that intentionally run
|
||||
# outside baseline (see namespace PSA labels under clusters/noble/apps/*/namespace.yaml)
|
||||
# plus core Kubernetes namespaces and every Ansible-managed app namespace on noble.
|
||||
#
|
||||
# After widening excludes, Kyverno does not always prune old PolicyReport rows; refresh:
|
||||
# kubectl delete clusterpolicyreport --all
|
||||
# kubectl delete policyreport -A --all
|
||||
# (Reports are recreated on the next background scan.)
|
||||
#
|
||||
# Exclude blocks omit `kinds` so the same namespace skip applies to autogen rules for
|
||||
# Deployments, DaemonSets, etc. (see kyverno/kyverno#4306).
|
||||
#
|
||||
policyKind: ClusterPolicy
|
||||
policyType: ClusterPolicy
|
||||
podSecurityStandard: baseline
|
||||
podSecuritySeverity: medium
|
||||
validationFailureAction: Audit
|
||||
failurePolicy: Fail
|
||||
validationAllowExistingViolations: true
|
||||
|
||||
# All platform namespaces on noble (ansible/playbooks/noble.yml + clusters/noble/apps).
|
||||
x-kyverno-exclude-infra: &kyverno_exclude_infra
|
||||
any:
|
||||
- resources:
|
||||
namespaces:
|
||||
- kube-system
|
||||
- kube-public
|
||||
- kube-node-lease
|
||||
- argocd
|
||||
- cert-manager
|
||||
- external-secrets
|
||||
- headlamp
|
||||
- kyverno
|
||||
- logging
|
||||
- loki
|
||||
- longhorn-system
|
||||
- metallb-system
|
||||
- monitoring
|
||||
- newt
|
||||
- sealed-secrets
|
||||
- traefik
|
||||
- vault
|
||||
|
||||
policyExclude:
|
||||
disallow-capabilities: *kyverno_exclude_infra
|
||||
disallow-host-namespaces: *kyverno_exclude_infra
|
||||
disallow-host-path: *kyverno_exclude_infra
|
||||
disallow-host-ports: *kyverno_exclude_infra
|
||||
disallow-host-process: *kyverno_exclude_infra
|
||||
disallow-privileged-containers: *kyverno_exclude_infra
|
||||
disallow-proc-mount: *kyverno_exclude_infra
|
||||
disallow-selinux: *kyverno_exclude_infra
|
||||
restrict-apparmor-profiles: *kyverno_exclude_infra
|
||||
restrict-seccomp: *kyverno_exclude_infra
|
||||
restrict-sysctls: *kyverno_exclude_infra
|
||||
22
clusters/noble/bootstrap/kyverno/values.yaml
Normal file
22
clusters/noble/bootstrap/kyverno/values.yaml
Normal file
@@ -0,0 +1,22 @@
|
||||
# Kyverno — noble (policy engine)
|
||||
#
|
||||
# helm repo add kyverno https://kyverno.github.io/kyverno/
|
||||
# helm repo update
|
||||
# kubectl apply -f clusters/noble/apps/kyverno/namespace.yaml
|
||||
# helm upgrade --install kyverno kyverno/kyverno -n kyverno \
|
||||
# --version 3.7.1 -f clusters/noble/apps/kyverno/values.yaml --wait --timeout 15m
|
||||
#
|
||||
# Baseline Pod Security policies (separate chart): see policies-values.yaml + README.md
|
||||
#
|
||||
# Raise Kubernetes client QPS/burst so under API/etcd load Kyverno does not hit
|
||||
# "client rate limiter Wait" / flaky kyverno-health lease (defaults are very low).
|
||||
# Two replicas: webhook Service keeps endpoints during rolling restarts (avoids
|
||||
# apiserver "connection refused" to kyverno-svc:443 while a single pod cycles).
|
||||
admissionController:
|
||||
replicas: 2
|
||||
# Insulate Kyverno API traffic via APF (helps when etcd/apiserver are busy).
|
||||
apiPriorityAndFairness: true
|
||||
container:
|
||||
extraArgs:
|
||||
clientRateLimitQPS: 30
|
||||
clientRateLimitBurst: 60
|
||||
9
clusters/noble/bootstrap/loki/namespace.yaml
Normal file
9
clusters/noble/bootstrap/loki/namespace.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
# Loki (SingleBinary + filesystem on Longhorn) — apply before Helm.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: loki
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: baseline
|
||||
pod-security.kubernetes.io/audit: baseline
|
||||
pod-security.kubernetes.io/warn: baseline
|
||||
78
clusters/noble/bootstrap/loki/values.yaml
Normal file
78
clusters/noble/bootstrap/loki/values.yaml
Normal file
@@ -0,0 +1,78 @@
|
||||
# Grafana Loki — noble lab (SingleBinary, filesystem on Longhorn; no MinIO/S3).
|
||||
#
|
||||
# Chart: grafana/loki — pin version on install (e.g. 6.55.0).
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/loki/namespace.yaml
|
||||
# helm repo add grafana https://grafana.github.io/helm-charts
|
||||
# helm repo update
|
||||
# helm upgrade --install loki grafana/loki -n loki \
|
||||
# --version 6.55.0 -f clusters/noble/apps/loki/values.yaml --wait --timeout 30m
|
||||
#
|
||||
# Query/push URL for Grafana + Fluent Bit: http://loki-gateway.loki.svc.cluster.local:80
|
||||
|
||||
deploymentMode: SingleBinary
|
||||
|
||||
loki:
|
||||
# Single-tenant lab: chart default auth_enabled: true requires X-Scope-OrgID on every query/push (Grafana + Fluent Bit break).
|
||||
auth_enabled: false
|
||||
commonConfig:
|
||||
replication_factor: 1
|
||||
storage:
|
||||
type: filesystem
|
||||
schemaConfig:
|
||||
configs:
|
||||
- from: "2024-04-01"
|
||||
store: tsdb
|
||||
object_store: filesystem
|
||||
schema: v13
|
||||
index:
|
||||
prefix: loki_index_
|
||||
period: 24h
|
||||
pattern_ingester:
|
||||
enabled: false
|
||||
limits_config:
|
||||
allow_structured_metadata: true
|
||||
volume_enabled: true
|
||||
|
||||
singleBinary:
|
||||
replicas: 1
|
||||
persistence:
|
||||
enabled: true
|
||||
storageClass: longhorn
|
||||
size: 30Gi
|
||||
|
||||
backend:
|
||||
replicas: 0
|
||||
read:
|
||||
replicas: 0
|
||||
write:
|
||||
replicas: 0
|
||||
ingester:
|
||||
replicas: 0
|
||||
querier:
|
||||
replicas: 0
|
||||
queryFrontend:
|
||||
replicas: 0
|
||||
queryScheduler:
|
||||
replicas: 0
|
||||
distributor:
|
||||
replicas: 0
|
||||
compactor:
|
||||
replicas: 0
|
||||
indexGateway:
|
||||
replicas: 0
|
||||
bloomCompactor:
|
||||
replicas: 0
|
||||
bloomGateway:
|
||||
replicas: 0
|
||||
|
||||
minio:
|
||||
enabled: false
|
||||
|
||||
gateway:
|
||||
enabled: true
|
||||
|
||||
# Memcached chunk cache: chart default is ~8Gi RAM requests; even 512Mi can stay Pending on small clusters (affinity).
|
||||
# Homelab: disable — Loki works without it; queries may be slightly slower under load.
|
||||
chunksCache:
|
||||
enabled: false
|
||||
4
clusters/noble/bootstrap/longhorn/kustomization.yaml
Normal file
4
clusters/noble/bootstrap/longhorn/kustomization.yaml
Normal file
@@ -0,0 +1,4 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources:
|
||||
- namespace.yaml
|
||||
10
clusters/noble/bootstrap/longhorn/namespace.yaml
Normal file
10
clusters/noble/bootstrap/longhorn/namespace.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
# Longhorn Manager uses hostPath + privileged; incompatible with Pod Security "baseline".
|
||||
# Apply before or after Helm — merges labels onto existing longhorn-system.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: longhorn-system
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: privileged
|
||||
pod-security.kubernetes.io/audit: privileged
|
||||
pod-security.kubernetes.io/warn: privileged
|
||||
34
clusters/noble/bootstrap/longhorn/values.yaml
Normal file
34
clusters/noble/bootstrap/longhorn/values.yaml
Normal file
@@ -0,0 +1,34 @@
|
||||
# Longhorn Helm values — use with Talos user volume + kubelet mounts (see talos/talconfig.yaml).
|
||||
# 1) PSA: `kubectl apply -k clusters/noble/apps/longhorn` (privileged namespace) before or after Helm.
|
||||
# 2) Talos: bind `/var/lib/longhorn` → `/var/mnt/longhorn` in kubelet extraMounts — chart hostPath is fixed to /var/lib/longhorn.
|
||||
# Example (run from home-server repo root so -f path resolves):
|
||||
# kubectl apply -k clusters/noble/apps/longhorn
|
||||
# helm repo add longhorn https://charts.longhorn.io && helm repo update
|
||||
# helm upgrade --install longhorn longhorn/longhorn -n longhorn-system --create-namespace \
|
||||
# -f clusters/noble/apps/longhorn/values.yaml
|
||||
# "helm upgrade --install" needs two arguments: RELEASE_NAME and CHART (e.g. longhorn longhorn/longhorn).
|
||||
#
|
||||
# If you already installed Longhorn without this file: fix Default Settings in the UI or edit each
|
||||
# node's disk path to /var/mnt/longhorn; wrong path → "wrong format" (root fs / overlay).
|
||||
|
||||
defaultSettings:
|
||||
defaultDataPath: /var/mnt/longhorn
|
||||
# Default 30% reserved often makes small data disks look "full" to the scheduler.
|
||||
storageReservedPercentageForDefaultDisk: "10"
|
||||
|
||||
# Longhorn UI — same *.apps.noble.lab.pcenicni.dev pattern as Grafana / Headlamp (Traefik LB → cert-manager TLS).
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
host: longhorn.apps.noble.lab.pcenicni.dev
|
||||
path: /
|
||||
pathType: Prefix
|
||||
tls: true
|
||||
tlsSecret: longhorn-apps-noble-tls
|
||||
secureBackends: false
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
|
||||
# Pre-upgrade Job: keep enabled for normal Helm upgrades (disable only if GitOps sync fights the Job).
|
||||
preUpgradeChecker:
|
||||
jobEnabled: true
|
||||
52
clusters/noble/bootstrap/metallb/README.md
Normal file
52
clusters/noble/bootstrap/metallb/README.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# MetalLB (layer 2) — noble
|
||||
|
||||
**Prerequisite (Talos + `cni: none`):** install **Cilium** (or your CNI) **before** MetalLB.
|
||||
|
||||
Until the CNI is up, nodes stay **`NotReady`** and carry taints such as **`node.kubernetes.io/network-unavailable`** (and **`not-ready`**). The scheduler then reports **`0/N nodes are available: N node(s) had untolerated taint(s)`** and MetalLB stays **`Pending`** — its chart does not tolerate those taints, by design. **Install Cilium first** (`talos/CLUSTER-BUILD.md` Phase B); when nodes are **`Ready`**, reinstall or rollout MetalLB if needed.
|
||||
|
||||
**Order:** namespace (Pod Security) → **Helm** (CRDs + controller) → **kustomize** (pool + L2).
|
||||
|
||||
If `kubectl apply -k` fails with **`no matches for kind "IPAddressPool"`** / **`ensure CRDs are installed first`**, Helm is not installed yet.
|
||||
|
||||
**Pod Security warnings** (`would violate PodSecurity "restricted"`): MetalLB’s speaker/FRR use `hostNetwork`, `NET_ADMIN`, etc. That is expected unless `metallb-system` is labeled **privileged**. Apply `namespace.yaml` **before** Helm so the namespace is created with the right labels (omit `--create-namespace` on Helm), or patch an existing namespace:
|
||||
|
||||
```bash
|
||||
kubectl apply -f clusters/noble/apps/metallb/namespace.yaml
|
||||
```
|
||||
|
||||
If you already ran Helm with `--create-namespace`, either `kubectl apply -f namespace.yaml` (merges labels) or:
|
||||
|
||||
```bash
|
||||
kubectl label namespace metallb-system \
|
||||
pod-security.kubernetes.io/enforce=privileged \
|
||||
pod-security.kubernetes.io/audit=privileged \
|
||||
pod-security.kubernetes.io/warn=privileged --overwrite
|
||||
```
|
||||
|
||||
Then restart MetalLB pods if they were failing (`kubectl get pods -n metallb-system`; delete stuck pods or `kubectl rollout restart` each `Deployment` / `DaemonSet` in that namespace).
|
||||
|
||||
1. Install the MetalLB chart (CRDs + controller). If you applied `namespace.yaml` above, **skip** `--create-namespace`:
|
||||
|
||||
```bash
|
||||
helm repo add metallb https://metallb.github.io/metallb
|
||||
helm repo update
|
||||
helm upgrade --install metallb metallb/metallb \
|
||||
--namespace metallb-system \
|
||||
--wait --timeout 20m
|
||||
```
|
||||
|
||||
2. Apply this folder’s pool and L2 advertisement:
|
||||
|
||||
```bash
|
||||
kubectl apply -k clusters/noble/apps/metallb
|
||||
```
|
||||
|
||||
3. Confirm a `Service` `type: LoadBalancer` receives an address in `192.168.50.210`–`192.168.50.229` (e.g. **`kubectl get svc -n traefik traefik`** after installing **Traefik** in `clusters/noble/apps/traefik/`).
|
||||
|
||||
Reserve **one** IP in that range for Argo CD (e.g. `192.168.50.210`) via `spec.loadBalancerIP` or chart values when you expose the server. Traefik pins **`192.168.50.211`** in **`clusters/noble/apps/traefik/values.yaml`**.
|
||||
|
||||
## `Pending` MetalLB pods
|
||||
|
||||
1. `kubectl get nodes` — every node **`Ready`**? If **`NotReady`** or **`NetworkUnavailable`**, finish **CNI** install first.
|
||||
2. `kubectl describe pod -n metallb-system <pod-name>` — read **Events** at the bottom (`0/N nodes are available: …`).
|
||||
3. L2 speaker uses the node’s uplink; kube-vip in this repo expects **`ens18`** on control planes (`clusters/noble/apps/kube-vip/vip-daemonset.yaml`). If your NIC name differs, change `vip_interface` there.
|
||||
19
clusters/noble/bootstrap/metallb/ip-address-pool.yaml
Normal file
19
clusters/noble/bootstrap/metallb/ip-address-pool.yaml
Normal file
@@ -0,0 +1,19 @@
|
||||
# Apply after MetalLB controller is installed (Helm chart or manifest).
|
||||
# Namespace must match where MetalLB expects pools (commonly metallb-system).
|
||||
apiVersion: metallb.io/v1beta1
|
||||
kind: IPAddressPool
|
||||
metadata:
|
||||
name: noble-l2
|
||||
namespace: metallb-system
|
||||
spec:
|
||||
addresses:
|
||||
- 192.168.50.210-192.168.50.229
|
||||
---
|
||||
apiVersion: metallb.io/v1beta1
|
||||
kind: L2Advertisement
|
||||
metadata:
|
||||
name: noble-l2
|
||||
namespace: metallb-system
|
||||
spec:
|
||||
ipAddressPools:
|
||||
- noble-l2
|
||||
4
clusters/noble/bootstrap/metallb/kustomization.yaml
Normal file
4
clusters/noble/bootstrap/metallb/kustomization.yaml
Normal file
@@ -0,0 +1,4 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources:
|
||||
- ip-address-pool.yaml
|
||||
11
clusters/noble/bootstrap/metallb/namespace.yaml
Normal file
11
clusters/noble/bootstrap/metallb/namespace.yaml
Normal file
@@ -0,0 +1,11 @@
|
||||
# Apply before Helm if you do not use --create-namespace, or use this to fix PSA after the fact:
|
||||
# kubectl apply -f clusters/noble/apps/metallb/namespace.yaml
|
||||
# MetalLB speaker needs hostNetwork + NET_ADMIN; incompatible with Pod Security "restricted".
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: metallb-system
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: privileged
|
||||
pod-security.kubernetes.io/audit: privileged
|
||||
pod-security.kubernetes.io/warn: privileged
|
||||
10
clusters/noble/bootstrap/metrics-server/values.yaml
Normal file
10
clusters/noble/bootstrap/metrics-server/values.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
# metrics-server — noble (Talos)
|
||||
# Kubelet serving certs are not validated by default; see Talos docs:
|
||||
# https://www.talos.dev/latest/kubernetes-guides/configuration/deploy-metrics-server/
|
||||
#
|
||||
# helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
|
||||
# helm upgrade --install metrics-server metrics-server/metrics-server -n kube-system \
|
||||
# --version 3.13.0 -f clusters/noble/apps/metrics-server/values.yaml --wait
|
||||
|
||||
args:
|
||||
- --kubelet-insecure-tls
|
||||
96
clusters/noble/bootstrap/newt/README.md
Normal file
96
clusters/noble/bootstrap/newt/README.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# Newt (Pangolin) — noble
|
||||
|
||||
This is the **primary** automation path for **public** hostnames to workloads in this cluster (it **replaces** in-cluster ExternalDNS). [Newt](https://github.com/fosrl/newt) is the on-prem agent that connects your cluster to a **Pangolin** site (WireGuard tunnel). The [Fossorial Helm chart](https://github.com/fosrl/helm-charts) deploys one or more instances.
|
||||
|
||||
**Secrets:** Never commit endpoint, Newt ID, or Newt secret. If credentials were pasted into chat or CI logs, **rotate them** in Pangolin and recreate the Kubernetes Secret.
|
||||
|
||||
## 1. Create the Secret
|
||||
|
||||
Keys must match `values.yaml` (`PANGOLIN_ENDPOINT`, `NEWT_ID`, `NEWT_SECRET`).
|
||||
|
||||
### Option A — Sealed Secret (safe for GitOps)
|
||||
|
||||
With the [Sealed Secrets](https://github.com/bitnami-labs/sealed-secrets) controller installed (`clusters/noble/apps/sealed-secrets/`), generate a `SealedSecret` from your workstation (rotate credentials in Pangolin first if they were exposed):
|
||||
|
||||
```bash
|
||||
chmod +x clusters/noble/apps/sealed-secrets/examples/kubeseal-newt-pangolin-auth.sh
|
||||
export PANGOLIN_ENDPOINT='https://pangolin.pcenicni.dev'
|
||||
export NEWT_ID='YOUR_NEWT_ID'
|
||||
export NEWT_SECRET='YOUR_NEWT_SECRET'
|
||||
./clusters/noble/apps/sealed-secrets/examples/kubeseal-newt-pangolin-auth.sh > newt-pangolin-auth.sealedsecret.yaml
|
||||
kubectl apply -f newt-pangolin-auth.sealedsecret.yaml
|
||||
```
|
||||
|
||||
Commit only the `.sealedsecret.yaml` file, not plain `Secret` YAML.
|
||||
|
||||
### Option B — Imperative Secret (not in git)
|
||||
|
||||
```bash
|
||||
kubectl apply -f clusters/noble/apps/newt/namespace.yaml
|
||||
|
||||
kubectl -n newt create secret generic newt-pangolin-auth \
|
||||
--from-literal=PANGOLIN_ENDPOINT='https://pangolin.pcenicni.dev' \
|
||||
--from-literal=NEWT_ID='YOUR_NEWT_ID' \
|
||||
--from-literal=NEWT_SECRET='YOUR_NEWT_SECRET'
|
||||
```
|
||||
|
||||
Use the Pangolin UI or [Integration API](https://docs.pangolin.net/manage/common-api-routes) (`pick-site-defaults` + `create site`) to obtain a Newt ID and secret for a new site if you are not reusing an existing pair.
|
||||
|
||||
## 2. Install the chart
|
||||
|
||||
```bash
|
||||
helm repo add fossorial https://charts.fossorial.io
|
||||
helm repo update
|
||||
helm upgrade --install newt fossorial/newt \
|
||||
--namespace newt \
|
||||
--version 1.2.0 \
|
||||
-f clusters/noble/apps/newt/values.yaml \
|
||||
--wait
|
||||
```
|
||||
|
||||
## 3. DNS: CNAME at your DNS host + Pangolin API for routes
|
||||
|
||||
Pangolin does not replace your public DNS provider. Typical flow:
|
||||
|
||||
1. **Link a domain** in Pangolin (organization **Domains**). For **CNAME**-style domains, Pangolin shows the hostname you must **CNAME** to at Cloudflare / your registrar (see [Domains](https://docs.pangolin.net/manage/common-api-routes#list-domains)).
|
||||
2. **Create public HTTP resources** (and **targets** to your Newt **site**) via the [Integration API](https://docs.pangolin.net/manage/integration-api) — same flows as the UI. Swagger: `https://<your-api-host>/v1/docs` (self-hosted: enable `enable_integration_api` and route `api.example.com` → integration port per [docs](https://docs.pangolin.net/self-host/advanced/integration-api)).
|
||||
|
||||
Minimal patterns (Bearer token = org or root API key):
|
||||
|
||||
```bash
|
||||
export API_BASE='https://api.example.com/v1' # your Pangolin Integration API base
|
||||
export ORG_ID='your-org-id'
|
||||
export TOKEN='your-integration-api-key'
|
||||
|
||||
# Domains already linked to the org (use domainId when creating a resource)
|
||||
curl -sS -H "Authorization: Bearer ${TOKEN}" \
|
||||
"${API_BASE}/org/${ORG_ID}/domains"
|
||||
|
||||
# Create an HTTP resource on a domain (FQDN = subdomain + base domain for NS/wildcard domains)
|
||||
curl -sS -X PUT -H "Authorization: Bearer ${TOKEN}" -H 'Content-Type: application/json' \
|
||||
"${API_BASE}/org/${ORG_ID}/resource" \
|
||||
-d '{
|
||||
"name": "Example app",
|
||||
"http": true,
|
||||
"domainId": "YOUR_DOMAIN_ID",
|
||||
"protocol": "tcp",
|
||||
"subdomain": "my-app"
|
||||
}'
|
||||
|
||||
# Point the resource at your Newt site backend (siteId from list sites / create site; ip:port inside the tunnel)
|
||||
curl -sS -X PUT -H "Authorization: Bearer ${TOKEN}" -H 'Content-Type: application/json' \
|
||||
"${API_BASE}/resource/RESOURCE_ID/target" \
|
||||
-d '{
|
||||
"siteId": YOUR_SITE_ID,
|
||||
"ip": "10.x.x.x",
|
||||
"port": 443,
|
||||
"method": "http"
|
||||
}'
|
||||
```
|
||||
|
||||
Exact JSON fields and IDs differ by domain type (**ns** vs **cname** vs **wildcard**); see [Common API routes](https://docs.pangolin.net/manage/common-api-routes) and Swagger.
|
||||
|
||||
## LAN vs internet
|
||||
|
||||
- **LAN / VPN:** point **`*.apps.noble.lab.pcenicni.dev`** at the Traefik **LoadBalancer** (**`192.168.50.211`**) with local or split-horizon DNS if you want direct in-lab access.
|
||||
- **Internet-facing:** use Pangolin **resources** + **targets** to the Newt **site**; public names rely on **CNAME** records at your DNS provider per Pangolin’s domain setup, not on ExternalDNS in the cluster.
|
||||
9
clusters/noble/bootstrap/newt/namespace.yaml
Normal file
9
clusters/noble/bootstrap/newt/namespace.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
# Newt (Pangolin site tunnel client) — noble lab
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: newt
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: baseline
|
||||
pod-security.kubernetes.io/audit: baseline
|
||||
pod-security.kubernetes.io/warn: baseline
|
||||
26
clusters/noble/bootstrap/newt/values.yaml
Normal file
26
clusters/noble/bootstrap/newt/values.yaml
Normal file
@@ -0,0 +1,26 @@
|
||||
# Newt — noble lab (Fossorial Helm chart)
|
||||
#
|
||||
# Credentials MUST come from a Secret — do not put endpoint/id/secret in git.
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/newt/namespace.yaml
|
||||
# kubectl -n newt create secret generic newt-pangolin-auth \
|
||||
# --from-literal=PANGOLIN_ENDPOINT='https://pangolin.example.com' \
|
||||
# --from-literal=NEWT_ID='...' \
|
||||
# --from-literal=NEWT_SECRET='...'
|
||||
#
|
||||
# helm repo add fossorial https://charts.fossorial.io
|
||||
# helm upgrade --install newt fossorial/newt -n newt \
|
||||
# --version 1.2.0 -f clusters/noble/apps/newt/values.yaml --wait
|
||||
#
|
||||
# See README.md for Pangolin Integration API (domains + HTTP resources + CNAME).
|
||||
|
||||
newtInstances:
|
||||
- name: main-tunnel
|
||||
enabled: true
|
||||
replicas: 1
|
||||
auth:
|
||||
existingSecretName: newt-pangolin-auth
|
||||
keys:
|
||||
endpointKey: PANGOLIN_ENDPOINT
|
||||
idKey: NEWT_ID
|
||||
secretKey: NEWT_SECRET
|
||||
50
clusters/noble/bootstrap/sealed-secrets/README.md
Normal file
50
clusters/noble/bootstrap/sealed-secrets/README.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# Sealed Secrets (noble)
|
||||
|
||||
Encrypts `Secret` manifests so they can live in git; the controller decrypts **SealedSecret** resources into **Secret**s in-cluster.
|
||||
|
||||
- **Chart:** `sealed-secrets/sealed-secrets` **2.18.4** (app **0.36.1**)
|
||||
- **Namespace:** `sealed-secrets`
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
helm repo add sealed-secrets https://bitnami-labs.github.io/sealed-secrets
|
||||
helm repo update
|
||||
kubectl apply -f clusters/noble/apps/sealed-secrets/namespace.yaml
|
||||
helm upgrade --install sealed-secrets sealed-secrets/sealed-secrets -n sealed-secrets \
|
||||
--version 2.18.4 -f clusters/noble/apps/sealed-secrets/values.yaml --wait
|
||||
```
|
||||
|
||||
## Workstation: `kubeseal`
|
||||
|
||||
Install a **kubeseal** build compatible with the controller (match **app** minor, e.g. **0.36.x** for **0.36.1**). Examples:
|
||||
|
||||
- **Homebrew:** `brew install kubeseal` (check `kubeseal --version` against the chart’s `image.tag` in `helm show values`).
|
||||
- **GitHub releases:** [bitnami-labs/sealed-secrets](https://github.com/bitnami-labs/sealed-secrets/releases)
|
||||
|
||||
Fetch the cluster’s public seal cert (once per kube context):
|
||||
|
||||
```bash
|
||||
kubeseal --fetch-cert > /tmp/noble-sealed-secrets.pem
|
||||
```
|
||||
|
||||
Create a sealed secret from a normal secret manifest:
|
||||
|
||||
```bash
|
||||
kubectl create secret generic example --from-literal=foo=bar --dry-run=client -o yaml \
|
||||
| kubeseal --cert /tmp/noble-sealed-secrets.pem -o yaml > example-sealedsecret.yaml
|
||||
```
|
||||
|
||||
Commit `example-sealedsecret.yaml`; apply it with `kubectl apply -f`. The controller creates the **Secret** in the same namespace as the **SealedSecret**.
|
||||
|
||||
**Noble example:** `examples/kubeseal-newt-pangolin-auth.sh` (Newt / Pangolin tunnel credentials).
|
||||
|
||||
## Backup the sealing key
|
||||
|
||||
If the controller’s private key is lost, existing sealed files cannot be decrypted on a new cluster. Back up the key secret after install:
|
||||
|
||||
```bash
|
||||
kubectl get secret -n sealed-secrets -l sealedsecrets.bitnami.com/sealed-secrets-key=active -o yaml > sealed-secrets-key-backup.yaml
|
||||
```
|
||||
|
||||
Store `sealed-secrets-key-backup.yaml` in a safe offline location (not in public git).
|
||||
@@ -0,0 +1,19 @@
|
||||
#!/usr/bin/env bash
|
||||
# Emit a SealedSecret for newt-pangolin-auth (namespace newt).
|
||||
# Prerequisites: sealed-secrets controller running; kubeseal client (same minor as controller).
|
||||
# Rotate Pangolin/Newt credentials in the UI first if they were exposed, then set env vars and run:
|
||||
#
|
||||
# export PANGOLIN_ENDPOINT='https://pangolin.example.com'
|
||||
# export NEWT_ID='...'
|
||||
# export NEWT_SECRET='...'
|
||||
# ./kubeseal-newt-pangolin-auth.sh > newt-pangolin-auth.sealedsecret.yaml
|
||||
# kubectl apply -f newt-pangolin-auth.sealedsecret.yaml
|
||||
#
|
||||
set -euo pipefail
|
||||
kubectl apply -f "$(dirname "$0")/../../newt/namespace.yaml" >/dev/null 2>&1 || true
|
||||
kubectl -n newt create secret generic newt-pangolin-auth \
|
||||
--dry-run=client \
|
||||
--from-literal=PANGOLIN_ENDPOINT="${PANGOLIN_ENDPOINT:?}" \
|
||||
--from-literal=NEWT_ID="${NEWT_ID:?}" \
|
||||
--from-literal=NEWT_SECRET="${NEWT_SECRET:?}" \
|
||||
-o yaml | kubeseal -o yaml
|
||||
5
clusters/noble/bootstrap/sealed-secrets/namespace.yaml
Normal file
5
clusters/noble/bootstrap/sealed-secrets/namespace.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
# Sealed Secrets controller — apply before Helm.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: sealed-secrets
|
||||
18
clusters/noble/bootstrap/sealed-secrets/values.yaml
Normal file
18
clusters/noble/bootstrap/sealed-secrets/values.yaml
Normal file
@@ -0,0 +1,18 @@
|
||||
# Sealed Secrets — noble (Git-encrypted Secret workflow)
|
||||
#
|
||||
# helm repo add sealed-secrets https://bitnami-labs.github.io/sealed-secrets
|
||||
# helm repo update
|
||||
# kubectl apply -f clusters/noble/apps/sealed-secrets/namespace.yaml
|
||||
# helm upgrade --install sealed-secrets sealed-secrets/sealed-secrets -n sealed-secrets \
|
||||
# --version 2.18.4 -f clusters/noble/apps/sealed-secrets/values.yaml --wait
|
||||
#
|
||||
# Client: install kubeseal (same minor as controller — see README).
|
||||
# Defaults are sufficient for the lab; override here if you need key renewal, resources, etc.
|
||||
#
|
||||
# GitOps pattern: create Secrets only via SealedSecret (or External Secrets + Vault).
|
||||
# Example (Newt): clusters/noble/apps/sealed-secrets/examples/kubeseal-newt-pangolin-auth.sh
|
||||
# Backup the controller's sealing key: kubectl -n sealed-secrets get secret sealed-secrets-key -o yaml
|
||||
#
|
||||
# Talos cluster secrets (bootstrap token, cluster secret, certs) belong in talhelper talsecret /
|
||||
# SOPS — not Sealed Secrets. See talos/README.md.
|
||||
commonLabels: {}
|
||||
33
clusters/noble/bootstrap/traefik/README.md
Normal file
33
clusters/noble/bootstrap/traefik/README.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Traefik — noble
|
||||
|
||||
**Prerequisites:** **Cilium**, **MetalLB** (pool + L2), nodes **Ready**.
|
||||
|
||||
1. Create the namespace (Pod Security **baseline** — Traefik needs more than **restricted**):
|
||||
|
||||
```bash
|
||||
kubectl apply -f clusters/noble/apps/traefik/namespace.yaml
|
||||
```
|
||||
|
||||
2. Install the chart (**do not** use `--create-namespace` if the namespace already exists):
|
||||
|
||||
```bash
|
||||
helm repo add traefik https://traefik.github.io/charts
|
||||
helm repo update
|
||||
helm upgrade --install traefik traefik/traefik \
|
||||
--namespace traefik \
|
||||
--version 39.0.6 \
|
||||
-f clusters/noble/apps/traefik/values.yaml \
|
||||
--wait
|
||||
```
|
||||
|
||||
3. Confirm the Service has a pool address. On the **LAN**, **`*.apps.noble.lab.pcenicni.dev`** can resolve to this IP (split horizon / local DNS). **Public** names go through **Pangolin + Newt** (CNAME + API), not ExternalDNS — see **`clusters/noble/apps/newt/README.md`**.
|
||||
|
||||
```bash
|
||||
kubectl get svc -n traefik traefik
|
||||
```
|
||||
|
||||
Values pin **`192.168.50.211`** via **`metallb.io/loadBalancerIPs`**. **`192.168.50.210`** stays free for Argo CD.
|
||||
|
||||
4. Create **Ingress** resources with **`ingressClassName: traefik`** (or rely on the default class). **TLS:** add **`cert-manager.io/cluster-issuer: letsencrypt-staging`** (or **`letsencrypt-prod`**) and **`tls`** hosts — see **`clusters/noble/apps/cert-manager/README.md`**.
|
||||
|
||||
5. **Public DNS:** use **Newt** + Pangolin (**CNAME** at your DNS host + **Integration API** for resources/targets) — **`clusters/noble/apps/newt/README.md`**.
|
||||
10
clusters/noble/bootstrap/traefik/namespace.yaml
Normal file
10
clusters/noble/bootstrap/traefik/namespace.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
# Traefik controller — apply before Helm (omit --create-namespace on install).
|
||||
# Ingress controller needs capabilities beyond "restricted"; use baseline.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: traefik
|
||||
labels:
|
||||
pod-security.kubernetes.io/enforce: baseline
|
||||
pod-security.kubernetes.io/audit: baseline
|
||||
pod-security.kubernetes.io/warn: baseline
|
||||
29
clusters/noble/bootstrap/traefik/values.yaml
Normal file
29
clusters/noble/bootstrap/traefik/values.yaml
Normal file
@@ -0,0 +1,29 @@
|
||||
# Traefik ingress controller — noble lab
|
||||
#
|
||||
# Chart: traefik/traefik — pin version on the helm command (e.g. 39.0.6).
|
||||
# DNS: point *.apps.noble.lab.pcenicni.dev to the LoadBalancer IP below.
|
||||
#
|
||||
# kubectl apply -f clusters/noble/apps/traefik/namespace.yaml
|
||||
# helm repo add traefik https://traefik.github.io/charts
|
||||
# helm upgrade --install traefik traefik/traefik -n traefik \
|
||||
# --version 39.0.6 -f clusters/noble/apps/traefik/values.yaml --wait
|
||||
|
||||
service:
|
||||
type: LoadBalancer
|
||||
annotations:
|
||||
metallb.io/loadBalancerIPs: 192.168.50.211
|
||||
|
||||
ingressClass:
|
||||
enabled: true
|
||||
isDefaultClass: true
|
||||
name: traefik
|
||||
|
||||
# Ingress-only; Gateway API objects from the chart are not needed here.
|
||||
gateway:
|
||||
enabled: false
|
||||
|
||||
gatewayClass:
|
||||
enabled: false
|
||||
|
||||
deployment:
|
||||
replicas: 1
|
||||
162
clusters/noble/bootstrap/vault/README.md
Normal file
162
clusters/noble/bootstrap/vault/README.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# HashiCorp Vault (noble)
|
||||
|
||||
Standalone Vault with **file** storage on a **Longhorn** PVC (`server.dataStorage`). The listener uses **HTTP** (`global.tlsDisable: true`) for in-cluster use; add TLS at the listener when exposing outside the cluster.
|
||||
|
||||
- **Chart:** `hashicorp/vault` **0.32.0** (Vault **1.21.2**)
|
||||
- **Namespace:** `vault`
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
helm repo add hashicorp https://helm.releases.hashicorp.com
|
||||
helm repo update
|
||||
kubectl apply -f clusters/noble/apps/vault/namespace.yaml
|
||||
helm upgrade --install vault hashicorp/vault -n vault \
|
||||
--version 0.32.0 -f clusters/noble/apps/vault/values.yaml --wait --timeout 15m
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
kubectl -n vault get pods,pvc,svc
|
||||
kubectl -n vault exec -i sts/vault -- vault status
|
||||
```
|
||||
|
||||
## Cilium network policy (Phase G)
|
||||
|
||||
After **Cilium** is up, optionally restrict HTTP access to the Vault server pods (**TCP 8200**) to **`external-secrets`** and same-namespace clients:
|
||||
|
||||
```bash
|
||||
kubectl apply -f clusters/noble/apps/vault/cilium-network-policy.yaml
|
||||
```
|
||||
|
||||
If you add workloads in other namespaces that call Vault, extend **`ingress`** in that manifest.
|
||||
|
||||
## Initialize and unseal (first time)
|
||||
|
||||
From a workstation with `kubectl` (or `kubectl exec` into any pod with `vault` CLI):
|
||||
|
||||
```bash
|
||||
kubectl -n vault exec -i sts/vault -- vault operator init -key-shares=1 -key-threshold=1
|
||||
```
|
||||
|
||||
**Lab-only:** `-key-shares=1 -key-threshold=1` keeps a single unseal key. For stronger Shamir splits, use more shares and store them safely.
|
||||
|
||||
Save the **Unseal Key** and **Root Token** offline. Then unseal once:
|
||||
|
||||
```bash
|
||||
kubectl -n vault exec -i sts/vault -- vault operator unseal
|
||||
# paste unseal key
|
||||
```
|
||||
|
||||
Or create the Secret used by the optional CronJob and apply it:
|
||||
|
||||
```bash
|
||||
kubectl -n vault create secret generic vault-unseal-key --from-literal=key='YOUR_UNSEAL_KEY'
|
||||
kubectl apply -f clusters/noble/apps/vault/unseal-cronjob.yaml
|
||||
```
|
||||
|
||||
The CronJob runs every minute and unseals if Vault is sealed and the Secret is present.
|
||||
|
||||
## Auto-unseal note
|
||||
|
||||
Vault **OSS** auto-unseal uses cloud KMS (AWS, GCP, Azure, OCI), **Transit** (another Vault), etc. There is no first-class “Kubernetes Secret” seal. This repo uses an optional **CronJob** as a **lab** substitute. Production clusters should use a supported seal backend.
|
||||
|
||||
## Kubernetes auth (External Secrets / ClusterSecretStore)
|
||||
|
||||
**One-shot:** from the repo root, `export KUBECONFIG=talos/kubeconfig` and `export VAULT_TOKEN=…`, then run **`./clusters/noble/apps/vault/configure-kubernetes-auth.sh`** (idempotent). Then **`kubectl apply -f clusters/noble/apps/external-secrets/examples/vault-cluster-secret-store.yaml`** on its own line (shell comments **`# …`** on the same line are parsed as extra `kubectl` args and break `apply`). **`kubectl get clustersecretstore vault`** should show **READY=True** after a few seconds.
|
||||
|
||||
Run these **from your workstation** (needs `kubectl`; no local `vault` binary required). Use a **short-lived admin token** or the root token **only in your shell** — do not paste tokens into logs or chat.
|
||||
|
||||
**1. Enable the auth method** (skip if already done):
|
||||
|
||||
```bash
|
||||
kubectl -n vault exec -it sts/vault -- sh -c '
|
||||
export VAULT_ADDR=http://127.0.0.1:8200
|
||||
export VAULT_TOKEN="YOUR_ROOT_OR_ADMIN_TOKEN"
|
||||
vault auth enable kubernetes
|
||||
'
|
||||
```
|
||||
|
||||
**2. Configure `auth/kubernetes`** — the API **issuer** must match the `iss` claim on service account JWTs. With **kube-vip** / a custom API URL, discover it from the cluster (do not assume `kubernetes.default`):
|
||||
|
||||
```bash
|
||||
ISSUER=$(kubectl get --raw /.well-known/openid-configuration | jq -r .issuer)
|
||||
REVIEWER=$(kubectl -n vault create token vault --duration=8760h)
|
||||
CA_B64=$(kubectl config view --raw --minify -o jsonpath='{.clusters[0].cluster.certificate-authority-data}')
|
||||
```
|
||||
|
||||
Then apply config **inside** the Vault pod (environment variables are passed in with `env` so quoting stays correct):
|
||||
|
||||
```bash
|
||||
export VAULT_TOKEN="YOUR_ROOT_OR_ADMIN_TOKEN"
|
||||
export ISSUER REVIEWER CA_B64
|
||||
kubectl -n vault exec -i sts/vault -- env \
|
||||
VAULT_ADDR=http://127.0.0.1:8200 \
|
||||
VAULT_TOKEN="$VAULT_TOKEN" \
|
||||
CA_B64="$CA_B64" \
|
||||
REVIEWER="$REVIEWER" \
|
||||
ISSUER="$ISSUER" \
|
||||
sh -ec '
|
||||
echo "$CA_B64" | base64 -d > /tmp/k8s-ca.crt
|
||||
vault write auth/kubernetes/config \
|
||||
kubernetes_host="https://kubernetes.default.svc:443" \
|
||||
kubernetes_ca_cert=@/tmp/k8s-ca.crt \
|
||||
token_reviewer_jwt="$REVIEWER" \
|
||||
issuer="$ISSUER"
|
||||
'
|
||||
```
|
||||
|
||||
**3. KV v2** at path `secret` (skip if already enabled):
|
||||
|
||||
```bash
|
||||
kubectl -n vault exec -it sts/vault -- sh -c '
|
||||
export VAULT_ADDR=http://127.0.0.1:8200
|
||||
export VAULT_TOKEN="YOUR_ROOT_OR_ADMIN_TOKEN"
|
||||
vault secrets enable -path=secret kv-v2
|
||||
'
|
||||
```
|
||||
|
||||
**4. Policy + role** for the External Secrets operator SA (`external-secrets` / `external-secrets`):
|
||||
|
||||
```bash
|
||||
kubectl -n vault exec -it sts/vault -- sh -c '
|
||||
export VAULT_ADDR=http://127.0.0.1:8200
|
||||
export VAULT_TOKEN="YOUR_ROOT_OR_ADMIN_TOKEN"
|
||||
vault policy write external-secrets - <<EOF
|
||||
path "secret/data/*" {
|
||||
capabilities = ["read", "list"]
|
||||
}
|
||||
path "secret/metadata/*" {
|
||||
capabilities = ["read", "list"]
|
||||
}
|
||||
EOF
|
||||
vault write auth/kubernetes/role/external-secrets \
|
||||
bound_service_account_names=external-secrets \
|
||||
bound_service_account_namespaces=external-secrets \
|
||||
policies=external-secrets \
|
||||
ttl=24h
|
||||
'
|
||||
```
|
||||
|
||||
**5. Apply** **`clusters/noble/apps/external-secrets/examples/vault-cluster-secret-store.yaml`** if you have not already, then verify:
|
||||
|
||||
```bash
|
||||
kubectl describe clustersecretstore vault
|
||||
```
|
||||
|
||||
See also [Kubernetes auth](https://developer.hashicorp.com/vault/docs/auth/kubernetes#configuration).
|
||||
|
||||
## TLS and External Secrets
|
||||
|
||||
`values.yaml` disables TLS on the Vault listener. The **`ClusterSecretStore`** example uses **`http://vault.vault.svc.cluster.local:8200`**. If you enable TLS on the listener, switch the URL to **`https://`** and configure **`caBundle`** or **`caProvider`** on the store.
|
||||
|
||||
## UI
|
||||
|
||||
Port-forward:
|
||||
|
||||
```bash
|
||||
kubectl -n vault port-forward svc/vault-ui 8200:8200
|
||||
```
|
||||
|
||||
Open `http://127.0.0.1:8200` and log in with the root token (rotate for production workflows).
|
||||
40
clusters/noble/bootstrap/vault/cilium-network-policy.yaml
Normal file
40
clusters/noble/bootstrap/vault/cilium-network-policy.yaml
Normal file
@@ -0,0 +1,40 @@
|
||||
# CiliumNetworkPolicy — restrict who may reach Vault HTTP listener (8200).
|
||||
# Apply after Cilium is healthy: kubectl apply -f clusters/noble/apps/vault/cilium-network-policy.yaml
|
||||
#
|
||||
# Ingress-only policy: egress from Vault is unchanged (Kubernetes auth needs API + DNS).
|
||||
# Extend ingress rules if other namespaces must call Vault (e.g. app workloads).
|
||||
#
|
||||
# Ref: https://docs.cilium.io/en/stable/security/policy/language/
|
||||
---
|
||||
apiVersion: cilium.io/v2
|
||||
kind: CiliumNetworkPolicy
|
||||
metadata:
|
||||
name: vault-http-ingress
|
||||
namespace: vault
|
||||
spec:
|
||||
endpointSelector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: vault
|
||||
component: server
|
||||
ingress:
|
||||
- fromEndpoints:
|
||||
- matchLabels:
|
||||
"k8s:io.kubernetes.pod.namespace": external-secrets
|
||||
toPorts:
|
||||
- ports:
|
||||
- port: "8200"
|
||||
protocol: TCP
|
||||
- fromEndpoints:
|
||||
- matchLabels:
|
||||
"k8s:io.kubernetes.pod.namespace": traefik
|
||||
toPorts:
|
||||
- ports:
|
||||
- port: "8200"
|
||||
protocol: TCP
|
||||
- fromEndpoints:
|
||||
- matchLabels:
|
||||
"k8s:io.kubernetes.pod.namespace": vault
|
||||
toPorts:
|
||||
- ports:
|
||||
- port: "8200"
|
||||
protocol: TCP
|
||||
77
clusters/noble/bootstrap/vault/configure-kubernetes-auth.sh
Executable file
77
clusters/noble/bootstrap/vault/configure-kubernetes-auth.sh
Executable file
@@ -0,0 +1,77 @@
|
||||
#!/usr/bin/env bash
|
||||
# Configure Vault Kubernetes auth + KV v2 + policy/role for External Secrets Operator.
|
||||
# Requires: kubectl (cluster access), jq optional (openid issuer); Vault reachable via sts/vault.
|
||||
#
|
||||
# Usage (from repo root):
|
||||
# export KUBECONFIG=talos/kubeconfig # or your path
|
||||
# export VAULT_TOKEN='…' # root or admin token — never commit
|
||||
# ./clusters/noble/apps/vault/configure-kubernetes-auth.sh
|
||||
#
|
||||
# Then: kubectl apply -f clusters/noble/apps/external-secrets/examples/vault-cluster-secret-store.yaml
|
||||
# Verify: kubectl describe clustersecretstore vault
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
: "${VAULT_TOKEN:?Set VAULT_TOKEN to your Vault root or admin token}"
|
||||
|
||||
ISSUER=$(kubectl get --raw /.well-known/openid-configuration | jq -r .issuer)
|
||||
REVIEWER=$(kubectl -n vault create token vault --duration=8760h)
|
||||
CA_B64=$(kubectl config view --raw --minify -o jsonpath='{.clusters[0].cluster.certificate-authority-data}')
|
||||
|
||||
kubectl -n vault exec -i sts/vault -- env \
|
||||
VAULT_ADDR=http://127.0.0.1:8200 \
|
||||
VAULT_TOKEN="$VAULT_TOKEN" \
|
||||
sh -ec '
|
||||
set -e
|
||||
vault auth list >/tmp/vauth.txt
|
||||
grep -q "^kubernetes/" /tmp/vauth.txt || vault auth enable kubernetes
|
||||
'
|
||||
|
||||
kubectl -n vault exec -i sts/vault -- env \
|
||||
VAULT_ADDR=http://127.0.0.1:8200 \
|
||||
VAULT_TOKEN="$VAULT_TOKEN" \
|
||||
CA_B64="$CA_B64" \
|
||||
REVIEWER="$REVIEWER" \
|
||||
ISSUER="$ISSUER" \
|
||||
sh -ec '
|
||||
echo "$CA_B64" | base64 -d > /tmp/k8s-ca.crt
|
||||
vault write auth/kubernetes/config \
|
||||
kubernetes_host="https://kubernetes.default.svc:443" \
|
||||
kubernetes_ca_cert=@/tmp/k8s-ca.crt \
|
||||
token_reviewer_jwt="$REVIEWER" \
|
||||
issuer="$ISSUER"
|
||||
'
|
||||
|
||||
kubectl -n vault exec -i sts/vault -- env \
|
||||
VAULT_ADDR=http://127.0.0.1:8200 \
|
||||
VAULT_TOKEN="$VAULT_TOKEN" \
|
||||
sh -ec '
|
||||
set -e
|
||||
vault secrets list >/tmp/vsec.txt
|
||||
grep -q "^secret/" /tmp/vsec.txt || vault secrets enable -path=secret kv-v2
|
||||
'
|
||||
|
||||
kubectl -n vault exec -i sts/vault -- env \
|
||||
VAULT_ADDR=http://127.0.0.1:8200 \
|
||||
VAULT_TOKEN="$VAULT_TOKEN" \
|
||||
sh -ec '
|
||||
vault policy write external-secrets - <<EOF
|
||||
path "secret/data/*" {
|
||||
capabilities = ["read", "list"]
|
||||
}
|
||||
path "secret/metadata/*" {
|
||||
capabilities = ["read", "list"]
|
||||
}
|
||||
EOF
|
||||
vault write auth/kubernetes/role/external-secrets \
|
||||
bound_service_account_names=external-secrets \
|
||||
bound_service_account_namespaces=external-secrets \
|
||||
policies=external-secrets \
|
||||
ttl=24h
|
||||
'
|
||||
|
||||
echo "Done. Issuer used: $ISSUER"
|
||||
echo ""
|
||||
echo "Next (each command on its own line — do not paste # comments after kubectl):"
|
||||
echo " kubectl apply -f clusters/noble/apps/external-secrets/examples/vault-cluster-secret-store.yaml"
|
||||
echo " kubectl get clustersecretstore vault"
|
||||
5
clusters/noble/bootstrap/vault/namespace.yaml
Normal file
5
clusters/noble/bootstrap/vault/namespace.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
# HashiCorp Vault — apply before Helm.
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: vault
|
||||
63
clusters/noble/bootstrap/vault/unseal-cronjob.yaml
Normal file
63
clusters/noble/bootstrap/vault/unseal-cronjob.yaml
Normal file
@@ -0,0 +1,63 @@
|
||||
# Optional lab auto-unseal: applies after Vault is initialized and Secret `vault-unseal-key` exists.
|
||||
#
|
||||
# 1) vault operator init -key-shares=1 -key-threshold=1 (lab only — single key)
|
||||
# 2) kubectl -n vault create secret generic vault-unseal-key --from-literal=key='YOUR_UNSEAL_KEY'
|
||||
# 3) kubectl apply -f clusters/noble/apps/vault/unseal-cronjob.yaml
|
||||
#
|
||||
# OSS Vault has no Kubernetes/KMS seal; this CronJob runs vault operator unseal when the server is sealed.
|
||||
# Protect the Secret with RBAC; prefer cloud KMS auto-unseal for real environments.
|
||||
---
|
||||
apiVersion: batch/v1
|
||||
kind: CronJob
|
||||
metadata:
|
||||
name: vault-auto-unseal
|
||||
namespace: vault
|
||||
spec:
|
||||
concurrencyPolicy: Forbid
|
||||
successfulJobsHistoryLimit: 1
|
||||
failedJobsHistoryLimit: 3
|
||||
schedule: "*/1 * * * *"
|
||||
jobTemplate:
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
restartPolicy: OnFailure
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 100
|
||||
runAsGroup: 1000
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: unseal
|
||||
image: hashicorp/vault:1.21.2
|
||||
imagePullPolicy: IfNotPresent
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
capabilities:
|
||||
drop:
|
||||
- ALL
|
||||
env:
|
||||
- name: VAULT_ADDR
|
||||
value: http://vault.vault.svc:8200
|
||||
command:
|
||||
- /bin/sh
|
||||
- -ec
|
||||
- |
|
||||
test -f /secrets/key || exit 0
|
||||
status="$(vault status -format=json 2>/dev/null || true)"
|
||||
echo "$status" | grep -q '"initialized":true' || exit 0
|
||||
echo "$status" | grep -q '"sealed":false' && exit 0
|
||||
vault operator unseal "$(cat /secrets/key)"
|
||||
volumeMounts:
|
||||
- name: unseal
|
||||
mountPath: /secrets
|
||||
readOnly: true
|
||||
volumes:
|
||||
- name: unseal
|
||||
secret:
|
||||
secretName: vault-unseal-key
|
||||
optional: true
|
||||
items:
|
||||
- key: key
|
||||
path: key
|
||||
62
clusters/noble/bootstrap/vault/values.yaml
Normal file
62
clusters/noble/bootstrap/vault/values.yaml
Normal file
@@ -0,0 +1,62 @@
|
||||
# HashiCorp Vault — noble (standalone, file storage on Longhorn; TLS disabled on listener for in-cluster HTTP).
|
||||
#
|
||||
# helm repo add hashicorp https://helm.releases.hashicorp.com
|
||||
# helm repo update
|
||||
# kubectl apply -f clusters/noble/apps/vault/namespace.yaml
|
||||
# helm upgrade --install vault hashicorp/vault -n vault \
|
||||
# --version 0.32.0 -f clusters/noble/apps/vault/values.yaml --wait --timeout 15m
|
||||
#
|
||||
# Post-install: initialize, store unseal key in Secret, apply optional unseal CronJob — see README.md
|
||||
#
|
||||
global:
|
||||
tlsDisable: true
|
||||
|
||||
injector:
|
||||
enabled: true
|
||||
|
||||
server:
|
||||
enabled: true
|
||||
dataStorage:
|
||||
enabled: true
|
||||
size: 10Gi
|
||||
storageClass: longhorn
|
||||
accessMode: ReadWriteOnce
|
||||
ha:
|
||||
enabled: false
|
||||
standalone:
|
||||
enabled: true
|
||||
config: |
|
||||
ui = true
|
||||
|
||||
listener "tcp" {
|
||||
tls_disable = 1
|
||||
address = "[::]:8200"
|
||||
cluster_address = "[::]:8201"
|
||||
}
|
||||
|
||||
storage "file" {
|
||||
path = "/vault/data"
|
||||
}
|
||||
|
||||
# Allow pod Ready before init/unseal so Helm --wait succeeds (see Vault /v1/sys/health docs).
|
||||
readinessProbe:
|
||||
enabled: true
|
||||
path: "/v1/sys/health?uninitcode=204&sealedcode=204&standbyok=true"
|
||||
port: 8200
|
||||
|
||||
# LAN: TLS terminates at Traefik + cert-manager; listener stays HTTP (global.tlsDisable).
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- host: vault.apps.noble.lab.pcenicni.dev
|
||||
paths: []
|
||||
tls:
|
||||
- secretName: vault-apps-noble-tls
|
||||
hosts:
|
||||
- vault.apps.noble.lab.pcenicni.dev
|
||||
|
||||
ui:
|
||||
enabled: true
|
||||
Reference in New Issue
Block a user