Enable pre-upgrade job for Longhorn in values.yaml, update MetalLB README for clarity on LoadBalancer IP assignment, and enhance Talos configuration with node IP validation for VIPs. Update cluster build documentation to reflect new application versions and configurations.
This commit is contained in:
@@ -9,7 +9,12 @@ This document is the **exported TODO** for the **noble** Talos cluster (4 nodes)
|
||||
- **MetalLB** Helm **0.15.3** / app **v0.15.3**; **IPAddressPool** `noble-l2` + **L2Advertisement** — pool **`192.168.50.210`–`192.168.50.229`**.
|
||||
- **kube-vip** DaemonSet **3/3** on control planes; VIP **`192.168.50.230`** on **`ens18`** (`vip_subnet` **`/32`** required — bare **`32`** breaks parsing). **Verified from workstation:** `kubectl config set-cluster noble --server=https://192.168.50.230:6443` then **`kubectl get --raw /healthz`** → **`ok`** (`talos/kubeconfig`; see `talos/README.md`).
|
||||
- **metrics-server** Helm **3.13.0** / app **v0.8.0** — `clusters/noble/apps/metrics-server/values.yaml` (`--kubelet-insecure-tls` for Talos); **`kubectl top nodes`** works.
|
||||
- **Still open:** Longhorn, Traefik, cert-manager, Argo CD, observability — checklist below.
|
||||
- **Longhorn** Helm **1.11.1** / app **v1.11.1** — `clusters/noble/apps/longhorn/` (PSA **privileged** namespace, `defaultDataPath` `/var/mnt/longhorn`, `preUpgradeChecker` enabled); **StorageClass** `longhorn` (default); **`nodes.longhorn.io`** all **Ready**; test **PVC** `Bound` on `longhorn`.
|
||||
- **Traefik** Helm **39.0.6** / app **v3.6.11** — `clusters/noble/apps/traefik/`; **`Service`** **`LoadBalancer`** **`EXTERNAL-IP` `192.168.50.211`**; **`IngressClass`** **`traefik`** (default). Point **`*.apps.noble.lab.pcenicni.dev`** at **`192.168.50.211`**. MetalLB pool verification was done before replacing the temporary nginx test with Traefik.
|
||||
- **cert-manager** Helm **v1.20.0** / app **v1.20.0** — `clusters/noble/apps/cert-manager/`; **`ClusterIssuer`** **`letsencrypt-staging`** and **`letsencrypt-prod`** (HTTP-01, ingress class **`traefik`**); ACME email **`certificates@noble.lab.pcenicni.dev`** (edit in manifests if you want a different mailbox).
|
||||
- **Newt** Helm **1.2.0** / app **1.10.1** — `clusters/noble/apps/newt/` (**fossorial/newt**); Pangolin site tunnel — **`newt-pangolin-auth`** Secret (**`PANGOLIN_ENDPOINT`**, **`NEWT_ID`**, **`NEWT_SECRET`**). **Public DNS** is **not** automated with ExternalDNS: **CNAME** records at your DNS host per Pangolin’s domain instructions, plus **Integration API** for HTTP resources/targets — see **`clusters/noble/apps/newt/README.md`**. LAN access to Traefik can still use **`*.apps.noble.lab.pcenicni.dev`** → **`192.168.50.211`** (split horizon / local resolver).
|
||||
- **Argo CD** Helm **9.4.17** / app **v3.3.6** — `clusters/noble/bootstrap/argocd/`; **`argocd-server`** **`LoadBalancer`** **`192.168.50.210`**; app-of-apps scaffold under **`bootstrap/argocd/apps/`** (edit **`root-application.yaml`** `repoURL` before applying).
|
||||
- **Still open:** observability — checklist below.
|
||||
|
||||
## Inventory
|
||||
|
||||
@@ -27,8 +32,9 @@ This document is the **exported TODO** for the **noble** Talos cluster (4 nodes)
|
||||
| Kubernetes API VIP (kube-vip) | `192.168.50.230` (see `talos/README.md`; align with `talos/talconfig.yaml` `additionalApiServerCertSans`) |
|
||||
| MetalLB L2 pool | `192.168.50.210`–`192.168.50.229` |
|
||||
| Argo CD `LoadBalancer` | **Pick one IP** in the MetalLB pool (e.g. `192.168.50.210`) |
|
||||
| Apps ingress DNS | `*.apps.noble.lab.pcenicni.dev` |
|
||||
| ExternalDNS | Pangolin (map to supported ExternalDNS provider when documented) |
|
||||
| Traefik (apps ingress) | `192.168.50.211` — **`metallb.io/loadBalancerIPs`** in `clusters/noble/apps/traefik/values.yaml` |
|
||||
| Apps ingress (LAN / split horizon) | `*.apps.noble.lab.pcenicni.dev` → Traefik LB |
|
||||
| Public DNS (Pangolin) | **Newt** tunnel + **CNAME** at registrar + **Integration API** — `clusters/noble/apps/newt/` |
|
||||
| Velero | S3-compatible URL — configure later |
|
||||
|
||||
## Versions
|
||||
@@ -39,6 +45,11 @@ This document is the **exported TODO** for the **noble** Talos cluster (4 nodes)
|
||||
- Cilium: **1.16.6** (Helm chart; see `clusters/noble/apps/cilium/README.md`)
|
||||
- MetalLB: **0.15.3** (Helm chart; app **v0.15.3**)
|
||||
- metrics-server: **3.13.0** (Helm chart; app **v0.8.0**)
|
||||
- Longhorn: **1.11.1** (Helm chart; app **v1.11.1**)
|
||||
- Traefik: **39.0.6** (Helm chart; app **v3.6.11**)
|
||||
- cert-manager: **v1.20.0** (Helm chart; app **v1.20.0**)
|
||||
- Newt (Fossorial): **1.2.0** (Helm chart; app **1.10.1**)
|
||||
- Argo CD: **9.4.17** (Helm chart `argo/argo-cd`; app **v3.3.6**)
|
||||
|
||||
## Repo paths (this workspace)
|
||||
|
||||
@@ -52,8 +63,12 @@ This document is the **exported TODO** for the **noble** Talos cluster (4 nodes)
|
||||
| kube-vip (kustomize) | `clusters/noble/apps/kube-vip/` (`vip_interface` e.g. `ens18`) |
|
||||
| Cilium (Helm values) | `clusters/noble/apps/cilium/` — `values.yaml` (phase 1), optional `values-kpr.yaml`, `README.md` |
|
||||
| MetalLB | `clusters/noble/apps/metallb/` — `namespace.yaml` (PSA **privileged**), `ip-address-pool.yaml`, `kustomization.yaml`, `README.md` |
|
||||
| Longhorn Helm values | `clusters/noble/apps/longhorn/values.yaml` |
|
||||
| Longhorn | `clusters/noble/apps/longhorn/` — `values.yaml`, `namespace.yaml` (PSA **privileged**), `kustomization.yaml` |
|
||||
| metrics-server (Helm values) | `clusters/noble/apps/metrics-server/values.yaml` |
|
||||
| Traefik (Helm values) | `clusters/noble/apps/traefik/` — `values.yaml`, `namespace.yaml`, `README.md` |
|
||||
| cert-manager (Helm + ClusterIssuers) | `clusters/noble/apps/cert-manager/` — `values.yaml`, `namespace.yaml`, `kustomization.yaml`, `README.md` |
|
||||
| Newt / Pangolin tunnel (Helm) | `clusters/noble/apps/newt/` — `values.yaml`, `namespace.yaml`, `README.md` |
|
||||
| Argo CD (bootstrap + app-of-apps) | `clusters/noble/bootstrap/argocd/` — `values.yaml`, `root-application.yaml`, `apps/`, `README.md` |
|
||||
|
||||
**Git vs cluster:** manifests and `talconfig` live in git; **`talhelper genconfig -o out`**, bootstrap, Helm, and `kubectl` run on your LAN. See **`talos/README.md`** for workstation reachability (lab LAN/VPN), **`talosctl kubeconfig`** vs Kubernetes `server:` (VIP vs node IP), and **`--insecure`** only in maintenance.
|
||||
|
||||
@@ -91,17 +106,18 @@ This document is the **exported TODO** for the **noble** Talos cluster (4 nodes)
|
||||
|
||||
- [x] **Cilium** (Helm **1.16.6**) — **required** before MetalLB if `cni: none` (`clusters/noble/apps/cilium/`)
|
||||
- [x] **metrics-server** — Helm **3.13.0**; values in `clusters/noble/apps/metrics-server/values.yaml`; verify `kubectl top nodes`
|
||||
- [ ] **Longhorn** — Talos: `talconfig.with-longhorn.yaml` + `talos/README.md` §5; Helm: `clusters/noble/apps/longhorn/values.yaml` (`defaultDataPath` `/var/mnt/longhorn`)
|
||||
- [x] **Longhorn** — Talos: user volume + kubelet mounts + extensions (`talos/README.md` §5); Helm **1.11.1**; `kubectl apply -k clusters/noble/apps/longhorn`; verify **`nodes.longhorn.io`** and test PVC **`Bound`**
|
||||
- [x] **MetalLB** — chart installed; **pool + L2** from `clusters/noble/apps/metallb/` applied (`192.168.50.210`–`229`)
|
||||
- [ ] **`Service` `LoadBalancer`** test — assign an IP from `210`–`229` (e.g. dummy `LoadBalancer` or Traefik)
|
||||
- [ ] **Traefik** `LoadBalancer` for `*.apps.noble.lab.pcenicni.dev`
|
||||
- [ ] **cert-manager** + ClusterIssuer (staging → prod)
|
||||
- [ ] **ExternalDNS** (Pangolin-compatible provider)
|
||||
- [x] **`Service` `LoadBalancer`** / pool check — MetalLB assigns from `210`–`229` (validated before Traefik; temporary nginx test removed in favor of Traefik)
|
||||
- [x] **Traefik** `LoadBalancer` for `*.apps.noble.lab.pcenicni.dev` — `clusters/noble/apps/traefik/`; **`192.168.50.211`**
|
||||
- [x] **cert-manager** + ClusterIssuer (**`letsencrypt-staging`** / **`letsencrypt-prod`**) — `clusters/noble/apps/cert-manager/`
|
||||
- [x] **Newt** (Pangolin tunnel; replaces ExternalDNS for public DNS) — `clusters/noble/apps/newt/` — **`newt-pangolin-auth`**; CNAME + **Integration API** per **`newt/README.md`**
|
||||
|
||||
## Phase C — GitOps
|
||||
|
||||
- [ ] **Argo CD** bootstrap (`clusters/noble/bootstrap/argocd`, root app) — path TBD when added
|
||||
- [ ] Argo CD server **LoadBalancer** with dedicated pool IP
|
||||
- [x] **Argo CD** bootstrap — `clusters/noble/bootstrap/argocd/` (`helm upgrade --install argocd …`)
|
||||
- [x] Argo CD server **LoadBalancer** — **`192.168.50.210`** (see `values.yaml`)
|
||||
- [ ] **App-of-apps** — set **`repoURL`** in **`root-application.yaml`**, add **`Application`** manifests under **`bootstrap/argocd/apps/`**, apply **`root-application.yaml`**
|
||||
- [ ] SSO — later
|
||||
|
||||
## Phase D — Observability
|
||||
@@ -129,9 +145,10 @@ This document is the **exported TODO** for the **noble** Talos cluster (4 nodes)
|
||||
|
||||
- [x] `kubectl get nodes` — all **Ready**
|
||||
- [x] API via VIP `:6443` — **`kubectl get --raw /healthz`** → **`ok`** with kubeconfig **`server:`** `https://192.168.50.230:6443`
|
||||
- [ ] Test `LoadBalancer` receives IP from `210`–`229`
|
||||
- [ ] Sample Ingress + cert + ExternalDNS record
|
||||
- [ ] PVC bound; Prometheus/Loki durable if configured
|
||||
- [x] Ingress **`LoadBalancer`** in pool `210`–`229` (**Traefik** → **`192.168.50.211`**)
|
||||
- [x] **Argo CD** UI — **`argocd-server`** **`LoadBalancer`** **`192.168.50.210`** (initial **`admin`** password from **`argocd-initial-admin-secret`**)
|
||||
- [ ] Sample Ingress + cert (cert-manager ready) + Pangolin resource + CNAME
|
||||
- [x] PVC **`Bound`** on **Longhorn** (`storageClassName: longhorn`); Prometheus/Loki durable when configured
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user