Refactor noble cluster configurations by removing deprecated Argo CD application management files and transitioning to a streamlined Ansible-driven installation approach. Update kustomization.yaml files to reflect the new structure, ensuring clarity on resource management. Introduce new namespaces and configurations for cert-manager, external-secrets, and logging components, enhancing the overall deployment process. Add detailed README.md documentation for each component to guide users through the setup and management of the noble lab environment.
This commit is contained in:
34
clusters/noble/bootstrap/cilium/README.md
Normal file
34
clusters/noble/bootstrap/cilium/README.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Cilium — noble (Talos)
|
||||
|
||||
Talos uses **`cluster.network.cni.name: none`**; you must install Cilium (or another CNI) before nodes become **Ready** and before **MetalLB** / most workloads. See `talos/CLUSTER-BUILD.md` ordering.
|
||||
|
||||
## 1. Install (phase 1 — required)
|
||||
|
||||
Uses **`values.yaml`**: IPAM **kubernetes**, **`k8sServiceHost` / `k8sServicePort`** pointing at **KubePrism** (`127.0.0.1:7445`, Talos default), Talos cgroup paths, **drop `SYS_MODULE`** from agent caps, **`bpf.masquerade: false`** ([Talos Cilium](https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/), [KubePrism](https://www.talos.dev/latest/kubernetes-guides/configuration/kubeprism/)). Without this, host-network CNI clients may **`dial tcp <VIP>:6443`** and fail if the VIP path is unhealthy.
|
||||
|
||||
From **repository root**:
|
||||
|
||||
```bash
|
||||
helm repo add cilium https://helm.cilium.io/
|
||||
helm repo update
|
||||
helm upgrade --install cilium cilium/cilium \
|
||||
--namespace kube-system \
|
||||
--version 1.16.6 \
|
||||
-f clusters/noble/apps/cilium/values.yaml \
|
||||
--wait
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system rollout status ds/cilium
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
When nodes are **Ready**, continue with **MetalLB** (`clusters/noble/apps/metallb/README.md`) and other Phase B items. **kube-vip** for the Kubernetes API VIP is separate (L2 ARP); it can run after the API is reachable.
|
||||
|
||||
## 2. Optional: kube-proxy replacement (phase 2)
|
||||
|
||||
To replace **`kube-proxy`** with Cilium entirely, use **`values-kpr.yaml`** and **`cluster.proxy.disabled: true`** in Talos on every node (see comments inside `values-kpr.yaml`). Follow the upstream [Deploy Cilium CNI](https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/) section *without kube-proxy*.
|
||||
|
||||
Do **not** skip phase 1 unless you already know your cluster matches the “bootstrap window” flow from the Talos docs.
|
||||
49
clusters/noble/bootstrap/cilium/values-kpr.yaml
Normal file
49
clusters/noble/bootstrap/cilium/values-kpr.yaml
Normal file
@@ -0,0 +1,49 @@
|
||||
# Optional phase 2: kube-proxy replacement via Cilium + KubePrism (Talos apid forwards :7445 → :6443).
|
||||
# Prerequisites:
|
||||
# 1. Phase 1 Cilium installed and healthy; nodes Ready.
|
||||
# 2. Add to Talos machine config on ALL nodes:
|
||||
# cluster:
|
||||
# proxy:
|
||||
# disabled: true
|
||||
# (keep cluster.network.cni.name: none). Regenerate, apply-config, reboot as needed.
|
||||
# 3. Remove legacy kube-proxy objects if still present:
|
||||
# kubectl delete ds -n kube-system kube-proxy --ignore-not-found
|
||||
# kubectl delete cm -n kube-system kube-proxy --ignore-not-found
|
||||
# 4. helm upgrade cilium ... -f values-kpr.yaml
|
||||
#
|
||||
# Ref: https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/
|
||||
|
||||
ipam:
|
||||
mode: kubernetes
|
||||
|
||||
kubeProxyReplacement: "true"
|
||||
|
||||
k8sServiceHost: localhost
|
||||
k8sServicePort: "7445"
|
||||
|
||||
securityContext:
|
||||
capabilities:
|
||||
ciliumAgent:
|
||||
- CHOWN
|
||||
- KILL
|
||||
- NET_ADMIN
|
||||
- NET_RAW
|
||||
- IPC_LOCK
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
- DAC_OVERRIDE
|
||||
- FOWNER
|
||||
- SETGID
|
||||
- SETUID
|
||||
cleanCiliumState:
|
||||
- NET_ADMIN
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
|
||||
cgroup:
|
||||
autoMount:
|
||||
enabled: false
|
||||
hostRoot: /sys/fs/cgroup
|
||||
|
||||
bpf:
|
||||
masquerade: false
|
||||
44
clusters/noble/bootstrap/cilium/values.yaml
Normal file
44
clusters/noble/bootstrap/cilium/values.yaml
Normal file
@@ -0,0 +1,44 @@
|
||||
# Cilium on Talos — phase 1: bring up CNI while kube-proxy still runs.
|
||||
# See README.md for install order (before MetalLB scheduling) and optional kube-proxy replacement.
|
||||
#
|
||||
# Chart: cilium/cilium — pin version in helm command (e.g. 1.16.6).
|
||||
# Ref: https://www.talos.dev/latest/kubernetes-guides/network/deploying-cilium/
|
||||
|
||||
ipam:
|
||||
mode: kubernetes
|
||||
|
||||
kubeProxyReplacement: "false"
|
||||
|
||||
# Host-network components cannot use kubernetes.default ClusterIP; Talos KubePrism (enabled by default)
|
||||
# on 127.0.0.1:7445 proxies to healthy apiservers and avoids flaky dials to cluster.controlPlane.endpoint (VIP).
|
||||
# Ref: https://www.talos.dev/latest/kubernetes-guides/configuration/kubeprism/
|
||||
k8sServiceHost: "127.0.0.1"
|
||||
k8sServicePort: "7445"
|
||||
|
||||
securityContext:
|
||||
capabilities:
|
||||
ciliumAgent:
|
||||
- CHOWN
|
||||
- KILL
|
||||
- NET_ADMIN
|
||||
- NET_RAW
|
||||
- IPC_LOCK
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
- DAC_OVERRIDE
|
||||
- FOWNER
|
||||
- SETGID
|
||||
- SETUID
|
||||
cleanCiliumState:
|
||||
- NET_ADMIN
|
||||
- SYS_ADMIN
|
||||
- SYS_RESOURCE
|
||||
|
||||
cgroup:
|
||||
autoMount:
|
||||
enabled: false
|
||||
hostRoot: /sys/fs/cgroup
|
||||
|
||||
# Workaround: Talos host DNS forwarding + bpf masquerade can break CoreDNS; see Talos Cilium guide "Known issues".
|
||||
bpf:
|
||||
masquerade: false
|
||||
Reference in New Issue
Block a user