Update Longhorn runbook documentation for clarity and compliance. Adjusted section references for consistency and added details on security and compliance measures regarding RBAC and namespace management.
This commit is contained in:
20
clusters/noble/bootstrap/longhorn/README.md
Normal file
20
clusters/noble/bootstrap/longhorn/README.md
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
# Longhorn on noble — install notes
|
||||||
|
|
||||||
|
Helm values, namespace PSA, and (when Authentik is enabled) ForwardAuth overlays live in this directory. Install flow is covered in [`ansible/roles/noble_longhorn`](../../../../ansible/roles/noble_longhorn/) and [`talos/runbooks/longhorn.md`](../../../../talos/runbooks/longhorn.md).
|
||||||
|
|
||||||
|
## RBAC, Trivy KSV, and accepted risk
|
||||||
|
|
||||||
|
The upstream Longhorn chart ships a **`longhorn-role` ClusterRole** with broad permissions: wildcard verbs on several API groups, **list/watch on Secrets** (policy tools treat cluster-scoped secret reads as high risk), **create/patch/delete** on mutating/validating **WebhookConfiguration** objects, and **delete/deletecollection** on **Pods**. Trivy’s built-in Kubernetes checks (for example **AVD-KSV-0041**, **0045**, **0048**, **0114**) flag that role. **This is expected** for a storage controller that installs CRDs, runs CSI-style components, and manages workload pods; shrinking that role without upstream support is likely to **break Longhorn**.
|
||||||
|
|
||||||
|
The chart also includes a **support-bundle** flow that binds a dedicated service account to **`cluster-admin`**. Treat that as **high privilege**: limit who can create or use support-bundle workloads in **`longhorn-system`**, and disable or avoid the feature if you do not need vendor diagnostics.
|
||||||
|
|
||||||
|
### Mitigations we rely on instead of forking RBAC
|
||||||
|
|
||||||
|
| Area | What we do |
|
||||||
|
| --- | --- |
|
||||||
|
| **Pod Security Admission** | **`longhorn-system`** is labeled **privileged** in [`namespace.yaml`](./namespace.yaml) because Longhorn requires hostPath and privileged pods; other namespaces stay on stricter defaults where configured. |
|
||||||
|
| **UI access** | Longhorn UI is exposed through **Traefik** with **oauth2-proxy** ForwardAuth to **Authentik** when the Authentik role is applied (see [`values-authentik-forwardauth.yaml`](./values-authentik-forwardauth.yaml) and [`ansible/roles/noble_authentik/README.md`](../../../../ansible/roles/noble_authentik/README.md)). |
|
||||||
|
| **Network segmentation** | Cluster CNI is **Cilium**. Add **NetworkPolicy** (or Cilium **CiliumNetworkPolicy**) for **`longhorn-system`** and workloads that talk to the Longhorn API if you need tighter east-west boundaries; this repo does not ship a default deny for Longhorn. |
|
||||||
|
| **Support bundles** | Restrict **`longhorn-system`** RBAC (who can create Jobs/Pods, impersonate, or exec) and Longhorn UI/API access so only trusted operators can trigger vendor support tooling. |
|
||||||
|
|
||||||
|
**Trivy Operator:** workload scans skip **`longhorn-system`** via **`excludeNamespaces`** in [`clusters/noble/apps/trivy/values.yaml`](../../apps/trivy/values.yaml). **ClusterRole** config audits are cluster-scoped, so findings on **`longhorn-role`** can still appear; treat them as **documented vendor baseline** unless you narrow operator config (for example dropping **ClusterRole** from config-audit kinds), which affects the whole cluster, not only Longhorn.
|
||||||
@@ -5,7 +5,7 @@
|
|||||||
**Checks**
|
**Checks**
|
||||||
|
|
||||||
1. `kubectl -n longhorn-system get pods` and `kubectl get nodes.longhorn.io -o wide`.
|
1. `kubectl -n longhorn-system get pods` and `kubectl get nodes.longhorn.io -o wide`.
|
||||||
2. Talos user disk + extensions for Longhorn (see [`talos/README.md`](../README.md) §5 and `talconfig.with-longhorn.yaml`).
|
2. Talos user disk + extensions for Longhorn (see [`talos/README.md`](../README.md) section 5 and `talconfig.with-longhorn.yaml`).
|
||||||
3. `kubectl get sc` — **longhorn** default as expected; PVC events: `kubectl describe pvc -n <ns> <name>`.
|
3. `kubectl get sc` — **longhorn** default as expected; PVC events: `kubectl describe pvc -n <ns> <name>`.
|
||||||
|
|
||||||
**Common fixes**
|
**Common fixes**
|
||||||
@@ -13,4 +13,6 @@
|
|||||||
- Node disk pressure / mount missing: fix Talos machine config, reboot node per Talos docs.
|
- Node disk pressure / mount missing: fix Talos machine config, reboot node per Talos docs.
|
||||||
- Recovery / GPT wipe scripts: [`talos/scripts/longhorn-gpt-recovery.sh`](../scripts/longhorn-gpt-recovery.sh) and CLUSTER-BUILD notes.
|
- Recovery / GPT wipe scripts: [`talos/scripts/longhorn-gpt-recovery.sh`](../scripts/longhorn-gpt-recovery.sh) and CLUSTER-BUILD notes.
|
||||||
|
|
||||||
**References:** [`clusters/noble/bootstrap/longhorn/`](../../clusters/noble/bootstrap/longhorn/), [Longhorn docs](https://longhorn.io/docs/).
|
**Security / compliance (Trivy KSV on `longhorn-role`):** Upstream Longhorn RBAC is expected to fail strict built-in checks; we accept that for a storage controller and mitigate with PSA on the namespace, OIDC/ForwardAuth for the UI, network policy where you add it, and tight control over support-bundle use. See [`clusters/noble/bootstrap/longhorn/README.md`](../../clusters/noble/bootstrap/longhorn/README.md).
|
||||||
|
|
||||||
|
**References:** [`clusters/noble/bootstrap/longhorn/`](../../clusters/noble/bootstrap/longhorn/), [`clusters/noble/bootstrap/longhorn/README.md`](../../clusters/noble/bootstrap/longhorn/README.md) (RBAC posture), [Longhorn docs](https://longhorn.io/docs/).
|
||||||
|
|||||||
Reference in New Issue
Block a user