Enhance documentation and configuration for Velero integration. Update README.md to clarify Velero's lack of web UI and usage instructions for CLI. Add CSI Volume Snapshot support in playbooks and roles, and include Velero service details in noble_landing_urls. Adjust kustomization.yaml to include VolumeSnapshotClass configuration, ensuring proper setup for backups. Improve overall clarity in related documentation.

This commit is contained in:
Nikholas Pcenicni
2026-03-28 19:34:43 -04:00
parent 33a10dc7e9
commit 544f75b0ee
15 changed files with 128 additions and 22 deletions

View File

@@ -1,6 +1,7 @@
# Homepage — [gethomepage/homepage](https://github.com/gethomepage/homepage) via [jameswynn/homepage](https://github.com/jameswynn/helm-charts) Helm chart.
# Ingress: Traefik + cert-manager (same pattern as `clusters/noble/bootstrap/headlamp/values.yaml`).
# Service links match **`ansible/roles/noble_landing_urls/defaults/main.yml`** (`noble_lab_ui_entries`).
# **Velero** has no in-cluster web UI — tile links to upstream docs (no **siteMonitor**).
#
# **`siteMonitor`** runs **server-side** in the Homepage pod (see `gethomepage/homepage` `siteMonitor.js`).
# Public FQDNs like **`*.apps.noble.lab.pcenicni.dev`** often do **not** resolve inside the cluster
@@ -84,6 +85,10 @@ config:
# Unauthenticated health (HEAD/GET) — not the redirecting UI root
siteMonitor: http://vault.vault.svc.cluster.local:8200/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204
description: Secrets engine UI (after init/unseal)
- Velero:
icon: mdi-backup-restore
href: https://velero.io/docs/
description: Cluster backups — no in-cluster web UI; use velero CLI or kubectl (docs)
widgets:
- datetime:
text_size: xl

View File

@@ -0,0 +1,16 @@
# CSI Volume Snapshot (external-snapshotter)
Installs the **Volume Snapshot** CRDs and the **snapshot-controller** so CSI drivers (e.g. **Longhorn**) and **Velero** can use `VolumeSnapshot` / `VolumeSnapshotContent` / `VolumeSnapshotClass`.
- Upstream: [kubernetes-csi/external-snapshotter](https://github.com/kubernetes-csi/external-snapshotter) **v8.5.0**
- **Not** the per-driver **csi-snapshotter** sidecar — Longhorn ships that with its CSI components.
**Order:** apply **before** relying on volume snapshots (e.g. before or early with **Longhorn**; **Ansible** runs this after **Cilium**, before **metrics-server** / **Longhorn**).
```bash
kubectl apply -k clusters/noble/bootstrap/csi-snapshot-controller/crd
kubectl apply -k clusters/noble/bootstrap/csi-snapshot-controller/controller
kubectl -n kube-system rollout status deploy/snapshot-controller --timeout=120s
```
After this, create or label a **VolumeSnapshotClass** for Longhorn (`velero.io/csi-volumesnapshot-class: "true"`) per `clusters/noble/bootstrap/velero/README.md`.

View File

@@ -0,0 +1,8 @@
# Snapshot controller — **kube-system** (upstream default).
# Image tag should match the external-snapshotter release family (see setup-snapshot-controller.yaml in that tag).
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: kube-system
resources:
- https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.5.0/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
- https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.5.0/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

View File

@@ -0,0 +1,9 @@
# kubernetes-csi/external-snapshotter — Volume Snapshot GA CRDs only (no VolumeGroupSnapshot).
# Pin **ref** when bumping; keep in sync with **controller** image below.
# https://github.com/kubernetes-csi/external-snapshotter/tree/v8.5.0/client/config/crd
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.5.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
- https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.5.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
- https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v8.5.0/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

View File

@@ -13,6 +13,7 @@ resources:
- vault/namespace.yaml
- kyverno/namespace.yaml
- velero/namespace.yaml
- velero/longhorn-volumesnapshotclass.yaml
- headlamp/namespace.yaml
- grafana-loki-datasource/loki-datasource.yaml
- vault/unseal-cronjob.yaml

View File

@@ -4,25 +4,20 @@ Ansible-managed core stack — **not** reconciled by Argo CD (`clusters/noble/ap
## What you get
- **No web UI** — Velero is operated with the **`velero`** CLI and **`kubectl`** (Backup, Schedule, Restore CRDs). Metrics are exposed for Prometheus; there is no first-party dashboard in this chart.
- **vmware-tanzu/velero** Helm chart (**12.0.0** → Velero **1.18.0**) in namespace **`velero`**
- **AWS plugin** init container for **S3-compatible** object storage (`velero/velero-plugin-for-aws:v1.14.0`)
- **CSI snapshots** via Veleros built-in CSI support (`EnableCSI`) and **VolumeSnapshotLocation** `velero.io/csi` (no separate CSI plugin image for Velero ≥ 1.14)
- **Prometheus** scraping: **ServiceMonitor** labeled for **kube-prometheus** (`release: kube-prometheus`)
- **Schedule** **`velero-daily-noble`**: cron **`0 3 * * *`** (daily at 03:00 in the Velero pods timezone, usually **UTC**), **720h** TTL per backup (~30 days). Edit **`values.yaml`** `schedules` to change time or retention.
## Prerequisites
1. **Longhorn** (or another CSI driver) with a **VolumeSnapshotClass** for that driver.
2. For **Velero** to pick a default snapshot class, **one** `VolumeSnapshotClass` per driver should carry:
1. **Volume Snapshot APIs** installed cluster-wide — **`clusters/noble/bootstrap/csi-snapshot-controller/`** (Ansible **`noble_csi_snapshot_controller`**, after **Cilium**). Without **`snapshot.storage.k8s.io`** CRDs and **`kube-system/snapshot-controller`**, Velero logs errors like `no matches for kind "VolumeSnapshot"`.
2. **Longhorn** (or another CSI driver) with a **VolumeSnapshotClass** for that driver.
3. For **Longhorn**, this repo applies **`velero/longhorn-volumesnapshotclass.yaml`** (`VolumeSnapshotClass` **`longhorn-velero`**, driver **`driver.longhorn.io`**, Velero label). It is included in **`clusters/noble/bootstrap/kustomization.yaml`** (same apply as other bootstrap YAML). For non-Longhorn drivers, add a class with **`velero.io/csi-volumesnapshot-class: "true"`** (see [Velero CSI](https://velero.io/docs/main/csi/)).
```yaml
metadata:
labels:
velero.io/csi-volumesnapshot-class: "true"
```
Example for Longhorn: after install, confirm the driver name (often `driver.longhorn.io`) and either label Longhorns `VolumeSnapshotClass` or create one and label it (see [Velero CSI](https://velero.io/docs/main/csi/)).
3. **S3-compatible** endpoint (MinIO, VersityGW, AWS, etc.) and a **bucket**.
4. **S3-compatible** endpoint (MinIO, VersityGW, AWS, etc.) and a **bucket**.
## Credentials Secret

View File

@@ -0,0 +1,11 @@
# Default Longhorn VolumeSnapshotClass for Velero CSI — one class per driver may carry
# **velero.io/csi-volumesnapshot-class: "true"** (see velero/README.md).
# Apply after **Longhorn** CSI is running (`driver.longhorn.io`).
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: longhorn-velero
labels:
velero.io/csi-volumesnapshot-class: "true"
driver: driver.longhorn.io
deletionPolicy: Delete

View File

@@ -54,4 +54,12 @@ metrics:
additionalLabels:
release: kube-prometheus
schedules: {}
# Daily full-cluster backup at 03:00 — cron is evaluated in the Velero pod (typically **UTC**; set TZ on the
# Deployment if you need local wall clock). See `helm upgrade --install` to apply.
schedules:
daily-noble:
disabled: false
schedule: "0 3 * * *"
template:
ttl: 720h
storageLocation: default