Update Ansible configuration to integrate SOPS for managing secrets. Enhance README.md with SOPS usage instructions and prerequisites. Remove External Secrets Operator references and related configurations from the bootstrap process, streamlining the deployment. Adjust playbooks and roles to apply SOPS-encrypted secrets automatically, improving security and clarity in secret management.

This commit is contained in:
Nikholas Pcenicni
2026-03-30 22:42:52 -04:00
parent 023ebfee5d
commit 3a6e5dff5b
44 changed files with 644 additions and 809 deletions

169
docs/Racks.md Normal file
View File

@@ -0,0 +1,169 @@
# Physical racks — Noble lab (10")
This page is a **logical rack layout** for the **noble** Talos lab: **three 10" (half-width) racks**, how **rack units (U)** are used, and **Ethernet** paths on **`192.168.50.0/24`**. Node names and IPs match [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md) and [`docs/architecture.md`](architecture.md).
## Legend
| Symbol | Meaning |
|--------|---------|
| `█` / filled cell | Equipment occupying that **1U** |
| `░` | Reserved / future use |
| `·` | Empty |
| `━━` | Copper to LAN switch |
**Rack unit numbering:** **U increases upward** (U1 = bottom of rack, like ANSI/EIA). **Slot** in the diagrams is **top → bottom** reading order for a quick visual scan.
### Three racks at a glance
Read **top → bottom** (first row = top of rack).
| Primary (10") | Storage B (10") | Rack C (10") |
|-----------------|-----------------|--------------|
| Fiber ONT | Mac Mini | *empty* |
| UniFi Fiber Gateway | NAS | *empty* |
| Patch panel | JBOD | *empty* |
| 2.5 GbE ×8 PoE switch | *empty* | *empty* |
| Raspberry Pi cluster | *empty* | *empty* |
| **helium** (Talos) | *empty* | *empty* |
| **neon** (Talos) | *empty* | *empty* |
| **argon** (Talos) | *empty* | *empty* |
| **krypton** (Talos) | *empty* | *empty* |
**Connectivity:** Primary rack gear shares **one L2** (`192.168.50.0/24`). Storage B and Rack C link the same way when cabled (e.g. **Ethernet** to the PoE switch, **VPN** or flat LAN per your design).
---
## Rack A — LAN aggregation (10" × 12U)
Dedicated to **Layer-2 access** and cable home runs. All cluster nodes plug into this switch (or into a downstream switch that uplinks here).
```
TOP OF RACK
┌────────────────────────────────────────┐
│ Slot 1 ········· empty ·············· │ 12U
│ Slot 2 ········· empty ·············· │ 11U
│ Slot 3 ········· empty ·············· │ 10U
│ Slot 4 ········· empty ·············· │ 9U
│ Slot 5 ········· empty ·············· │ 8U
│ Slot 6 ········· empty ·············· │ 7U
│ Slot 7 ░░░░░░░ optional PDU ░░░░░░░░ │ 6U
│ Slot 8 █████ 1U cable manager ██████ │ 5U
│ Slot 9 █████ 1U patch panel █████████ │ 4U
│ Slot10 ███ 8-port managed switch ████ │ 3U ← LAN L2 spine
│ Slot11 ········· empty ·············· │ 2U
│ Slot12 ········· empty ·············· │ 1U
└────────────────────────────────────────┘
BOTTOM
```
**Network role:** Every node NIC → **switch access port** → same **VLAN / flat LAN** as documented; **kube-vip** VIP **`192.168.50.230`**, **MetalLB** **`192.168.50.210``229`**, **Traefik** **`192.168.50.211`** are **logical** on node IPs (no extra hardware).
---
## Rack B — Control planes (10" × 12U)
Three **Talos control-plane** nodes (**scheduling allowed** on CPs per `talconfig.yaml`).
```
TOP OF RACK
┌────────────────────────────────────────┐
│ Slot 1 ········· empty ·············· │ 12U
│ Slot 2 ········· empty ·············· │ 11U
│ Slot 3 ········· empty ·············· │ 10U
│ Slot 4 ········· empty ·············· │ 9U
│ Slot 5 ········· empty ·············· │ 8U
│ Slot 6 ········· empty ·············· │ 7U
│ Slot 7 ········· empty ·············· │ 6U
│ Slot 8 █ neon control-plane .20 ████ │ 5U
│ Slot 9 █ argon control-plane .30 ███ │ 4U
│ Slot10 █ krypton control-plane .40 ██ │ 3U (kube-vip VIP .230)
│ Slot11 ········· empty ·············· │ 2U
│ Slot12 ········· empty ·············· │ 1U
└────────────────────────────────────────┘
BOTTOM
```
---
## Rack C — Worker (10" × 12U)
Single **worker** node; **Longhorn** data disk is **local** to each node (see `talconfig.yaml`); no separate NAS in this diagram.
```
TOP OF RACK
┌────────────────────────────────────────┐
│ Slot 1 ········· empty ·············· │ 12U
│ Slot 2 ········· empty ·············· │ 11U
│ Slot 3 ········· empty ·············· │ 10U
│ Slot 4 ········· empty ·············· │ 9U
│ Slot 5 ········· empty ·············· │ 8U
│ Slot 6 ········· empty ·············· │ 7U
│ Slot 7 ░░░░░░░ spare / future ░░░░░░░░ │ 6U
│ Slot 8 ········· empty ·············· │ 5U
│ Slot 9 ········· empty ·············· │ 4U
│ Slot10 ███ helium worker .10 █████ │ 3U
│ Slot11 ········· empty ·············· │ 2U
│ Slot12 ········· empty ·············· │ 1U
└────────────────────────────────────────┘
BOTTOM
```
---
## Space summary
| System | Rack | Approx. U | IP | Role |
|--------|------|-----------|-----|------|
| LAN switch | A | 1U | — | All nodes on `192.168.50.0/24` |
| Patch / cable mgmt | A | 2× 1U | — | Physical plant |
| **neon** | B | 1U | `192.168.50.20` | control-plane + schedulable |
| **argon** | B | 1U | `192.168.50.30` | control-plane + schedulable |
| **krypton** | B | 1U | `192.168.50.40` | control-plane + schedulable |
| **helium** | C | 1U | `192.168.50.10` | worker |
Adjust **empty vs. future** rows if your chassis are **2U** or on **shelves** — scale the `█` blocks accordingly.
---
## Network connections
All cluster nodes are on **one flat LAN**. **kube-vip** floats **`192.168.50.230:6443`** across the three control-plane hosts on **`ens18`** (see cluster bootstrap docs).
```mermaid
flowchart TB
subgraph RACK_A["Rack A — 10\""]
SW["Managed switch<br/>192.168.50.0/24 L2"]
PP["Patch / cable mgmt"]
SW --- PP
end
subgraph RACK_B["Rack B — 10\""]
N["neon :20"]
A["argon :30"]
K["krypton :40"]
end
subgraph RACK_C["Rack C — 10\""]
H["helium :10"]
end
subgraph LOGICAL["Logical (any node holding VIP)"]
VIP["API VIP 192.168.50.230<br/>kube-vip → apiserver :6443"]
end
WAN["Internet / other LANs"] -.->|"router (out of scope)"| SW
SW <-->|"Ethernet"| N
SW <-->|"Ethernet"| A
SW <-->|"Ethernet"| K
SW <-->|"Ethernet"| H
N --- VIP
A --- VIP
K --- VIP
WK["Workstation / CI<br/>kubectl, browser"] -->|"HTTPS :6443"| VIP
WK -->|"L2 (MetalLB .210.211, any node)"| SW
```
**Ingress path (same LAN):** clients → **`192.168.50.211`** (Traefik) or **`192.168.50.210`** (Argo CD) via **MetalLB** — still **through the same switch** to whichever node advertises the service.
---
## Related docs
- Cluster topology and services: [`architecture.md`](architecture.md)
- Build state and versions: [`../talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md)