Compare commits

..

2 Commits

194 changed files with 299 additions and 7289 deletions

BIN
.DS_Store vendored

Binary file not shown.

View File

@@ -1,19 +0,0 @@
# Copy to **.env** in this repository root (`.env` is gitignored).
# Ansible **noble_cert_manager** role sources `.env` after cert-manager Helm install and creates
# **cert-manager/cloudflare-dns-api-token** when **CLOUDFLARE_DNS_API_TOKEN** is set.
#
# Cloudflare: Zone → DNS → Edit + Zone → Read for **pcenicni.dev** (see clusters/noble/bootstrap/cert-manager/README.md).
CLOUDFLARE_DNS_API_TOKEN=
# --- Optional: other deploy-time values (documented for manual use or future automation) ---
# Pangolin / Newt — with **noble_newt_install=true**, Ansible creates **newt/newt-pangolin-auth** when all are set (see clusters/noble/bootstrap/newt/README.md).
PANGOLIN_ENDPOINT=
NEWT_ID=
NEWT_SECRET=
# Velero — when **noble_velero_install=true**, set bucket + S3 API URL and credentials (see clusters/noble/bootstrap/velero/README.md).
NOBLE_VELERO_S3_BUCKET=
NOBLE_VELERO_S3_URL=
NOBLE_VELERO_AWS_ACCESS_KEY_ID=
NOBLE_VELERO_AWS_SECRET_ACCESS_KEY=

11
.gitignore vendored
View File

@@ -1,11 +0,0 @@
ansible/inventory/hosts.ini
# Talos generated
talos/out/
talos/kubeconfig
# Local secrets
age-key.txt
.env
# Generated by ansible noble_landing_urls
ansible/output/noble-lab-ui-urls.md

View File

@@ -1,7 +0,0 @@
# Mozilla SOPS — encrypt/decrypt Kubernetes Secret manifests under clusters/noble/secrets/
# Generate a key: age-keygen -o age-key.txt (age-key.txt is gitignored)
# Add the printed public key below (one recipient per line is supported).
creation_rules:
- path_regex: clusters/noble/secrets/.*\.yaml$
age: >-
age1juym5p3ez3dkt0dxlznydgfgqvaujfnyk9hpdsssf50hsxeh3p4sjpf3gn

View File

@@ -180,12 +180,6 @@ Shared services used across multiple applications.
**Configuration:** Requires Pangolin endpoint URL, Newt ID, and Newt secret.
### versitygw/ (`komodo/s3/versitygw/`)
- **[Versity S3 Gateway](https://github.com/versity/versitygw)** — S3 API on port **10000** by default; optional **WebUI** on **8080** (not the same listener—enable `VERSITYGW_WEBUI_PORT` / `VGW_WEBUI_GATEWAYS` per `.env.sample`). Behind **Pangolin**, expose the API and WebUI separately (or you will see **404** browsing the API URL).
**Configuration:** Set either `ROOT_ACCESS_KEY` / `ROOT_SECRET_KEY` or `ROOT_ACCESS_KEY_ID` / `ROOT_SECRET_ACCESS_KEY`. Optional `VERSITYGW_PORT`. Compose uses `${VAR}` interpolation so credentials work with Komodos `docker compose --env-file <run_directory>/.env` (avoid `env_file:` in the service when `run_directory` is not the same folder as `compose.yaml`, or the written `.env` will not be found).
---
## 📊 Monitoring (`komodo/monitor/`)

1
ansible/.gitignore vendored
View File

@@ -1 +0,0 @@
.ansible-tmp/

View File

@@ -1,160 +1,84 @@
# Ansible — noble cluster
# Home Server Ansible Configuration
Automates [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md): optional **Talos Phase A** (genconfig → apply → bootstrap → kubeconfig), then **Phase B+** (CNI → add-ons → ingress → Argo CD → Kyverno → observability, etc.). **Argo CD** does not reconcile core charts — optional GitOps starts from an empty [`clusters/noble/apps/kustomization.yaml`](../clusters/noble/apps/kustomization.yaml).
This directory contains Ansible playbooks for managing the Proxmox home server environment.
## Order of operations
## Directory Structure
1. **From `talos/`:** `talhelper gensecret` / `talsecret` as in [`talos/README.md`](../talos/README.md) §1 (if not already done).
2. **Talos Phase A (automated):** run [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) **or** the full pipeline [`playbooks/deploy.yml`](playbooks/deploy.yml). This runs **`talhelper genconfig -o out`**, **`talosctl apply-config`** on each node, **`talosctl bootstrap`**, and **`talosctl kubeconfig`** → **`talos/kubeconfig`**.
3. **Platform stack:** [`playbooks/noble.yml`](playbooks/noble.yml) (included at the end of **`deploy.yml`**).
- `inventory/`: Contains the inventory file `hosts.ini` where you define your servers.
- `playbooks/`: Contains the actual Ansible playbooks.
- `ansible.cfg`: Local Ansible configuration.
- `requirements.yml`: List of Ansible collections required.
Your workstation must be able to reach **node IPs on the lab LAN** (Talos API **:50000** for `talosctl`, Kubernetes **:6443** for `kubectl` / Helm). If `kubectl` cannot reach the VIP (`192.168.50.230`), use `-e 'noble_k8s_api_server_override=https://<control-plane-ip>:6443'` on **`noble.yml`** (see `group_vars/all.yml`).
## Setup
**One-shot full deploy** (after nodes are booted and reachable):
1. **Install Requirements**:
```bash
ansible-galaxy install -r requirements.yml
```
2. **Configure Inventory**:
Edit `inventory/hosts.ini` and update the following:
- `ansible_host`: The IP address of your Proxmox node.
- `ansible_user`: The SSH user (usually root).
- `proxmox_api_*`: Variables if you plan to use API-based modules in the future.
*Note: Ensure you have SSH key access to your Proxmox node for passwordless login, or uncomment `ansible_ssh_pass`.*
## Available Playbooks
### Create Ubuntu Cloud Template (`playbooks/create_ubuntu_template.yml`)
This playbook downloads a generic Ubuntu 22.04 Cloud Image and converts it into a Proxmox VM Template.
**Usage:**
```bash
cd ansible
ansible-playbook playbooks/deploy.yml
# Run the playbook
ansible-playbook playbooks/create_ubuntu_template.yml
```
## Deploy secrets (`.env`)
**Variables:**
You can override variables at runtime or by editing the playbook:
Copy **`.env.sample`** to **`.env`** at the repository root (`.env` is gitignored). At minimum set **`CLOUDFLARE_DNS_API_TOKEN`** for cert-manager DNS-01. The **cert-manager** role applies it automatically during **`noble.yml`**. See **`.env.sample`** for optional placeholders (e.g. Newt/Pangolin).
## Prerequisites
- `talosctl` (matches node Talos version), `talhelper`, `helm`, `kubectl`.
- **SOPS secrets:** `sops` and `age` on the control host if you use **`clusters/noble/secrets/`** with **`age-key.txt`** (see **`clusters/noble/secrets/README.md`**).
- **Phase A:** same LAN/VPN as nodes so **Talos :50000** and **Kubernetes :6443** are reachable (see [`talos/README.md`](../talos/README.md) §3).
- **noble.yml:** bootstrapped cluster and **`talos/kubeconfig`** (or `KUBECONFIG`).
## Playbooks
| Playbook | Purpose |
|----------|---------|
| [`playbooks/deploy.yml`](playbooks/deploy.yml) | **Talos Phase A** then **`noble.yml`** (full automation). |
| [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) | `genconfig``apply-config``bootstrap``kubeconfig` only. |
| [`playbooks/noble.yml`](playbooks/noble.yml) | Helm + `kubectl` platform (after Phase A). |
| [`playbooks/post_deploy.yml`](playbooks/post_deploy.yml) | SOPS reminders and optional Argo root Application note. |
| [`playbooks/talos_bootstrap.yml`](playbooks/talos_bootstrap.yml) | **`talhelper genconfig` only** (legacy shortcut; prefer **`talos_phase_a.yml`**). |
| [`playbooks/debian_harden.yml`](playbooks/debian_harden.yml) | Baseline hardening for Debian servers (SSH/sysctl/fail2ban/unattended-upgrades). |
| [`playbooks/debian_maintenance.yml`](playbooks/debian_maintenance.yml) | Debian maintenance run (apt upgrades, autoremove/autoclean, reboot when required). |
| [`playbooks/debian_rotate_ssh_keys.yml`](playbooks/debian_rotate_ssh_keys.yml) | Rotate managed users' `authorized_keys`. |
| [`playbooks/debian_ops.yml`](playbooks/debian_ops.yml) | Convenience pipeline: harden then maintenance for Debian servers. |
| [`playbooks/proxmox_prepare.yml`](playbooks/proxmox_prepare.yml) | Configure Proxmox community repos and disable no-subscription UI warning. |
| [`playbooks/proxmox_upgrade.yml`](playbooks/proxmox_upgrade.yml) | Proxmox maintenance run (apt dist-upgrade, cleanup, reboot when required). |
| [`playbooks/proxmox_cluster.yml`](playbooks/proxmox_cluster.yml) | Create a Proxmox cluster on the master and join additional hosts. |
| [`playbooks/proxmox_ops.yml`](playbooks/proxmox_ops.yml) | Convenience pipeline: prepare, upgrade, then cluster Proxmox hosts. |
- `template_id`: Default `9000`
- `template_name`: Default `ubuntu-2204-cloud`
- `storage_pool`: Default `local-lvm`
Example overriding variables:
```bash
cd ansible
export KUBECONFIG=/absolute/path/to/home-server/talos/kubeconfig
# noble.yml only — if VIP is unreachable from this host:
# ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443'
ansible-playbook playbooks/noble.yml
ansible-playbook playbooks/post_deploy.yml
ansible-playbook playbooks/create_ubuntu_template.yml -e "template_id=9001 template_name=my-custom-template"
```
### Talos Phase A variables (role `talos_phase_a` defaults)
### Manage VM Playbook (`playbooks/manage_vm.yml`)
Override with `-e` when needed, e.g. **`-e noble_talos_skip_bootstrap=true`** if etcd is already initialized.
This unified playbook allows you to manage VMs (create from template, delete, backup, create template) across your Proxmox hosts.
| Variable | Default | Meaning |
|----------|---------|---------|
| `noble_talos_genconfig` | `true` | Run **`talhelper genconfig -o out`** first. |
| `noble_talos_apply_mode` | `auto` | **`auto`** — **`talosctl apply-config --dry-run`** on the first node picks maintenance (**`--insecure`**) vs joined (**`TALOSCONFIG`**). **`insecure`** / **`secure`** force talos/README §2 A or B. |
| `noble_talos_skip_bootstrap` | `false` | Skip **`talosctl bootstrap`**. If etcd is **already** initialized, bootstrap is treated as a no-op (same as **`talosctl`** “etcd data directory is not empty”). |
| `noble_talos_apid_wait_delay` / `noble_talos_apid_wait_timeout` | `20` / `900` | Seconds to wait for **apid :50000** on the bootstrap node after **apply-config** (nodes reboot). Increase if bootstrap hits **connection refused** to `:50000`. |
| `noble_talos_nodes` | neon/argon/krypton/helium | IP + **`out/*.yaml`** filename — align with **`talos/talconfig.yaml`**. |
**Usage:**
### Tags (partial runs)
The playbook target defaults to the `proxmox` group, but you should usually specify a specific host using `target_host` variable or `-l` limit.
```bash
ansible-playbook playbooks/noble.yml --tags cilium,metallb
ansible-playbook playbooks/noble.yml --skip-tags newt
ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=...
```
1. **Create a New Template**:
```bash
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=create_template vmid=9003 template_name=my-ubuntu-template"
```
### Variables — `group_vars/all.yml` and role defaults
2. **Create a VM from Template**:
```bash
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=create_vm vmid=9002 new_vmid=105 new_vm_name=my-new-vm"
```
- **`group_vars/all.yml`:** **`noble_newt_install`**, **`noble_velero_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_argocd_apply_root_application`**, **`noble_argocd_apply_bootstrap_root_application`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`**
- **`roles/noble_platform/defaults/main.yml`:** **`noble_apply_sops_secrets`**, **`noble_sops_age_key_file`** (SOPS secrets under **`clusters/noble/secrets/`**)
3. **Delete a VM**:
```bash
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=delete_vm vmid=105"
```
## Roles
4. **Backup a VM**:
```bash
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=backup_vm vmid=105"
```
| Role | Contents |
|------|----------|
| `talos_phase_a` | Talos genconfig, apply-config, bootstrap, kubeconfig |
| `helm_repos` | `helm repo add` / `update` |
| `noble_*` | Cilium, CSI Volume Snapshot CRDs + controller, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, Velero (optional) |
| `noble_landing_urls` | Writes **`ansible/output/noble-lab-ui-urls.md`** — URLs, service names, and (optional) Argo/Grafana passwords from Secrets |
| `noble_post_deploy` | Post-install reminders |
| `talos_bootstrap` | Genconfig-only (used by older playbook) |
| `debian_baseline_hardening` | Baseline Debian hardening (SSH policy, sysctl profile, fail2ban, unattended upgrades) |
| `debian_maintenance` | Routine Debian maintenance tasks (updates, cleanup, reboot-on-required) |
| `debian_ssh_key_rotation` | Declarative `authorized_keys` rotation for server users |
| `proxmox_baseline` | Proxmox repo prep (community repos) and no-subscription warning suppression |
| `proxmox_maintenance` | Proxmox package maintenance (dist-upgrade, cleanup, reboot-on-required) |
| `proxmox_cluster` | Proxmox cluster bootstrap/join automation using `pvecm` |
**Variables:**
- `proxmox_action`: One of `create_template`, `create_vm`, `delete_vm`, `backup_vm` (Default: `create_vm`)
- `target_host`: The host to run on (Default: `proxmox` group). Example: `-e "target_host=mercury"`
## Debian server ops quick start
These playbooks are separate from the Talos/noble flow and target hosts in `debian_servers`.
1. Copy `inventory/debian.example.yml` to `inventory/debian.yml` and update hosts/users.
2. Update `group_vars/debian_servers.yml` with your allowed SSH users and real public keys.
3. Run with the Debian inventory:
```bash
cd ansible
ansible-playbook -i inventory/debian.yml playbooks/debian_harden.yml
ansible-playbook -i inventory/debian.yml playbooks/debian_rotate_ssh_keys.yml
ansible-playbook -i inventory/debian.yml playbooks/debian_maintenance.yml
```
Or run the combined maintenance pipeline:
```bash
cd ansible
ansible-playbook -i inventory/debian.yml playbooks/debian_ops.yml
```
## Proxmox host + cluster quick start
These playbooks are separate from the Talos/noble flow and target hosts in `proxmox_hosts`.
1. Copy `inventory/proxmox.example.yml` to `inventory/proxmox.yml` and update hosts/users.
2. Update `group_vars/proxmox_hosts.yml` with your cluster name (`proxmox_cluster_name`), chosen cluster master, and root public key file paths to install.
3. First run (no SSH keys yet): use `--ask-pass` **or** set `ansible_password` (prefer Ansible Vault). Keep `ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new"` in inventory for first-contact hosts.
4. Run prepare first to install your public keys on each host, then continue:
```bash
cd ansible
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_prepare.yml --ask-pass
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_upgrade.yml
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_cluster.yml
```
After `proxmox_prepare.yml` finishes, SSH key auth should work for root (keys from `proxmox_root_authorized_key_files`), so `--ask-pass` is usually no longer needed.
If `pvecm add` still prompts for the master root password during join, set `proxmox_cluster_master_root_password` (prefer Vault) to run join non-interactively.
Changing `proxmox_cluster_name` only affects new cluster creation; it does not rename an already-created cluster.
Or run the full Proxmox pipeline:
```bash
cd ansible
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_ops.yml
```
## Migrating from Argo-managed `noble-platform`
```bash
kubectl delete application -n argocd noble-platform noble-kyverno noble-kyverno-policies --ignore-not-found
kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
```
Then run `playbooks/noble.yml` so Helm state matches git values.
*See `roles/proxmox_vm/defaults/main.yml` for all available configuration options.*

View File

@@ -1,10 +1,5 @@
[defaults]
inventory = inventory/localhost.yml
roles_path = roles
inventory = inventory/hosts.ini
host_key_checking = False
retry_files_enabled = False
stdout_callback = default
callback_result_format = yaml
local_tmp = .ansible-tmp
[privilege_escalation]
become = False
interpreter_python = auto_silent

View File

@@ -1,28 +0,0 @@
---
# noble_repo_root / noble_kubeconfig are set in playbooks (use **playbook_dir** magic var).
# When kubeconfig points at the API VIP but this workstation cannot reach the lab LAN (VPN off, etc.),
# set a reachable control-plane URL — same as: kubectl config set-cluster noble --server=https://<cp-ip>:6443
# Example: ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443'
noble_k8s_api_server_override: ""
# When /healthz fails with **network unreachable** to the VIP and **override** is empty, retry using this URL (neon).
noble_k8s_api_server_auto_fallback: true
noble_k8s_api_server_fallback: "https://192.168.50.20:6443"
# Only if you must skip the kubectl /healthz preflight (not recommended).
noble_skip_k8s_health_check: false
# Pangolin / Newt — set true only after newt-pangolin-auth Secret exists (SOPS: clusters/noble/secrets/ or imperative — see clusters/noble/bootstrap/newt/README.md)
noble_newt_install: false
# cert-manager needs Secret cloudflare-dns-api-token in cert-manager namespace before ClusterIssuers work
noble_cert_manager_require_cloudflare_secret: true
# Velero — set **noble_velero_install: true** plus S3 bucket/URL (and credentials — see clusters/noble/bootstrap/velero/README.md)
noble_velero_install: false
# Argo CD — apply app-of-apps root Application (clusters/noble/bootstrap/argocd/root-application.yaml). Set false to skip.
noble_argocd_apply_root_application: true
# Bootstrap kustomize in Argo (**noble-bootstrap-root** → **clusters/noble/bootstrap**). Applied with manual sync; enable automation after **noble.yml** (see **clusters/noble/bootstrap/argocd/README.md** §5).
noble_argocd_apply_bootstrap_root_application: true

View File

@@ -1,12 +0,0 @@
---
# Hardened SSH settings
debian_baseline_ssh_allow_users:
- admin
# Example key rotation entries. Replace with your real users and keys.
debian_ssh_rotation_users:
- name: admin
home: /home/admin
state: present
keys:
- "ssh-ed25519 AAAAEXAMPLE_REPLACE_ME admin@workstation"

View File

@@ -1,37 +0,0 @@
---
# Proxmox repositories
proxmox_repo_debian_codename: trixie
proxmox_repo_disable_enterprise: true
proxmox_repo_disable_ceph_enterprise: true
proxmox_repo_enable_pve_no_subscription: true
proxmox_repo_enable_ceph_no_subscription: true
# Suppress "No valid subscription" warning in UI
proxmox_no_subscription_notice_disable: true
# Public keys to install for root on each Proxmox host.
proxmox_root_authorized_key_files:
- "{{ lookup('env', 'HOME') }}/.ssh/id_ed25519.pub"
- "{{ lookup('env', 'HOME') }}/.ssh/ansible.pub"
# Package upgrade/reboot policy
proxmox_upgrade_apt_cache_valid_time: 3600
proxmox_upgrade_autoremove: true
proxmox_upgrade_autoclean: true
proxmox_upgrade_reboot_if_required: true
proxmox_upgrade_reboot_timeout: 1800
# Cluster settings
proxmox_cluster_enabled: true
proxmox_cluster_name: atomic-hub
# Bootstrap host name from inventory (first host by default if empty)
proxmox_cluster_master: ""
# Optional explicit IP/FQDN for joining; leave empty to use ansible_host of master
proxmox_cluster_master_ip: ""
proxmox_cluster_force: false
# Optional: use only for first cluster joins when inter-node SSH trust is not established.
# Prefer storing with Ansible Vault if you set this.
proxmox_cluster_master_root_password: "Hemroid8"

View File

@@ -1,11 +0,0 @@
---
all:
children:
debian_servers:
hosts:
debian-01:
ansible_host: 192.168.50.101
ansible_user: admin
debian-02:
ansible_host: 192.168.50.102
ansible_user: admin

View File

@@ -0,0 +1,14 @@
[proxmox]
# Replace pve1 with your proxmox node hostname or IP
mercury ansible_host=192.168.50.100 ansible_user=root
[proxmox:vars]
# If using password auth (ssh key recommended though):
# ansible_ssh_pass=yourpassword
# Connection variables for the proxmox modules (api)
proxmox_api_user=root@pam
proxmox_api_password=Hemroid8
proxmox_api_host=192.168.50.100
# proxmox_api_token_id=
# proxmox_api_token_secret=

View File

@@ -1,6 +0,0 @@
---
all:
hosts:
localhost:
ansible_connection: local
ansible_python_interpreter: "{{ ansible_playbook_python }}"

View File

@@ -1,24 +0,0 @@
---
all:
children:
proxmox_hosts:
vars:
ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new"
hosts:
helium:
ansible_host: 192.168.1.100
ansible_user: root
# First run without SSH keys:
# ansible_password: "{{ vault_proxmox_root_password }}"
neon:
ansible_host: 192.168.1.90
ansible_user: root
# ansible_password: "{{ vault_proxmox_root_password }}"
argon:
ansible_host: 192.168.1.80
ansible_user: root
# ansible_password: "{{ vault_proxmox_root_password }}"
krypton:
ansible_host: 192.168.1.70
ansible_user: root
# ansible_password: "{{ vault_proxmox_root_password }}"

View File

@@ -1,24 +0,0 @@
---
all:
children:
proxmox_hosts:
vars:
ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new"
hosts:
helium:
ansible_host: 192.168.1.100
ansible_user: root
# First run without SSH keys:
# ansible_password: "{{ vault_proxmox_root_password }}"
neon:
ansible_host: 192.168.1.90
ansible_user: root
# ansible_password: "{{ vault_proxmox_root_password }}"
argon:
ansible_host: 192.168.1.80
ansible_user: root
# ansible_password: "{{ vault_proxmox_root_password }}"
krypton:
ansible_host: 192.168.1.70
ansible_user: root
# ansible_password: "{{ vault_proxmox_root_password }}"

View File

@@ -0,0 +1,72 @@
---
- name: Create Ubuntu Cloud-Init Template
hosts: proxmox
become: yes
vars:
template_id: 9000
template_name: ubuntu-2204-cloud
# URL for Ubuntu 22.04 Cloud Image (Jammy)
image_url: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
image_name: "ubuntu-22.04-server-cloudimg-amd64.img"
storage_pool: "local-lvm"
memory: 2048
cores: 2
tasks:
- name: Check if template already exists
command: "qm status {{ template_id }}"
register: vm_status
failed_when: false
changed_when: false
- name: Fail if template ID exists
fail:
msg: "VM ID {{ template_id }} already exists. Please choose a different ID or delete the existing VM."
when: vm_status.rc == 0
- name: Download Ubuntu Cloud Image
get_url:
url: "{{ image_url }}"
dest: "/tmp/{{ image_name }}"
mode: '0644'
- name: Install libguestfs-tools (required for virt-customize if needed, optional)
apt:
name: libguestfs-tools
state: present
ignore_errors: yes
- name: Create VM with hardware config
command: >
qm create {{ template_id }}
--name "{{ template_name }}"
--memory {{ memory }}
--core {{ cores }}
--net0 virtio,bridge=vmbr0
--scsihw virtio-scsi-pci
--ostype l26
--serial0 socket --vga serial0
- name: Import Disk
command: "qm importdisk {{ template_id }} /tmp/{{ image_name }} {{ storage_pool }}"
- name: Attach Disk to SCSI
command: "qm set {{ template_id }} --scsi0 {{ storage_pool }}:vm-{{ template_id }}-disk-0"
- name: Add Cloud-Init Drive
command: "qm set {{ template_id }} --ide2 {{ storage_pool }}:cloudinit"
- name: Set Boot Order
command: "qm set {{ template_id }} --boot c --bootdisk scsi0"
- name: Resize Disk (Optional, e.g. 10G)
command: "qm resize {{ template_id }} scsi0 10G"
ignore_errors: yes
- name: Convert to Template
command: "qm template {{ template_id }}"
- name: Remove Downloaded Image
file:
path: "/tmp/{{ image_name }}"
state: absent

View File

@@ -1,8 +0,0 @@
---
- name: Debian server baseline hardening
hosts: debian_servers
become: true
gather_facts: true
roles:
- role: debian_baseline_hardening
tags: [hardening, baseline]

View File

@@ -1,8 +0,0 @@
---
- name: Debian maintenance (updates + reboot handling)
hosts: debian_servers
become: true
gather_facts: true
roles:
- role: debian_maintenance
tags: [maintenance, updates]

View File

@@ -1,3 +0,0 @@
---
- import_playbook: debian_harden.yml
- import_playbook: debian_maintenance.yml

View File

@@ -1,8 +0,0 @@
---
- name: Debian SSH key rotation
hosts: debian_servers
become: true
gather_facts: false
roles:
- role: debian_ssh_key_rotation
tags: [ssh, ssh_keys, rotation]

View File

@@ -1,5 +0,0 @@
---
# Full bring-up: Talos Phase A then platform stack.
# Run from **ansible/**: ansible-playbook playbooks/deploy.yml
- import_playbook: talos_phase_a.yml
- import_playbook: noble.yml

View File

@@ -0,0 +1,6 @@
---
- name: Manage Proxmox VMs
hosts: "{{ target_host | default('proxmox') }}"
become: yes
roles:
- proxmox_vm

View File

@@ -1,232 +0,0 @@
---
# Full platform install — **after** Talos bootstrap (`talosctl bootstrap` + working kubeconfig).
# Do not run until `kubectl get --raw /healthz` returns ok (see talos/README.md §3, CLUSTER-BUILD Phase A).
# Run from repo **ansible/** directory: ansible-playbook playbooks/noble.yml
#
# Tags: repos, cilium, csi_snapshot, metrics, longhorn, metallb, kube_vip, traefik, cert_manager, newt,
# argocd, kyverno, kyverno_policies, platform, velero, all (default)
- name: Noble cluster — platform stack (Ansible-managed)
hosts: localhost
connection: local
gather_facts: false
vars:
noble_repo_root: "{{ playbook_dir | dirname | dirname }}"
noble_kubeconfig: "{{ lookup('env', 'KUBECONFIG') | default(noble_repo_root + '/talos/kubeconfig', true) }}"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
pre_tasks:
# Helm/kubectl use $KUBECONFIG; a missing file yields "connection refused" to localhost:8080.
- name: Stat kubeconfig path from KUBECONFIG or default
ansible.builtin.stat:
path: "{{ noble_kubeconfig }}"
register: noble_kubeconfig_stat
tags: [always]
- name: Fall back to repo talos/kubeconfig when $KUBECONFIG is unset or not a file
ansible.builtin.set_fact:
noble_kubeconfig: "{{ noble_repo_root }}/talos/kubeconfig"
when: not noble_kubeconfig_stat.stat.exists | default(false)
tags: [always]
- name: Stat kubeconfig after fallback
ansible.builtin.stat:
path: "{{ noble_kubeconfig }}"
register: noble_kubeconfig_stat2
tags: [always]
- name: Require a real kubeconfig file
ansible.builtin.assert:
that:
- noble_kubeconfig_stat2.stat.exists | default(false)
- noble_kubeconfig_stat2.stat.isreg | default(false)
fail_msg: >-
No kubeconfig file at {{ noble_kubeconfig }}.
Fix: export KUBECONFIG=/actual/path/from/talosctl-kubeconfig (see talos/README.md),
or copy the admin kubeconfig to {{ noble_repo_root }}/talos/kubeconfig.
Do not use documentation placeholders as the path.
tags: [always]
- name: Ensure temp dir for kubeconfig API override
ansible.builtin.file:
path: "{{ noble_repo_root }}/ansible/.ansible-tmp"
state: directory
mode: "0700"
when: noble_k8s_api_server_override | default('') | length > 0
tags: [always]
- name: Copy kubeconfig for API server override (original file unchanged)
ansible.builtin.copy:
src: "{{ noble_kubeconfig }}"
dest: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched"
mode: "0600"
when: noble_k8s_api_server_override | default('') | length > 0
tags: [always]
- name: Resolve current cluster name (for set-cluster)
ansible.builtin.command:
argv:
- kubectl
- config
- view
- --minify
- -o
- jsonpath={.clusters[0].name}
environment:
KUBECONFIG: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched"
register: noble_k8s_cluster_name
changed_when: false
when: noble_k8s_api_server_override | default('') | length > 0
tags: [always]
- name: Point patched kubeconfig at reachable apiserver
ansible.builtin.command:
argv:
- kubectl
- config
- set-cluster
- "{{ noble_k8s_cluster_name.stdout }}"
- --server={{ noble_k8s_api_server_override }}
- --kubeconfig={{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched
when: noble_k8s_api_server_override | default('') | length > 0
changed_when: true
tags: [always]
- name: Use patched kubeconfig for this play
ansible.builtin.set_fact:
noble_kubeconfig: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched"
when: noble_k8s_api_server_override | default('') | length > 0
tags: [always]
- name: Verify Kubernetes API is reachable from this host
ansible.builtin.command:
argv:
- kubectl
- get
- --raw
- /healthz
- --request-timeout=15s
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_k8s_health_first
failed_when: false
changed_when: false
tags: [always]
# talosctl kubeconfig often sets server to the VIP; off-LAN you can reach a control-plane IP but not 192.168.50.230.
# kubectl stderr is often "The connection to the server ... was refused" (no substring "connection refused").
- name: Auto-fallback API server when VIP is unreachable (temp kubeconfig)
tags: [always]
when:
- noble_k8s_api_server_auto_fallback | default(true) | bool
- noble_k8s_api_server_override | default('') | length == 0
- not (noble_skip_k8s_health_check | default(false) | bool)
- (noble_k8s_health_first.rc | default(1)) != 0 or (noble_k8s_health_first.stdout | default('') | trim) != 'ok'
- (((noble_k8s_health_first.stderr | default('')) ~ (noble_k8s_health_first.stdout | default(''))) | lower) is search('network is unreachable|no route to host|connection refused|was refused', multiline=False)
block:
- name: Ensure temp dir for kubeconfig auto-fallback
ansible.builtin.file:
path: "{{ noble_repo_root }}/ansible/.ansible-tmp"
state: directory
mode: "0700"
- name: Copy kubeconfig for API auto-fallback
ansible.builtin.copy:
src: "{{ noble_kubeconfig }}"
dest: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback"
mode: "0600"
- name: Resolve cluster name for kubectl set-cluster
ansible.builtin.command:
argv:
- kubectl
- config
- view
- --minify
- -o
- jsonpath={.clusters[0].name}
environment:
KUBECONFIG: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback"
register: noble_k8s_cluster_fb
changed_when: false
- name: Point temp kubeconfig at fallback apiserver
ansible.builtin.command:
argv:
- kubectl
- config
- set-cluster
- "{{ noble_k8s_cluster_fb.stdout }}"
- --server={{ noble_k8s_api_server_fallback | default('https://192.168.50.20:6443', true) }}
- --kubeconfig={{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback
changed_when: true
- name: Use kubeconfig with fallback API server for this play
ansible.builtin.set_fact:
noble_kubeconfig: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback"
- name: Re-verify Kubernetes API after auto-fallback
ansible.builtin.command:
argv:
- kubectl
- get
- --raw
- /healthz
- --request-timeout=15s
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_k8s_health_after_fallback
failed_when: false
changed_when: false
- name: Mark that API was re-checked after kubeconfig fallback
ansible.builtin.set_fact:
noble_k8s_api_fallback_used: true
- name: Normalize API health result for preflight (scalars; avoids dict merge / set_fact stringification)
ansible.builtin.set_fact:
noble_k8s_health_rc: "{{ noble_k8s_health_after_fallback.rc | default(1) if (noble_k8s_api_fallback_used | default(false) | bool) else (noble_k8s_health_first.rc | default(1)) }}"
noble_k8s_health_stdout: "{{ noble_k8s_health_after_fallback.stdout | default('') if (noble_k8s_api_fallback_used | default(false) | bool) else (noble_k8s_health_first.stdout | default('')) }}"
noble_k8s_health_stderr: "{{ noble_k8s_health_after_fallback.stderr | default('') if (noble_k8s_api_fallback_used | default(false) | bool) else (noble_k8s_health_first.stderr | default('')) }}"
tags: [always]
- name: Fail when API check did not return ok
ansible.builtin.fail:
msg: "{{ lookup('template', 'templates/api_health_hint.j2') }}"
when:
- not (noble_skip_k8s_health_check | default(false) | bool)
- (noble_k8s_health_rc | int) != 0 or (noble_k8s_health_stdout | default('') | trim) != 'ok'
tags: [always]
roles:
- role: helm_repos
tags: [repos, helm]
- role: noble_cilium
tags: [cilium, cni]
- role: noble_csi_snapshot_controller
tags: [csi_snapshot, snapshot, storage]
- role: noble_metrics_server
tags: [metrics, metrics_server]
- role: noble_longhorn
tags: [longhorn, storage]
- role: noble_metallb
tags: [metallb, lb]
- role: noble_kube_vip
tags: [kube_vip, vip]
- role: noble_traefik
tags: [traefik, ingress]
- role: noble_cert_manager
tags: [cert_manager, certs]
- role: noble_newt
tags: [newt]
- role: noble_argocd
tags: [argocd, gitops]
- role: noble_kyverno
tags: [kyverno, policy]
- role: noble_kyverno_policies
tags: [kyverno_policies, policy]
- role: noble_platform
tags: [platform, observability, apps]
- role: noble_velero
tags: [velero, backups]
- role: noble_landing_urls
tags: [landing, platform, observability, apps]

View File

@@ -1,7 +0,0 @@
---
# Manual follow-ups after **noble.yml**: SOPS key backup, optional Argo root Application.
- hosts: localhost
connection: local
gather_facts: false
roles:
- noble_post_deploy

View File

@@ -1,9 +0,0 @@
---
- name: Proxmox cluster bootstrap/join
hosts: proxmox_hosts
become: true
gather_facts: false
serial: 1
roles:
- role: proxmox_cluster
tags: [proxmox, cluster]

View File

@@ -1,4 +0,0 @@
---
- import_playbook: proxmox_prepare.yml
- import_playbook: proxmox_upgrade.yml
- import_playbook: proxmox_cluster.yml

View File

@@ -1,8 +0,0 @@
---
- name: Proxmox host preparation (community repos + no-subscription notice)
hosts: proxmox_hosts
become: true
gather_facts: true
roles:
- role: proxmox_baseline
tags: [proxmox, prepare, repos, ui]

View File

@@ -1,9 +0,0 @@
---
- name: Proxmox host maintenance (upgrade to latest)
hosts: proxmox_hosts
become: true
gather_facts: true
serial: 1
roles:
- role: proxmox_maintenance
tags: [proxmox, maintenance, updates]

View File

@@ -1,11 +0,0 @@
---
# Genconfig only — for full Talos Phase A (apply, bootstrap, kubeconfig) use **playbooks/talos_phase_a.yml**
# or **playbooks/deploy.yml**. Run: ansible-playbook playbooks/talos_bootstrap.yml -e noble_talos_genconfig=true
- name: Talos — optional genconfig helper
hosts: localhost
connection: local
gather_facts: false
vars:
noble_repo_root: "{{ playbook_dir | dirname | dirname }}"
roles:
- role: talos_bootstrap

View File

@@ -1,15 +0,0 @@
---
# Talos Phase A — **talhelper genconfig** → **apply-config** (all nodes) → **bootstrap** → **kubeconfig**.
# Requires: **talosctl**, **talhelper**, reachable node IPs (same LAN as nodes for Talos API :50000).
# See **talos/README.md** §1§3. Then run **playbooks/noble.yml** or **deploy.yml**.
- name: Talos — genconfig, apply, bootstrap, kubeconfig
hosts: localhost
connection: local
gather_facts: false
vars:
noble_repo_root: "{{ playbook_dir | dirname | dirname }}"
noble_talos_dir: "{{ noble_repo_root }}/talos"
noble_talos_kubeconfig_out: "{{ noble_repo_root }}/talos/kubeconfig"
roles:
- role: talos_phase_a
tags: [talos, phase_a]

View File

@@ -1,22 +0,0 @@
{# Error output for noble.yml API preflight when kubectl /healthz fails #}
Cannot use the Kubernetes API from this host (kubectl get --raw /healthz).
rc={{ noble_k8s_health_rc | default('n/a') }}
stderr: {{ noble_k8s_health_stderr | default('') | trim }}
{% set err = (noble_k8s_health_stderr | default('')) | lower %}
{% if 'connection refused' in err %}
Connection refused: the TCP path to that host works, but nothing is accepting HTTPS on port 6443 there.
• **Not bootstrapped yet?** Finish Talos first: `talosctl bootstrap` (once on a control plane), then `talosctl kubeconfig`, then confirm `kubectl get nodes`. See talos/README.md §2§3 and CLUSTER-BUILD.md Phase A. **Do not run this playbook before the Kubernetes API exists.**
• If bootstrap is done: try another control-plane IP (CLUSTER-BUILD inventory: neon 192.168.50.20, argon .30, krypton .40), or the VIP if kube-vip is up and you are on the LAN:
-e 'noble_k8s_api_server_override=https://192.168.50.230:6443'
• Do not point the API URL at a worker-only node.
• `talosctl health` / `kubectl get nodes` from a working client.
{% elif 'network is unreachable' in err or 'no route to host' in err %}
Network unreachable / no route: this machine cannot route to the API IP. Join the lab LAN or VPN, or set a reachable API server URL (talos/README.md §3).
{% else %}
If kubeconfig used the VIP from off-LAN, try a reachable control-plane IP, e.g.:
-e 'noble_k8s_api_server_override=https://192.168.50.20:6443'
See talos/README.md §3.
{% endif %}
To skip this check (not recommended): -e noble_skip_k8s_health_check=true

2
ansible/requirements.yml Normal file
View File

@@ -0,0 +1,2 @@
collections:
- name: community.general

View File

@@ -1,39 +0,0 @@
---
# Update apt metadata only when stale (seconds)
debian_baseline_apt_cache_valid_time: 3600
# Core host hardening packages
debian_baseline_packages:
- unattended-upgrades
- apt-listchanges
- fail2ban
- needrestart
- sudo
- ca-certificates
# SSH hardening controls
debian_baseline_ssh_permit_root_login: "no"
debian_baseline_ssh_password_authentication: "no"
debian_baseline_ssh_pubkey_authentication: "yes"
debian_baseline_ssh_x11_forwarding: "no"
debian_baseline_ssh_max_auth_tries: 3
debian_baseline_ssh_client_alive_interval: 300
debian_baseline_ssh_client_alive_count_max: 2
debian_baseline_ssh_allow_users: []
# unattended-upgrades controls
debian_baseline_enable_unattended_upgrades: true
debian_baseline_unattended_auto_upgrade: "1"
debian_baseline_unattended_update_lists: "1"
# Kernel and network hardening sysctls
debian_baseline_sysctl_settings:
net.ipv4.conf.all.accept_redirects: "0"
net.ipv4.conf.default.accept_redirects: "0"
net.ipv4.conf.all.send_redirects: "0"
net.ipv4.conf.default.send_redirects: "0"
net.ipv4.conf.all.log_martians: "1"
net.ipv4.conf.default.log_martians: "1"
net.ipv4.tcp_syncookies: "1"
net.ipv6.conf.all.accept_redirects: "0"
net.ipv6.conf.default.accept_redirects: "0"

View File

@@ -1,12 +0,0 @@
---
- name: Restart ssh
ansible.builtin.service:
name: ssh
state: restarted
- name: Reload sysctl
ansible.builtin.command:
argv:
- sysctl
- --system
changed_when: true

View File

@@ -1,52 +0,0 @@
---
- name: Refresh apt cache
ansible.builtin.apt:
update_cache: true
cache_valid_time: "{{ debian_baseline_apt_cache_valid_time }}"
- name: Install baseline hardening packages
ansible.builtin.apt:
name: "{{ debian_baseline_packages }}"
state: present
- name: Configure unattended-upgrades auto settings
ansible.builtin.copy:
dest: /etc/apt/apt.conf.d/20auto-upgrades
mode: "0644"
content: |
APT::Periodic::Update-Package-Lists "{{ debian_baseline_unattended_update_lists }}";
APT::Periodic::Unattended-Upgrade "{{ debian_baseline_unattended_auto_upgrade }}";
when: debian_baseline_enable_unattended_upgrades | bool
- name: Configure SSH hardening options
ansible.builtin.copy:
dest: /etc/ssh/sshd_config.d/99-hardening.conf
mode: "0644"
content: |
PermitRootLogin {{ debian_baseline_ssh_permit_root_login }}
PasswordAuthentication {{ debian_baseline_ssh_password_authentication }}
PubkeyAuthentication {{ debian_baseline_ssh_pubkey_authentication }}
X11Forwarding {{ debian_baseline_ssh_x11_forwarding }}
MaxAuthTries {{ debian_baseline_ssh_max_auth_tries }}
ClientAliveInterval {{ debian_baseline_ssh_client_alive_interval }}
ClientAliveCountMax {{ debian_baseline_ssh_client_alive_count_max }}
{% if debian_baseline_ssh_allow_users | length > 0 %}
AllowUsers {{ debian_baseline_ssh_allow_users | join(' ') }}
{% endif %}
notify: Restart ssh
- name: Configure baseline sysctls
ansible.builtin.copy:
dest: /etc/sysctl.d/99-hardening.conf
mode: "0644"
content: |
{% for key, value in debian_baseline_sysctl_settings.items() %}
{{ key }} = {{ value }}
{% endfor %}
notify: Reload sysctl
- name: Ensure fail2ban service is enabled
ansible.builtin.service:
name: fail2ban
enabled: true
state: started

View File

@@ -1,7 +0,0 @@
---
debian_maintenance_apt_cache_valid_time: 3600
debian_maintenance_upgrade_type: dist
debian_maintenance_autoremove: true
debian_maintenance_autoclean: true
debian_maintenance_reboot_if_required: true
debian_maintenance_reboot_timeout: 1800

View File

@@ -1,30 +0,0 @@
---
- name: Refresh apt cache
ansible.builtin.apt:
update_cache: true
cache_valid_time: "{{ debian_maintenance_apt_cache_valid_time }}"
- name: Upgrade Debian packages
ansible.builtin.apt:
upgrade: "{{ debian_maintenance_upgrade_type }}"
- name: Remove orphaned packages
ansible.builtin.apt:
autoremove: "{{ debian_maintenance_autoremove }}"
- name: Clean apt package cache
ansible.builtin.apt:
autoclean: "{{ debian_maintenance_autoclean }}"
- name: Check if reboot is required
ansible.builtin.stat:
path: /var/run/reboot-required
register: debian_maintenance_reboot_required_file
- name: Reboot when required by package updates
ansible.builtin.reboot:
reboot_timeout: "{{ debian_maintenance_reboot_timeout }}"
msg: "Reboot initiated by Ansible maintenance playbook"
when:
- debian_maintenance_reboot_if_required | bool
- debian_maintenance_reboot_required_file.stat.exists | default(false)

View File

@@ -1,10 +0,0 @@
---
# List of users to manage keys for.
# Example:
# debian_ssh_rotation_users:
# - name: deploy
# home: /home/deploy
# state: present
# keys:
# - "ssh-ed25519 AAAA... deploy@laptop"
debian_ssh_rotation_users: []

View File

@@ -1,50 +0,0 @@
---
- name: Validate SSH key rotation inputs
ansible.builtin.assert:
that:
- item.name is defined
- item.home is defined
- (item.state | default('present')) in ['present', 'absent']
- (item.state | default('present')) == 'absent' or (item.keys is defined and item.keys | length > 0)
fail_msg: >-
Each entry in debian_ssh_rotation_users must include name, home, and either:
state=absent, or keys with at least one SSH public key.
loop: "{{ debian_ssh_rotation_users }}"
loop_control:
label: "{{ item.name | default('unknown') }}"
- name: Ensure ~/.ssh exists for managed users
ansible.builtin.file:
path: "{{ item.home }}/.ssh"
state: directory
owner: "{{ item.name }}"
group: "{{ item.name }}"
mode: "0700"
loop: "{{ debian_ssh_rotation_users }}"
loop_control:
label: "{{ item.name }}"
when: (item.state | default('present')) == 'present'
- name: Rotate authorized_keys for managed users
ansible.builtin.copy:
dest: "{{ item.home }}/.ssh/authorized_keys"
owner: "{{ item.name }}"
group: "{{ item.name }}"
mode: "0600"
content: |
{% for key in item.keys %}
{{ key }}
{% endfor %}
loop: "{{ debian_ssh_rotation_users }}"
loop_control:
label: "{{ item.name }}"
when: (item.state | default('present')) == 'present'
- name: Remove authorized_keys for users marked absent
ansible.builtin.file:
path: "{{ item.home }}/.ssh/authorized_keys"
state: absent
loop: "{{ debian_ssh_rotation_users }}"
loop_control:
label: "{{ item.name }}"
when: (item.state | default('present')) == 'absent'

View File

@@ -1,16 +0,0 @@
---
noble_helm_repos:
- { name: cilium, url: "https://helm.cilium.io/" }
- { name: metallb, url: "https://metallb.github.io/metallb" }
- { name: longhorn, url: "https://charts.longhorn.io" }
- { name: traefik, url: "https://traefik.github.io/charts" }
- { name: jetstack, url: "https://charts.jetstack.io" }
- { name: fossorial, url: "https://charts.fossorial.io" }
- { name: argo, url: "https://argoproj.github.io/argo-helm" }
- { name: metrics-server, url: "https://kubernetes-sigs.github.io/metrics-server/" }
- { name: prometheus-community, url: "https://prometheus-community.github.io/helm-charts" }
- { name: grafana, url: "https://grafana.github.io/helm-charts" }
- { name: fluent, url: "https://fluent.github.io/helm-charts" }
- { name: headlamp, url: "https://kubernetes-sigs.github.io/headlamp/" }
- { name: kyverno, url: "https://kyverno.github.io/kyverno/" }
- { name: vmware-tanzu, url: "https://vmware-tanzu.github.io/helm-charts" }

View File

@@ -1,16 +0,0 @@
---
- name: Add Helm repositories
ansible.builtin.command:
cmd: "helm repo add {{ item.name }} {{ item.url }}"
loop: "{{ noble_helm_repos }}"
loop_control:
label: "{{ item.name }}"
register: helm_repo_add
changed_when: helm_repo_add.rc == 0
failed_when: >-
helm_repo_add.rc != 0 and
('already exists' not in (helm_repo_add.stderr | default('')))
- name: helm repo update
ansible.builtin.command: helm repo update
changed_when: true

View File

@@ -1,6 +0,0 @@
---
# When true, applies clusters/noble/bootstrap/argocd/root-application.yaml (app-of-apps).
# Edit spec.source.repoURL in that file if your Git remote differs.
noble_argocd_apply_root_application: false
# When true, applies clusters/noble/bootstrap/argocd/bootstrap-root-application.yaml (noble-bootstrap-root; manual sync until README §5).
noble_argocd_apply_bootstrap_root_application: true

View File

@@ -1,46 +0,0 @@
---
- name: Install Argo CD
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- argocd
- argo/argo-cd
- --namespace
- argocd
- --create-namespace
- --version
- "9.4.17"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/argocd/values.yaml"
- --wait
- --timeout
- 15m
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Apply Argo CD root Application (app-of-apps)
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/argocd/root-application.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_argocd_apply_root_application | default(false) | bool
changed_when: true
- name: Apply Argo CD bootstrap app-of-apps Application
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/argocd/bootstrap-root-application.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_argocd_apply_bootstrap_root_application | default(false) | bool
changed_when: true

View File

@@ -1,3 +0,0 @@
---
# Warn when **cloudflare-dns-api-token** is missing after apply (also set in **group_vars/all.yml** when loaded).
noble_cert_manager_require_cloudflare_secret: true

View File

@@ -1,28 +0,0 @@
---
# See repository **.env.sample** — copy to **.env** (gitignored).
- name: Stat repository .env for deploy secrets
ansible.builtin.stat:
path: "{{ noble_repo_root }}/.env"
register: noble_deploy_env_file
changed_when: false
- name: Create cert-manager Cloudflare DNS secret from .env
ansible.builtin.shell: |
set -euo pipefail
set -a
. "{{ noble_repo_root }}/.env"
set +a
if [ -z "${CLOUDFLARE_DNS_API_TOKEN:-}" ]; then
echo NO_TOKEN
exit 0
fi
kubectl -n cert-manager create secret generic cloudflare-dns-api-token \
--from-literal=api-token="${CLOUDFLARE_DNS_API_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
echo APPLIED
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_deploy_env_file.stat.exists | default(false)
no_log: true
register: noble_cf_secret_from_env
changed_when: "'APPLIED' in (noble_cf_secret_from_env.stdout | default(''))"

View File

@@ -1,68 +0,0 @@
---
- name: Create cert-manager namespace
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cert-manager/namespace.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install cert-manager
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- cert-manager
- jetstack/cert-manager
- --namespace
- cert-manager
- --version
- v1.20.0
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cert-manager/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Apply secrets from repository .env (optional)
ansible.builtin.include_tasks: from_env.yml
- name: Check Cloudflare DNS API token Secret (required for ClusterIssuers)
ansible.builtin.command:
argv:
- kubectl
- -n
- cert-manager
- get
- secret
- cloudflare-dns-api-token
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_cf_secret
failed_when: false
changed_when: false
- name: Warn when Cloudflare Secret is missing
ansible.builtin.debug:
msg: >-
Secret cert-manager/cloudflare-dns-api-token not found.
Create it per clusters/noble/bootstrap/cert-manager/README.md before ClusterIssuers can succeed.
when:
- noble_cert_manager_require_cloudflare_secret | default(true) | bool
- noble_cf_secret.rc != 0
- name: Apply ClusterIssuers (staging + prod)
ansible.builtin.command:
argv:
- kubectl
- apply
- -k
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cert-manager"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,25 +0,0 @@
---
- name: Install Cilium (required CNI for Talos cni:none)
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- cilium
- cilium/cilium
- --namespace
- kube-system
- --version
- "1.16.6"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cilium/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Wait for Cilium DaemonSet
ansible.builtin.command: kubectl -n kube-system rollout status ds/cilium --timeout=300s
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: false

View File

@@ -1,2 +0,0 @@
---
noble_csi_snapshot_kubectl_timeout: 120s

View File

@@ -1,39 +0,0 @@
---
# Volume Snapshot CRDs + snapshot-controller (Velero CSI / Longhorn snapshots).
- name: Apply Volume Snapshot CRDs (snapshot.storage.k8s.io)
ansible.builtin.command:
argv:
- kubectl
- apply
- "--request-timeout={{ noble_csi_snapshot_kubectl_timeout | default('120s') }}"
- -k
- "{{ noble_repo_root }}/clusters/noble/bootstrap/csi-snapshot-controller/crd"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Apply snapshot-controller in kube-system
ansible.builtin.command:
argv:
- kubectl
- apply
- "--request-timeout={{ noble_csi_snapshot_kubectl_timeout | default('120s') }}"
- -k
- "{{ noble_repo_root }}/clusters/noble/bootstrap/csi-snapshot-controller/controller"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Wait for snapshot-controller Deployment
ansible.builtin.command:
argv:
- kubectl
- -n
- kube-system
- rollout
- status
- deploy/snapshot-controller
- --timeout=120s
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: false

View File

@@ -1,11 +0,0 @@
---
- name: Apply kube-vip (Kubernetes API VIP)
ansible.builtin.command:
argv:
- kubectl
- apply
- -k
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kube-vip"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,32 +0,0 @@
---
- name: Create Kyverno namespace
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kyverno/namespace.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install Kyverno operator
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- kyverno
- kyverno/kyverno
- -n
- kyverno
- --version
- "3.7.1"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kyverno/values.yaml"
- --wait
- --timeout
- 15m
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,21 +0,0 @@
---
- name: Install Kyverno policy chart (PSS baseline, Audit)
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- kyverno-policies
- kyverno/kyverno-policies
- -n
- kyverno
- --version
- "3.7.1"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kyverno/policies-values.yaml"
- --wait
- --timeout
- 10m
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,51 +0,0 @@
---
# Regenerated when **noble_landing_urls** runs (after platform stack). Paths match Traefik + cert-manager Ingresses.
noble_landing_urls_dest: "{{ noble_repo_root }}/ansible/output/noble-lab-ui-urls.md"
# When true, run kubectl to fill Argo CD / Grafana secrets and a bounded Headlamp SA token in the markdown (requires working kubeconfig).
noble_landing_urls_fetch_credentials: true
# Headlamp: bounded token for UI sign-in (`kubectl create token`); cluster may cap max duration.
noble_landing_urls_headlamp_token_duration: 48h
noble_lab_ui_entries:
- name: Argo CD
description: GitOps UI (sync, apps, repos)
namespace: argocd
service: argocd-server
url: https://argo.apps.noble.lab.pcenicni.dev
- name: Grafana
description: Dashboards, Loki explore (logs)
namespace: monitoring
service: kube-prometheus-grafana
url: https://grafana.apps.noble.lab.pcenicni.dev
- name: Prometheus
description: Prometheus UI (queries, targets) — lab; protect in production
namespace: monitoring
service: kube-prometheus-kube-prome-prometheus
url: https://prometheus.apps.noble.lab.pcenicni.dev
- name: Alertmanager
description: Alertmanager UI (silences, status)
namespace: monitoring
service: kube-prometheus-kube-prome-alertmanager
url: https://alertmanager.apps.noble.lab.pcenicni.dev
- name: Headlamp
description: Kubernetes UI (cluster resources)
namespace: headlamp
service: headlamp
url: https://headlamp.apps.noble.lab.pcenicni.dev
- name: Longhorn
description: Storage volumes, nodes, backups
namespace: longhorn-system
service: longhorn-frontend
url: https://longhorn.apps.noble.lab.pcenicni.dev
- name: Velero
description: Cluster backups — no web UI (velero CLI / kubectl CRDs)
namespace: velero
service: velero
url: ""
- name: Homepage
description: App dashboard (links to lab UIs)
namespace: homepage
service: homepage
url: https://homepage.apps.noble.lab.pcenicni.dev

View File

@@ -1,72 +0,0 @@
---
# Populates template variables from Secrets + Headlamp token (no_log on kubectl to avoid leaking into Ansible stdout).
- name: Fetch Argo CD initial admin password (base64)
ansible.builtin.command:
argv:
- kubectl
- -n
- argocd
- get
- secret
- argocd-initial-admin-secret
- -o
- jsonpath={.data.password}
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_fetch_argocd_pw_b64
failed_when: false
changed_when: false
no_log: true
- name: Fetch Grafana admin user (base64)
ansible.builtin.command:
argv:
- kubectl
- -n
- monitoring
- get
- secret
- kube-prometheus-grafana
- -o
- jsonpath={.data.admin-user}
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_fetch_grafana_user_b64
failed_when: false
changed_when: false
no_log: true
- name: Fetch Grafana admin password (base64)
ansible.builtin.command:
argv:
- kubectl
- -n
- monitoring
- get
- secret
- kube-prometheus-grafana
- -o
- jsonpath={.data.admin-password}
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_fetch_grafana_pw_b64
failed_when: false
changed_when: false
no_log: true
- name: Create Headlamp ServiceAccount token (for UI sign-in)
ansible.builtin.command:
argv:
- kubectl
- -n
- headlamp
- create
- token
- headlamp
- "--duration={{ noble_landing_urls_headlamp_token_duration | default('48h') }}"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_fetch_headlamp_token
failed_when: false
changed_when: false
no_log: true

View File

@@ -1,20 +0,0 @@
---
- name: Ensure output directory for generated landing page
ansible.builtin.file:
path: "{{ noble_repo_root }}/ansible/output"
state: directory
mode: "0755"
- name: Fetch initial credentials from cluster Secrets (optional)
ansible.builtin.include_tasks: fetch_credentials.yml
when: noble_landing_urls_fetch_credentials | default(true) | bool
- name: Write noble lab UI URLs (markdown landing page)
ansible.builtin.template:
src: noble-lab-ui-urls.md.j2
dest: "{{ noble_landing_urls_dest }}"
mode: "0644"
- name: Show landing page path
ansible.builtin.debug:
msg: "Noble lab UI list written to {{ noble_landing_urls_dest }}"

View File

@@ -1,51 +0,0 @@
# Noble lab — web UIs (LAN)
> **Sensitive:** This file may include **passwords read from Kubernetes Secrets** when credential fetch ran. It is **gitignored** — do not commit or share.
**DNS:** point **`*.apps.noble.lab.pcenicni.dev`** at the Traefik **LoadBalancer** (MetalLB **`192.168.50.211`** by default — see `clusters/noble/bootstrap/traefik/values.yaml`).
**TLS:** **cert-manager** + **`letsencrypt-prod`** on each Ingress (public **DNS-01** for **`pcenicni.dev`**).
This file is **generated** by Ansible (`noble_landing_urls` role). Use it as a temporary landing page to find services after deploy.
| UI | What | Kubernetes service | Namespace | URL |
|----|------|----------------------|-----------|-----|
{% for e in noble_lab_ui_entries %}
| {{ e.name }} | {{ e.description }} | `{{ e.service }}` | `{{ e.namespace }}` | {% if e.url | default('') | length > 0 %}[{{ e.url }}]({{ e.url }}){% else %}—{% endif %} |
{% endfor %}
## Initial access (logins)
| App | Username / identity | Password / secret |
|-----|---------------------|-------------------|
| **Argo CD** | `admin` | {% if (noble_fetch_argocd_pw_b64 is defined) and (noble_fetch_argocd_pw_b64.rc | default(1) == 0) and (noble_fetch_argocd_pw_b64.stdout | default('') | length > 0) %}`{{ noble_fetch_argocd_pw_b64.stdout | b64decode }}`{% else %}*(not fetched — use commands below)*{% endif %} |
| **Grafana** | {% if (noble_fetch_grafana_user_b64 is defined) and (noble_fetch_grafana_user_b64.rc | default(1) == 0) and (noble_fetch_grafana_user_b64.stdout | default('') | length > 0) %}`{{ noble_fetch_grafana_user_b64.stdout | b64decode }}`{% else %}*(from Secret — use commands below)*{% endif %} | {% if (noble_fetch_grafana_pw_b64 is defined) and (noble_fetch_grafana_pw_b64.rc | default(1) == 0) and (noble_fetch_grafana_pw_b64.stdout | default('') | length > 0) %}`{{ noble_fetch_grafana_pw_b64.stdout | b64decode }}`{% else %}*(not fetched — use commands below)*{% endif %} |
| **Headlamp** | ServiceAccount **`headlamp`** | {% if (noble_fetch_headlamp_token is defined) and (noble_fetch_headlamp_token.rc | default(1) == 0) and (noble_fetch_headlamp_token.stdout | default('') | trim | length > 0) %}Token ({{ noble_landing_urls_headlamp_token_duration | default('48h') }}): `{{ noble_fetch_headlamp_token.stdout | trim }}`{% else %}*(not generated — use command below)*{% endif %} |
| **Prometheus** | — | No auth in default install (lab). |
| **Alertmanager** | — | No auth in default install (lab). |
| **Longhorn** | — | No default login unless you enable access control in the UI settings. |
### Commands to retrieve passwords (if not filled above)
```bash
# Argo CD initial admin (Secret removed after you change password)
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath='{.data.password}' | base64 -d
echo
# Grafana admin user / password
kubectl -n monitoring get secret kube-prometheus-grafana -o jsonpath='{.data.admin-user}' | base64 -d
echo
kubectl -n monitoring get secret kube-prometheus-grafana -o jsonpath='{.data.admin-password}' | base64 -d
echo
```
To generate this file **without** calling kubectl, run Ansible with **`-e noble_landing_urls_fetch_credentials=false`**.
## Notes
- **Argo CD** `argocd-initial-admin-secret` disappears after you change the admin password.
- **Grafana** password is random unless you set `grafana.adminPassword` in chart values.
- **Prometheus / Alertmanager** UIs are unauthenticated by default — restrict when hardening (`talos/CLUSTER-BUILD.md` Phase G).
- **SOPS:** cluster secrets in git under **`clusters/noble/secrets/`** are encrypted; decrypt with **`age-key.txt`** (not in git). See **`clusters/noble/secrets/README.md`**.
- **Headlamp** token above expires after the configured duration; re-run Ansible or `kubectl create token` to refresh.
- **Velero** has **no web UI** — use **`velero`** CLI or **`kubectl -n velero get backup,schedule,backupstoragelocation`**. Metrics: **`velero`** Service in **`velero`** (Prometheus scrape). See `clusters/noble/bootstrap/velero/README.md`.

View File

@@ -1,29 +0,0 @@
---
- name: Apply Longhorn namespace (PSA) from kustomization
ansible.builtin.command:
argv:
- kubectl
- apply
- -k
- "{{ noble_repo_root }}/clusters/noble/bootstrap/longhorn"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install Longhorn chart
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- longhorn
- longhorn/longhorn
- -n
- longhorn-system
- --create-namespace
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/longhorn/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,3 +0,0 @@
---
# Helm **--wait** default is often too short when images pull slowly or nodes are busy.
noble_helm_metallb_wait_timeout: 20m

View File

@@ -1,39 +0,0 @@
---
- name: Apply MetalLB namespace (Pod Security labels)
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/metallb/namespace.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install MetalLB chart
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- metallb
- metallb/metallb
- --namespace
- metallb-system
- --wait
- --timeout
- "{{ noble_helm_metallb_wait_timeout }}"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Apply IPAddressPool and L2Advertisement
ansible.builtin.command:
argv:
- kubectl
- apply
- -k
- "{{ noble_repo_root }}/clusters/noble/bootstrap/metallb"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,19 +0,0 @@
---
- name: Install metrics-server
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- metrics-server
- metrics-server/metrics-server
- -n
- kube-system
- --version
- "3.13.0"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/metrics-server/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,3 +0,0 @@
---
# Set true after creating the newt-pangolin-auth Secret (see role / cluster docs).
noble_newt_install: true

View File

@@ -1,30 +0,0 @@
---
# See repository **.env.sample** — copy to **.env** (gitignored).
- name: Stat repository .env for deploy secrets
ansible.builtin.stat:
path: "{{ noble_repo_root }}/.env"
register: noble_deploy_env_file
changed_when: false
- name: Create newt-pangolin-auth Secret from .env
ansible.builtin.shell: |
set -euo pipefail
set -a
. "{{ noble_repo_root }}/.env"
set +a
if [ -z "${PANGOLIN_ENDPOINT:-}" ] || [ -z "${NEWT_ID:-}" ] || [ -z "${NEWT_SECRET:-}" ]; then
echo NO_VARS
exit 0
fi
kubectl -n newt create secret generic newt-pangolin-auth \
--from-literal=PANGOLIN_ENDPOINT="${PANGOLIN_ENDPOINT}" \
--from-literal=NEWT_ID="${NEWT_ID}" \
--from-literal=NEWT_SECRET="${NEWT_SECRET}" \
--dry-run=client -o yaml | kubectl apply -f -
echo APPLIED
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_deploy_env_file.stat.exists | default(false)
no_log: true
register: noble_newt_secret_from_env
changed_when: "'APPLIED' in (noble_newt_secret_from_env.stdout | default(''))"

View File

@@ -1,41 +0,0 @@
---
- name: Skip Newt when not enabled
ansible.builtin.debug:
msg: "noble_newt_install is false — set PANGOLIN_ENDPOINT, NEWT_ID, NEWT_SECRET in repo .env (or create the Secret manually) and set noble_newt_install=true to deploy Newt."
when: not (noble_newt_install | bool)
- name: Create Newt namespace
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/newt/namespace.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_newt_install | bool
changed_when: true
- name: Apply Newt Pangolin auth Secret from repository .env (optional)
ansible.builtin.include_tasks: from_env.yml
when: noble_newt_install | bool
- name: Install Newt chart
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- newt
- fossorial/newt
- --namespace
- newt
- --version
- "1.2.0"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/newt/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_newt_install | bool
changed_when: true

View File

@@ -1,9 +0,0 @@
---
# kubectl apply -k can hit transient etcd timeouts under load; retries + longer API deadline help.
noble_platform_kubectl_request_timeout: 120s
noble_platform_kustomize_retries: 5
noble_platform_kustomize_delay: 20
# Decrypt **clusters/noble/secrets/*.yaml** with SOPS and kubectl apply (requires **sops**, **age**, and **age-key.txt**).
noble_apply_sops_secrets: true
noble_sops_age_key_file: "{{ noble_repo_root }}/age-key.txt"

View File

@@ -1,117 +0,0 @@
---
# Mirrors former **noble-platform** Argo Application: Helm releases + plain manifests under clusters/noble/bootstrap.
- name: Apply clusters/noble/bootstrap kustomize (namespaces, Grafana Loki datasource)
ansible.builtin.command:
argv:
- kubectl
- apply
- "--request-timeout={{ noble_platform_kubectl_request_timeout }}"
- -k
- "{{ noble_repo_root }}/clusters/noble/bootstrap"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_platform_kustomize
retries: "{{ noble_platform_kustomize_retries | int }}"
delay: "{{ noble_platform_kustomize_delay | int }}"
until: noble_platform_kustomize.rc == 0
changed_when: true
- name: Stat SOPS age private key (age-key.txt)
ansible.builtin.stat:
path: "{{ noble_sops_age_key_file }}"
register: noble_sops_age_key_stat
- name: Apply SOPS-encrypted cluster secrets (clusters/noble/secrets/*.yaml)
ansible.builtin.shell: |
set -euo pipefail
shopt -s nullglob
for f in "{{ noble_repo_root }}/clusters/noble/secrets"/*.yaml; do
sops -d "$f" | kubectl apply -f -
done
args:
executable: /bin/bash
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
SOPS_AGE_KEY_FILE: "{{ noble_sops_age_key_file }}"
when:
- noble_apply_sops_secrets | default(true) | bool
- noble_sops_age_key_stat.stat.exists
changed_when: true
- name: Install kube-prometheus-stack
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- kube-prometheus
- prometheus-community/kube-prometheus-stack
- -n
- monitoring
- --version
- "82.15.1"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kube-prometheus-stack/values.yaml"
- --wait
- --timeout
- 30m
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install Loki
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- loki
- grafana/loki
- -n
- loki
- --version
- "6.55.0"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/loki/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install Fluent Bit
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- fluent-bit
- fluent/fluent-bit
- -n
- logging
- --version
- "0.56.0"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/fluent-bit/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install Headlamp
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- headlamp
- headlamp/headlamp
- --version
- "0.40.1"
- -n
- headlamp
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/headlamp/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,15 +0,0 @@
---
- name: SOPS secrets (workstation)
ansible.builtin.debug:
msg: |
Encrypted Kubernetes Secrets live under clusters/noble/secrets/ (Mozilla SOPS + age).
Private key: age-key.txt at repo root (gitignored). See clusters/noble/secrets/README.md
and .sops.yaml. noble.yml decrypt-applies these when age-key.txt exists.
- name: Argo CD optional root Application (empty app-of-apps)
ansible.builtin.debug:
msg: >-
App-of-apps: noble.yml applies root-application.yaml when noble_argocd_apply_root_application is true;
bootstrap-root-application.yaml when noble_argocd_apply_bootstrap_root_application is true (group_vars/all.yml).
noble-bootstrap-root uses manual sync until you enable automation after the playbook —
clusters/noble/bootstrap/argocd/README.md §5. See clusters/noble/apps/README.md and that README.

View File

@@ -1,30 +0,0 @@
---
- name: Create Traefik namespace
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/traefik/namespace.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true
- name: Install Traefik
ansible.builtin.command:
argv:
- helm
- upgrade
- --install
- traefik
- traefik/traefik
- --namespace
- traefik
- --version
- "39.0.6"
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/traefik/values.yaml"
- --wait
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
changed_when: true

View File

@@ -1,13 +0,0 @@
---
# **noble_velero_install** is in **ansible/group_vars/all.yml**. Override S3 fields via extra-vars or group_vars.
noble_velero_chart_version: "12.0.0"
noble_velero_s3_bucket: ""
noble_velero_s3_url: ""
noble_velero_s3_region: "us-east-1"
noble_velero_s3_force_path_style: "true"
noble_velero_s3_prefix: ""
# Optional — if unset, Ansible expects Secret **velero/velero-cloud-credentials** (key **cloud**) to exist.
noble_velero_aws_access_key_id: ""
noble_velero_aws_secret_access_key: ""

View File

@@ -1,68 +0,0 @@
---
# See repository **.env.sample** — copy to **.env** (gitignored).
- name: Stat repository .env for Velero
ansible.builtin.stat:
path: "{{ noble_repo_root }}/.env"
register: noble_deploy_env_file
changed_when: false
- name: Load NOBLE_VELERO_S3_BUCKET from .env when unset
ansible.builtin.shell: |
set -a
. "{{ noble_repo_root }}/.env"
set +a
echo "${NOBLE_VELERO_S3_BUCKET:-}"
register: noble_velero_s3_bucket_from_env
when:
- noble_deploy_env_file.stat.exists | default(false)
- noble_velero_s3_bucket | default('') | length == 0
changed_when: false
- name: Apply NOBLE_VELERO_S3_BUCKET from .env
ansible.builtin.set_fact:
noble_velero_s3_bucket: "{{ noble_velero_s3_bucket_from_env.stdout | trim }}"
when:
- noble_velero_s3_bucket_from_env is defined
- (noble_velero_s3_bucket_from_env.stdout | default('') | trim | length) > 0
- name: Load NOBLE_VELERO_S3_URL from .env when unset
ansible.builtin.shell: |
set -a
. "{{ noble_repo_root }}/.env"
set +a
echo "${NOBLE_VELERO_S3_URL:-}"
register: noble_velero_s3_url_from_env
when:
- noble_deploy_env_file.stat.exists | default(false)
- noble_velero_s3_url | default('') | length == 0
changed_when: false
- name: Apply NOBLE_VELERO_S3_URL from .env
ansible.builtin.set_fact:
noble_velero_s3_url: "{{ noble_velero_s3_url_from_env.stdout | trim }}"
when:
- noble_velero_s3_url_from_env is defined
- (noble_velero_s3_url_from_env.stdout | default('') | trim | length) > 0
- name: Create velero-cloud-credentials from .env when keys present
ansible.builtin.shell: |
set -euo pipefail
set -a
. "{{ noble_repo_root }}/.env"
set +a
if [ -z "${NOBLE_VELERO_AWS_ACCESS_KEY_ID:-}" ] || [ -z "${NOBLE_VELERO_AWS_SECRET_ACCESS_KEY:-}" ]; then
echo SKIP
exit 0
fi
CLOUD="$(printf '[default]\naws_access_key_id=%s\naws_secret_access_key=%s\n' \
"${NOBLE_VELERO_AWS_ACCESS_KEY_ID}" "${NOBLE_VELERO_AWS_SECRET_ACCESS_KEY}")"
kubectl -n velero create secret generic velero-cloud-credentials \
--from-literal=cloud="${CLOUD}" \
--dry-run=client -o yaml | kubectl apply -f -
echo APPLIED
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_deploy_env_file.stat.exists | default(false)
no_log: true
register: noble_velero_secret_from_env
changed_when: "'APPLIED' in (noble_velero_secret_from_env.stdout | default(''))"

View File

@@ -1,85 +0,0 @@
---
# Velero — S3 backup target + built-in CSI snapshots (Longhorn: label VolumeSnapshotClass per README).
- name: Apply velero namespace
ansible.builtin.command:
argv:
- kubectl
- apply
- -f
- "{{ noble_repo_root }}/clusters/noble/bootstrap/velero/namespace.yaml"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_velero_install | default(false) | bool
changed_when: true
- name: Include Velero settings from repository .env (S3 bucket, URL, credentials)
ansible.builtin.include_tasks: from_env.yml
when: noble_velero_install | default(false) | bool
- name: Require S3 bucket and endpoint for Velero
ansible.builtin.assert:
that:
- noble_velero_s3_bucket | default('') | length > 0
- noble_velero_s3_url | default('') | length > 0
fail_msg: >-
Set NOBLE_VELERO_S3_BUCKET and NOBLE_VELERO_S3_URL in .env, or noble_velero_s3_bucket / noble_velero_s3_url
(e.g. -e ...), or group_vars when noble_velero_install is true.
when: noble_velero_install | default(false) | bool
- name: Create velero-cloud-credentials from Ansible vars
ansible.builtin.shell: |
set -euo pipefail
CLOUD="$(printf '[default]\naws_access_key_id=%s\naws_secret_access_key=%s\n' \
"${AWS_ACCESS_KEY_ID}" "${AWS_SECRET_ACCESS_KEY}")"
kubectl -n velero create secret generic velero-cloud-credentials \
--from-literal=cloud="${CLOUD}" \
--dry-run=client -o yaml | kubectl apply -f -
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
AWS_ACCESS_KEY_ID: "{{ noble_velero_aws_access_key_id }}"
AWS_SECRET_ACCESS_KEY: "{{ noble_velero_aws_secret_access_key }}"
when:
- noble_velero_install | default(false) | bool
- noble_velero_aws_access_key_id | default('') | length > 0
- noble_velero_aws_secret_access_key | default('') | length > 0
no_log: true
changed_when: true
- name: Check velero-cloud-credentials Secret
ansible.builtin.command:
argv:
- kubectl
- -n
- velero
- get
- secret
- velero-cloud-credentials
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
register: noble_velero_secret_check
failed_when: false
changed_when: false
when: noble_velero_install | default(false) | bool
- name: Require velero-cloud-credentials before Helm
ansible.builtin.assert:
that:
- noble_velero_secret_check.rc == 0
fail_msg: >-
Velero needs Secret velero/velero-cloud-credentials (key cloud). Set NOBLE_VELERO_AWS_ACCESS_KEY_ID and
NOBLE_VELERO_AWS_SECRET_ACCESS_KEY in .env, or noble_velero_aws_* extra-vars, or create the Secret manually
(see clusters/noble/bootstrap/velero/README.md).
when: noble_velero_install | default(false) | bool
- name: Optional object prefix argv for Helm
ansible.builtin.set_fact:
noble_velero_helm_prefix_argv: "{{ ['--set-string', 'configuration.backupStorageLocation[0].prefix=' ~ (noble_velero_s3_prefix | default(''))] if (noble_velero_s3_prefix | default('') | length > 0) else [] }}"
when: noble_velero_install | default(false) | bool
- name: Install Velero
ansible.builtin.command:
argv: "{{ ['helm', 'upgrade', '--install', 'velero', 'vmware-tanzu/velero', '--namespace', 'velero', '--version', noble_velero_chart_version, '-f', noble_repo_root ~ '/clusters/noble/bootstrap/velero/values.yaml', '--set-string', 'configuration.backupStorageLocation[0].bucket=' ~ noble_velero_s3_bucket, '--set-string', 'configuration.backupStorageLocation[0].config.s3Url=' ~ noble_velero_s3_url, '--set-string', 'configuration.backupStorageLocation[0].config.region=' ~ noble_velero_s3_region, '--set-string', 'configuration.backupStorageLocation[0].config.s3ForcePathStyle=' ~ noble_velero_s3_force_path_style] + (noble_velero_helm_prefix_argv | default([])) + ['--wait'] }}"
environment:
KUBECONFIG: "{{ noble_kubeconfig }}"
when: noble_velero_install | default(false) | bool
changed_when: true

View File

@@ -1,14 +0,0 @@
---
proxmox_repo_debian_codename: "{{ ansible_facts['distribution_release'] | default('bookworm') }}"
proxmox_repo_disable_enterprise: true
proxmox_repo_disable_ceph_enterprise: true
proxmox_repo_enable_pve_no_subscription: true
proxmox_repo_enable_ceph_no_subscription: false
proxmox_no_subscription_notice_disable: true
proxmox_widget_toolkit_file: /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js
# Bootstrap root SSH keys from the control machine so subsequent runs can use key auth.
proxmox_root_authorized_key_files:
- "{{ lookup('env', 'HOME') }}/.ssh/id_ed25519.pub"
- "{{ lookup('env', 'HOME') }}/.ssh/ansible.pub"

View File

@@ -1,5 +0,0 @@
---
- name: Restart pveproxy
ansible.builtin.service:
name: pveproxy
state: restarted

View File

@@ -1,100 +0,0 @@
---
- name: Check configured local public key files
ansible.builtin.stat:
path: "{{ item }}"
register: proxmox_root_pubkey_stats
loop: "{{ proxmox_root_authorized_key_files }}"
delegate_to: localhost
become: false
- name: Fail when a configured local public key file is missing
ansible.builtin.fail:
msg: "Configured key file does not exist on the control host: {{ item.item }}"
when: not item.stat.exists
loop: "{{ proxmox_root_pubkey_stats.results }}"
delegate_to: localhost
become: false
- name: Ensure root authorized_keys contains configured public keys
ansible.posix.authorized_key:
user: root
state: present
key: "{{ lookup('ansible.builtin.file', item) }}"
manage_dir: true
loop: "{{ proxmox_root_authorized_key_files }}"
- name: Remove enterprise repository lines from /etc/apt/sources.list
ansible.builtin.lineinfile:
path: /etc/apt/sources.list
regexp: ".*enterprise\\.proxmox\\.com.*"
state: absent
when:
- proxmox_repo_disable_enterprise | bool or proxmox_repo_disable_ceph_enterprise | bool
failed_when: false
- name: Find apt source files that contain Proxmox enterprise repositories
ansible.builtin.find:
paths: /etc/apt/sources.list.d
file_type: file
patterns:
- "*.list"
- "*.sources"
contains: "enterprise\\.proxmox\\.com"
use_regex: true
register: proxmox_enterprise_repo_files
when:
- proxmox_repo_disable_enterprise | bool or proxmox_repo_disable_ceph_enterprise | bool
- name: Remove enterprise repository lines from apt source files
ansible.builtin.lineinfile:
path: "{{ item.path }}"
regexp: ".*enterprise\\.proxmox\\.com.*"
state: absent
loop: "{{ proxmox_enterprise_repo_files.files | default([]) }}"
when:
- proxmox_repo_disable_enterprise | bool or proxmox_repo_disable_ceph_enterprise | bool
- name: Find apt source files that already contain pve-no-subscription
ansible.builtin.find:
paths: /etc/apt/sources.list.d
file_type: file
patterns:
- "*.list"
- "*.sources"
contains: "pve-no-subscription"
use_regex: false
register: proxmox_no_sub_repo_files
when: proxmox_repo_enable_pve_no_subscription | bool
- name: Ensure Proxmox no-subscription repository is configured when absent
ansible.builtin.copy:
dest: /etc/apt/sources.list.d/pve-no-subscription.list
content: "deb http://download.proxmox.com/debian/pve {{ proxmox_repo_debian_codename }} pve-no-subscription\n"
mode: "0644"
when:
- proxmox_repo_enable_pve_no_subscription | bool
- (proxmox_no_sub_repo_files.matched | default(0) | int) == 0
- name: Remove duplicate pve-no-subscription.list when another source already provides it
ansible.builtin.file:
path: /etc/apt/sources.list.d/pve-no-subscription.list
state: absent
when:
- proxmox_repo_enable_pve_no_subscription | bool
- (proxmox_no_sub_repo_files.files | default([]) | map(attribute='path') | list | select('ne', '/etc/apt/sources.list.d/pve-no-subscription.list') | list | length) > 0
- name: Ensure Ceph no-subscription repository is configured
ansible.builtin.copy:
dest: /etc/apt/sources.list.d/ceph-no-subscription.list
content: "deb http://download.proxmox.com/debian/ceph-{{ proxmox_repo_debian_codename }} {{ proxmox_repo_debian_codename }} no-subscription\n"
mode: "0644"
when: proxmox_repo_enable_ceph_no_subscription | bool
- name: Disable no-subscription pop-up in Proxmox UI
ansible.builtin.replace:
path: "{{ proxmox_widget_toolkit_file }}"
regexp: "if \\(data\\.status !== 'Active'\\)"
replace: "if (false)"
backup: true
when: proxmox_no_subscription_notice_disable | bool
notify: Restart pveproxy

View File

@@ -1,7 +0,0 @@
---
proxmox_cluster_enabled: true
proxmox_cluster_name: pve-cluster
proxmox_cluster_master: ""
proxmox_cluster_master_ip: ""
proxmox_cluster_force: false
proxmox_cluster_master_root_password: ""

View File

@@ -1,63 +0,0 @@
---
- name: Skip cluster role when disabled
ansible.builtin.meta: end_host
when: not (proxmox_cluster_enabled | bool)
- name: Check whether corosync cluster config exists
ansible.builtin.stat:
path: /etc/pve/corosync.conf
register: proxmox_cluster_corosync_conf
- name: Set effective Proxmox cluster master
ansible.builtin.set_fact:
proxmox_cluster_master_effective: "{{ proxmox_cluster_master | default(groups['proxmox_hosts'][0], true) }}"
- name: Set effective Proxmox cluster master IP
ansible.builtin.set_fact:
proxmox_cluster_master_ip_effective: >-
{{
proxmox_cluster_master_ip
| default(hostvars[proxmox_cluster_master_effective].ansible_host
| default(proxmox_cluster_master_effective), true)
}}
- name: Create cluster on designated master
ansible.builtin.command:
cmd: "pvecm create {{ proxmox_cluster_name }}"
when:
- inventory_hostname == proxmox_cluster_master_effective
- not proxmox_cluster_corosync_conf.stat.exists
- name: Ensure python3-pexpect is installed for password-based cluster join
ansible.builtin.apt:
name: python3-pexpect
state: present
update_cache: true
when:
- inventory_hostname != proxmox_cluster_master_effective
- not proxmox_cluster_corosync_conf.stat.exists
- proxmox_cluster_master_root_password | length > 0
- name: Join node to existing cluster (password provided)
ansible.builtin.expect:
command: >-
pvecm add {{ proxmox_cluster_master_ip_effective }}
{% if proxmox_cluster_force | bool %}--force{% endif %}
responses:
"Please enter superuser \\(root\\) password for '.*':": "{{ proxmox_cluster_master_root_password }}"
"password:": "{{ proxmox_cluster_master_root_password }}"
no_log: true
when:
- inventory_hostname != proxmox_cluster_master_effective
- not proxmox_cluster_corosync_conf.stat.exists
- proxmox_cluster_master_root_password | length > 0
- name: Join node to existing cluster (SSH trust/no prompt)
ansible.builtin.command:
cmd: >-
pvecm add {{ proxmox_cluster_master_ip_effective }}
{% if proxmox_cluster_force | bool %}--force{% endif %}
when:
- inventory_hostname != proxmox_cluster_master_effective
- not proxmox_cluster_corosync_conf.stat.exists
- proxmox_cluster_master_root_password | length == 0

View File

@@ -1,6 +0,0 @@
---
proxmox_upgrade_apt_cache_valid_time: 3600
proxmox_upgrade_autoremove: true
proxmox_upgrade_autoclean: true
proxmox_upgrade_reboot_if_required: true
proxmox_upgrade_reboot_timeout: 1800

View File

@@ -1,30 +0,0 @@
---
- name: Refresh apt cache
ansible.builtin.apt:
update_cache: true
cache_valid_time: "{{ proxmox_upgrade_apt_cache_valid_time }}"
- name: Upgrade Proxmox host packages
ansible.builtin.apt:
upgrade: dist
- name: Remove orphaned packages
ansible.builtin.apt:
autoremove: "{{ proxmox_upgrade_autoremove }}"
- name: Clean apt package cache
ansible.builtin.apt:
autoclean: "{{ proxmox_upgrade_autoclean }}"
- name: Check if reboot is required
ansible.builtin.stat:
path: /var/run/reboot-required
register: proxmox_reboot_required_file
- name: Reboot when required by package upgrades
ansible.builtin.reboot:
reboot_timeout: "{{ proxmox_upgrade_reboot_timeout }}"
msg: "Reboot initiated by Ansible Proxmox maintenance playbook"
when:
- proxmox_upgrade_reboot_if_required | bool
- proxmox_reboot_required_file.stat.exists | default(false)

View File

@@ -0,0 +1,26 @@
---
# Defaults for proxmox_vm role
# Action to perform: create_template, create_vm, delete_vm, backup_vm
proxmox_action: create_vm
# Common settings
storage_pool: local-lvm
vmid: 9000
# Template Creation settings
template_name: ubuntu-cloud-template
image_url: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
image_name: "ubuntu-22.04-server-cloudimg-amd64.img"
memory: 2048
cores: 2
# Create VM settings (cloning)
new_vm_name: new-vm
target_node: "{{ inventory_hostname }}" # For cloning, usually same node
clone_full: true # Full clone (independent) vs Linked clone
# Backup settings
backup_mode: snapshot # snapshot, suspend, stop
backup_compress: zstd
backup_storage: local

View File

@@ -0,0 +1,7 @@
---
- name: Create VM Backup
command: >
vzdump {{ vmid }}
--mode {{ backup_mode }}
--compress {{ backup_compress }}
--storage {{ backup_storage }}

View File

@@ -0,0 +1,58 @@
---
- name: Check if template already exists
command: "qm status {{ vmid }}"
register: vm_status
failed_when: false
changed_when: false
- name: Fail if template ID exists
fail:
msg: "VM ID {{ vmid }} already exists. Please choose a different ID or delete the existing VM."
when: vm_status.rc == 0
- name: Download Cloud Image
get_url:
url: "{{ image_url }}"
dest: "/tmp/{{ image_name }}"
mode: '0644'
- name: Install libguestfs-tools
apt:
name: libguestfs-tools
state: present
ignore_errors: yes
- name: Create VM with hardware config
command: >
qm create {{ vmid }}
--name "{{ template_name }}"
--memory {{ memory }}
--core {{ cores }}
--net0 virtio,bridge=vmbr0
--scsihw virtio-scsi-pci
--ostype l26
--serial0 socket --vga serial0
- name: Import Disk
command: "qm importdisk {{ vmid }} /tmp/{{ image_name }} {{ storage_pool }}"
- name: Attach Disk to SCSI
command: "qm set {{ vmid }} --scsi0 {{ storage_pool }}:vm-{{ vmid }}-disk-0"
- name: Add Cloud-Init Drive
command: "qm set {{ vmid }} --ide2 {{ storage_pool }}:cloudinit"
- name: Set Boot Order
command: "qm set {{ vmid }} --boot c --bootdisk scsi0"
- name: Resize Disk (Default 10G)
command: "qm resize {{ vmid }} scsi0 10G"
ignore_errors: yes
- name: Convert to Template
command: "qm template {{ vmid }}"
- name: Remove Downloaded Image
file:
path: "/tmp/{{ image_name }}"
state: absent

View File

@@ -0,0 +1,11 @@
---
- name: Clone VM from Template
command: >
qm clone {{ vmid }} {{ new_vmid }}
--name "{{ new_vm_name }}"
--full {{ 1 if clone_full | bool else 0 }}
register: clone_result
- name: Start VM (Optional)
command: "qm start {{ new_vmid }}"
when: start_after_create | default(false) | bool

View File

@@ -0,0 +1,7 @@
---
- name: Stop VM (Force Stop)
command: "qm stop {{ vmid }}"
ignore_errors: yes
- name: Destroy VM
command: "qm destroy {{ vmid }} --purge"

View File

@@ -0,0 +1,3 @@
---
- name: Dispatch task based on action
include_tasks: "{{ proxmox_action }}.yml"

View File

@@ -1,3 +0,0 @@
---
# Set **true** to run `talhelper genconfig -o out` under **talos/** (requires talhelper + talconfig).
noble_talos_genconfig: false

View File

@@ -1,36 +0,0 @@
---
- name: Generate Talos machine configs (talhelper genconfig)
when: noble_talos_genconfig | bool
block:
- name: Validate talconfig
ansible.builtin.command:
argv:
- talhelper
- validate
- talconfig
- talconfig.yaml
args:
chdir: "{{ noble_repo_root }}/talos"
changed_when: false
- name: Generate Talos configs (out/)
ansible.builtin.command:
argv:
- talhelper
- genconfig
- -o
- out
args:
chdir: "{{ noble_repo_root }}/talos"
changed_when: true
- name: Post genconfig — next steps
ansible.builtin.debug:
msg: >-
Configs are in talos/out/. Apply to nodes, bootstrap, and kubeconfig per talos/README.md
before running playbooks/noble.yml.
- name: Skip when noble_talos_genconfig is false
ansible.builtin.debug:
msg: "No-op: pass -e noble_talos_genconfig=true to run talhelper genconfig."
when: not (noble_talos_genconfig | bool)

View File

@@ -1,38 +0,0 @@
---
# **noble_repo_root** and **noble_talos_dir** are set by **playbooks/talos_phase_a.yml** (repo root and **talos/**).
# Run **talhelper genconfig -o out** before apply (needs talhelper + talsecret per talos/README.md §1).
noble_talos_genconfig: true
# **auto** — probe nodes (maintenance vs joined TLS); **insecure** — always **--insecure**; **secure** — always **TALOSCONFIG** (Phase A already done / talos/README §2 B).
noble_talos_apply_mode: auto
# Skip if cluster is already bootstrapped (re-run playbook safely).
noble_talos_skip_bootstrap: false
# After **apply-config**, nodes often reboot — wait for Talos **apid** (:50000) before **bootstrap** / **kubeconfig**.
noble_talos_wait_for_apid: true
noble_talos_apid_wait_delay: 20
noble_talos_apid_wait_timeout: 900
# **talosctl bootstrap -n** — first control plane (neon).
noble_talos_bootstrap_node_ip: "192.168.50.20"
# **talosctl kubeconfig -n** (node that answers Talos/K8s for cert fetch).
noble_talos_kubeconfig_node: "192.168.50.20"
# **talosctl kubeconfig -e** — Talos endpoint (node IP before VIP is reachable; VIP when LAN works).
noble_talos_kubeconfig_endpoint: "192.168.50.20"
# After kubeconfig, patch **kubectl** server if VIP in file is unreachable (**group_vars** / same as noble.yml).
# noble_k8s_api_server_override: ""
# Must match **cluster.name** / kubeconfig cluster entry (often **noble**).
noble_talos_kubectl_cluster_name: noble
# Inventory: IP + filename under **talos/out/** — align with **talos/talconfig.yaml**.
noble_talos_nodes:
- { ip: "192.168.50.20", machine: "noble-neon.yaml" }
- { ip: "192.168.50.30", machine: "noble-argon.yaml" }
- { ip: "192.168.50.40", machine: "noble-krypton.yaml" }
- { ip: "192.168.50.10", machine: "noble-helium.yaml" }

View File

@@ -1,209 +0,0 @@
---
# Order matches talos/README.md: genconfig → apply all nodes → bootstrap → kubeconfig.
- name: Validate talconfig and generate **out/** (talhelper genconfig)
when: noble_talos_genconfig | bool
block:
- name: talhelper validate
ansible.builtin.command:
argv:
- talhelper
- validate
- talconfig
- talconfig.yaml
args:
chdir: "{{ noble_talos_dir }}"
changed_when: false
- name: talhelper genconfig -o out
ansible.builtin.command:
argv:
- talhelper
- genconfig
- -o
- out
args:
chdir: "{{ noble_talos_dir }}"
changed_when: true
- name: Stat talos/out/talosconfig
ansible.builtin.stat:
path: "{{ noble_talos_dir }}/out/talosconfig"
register: noble_talos_talosconfig
- name: Require talos/out/talosconfig
ansible.builtin.assert:
that:
- noble_talos_talosconfig.stat.exists | default(false)
fail_msg: >-
Missing {{ noble_talos_dir }}/out/talosconfig. Run **talhelper genconfig -o out** in **talos/** (talsecret per talos/README.md §1),
or set **noble_talos_genconfig=true** on this playbook.
# Maintenance API (**--insecure**) vs joined cluster (**tls: certificate required**) — talos/README §2 A vs B.
- name: Set apply path from noble_talos_apply_mode (manual)
ansible.builtin.set_fact:
noble_talos_apply_insecure: "{{ noble_talos_apply_mode == 'insecure' }}"
when: noble_talos_apply_mode | default('auto') in ['insecure', 'secure']
- name: Probe Talos API — apply-config dry-run (insecure / maintenance)
ansible.builtin.command:
argv:
- talosctl
- apply-config
- --insecure
- -n
- "{{ noble_talos_nodes[0].ip }}"
- -f
- "{{ noble_talos_dir }}/out/{{ noble_talos_nodes[0].machine }}"
- --dry-run
register: noble_talos_probe_insecure
failed_when: false
changed_when: false
when: noble_talos_apply_mode | default('auto') == 'auto'
- name: Probe Talos API — apply-config dry-run (TLS / joined)
ansible.builtin.command:
argv:
- talosctl
- apply-config
- -n
- "{{ noble_talos_nodes[0].ip }}"
- -f
- "{{ noble_talos_dir }}/out/{{ noble_talos_nodes[0].machine }}"
- --dry-run
environment:
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
register: noble_talos_probe_secure
failed_when: false
changed_when: false
when:
- noble_talos_apply_mode | default('auto') == 'auto'
- noble_talos_probe_insecure.rc != 0
- name: Resolve apply mode — maintenance (insecure)
ansible.builtin.set_fact:
noble_talos_apply_insecure: true
when:
- noble_talos_apply_mode | default('auto') == 'auto'
- noble_talos_probe_insecure.rc == 0
- name: Resolve apply mode — joined (TALOSCONFIG, no insecure)
ansible.builtin.set_fact:
noble_talos_apply_insecure: false
when:
- noble_talos_apply_mode | default('auto') == 'auto'
- noble_talos_probe_insecure.rc != 0
- noble_talos_probe_secure.rc == 0
- name: Fail when Talos API mode cannot be determined
ansible.builtin.fail:
msg: >-
Cannot run **talosctl apply-config --dry-run** on {{ noble_talos_nodes[0].ip }}.
Insecure: rc={{ noble_talos_probe_insecure.rc }} {{ noble_talos_probe_insecure.stderr | default('') }}.
TLS: rc={{ noble_talos_probe_secure.rc | default('n/a') }} {{ noble_talos_probe_secure.stderr | default('') }}.
Check LAN to :50000, node power, and that **out/talosconfig** matches these nodes.
Override: **-e noble_talos_apply_mode=secure** (joined) or **insecure** (maintenance ISO).
when:
- noble_talos_apply_mode | default('auto') == 'auto'
- noble_talos_probe_insecure.rc != 0
- noble_talos_probe_secure is not defined or noble_talos_probe_secure.rc != 0
- name: Show resolved Talos apply-config mode
ansible.builtin.debug:
msg: >-
apply-config: {{ 'maintenance (--insecure)' if noble_talos_apply_insecure | bool else 'joined (TALOSCONFIG)' }}
(noble_talos_apply_mode={{ noble_talos_apply_mode | default('auto') }})
- name: Apply machine config to each node (first install — insecure)
ansible.builtin.command:
argv:
- talosctl
- apply-config
- --insecure
- -n
- "{{ item.ip }}"
- --file
- "{{ noble_talos_dir }}/out/{{ item.machine }}"
loop: "{{ noble_talos_nodes }}"
loop_control:
label: "{{ item.ip }}"
when: noble_talos_apply_insecure | bool
changed_when: true
- name: Apply machine config to each node (cluster already has TLS — no insecure)
ansible.builtin.command:
argv:
- talosctl
- apply-config
- -n
- "{{ item.ip }}"
- --file
- "{{ noble_talos_dir }}/out/{{ item.machine }}"
environment:
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
loop: "{{ noble_talos_nodes }}"
loop_control:
label: "{{ item.ip }}"
when: not (noble_talos_apply_insecure | bool)
changed_when: true
# apply-config triggers reboots; apid on :50000 must accept connections before talosctl bootstrap / kubeconfig.
- name: Wait for Talos machine API (apid) on bootstrap node
ansible.builtin.wait_for:
host: "{{ noble_talos_bootstrap_node_ip }}"
port: 50000
delay: "{{ noble_talos_apid_wait_delay | int }}"
timeout: "{{ noble_talos_apid_wait_timeout | int }}"
state: started
when: noble_talos_wait_for_apid | default(true) | bool
- name: Bootstrap cluster (once per cluster)
ansible.builtin.command:
argv:
- talosctl
- bootstrap
- -n
- "{{ noble_talos_bootstrap_node_ip }}"
environment:
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
register: noble_talos_bootstrap_cmd
when: not (noble_talos_skip_bootstrap | bool)
changed_when: noble_talos_bootstrap_cmd.rc == 0
failed_when: >-
noble_talos_bootstrap_cmd.rc != 0 and
('etcd data directory is not empty' not in (noble_talos_bootstrap_cmd.stderr | default('')))
- name: Write Kubernetes admin kubeconfig
ansible.builtin.command:
argv:
- talosctl
- kubeconfig
- "{{ noble_talos_kubeconfig_out }}"
- --force
- -n
- "{{ noble_talos_kubeconfig_node }}"
- -e
- "{{ noble_talos_kubeconfig_endpoint }}"
- --merge=false
environment:
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
changed_when: true
- name: Optional — set kubectl cluster server to reachable API (VIP unreachable from this host)
ansible.builtin.command:
argv:
- kubectl
- config
- set-cluster
- "{{ noble_talos_kubectl_cluster_name }}"
- --server={{ noble_k8s_api_server_override }}
- --kubeconfig={{ noble_talos_kubeconfig_out }}
when: noble_k8s_api_server_override | default('') | length > 0
changed_when: true
- name: Next — platform stack
ansible.builtin.debug:
msg: >-
Kubeconfig written to {{ noble_talos_kubeconfig_out }}.
Export KUBECONFIG={{ noble_talos_kubeconfig_out }} and run: ansible-playbook playbooks/noble.yml
(or: ansible-playbook playbooks/deploy.yml for the full pipeline).

Binary file not shown.

Before

Width:  |  Height:  |  Size: 277 KiB

View File

@@ -1,7 +0,0 @@
# Argo CD — optional applications (non-bootstrap)
**Base cluster configuration** (CNI, MetalLB, ingress, cert-manager, storage, observability stack, policy, SOPS secrets path, etc.) is installed by **`ansible/playbooks/noble.yml`** from **`clusters/noble/bootstrap/`** — not from here.
**`noble-root`** (`clusters/noble/bootstrap/argocd/root-application.yaml`) points at **`clusters/noble/apps`**. Add **`Application`** manifests (and optional **`AppProject`** definitions) under this directory only for workloads that are additive and do not subsume the core platform.
Bootstrap kustomize (namespaces, static YAML, leaf **`Application`**s) lives in **`clusters/noble/bootstrap/`** and is tracked by **`noble-bootstrap-root`** — enable automated sync for that app only after **`noble.yml`** completes (**`clusters/noble/bootstrap/argocd/README.md`** §5). Put Helm **`Application`** migrations under **`clusters/noble/bootstrap/argocd/app-of-apps/`**.

View File

@@ -1,32 +0,0 @@
# Argo CD — optional [Homepage](https://gethomepage.dev/) dashboard (Helm from [jameswynn.github.io/helm-charts](https://jameswynn.github.io/helm-charts/)).
# Values: **`./values.yaml`** (multi-source **`$values`** ref).
#
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: homepage
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io/background
spec:
project: default
sources:
- repoURL: https://jameswynn.github.io/helm-charts
chart: homepage
targetRevision: 2.1.0
helm:
releaseName: homepage
valueFiles:
- $values/clusters/noble/apps/homepage/values.yaml
- repoURL: https://gitea.pcenicni.ca/gsdavidp/home-server.git
targetRevision: HEAD
ref: values
destination:
server: https://kubernetes.default.svc
namespace: homepage
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@@ -1,122 +0,0 @@
# Homepage — [gethomepage/homepage](https://github.com/gethomepage/homepage) via [jameswynn/homepage](https://github.com/jameswynn/helm-charts) Helm chart.
# Ingress: Traefik + cert-manager (same pattern as `clusters/noble/bootstrap/headlamp/values.yaml`).
# Service links match **`ansible/roles/noble_landing_urls/defaults/main.yml`** (`noble_lab_ui_entries`).
# **Velero** has no in-cluster web UI — tile links to upstream docs (no **siteMonitor**).
#
# **`siteMonitor`** runs **server-side** in the Homepage pod (see `gethomepage/homepage` `siteMonitor.js`).
# Public FQDNs like **`*.apps.noble.lab.pcenicni.dev`** often do **not** resolve inside the cluster
# (split-horizon / LAN DNS only) → `ENOTFOUND` / HTTP **500** in the monitor. Use **in-cluster Service**
# URLs for **`siteMonitor`** only; **`href`** stays the human-facing ingress URL.
#
# **Prometheus widget** also resolves from the pod — use the real **Service** name (Helm may truncate to
# 63 chars — this repos generated UI list uses **`kube-prometheus-kube-prome-prometheus`**).
# Verify: `kubectl -n monitoring get svc | grep -E 'prometheus|alertmanager|grafana'`.
#
image:
repository: ghcr.io/gethomepage/homepage
tag: v1.2.0
enableRbac: true
serviceAccount:
create: true
ingress:
main:
enabled: true
ingressClassName: traefik
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: homepage.apps.noble.lab.pcenicni.dev
paths:
- path: /
pathType: Prefix
tls:
- hosts:
- homepage.apps.noble.lab.pcenicni.dev
secretName: homepage-apps-noble-tls
env:
- name: HOMEPAGE_ALLOWED_HOSTS
value: homepage.apps.noble.lab.pcenicni.dev
config:
bookmarks: []
services:
- Noble Lab:
- Argo CD:
icon: si-argocd
href: https://argo.apps.noble.lab.pcenicni.dev
siteMonitor: http://argocd-server.argocd.svc.cluster.local:80
description: GitOps UI (sync, apps, repos)
- Grafana:
icon: si-grafana
href: https://grafana.apps.noble.lab.pcenicni.dev
siteMonitor: http://kube-prometheus-grafana.monitoring.svc.cluster.local:80
description: Dashboards, Loki explore (logs)
- Prometheus:
icon: si-prometheus
href: https://prometheus.apps.noble.lab.pcenicni.dev
siteMonitor: http://kube-prometheus-kube-prome-prometheus.monitoring.svc.cluster.local:9090
description: Prometheus UI (queries, targets) — lab; protect in production
widget:
type: prometheus
url: http://kube-prometheus-kube-prome-prometheus.monitoring.svc.cluster.local:9090
fields: ["targets_up", "targets_down", "targets_total"]
- Alertmanager:
icon: alertmanager.png
href: https://alertmanager.apps.noble.lab.pcenicni.dev
siteMonitor: http://kube-prometheus-kube-prome-alertmanager.monitoring.svc.cluster.local:9093
description: Alertmanager UI (silences, status)
- Headlamp:
icon: mdi-kubernetes
href: https://headlamp.apps.noble.lab.pcenicni.dev
siteMonitor: http://headlamp.headlamp.svc.cluster.local:80
description: Kubernetes UI (cluster resources)
- Longhorn:
icon: longhorn.png
href: https://longhorn.apps.noble.lab.pcenicni.dev
siteMonitor: http://longhorn-frontend.longhorn-system.svc.cluster.local:80
description: Storage volumes, nodes, backups
- Velero:
icon: mdi-backup-restore
href: https://velero.io/docs/
description: Cluster backups — no in-cluster web UI; use velero CLI or kubectl (docs)
widgets:
- datetime:
text_size: xl
format:
dateStyle: medium
timeStyle: short
- kubernetes:
cluster:
show: true
cpu: true
memory: true
showLabel: true
label: Cluster
nodes:
show: true
cpu: true
memory: true
showLabel: true
- search:
provider: duckduckgo
target: _blank
kubernetes:
mode: cluster
settingsString: |
title: Noble Lab
description: Homelab services — in-cluster uptime checks, cluster resources, Prometheus targets
theme: dark
color: slate
headerStyle: boxedWidgets
statusStyle: dot
iconStyle: theme
fullWidth: true
useEqualHeights: true
layout:
Noble Lab:
style: row
columns: 4

View File

@@ -1,7 +0,0 @@
# Argo CD **noble-root** syncs this directory. Add **Application** / **AppProject** manifests only for
# optional workloads that do not replace Ansible bootstrap (CNI, ingress, storage, core observability, etc.).
# Helm value files for those apps can live in subdirectories here (for example **./homepage/values.yaml**).
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- homepage/application.yaml

View File

@@ -1,106 +0,0 @@
# Argo CD — noble (bootstrap)
**Prerequisites:** cluster **Ready**, **Traefik** + **cert-manager**; DNS **`argo.apps.noble.lab.pcenicni.dev`** → Traefik **`192.168.50.211`** (see **`values.yaml`**).
## 1. Install
```bash
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update
helm upgrade --install argocd argo/argo-cd \
--namespace argocd \
--create-namespace \
--version 9.4.17 \
-f clusters/noble/bootstrap/argocd/values.yaml \
--wait
```
**RBAC:** `values.yaml` sets **`policy.default: role:readonly`** and **`g, admin, role:admin`** so the local **`admin`** user keeps full access while future OIDC users default to read-only until you add **`policy.csv`** mappings.
## 2. UI / CLI address
**HTTPS:** `https://argo.apps.noble.lab.pcenicni.dev` (Ingress via Traefik; cert from **`values.yaml`**).
```bash
kubectl get ingress -n argocd
```
Log in as **`admin`**; initial password:
```bash
kubectl -n argocd get secret argocd-initial-admin-secret \
-o jsonpath='{.data.password}' | base64 -d
echo
```
Change the password in the UI or via `argocd account update-password`.
### TLS: changing ClusterIssuer (e.g. staging → prod)
If **`helm upgrade --wait`** fails with *Secret was previously issued by `letsencrypt-staging`* (or another issuer), cert-manager will not replace the TLS Secret in place. Remove the old cert material once, then upgrade again:
```bash
kubectl -n argocd delete certificate argocd-server --ignore-not-found
kubectl -n argocd delete secret argocd-server-tls --ignore-not-found
helm upgrade --install argocd argo/argo-cd -n argocd --create-namespace \
--version 9.4.17 -f clusters/noble/bootstrap/argocd/values.yaml --wait
```
## 3. Register this repo (if private)
Use **Settings → Repositories** in the UI, or `argocd repo add` / a `Secret` of type `repository`.
## 4. App-of-apps (GitOps)
**Ansible** (`ansible/playbooks/noble.yml`) performs the **initial** install: Helm releases and **`kubectl apply -k clusters/noble/bootstrap`**. **Argo** then tracks the same git paths for ongoing reconciliation.
1. Edit **`root-application.yaml`** and **`bootstrap-root-application.yaml`**: set **`repoURL`** and **`targetRevision`**. The **`resources-finalizer.argocd.argoproj.io/background`** finalizer uses Argos path-qualified form so **`kubectl apply`** does not warn about finalizer names.
2. Optional add-on apps: add **`Application`** manifests under **`clusters/noble/apps/`** (see **`clusters/noble/apps/README.md`**).
3. **Bootstrap kustomize** (namespaces, datasource, leaf **`Application`**s under **`argocd/app-of-apps/`**, etc.): **`noble-bootstrap-root`** syncs **`clusters/noble/bootstrap`**. It is created with **manual** sync only so Argo does not apply changes while **`noble.yml`** is still running.
**`ansible/playbooks/noble.yml`** (role **`noble_argocd`**) applies both roots when **`noble_argocd_apply_root_application`** / **`noble_argocd_apply_bootstrap_root_application`** are true in **`ansible/group_vars/all.yml`**.
```bash
kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
kubectl apply -f clusters/noble/bootstrap/argocd/bootstrap-root-application.yaml
```
If you migrated from older GitOps **`Application`** names, delete stale **`Application`** objects on the cluster (see **`clusters/noble/apps/README.md`**) then re-apply the roots.
## 5. After Ansible: enable automated sync for **noble-bootstrap-root**
Do this only after **`ansible-playbook playbooks/noble.yml`** has finished successfully (including **`noble_platform`** `kubectl apply -k` and any Helm stages you rely on). Until then, leave **manual** sync so Argo does not fight the playbook.
**Required steps**
1. Confirm the cluster matches git for kustomize output (optional): `kubectl kustomize clusters/noble/bootstrap | kubectl diff -f -` or inspect resources in the UI.
2. Register the git repo in Argo if you have not already (**§3**).
3. **Refresh** the app so Argo compares **`clusters/noble/bootstrap`** to the cluster: Argo UI → **noble-bootstrap-root** → **Refresh**, or:
```bash
argocd app get noble-bootstrap-root --refresh
```
4. **Enable automated sync** (prune + self-heal), preserving **`CreateNamespace`**, using any one of:
**kubectl**
```bash
kubectl patch application noble-bootstrap-root -n argocd --type merge -p '{"spec":{"syncPolicy":{"automated":{"prune":true,"selfHeal":true},"syncOptions":["CreateNamespace=true"]}}}'
```
**argocd** CLI (logged in)
```bash
argocd app set noble-bootstrap-root --sync-policy automated --auto-prune --self-heal
```
**UI:** open **noble-bootstrap-root** → **App Details** → enable **AUTO-SYNC** (and **Prune** / **Self Heal** if shown).
5. Trigger a sync if the app does not go green immediately: **Sync** in the UI, or `argocd app sync noble-bootstrap-root`.
After this, **git** is the source of truth for everything under **`clusters/noble/bootstrap/kustomization.yaml`** (including **`argocd/app-of-apps/`**). Helm-managed platform components remain whatever Ansible last installed until you model them as Argo **`Application`**s under **`app-of-apps/`** and stop installing them from Ansible.
## Versions
Pinned in **`values.yaml`** comments (chart **9.4.17** / Argo CD **v3.3.6** at time of writing). Bump **`--version`** when upgrading.

View File

@@ -1,35 +0,0 @@
# App-of-apps root — apply after Argo CD is running (optional).
#
# 1. Set spec.source.repoURL (and targetRevision — **HEAD** tracks the remote default branch) to this repo.
# 2. kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
#
# **clusters/noble/apps** holds optional **Application** manifests. Core platform Helm + kustomize is
# installed by **ansible/playbooks/noble.yml** from **clusters/noble/bootstrap/**. **bootstrap-root-application.yaml**
# registers **noble-bootstrap-root** for the same kustomize tree (**manual** sync until you enable
# automation after the playbook — see **README.md** §5).
#
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: noble-root
namespace: argocd
# Path suffix satisfies Kubernetes domain-qualified finalizer guidance (avoids kubectl warning).
# Background cascade: Application deletes after resources are removed asynchronously.
# See: https://argo-cd.readthedocs.io/en/stable/user-guide/app_deletion/#about-the-deletion-finalizer
finalizers:
- resources-finalizer.argocd.argoproj.io/background
spec:
project: default
source:
repoURL: https://gitea.pcenicni.ca/gsdavidp/home-server.git
targetRevision: HEAD
path: clusters/noble/apps
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@@ -1,51 +0,0 @@
# Argo CD — noble lab (GitOps)
#
# Chart: argo/argo-cd — pin version on the helm command (e.g. 9.4.17).
# UI/API: **Ingress** via **Traefik** at **argo.apps.noble.lab.pcenicni.dev** (TLS: cert-manager
# ClusterIssuer + **`server.insecure`** so TLS terminates at Traefik).
# DNS: **`argo.apps.noble.lab.pcenicni.dev`** → Traefik LB **192.168.50.211** (same wildcard as apps).
#
# helm repo add argo https://argoproj.github.io/argo-helm
# helm upgrade --install argocd argo/argo-cd -n argocd --create-namespace \
# --version 9.4.17 -f clusters/noble/bootstrap/argocd/values.yaml --wait
#
# Initial admin password: kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath='{.data.password}' | base64 -d
#
# Optional: kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
global:
domain: argo.apps.noble.lab.pcenicni.dev
configs:
params:
# TLS terminates at Traefik / cert-manager; Argo CD serves HTTP behind the Ingress.
server.insecure: true
# RBAC: default authenticated users to read-only; keep local **admin** as full admin.
# Ref: https://argo-cd.readthedocs.io/en/stable/operator-manual/rbac/
rbac:
policy.default: role:readonly
policy.csv: |
g, admin, role:admin
server:
certificate:
enabled: true
domain: argo.apps.noble.lab.pcenicni.dev
# If you change issuer.name, delete Certificate/Secret once so cert-manager can re-issue (see README.md).
issuer:
group: cert-manager.io
kind: ClusterIssuer
name: letsencrypt-prod
ingress:
enabled: true
ingressClassName: traefik
hostname: argo.apps.noble.lab.pcenicni.dev
tls: true
# Traefik terminates TLS; Argo serves HTTP/2 cleartext (insecure). Without h2c, UI/API can 404 or fail gRPC.
annotations:
traefik.ingress.kubernetes.io/service.serversscheme: h2c
service:
type: ClusterIP

View File

@@ -1,53 +0,0 @@
# cert-manager — noble
**Prerequisites:** **Traefik** (ingress class **`traefik`**), DNS for **`*.apps.noble.lab.pcenicni.dev`** → Traefik LB for app traffic.
**ACME (Lets Encrypt)** uses **DNS-01** via **Cloudflare** for zone **`pcenicni.dev`**. Create an API token with **Zone → DNS → Edit** and **Zone → Zone → Read** (or use the “Edit zone DNS” template), then:
**Option A — Ansible:** copy **`.env.sample`** to **`.env`** in the repo root, set **`CLOUDFLARE_DNS_API_TOKEN`**, run **`ansible/playbooks/noble.yml`** (or **`deploy.yml`**). The **cert-manager** role creates **cloudflare-dns-api-token** from `.env` after the chart installs.
**Option B — kubectl:**
```bash
kubectl -n cert-manager create secret generic cloudflare-dns-api-token \
--from-literal=api-token='YOUR_CLOUDFLARE_API_TOKEN' \
--dry-run=client -o yaml | kubectl apply -f -
```
Without this Secret, **`ClusterIssuer`** will not complete certificate orders.
1. Create the namespace:
```bash
kubectl apply -f clusters/noble/bootstrap/cert-manager/namespace.yaml
```
2. Install the chart (CRDs included via `values.yaml`):
```bash
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v1.20.0 \
-f clusters/noble/bootstrap/cert-manager/values.yaml \
--wait
```
3. Optionally edit **`spec.acme.email`** in both ClusterIssuer manifests (default **`certificates@noble.lab.pcenicni.dev`**) — Lets Encrypt uses this for expiry and account notices. Do **not** use **`example.com`** (ACME rejects it).
4. Apply ClusterIssuers (staging then prod, or both):
```bash
kubectl apply -k clusters/noble/bootstrap/cert-manager
```
5. Confirm:
```bash
kubectl get clusterissuer
```
Use **`cert-manager.io/cluster-issuer: letsencrypt-staging`** on Ingresses while testing; switch to **`letsencrypt-prod`** when ready.
**HTTP-01** is not configured: if the hostname is **proxied** (orange cloud) in Cloudflare, Lets Encrypt may hit Cloudflares edge and get **404** for `/.well-known/acme-challenge/`. DNS-01 avoids that.

View File

@@ -1,23 +0,0 @@
# Let's Encrypt production — trusted certificates; respect rate limits.
# Prefer a real mailbox for expiry notices; this domain is accepted by LE (edit if needed).
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: certificates@noble.lab.pcenicni.dev
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod-account-key
solvers:
# DNS-01 — works when public HTTP to Traefik is wrong (e.g. hostname proxied through Cloudflare
# returns 404 for /.well-known/acme-challenge). Requires Secret cloudflare-dns-api-token in cert-manager.
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-dns-api-token
key: api-token
selector:
dnsZones:
- pcenicni.dev

View File

@@ -1,21 +0,0 @@
# Let's Encrypt staging — use for tests (untrusted issuer in browsers).
# Prefer a real mailbox for expiry notices; this domain is accepted by LE (edit if needed).
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
spec:
acme:
email: certificates@noble.lab.pcenicni.dev
server: https://acme-staging-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-staging-account-key
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-dns-api-token
key: api-token
selector:
dnsZones:
- pcenicni.dev

View File

@@ -1,5 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- clusterissuer-letsencrypt-staging.yaml
- clusterissuer-letsencrypt-prod.yaml

Some files were not shown because too many files have changed in this diff Show More