Compare commits
2 Commits
main
...
ed2df96d10
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ed2df96d10 | ||
|
|
f6c44024a2 |
19
.env.sample
19
.env.sample
@@ -1,19 +0,0 @@
|
||||
# Copy to **.env** in this repository root (`.env` is gitignored).
|
||||
# Ansible **noble_cert_manager** role sources `.env` after cert-manager Helm install and creates
|
||||
# **cert-manager/cloudflare-dns-api-token** when **CLOUDFLARE_DNS_API_TOKEN** is set.
|
||||
#
|
||||
# Cloudflare: Zone → DNS → Edit + Zone → Read for **pcenicni.dev** (see clusters/noble/bootstrap/cert-manager/README.md).
|
||||
CLOUDFLARE_DNS_API_TOKEN=
|
||||
|
||||
# --- Optional: other deploy-time values (documented for manual use or future automation) ---
|
||||
|
||||
# Pangolin / Newt — with **noble_newt_install=true**, Ansible creates **newt/newt-pangolin-auth** when all are set (see clusters/noble/bootstrap/newt/README.md).
|
||||
PANGOLIN_ENDPOINT=
|
||||
NEWT_ID=
|
||||
NEWT_SECRET=
|
||||
|
||||
# Velero — when **noble_velero_install=true**, set bucket + S3 API URL and credentials (see clusters/noble/bootstrap/velero/README.md).
|
||||
NOBLE_VELERO_S3_BUCKET=
|
||||
NOBLE_VELERO_S3_URL=
|
||||
NOBLE_VELERO_AWS_ACCESS_KEY_ID=
|
||||
NOBLE_VELERO_AWS_SECRET_ACCESS_KEY=
|
||||
11
.gitignore
vendored
11
.gitignore
vendored
@@ -1,11 +0,0 @@
|
||||
ansible/inventory/hosts.ini
|
||||
# Talos generated
|
||||
talos/out/
|
||||
talos/kubeconfig
|
||||
|
||||
# Local secrets
|
||||
age-key.txt
|
||||
.env
|
||||
|
||||
# Generated by ansible noble_landing_urls
|
||||
ansible/output/noble-lab-ui-urls.md
|
||||
@@ -1,7 +0,0 @@
|
||||
# Mozilla SOPS — encrypt/decrypt Kubernetes Secret manifests under clusters/noble/secrets/
|
||||
# Generate a key: age-keygen -o age-key.txt (age-key.txt is gitignored)
|
||||
# Add the printed public key below (one recipient per line is supported).
|
||||
creation_rules:
|
||||
- path_regex: clusters/noble/secrets/.*\.yaml$
|
||||
age: >-
|
||||
age1juym5p3ez3dkt0dxlznydgfgqvaujfnyk9hpdsssf50hsxeh3p4sjpf3gn
|
||||
@@ -180,12 +180,6 @@ Shared services used across multiple applications.
|
||||
|
||||
**Configuration:** Requires Pangolin endpoint URL, Newt ID, and Newt secret.
|
||||
|
||||
### versitygw/ (`komodo/s3/versitygw/`)
|
||||
|
||||
- **[Versity S3 Gateway](https://github.com/versity/versitygw)** — S3 API on port **10000** by default; optional **WebUI** on **8080** (not the same listener—enable `VERSITYGW_WEBUI_PORT` / `VGW_WEBUI_GATEWAYS` per `.env.sample`). Behind **Pangolin**, expose the API and WebUI separately (or you will see **404** browsing the API URL).
|
||||
|
||||
**Configuration:** Set either `ROOT_ACCESS_KEY` / `ROOT_SECRET_KEY` or `ROOT_ACCESS_KEY_ID` / `ROOT_SECRET_ACCESS_KEY`. Optional `VERSITYGW_PORT`. Compose uses `${VAR}` interpolation so credentials work with Komodo’s `docker compose --env-file <run_directory>/.env` (avoid `env_file:` in the service when `run_directory` is not the same folder as `compose.yaml`, or the written `.env` will not be found).
|
||||
|
||||
---
|
||||
|
||||
## 📊 Monitoring (`komodo/monitor/`)
|
||||
|
||||
1
ansible/.gitignore
vendored
1
ansible/.gitignore
vendored
@@ -1 +0,0 @@
|
||||
.ansible-tmp/
|
||||
@@ -1,160 +1,84 @@
|
||||
# Ansible — noble cluster
|
||||
# Home Server Ansible Configuration
|
||||
|
||||
Automates [`talos/CLUSTER-BUILD.md`](../talos/CLUSTER-BUILD.md): optional **Talos Phase A** (genconfig → apply → bootstrap → kubeconfig), then **Phase B+** (CNI → add-ons → ingress → Argo CD → Kyverno → observability, etc.). **Argo CD** does not reconcile core charts — optional GitOps starts from an empty [`clusters/noble/apps/kustomization.yaml`](../clusters/noble/apps/kustomization.yaml).
|
||||
This directory contains Ansible playbooks for managing the Proxmox home server environment.
|
||||
|
||||
## Order of operations
|
||||
## Directory Structure
|
||||
|
||||
1. **From `talos/`:** `talhelper gensecret` / `talsecret` as in [`talos/README.md`](../talos/README.md) §1 (if not already done).
|
||||
2. **Talos Phase A (automated):** run [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) **or** the full pipeline [`playbooks/deploy.yml`](playbooks/deploy.yml). This runs **`talhelper genconfig -o out`**, **`talosctl apply-config`** on each node, **`talosctl bootstrap`**, and **`talosctl kubeconfig`** → **`talos/kubeconfig`**.
|
||||
3. **Platform stack:** [`playbooks/noble.yml`](playbooks/noble.yml) (included at the end of **`deploy.yml`**).
|
||||
- `inventory/`: Contains the inventory file `hosts.ini` where you define your servers.
|
||||
- `playbooks/`: Contains the actual Ansible playbooks.
|
||||
- `ansible.cfg`: Local Ansible configuration.
|
||||
- `requirements.yml`: List of Ansible collections required.
|
||||
|
||||
Your workstation must be able to reach **node IPs on the lab LAN** (Talos API **:50000** for `talosctl`, Kubernetes **:6443** for `kubectl` / Helm). If `kubectl` cannot reach the VIP (`192.168.50.230`), use `-e 'noble_k8s_api_server_override=https://<control-plane-ip>:6443'` on **`noble.yml`** (see `group_vars/all.yml`).
|
||||
|
||||
**One-shot full deploy** (after nodes are booted and reachable):
|
||||
## Setup
|
||||
|
||||
1. **Install Requirements**:
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook playbooks/deploy.yml
|
||||
ansible-galaxy install -r requirements.yml
|
||||
```
|
||||
|
||||
## Deploy secrets (`.env`)
|
||||
2. **Configure Inventory**:
|
||||
Edit `inventory/hosts.ini` and update the following:
|
||||
- `ansible_host`: The IP address of your Proxmox node.
|
||||
- `ansible_user`: The SSH user (usually root).
|
||||
- `proxmox_api_*`: Variables if you plan to use API-based modules in the future.
|
||||
|
||||
Copy **`.env.sample`** to **`.env`** at the repository root (`.env` is gitignored). At minimum set **`CLOUDFLARE_DNS_API_TOKEN`** for cert-manager DNS-01. The **cert-manager** role applies it automatically during **`noble.yml`**. See **`.env.sample`** for optional placeholders (e.g. Newt/Pangolin).
|
||||
*Note: Ensure you have SSH key access to your Proxmox node for passwordless login, or uncomment `ansible_ssh_pass`.*
|
||||
|
||||
## Prerequisites
|
||||
## Available Playbooks
|
||||
|
||||
- `talosctl` (matches node Talos version), `talhelper`, `helm`, `kubectl`.
|
||||
- **SOPS secrets:** `sops` and `age` on the control host if you use **`clusters/noble/secrets/`** with **`age-key.txt`** (see **`clusters/noble/secrets/README.md`**).
|
||||
- **Phase A:** same LAN/VPN as nodes so **Talos :50000** and **Kubernetes :6443** are reachable (see [`talos/README.md`](../talos/README.md) §3).
|
||||
- **noble.yml:** bootstrapped cluster and **`talos/kubeconfig`** (or `KUBECONFIG`).
|
||||
### Create Ubuntu Cloud Template (`playbooks/create_ubuntu_template.yml`)
|
||||
|
||||
## Playbooks
|
||||
This playbook downloads a generic Ubuntu 22.04 Cloud Image and converts it into a Proxmox VM Template.
|
||||
|
||||
| Playbook | Purpose |
|
||||
|----------|---------|
|
||||
| [`playbooks/deploy.yml`](playbooks/deploy.yml) | **Talos Phase A** then **`noble.yml`** (full automation). |
|
||||
| [`playbooks/talos_phase_a.yml`](playbooks/talos_phase_a.yml) | `genconfig` → `apply-config` → `bootstrap` → `kubeconfig` only. |
|
||||
| [`playbooks/noble.yml`](playbooks/noble.yml) | Helm + `kubectl` platform (after Phase A). |
|
||||
| [`playbooks/post_deploy.yml`](playbooks/post_deploy.yml) | SOPS reminders and optional Argo root Application note. |
|
||||
| [`playbooks/talos_bootstrap.yml`](playbooks/talos_bootstrap.yml) | **`talhelper genconfig` only** (legacy shortcut; prefer **`talos_phase_a.yml`**). |
|
||||
| [`playbooks/debian_harden.yml`](playbooks/debian_harden.yml) | Baseline hardening for Debian servers (SSH/sysctl/fail2ban/unattended-upgrades). |
|
||||
| [`playbooks/debian_maintenance.yml`](playbooks/debian_maintenance.yml) | Debian maintenance run (apt upgrades, autoremove/autoclean, reboot when required). |
|
||||
| [`playbooks/debian_rotate_ssh_keys.yml`](playbooks/debian_rotate_ssh_keys.yml) | Rotate managed users' `authorized_keys`. |
|
||||
| [`playbooks/debian_ops.yml`](playbooks/debian_ops.yml) | Convenience pipeline: harden then maintenance for Debian servers. |
|
||||
| [`playbooks/proxmox_prepare.yml`](playbooks/proxmox_prepare.yml) | Configure Proxmox community repos and disable no-subscription UI warning. |
|
||||
| [`playbooks/proxmox_upgrade.yml`](playbooks/proxmox_upgrade.yml) | Proxmox maintenance run (apt dist-upgrade, cleanup, reboot when required). |
|
||||
| [`playbooks/proxmox_cluster.yml`](playbooks/proxmox_cluster.yml) | Create a Proxmox cluster on the master and join additional hosts. |
|
||||
| [`playbooks/proxmox_ops.yml`](playbooks/proxmox_ops.yml) | Convenience pipeline: prepare, upgrade, then cluster Proxmox hosts. |
|
||||
**Usage:**
|
||||
|
||||
```bash
|
||||
cd ansible
|
||||
export KUBECONFIG=/absolute/path/to/home-server/talos/kubeconfig
|
||||
|
||||
# noble.yml only — if VIP is unreachable from this host:
|
||||
# ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443'
|
||||
|
||||
ansible-playbook playbooks/noble.yml
|
||||
ansible-playbook playbooks/post_deploy.yml
|
||||
# Run the playbook
|
||||
ansible-playbook playbooks/create_ubuntu_template.yml
|
||||
```
|
||||
|
||||
### Talos Phase A variables (role `talos_phase_a` defaults)
|
||||
**Variables:**
|
||||
You can override variables at runtime or by editing the playbook:
|
||||
|
||||
Override with `-e` when needed, e.g. **`-e noble_talos_skip_bootstrap=true`** if etcd is already initialized.
|
||||
|
||||
| Variable | Default | Meaning |
|
||||
|----------|---------|---------|
|
||||
| `noble_talos_genconfig` | `true` | Run **`talhelper genconfig -o out`** first. |
|
||||
| `noble_talos_apply_mode` | `auto` | **`auto`** — **`talosctl apply-config --dry-run`** on the first node picks maintenance (**`--insecure`**) vs joined (**`TALOSCONFIG`**). **`insecure`** / **`secure`** force talos/README §2 A or B. |
|
||||
| `noble_talos_skip_bootstrap` | `false` | Skip **`talosctl bootstrap`**. If etcd is **already** initialized, bootstrap is treated as a no-op (same as **`talosctl`** “etcd data directory is not empty”). |
|
||||
| `noble_talos_apid_wait_delay` / `noble_talos_apid_wait_timeout` | `20` / `900` | Seconds to wait for **apid :50000** on the bootstrap node after **apply-config** (nodes reboot). Increase if bootstrap hits **connection refused** to `:50000`. |
|
||||
| `noble_talos_nodes` | neon/argon/krypton/helium | IP + **`out/*.yaml`** filename — align with **`talos/talconfig.yaml`**. |
|
||||
|
||||
### Tags (partial runs)
|
||||
- `template_id`: Default `9000`
|
||||
- `template_name`: Default `ubuntu-2204-cloud`
|
||||
- `storage_pool`: Default `local-lvm`
|
||||
|
||||
Example overriding variables:
|
||||
```bash
|
||||
ansible-playbook playbooks/noble.yml --tags cilium,metallb
|
||||
ansible-playbook playbooks/noble.yml --skip-tags newt
|
||||
ansible-playbook playbooks/noble.yml --tags velero -e noble_velero_install=true -e noble_velero_s3_bucket=... -e noble_velero_s3_url=...
|
||||
ansible-playbook playbooks/create_ubuntu_template.yml -e "template_id=9001 template_name=my-custom-template"
|
||||
```
|
||||
|
||||
### Variables — `group_vars/all.yml` and role defaults
|
||||
### Manage VM Playbook (`playbooks/manage_vm.yml`)
|
||||
|
||||
- **`group_vars/all.yml`:** **`noble_newt_install`**, **`noble_velero_install`**, **`noble_cert_manager_require_cloudflare_secret`**, **`noble_argocd_apply_root_application`**, **`noble_argocd_apply_bootstrap_root_application`**, **`noble_k8s_api_server_override`**, **`noble_k8s_api_server_auto_fallback`**, **`noble_k8s_api_server_fallback`**, **`noble_skip_k8s_health_check`**
|
||||
- **`roles/noble_platform/defaults/main.yml`:** **`noble_apply_sops_secrets`**, **`noble_sops_age_key_file`** (SOPS secrets under **`clusters/noble/secrets/`**)
|
||||
This unified playbook allows you to manage VMs (create from template, delete, backup, create template) across your Proxmox hosts.
|
||||
|
||||
## Roles
|
||||
**Usage:**
|
||||
|
||||
| Role | Contents |
|
||||
|------|----------|
|
||||
| `talos_phase_a` | Talos genconfig, apply-config, bootstrap, kubeconfig |
|
||||
| `helm_repos` | `helm repo add` / `update` |
|
||||
| `noble_*` | Cilium, CSI Volume Snapshot CRDs + controller, metrics-server, Longhorn, MetalLB (20m Helm wait), kube-vip, Traefik, cert-manager, Newt, Argo CD, Kyverno, platform stack, Velero (optional) |
|
||||
| `noble_landing_urls` | Writes **`ansible/output/noble-lab-ui-urls.md`** — URLs, service names, and (optional) Argo/Grafana passwords from Secrets |
|
||||
| `noble_post_deploy` | Post-install reminders |
|
||||
| `talos_bootstrap` | Genconfig-only (used by older playbook) |
|
||||
| `debian_baseline_hardening` | Baseline Debian hardening (SSH policy, sysctl profile, fail2ban, unattended upgrades) |
|
||||
| `debian_maintenance` | Routine Debian maintenance tasks (updates, cleanup, reboot-on-required) |
|
||||
| `debian_ssh_key_rotation` | Declarative `authorized_keys` rotation for server users |
|
||||
| `proxmox_baseline` | Proxmox repo prep (community repos) and no-subscription warning suppression |
|
||||
| `proxmox_maintenance` | Proxmox package maintenance (dist-upgrade, cleanup, reboot-on-required) |
|
||||
| `proxmox_cluster` | Proxmox cluster bootstrap/join automation using `pvecm` |
|
||||
|
||||
## Debian server ops quick start
|
||||
|
||||
These playbooks are separate from the Talos/noble flow and target hosts in `debian_servers`.
|
||||
|
||||
1. Copy `inventory/debian.example.yml` to `inventory/debian.yml` and update hosts/users.
|
||||
2. Update `group_vars/debian_servers.yml` with your allowed SSH users and real public keys.
|
||||
3. Run with the Debian inventory:
|
||||
The playbook target defaults to the `proxmox` group, but you should usually specify a specific host using `target_host` variable or `-l` limit.
|
||||
|
||||
1. **Create a New Template**:
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook -i inventory/debian.yml playbooks/debian_harden.yml
|
||||
ansible-playbook -i inventory/debian.yml playbooks/debian_rotate_ssh_keys.yml
|
||||
ansible-playbook -i inventory/debian.yml playbooks/debian_maintenance.yml
|
||||
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=create_template vmid=9003 template_name=my-ubuntu-template"
|
||||
```
|
||||
|
||||
Or run the combined maintenance pipeline:
|
||||
|
||||
2. **Create a VM from Template**:
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook -i inventory/debian.yml playbooks/debian_ops.yml
|
||||
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=create_vm vmid=9002 new_vmid=105 new_vm_name=my-new-vm"
|
||||
```
|
||||
|
||||
## Proxmox host + cluster quick start
|
||||
|
||||
These playbooks are separate from the Talos/noble flow and target hosts in `proxmox_hosts`.
|
||||
|
||||
1. Copy `inventory/proxmox.example.yml` to `inventory/proxmox.yml` and update hosts/users.
|
||||
2. Update `group_vars/proxmox_hosts.yml` with your cluster name (`proxmox_cluster_name`), chosen cluster master, and root public key file paths to install.
|
||||
3. First run (no SSH keys yet): use `--ask-pass` **or** set `ansible_password` (prefer Ansible Vault). Keep `ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new"` in inventory for first-contact hosts.
|
||||
4. Run prepare first to install your public keys on each host, then continue:
|
||||
|
||||
3. **Delete a VM**:
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_prepare.yml --ask-pass
|
||||
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_upgrade.yml
|
||||
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_cluster.yml
|
||||
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=delete_vm vmid=105"
|
||||
```
|
||||
|
||||
After `proxmox_prepare.yml` finishes, SSH key auth should work for root (keys from `proxmox_root_authorized_key_files`), so `--ask-pass` is usually no longer needed.
|
||||
|
||||
If `pvecm add` still prompts for the master root password during join, set `proxmox_cluster_master_root_password` (prefer Vault) to run join non-interactively.
|
||||
|
||||
Changing `proxmox_cluster_name` only affects new cluster creation; it does not rename an already-created cluster.
|
||||
|
||||
Or run the full Proxmox pipeline:
|
||||
|
||||
4. **Backup a VM**:
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook -i inventory/proxmox.yml playbooks/proxmox_ops.yml
|
||||
ansible-playbook playbooks/manage_vm.yml -e "proxmox_action=backup_vm vmid=105"
|
||||
```
|
||||
|
||||
## Migrating from Argo-managed `noble-platform`
|
||||
**Variables:**
|
||||
- `proxmox_action`: One of `create_template`, `create_vm`, `delete_vm`, `backup_vm` (Default: `create_vm`)
|
||||
- `target_host`: The host to run on (Default: `proxmox` group). Example: `-e "target_host=mercury"`
|
||||
|
||||
```bash
|
||||
kubectl delete application -n argocd noble-platform noble-kyverno noble-kyverno-policies --ignore-not-found
|
||||
kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
|
||||
```
|
||||
|
||||
Then run `playbooks/noble.yml` so Helm state matches git values.
|
||||
*See `roles/proxmox_vm/defaults/main.yml` for all available configuration options.*
|
||||
|
||||
@@ -1,10 +1,5 @@
|
||||
[defaults]
|
||||
inventory = inventory/localhost.yml
|
||||
roles_path = roles
|
||||
inventory = inventory/hosts.ini
|
||||
host_key_checking = False
|
||||
retry_files_enabled = False
|
||||
stdout_callback = default
|
||||
callback_result_format = yaml
|
||||
local_tmp = .ansible-tmp
|
||||
|
||||
[privilege_escalation]
|
||||
become = False
|
||||
interpreter_python = auto_silent
|
||||
|
||||
@@ -1,28 +0,0 @@
|
||||
---
|
||||
# noble_repo_root / noble_kubeconfig are set in playbooks (use **playbook_dir** magic var).
|
||||
|
||||
# When kubeconfig points at the API VIP but this workstation cannot reach the lab LAN (VPN off, etc.),
|
||||
# set a reachable control-plane URL — same as: kubectl config set-cluster noble --server=https://<cp-ip>:6443
|
||||
# Example: ansible-playbook playbooks/noble.yml -e 'noble_k8s_api_server_override=https://192.168.50.20:6443'
|
||||
noble_k8s_api_server_override: ""
|
||||
|
||||
# When /healthz fails with **network unreachable** to the VIP and **override** is empty, retry using this URL (neon).
|
||||
noble_k8s_api_server_auto_fallback: true
|
||||
noble_k8s_api_server_fallback: "https://192.168.50.20:6443"
|
||||
|
||||
# Only if you must skip the kubectl /healthz preflight (not recommended).
|
||||
noble_skip_k8s_health_check: false
|
||||
|
||||
# Pangolin / Newt — set true only after newt-pangolin-auth Secret exists (SOPS: clusters/noble/secrets/ or imperative — see clusters/noble/bootstrap/newt/README.md)
|
||||
noble_newt_install: false
|
||||
|
||||
# cert-manager needs Secret cloudflare-dns-api-token in cert-manager namespace before ClusterIssuers work
|
||||
noble_cert_manager_require_cloudflare_secret: true
|
||||
|
||||
# Velero — set **noble_velero_install: true** plus S3 bucket/URL (and credentials — see clusters/noble/bootstrap/velero/README.md)
|
||||
noble_velero_install: false
|
||||
|
||||
# Argo CD — apply app-of-apps root Application (clusters/noble/bootstrap/argocd/root-application.yaml). Set false to skip.
|
||||
noble_argocd_apply_root_application: true
|
||||
# Bootstrap kustomize in Argo (**noble-bootstrap-root** → **clusters/noble/bootstrap**). Applied with manual sync; enable automation after **noble.yml** (see **clusters/noble/bootstrap/argocd/README.md** §5).
|
||||
noble_argocd_apply_bootstrap_root_application: true
|
||||
@@ -1,12 +0,0 @@
|
||||
---
|
||||
# Hardened SSH settings
|
||||
debian_baseline_ssh_allow_users:
|
||||
- admin
|
||||
|
||||
# Example key rotation entries. Replace with your real users and keys.
|
||||
debian_ssh_rotation_users:
|
||||
- name: admin
|
||||
home: /home/admin
|
||||
state: present
|
||||
keys:
|
||||
- "ssh-ed25519 AAAAEXAMPLE_REPLACE_ME admin@workstation"
|
||||
@@ -1,37 +0,0 @@
|
||||
---
|
||||
# Proxmox repositories
|
||||
proxmox_repo_debian_codename: trixie
|
||||
proxmox_repo_disable_enterprise: true
|
||||
proxmox_repo_disable_ceph_enterprise: true
|
||||
proxmox_repo_enable_pve_no_subscription: true
|
||||
proxmox_repo_enable_ceph_no_subscription: true
|
||||
|
||||
# Suppress "No valid subscription" warning in UI
|
||||
proxmox_no_subscription_notice_disable: true
|
||||
|
||||
# Public keys to install for root on each Proxmox host.
|
||||
proxmox_root_authorized_key_files:
|
||||
- "{{ lookup('env', 'HOME') }}/.ssh/id_ed25519.pub"
|
||||
- "{{ lookup('env', 'HOME') }}/.ssh/ansible.pub"
|
||||
|
||||
# Package upgrade/reboot policy
|
||||
proxmox_upgrade_apt_cache_valid_time: 3600
|
||||
proxmox_upgrade_autoremove: true
|
||||
proxmox_upgrade_autoclean: true
|
||||
proxmox_upgrade_reboot_if_required: true
|
||||
proxmox_upgrade_reboot_timeout: 1800
|
||||
|
||||
# Cluster settings
|
||||
proxmox_cluster_enabled: true
|
||||
proxmox_cluster_name: atomic-hub
|
||||
|
||||
# Bootstrap host name from inventory (first host by default if empty)
|
||||
proxmox_cluster_master: ""
|
||||
|
||||
# Optional explicit IP/FQDN for joining; leave empty to use ansible_host of master
|
||||
proxmox_cluster_master_ip: ""
|
||||
proxmox_cluster_force: false
|
||||
|
||||
# Optional: use only for first cluster joins when inter-node SSH trust is not established.
|
||||
# Prefer storing with Ansible Vault if you set this.
|
||||
proxmox_cluster_master_root_password: "Hemroid8"
|
||||
@@ -1,11 +0,0 @@
|
||||
---
|
||||
all:
|
||||
children:
|
||||
debian_servers:
|
||||
hosts:
|
||||
debian-01:
|
||||
ansible_host: 192.168.50.101
|
||||
ansible_user: admin
|
||||
debian-02:
|
||||
ansible_host: 192.168.50.102
|
||||
ansible_user: admin
|
||||
14
ansible/inventory/hosts.ini
Normal file
14
ansible/inventory/hosts.ini
Normal file
@@ -0,0 +1,14 @@
|
||||
[proxmox]
|
||||
# Replace pve1 with your proxmox node hostname or IP
|
||||
mercury ansible_host=192.168.50.100 ansible_user=root
|
||||
|
||||
[proxmox:vars]
|
||||
# If using password auth (ssh key recommended though):
|
||||
# ansible_ssh_pass=yourpassword
|
||||
|
||||
# Connection variables for the proxmox modules (api)
|
||||
proxmox_api_user=root@pam
|
||||
proxmox_api_password=Hemroid8
|
||||
proxmox_api_host=192.168.50.100
|
||||
# proxmox_api_token_id=
|
||||
# proxmox_api_token_secret=
|
||||
@@ -1,6 +0,0 @@
|
||||
---
|
||||
all:
|
||||
hosts:
|
||||
localhost:
|
||||
ansible_connection: local
|
||||
ansible_python_interpreter: "{{ ansible_playbook_python }}"
|
||||
@@ -1,24 +0,0 @@
|
||||
---
|
||||
all:
|
||||
children:
|
||||
proxmox_hosts:
|
||||
vars:
|
||||
ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new"
|
||||
hosts:
|
||||
helium:
|
||||
ansible_host: 192.168.1.100
|
||||
ansible_user: root
|
||||
# First run without SSH keys:
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
neon:
|
||||
ansible_host: 192.168.1.90
|
||||
ansible_user: root
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
argon:
|
||||
ansible_host: 192.168.1.80
|
||||
ansible_user: root
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
krypton:
|
||||
ansible_host: 192.168.1.70
|
||||
ansible_user: root
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
@@ -1,24 +0,0 @@
|
||||
---
|
||||
all:
|
||||
children:
|
||||
proxmox_hosts:
|
||||
vars:
|
||||
ansible_ssh_common_args: "-o StrictHostKeyChecking=accept-new"
|
||||
hosts:
|
||||
helium:
|
||||
ansible_host: 192.168.1.100
|
||||
ansible_user: root
|
||||
# First run without SSH keys:
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
neon:
|
||||
ansible_host: 192.168.1.90
|
||||
ansible_user: root
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
argon:
|
||||
ansible_host: 192.168.1.80
|
||||
ansible_user: root
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
krypton:
|
||||
ansible_host: 192.168.1.70
|
||||
ansible_user: root
|
||||
# ansible_password: "{{ vault_proxmox_root_password }}"
|
||||
72
ansible/playbooks/create_ubuntu_template.yml
Normal file
72
ansible/playbooks/create_ubuntu_template.yml
Normal file
@@ -0,0 +1,72 @@
|
||||
---
|
||||
- name: Create Ubuntu Cloud-Init Template
|
||||
hosts: proxmox
|
||||
become: yes
|
||||
vars:
|
||||
template_id: 9000
|
||||
template_name: ubuntu-2204-cloud
|
||||
# URL for Ubuntu 22.04 Cloud Image (Jammy)
|
||||
image_url: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
|
||||
image_name: "ubuntu-22.04-server-cloudimg-amd64.img"
|
||||
storage_pool: "local-lvm"
|
||||
memory: 2048
|
||||
cores: 2
|
||||
|
||||
tasks:
|
||||
- name: Check if template already exists
|
||||
command: "qm status {{ template_id }}"
|
||||
register: vm_status
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Fail if template ID exists
|
||||
fail:
|
||||
msg: "VM ID {{ template_id }} already exists. Please choose a different ID or delete the existing VM."
|
||||
when: vm_status.rc == 0
|
||||
|
||||
- name: Download Ubuntu Cloud Image
|
||||
get_url:
|
||||
url: "{{ image_url }}"
|
||||
dest: "/tmp/{{ image_name }}"
|
||||
mode: '0644'
|
||||
|
||||
- name: Install libguestfs-tools (required for virt-customize if needed, optional)
|
||||
apt:
|
||||
name: libguestfs-tools
|
||||
state: present
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Create VM with hardware config
|
||||
command: >
|
||||
qm create {{ template_id }}
|
||||
--name "{{ template_name }}"
|
||||
--memory {{ memory }}
|
||||
--core {{ cores }}
|
||||
--net0 virtio,bridge=vmbr0
|
||||
--scsihw virtio-scsi-pci
|
||||
--ostype l26
|
||||
--serial0 socket --vga serial0
|
||||
|
||||
- name: Import Disk
|
||||
command: "qm importdisk {{ template_id }} /tmp/{{ image_name }} {{ storage_pool }}"
|
||||
|
||||
- name: Attach Disk to SCSI
|
||||
command: "qm set {{ template_id }} --scsi0 {{ storage_pool }}:vm-{{ template_id }}-disk-0"
|
||||
|
||||
- name: Add Cloud-Init Drive
|
||||
command: "qm set {{ template_id }} --ide2 {{ storage_pool }}:cloudinit"
|
||||
|
||||
- name: Set Boot Order
|
||||
command: "qm set {{ template_id }} --boot c --bootdisk scsi0"
|
||||
|
||||
- name: Resize Disk (Optional, e.g. 10G)
|
||||
command: "qm resize {{ template_id }} scsi0 10G"
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Convert to Template
|
||||
command: "qm template {{ template_id }}"
|
||||
|
||||
- name: Remove Downloaded Image
|
||||
file:
|
||||
path: "/tmp/{{ image_name }}"
|
||||
state: absent
|
||||
@@ -1,8 +0,0 @@
|
||||
---
|
||||
- name: Debian server baseline hardening
|
||||
hosts: debian_servers
|
||||
become: true
|
||||
gather_facts: true
|
||||
roles:
|
||||
- role: debian_baseline_hardening
|
||||
tags: [hardening, baseline]
|
||||
@@ -1,8 +0,0 @@
|
||||
---
|
||||
- name: Debian maintenance (updates + reboot handling)
|
||||
hosts: debian_servers
|
||||
become: true
|
||||
gather_facts: true
|
||||
roles:
|
||||
- role: debian_maintenance
|
||||
tags: [maintenance, updates]
|
||||
@@ -1,3 +0,0 @@
|
||||
---
|
||||
- import_playbook: debian_harden.yml
|
||||
- import_playbook: debian_maintenance.yml
|
||||
@@ -1,8 +0,0 @@
|
||||
---
|
||||
- name: Debian SSH key rotation
|
||||
hosts: debian_servers
|
||||
become: true
|
||||
gather_facts: false
|
||||
roles:
|
||||
- role: debian_ssh_key_rotation
|
||||
tags: [ssh, ssh_keys, rotation]
|
||||
@@ -1,5 +0,0 @@
|
||||
---
|
||||
# Full bring-up: Talos Phase A then platform stack.
|
||||
# Run from **ansible/**: ansible-playbook playbooks/deploy.yml
|
||||
- import_playbook: talos_phase_a.yml
|
||||
- import_playbook: noble.yml
|
||||
6
ansible/playbooks/manage_vm.yml
Normal file
6
ansible/playbooks/manage_vm.yml
Normal file
@@ -0,0 +1,6 @@
|
||||
---
|
||||
- name: Manage Proxmox VMs
|
||||
hosts: "{{ target_host | default('proxmox') }}"
|
||||
become: yes
|
||||
roles:
|
||||
- proxmox_vm
|
||||
@@ -1,232 +0,0 @@
|
||||
---
|
||||
# Full platform install — **after** Talos bootstrap (`talosctl bootstrap` + working kubeconfig).
|
||||
# Do not run until `kubectl get --raw /healthz` returns ok (see talos/README.md §3, CLUSTER-BUILD Phase A).
|
||||
# Run from repo **ansible/** directory: ansible-playbook playbooks/noble.yml
|
||||
#
|
||||
# Tags: repos, cilium, csi_snapshot, metrics, longhorn, metallb, kube_vip, traefik, cert_manager, newt,
|
||||
# argocd, kyverno, kyverno_policies, platform, velero, all (default)
|
||||
- name: Noble cluster — platform stack (Ansible-managed)
|
||||
hosts: localhost
|
||||
connection: local
|
||||
gather_facts: false
|
||||
vars:
|
||||
noble_repo_root: "{{ playbook_dir | dirname | dirname }}"
|
||||
noble_kubeconfig: "{{ lookup('env', 'KUBECONFIG') | default(noble_repo_root + '/talos/kubeconfig', true) }}"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
pre_tasks:
|
||||
# Helm/kubectl use $KUBECONFIG; a missing file yields "connection refused" to localhost:8080.
|
||||
- name: Stat kubeconfig path from KUBECONFIG or default
|
||||
ansible.builtin.stat:
|
||||
path: "{{ noble_kubeconfig }}"
|
||||
register: noble_kubeconfig_stat
|
||||
tags: [always]
|
||||
|
||||
- name: Fall back to repo talos/kubeconfig when $KUBECONFIG is unset or not a file
|
||||
ansible.builtin.set_fact:
|
||||
noble_kubeconfig: "{{ noble_repo_root }}/talos/kubeconfig"
|
||||
when: not noble_kubeconfig_stat.stat.exists | default(false)
|
||||
tags: [always]
|
||||
|
||||
- name: Stat kubeconfig after fallback
|
||||
ansible.builtin.stat:
|
||||
path: "{{ noble_kubeconfig }}"
|
||||
register: noble_kubeconfig_stat2
|
||||
tags: [always]
|
||||
|
||||
- name: Require a real kubeconfig file
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- noble_kubeconfig_stat2.stat.exists | default(false)
|
||||
- noble_kubeconfig_stat2.stat.isreg | default(false)
|
||||
fail_msg: >-
|
||||
No kubeconfig file at {{ noble_kubeconfig }}.
|
||||
Fix: export KUBECONFIG=/actual/path/from/talosctl-kubeconfig (see talos/README.md),
|
||||
or copy the admin kubeconfig to {{ noble_repo_root }}/talos/kubeconfig.
|
||||
Do not use documentation placeholders as the path.
|
||||
tags: [always]
|
||||
|
||||
- name: Ensure temp dir for kubeconfig API override
|
||||
ansible.builtin.file:
|
||||
path: "{{ noble_repo_root }}/ansible/.ansible-tmp"
|
||||
state: directory
|
||||
mode: "0700"
|
||||
when: noble_k8s_api_server_override | default('') | length > 0
|
||||
tags: [always]
|
||||
|
||||
- name: Copy kubeconfig for API server override (original file unchanged)
|
||||
ansible.builtin.copy:
|
||||
src: "{{ noble_kubeconfig }}"
|
||||
dest: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched"
|
||||
mode: "0600"
|
||||
when: noble_k8s_api_server_override | default('') | length > 0
|
||||
tags: [always]
|
||||
|
||||
- name: Resolve current cluster name (for set-cluster)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- config
|
||||
- view
|
||||
- --minify
|
||||
- -o
|
||||
- jsonpath={.clusters[0].name}
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched"
|
||||
register: noble_k8s_cluster_name
|
||||
changed_when: false
|
||||
when: noble_k8s_api_server_override | default('') | length > 0
|
||||
tags: [always]
|
||||
|
||||
- name: Point patched kubeconfig at reachable apiserver
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- config
|
||||
- set-cluster
|
||||
- "{{ noble_k8s_cluster_name.stdout }}"
|
||||
- --server={{ noble_k8s_api_server_override }}
|
||||
- --kubeconfig={{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched
|
||||
when: noble_k8s_api_server_override | default('') | length > 0
|
||||
changed_when: true
|
||||
tags: [always]
|
||||
|
||||
- name: Use patched kubeconfig for this play
|
||||
ansible.builtin.set_fact:
|
||||
noble_kubeconfig: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.patched"
|
||||
when: noble_k8s_api_server_override | default('') | length > 0
|
||||
tags: [always]
|
||||
|
||||
- name: Verify Kubernetes API is reachable from this host
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- get
|
||||
- --raw
|
||||
- /healthz
|
||||
- --request-timeout=15s
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_k8s_health_first
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
tags: [always]
|
||||
|
||||
# talosctl kubeconfig often sets server to the VIP; off-LAN you can reach a control-plane IP but not 192.168.50.230.
|
||||
# kubectl stderr is often "The connection to the server ... was refused" (no substring "connection refused").
|
||||
- name: Auto-fallback API server when VIP is unreachable (temp kubeconfig)
|
||||
tags: [always]
|
||||
when:
|
||||
- noble_k8s_api_server_auto_fallback | default(true) | bool
|
||||
- noble_k8s_api_server_override | default('') | length == 0
|
||||
- not (noble_skip_k8s_health_check | default(false) | bool)
|
||||
- (noble_k8s_health_first.rc | default(1)) != 0 or (noble_k8s_health_first.stdout | default('') | trim) != 'ok'
|
||||
- (((noble_k8s_health_first.stderr | default('')) ~ (noble_k8s_health_first.stdout | default(''))) | lower) is search('network is unreachable|no route to host|connection refused|was refused', multiline=False)
|
||||
block:
|
||||
- name: Ensure temp dir for kubeconfig auto-fallback
|
||||
ansible.builtin.file:
|
||||
path: "{{ noble_repo_root }}/ansible/.ansible-tmp"
|
||||
state: directory
|
||||
mode: "0700"
|
||||
|
||||
- name: Copy kubeconfig for API auto-fallback
|
||||
ansible.builtin.copy:
|
||||
src: "{{ noble_kubeconfig }}"
|
||||
dest: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback"
|
||||
mode: "0600"
|
||||
|
||||
- name: Resolve cluster name for kubectl set-cluster
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- config
|
||||
- view
|
||||
- --minify
|
||||
- -o
|
||||
- jsonpath={.clusters[0].name}
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback"
|
||||
register: noble_k8s_cluster_fb
|
||||
changed_when: false
|
||||
|
||||
- name: Point temp kubeconfig at fallback apiserver
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- config
|
||||
- set-cluster
|
||||
- "{{ noble_k8s_cluster_fb.stdout }}"
|
||||
- --server={{ noble_k8s_api_server_fallback | default('https://192.168.50.20:6443', true) }}
|
||||
- --kubeconfig={{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback
|
||||
changed_when: true
|
||||
|
||||
- name: Use kubeconfig with fallback API server for this play
|
||||
ansible.builtin.set_fact:
|
||||
noble_kubeconfig: "{{ noble_repo_root }}/ansible/.ansible-tmp/kubeconfig.auto-fallback"
|
||||
|
||||
- name: Re-verify Kubernetes API after auto-fallback
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- get
|
||||
- --raw
|
||||
- /healthz
|
||||
- --request-timeout=15s
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_k8s_health_after_fallback
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Mark that API was re-checked after kubeconfig fallback
|
||||
ansible.builtin.set_fact:
|
||||
noble_k8s_api_fallback_used: true
|
||||
|
||||
- name: Normalize API health result for preflight (scalars; avoids dict merge / set_fact stringification)
|
||||
ansible.builtin.set_fact:
|
||||
noble_k8s_health_rc: "{{ noble_k8s_health_after_fallback.rc | default(1) if (noble_k8s_api_fallback_used | default(false) | bool) else (noble_k8s_health_first.rc | default(1)) }}"
|
||||
noble_k8s_health_stdout: "{{ noble_k8s_health_after_fallback.stdout | default('') if (noble_k8s_api_fallback_used | default(false) | bool) else (noble_k8s_health_first.stdout | default('')) }}"
|
||||
noble_k8s_health_stderr: "{{ noble_k8s_health_after_fallback.stderr | default('') if (noble_k8s_api_fallback_used | default(false) | bool) else (noble_k8s_health_first.stderr | default('')) }}"
|
||||
tags: [always]
|
||||
|
||||
- name: Fail when API check did not return ok
|
||||
ansible.builtin.fail:
|
||||
msg: "{{ lookup('template', 'templates/api_health_hint.j2') }}"
|
||||
when:
|
||||
- not (noble_skip_k8s_health_check | default(false) | bool)
|
||||
- (noble_k8s_health_rc | int) != 0 or (noble_k8s_health_stdout | default('') | trim) != 'ok'
|
||||
tags: [always]
|
||||
|
||||
roles:
|
||||
- role: helm_repos
|
||||
tags: [repos, helm]
|
||||
- role: noble_cilium
|
||||
tags: [cilium, cni]
|
||||
- role: noble_csi_snapshot_controller
|
||||
tags: [csi_snapshot, snapshot, storage]
|
||||
- role: noble_metrics_server
|
||||
tags: [metrics, metrics_server]
|
||||
- role: noble_longhorn
|
||||
tags: [longhorn, storage]
|
||||
- role: noble_metallb
|
||||
tags: [metallb, lb]
|
||||
- role: noble_kube_vip
|
||||
tags: [kube_vip, vip]
|
||||
- role: noble_traefik
|
||||
tags: [traefik, ingress]
|
||||
- role: noble_cert_manager
|
||||
tags: [cert_manager, certs]
|
||||
- role: noble_newt
|
||||
tags: [newt]
|
||||
- role: noble_argocd
|
||||
tags: [argocd, gitops]
|
||||
- role: noble_kyverno
|
||||
tags: [kyverno, policy]
|
||||
- role: noble_kyverno_policies
|
||||
tags: [kyverno_policies, policy]
|
||||
- role: noble_platform
|
||||
tags: [platform, observability, apps]
|
||||
- role: noble_velero
|
||||
tags: [velero, backups]
|
||||
- role: noble_landing_urls
|
||||
tags: [landing, platform, observability, apps]
|
||||
@@ -1,7 +0,0 @@
|
||||
---
|
||||
# Manual follow-ups after **noble.yml**: SOPS key backup, optional Argo root Application.
|
||||
- hosts: localhost
|
||||
connection: local
|
||||
gather_facts: false
|
||||
roles:
|
||||
- noble_post_deploy
|
||||
@@ -1,9 +0,0 @@
|
||||
---
|
||||
- name: Proxmox cluster bootstrap/join
|
||||
hosts: proxmox_hosts
|
||||
become: true
|
||||
gather_facts: false
|
||||
serial: 1
|
||||
roles:
|
||||
- role: proxmox_cluster
|
||||
tags: [proxmox, cluster]
|
||||
@@ -1,4 +0,0 @@
|
||||
---
|
||||
- import_playbook: proxmox_prepare.yml
|
||||
- import_playbook: proxmox_upgrade.yml
|
||||
- import_playbook: proxmox_cluster.yml
|
||||
@@ -1,8 +0,0 @@
|
||||
---
|
||||
- name: Proxmox host preparation (community repos + no-subscription notice)
|
||||
hosts: proxmox_hosts
|
||||
become: true
|
||||
gather_facts: true
|
||||
roles:
|
||||
- role: proxmox_baseline
|
||||
tags: [proxmox, prepare, repos, ui]
|
||||
@@ -1,9 +0,0 @@
|
||||
---
|
||||
- name: Proxmox host maintenance (upgrade to latest)
|
||||
hosts: proxmox_hosts
|
||||
become: true
|
||||
gather_facts: true
|
||||
serial: 1
|
||||
roles:
|
||||
- role: proxmox_maintenance
|
||||
tags: [proxmox, maintenance, updates]
|
||||
@@ -1,11 +0,0 @@
|
||||
---
|
||||
# Genconfig only — for full Talos Phase A (apply, bootstrap, kubeconfig) use **playbooks/talos_phase_a.yml**
|
||||
# or **playbooks/deploy.yml**. Run: ansible-playbook playbooks/talos_bootstrap.yml -e noble_talos_genconfig=true
|
||||
- name: Talos — optional genconfig helper
|
||||
hosts: localhost
|
||||
connection: local
|
||||
gather_facts: false
|
||||
vars:
|
||||
noble_repo_root: "{{ playbook_dir | dirname | dirname }}"
|
||||
roles:
|
||||
- role: talos_bootstrap
|
||||
@@ -1,15 +0,0 @@
|
||||
---
|
||||
# Talos Phase A — **talhelper genconfig** → **apply-config** (all nodes) → **bootstrap** → **kubeconfig**.
|
||||
# Requires: **talosctl**, **talhelper**, reachable node IPs (same LAN as nodes for Talos API :50000).
|
||||
# See **talos/README.md** §1–§3. Then run **playbooks/noble.yml** or **deploy.yml**.
|
||||
- name: Talos — genconfig, apply, bootstrap, kubeconfig
|
||||
hosts: localhost
|
||||
connection: local
|
||||
gather_facts: false
|
||||
vars:
|
||||
noble_repo_root: "{{ playbook_dir | dirname | dirname }}"
|
||||
noble_talos_dir: "{{ noble_repo_root }}/talos"
|
||||
noble_talos_kubeconfig_out: "{{ noble_repo_root }}/talos/kubeconfig"
|
||||
roles:
|
||||
- role: talos_phase_a
|
||||
tags: [talos, phase_a]
|
||||
@@ -1,22 +0,0 @@
|
||||
{# Error output for noble.yml API preflight when kubectl /healthz fails #}
|
||||
Cannot use the Kubernetes API from this host (kubectl get --raw /healthz).
|
||||
rc={{ noble_k8s_health_rc | default('n/a') }}
|
||||
stderr: {{ noble_k8s_health_stderr | default('') | trim }}
|
||||
|
||||
{% set err = (noble_k8s_health_stderr | default('')) | lower %}
|
||||
{% if 'connection refused' in err %}
|
||||
Connection refused: the TCP path to that host works, but nothing is accepting HTTPS on port 6443 there.
|
||||
• **Not bootstrapped yet?** Finish Talos first: `talosctl bootstrap` (once on a control plane), then `talosctl kubeconfig`, then confirm `kubectl get nodes`. See talos/README.md §2–§3 and CLUSTER-BUILD.md Phase A. **Do not run this playbook before the Kubernetes API exists.**
|
||||
• If bootstrap is done: try another control-plane IP (CLUSTER-BUILD inventory: neon 192.168.50.20, argon .30, krypton .40), or the VIP if kube-vip is up and you are on the LAN:
|
||||
-e 'noble_k8s_api_server_override=https://192.168.50.230:6443'
|
||||
• Do not point the API URL at a worker-only node.
|
||||
• `talosctl health` / `kubectl get nodes` from a working client.
|
||||
{% elif 'network is unreachable' in err or 'no route to host' in err %}
|
||||
Network unreachable / no route: this machine cannot route to the API IP. Join the lab LAN or VPN, or set a reachable API server URL (talos/README.md §3).
|
||||
{% else %}
|
||||
If kubeconfig used the VIP from off-LAN, try a reachable control-plane IP, e.g.:
|
||||
-e 'noble_k8s_api_server_override=https://192.168.50.20:6443'
|
||||
See talos/README.md §3.
|
||||
{% endif %}
|
||||
|
||||
To skip this check (not recommended): -e noble_skip_k8s_health_check=true
|
||||
2
ansible/requirements.yml
Normal file
2
ansible/requirements.yml
Normal file
@@ -0,0 +1,2 @@
|
||||
collections:
|
||||
- name: community.general
|
||||
@@ -1,39 +0,0 @@
|
||||
---
|
||||
# Update apt metadata only when stale (seconds)
|
||||
debian_baseline_apt_cache_valid_time: 3600
|
||||
|
||||
# Core host hardening packages
|
||||
debian_baseline_packages:
|
||||
- unattended-upgrades
|
||||
- apt-listchanges
|
||||
- fail2ban
|
||||
- needrestart
|
||||
- sudo
|
||||
- ca-certificates
|
||||
|
||||
# SSH hardening controls
|
||||
debian_baseline_ssh_permit_root_login: "no"
|
||||
debian_baseline_ssh_password_authentication: "no"
|
||||
debian_baseline_ssh_pubkey_authentication: "yes"
|
||||
debian_baseline_ssh_x11_forwarding: "no"
|
||||
debian_baseline_ssh_max_auth_tries: 3
|
||||
debian_baseline_ssh_client_alive_interval: 300
|
||||
debian_baseline_ssh_client_alive_count_max: 2
|
||||
debian_baseline_ssh_allow_users: []
|
||||
|
||||
# unattended-upgrades controls
|
||||
debian_baseline_enable_unattended_upgrades: true
|
||||
debian_baseline_unattended_auto_upgrade: "1"
|
||||
debian_baseline_unattended_update_lists: "1"
|
||||
|
||||
# Kernel and network hardening sysctls
|
||||
debian_baseline_sysctl_settings:
|
||||
net.ipv4.conf.all.accept_redirects: "0"
|
||||
net.ipv4.conf.default.accept_redirects: "0"
|
||||
net.ipv4.conf.all.send_redirects: "0"
|
||||
net.ipv4.conf.default.send_redirects: "0"
|
||||
net.ipv4.conf.all.log_martians: "1"
|
||||
net.ipv4.conf.default.log_martians: "1"
|
||||
net.ipv4.tcp_syncookies: "1"
|
||||
net.ipv6.conf.all.accept_redirects: "0"
|
||||
net.ipv6.conf.default.accept_redirects: "0"
|
||||
@@ -1,12 +0,0 @@
|
||||
---
|
||||
- name: Restart ssh
|
||||
ansible.builtin.service:
|
||||
name: ssh
|
||||
state: restarted
|
||||
|
||||
- name: Reload sysctl
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- sysctl
|
||||
- --system
|
||||
changed_when: true
|
||||
@@ -1,52 +0,0 @@
|
||||
---
|
||||
- name: Refresh apt cache
|
||||
ansible.builtin.apt:
|
||||
update_cache: true
|
||||
cache_valid_time: "{{ debian_baseline_apt_cache_valid_time }}"
|
||||
|
||||
- name: Install baseline hardening packages
|
||||
ansible.builtin.apt:
|
||||
name: "{{ debian_baseline_packages }}"
|
||||
state: present
|
||||
|
||||
- name: Configure unattended-upgrades auto settings
|
||||
ansible.builtin.copy:
|
||||
dest: /etc/apt/apt.conf.d/20auto-upgrades
|
||||
mode: "0644"
|
||||
content: |
|
||||
APT::Periodic::Update-Package-Lists "{{ debian_baseline_unattended_update_lists }}";
|
||||
APT::Periodic::Unattended-Upgrade "{{ debian_baseline_unattended_auto_upgrade }}";
|
||||
when: debian_baseline_enable_unattended_upgrades | bool
|
||||
|
||||
- name: Configure SSH hardening options
|
||||
ansible.builtin.copy:
|
||||
dest: /etc/ssh/sshd_config.d/99-hardening.conf
|
||||
mode: "0644"
|
||||
content: |
|
||||
PermitRootLogin {{ debian_baseline_ssh_permit_root_login }}
|
||||
PasswordAuthentication {{ debian_baseline_ssh_password_authentication }}
|
||||
PubkeyAuthentication {{ debian_baseline_ssh_pubkey_authentication }}
|
||||
X11Forwarding {{ debian_baseline_ssh_x11_forwarding }}
|
||||
MaxAuthTries {{ debian_baseline_ssh_max_auth_tries }}
|
||||
ClientAliveInterval {{ debian_baseline_ssh_client_alive_interval }}
|
||||
ClientAliveCountMax {{ debian_baseline_ssh_client_alive_count_max }}
|
||||
{% if debian_baseline_ssh_allow_users | length > 0 %}
|
||||
AllowUsers {{ debian_baseline_ssh_allow_users | join(' ') }}
|
||||
{% endif %}
|
||||
notify: Restart ssh
|
||||
|
||||
- name: Configure baseline sysctls
|
||||
ansible.builtin.copy:
|
||||
dest: /etc/sysctl.d/99-hardening.conf
|
||||
mode: "0644"
|
||||
content: |
|
||||
{% for key, value in debian_baseline_sysctl_settings.items() %}
|
||||
{{ key }} = {{ value }}
|
||||
{% endfor %}
|
||||
notify: Reload sysctl
|
||||
|
||||
- name: Ensure fail2ban service is enabled
|
||||
ansible.builtin.service:
|
||||
name: fail2ban
|
||||
enabled: true
|
||||
state: started
|
||||
@@ -1,7 +0,0 @@
|
||||
---
|
||||
debian_maintenance_apt_cache_valid_time: 3600
|
||||
debian_maintenance_upgrade_type: dist
|
||||
debian_maintenance_autoremove: true
|
||||
debian_maintenance_autoclean: true
|
||||
debian_maintenance_reboot_if_required: true
|
||||
debian_maintenance_reboot_timeout: 1800
|
||||
@@ -1,30 +0,0 @@
|
||||
---
|
||||
- name: Refresh apt cache
|
||||
ansible.builtin.apt:
|
||||
update_cache: true
|
||||
cache_valid_time: "{{ debian_maintenance_apt_cache_valid_time }}"
|
||||
|
||||
- name: Upgrade Debian packages
|
||||
ansible.builtin.apt:
|
||||
upgrade: "{{ debian_maintenance_upgrade_type }}"
|
||||
|
||||
- name: Remove orphaned packages
|
||||
ansible.builtin.apt:
|
||||
autoremove: "{{ debian_maintenance_autoremove }}"
|
||||
|
||||
- name: Clean apt package cache
|
||||
ansible.builtin.apt:
|
||||
autoclean: "{{ debian_maintenance_autoclean }}"
|
||||
|
||||
- name: Check if reboot is required
|
||||
ansible.builtin.stat:
|
||||
path: /var/run/reboot-required
|
||||
register: debian_maintenance_reboot_required_file
|
||||
|
||||
- name: Reboot when required by package updates
|
||||
ansible.builtin.reboot:
|
||||
reboot_timeout: "{{ debian_maintenance_reboot_timeout }}"
|
||||
msg: "Reboot initiated by Ansible maintenance playbook"
|
||||
when:
|
||||
- debian_maintenance_reboot_if_required | bool
|
||||
- debian_maintenance_reboot_required_file.stat.exists | default(false)
|
||||
@@ -1,10 +0,0 @@
|
||||
---
|
||||
# List of users to manage keys for.
|
||||
# Example:
|
||||
# debian_ssh_rotation_users:
|
||||
# - name: deploy
|
||||
# home: /home/deploy
|
||||
# state: present
|
||||
# keys:
|
||||
# - "ssh-ed25519 AAAA... deploy@laptop"
|
||||
debian_ssh_rotation_users: []
|
||||
@@ -1,50 +0,0 @@
|
||||
---
|
||||
- name: Validate SSH key rotation inputs
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- item.name is defined
|
||||
- item.home is defined
|
||||
- (item.state | default('present')) in ['present', 'absent']
|
||||
- (item.state | default('present')) == 'absent' or (item.keys is defined and item.keys | length > 0)
|
||||
fail_msg: >-
|
||||
Each entry in debian_ssh_rotation_users must include name, home, and either:
|
||||
state=absent, or keys with at least one SSH public key.
|
||||
loop: "{{ debian_ssh_rotation_users }}"
|
||||
loop_control:
|
||||
label: "{{ item.name | default('unknown') }}"
|
||||
|
||||
- name: Ensure ~/.ssh exists for managed users
|
||||
ansible.builtin.file:
|
||||
path: "{{ item.home }}/.ssh"
|
||||
state: directory
|
||||
owner: "{{ item.name }}"
|
||||
group: "{{ item.name }}"
|
||||
mode: "0700"
|
||||
loop: "{{ debian_ssh_rotation_users }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
when: (item.state | default('present')) == 'present'
|
||||
|
||||
- name: Rotate authorized_keys for managed users
|
||||
ansible.builtin.copy:
|
||||
dest: "{{ item.home }}/.ssh/authorized_keys"
|
||||
owner: "{{ item.name }}"
|
||||
group: "{{ item.name }}"
|
||||
mode: "0600"
|
||||
content: |
|
||||
{% for key in item.keys %}
|
||||
{{ key }}
|
||||
{% endfor %}
|
||||
loop: "{{ debian_ssh_rotation_users }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
when: (item.state | default('present')) == 'present'
|
||||
|
||||
- name: Remove authorized_keys for users marked absent
|
||||
ansible.builtin.file:
|
||||
path: "{{ item.home }}/.ssh/authorized_keys"
|
||||
state: absent
|
||||
loop: "{{ debian_ssh_rotation_users }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
when: (item.state | default('present')) == 'absent'
|
||||
@@ -1,16 +0,0 @@
|
||||
---
|
||||
noble_helm_repos:
|
||||
- { name: cilium, url: "https://helm.cilium.io/" }
|
||||
- { name: metallb, url: "https://metallb.github.io/metallb" }
|
||||
- { name: longhorn, url: "https://charts.longhorn.io" }
|
||||
- { name: traefik, url: "https://traefik.github.io/charts" }
|
||||
- { name: jetstack, url: "https://charts.jetstack.io" }
|
||||
- { name: fossorial, url: "https://charts.fossorial.io" }
|
||||
- { name: argo, url: "https://argoproj.github.io/argo-helm" }
|
||||
- { name: metrics-server, url: "https://kubernetes-sigs.github.io/metrics-server/" }
|
||||
- { name: prometheus-community, url: "https://prometheus-community.github.io/helm-charts" }
|
||||
- { name: grafana, url: "https://grafana.github.io/helm-charts" }
|
||||
- { name: fluent, url: "https://fluent.github.io/helm-charts" }
|
||||
- { name: headlamp, url: "https://kubernetes-sigs.github.io/headlamp/" }
|
||||
- { name: kyverno, url: "https://kyverno.github.io/kyverno/" }
|
||||
- { name: vmware-tanzu, url: "https://vmware-tanzu.github.io/helm-charts" }
|
||||
@@ -1,16 +0,0 @@
|
||||
---
|
||||
- name: Add Helm repositories
|
||||
ansible.builtin.command:
|
||||
cmd: "helm repo add {{ item.name }} {{ item.url }}"
|
||||
loop: "{{ noble_helm_repos }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
register: helm_repo_add
|
||||
changed_when: helm_repo_add.rc == 0
|
||||
failed_when: >-
|
||||
helm_repo_add.rc != 0 and
|
||||
('already exists' not in (helm_repo_add.stderr | default('')))
|
||||
|
||||
- name: helm repo update
|
||||
ansible.builtin.command: helm repo update
|
||||
changed_when: true
|
||||
@@ -1,6 +0,0 @@
|
||||
---
|
||||
# When true, applies clusters/noble/bootstrap/argocd/root-application.yaml (app-of-apps).
|
||||
# Edit spec.source.repoURL in that file if your Git remote differs.
|
||||
noble_argocd_apply_root_application: false
|
||||
# When true, applies clusters/noble/bootstrap/argocd/bootstrap-root-application.yaml (noble-bootstrap-root; manual sync until README §5).
|
||||
noble_argocd_apply_bootstrap_root_application: true
|
||||
@@ -1,46 +0,0 @@
|
||||
---
|
||||
- name: Install Argo CD
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- argocd
|
||||
- argo/argo-cd
|
||||
- --namespace
|
||||
- argocd
|
||||
- --create-namespace
|
||||
- --version
|
||||
- "9.4.17"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/argocd/values.yaml"
|
||||
- --wait
|
||||
- --timeout
|
||||
- 15m
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Apply Argo CD root Application (app-of-apps)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/argocd/root-application.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_argocd_apply_root_application | default(false) | bool
|
||||
changed_when: true
|
||||
|
||||
- name: Apply Argo CD bootstrap app-of-apps Application
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/argocd/bootstrap-root-application.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_argocd_apply_bootstrap_root_application | default(false) | bool
|
||||
changed_when: true
|
||||
@@ -1,3 +0,0 @@
|
||||
---
|
||||
# Warn when **cloudflare-dns-api-token** is missing after apply (also set in **group_vars/all.yml** when loaded).
|
||||
noble_cert_manager_require_cloudflare_secret: true
|
||||
@@ -1,28 +0,0 @@
|
||||
---
|
||||
# See repository **.env.sample** — copy to **.env** (gitignored).
|
||||
- name: Stat repository .env for deploy secrets
|
||||
ansible.builtin.stat:
|
||||
path: "{{ noble_repo_root }}/.env"
|
||||
register: noble_deploy_env_file
|
||||
changed_when: false
|
||||
|
||||
- name: Create cert-manager Cloudflare DNS secret from .env
|
||||
ansible.builtin.shell: |
|
||||
set -euo pipefail
|
||||
set -a
|
||||
. "{{ noble_repo_root }}/.env"
|
||||
set +a
|
||||
if [ -z "${CLOUDFLARE_DNS_API_TOKEN:-}" ]; then
|
||||
echo NO_TOKEN
|
||||
exit 0
|
||||
fi
|
||||
kubectl -n cert-manager create secret generic cloudflare-dns-api-token \
|
||||
--from-literal=api-token="${CLOUDFLARE_DNS_API_TOKEN}" \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
echo APPLIED
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_deploy_env_file.stat.exists | default(false)
|
||||
no_log: true
|
||||
register: noble_cf_secret_from_env
|
||||
changed_when: "'APPLIED' in (noble_cf_secret_from_env.stdout | default(''))"
|
||||
@@ -1,68 +0,0 @@
|
||||
---
|
||||
- name: Create cert-manager namespace
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cert-manager/namespace.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install cert-manager
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- cert-manager
|
||||
- jetstack/cert-manager
|
||||
- --namespace
|
||||
- cert-manager
|
||||
- --version
|
||||
- v1.20.0
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cert-manager/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Apply secrets from repository .env (optional)
|
||||
ansible.builtin.include_tasks: from_env.yml
|
||||
|
||||
- name: Check Cloudflare DNS API token Secret (required for ClusterIssuers)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- -n
|
||||
- cert-manager
|
||||
- get
|
||||
- secret
|
||||
- cloudflare-dns-api-token
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_cf_secret
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Warn when Cloudflare Secret is missing
|
||||
ansible.builtin.debug:
|
||||
msg: >-
|
||||
Secret cert-manager/cloudflare-dns-api-token not found.
|
||||
Create it per clusters/noble/bootstrap/cert-manager/README.md before ClusterIssuers can succeed.
|
||||
when:
|
||||
- noble_cert_manager_require_cloudflare_secret | default(true) | bool
|
||||
- noble_cf_secret.rc != 0
|
||||
|
||||
- name: Apply ClusterIssuers (staging + prod)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -k
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cert-manager"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,25 +0,0 @@
|
||||
---
|
||||
- name: Install Cilium (required CNI for Talos cni:none)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- cilium
|
||||
- cilium/cilium
|
||||
- --namespace
|
||||
- kube-system
|
||||
- --version
|
||||
- "1.16.6"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/cilium/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Wait for Cilium DaemonSet
|
||||
ansible.builtin.command: kubectl -n kube-system rollout status ds/cilium --timeout=300s
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: false
|
||||
@@ -1,2 +0,0 @@
|
||||
---
|
||||
noble_csi_snapshot_kubectl_timeout: 120s
|
||||
@@ -1,39 +0,0 @@
|
||||
---
|
||||
# Volume Snapshot CRDs + snapshot-controller (Velero CSI / Longhorn snapshots).
|
||||
- name: Apply Volume Snapshot CRDs (snapshot.storage.k8s.io)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- "--request-timeout={{ noble_csi_snapshot_kubectl_timeout | default('120s') }}"
|
||||
- -k
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/csi-snapshot-controller/crd"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Apply snapshot-controller in kube-system
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- "--request-timeout={{ noble_csi_snapshot_kubectl_timeout | default('120s') }}"
|
||||
- -k
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/csi-snapshot-controller/controller"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Wait for snapshot-controller Deployment
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- -n
|
||||
- kube-system
|
||||
- rollout
|
||||
- status
|
||||
- deploy/snapshot-controller
|
||||
- --timeout=120s
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: false
|
||||
@@ -1,11 +0,0 @@
|
||||
---
|
||||
- name: Apply kube-vip (Kubernetes API VIP)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -k
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kube-vip"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,32 +0,0 @@
|
||||
---
|
||||
- name: Create Kyverno namespace
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kyverno/namespace.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install Kyverno operator
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- kyverno
|
||||
- kyverno/kyverno
|
||||
- -n
|
||||
- kyverno
|
||||
- --version
|
||||
- "3.7.1"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kyverno/values.yaml"
|
||||
- --wait
|
||||
- --timeout
|
||||
- 15m
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,21 +0,0 @@
|
||||
---
|
||||
- name: Install Kyverno policy chart (PSS baseline, Audit)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- kyverno-policies
|
||||
- kyverno/kyverno-policies
|
||||
- -n
|
||||
- kyverno
|
||||
- --version
|
||||
- "3.7.1"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kyverno/policies-values.yaml"
|
||||
- --wait
|
||||
- --timeout
|
||||
- 10m
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,51 +0,0 @@
|
||||
---
|
||||
# Regenerated when **noble_landing_urls** runs (after platform stack). Paths match Traefik + cert-manager Ingresses.
|
||||
noble_landing_urls_dest: "{{ noble_repo_root }}/ansible/output/noble-lab-ui-urls.md"
|
||||
|
||||
# When true, run kubectl to fill Argo CD / Grafana secrets and a bounded Headlamp SA token in the markdown (requires working kubeconfig).
|
||||
noble_landing_urls_fetch_credentials: true
|
||||
|
||||
# Headlamp: bounded token for UI sign-in (`kubectl create token`); cluster may cap max duration.
|
||||
noble_landing_urls_headlamp_token_duration: 48h
|
||||
|
||||
noble_lab_ui_entries:
|
||||
- name: Argo CD
|
||||
description: GitOps UI (sync, apps, repos)
|
||||
namespace: argocd
|
||||
service: argocd-server
|
||||
url: https://argo.apps.noble.lab.pcenicni.dev
|
||||
- name: Grafana
|
||||
description: Dashboards, Loki explore (logs)
|
||||
namespace: monitoring
|
||||
service: kube-prometheus-grafana
|
||||
url: https://grafana.apps.noble.lab.pcenicni.dev
|
||||
- name: Prometheus
|
||||
description: Prometheus UI (queries, targets) — lab; protect in production
|
||||
namespace: monitoring
|
||||
service: kube-prometheus-kube-prome-prometheus
|
||||
url: https://prometheus.apps.noble.lab.pcenicni.dev
|
||||
- name: Alertmanager
|
||||
description: Alertmanager UI (silences, status)
|
||||
namespace: monitoring
|
||||
service: kube-prometheus-kube-prome-alertmanager
|
||||
url: https://alertmanager.apps.noble.lab.pcenicni.dev
|
||||
- name: Headlamp
|
||||
description: Kubernetes UI (cluster resources)
|
||||
namespace: headlamp
|
||||
service: headlamp
|
||||
url: https://headlamp.apps.noble.lab.pcenicni.dev
|
||||
- name: Longhorn
|
||||
description: Storage volumes, nodes, backups
|
||||
namespace: longhorn-system
|
||||
service: longhorn-frontend
|
||||
url: https://longhorn.apps.noble.lab.pcenicni.dev
|
||||
- name: Velero
|
||||
description: Cluster backups — no web UI (velero CLI / kubectl CRDs)
|
||||
namespace: velero
|
||||
service: velero
|
||||
url: ""
|
||||
- name: Homepage
|
||||
description: App dashboard (links to lab UIs)
|
||||
namespace: homepage
|
||||
service: homepage
|
||||
url: https://homepage.apps.noble.lab.pcenicni.dev
|
||||
@@ -1,72 +0,0 @@
|
||||
---
|
||||
# Populates template variables from Secrets + Headlamp token (no_log on kubectl to avoid leaking into Ansible stdout).
|
||||
- name: Fetch Argo CD initial admin password (base64)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- -n
|
||||
- argocd
|
||||
- get
|
||||
- secret
|
||||
- argocd-initial-admin-secret
|
||||
- -o
|
||||
- jsonpath={.data.password}
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_fetch_argocd_pw_b64
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
no_log: true
|
||||
|
||||
- name: Fetch Grafana admin user (base64)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- -n
|
||||
- monitoring
|
||||
- get
|
||||
- secret
|
||||
- kube-prometheus-grafana
|
||||
- -o
|
||||
- jsonpath={.data.admin-user}
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_fetch_grafana_user_b64
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
no_log: true
|
||||
|
||||
- name: Fetch Grafana admin password (base64)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- -n
|
||||
- monitoring
|
||||
- get
|
||||
- secret
|
||||
- kube-prometheus-grafana
|
||||
- -o
|
||||
- jsonpath={.data.admin-password}
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_fetch_grafana_pw_b64
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
no_log: true
|
||||
|
||||
- name: Create Headlamp ServiceAccount token (for UI sign-in)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- -n
|
||||
- headlamp
|
||||
- create
|
||||
- token
|
||||
- headlamp
|
||||
- "--duration={{ noble_landing_urls_headlamp_token_duration | default('48h') }}"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_fetch_headlamp_token
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
no_log: true
|
||||
@@ -1,20 +0,0 @@
|
||||
---
|
||||
- name: Ensure output directory for generated landing page
|
||||
ansible.builtin.file:
|
||||
path: "{{ noble_repo_root }}/ansible/output"
|
||||
state: directory
|
||||
mode: "0755"
|
||||
|
||||
- name: Fetch initial credentials from cluster Secrets (optional)
|
||||
ansible.builtin.include_tasks: fetch_credentials.yml
|
||||
when: noble_landing_urls_fetch_credentials | default(true) | bool
|
||||
|
||||
- name: Write noble lab UI URLs (markdown landing page)
|
||||
ansible.builtin.template:
|
||||
src: noble-lab-ui-urls.md.j2
|
||||
dest: "{{ noble_landing_urls_dest }}"
|
||||
mode: "0644"
|
||||
|
||||
- name: Show landing page path
|
||||
ansible.builtin.debug:
|
||||
msg: "Noble lab UI list written to {{ noble_landing_urls_dest }}"
|
||||
@@ -1,51 +0,0 @@
|
||||
# Noble lab — web UIs (LAN)
|
||||
|
||||
> **Sensitive:** This file may include **passwords read from Kubernetes Secrets** when credential fetch ran. It is **gitignored** — do not commit or share.
|
||||
|
||||
**DNS:** point **`*.apps.noble.lab.pcenicni.dev`** at the Traefik **LoadBalancer** (MetalLB **`192.168.50.211`** by default — see `clusters/noble/bootstrap/traefik/values.yaml`).
|
||||
|
||||
**TLS:** **cert-manager** + **`letsencrypt-prod`** on each Ingress (public **DNS-01** for **`pcenicni.dev`**).
|
||||
|
||||
This file is **generated** by Ansible (`noble_landing_urls` role). Use it as a temporary landing page to find services after deploy.
|
||||
|
||||
| UI | What | Kubernetes service | Namespace | URL |
|
||||
|----|------|----------------------|-----------|-----|
|
||||
{% for e in noble_lab_ui_entries %}
|
||||
| {{ e.name }} | {{ e.description }} | `{{ e.service }}` | `{{ e.namespace }}` | {% if e.url | default('') | length > 0 %}[{{ e.url }}]({{ e.url }}){% else %}—{% endif %} |
|
||||
{% endfor %}
|
||||
|
||||
## Initial access (logins)
|
||||
|
||||
| App | Username / identity | Password / secret |
|
||||
|-----|---------------------|-------------------|
|
||||
| **Argo CD** | `admin` | {% if (noble_fetch_argocd_pw_b64 is defined) and (noble_fetch_argocd_pw_b64.rc | default(1) == 0) and (noble_fetch_argocd_pw_b64.stdout | default('') | length > 0) %}`{{ noble_fetch_argocd_pw_b64.stdout | b64decode }}`{% else %}*(not fetched — use commands below)*{% endif %} |
|
||||
| **Grafana** | {% if (noble_fetch_grafana_user_b64 is defined) and (noble_fetch_grafana_user_b64.rc | default(1) == 0) and (noble_fetch_grafana_user_b64.stdout | default('') | length > 0) %}`{{ noble_fetch_grafana_user_b64.stdout | b64decode }}`{% else %}*(from Secret — use commands below)*{% endif %} | {% if (noble_fetch_grafana_pw_b64 is defined) and (noble_fetch_grafana_pw_b64.rc | default(1) == 0) and (noble_fetch_grafana_pw_b64.stdout | default('') | length > 0) %}`{{ noble_fetch_grafana_pw_b64.stdout | b64decode }}`{% else %}*(not fetched — use commands below)*{% endif %} |
|
||||
| **Headlamp** | ServiceAccount **`headlamp`** | {% if (noble_fetch_headlamp_token is defined) and (noble_fetch_headlamp_token.rc | default(1) == 0) and (noble_fetch_headlamp_token.stdout | default('') | trim | length > 0) %}Token ({{ noble_landing_urls_headlamp_token_duration | default('48h') }}): `{{ noble_fetch_headlamp_token.stdout | trim }}`{% else %}*(not generated — use command below)*{% endif %} |
|
||||
| **Prometheus** | — | No auth in default install (lab). |
|
||||
| **Alertmanager** | — | No auth in default install (lab). |
|
||||
| **Longhorn** | — | No default login unless you enable access control in the UI settings. |
|
||||
|
||||
### Commands to retrieve passwords (if not filled above)
|
||||
|
||||
```bash
|
||||
# Argo CD initial admin (Secret removed after you change password)
|
||||
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath='{.data.password}' | base64 -d
|
||||
echo
|
||||
|
||||
# Grafana admin user / password
|
||||
kubectl -n monitoring get secret kube-prometheus-grafana -o jsonpath='{.data.admin-user}' | base64 -d
|
||||
echo
|
||||
kubectl -n monitoring get secret kube-prometheus-grafana -o jsonpath='{.data.admin-password}' | base64 -d
|
||||
echo
|
||||
```
|
||||
|
||||
To generate this file **without** calling kubectl, run Ansible with **`-e noble_landing_urls_fetch_credentials=false`**.
|
||||
|
||||
## Notes
|
||||
|
||||
- **Argo CD** `argocd-initial-admin-secret` disappears after you change the admin password.
|
||||
- **Grafana** password is random unless you set `grafana.adminPassword` in chart values.
|
||||
- **Prometheus / Alertmanager** UIs are unauthenticated by default — restrict when hardening (`talos/CLUSTER-BUILD.md` Phase G).
|
||||
- **SOPS:** cluster secrets in git under **`clusters/noble/secrets/`** are encrypted; decrypt with **`age-key.txt`** (not in git). See **`clusters/noble/secrets/README.md`**.
|
||||
- **Headlamp** token above expires after the configured duration; re-run Ansible or `kubectl create token` to refresh.
|
||||
- **Velero** has **no web UI** — use **`velero`** CLI or **`kubectl -n velero get backup,schedule,backupstoragelocation`**. Metrics: **`velero`** Service in **`velero`** (Prometheus scrape). See `clusters/noble/bootstrap/velero/README.md`.
|
||||
@@ -1,29 +0,0 @@
|
||||
---
|
||||
- name: Apply Longhorn namespace (PSA) from kustomization
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -k
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/longhorn"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install Longhorn chart
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- longhorn
|
||||
- longhorn/longhorn
|
||||
- -n
|
||||
- longhorn-system
|
||||
- --create-namespace
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/longhorn/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,3 +0,0 @@
|
||||
---
|
||||
# Helm **--wait** default is often too short when images pull slowly or nodes are busy.
|
||||
noble_helm_metallb_wait_timeout: 20m
|
||||
@@ -1,39 +0,0 @@
|
||||
---
|
||||
- name: Apply MetalLB namespace (Pod Security labels)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/metallb/namespace.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install MetalLB chart
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- metallb
|
||||
- metallb/metallb
|
||||
- --namespace
|
||||
- metallb-system
|
||||
- --wait
|
||||
- --timeout
|
||||
- "{{ noble_helm_metallb_wait_timeout }}"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Apply IPAddressPool and L2Advertisement
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -k
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/metallb"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,19 +0,0 @@
|
||||
---
|
||||
- name: Install metrics-server
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- metrics-server
|
||||
- metrics-server/metrics-server
|
||||
- -n
|
||||
- kube-system
|
||||
- --version
|
||||
- "3.13.0"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/metrics-server/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,3 +0,0 @@
|
||||
---
|
||||
# Set true after creating the newt-pangolin-auth Secret (see role / cluster docs).
|
||||
noble_newt_install: true
|
||||
@@ -1,30 +0,0 @@
|
||||
---
|
||||
# See repository **.env.sample** — copy to **.env** (gitignored).
|
||||
- name: Stat repository .env for deploy secrets
|
||||
ansible.builtin.stat:
|
||||
path: "{{ noble_repo_root }}/.env"
|
||||
register: noble_deploy_env_file
|
||||
changed_when: false
|
||||
|
||||
- name: Create newt-pangolin-auth Secret from .env
|
||||
ansible.builtin.shell: |
|
||||
set -euo pipefail
|
||||
set -a
|
||||
. "{{ noble_repo_root }}/.env"
|
||||
set +a
|
||||
if [ -z "${PANGOLIN_ENDPOINT:-}" ] || [ -z "${NEWT_ID:-}" ] || [ -z "${NEWT_SECRET:-}" ]; then
|
||||
echo NO_VARS
|
||||
exit 0
|
||||
fi
|
||||
kubectl -n newt create secret generic newt-pangolin-auth \
|
||||
--from-literal=PANGOLIN_ENDPOINT="${PANGOLIN_ENDPOINT}" \
|
||||
--from-literal=NEWT_ID="${NEWT_ID}" \
|
||||
--from-literal=NEWT_SECRET="${NEWT_SECRET}" \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
echo APPLIED
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_deploy_env_file.stat.exists | default(false)
|
||||
no_log: true
|
||||
register: noble_newt_secret_from_env
|
||||
changed_when: "'APPLIED' in (noble_newt_secret_from_env.stdout | default(''))"
|
||||
@@ -1,41 +0,0 @@
|
||||
---
|
||||
- name: Skip Newt when not enabled
|
||||
ansible.builtin.debug:
|
||||
msg: "noble_newt_install is false — set PANGOLIN_ENDPOINT, NEWT_ID, NEWT_SECRET in repo .env (or create the Secret manually) and set noble_newt_install=true to deploy Newt."
|
||||
when: not (noble_newt_install | bool)
|
||||
|
||||
- name: Create Newt namespace
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/newt/namespace.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_newt_install | bool
|
||||
changed_when: true
|
||||
|
||||
- name: Apply Newt Pangolin auth Secret from repository .env (optional)
|
||||
ansible.builtin.include_tasks: from_env.yml
|
||||
when: noble_newt_install | bool
|
||||
|
||||
- name: Install Newt chart
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- newt
|
||||
- fossorial/newt
|
||||
- --namespace
|
||||
- newt
|
||||
- --version
|
||||
- "1.2.0"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/newt/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_newt_install | bool
|
||||
changed_when: true
|
||||
@@ -1,9 +0,0 @@
|
||||
---
|
||||
# kubectl apply -k can hit transient etcd timeouts under load; retries + longer API deadline help.
|
||||
noble_platform_kubectl_request_timeout: 120s
|
||||
noble_platform_kustomize_retries: 5
|
||||
noble_platform_kustomize_delay: 20
|
||||
|
||||
# Decrypt **clusters/noble/secrets/*.yaml** with SOPS and kubectl apply (requires **sops**, **age**, and **age-key.txt**).
|
||||
noble_apply_sops_secrets: true
|
||||
noble_sops_age_key_file: "{{ noble_repo_root }}/age-key.txt"
|
||||
@@ -1,117 +0,0 @@
|
||||
---
|
||||
# Mirrors former **noble-platform** Argo Application: Helm releases + plain manifests under clusters/noble/bootstrap.
|
||||
- name: Apply clusters/noble/bootstrap kustomize (namespaces, Grafana Loki datasource)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- "--request-timeout={{ noble_platform_kubectl_request_timeout }}"
|
||||
- -k
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_platform_kustomize
|
||||
retries: "{{ noble_platform_kustomize_retries | int }}"
|
||||
delay: "{{ noble_platform_kustomize_delay | int }}"
|
||||
until: noble_platform_kustomize.rc == 0
|
||||
changed_when: true
|
||||
|
||||
- name: Stat SOPS age private key (age-key.txt)
|
||||
ansible.builtin.stat:
|
||||
path: "{{ noble_sops_age_key_file }}"
|
||||
register: noble_sops_age_key_stat
|
||||
|
||||
- name: Apply SOPS-encrypted cluster secrets (clusters/noble/secrets/*.yaml)
|
||||
ansible.builtin.shell: |
|
||||
set -euo pipefail
|
||||
shopt -s nullglob
|
||||
for f in "{{ noble_repo_root }}/clusters/noble/secrets"/*.yaml; do
|
||||
sops -d "$f" | kubectl apply -f -
|
||||
done
|
||||
args:
|
||||
executable: /bin/bash
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
SOPS_AGE_KEY_FILE: "{{ noble_sops_age_key_file }}"
|
||||
when:
|
||||
- noble_apply_sops_secrets | default(true) | bool
|
||||
- noble_sops_age_key_stat.stat.exists
|
||||
changed_when: true
|
||||
|
||||
- name: Install kube-prometheus-stack
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- kube-prometheus
|
||||
- prometheus-community/kube-prometheus-stack
|
||||
- -n
|
||||
- monitoring
|
||||
- --version
|
||||
- "82.15.1"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/kube-prometheus-stack/values.yaml"
|
||||
- --wait
|
||||
- --timeout
|
||||
- 30m
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install Loki
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- loki
|
||||
- grafana/loki
|
||||
- -n
|
||||
- loki
|
||||
- --version
|
||||
- "6.55.0"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/loki/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install Fluent Bit
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- fluent-bit
|
||||
- fluent/fluent-bit
|
||||
- -n
|
||||
- logging
|
||||
- --version
|
||||
- "0.56.0"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/fluent-bit/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install Headlamp
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- headlamp
|
||||
- headlamp/headlamp
|
||||
- --version
|
||||
- "0.40.1"
|
||||
- -n
|
||||
- headlamp
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/headlamp/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,15 +0,0 @@
|
||||
---
|
||||
- name: SOPS secrets (workstation)
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
Encrypted Kubernetes Secrets live under clusters/noble/secrets/ (Mozilla SOPS + age).
|
||||
Private key: age-key.txt at repo root (gitignored). See clusters/noble/secrets/README.md
|
||||
and .sops.yaml. noble.yml decrypt-applies these when age-key.txt exists.
|
||||
|
||||
- name: Argo CD optional root Application (empty app-of-apps)
|
||||
ansible.builtin.debug:
|
||||
msg: >-
|
||||
App-of-apps: noble.yml applies root-application.yaml when noble_argocd_apply_root_application is true;
|
||||
bootstrap-root-application.yaml when noble_argocd_apply_bootstrap_root_application is true (group_vars/all.yml).
|
||||
noble-bootstrap-root uses manual sync until you enable automation after the playbook —
|
||||
clusters/noble/bootstrap/argocd/README.md §5. See clusters/noble/apps/README.md and that README.
|
||||
@@ -1,30 +0,0 @@
|
||||
---
|
||||
- name: Create Traefik namespace
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/traefik/namespace.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Install Traefik
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- helm
|
||||
- upgrade
|
||||
- --install
|
||||
- traefik
|
||||
- traefik/traefik
|
||||
- --namespace
|
||||
- traefik
|
||||
- --version
|
||||
- "39.0.6"
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/traefik/values.yaml"
|
||||
- --wait
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
changed_when: true
|
||||
@@ -1,13 +0,0 @@
|
||||
---
|
||||
# **noble_velero_install** is in **ansible/group_vars/all.yml**. Override S3 fields via extra-vars or group_vars.
|
||||
noble_velero_chart_version: "12.0.0"
|
||||
|
||||
noble_velero_s3_bucket: ""
|
||||
noble_velero_s3_url: ""
|
||||
noble_velero_s3_region: "us-east-1"
|
||||
noble_velero_s3_force_path_style: "true"
|
||||
noble_velero_s3_prefix: ""
|
||||
|
||||
# Optional — if unset, Ansible expects Secret **velero/velero-cloud-credentials** (key **cloud**) to exist.
|
||||
noble_velero_aws_access_key_id: ""
|
||||
noble_velero_aws_secret_access_key: ""
|
||||
@@ -1,68 +0,0 @@
|
||||
---
|
||||
# See repository **.env.sample** — copy to **.env** (gitignored).
|
||||
- name: Stat repository .env for Velero
|
||||
ansible.builtin.stat:
|
||||
path: "{{ noble_repo_root }}/.env"
|
||||
register: noble_deploy_env_file
|
||||
changed_when: false
|
||||
|
||||
- name: Load NOBLE_VELERO_S3_BUCKET from .env when unset
|
||||
ansible.builtin.shell: |
|
||||
set -a
|
||||
. "{{ noble_repo_root }}/.env"
|
||||
set +a
|
||||
echo "${NOBLE_VELERO_S3_BUCKET:-}"
|
||||
register: noble_velero_s3_bucket_from_env
|
||||
when:
|
||||
- noble_deploy_env_file.stat.exists | default(false)
|
||||
- noble_velero_s3_bucket | default('') | length == 0
|
||||
changed_when: false
|
||||
|
||||
- name: Apply NOBLE_VELERO_S3_BUCKET from .env
|
||||
ansible.builtin.set_fact:
|
||||
noble_velero_s3_bucket: "{{ noble_velero_s3_bucket_from_env.stdout | trim }}"
|
||||
when:
|
||||
- noble_velero_s3_bucket_from_env is defined
|
||||
- (noble_velero_s3_bucket_from_env.stdout | default('') | trim | length) > 0
|
||||
|
||||
- name: Load NOBLE_VELERO_S3_URL from .env when unset
|
||||
ansible.builtin.shell: |
|
||||
set -a
|
||||
. "{{ noble_repo_root }}/.env"
|
||||
set +a
|
||||
echo "${NOBLE_VELERO_S3_URL:-}"
|
||||
register: noble_velero_s3_url_from_env
|
||||
when:
|
||||
- noble_deploy_env_file.stat.exists | default(false)
|
||||
- noble_velero_s3_url | default('') | length == 0
|
||||
changed_when: false
|
||||
|
||||
- name: Apply NOBLE_VELERO_S3_URL from .env
|
||||
ansible.builtin.set_fact:
|
||||
noble_velero_s3_url: "{{ noble_velero_s3_url_from_env.stdout | trim }}"
|
||||
when:
|
||||
- noble_velero_s3_url_from_env is defined
|
||||
- (noble_velero_s3_url_from_env.stdout | default('') | trim | length) > 0
|
||||
|
||||
- name: Create velero-cloud-credentials from .env when keys present
|
||||
ansible.builtin.shell: |
|
||||
set -euo pipefail
|
||||
set -a
|
||||
. "{{ noble_repo_root }}/.env"
|
||||
set +a
|
||||
if [ -z "${NOBLE_VELERO_AWS_ACCESS_KEY_ID:-}" ] || [ -z "${NOBLE_VELERO_AWS_SECRET_ACCESS_KEY:-}" ]; then
|
||||
echo SKIP
|
||||
exit 0
|
||||
fi
|
||||
CLOUD="$(printf '[default]\naws_access_key_id=%s\naws_secret_access_key=%s\n' \
|
||||
"${NOBLE_VELERO_AWS_ACCESS_KEY_ID}" "${NOBLE_VELERO_AWS_SECRET_ACCESS_KEY}")"
|
||||
kubectl -n velero create secret generic velero-cloud-credentials \
|
||||
--from-literal=cloud="${CLOUD}" \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
echo APPLIED
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_deploy_env_file.stat.exists | default(false)
|
||||
no_log: true
|
||||
register: noble_velero_secret_from_env
|
||||
changed_when: "'APPLIED' in (noble_velero_secret_from_env.stdout | default(''))"
|
||||
@@ -1,85 +0,0 @@
|
||||
---
|
||||
# Velero — S3 backup target + built-in CSI snapshots (Longhorn: label VolumeSnapshotClass per README).
|
||||
- name: Apply velero namespace
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- apply
|
||||
- -f
|
||||
- "{{ noble_repo_root }}/clusters/noble/bootstrap/velero/namespace.yaml"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_velero_install | default(false) | bool
|
||||
changed_when: true
|
||||
|
||||
- name: Include Velero settings from repository .env (S3 bucket, URL, credentials)
|
||||
ansible.builtin.include_tasks: from_env.yml
|
||||
when: noble_velero_install | default(false) | bool
|
||||
|
||||
- name: Require S3 bucket and endpoint for Velero
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- noble_velero_s3_bucket | default('') | length > 0
|
||||
- noble_velero_s3_url | default('') | length > 0
|
||||
fail_msg: >-
|
||||
Set NOBLE_VELERO_S3_BUCKET and NOBLE_VELERO_S3_URL in .env, or noble_velero_s3_bucket / noble_velero_s3_url
|
||||
(e.g. -e ...), or group_vars when noble_velero_install is true.
|
||||
when: noble_velero_install | default(false) | bool
|
||||
|
||||
- name: Create velero-cloud-credentials from Ansible vars
|
||||
ansible.builtin.shell: |
|
||||
set -euo pipefail
|
||||
CLOUD="$(printf '[default]\naws_access_key_id=%s\naws_secret_access_key=%s\n' \
|
||||
"${AWS_ACCESS_KEY_ID}" "${AWS_SECRET_ACCESS_KEY}")"
|
||||
kubectl -n velero create secret generic velero-cloud-credentials \
|
||||
--from-literal=cloud="${CLOUD}" \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
AWS_ACCESS_KEY_ID: "{{ noble_velero_aws_access_key_id }}"
|
||||
AWS_SECRET_ACCESS_KEY: "{{ noble_velero_aws_secret_access_key }}"
|
||||
when:
|
||||
- noble_velero_install | default(false) | bool
|
||||
- noble_velero_aws_access_key_id | default('') | length > 0
|
||||
- noble_velero_aws_secret_access_key | default('') | length > 0
|
||||
no_log: true
|
||||
changed_when: true
|
||||
|
||||
- name: Check velero-cloud-credentials Secret
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- -n
|
||||
- velero
|
||||
- get
|
||||
- secret
|
||||
- velero-cloud-credentials
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
register: noble_velero_secret_check
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
when: noble_velero_install | default(false) | bool
|
||||
|
||||
- name: Require velero-cloud-credentials before Helm
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- noble_velero_secret_check.rc == 0
|
||||
fail_msg: >-
|
||||
Velero needs Secret velero/velero-cloud-credentials (key cloud). Set NOBLE_VELERO_AWS_ACCESS_KEY_ID and
|
||||
NOBLE_VELERO_AWS_SECRET_ACCESS_KEY in .env, or noble_velero_aws_* extra-vars, or create the Secret manually
|
||||
(see clusters/noble/bootstrap/velero/README.md).
|
||||
when: noble_velero_install | default(false) | bool
|
||||
|
||||
- name: Optional object prefix argv for Helm
|
||||
ansible.builtin.set_fact:
|
||||
noble_velero_helm_prefix_argv: "{{ ['--set-string', 'configuration.backupStorageLocation[0].prefix=' ~ (noble_velero_s3_prefix | default(''))] if (noble_velero_s3_prefix | default('') | length > 0) else [] }}"
|
||||
when: noble_velero_install | default(false) | bool
|
||||
|
||||
- name: Install Velero
|
||||
ansible.builtin.command:
|
||||
argv: "{{ ['helm', 'upgrade', '--install', 'velero', 'vmware-tanzu/velero', '--namespace', 'velero', '--version', noble_velero_chart_version, '-f', noble_repo_root ~ '/clusters/noble/bootstrap/velero/values.yaml', '--set-string', 'configuration.backupStorageLocation[0].bucket=' ~ noble_velero_s3_bucket, '--set-string', 'configuration.backupStorageLocation[0].config.s3Url=' ~ noble_velero_s3_url, '--set-string', 'configuration.backupStorageLocation[0].config.region=' ~ noble_velero_s3_region, '--set-string', 'configuration.backupStorageLocation[0].config.s3ForcePathStyle=' ~ noble_velero_s3_force_path_style] + (noble_velero_helm_prefix_argv | default([])) + ['--wait'] }}"
|
||||
environment:
|
||||
KUBECONFIG: "{{ noble_kubeconfig }}"
|
||||
when: noble_velero_install | default(false) | bool
|
||||
changed_when: true
|
||||
@@ -1,14 +0,0 @@
|
||||
---
|
||||
proxmox_repo_debian_codename: "{{ ansible_facts['distribution_release'] | default('bookworm') }}"
|
||||
proxmox_repo_disable_enterprise: true
|
||||
proxmox_repo_disable_ceph_enterprise: true
|
||||
proxmox_repo_enable_pve_no_subscription: true
|
||||
proxmox_repo_enable_ceph_no_subscription: false
|
||||
|
||||
proxmox_no_subscription_notice_disable: true
|
||||
proxmox_widget_toolkit_file: /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js
|
||||
|
||||
# Bootstrap root SSH keys from the control machine so subsequent runs can use key auth.
|
||||
proxmox_root_authorized_key_files:
|
||||
- "{{ lookup('env', 'HOME') }}/.ssh/id_ed25519.pub"
|
||||
- "{{ lookup('env', 'HOME') }}/.ssh/ansible.pub"
|
||||
@@ -1,5 +0,0 @@
|
||||
---
|
||||
- name: Restart pveproxy
|
||||
ansible.builtin.service:
|
||||
name: pveproxy
|
||||
state: restarted
|
||||
@@ -1,100 +0,0 @@
|
||||
---
|
||||
- name: Check configured local public key files
|
||||
ansible.builtin.stat:
|
||||
path: "{{ item }}"
|
||||
register: proxmox_root_pubkey_stats
|
||||
loop: "{{ proxmox_root_authorized_key_files }}"
|
||||
delegate_to: localhost
|
||||
become: false
|
||||
|
||||
- name: Fail when a configured local public key file is missing
|
||||
ansible.builtin.fail:
|
||||
msg: "Configured key file does not exist on the control host: {{ item.item }}"
|
||||
when: not item.stat.exists
|
||||
loop: "{{ proxmox_root_pubkey_stats.results }}"
|
||||
delegate_to: localhost
|
||||
become: false
|
||||
|
||||
- name: Ensure root authorized_keys contains configured public keys
|
||||
ansible.posix.authorized_key:
|
||||
user: root
|
||||
state: present
|
||||
key: "{{ lookup('ansible.builtin.file', item) }}"
|
||||
manage_dir: true
|
||||
loop: "{{ proxmox_root_authorized_key_files }}"
|
||||
|
||||
- name: Remove enterprise repository lines from /etc/apt/sources.list
|
||||
ansible.builtin.lineinfile:
|
||||
path: /etc/apt/sources.list
|
||||
regexp: ".*enterprise\\.proxmox\\.com.*"
|
||||
state: absent
|
||||
when:
|
||||
- proxmox_repo_disable_enterprise | bool or proxmox_repo_disable_ceph_enterprise | bool
|
||||
failed_when: false
|
||||
|
||||
- name: Find apt source files that contain Proxmox enterprise repositories
|
||||
ansible.builtin.find:
|
||||
paths: /etc/apt/sources.list.d
|
||||
file_type: file
|
||||
patterns:
|
||||
- "*.list"
|
||||
- "*.sources"
|
||||
contains: "enterprise\\.proxmox\\.com"
|
||||
use_regex: true
|
||||
register: proxmox_enterprise_repo_files
|
||||
when:
|
||||
- proxmox_repo_disable_enterprise | bool or proxmox_repo_disable_ceph_enterprise | bool
|
||||
|
||||
- name: Remove enterprise repository lines from apt source files
|
||||
ansible.builtin.lineinfile:
|
||||
path: "{{ item.path }}"
|
||||
regexp: ".*enterprise\\.proxmox\\.com.*"
|
||||
state: absent
|
||||
loop: "{{ proxmox_enterprise_repo_files.files | default([]) }}"
|
||||
when:
|
||||
- proxmox_repo_disable_enterprise | bool or proxmox_repo_disable_ceph_enterprise | bool
|
||||
|
||||
- name: Find apt source files that already contain pve-no-subscription
|
||||
ansible.builtin.find:
|
||||
paths: /etc/apt/sources.list.d
|
||||
file_type: file
|
||||
patterns:
|
||||
- "*.list"
|
||||
- "*.sources"
|
||||
contains: "pve-no-subscription"
|
||||
use_regex: false
|
||||
register: proxmox_no_sub_repo_files
|
||||
when: proxmox_repo_enable_pve_no_subscription | bool
|
||||
|
||||
- name: Ensure Proxmox no-subscription repository is configured when absent
|
||||
ansible.builtin.copy:
|
||||
dest: /etc/apt/sources.list.d/pve-no-subscription.list
|
||||
content: "deb http://download.proxmox.com/debian/pve {{ proxmox_repo_debian_codename }} pve-no-subscription\n"
|
||||
mode: "0644"
|
||||
when:
|
||||
- proxmox_repo_enable_pve_no_subscription | bool
|
||||
- (proxmox_no_sub_repo_files.matched | default(0) | int) == 0
|
||||
|
||||
- name: Remove duplicate pve-no-subscription.list when another source already provides it
|
||||
ansible.builtin.file:
|
||||
path: /etc/apt/sources.list.d/pve-no-subscription.list
|
||||
state: absent
|
||||
when:
|
||||
- proxmox_repo_enable_pve_no_subscription | bool
|
||||
- (proxmox_no_sub_repo_files.files | default([]) | map(attribute='path') | list | select('ne', '/etc/apt/sources.list.d/pve-no-subscription.list') | list | length) > 0
|
||||
|
||||
- name: Ensure Ceph no-subscription repository is configured
|
||||
ansible.builtin.copy:
|
||||
dest: /etc/apt/sources.list.d/ceph-no-subscription.list
|
||||
content: "deb http://download.proxmox.com/debian/ceph-{{ proxmox_repo_debian_codename }} {{ proxmox_repo_debian_codename }} no-subscription\n"
|
||||
mode: "0644"
|
||||
when: proxmox_repo_enable_ceph_no_subscription | bool
|
||||
|
||||
- name: Disable no-subscription pop-up in Proxmox UI
|
||||
ansible.builtin.replace:
|
||||
path: "{{ proxmox_widget_toolkit_file }}"
|
||||
regexp: "if \\(data\\.status !== 'Active'\\)"
|
||||
replace: "if (false)"
|
||||
backup: true
|
||||
when: proxmox_no_subscription_notice_disable | bool
|
||||
notify: Restart pveproxy
|
||||
@@ -1,7 +0,0 @@
|
||||
---
|
||||
proxmox_cluster_enabled: true
|
||||
proxmox_cluster_name: pve-cluster
|
||||
proxmox_cluster_master: ""
|
||||
proxmox_cluster_master_ip: ""
|
||||
proxmox_cluster_force: false
|
||||
proxmox_cluster_master_root_password: ""
|
||||
@@ -1,63 +0,0 @@
|
||||
---
|
||||
- name: Skip cluster role when disabled
|
||||
ansible.builtin.meta: end_host
|
||||
when: not (proxmox_cluster_enabled | bool)
|
||||
|
||||
- name: Check whether corosync cluster config exists
|
||||
ansible.builtin.stat:
|
||||
path: /etc/pve/corosync.conf
|
||||
register: proxmox_cluster_corosync_conf
|
||||
|
||||
- name: Set effective Proxmox cluster master
|
||||
ansible.builtin.set_fact:
|
||||
proxmox_cluster_master_effective: "{{ proxmox_cluster_master | default(groups['proxmox_hosts'][0], true) }}"
|
||||
|
||||
- name: Set effective Proxmox cluster master IP
|
||||
ansible.builtin.set_fact:
|
||||
proxmox_cluster_master_ip_effective: >-
|
||||
{{
|
||||
proxmox_cluster_master_ip
|
||||
| default(hostvars[proxmox_cluster_master_effective].ansible_host
|
||||
| default(proxmox_cluster_master_effective), true)
|
||||
}}
|
||||
|
||||
- name: Create cluster on designated master
|
||||
ansible.builtin.command:
|
||||
cmd: "pvecm create {{ proxmox_cluster_name }}"
|
||||
when:
|
||||
- inventory_hostname == proxmox_cluster_master_effective
|
||||
- not proxmox_cluster_corosync_conf.stat.exists
|
||||
|
||||
- name: Ensure python3-pexpect is installed for password-based cluster join
|
||||
ansible.builtin.apt:
|
||||
name: python3-pexpect
|
||||
state: present
|
||||
update_cache: true
|
||||
when:
|
||||
- inventory_hostname != proxmox_cluster_master_effective
|
||||
- not proxmox_cluster_corosync_conf.stat.exists
|
||||
- proxmox_cluster_master_root_password | length > 0
|
||||
|
||||
- name: Join node to existing cluster (password provided)
|
||||
ansible.builtin.expect:
|
||||
command: >-
|
||||
pvecm add {{ proxmox_cluster_master_ip_effective }}
|
||||
{% if proxmox_cluster_force | bool %}--force{% endif %}
|
||||
responses:
|
||||
"Please enter superuser \\(root\\) password for '.*':": "{{ proxmox_cluster_master_root_password }}"
|
||||
"password:": "{{ proxmox_cluster_master_root_password }}"
|
||||
no_log: true
|
||||
when:
|
||||
- inventory_hostname != proxmox_cluster_master_effective
|
||||
- not proxmox_cluster_corosync_conf.stat.exists
|
||||
- proxmox_cluster_master_root_password | length > 0
|
||||
|
||||
- name: Join node to existing cluster (SSH trust/no prompt)
|
||||
ansible.builtin.command:
|
||||
cmd: >-
|
||||
pvecm add {{ proxmox_cluster_master_ip_effective }}
|
||||
{% if proxmox_cluster_force | bool %}--force{% endif %}
|
||||
when:
|
||||
- inventory_hostname != proxmox_cluster_master_effective
|
||||
- not proxmox_cluster_corosync_conf.stat.exists
|
||||
- proxmox_cluster_master_root_password | length == 0
|
||||
@@ -1,6 +0,0 @@
|
||||
---
|
||||
proxmox_upgrade_apt_cache_valid_time: 3600
|
||||
proxmox_upgrade_autoremove: true
|
||||
proxmox_upgrade_autoclean: true
|
||||
proxmox_upgrade_reboot_if_required: true
|
||||
proxmox_upgrade_reboot_timeout: 1800
|
||||
@@ -1,30 +0,0 @@
|
||||
---
|
||||
- name: Refresh apt cache
|
||||
ansible.builtin.apt:
|
||||
update_cache: true
|
||||
cache_valid_time: "{{ proxmox_upgrade_apt_cache_valid_time }}"
|
||||
|
||||
- name: Upgrade Proxmox host packages
|
||||
ansible.builtin.apt:
|
||||
upgrade: dist
|
||||
|
||||
- name: Remove orphaned packages
|
||||
ansible.builtin.apt:
|
||||
autoremove: "{{ proxmox_upgrade_autoremove }}"
|
||||
|
||||
- name: Clean apt package cache
|
||||
ansible.builtin.apt:
|
||||
autoclean: "{{ proxmox_upgrade_autoclean }}"
|
||||
|
||||
- name: Check if reboot is required
|
||||
ansible.builtin.stat:
|
||||
path: /var/run/reboot-required
|
||||
register: proxmox_reboot_required_file
|
||||
|
||||
- name: Reboot when required by package upgrades
|
||||
ansible.builtin.reboot:
|
||||
reboot_timeout: "{{ proxmox_upgrade_reboot_timeout }}"
|
||||
msg: "Reboot initiated by Ansible Proxmox maintenance playbook"
|
||||
when:
|
||||
- proxmox_upgrade_reboot_if_required | bool
|
||||
- proxmox_reboot_required_file.stat.exists | default(false)
|
||||
26
ansible/roles/proxmox_vm/defaults/main.yml
Normal file
26
ansible/roles/proxmox_vm/defaults/main.yml
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
# Defaults for proxmox_vm role
|
||||
|
||||
# Action to perform: create_template, create_vm, delete_vm, backup_vm
|
||||
proxmox_action: create_vm
|
||||
|
||||
# Common settings
|
||||
storage_pool: local-lvm
|
||||
vmid: 9000
|
||||
|
||||
# Template Creation settings
|
||||
template_name: ubuntu-cloud-template
|
||||
image_url: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
|
||||
image_name: "ubuntu-22.04-server-cloudimg-amd64.img"
|
||||
memory: 2048
|
||||
cores: 2
|
||||
|
||||
# Create VM settings (cloning)
|
||||
new_vm_name: new-vm
|
||||
target_node: "{{ inventory_hostname }}" # For cloning, usually same node
|
||||
clone_full: true # Full clone (independent) vs Linked clone
|
||||
|
||||
# Backup settings
|
||||
backup_mode: snapshot # snapshot, suspend, stop
|
||||
backup_compress: zstd
|
||||
backup_storage: local
|
||||
7
ansible/roles/proxmox_vm/tasks/backup_vm.yml
Normal file
7
ansible/roles/proxmox_vm/tasks/backup_vm.yml
Normal file
@@ -0,0 +1,7 @@
|
||||
---
|
||||
- name: Create VM Backup
|
||||
command: >
|
||||
vzdump {{ vmid }}
|
||||
--mode {{ backup_mode }}
|
||||
--compress {{ backup_compress }}
|
||||
--storage {{ backup_storage }}
|
||||
58
ansible/roles/proxmox_vm/tasks/create_template.yml
Normal file
58
ansible/roles/proxmox_vm/tasks/create_template.yml
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
- name: Check if template already exists
|
||||
command: "qm status {{ vmid }}"
|
||||
register: vm_status
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Fail if template ID exists
|
||||
fail:
|
||||
msg: "VM ID {{ vmid }} already exists. Please choose a different ID or delete the existing VM."
|
||||
when: vm_status.rc == 0
|
||||
|
||||
- name: Download Cloud Image
|
||||
get_url:
|
||||
url: "{{ image_url }}"
|
||||
dest: "/tmp/{{ image_name }}"
|
||||
mode: '0644'
|
||||
|
||||
- name: Install libguestfs-tools
|
||||
apt:
|
||||
name: libguestfs-tools
|
||||
state: present
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Create VM with hardware config
|
||||
command: >
|
||||
qm create {{ vmid }}
|
||||
--name "{{ template_name }}"
|
||||
--memory {{ memory }}
|
||||
--core {{ cores }}
|
||||
--net0 virtio,bridge=vmbr0
|
||||
--scsihw virtio-scsi-pci
|
||||
--ostype l26
|
||||
--serial0 socket --vga serial0
|
||||
|
||||
- name: Import Disk
|
||||
command: "qm importdisk {{ vmid }} /tmp/{{ image_name }} {{ storage_pool }}"
|
||||
|
||||
- name: Attach Disk to SCSI
|
||||
command: "qm set {{ vmid }} --scsi0 {{ storage_pool }}:vm-{{ vmid }}-disk-0"
|
||||
|
||||
- name: Add Cloud-Init Drive
|
||||
command: "qm set {{ vmid }} --ide2 {{ storage_pool }}:cloudinit"
|
||||
|
||||
- name: Set Boot Order
|
||||
command: "qm set {{ vmid }} --boot c --bootdisk scsi0"
|
||||
|
||||
- name: Resize Disk (Default 10G)
|
||||
command: "qm resize {{ vmid }} scsi0 10G"
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Convert to Template
|
||||
command: "qm template {{ vmid }}"
|
||||
|
||||
- name: Remove Downloaded Image
|
||||
file:
|
||||
path: "/tmp/{{ image_name }}"
|
||||
state: absent
|
||||
11
ansible/roles/proxmox_vm/tasks/create_vm.yml
Normal file
11
ansible/roles/proxmox_vm/tasks/create_vm.yml
Normal file
@@ -0,0 +1,11 @@
|
||||
---
|
||||
- name: Clone VM from Template
|
||||
command: >
|
||||
qm clone {{ vmid }} {{ new_vmid }}
|
||||
--name "{{ new_vm_name }}"
|
||||
--full {{ 1 if clone_full | bool else 0 }}
|
||||
register: clone_result
|
||||
|
||||
- name: Start VM (Optional)
|
||||
command: "qm start {{ new_vmid }}"
|
||||
when: start_after_create | default(false) | bool
|
||||
7
ansible/roles/proxmox_vm/tasks/delete_vm.yml
Normal file
7
ansible/roles/proxmox_vm/tasks/delete_vm.yml
Normal file
@@ -0,0 +1,7 @@
|
||||
---
|
||||
- name: Stop VM (Force Stop)
|
||||
command: "qm stop {{ vmid }}"
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Destroy VM
|
||||
command: "qm destroy {{ vmid }} --purge"
|
||||
3
ansible/roles/proxmox_vm/tasks/main.yml
Normal file
3
ansible/roles/proxmox_vm/tasks/main.yml
Normal file
@@ -0,0 +1,3 @@
|
||||
---
|
||||
- name: Dispatch task based on action
|
||||
include_tasks: "{{ proxmox_action }}.yml"
|
||||
@@ -1,3 +0,0 @@
|
||||
---
|
||||
# Set **true** to run `talhelper genconfig -o out` under **talos/** (requires talhelper + talconfig).
|
||||
noble_talos_genconfig: false
|
||||
@@ -1,36 +0,0 @@
|
||||
---
|
||||
- name: Generate Talos machine configs (talhelper genconfig)
|
||||
when: noble_talos_genconfig | bool
|
||||
block:
|
||||
- name: Validate talconfig
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talhelper
|
||||
- validate
|
||||
- talconfig
|
||||
- talconfig.yaml
|
||||
args:
|
||||
chdir: "{{ noble_repo_root }}/talos"
|
||||
changed_when: false
|
||||
|
||||
- name: Generate Talos configs (out/)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talhelper
|
||||
- genconfig
|
||||
- -o
|
||||
- out
|
||||
args:
|
||||
chdir: "{{ noble_repo_root }}/talos"
|
||||
changed_when: true
|
||||
|
||||
- name: Post genconfig — next steps
|
||||
ansible.builtin.debug:
|
||||
msg: >-
|
||||
Configs are in talos/out/. Apply to nodes, bootstrap, and kubeconfig per talos/README.md
|
||||
before running playbooks/noble.yml.
|
||||
|
||||
- name: Skip when noble_talos_genconfig is false
|
||||
ansible.builtin.debug:
|
||||
msg: "No-op: pass -e noble_talos_genconfig=true to run talhelper genconfig."
|
||||
when: not (noble_talos_genconfig | bool)
|
||||
@@ -1,38 +0,0 @@
|
||||
---
|
||||
# **noble_repo_root** and **noble_talos_dir** are set by **playbooks/talos_phase_a.yml** (repo root and **talos/**).
|
||||
|
||||
# Run **talhelper genconfig -o out** before apply (needs talhelper + talsecret per talos/README.md §1).
|
||||
noble_talos_genconfig: true
|
||||
|
||||
# **auto** — probe nodes (maintenance vs joined TLS); **insecure** — always **--insecure**; **secure** — always **TALOSCONFIG** (Phase A already done / talos/README §2 B).
|
||||
noble_talos_apply_mode: auto
|
||||
|
||||
# Skip if cluster is already bootstrapped (re-run playbook safely).
|
||||
noble_talos_skip_bootstrap: false
|
||||
|
||||
# After **apply-config**, nodes often reboot — wait for Talos **apid** (:50000) before **bootstrap** / **kubeconfig**.
|
||||
noble_talos_wait_for_apid: true
|
||||
noble_talos_apid_wait_delay: 20
|
||||
noble_talos_apid_wait_timeout: 900
|
||||
|
||||
# **talosctl bootstrap -n** — first control plane (neon).
|
||||
noble_talos_bootstrap_node_ip: "192.168.50.20"
|
||||
|
||||
# **talosctl kubeconfig -n** (node that answers Talos/K8s for cert fetch).
|
||||
noble_talos_kubeconfig_node: "192.168.50.20"
|
||||
|
||||
# **talosctl kubeconfig -e** — Talos endpoint (node IP before VIP is reachable; VIP when LAN works).
|
||||
noble_talos_kubeconfig_endpoint: "192.168.50.20"
|
||||
|
||||
# After kubeconfig, patch **kubectl** server if VIP in file is unreachable (**group_vars** / same as noble.yml).
|
||||
# noble_k8s_api_server_override: ""
|
||||
|
||||
# Must match **cluster.name** / kubeconfig cluster entry (often **noble**).
|
||||
noble_talos_kubectl_cluster_name: noble
|
||||
|
||||
# Inventory: IP + filename under **talos/out/** — align with **talos/talconfig.yaml**.
|
||||
noble_talos_nodes:
|
||||
- { ip: "192.168.50.20", machine: "noble-neon.yaml" }
|
||||
- { ip: "192.168.50.30", machine: "noble-argon.yaml" }
|
||||
- { ip: "192.168.50.40", machine: "noble-krypton.yaml" }
|
||||
- { ip: "192.168.50.10", machine: "noble-helium.yaml" }
|
||||
@@ -1,209 +0,0 @@
|
||||
---
|
||||
# Order matches talos/README.md: genconfig → apply all nodes → bootstrap → kubeconfig.
|
||||
|
||||
- name: Validate talconfig and generate **out/** (talhelper genconfig)
|
||||
when: noble_talos_genconfig | bool
|
||||
block:
|
||||
- name: talhelper validate
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talhelper
|
||||
- validate
|
||||
- talconfig
|
||||
- talconfig.yaml
|
||||
args:
|
||||
chdir: "{{ noble_talos_dir }}"
|
||||
changed_when: false
|
||||
|
||||
- name: talhelper genconfig -o out
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talhelper
|
||||
- genconfig
|
||||
- -o
|
||||
- out
|
||||
args:
|
||||
chdir: "{{ noble_talos_dir }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Stat talos/out/talosconfig
|
||||
ansible.builtin.stat:
|
||||
path: "{{ noble_talos_dir }}/out/talosconfig"
|
||||
register: noble_talos_talosconfig
|
||||
|
||||
- name: Require talos/out/talosconfig
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- noble_talos_talosconfig.stat.exists | default(false)
|
||||
fail_msg: >-
|
||||
Missing {{ noble_talos_dir }}/out/talosconfig. Run **talhelper genconfig -o out** in **talos/** (talsecret per talos/README.md §1),
|
||||
or set **noble_talos_genconfig=true** on this playbook.
|
||||
|
||||
# Maintenance API (**--insecure**) vs joined cluster (**tls: certificate required**) — talos/README §2 A vs B.
|
||||
- name: Set apply path from noble_talos_apply_mode (manual)
|
||||
ansible.builtin.set_fact:
|
||||
noble_talos_apply_insecure: "{{ noble_talos_apply_mode == 'insecure' }}"
|
||||
when: noble_talos_apply_mode | default('auto') in ['insecure', 'secure']
|
||||
|
||||
- name: Probe Talos API — apply-config dry-run (insecure / maintenance)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talosctl
|
||||
- apply-config
|
||||
- --insecure
|
||||
- -n
|
||||
- "{{ noble_talos_nodes[0].ip }}"
|
||||
- -f
|
||||
- "{{ noble_talos_dir }}/out/{{ noble_talos_nodes[0].machine }}"
|
||||
- --dry-run
|
||||
register: noble_talos_probe_insecure
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
when: noble_talos_apply_mode | default('auto') == 'auto'
|
||||
|
||||
- name: Probe Talos API — apply-config dry-run (TLS / joined)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talosctl
|
||||
- apply-config
|
||||
- -n
|
||||
- "{{ noble_talos_nodes[0].ip }}"
|
||||
- -f
|
||||
- "{{ noble_talos_dir }}/out/{{ noble_talos_nodes[0].machine }}"
|
||||
- --dry-run
|
||||
environment:
|
||||
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
|
||||
register: noble_talos_probe_secure
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
when:
|
||||
- noble_talos_apply_mode | default('auto') == 'auto'
|
||||
- noble_talos_probe_insecure.rc != 0
|
||||
|
||||
- name: Resolve apply mode — maintenance (insecure)
|
||||
ansible.builtin.set_fact:
|
||||
noble_talos_apply_insecure: true
|
||||
when:
|
||||
- noble_talos_apply_mode | default('auto') == 'auto'
|
||||
- noble_talos_probe_insecure.rc == 0
|
||||
|
||||
- name: Resolve apply mode — joined (TALOSCONFIG, no insecure)
|
||||
ansible.builtin.set_fact:
|
||||
noble_talos_apply_insecure: false
|
||||
when:
|
||||
- noble_talos_apply_mode | default('auto') == 'auto'
|
||||
- noble_talos_probe_insecure.rc != 0
|
||||
- noble_talos_probe_secure.rc == 0
|
||||
|
||||
- name: Fail when Talos API mode cannot be determined
|
||||
ansible.builtin.fail:
|
||||
msg: >-
|
||||
Cannot run **talosctl apply-config --dry-run** on {{ noble_talos_nodes[0].ip }}.
|
||||
Insecure: rc={{ noble_talos_probe_insecure.rc }} {{ noble_talos_probe_insecure.stderr | default('') }}.
|
||||
TLS: rc={{ noble_talos_probe_secure.rc | default('n/a') }} {{ noble_talos_probe_secure.stderr | default('') }}.
|
||||
Check LAN to :50000, node power, and that **out/talosconfig** matches these nodes.
|
||||
Override: **-e noble_talos_apply_mode=secure** (joined) or **insecure** (maintenance ISO).
|
||||
when:
|
||||
- noble_talos_apply_mode | default('auto') == 'auto'
|
||||
- noble_talos_probe_insecure.rc != 0
|
||||
- noble_talos_probe_secure is not defined or noble_talos_probe_secure.rc != 0
|
||||
|
||||
- name: Show resolved Talos apply-config mode
|
||||
ansible.builtin.debug:
|
||||
msg: >-
|
||||
apply-config: {{ 'maintenance (--insecure)' if noble_talos_apply_insecure | bool else 'joined (TALOSCONFIG)' }}
|
||||
(noble_talos_apply_mode={{ noble_talos_apply_mode | default('auto') }})
|
||||
|
||||
- name: Apply machine config to each node (first install — insecure)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talosctl
|
||||
- apply-config
|
||||
- --insecure
|
||||
- -n
|
||||
- "{{ item.ip }}"
|
||||
- --file
|
||||
- "{{ noble_talos_dir }}/out/{{ item.machine }}"
|
||||
loop: "{{ noble_talos_nodes }}"
|
||||
loop_control:
|
||||
label: "{{ item.ip }}"
|
||||
when: noble_talos_apply_insecure | bool
|
||||
changed_when: true
|
||||
|
||||
- name: Apply machine config to each node (cluster already has TLS — no insecure)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talosctl
|
||||
- apply-config
|
||||
- -n
|
||||
- "{{ item.ip }}"
|
||||
- --file
|
||||
- "{{ noble_talos_dir }}/out/{{ item.machine }}"
|
||||
environment:
|
||||
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
|
||||
loop: "{{ noble_talos_nodes }}"
|
||||
loop_control:
|
||||
label: "{{ item.ip }}"
|
||||
when: not (noble_talos_apply_insecure | bool)
|
||||
changed_when: true
|
||||
|
||||
# apply-config triggers reboots; apid on :50000 must accept connections before talosctl bootstrap / kubeconfig.
|
||||
- name: Wait for Talos machine API (apid) on bootstrap node
|
||||
ansible.builtin.wait_for:
|
||||
host: "{{ noble_talos_bootstrap_node_ip }}"
|
||||
port: 50000
|
||||
delay: "{{ noble_talos_apid_wait_delay | int }}"
|
||||
timeout: "{{ noble_talos_apid_wait_timeout | int }}"
|
||||
state: started
|
||||
when: noble_talos_wait_for_apid | default(true) | bool
|
||||
|
||||
- name: Bootstrap cluster (once per cluster)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talosctl
|
||||
- bootstrap
|
||||
- -n
|
||||
- "{{ noble_talos_bootstrap_node_ip }}"
|
||||
environment:
|
||||
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
|
||||
register: noble_talos_bootstrap_cmd
|
||||
when: not (noble_talos_skip_bootstrap | bool)
|
||||
changed_when: noble_talos_bootstrap_cmd.rc == 0
|
||||
failed_when: >-
|
||||
noble_talos_bootstrap_cmd.rc != 0 and
|
||||
('etcd data directory is not empty' not in (noble_talos_bootstrap_cmd.stderr | default('')))
|
||||
|
||||
- name: Write Kubernetes admin kubeconfig
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- talosctl
|
||||
- kubeconfig
|
||||
- "{{ noble_talos_kubeconfig_out }}"
|
||||
- --force
|
||||
- -n
|
||||
- "{{ noble_talos_kubeconfig_node }}"
|
||||
- -e
|
||||
- "{{ noble_talos_kubeconfig_endpoint }}"
|
||||
- --merge=false
|
||||
environment:
|
||||
TALOSCONFIG: "{{ noble_talos_dir }}/out/talosconfig"
|
||||
changed_when: true
|
||||
|
||||
- name: Optional — set kubectl cluster server to reachable API (VIP unreachable from this host)
|
||||
ansible.builtin.command:
|
||||
argv:
|
||||
- kubectl
|
||||
- config
|
||||
- set-cluster
|
||||
- "{{ noble_talos_kubectl_cluster_name }}"
|
||||
- --server={{ noble_k8s_api_server_override }}
|
||||
- --kubeconfig={{ noble_talos_kubeconfig_out }}
|
||||
when: noble_k8s_api_server_override | default('') | length > 0
|
||||
changed_when: true
|
||||
|
||||
- name: Next — platform stack
|
||||
ansible.builtin.debug:
|
||||
msg: >-
|
||||
Kubeconfig written to {{ noble_talos_kubeconfig_out }}.
|
||||
Export KUBECONFIG={{ noble_talos_kubeconfig_out }} and run: ansible-playbook playbooks/noble.yml
|
||||
(or: ansible-playbook playbooks/deploy.yml for the full pipeline).
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 277 KiB |
@@ -1,7 +0,0 @@
|
||||
# Argo CD — optional applications (non-bootstrap)
|
||||
|
||||
**Base cluster configuration** (CNI, MetalLB, ingress, cert-manager, storage, observability stack, policy, SOPS secrets path, etc.) is installed by **`ansible/playbooks/noble.yml`** from **`clusters/noble/bootstrap/`** — not from here.
|
||||
|
||||
**`noble-root`** (`clusters/noble/bootstrap/argocd/root-application.yaml`) points at **`clusters/noble/apps`**. Add **`Application`** manifests (and optional **`AppProject`** definitions) under this directory only for workloads that are additive and do not subsume the core platform.
|
||||
|
||||
Bootstrap kustomize (namespaces, static YAML, leaf **`Application`**s) lives in **`clusters/noble/bootstrap/`** and is tracked by **`noble-bootstrap-root`** — enable automated sync for that app only after **`noble.yml`** completes (**`clusters/noble/bootstrap/argocd/README.md`** §5). Put Helm **`Application`** migrations under **`clusters/noble/bootstrap/argocd/app-of-apps/`**.
|
||||
@@ -1,32 +0,0 @@
|
||||
# Argo CD — optional [Homepage](https://gethomepage.dev/) dashboard (Helm from [jameswynn.github.io/helm-charts](https://jameswynn.github.io/helm-charts/)).
|
||||
# Values: **`./values.yaml`** (multi-source **`$values`** ref).
|
||||
#
|
||||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Application
|
||||
metadata:
|
||||
name: homepage
|
||||
namespace: argocd
|
||||
finalizers:
|
||||
- resources-finalizer.argocd.argoproj.io/background
|
||||
spec:
|
||||
project: default
|
||||
sources:
|
||||
- repoURL: https://jameswynn.github.io/helm-charts
|
||||
chart: homepage
|
||||
targetRevision: 2.1.0
|
||||
helm:
|
||||
releaseName: homepage
|
||||
valueFiles:
|
||||
- $values/clusters/noble/apps/homepage/values.yaml
|
||||
- repoURL: https://gitea.pcenicni.ca/gsdavidp/home-server.git
|
||||
targetRevision: HEAD
|
||||
ref: values
|
||||
destination:
|
||||
server: https://kubernetes.default.svc
|
||||
namespace: homepage
|
||||
syncPolicy:
|
||||
automated:
|
||||
prune: true
|
||||
selfHeal: true
|
||||
syncOptions:
|
||||
- CreateNamespace=true
|
||||
@@ -1,122 +0,0 @@
|
||||
# Homepage — [gethomepage/homepage](https://github.com/gethomepage/homepage) via [jameswynn/homepage](https://github.com/jameswynn/helm-charts) Helm chart.
|
||||
# Ingress: Traefik + cert-manager (same pattern as `clusters/noble/bootstrap/headlamp/values.yaml`).
|
||||
# Service links match **`ansible/roles/noble_landing_urls/defaults/main.yml`** (`noble_lab_ui_entries`).
|
||||
# **Velero** has no in-cluster web UI — tile links to upstream docs (no **siteMonitor**).
|
||||
#
|
||||
# **`siteMonitor`** runs **server-side** in the Homepage pod (see `gethomepage/homepage` `siteMonitor.js`).
|
||||
# Public FQDNs like **`*.apps.noble.lab.pcenicni.dev`** often do **not** resolve inside the cluster
|
||||
# (split-horizon / LAN DNS only) → `ENOTFOUND` / HTTP **500** in the monitor. Use **in-cluster Service**
|
||||
# URLs for **`siteMonitor`** only; **`href`** stays the human-facing ingress URL.
|
||||
#
|
||||
# **Prometheus widget** also resolves from the pod — use the real **Service** name (Helm may truncate to
|
||||
# 63 chars — this repo’s generated UI list uses **`kube-prometheus-kube-prome-prometheus`**).
|
||||
# Verify: `kubectl -n monitoring get svc | grep -E 'prometheus|alertmanager|grafana'`.
|
||||
#
|
||||
image:
|
||||
repository: ghcr.io/gethomepage/homepage
|
||||
tag: v1.2.0
|
||||
|
||||
enableRbac: true
|
||||
|
||||
serviceAccount:
|
||||
create: true
|
||||
|
||||
ingress:
|
||||
main:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- host: homepage.apps.noble.lab.pcenicni.dev
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
tls:
|
||||
- hosts:
|
||||
- homepage.apps.noble.lab.pcenicni.dev
|
||||
secretName: homepage-apps-noble-tls
|
||||
|
||||
env:
|
||||
- name: HOMEPAGE_ALLOWED_HOSTS
|
||||
value: homepage.apps.noble.lab.pcenicni.dev
|
||||
|
||||
config:
|
||||
bookmarks: []
|
||||
services:
|
||||
- Noble Lab:
|
||||
- Argo CD:
|
||||
icon: si-argocd
|
||||
href: https://argo.apps.noble.lab.pcenicni.dev
|
||||
siteMonitor: http://argocd-server.argocd.svc.cluster.local:80
|
||||
description: GitOps UI (sync, apps, repos)
|
||||
- Grafana:
|
||||
icon: si-grafana
|
||||
href: https://grafana.apps.noble.lab.pcenicni.dev
|
||||
siteMonitor: http://kube-prometheus-grafana.monitoring.svc.cluster.local:80
|
||||
description: Dashboards, Loki explore (logs)
|
||||
- Prometheus:
|
||||
icon: si-prometheus
|
||||
href: https://prometheus.apps.noble.lab.pcenicni.dev
|
||||
siteMonitor: http://kube-prometheus-kube-prome-prometheus.monitoring.svc.cluster.local:9090
|
||||
description: Prometheus UI (queries, targets) — lab; protect in production
|
||||
widget:
|
||||
type: prometheus
|
||||
url: http://kube-prometheus-kube-prome-prometheus.monitoring.svc.cluster.local:9090
|
||||
fields: ["targets_up", "targets_down", "targets_total"]
|
||||
- Alertmanager:
|
||||
icon: alertmanager.png
|
||||
href: https://alertmanager.apps.noble.lab.pcenicni.dev
|
||||
siteMonitor: http://kube-prometheus-kube-prome-alertmanager.monitoring.svc.cluster.local:9093
|
||||
description: Alertmanager UI (silences, status)
|
||||
- Headlamp:
|
||||
icon: mdi-kubernetes
|
||||
href: https://headlamp.apps.noble.lab.pcenicni.dev
|
||||
siteMonitor: http://headlamp.headlamp.svc.cluster.local:80
|
||||
description: Kubernetes UI (cluster resources)
|
||||
- Longhorn:
|
||||
icon: longhorn.png
|
||||
href: https://longhorn.apps.noble.lab.pcenicni.dev
|
||||
siteMonitor: http://longhorn-frontend.longhorn-system.svc.cluster.local:80
|
||||
description: Storage volumes, nodes, backups
|
||||
- Velero:
|
||||
icon: mdi-backup-restore
|
||||
href: https://velero.io/docs/
|
||||
description: Cluster backups — no in-cluster web UI; use velero CLI or kubectl (docs)
|
||||
widgets:
|
||||
- datetime:
|
||||
text_size: xl
|
||||
format:
|
||||
dateStyle: medium
|
||||
timeStyle: short
|
||||
- kubernetes:
|
||||
cluster:
|
||||
show: true
|
||||
cpu: true
|
||||
memory: true
|
||||
showLabel: true
|
||||
label: Cluster
|
||||
nodes:
|
||||
show: true
|
||||
cpu: true
|
||||
memory: true
|
||||
showLabel: true
|
||||
- search:
|
||||
provider: duckduckgo
|
||||
target: _blank
|
||||
kubernetes:
|
||||
mode: cluster
|
||||
settingsString: |
|
||||
title: Noble Lab
|
||||
description: Homelab services — in-cluster uptime checks, cluster resources, Prometheus targets
|
||||
theme: dark
|
||||
color: slate
|
||||
headerStyle: boxedWidgets
|
||||
statusStyle: dot
|
||||
iconStyle: theme
|
||||
fullWidth: true
|
||||
useEqualHeights: true
|
||||
layout:
|
||||
Noble Lab:
|
||||
style: row
|
||||
columns: 4
|
||||
@@ -1,7 +0,0 @@
|
||||
# Argo CD **noble-root** syncs this directory. Add **Application** / **AppProject** manifests only for
|
||||
# optional workloads that do not replace Ansible bootstrap (CNI, ingress, storage, core observability, etc.).
|
||||
# Helm value files for those apps can live in subdirectories here (for example **./homepage/values.yaml**).
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources:
|
||||
- homepage/application.yaml
|
||||
@@ -1,106 +0,0 @@
|
||||
# Argo CD — noble (bootstrap)
|
||||
|
||||
**Prerequisites:** cluster **Ready**, **Traefik** + **cert-manager**; DNS **`argo.apps.noble.lab.pcenicni.dev`** → Traefik **`192.168.50.211`** (see **`values.yaml`**).
|
||||
|
||||
## 1. Install
|
||||
|
||||
```bash
|
||||
helm repo add argo https://argoproj.github.io/argo-helm
|
||||
helm repo update
|
||||
helm upgrade --install argocd argo/argo-cd \
|
||||
--namespace argocd \
|
||||
--create-namespace \
|
||||
--version 9.4.17 \
|
||||
-f clusters/noble/bootstrap/argocd/values.yaml \
|
||||
--wait
|
||||
```
|
||||
|
||||
**RBAC:** `values.yaml` sets **`policy.default: role:readonly`** and **`g, admin, role:admin`** so the local **`admin`** user keeps full access while future OIDC users default to read-only until you add **`policy.csv`** mappings.
|
||||
|
||||
## 2. UI / CLI address
|
||||
|
||||
**HTTPS:** `https://argo.apps.noble.lab.pcenicni.dev` (Ingress via Traefik; cert from **`values.yaml`**).
|
||||
|
||||
```bash
|
||||
kubectl get ingress -n argocd
|
||||
```
|
||||
|
||||
Log in as **`admin`**; initial password:
|
||||
|
||||
```bash
|
||||
kubectl -n argocd get secret argocd-initial-admin-secret \
|
||||
-o jsonpath='{.data.password}' | base64 -d
|
||||
echo
|
||||
```
|
||||
|
||||
Change the password in the UI or via `argocd account update-password`.
|
||||
|
||||
### TLS: changing ClusterIssuer (e.g. staging → prod)
|
||||
|
||||
If **`helm upgrade --wait`** fails with *Secret was previously issued by `letsencrypt-staging`* (or another issuer), cert-manager will not replace the TLS Secret in place. Remove the old cert material once, then upgrade again:
|
||||
|
||||
```bash
|
||||
kubectl -n argocd delete certificate argocd-server --ignore-not-found
|
||||
kubectl -n argocd delete secret argocd-server-tls --ignore-not-found
|
||||
helm upgrade --install argocd argo/argo-cd -n argocd --create-namespace \
|
||||
--version 9.4.17 -f clusters/noble/bootstrap/argocd/values.yaml --wait
|
||||
```
|
||||
|
||||
## 3. Register this repo (if private)
|
||||
|
||||
Use **Settings → Repositories** in the UI, or `argocd repo add` / a `Secret` of type `repository`.
|
||||
|
||||
## 4. App-of-apps (GitOps)
|
||||
|
||||
**Ansible** (`ansible/playbooks/noble.yml`) performs the **initial** install: Helm releases and **`kubectl apply -k clusters/noble/bootstrap`**. **Argo** then tracks the same git paths for ongoing reconciliation.
|
||||
|
||||
1. Edit **`root-application.yaml`** and **`bootstrap-root-application.yaml`**: set **`repoURL`** and **`targetRevision`**. The **`resources-finalizer.argocd.argoproj.io/background`** finalizer uses Argo’s path-qualified form so **`kubectl apply`** does not warn about finalizer names.
|
||||
2. Optional add-on apps: add **`Application`** manifests under **`clusters/noble/apps/`** (see **`clusters/noble/apps/README.md`**).
|
||||
3. **Bootstrap kustomize** (namespaces, datasource, leaf **`Application`**s under **`argocd/app-of-apps/`**, etc.): **`noble-bootstrap-root`** syncs **`clusters/noble/bootstrap`**. It is created with **manual** sync only so Argo does not apply changes while **`noble.yml`** is still running.
|
||||
|
||||
**`ansible/playbooks/noble.yml`** (role **`noble_argocd`**) applies both roots when **`noble_argocd_apply_root_application`** / **`noble_argocd_apply_bootstrap_root_application`** are true in **`ansible/group_vars/all.yml`**.
|
||||
|
||||
```bash
|
||||
kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
|
||||
kubectl apply -f clusters/noble/bootstrap/argocd/bootstrap-root-application.yaml
|
||||
```
|
||||
|
||||
If you migrated from older GitOps **`Application`** names, delete stale **`Application`** objects on the cluster (see **`clusters/noble/apps/README.md`**) then re-apply the roots.
|
||||
|
||||
## 5. After Ansible: enable automated sync for **noble-bootstrap-root**
|
||||
|
||||
Do this only after **`ansible-playbook playbooks/noble.yml`** has finished successfully (including **`noble_platform`** `kubectl apply -k` and any Helm stages you rely on). Until then, leave **manual** sync so Argo does not fight the playbook.
|
||||
|
||||
**Required steps**
|
||||
|
||||
1. Confirm the cluster matches git for kustomize output (optional): `kubectl kustomize clusters/noble/bootstrap | kubectl diff -f -` or inspect resources in the UI.
|
||||
2. Register the git repo in Argo if you have not already (**§3**).
|
||||
3. **Refresh** the app so Argo compares **`clusters/noble/bootstrap`** to the cluster: Argo UI → **noble-bootstrap-root** → **Refresh**, or:
|
||||
|
||||
```bash
|
||||
argocd app get noble-bootstrap-root --refresh
|
||||
```
|
||||
|
||||
4. **Enable automated sync** (prune + self-heal), preserving **`CreateNamespace`**, using any one of:
|
||||
|
||||
**kubectl**
|
||||
|
||||
```bash
|
||||
kubectl patch application noble-bootstrap-root -n argocd --type merge -p '{"spec":{"syncPolicy":{"automated":{"prune":true,"selfHeal":true},"syncOptions":["CreateNamespace=true"]}}}'
|
||||
```
|
||||
|
||||
**argocd** CLI (logged in)
|
||||
|
||||
```bash
|
||||
argocd app set noble-bootstrap-root --sync-policy automated --auto-prune --self-heal
|
||||
```
|
||||
|
||||
**UI:** open **noble-bootstrap-root** → **App Details** → enable **AUTO-SYNC** (and **Prune** / **Self Heal** if shown).
|
||||
|
||||
5. Trigger a sync if the app does not go green immediately: **Sync** in the UI, or `argocd app sync noble-bootstrap-root`.
|
||||
|
||||
After this, **git** is the source of truth for everything under **`clusters/noble/bootstrap/kustomization.yaml`** (including **`argocd/app-of-apps/`**). Helm-managed platform components remain whatever Ansible last installed until you model them as Argo **`Application`**s under **`app-of-apps/`** and stop installing them from Ansible.
|
||||
|
||||
## Versions
|
||||
|
||||
Pinned in **`values.yaml`** comments (chart **9.4.17** / Argo CD **v3.3.6** at time of writing). Bump **`--version`** when upgrading.
|
||||
@@ -1,35 +0,0 @@
|
||||
# App-of-apps root — apply after Argo CD is running (optional).
|
||||
#
|
||||
# 1. Set spec.source.repoURL (and targetRevision — **HEAD** tracks the remote default branch) to this repo.
|
||||
# 2. kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
|
||||
#
|
||||
# **clusters/noble/apps** holds optional **Application** manifests. Core platform Helm + kustomize is
|
||||
# installed by **ansible/playbooks/noble.yml** from **clusters/noble/bootstrap/**. **bootstrap-root-application.yaml**
|
||||
# registers **noble-bootstrap-root** for the same kustomize tree (**manual** sync until you enable
|
||||
# automation after the playbook — see **README.md** §5).
|
||||
#
|
||||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Application
|
||||
metadata:
|
||||
name: noble-root
|
||||
namespace: argocd
|
||||
# Path suffix satisfies Kubernetes’ domain-qualified finalizer guidance (avoids kubectl warning).
|
||||
# Background cascade: Application deletes after resources are removed asynchronously.
|
||||
# See: https://argo-cd.readthedocs.io/en/stable/user-guide/app_deletion/#about-the-deletion-finalizer
|
||||
finalizers:
|
||||
- resources-finalizer.argocd.argoproj.io/background
|
||||
spec:
|
||||
project: default
|
||||
source:
|
||||
repoURL: https://gitea.pcenicni.ca/gsdavidp/home-server.git
|
||||
targetRevision: HEAD
|
||||
path: clusters/noble/apps
|
||||
destination:
|
||||
server: https://kubernetes.default.svc
|
||||
namespace: argocd
|
||||
syncPolicy:
|
||||
automated:
|
||||
prune: true
|
||||
selfHeal: true
|
||||
syncOptions:
|
||||
- CreateNamespace=true
|
||||
@@ -1,51 +0,0 @@
|
||||
# Argo CD — noble lab (GitOps)
|
||||
#
|
||||
# Chart: argo/argo-cd — pin version on the helm command (e.g. 9.4.17).
|
||||
# UI/API: **Ingress** via **Traefik** at **argo.apps.noble.lab.pcenicni.dev** (TLS: cert-manager
|
||||
# ClusterIssuer + **`server.insecure`** so TLS terminates at Traefik).
|
||||
# DNS: **`argo.apps.noble.lab.pcenicni.dev`** → Traefik LB **192.168.50.211** (same wildcard as apps).
|
||||
#
|
||||
# helm repo add argo https://argoproj.github.io/argo-helm
|
||||
# helm upgrade --install argocd argo/argo-cd -n argocd --create-namespace \
|
||||
# --version 9.4.17 -f clusters/noble/bootstrap/argocd/values.yaml --wait
|
||||
#
|
||||
# Initial admin password: kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath='{.data.password}' | base64 -d
|
||||
#
|
||||
# Optional: kubectl apply -f clusters/noble/bootstrap/argocd/root-application.yaml
|
||||
|
||||
global:
|
||||
domain: argo.apps.noble.lab.pcenicni.dev
|
||||
|
||||
configs:
|
||||
params:
|
||||
# TLS terminates at Traefik / cert-manager; Argo CD serves HTTP behind the Ingress.
|
||||
server.insecure: true
|
||||
|
||||
# RBAC: default authenticated users to read-only; keep local **admin** as full admin.
|
||||
# Ref: https://argo-cd.readthedocs.io/en/stable/operator-manual/rbac/
|
||||
rbac:
|
||||
policy.default: role:readonly
|
||||
policy.csv: |
|
||||
g, admin, role:admin
|
||||
|
||||
server:
|
||||
certificate:
|
||||
enabled: true
|
||||
domain: argo.apps.noble.lab.pcenicni.dev
|
||||
# If you change issuer.name, delete Certificate/Secret once so cert-manager can re-issue (see README.md).
|
||||
issuer:
|
||||
group: cert-manager.io
|
||||
kind: ClusterIssuer
|
||||
name: letsencrypt-prod
|
||||
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
hostname: argo.apps.noble.lab.pcenicni.dev
|
||||
tls: true
|
||||
# Traefik terminates TLS; Argo serves HTTP/2 cleartext (insecure). Without h2c, UI/API can 404 or fail gRPC.
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/service.serversscheme: h2c
|
||||
|
||||
service:
|
||||
type: ClusterIP
|
||||
@@ -1,53 +0,0 @@
|
||||
# cert-manager — noble
|
||||
|
||||
**Prerequisites:** **Traefik** (ingress class **`traefik`**), DNS for **`*.apps.noble.lab.pcenicni.dev`** → Traefik LB for app traffic.
|
||||
|
||||
**ACME (Let’s Encrypt)** uses **DNS-01** via **Cloudflare** for zone **`pcenicni.dev`**. Create an API token with **Zone → DNS → Edit** and **Zone → Zone → Read** (or use the “Edit zone DNS” template), then:
|
||||
|
||||
**Option A — Ansible:** copy **`.env.sample`** to **`.env`** in the repo root, set **`CLOUDFLARE_DNS_API_TOKEN`**, run **`ansible/playbooks/noble.yml`** (or **`deploy.yml`**). The **cert-manager** role creates **cloudflare-dns-api-token** from `.env` after the chart installs.
|
||||
|
||||
**Option B — kubectl:**
|
||||
|
||||
```bash
|
||||
kubectl -n cert-manager create secret generic cloudflare-dns-api-token \
|
||||
--from-literal=api-token='YOUR_CLOUDFLARE_API_TOKEN' \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
```
|
||||
|
||||
Without this Secret, **`ClusterIssuer`** will not complete certificate orders.
|
||||
|
||||
1. Create the namespace:
|
||||
|
||||
```bash
|
||||
kubectl apply -f clusters/noble/bootstrap/cert-manager/namespace.yaml
|
||||
```
|
||||
|
||||
2. Install the chart (CRDs included via `values.yaml`):
|
||||
|
||||
```bash
|
||||
helm repo add jetstack https://charts.jetstack.io
|
||||
helm repo update
|
||||
helm upgrade --install cert-manager jetstack/cert-manager \
|
||||
--namespace cert-manager \
|
||||
--version v1.20.0 \
|
||||
-f clusters/noble/bootstrap/cert-manager/values.yaml \
|
||||
--wait
|
||||
```
|
||||
|
||||
3. Optionally edit **`spec.acme.email`** in both ClusterIssuer manifests (default **`certificates@noble.lab.pcenicni.dev`**) — Let’s Encrypt uses this for expiry and account notices. Do **not** use **`example.com`** (ACME rejects it).
|
||||
|
||||
4. Apply ClusterIssuers (staging then prod, or both):
|
||||
|
||||
```bash
|
||||
kubectl apply -k clusters/noble/bootstrap/cert-manager
|
||||
```
|
||||
|
||||
5. Confirm:
|
||||
|
||||
```bash
|
||||
kubectl get clusterissuer
|
||||
```
|
||||
|
||||
Use **`cert-manager.io/cluster-issuer: letsencrypt-staging`** on Ingresses while testing; switch to **`letsencrypt-prod`** when ready.
|
||||
|
||||
**HTTP-01** is not configured: if the hostname is **proxied** (orange cloud) in Cloudflare, Let’s Encrypt may hit Cloudflare’s edge and get **404** for `/.well-known/acme-challenge/`. DNS-01 avoids that.
|
||||
@@ -1,23 +0,0 @@
|
||||
# Let's Encrypt production — trusted certificates; respect rate limits.
|
||||
# Prefer a real mailbox for expiry notices; this domain is accepted by LE (edit if needed).
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod
|
||||
spec:
|
||||
acme:
|
||||
email: certificates@noble.lab.pcenicni.dev
|
||||
server: https://acme-v02.api.letsencrypt.org/directory
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-prod-account-key
|
||||
solvers:
|
||||
# DNS-01 — works when public HTTP to Traefik is wrong (e.g. hostname proxied through Cloudflare
|
||||
# returns 404 for /.well-known/acme-challenge). Requires Secret cloudflare-dns-api-token in cert-manager.
|
||||
- dns01:
|
||||
cloudflare:
|
||||
apiTokenSecretRef:
|
||||
name: cloudflare-dns-api-token
|
||||
key: api-token
|
||||
selector:
|
||||
dnsZones:
|
||||
- pcenicni.dev
|
||||
@@ -1,21 +0,0 @@
|
||||
# Let's Encrypt staging — use for tests (untrusted issuer in browsers).
|
||||
# Prefer a real mailbox for expiry notices; this domain is accepted by LE (edit if needed).
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-staging
|
||||
spec:
|
||||
acme:
|
||||
email: certificates@noble.lab.pcenicni.dev
|
||||
server: https://acme-staging-v02.api.letsencrypt.org/directory
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-staging-account-key
|
||||
solvers:
|
||||
- dns01:
|
||||
cloudflare:
|
||||
apiTokenSecretRef:
|
||||
name: cloudflare-dns-api-token
|
||||
key: api-token
|
||||
selector:
|
||||
dnsZones:
|
||||
- pcenicni.dev
|
||||
@@ -1,5 +0,0 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources:
|
||||
- clusterissuer-letsencrypt-staging.yaml
|
||||
- clusterissuer-letsencrypt-prod.yaml
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user