110 lines
20 KiB
Markdown
110 lines
20 KiB
Markdown
# noble_authentik — Authentik + OIDC for the noble stack
|
||
|
||
Installs **Authentik** (Helm `goauthentik/authentik`) as the cluster IdP, **oauth2-proxy** as an **OIDC** client to Authentik for Traefik **ForwardAuth** (Prometheus, Alertmanager, Longhorn UI), and re-applies Helm values so **Argo CD**, **Grafana**, and **Headlamp** use **native OIDC** to Authentik (not HTTP BasicAuth).
|
||
|
||
## Enable
|
||
|
||
1. Copy repository **`.env.sample`** to **`.env`** and set all **required** **`NOBLE_AUTHENTIK_*`** values (see comments there; SMTP keys are optional).
|
||
2. Set **`noble_authentik_install: true`** in **`ansible/inventory/group_vars/all.yml`** (or pass **`-e noble_authentik_install=true`**).
|
||
3. Run **`ansible-playbook playbooks/noble.yml --tags authentik`** (or a full **`noble.yml`**) from **`ansible/`** with a working **`KUBECONFIG`**.
|
||
|
||
`noble_authentik` runs **after** **`noble_platform`** so Grafana / Headlamp / Prometheus exist before SSO Helm upgrades.
|
||
|
||
## Variables
|
||
|
||
See **`defaults/main.yml`**. Hostnames default to **`auth.apps.noble.lab.pcenicni.dev`** and **`oauth2.apps.noble.lab.pcenicni.dev`**. **`noble_authentik_ensure_admin_ui_access`** (default **true**) re-applies **authentik Admins** superuser membership via the worker on each **`--tags authentik`** run so the admin UI keeps working under **2026+** RBAC.
|
||
|
||
### S3 media (avatars, flows, uploads)
|
||
|
||
Authentik stores file-backed data in **S3** (not a shared PVC on **`authentik-worker`**). Set **`NOBLE_AUTHENTIK_MEDIA_S3_BUCKET`** in **`.env`** to a **dedicated** bucket name (do **not** reuse the Velero backup bucket). **`NOBLE_VELERO_S3_URL`**, **`NOBLE_VELERO_AWS_ACCESS_KEY_ID`**, and **`NOBLE_VELERO_AWS_SECRET_ACCESS_KEY`** are reused automatically when the Authentik-specific S3 variables are unset; override with **`NOBLE_AUTHENTIK_S3_URL`** / **`NOBLE_AUTHENTIK_S3_ACCESS_KEY`** / **`NOBLE_AUTHENTIK_S3_SECRET_KEY`** if needed. Optional: **`NOBLE_AUTHENTIK_S3_REGION`** (defaults to **`us-east-1`** in Ansible), **`NOBLE_AUTHENTIK_S3_ADDRESSING_STYLE`** (**`path`** vs **`virtual`** for some gateways). Create the bucket and grant the same credentials **read/write** to that bucket only. For browser uploads and public assets, follow [Authentik — S3 storage](https://docs.goauthentik.io/sys-mgmt/ops/storage-s3/) (CORS and policies). If you previously used a PVC for **`/data`**, sync into the new bucket (for example **`aws s3 sync`** from a volume snapshot or old mount) before relying on S3-only.
|
||
|
||
### Outbound email (SMTP)
|
||
|
||
Optional. Set **`NOBLE_AUTHENTIK_SMTP_HOST`** and **`NOBLE_AUTHENTIK_SMTP_FROM`** in repository **`.env`**; Ansible adds **`AUTHENTIK_EMAIL__HOST`**, **`AUTHENTIK_EMAIL__FROM`**, and related variables to Helm **`global.env`** (see [Authentik configuration — email](https://docs.goauthentik.io/install-config/configuration/#email-settings)). Omit **`NOBLE_AUTHENTIK_SMTP_HOST`** to skip SMTP env vars entirely. Optional overrides: **`NOBLE_AUTHENTIK_SMTP_PORT`** (default **587** in **`defaults/main.yml`**), **`NOBLE_AUTHENTIK_SMTP_USERNAME`**, **`NOBLE_AUTHENTIK_SMTP_PASSWORD`**, **`NOBLE_AUTHENTIK_SMTP_USE_TLS`** / **`USE_SSL`** / **`TIMEOUT`**. Re-run **`ansible-playbook playbooks/noble.yml --tags authentik`** after changes.
|
||
|
||
### Extra public hostname (Pangolin + Newt, same Authentik)
|
||
|
||
To expose the **same** Authentik instance on an **internet-facing** FQDN (while keeping the lab name on Traefik), set **`noble_authentik_ingress_extra_hosts`** in **`ansible/inventory/group_vars/all.yml`** (or **`-e`**) to a list of extra FQDNs, for example **`auth.example.com`**. Re-run **`ansible-playbook playbooks/noble.yml --tags authentik`**. Ansible extends **`server.ingress.hosts`** and **`tls[0].hosts`** so **cert-manager** issues one certificate with SANs for the primary **`noble_authentik_host`** plus those names (DNS must resolve for your issuer — often **Cloudflare** for public names, split horizon for lab).
|
||
|
||
Then in **Pangolin**: link the domain, create an **HTTP** resource for that hostname, and set the **target** to your **Newt** site with **`ip:port`** pointing at the cluster **Traefik** HTTPS entry (same pattern as **`clusters/noble/bootstrap/newt/README.md`** — typically the MetalLB / LAN VIP and **443**). One Newt tunnel can front many hostnames.
|
||
|
||
### Split routing, two Brands, and optional blueprints
|
||
|
||
This role supports a **single Authentik deployment** with **two hostnames** (lab + public) and **different Brands** per **`Host`**, without Authentik’s separate-database **Tenancy** feature (see [Tenancy](https://docs.goauthentik.io/sys-mgmt/tenancy) — alpha / licensing). [Brands](https://docs.goauthentik.io/brands/) choose default **authentication** (and related) **flows** and branding for each FQDN.
|
||
|
||
**Split routing (recommended):**
|
||
|
||
- **Lab / operator URL** — **`noble_authentik_host`** (default **`auth.apps.noble.lab.pcenicni.dev`**): keep DNS **internal-only** (split horizon, VPN, or LAN DNS). Do **not** publish this hostname as a Pangolin HTTP resource toward the internet unless you intentionally want it reachable off-LAN.
|
||
- **Public URL** — entries in **`noble_authentik_ingress_extra_hosts`**: use Pangolin (or another edge) only for these names so casual users never need the lab FQDN.
|
||
|
||
Network isolation is enforced at **DNS and the tunnel**, not inside Authentik. Optionally add firewall / Traefik entrypoint rules for defense in depth.
|
||
|
||
**Two-Brand model:**
|
||
|
||
- **Lab Brand** — domain equals **`noble_authentik_host`**: use a **restricted authentication flow** so only operator groups (defaults: **`noble-admins`**, **`authentik Admins`**) can complete sign-in on that hostname. Your bootstrap / break-glass account must remain in one of those groups (see **`noble_authentik_ensure_admin_ui_access`** and **`configure_authentik.py`** group membership).
|
||
- **Public Brand(s)** — one Brand per FQDN in **`noble_authentik_ingress_extra_hosts`**: use the stock **`default-authentication-flow`** (or replace with your own flow slug via a forked blueprint later). Assign general users to **`noble_authentik_blueprint_public_groups`** (defaults **`noble-public-users`**, **`noble-public-admins`**) for app policies and OAuth claims; **`noble-admins`** / **`noble-editors`** remain for cluster / Argo / Grafana as today.
|
||
|
||
**OAuth note:** Redirect URIs and **`iss`** must stay consistent with the hostname clients use (internal issuer for in-cluster apps vs public issuer is a deliberate choice — avoid mixing both for the same app).
|
||
|
||
**Mounted blueprints (optional):** set **`noble_authentik_blueprints_enabled: true`** in **`group_vars`** (or **`-e`**). On each **`--tags authentik`** run, Ansible renders Jinja templates under **`templates/blueprints/`** into a ConfigMap **`noble_authentik_blueprints_configmap_name`** (default **`authentik-noble-blueprints`**) and sets Helm **`blueprints.configMaps`** so **authentik-worker** loads them from **`/blueprints/mounted/cm-authentik-noble-blueprints/`** (see [Blueprints](https://docs.goauthentik.io/customize/blueprints/)). Files (apply in lexical order):
|
||
|
||
| Key | Purpose |
|
||
| --- | --- |
|
||
| **`10-noble-public-groups.yaml.j2`** | Ensures **`noble_authentik_blueprint_public_groups`** exist. |
|
||
| **`20-noble-lab-operator-authentication-flow.yaml.j2`** | Flow **`noble_authentik_blueprint_lab_flow_slug`** + expression policy **`noble_authentik_blueprint_operator_policy_name`** (allowed groups **`noble_authentik_blueprint_lab_operator_groups`**). |
|
||
| **`30-noble-brands-domain-split.yaml.j2`** | Brand for **`noble_authentik_host`** → lab flow; one Brand per **`noble_authentik_ingress_extra_hosts`** → default authentication. |
|
||
|
||
Tune titles via **`noble_authentik_blueprint_lab_brand_title`** and **`noble_authentik_blueprint_public_brand_title_prefix`**. After the worker applies blueprints, confirm **System → Brands** and **Flows** in the admin UI; fix any **`!Find`** failures if upstream default stage **names** change between Authentik versions.
|
||
|
||
**Confirming blueprints on the cluster:** the Ansible task **Install Authentik (Helm)** uses **`changed_when: true`**, so a **“changed”** line there does **not** prove Helm mutated the release. When **`noble_authentik_blueprints_enabled`** is true, the role asserts the **worker** Deployment has a volumeMount named **`blueprints-cm-<noble_authentik_blueprints_configmap_name>`** (default **`blueprints-cm-authentik-noble-blueprints`**). You can also run:
|
||
|
||
```bash
|
||
kubectl -n authentik get configmap authentik-noble-blueprints -o yaml
|
||
helm get values authentik -n authentik -o yaml | grep -A2 blueprints
|
||
kubectl -n authentik get deploy -l app.kubernetes.io/component=worker -o yaml | grep blueprints-cm
|
||
```
|
||
|
||
Mounted files are applied asynchronously by **authentik-worker**; check **System → Blueprints** (or **Customization → Blueprints** depending on version) for instances sourced from **`/blueprints/mounted/cm-authentik-noble-blueprints/`**, and **`kubectl logs -n authentik deploy/authentik-worker`** if a blueprint shows **Error** / failed apply.
|
||
|
||
### “Secondary tenant” (separate PostgreSQL schema — alpha)
|
||
|
||
Authentik **tenancy** (multiple isolated tenants in one deployment, **`AUTHENTIK_TENANTS__ENABLED`**) is **alpha**, requires **per-tenant Enterprise licensing**, **`AUTHENTIK_TENANTS__API_KEY`**, and **`AUTHENTIK_OUTPOSTS__DISABLE_EMBEDDED_OUTPOST=true`** (embedded outposts are unsupported with tenancy). It is **not** wired in this repo by default. See [Tenancy](https://docs.goauthentik.io/sys-mgmt/tenancy). For most homelabs, **one tenant** plus **`noble_authentik_ingress_extra_hosts`** is the right split.
|
||
|
||
## IdP configuration
|
||
|
||
When **`noble_authentik_configure_idp`** is true, Ansible creates/updates OAuth2 providers and applications for **argocd**, **grafana**, **headlamp**, and **oauth2-proxy** using either the **worker ORM path** (default **`noble_authentik_oidc_provision_via: worker`**: **`kubectl exec`** + **`ak shell`** + **`files/worker_upsert_oauth_oidc.py`**, which avoids **2026+** REST **403** on **`GET …/providers/oauth2/**`) or the **REST-only path** (**`noble_authentik_oidc_provision_via: rest`**: **`files/configure_authentik.py`** needs a token that can list/patch OAuth2 providers). With the worker path and a bootstrap email, it also runs **`files/worker_add_bootstrap_user_groups.py`** so **`User.groups.add`** does not depend on **`GET …/core/users/**`. It then runs **`configure_authentik.py`** with **`AUTHENTIK_SKIP_OIDC_REST`** / **`AUTHENTIK_SKIP_USER_GROUP_REST`** when those worker steps ran, so the script only calls **`ensure_group`** over the API (skipped when **`AUTHENTIK_NOBLE_*_GROUP_PK`** are set).
|
||
|
||
## RBAC notes
|
||
|
||
- **Argo CD:** `noble-admins` group → `role:admin` (see **`clusters/noble/bootstrap/argocd/values-authentik-oidc.yaml`**).
|
||
- **Grafana:** `noble-admins` → Admin, `noble-editors` → Editor (see **`values-authentik-oidc.yaml`**).
|
||
|
||
## Troubleshooting
|
||
|
||
- **Blueprints from Ansible fail to apply** (worker logs / **System → Blueprints**): confirm the ConfigMap exists (**`kubectl -n authentik get cm authentik-noble-blueprints`** unless you changed **`noble_authentik_blueprints_configmap_name`**), that Helm mounts it (**`blueprints.configMaps`** in the rendered extra values), and that every **`!Find`** in **`templates/blueprints/20-*.j2`** still matches your Authentik version’s default stage **names**. Re-run **`--tags authentik`** after editing templates.
|
||
- **oauth2-proxy shows 500** on **`oauth2.apps…/oauth2/callback`** (logs: `email in id_token (...) isn't verified`): Authentik’s id_token often lacks **`email_verified: true`** for bootstrap users. **`clusters/noble/bootstrap/oauth2-proxy/values.yaml`** sets **`insecure-oidc-allow-unverified-email`** for the lab; otherwise verify the user’s email in Authentik, then **`helm upgrade oauth2-proxy`** (or **`--tags authentik`**).
|
||
- Re-run **`configure_authentik.py`** only by executing **`noble.yml`** with **`--tags authentik`** after fixing `.env`.
|
||
- If Authentik API calls fail, check flows exist (slug **`default-provider-authorization-implicit-consent`**) and TLS reaches **`AUTHENTIK_API_BASE`**.
|
||
- **`GET …/flows/instances/…` → HTTP 403** with **`Token invalid/expired`**: the bootstrap API token is not accepted yet (common right after install: worker still creating it) or **`NOBLE_AUTHENTIK_BOOTSTRAP_TOKEN`** in `.env` does not match the value Helm applied. Re-run **`--tags authentik`** (the role waits for **`GET …/core/applications/`** to return **200** with your token). If you rotated the token in `.env` only, run the play again so Helm picks up the new value, or mint a new API token for **`akadmin`** in the admin UI.
|
||
- **`GET …/flows/instances/…` → HTTP 403** with **permission** errors (Authentik **2026+** RBAC): the bootstrap API token often cannot **view flows**. The role reads flow UUIDs from the **worker** database (`kubectl exec` + **`ak shell`**) when **`noble_authentik_oauth_authorization_flow_pk`** / **`noble_authentik_oauth_invalidation_flow_pk`** are unset. The same pattern applies to **`/crypto/certificatekeypairs/`**, **`/propertymappings/…`**, **`/core/groups/`**, and the matching **`noble_authentik_*`** inventory variables. If a lookup fails, fix **`akadmin`** / **authentik Admins** / token, or set the UUID variables manually (see below).
|
||
- **`GET …/crypto/certificatekeypairs/…` → HTTP 403** (permission): same RBAC issue as flows. When **`noble_authentik_oauth_signing_key_pk`** is unset, the role resolves the first **CertificateKeyPair** UUID from the **worker** DB. You can also set **`noble_authentik_oauth_signing_key_pk`** manually (Admin → **System** → **Certificates**).
|
||
- **`GET …/propertymappings/…` → HTTP 403** (permission): when **`noble_authentik_oauth_scope_mapping_pks`** is unset, the role resolves **ScopeMapping** UUIDs from the **worker** DB: **openid**, **email**, **profile**, **offline_access**, and **groups** only if a separate **`groups`** mapping exists (Authentik **2026.x** defaults put **groups** inside **profile** only).
|
||
- **`GET …/core/groups/…` → HTTP 403** (permission): when **`noble_authentik_group_pk_noble_admins`** and **`noble_authentik_group_pk_noble_editors`** are unset, the role runs **`resolve_noble_group_pks.py`** in the worker (**`get_or_create`** for **noble-admins** / **noble-editors**), then passes **`AUTHENTIK_NOBLE_*_GROUP_PK`** into **`configure_authentik.py`** so it skips group list/create via REST.
|
||
- **`GET …/providers/oauth2/…` → HTTP 403** (permission): bootstrap tokens often cannot list OAuth2 providers. With the default **`noble_authentik_oidc_provision_via: worker`**, the role upserts providers and applications in **`authentik-worker`** via Django ORM (**`worker_upsert_oauth_oidc.py`**) instead of **`configure_authentik.py`** REST. Set **`noble_authentik_oidc_provision_via: rest`** only if your API token has **view_oauth2provider** / provider edit permissions (e.g. a full **akadmin** token from the UI).
|
||
- **`GET …/core/users/…` → HTTP 403** when adding the bootstrap user to **noble-admins** / **noble-editors**: with **`noble_authentik_oidc_provision_via: worker`** and a non-empty bootstrap email, the role runs **`worker_add_bootstrap_user_groups.py`** in the worker (ORM **`User.groups.add`**) and sets **`AUTHENTIK_SKIP_USER_GROUP_REST`** so **`configure_authentik.py`** does not call the users API for membership.
|
||
- **Manual flow / signing / scope / group UUIDs (optional):** set **`noble_authentik_oauth_authorization_flow_pk`** and **`noble_authentik_oauth_invalidation_flow_pk`** (both together), optionally **`noble_authentik_oauth_signing_key_pk`**, **`noble_authentik_oauth_scope_mapping_pks`**, **`noble_authentik_group_pk_noble_admins`**, and **`noble_authentik_group_pk_noble_editors`**, from the admin UI or `-e` / `group_vars`; **`configure_authentik.py`** then skips the matching REST discovery calls.
|
||
- **`/if/admin/` redirects to `/if/user/`** (lost admin panel): in **2026.x**, **`canAccessAdmin`** follows **`isSuperuser`**, which is true only when the user belongs to a group with the **superuser** flag (**`authentik Admins`** by default). **`noble_authentik_ensure_admin_ui_access`** (default **true**) makes **`--tags authentik`** run **`files/worker_ensure_authentik_admin_access.py`** in **authentik-worker** (adds **akadmin** or the **bootstrap email** user to **authentik Admins** and forces **`is_superuser`** on that group). **Log out** of Authentik (private window is fine) and sign in again. Set **`noble_authentik_ensure_admin_ui_access: false`** to skip. Without Ansible, you can fix it in **Directory → Groups → authentik Admins** (superuser flag + membership) or run **`ak shell`** with the same logic as that script.
|
||
- **Grafana / Headlamp / ForwardAuth “Unauthorized” or Authentik “Not found”** (Authentik **2026.x**): OAuth endpoints are no longer under **`/application/o/<app>/oauth2/...`**. Use **issuer discovery** (Grafana **`server_url`** at **`…/application/o/<slug>/`**; oauth2-proxy **`oidc-issuer-url`**; Headlamp **`-oidc-idp-issuer-url`**). Re-apply **Traefik** (**`allowCrossNamespace`** so Ingresses can use Middleware in **`oauth2-proxy`**), **kube-prometheus-stack**, and **Headlamp** after updating values (e.g. **`ansible-playbook playbooks/noble.yml --tags authentik`**).
|
||
- **Headlamp OIDC authorize fails / `invalid_scope`**: Authentik often has no separate **`groups`** ScopeMapping (groups live under **`profile`**). Default **`noble_authentik_headlamp_oidc_scopes`** omits **`groups`**; add a **`groups`** mapping to the provider in Authentik and set **`noble_authentik_headlamp_oidc_scopes`** to include **`groups`** if you need that scope by name.
|
||
- **Headlamp OIDC: Authentik flashes then back at login / page refresh**: Headlamp **does** support normal browser OAuth (redirect to Authentik and return to **`/oidc-callback`**). If the callback fails, the UI looks like it “drops” auth. Common causes: **`X-Forwarded-Proto`** not reaching Headlamp (callback built as **`http`** — see [Headlamp OIDC docs](https://headlamp.dev/docs/latest/installation/in-cluster/oidc)); **Traefik ForwardAuth** on the same Ingress (do not combine with native OIDC); **PKCE** state issues — this role defaults **`noble_authentik_headlamp_oidc_use_pkce: false`** for Authentik confidential clients (set **`true`** in **`group_vars`** if you need PKCE).
|
||
- **Headlamp UI: `/me` works but `/clusters/main/version` (and other K8s calls) return 401**: Headlamp forwards your **OIDC id_token** to **kube-apiserver**. The API server must be configured with **OIDC** flags for the same **issuer** and **`oidc-client-id`** as Headlamp (see **`talos/talconfig.yaml`** patch and [Kubernetes OIDC authentication](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#openid-connect-tokens)). Apply regenerated Talos configs to control plane nodes, then **`kubectl apply -k clusters/noble/bootstrap/headlamp`** (or **`--tags authentik`**) for **`oidc-noble-admins-clusterrolebinding.yaml`**. Ensure your user is in Authentik group **`noble-admins`** and the id_token includes a **`groups`** claim if you rely on that binding.
|
||
- **Headlamp + Traefik ForwardAuth (`oauth2-proxy-forward-auth`)**: Do **not** put ForwardAuth on the **Headlamp** Ingress while using **native Headlamp OIDC**. Auth runs on **`/oidc-callback`** before Headlamp can finish the code exchange; ForwardAuth returns **401** and breaks login. Use **either** native OIDC (this repo’s **`values-authentik-oidc.yaml`**) **or** terminate auth at oauth2-proxy only (no **`config.oidc`**), not both.
|
||
|
||
### Fix admin access manually (worker shell, no Ansible)
|
||
|
||
```bash
|
||
kubectl exec -it deploy/authentik-worker -n authentik -- ak shell -c "from authentik.core.models import User, Group; u=User.objects.get(username='akadmin'); adm,_=Group.objects.get_or_create(name='authentik Admins', defaults={'is_superuser': True}); adm.is_superuser=True; adm.save(update_fields=['is_superuser']); u.groups.add(adm); u=User.objects.get(pk=u.pk); print('is_superuser', u.is_superuser)"
|
||
```
|
||
|
||
Then **log out** of Authentik and sign in again.
|