Files
home-server/ansible/roles/noble_authentik/README.md

131 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# noble_authentik — Authentik + OIDC for the noble stack
Installs **Authentik** (Helm `goauthentik/authentik`) as the cluster IdP, **oauth2-proxy** as an **OIDC** client to Authentik for Traefik **ForwardAuth** (Prometheus, Alertmanager, Longhorn UI), and re-applies Helm values so **Argo CD**, **Grafana**, and **Headlamp** use **native OIDC** to Authentik (not HTTP BasicAuth).
## Enable
1. Copy repository **`.env.sample`** to **`.env`** and set all **required** **`NOBLE_AUTHENTIK_*`** values (see comments there; SMTP keys are optional).
2. Set **`noble_authentik_install: true`** in **`ansible/inventory/group_vars/all.yml`** (or pass **`-e noble_authentik_install=true`**).
3. Run **`ansible-playbook playbooks/noble.yml --tags authentik`** (or a full **`noble.yml`**) from **`ansible/`** with a working **`KUBECONFIG`**.
`noble_authentik` runs **after** **`noble_platform`** so Grafana / Headlamp / Prometheus exist before SSO Helm upgrades.
## Variables
See **`defaults/main.yml`**. Hostnames default to **`auth.apps.noble.lab.pcenicni.dev`** and **`oauth2.apps.noble.lab.pcenicni.dev`**. **`noble_authentik_ensure_admin_ui_access`** (default **true**) re-applies **authentik Admins** superuser membership via the worker on each **`--tags authentik`** run so the admin UI keeps working under **2026+** RBAC.
### S3 media (avatars, flows, uploads)
Authentik stores file-backed data in **S3** (not a shared PVC on **`authentik-worker`**). Set **`NOBLE_AUTHENTIK_MEDIA_S3_BUCKET`** in **`.env`** to a **dedicated** bucket name (do **not** reuse the Velero backup bucket). **`NOBLE_VELERO_S3_URL`**, **`NOBLE_VELERO_AWS_ACCESS_KEY_ID`**, and **`NOBLE_VELERO_AWS_SECRET_ACCESS_KEY`** are reused automatically when the Authentik-specific S3 variables are unset; override with **`NOBLE_AUTHENTIK_S3_URL`** / **`NOBLE_AUTHENTIK_S3_ACCESS_KEY`** / **`NOBLE_AUTHENTIK_S3_SECRET_KEY`** if needed. Optional: **`NOBLE_AUTHENTIK_S3_REGION`** (defaults to **`us-east-1`** in Ansible), **`NOBLE_AUTHENTIK_S3_ADDRESSING_STYLE`** (**`path`** vs **`virtual`** for some gateways). Create the bucket and grant the same credentials **read/write** to that bucket only. For browser uploads and public assets, follow [Authentik — S3 storage](https://docs.goauthentik.io/sys-mgmt/ops/storage-s3/) (CORS and policies). If you previously used a PVC for **`/data`**, sync into the new bucket (for example **`aws s3 sync`** from a volume snapshot or old mount) before relying on S3-only.
### Outbound email (SMTP)
Optional. Set **`NOBLE_AUTHENTIK_SMTP_HOST`** and **`NOBLE_AUTHENTIK_SMTP_FROM`** in repository **`.env`**; Ansible adds **`AUTHENTIK_EMAIL__HOST`**, **`AUTHENTIK_EMAIL__FROM`**, and related variables to Helm **`global.env`** (see [Authentik configuration — email](https://docs.goauthentik.io/install-config/configuration/#email-settings)). Omit **`NOBLE_AUTHENTIK_SMTP_HOST`** to skip SMTP env vars entirely. Optional overrides: **`NOBLE_AUTHENTIK_SMTP_PORT`** (default **587** in **`defaults/main.yml`**), **`NOBLE_AUTHENTIK_SMTP_USERNAME`**, **`NOBLE_AUTHENTIK_SMTP_PASSWORD`**, **`NOBLE_AUTHENTIK_SMTP_USE_TLS`** / **`USE_SSL`** / **`TIMEOUT`**. Re-run **`ansible-playbook playbooks/noble.yml --tags authentik`** after changes.
### Extra public hostname (Pangolin + Newt, same Authentik)
To expose the **same** Authentik instance on an **internet-facing** FQDN (while keeping the lab name on Traefik), set **`noble_authentik_ingress_extra_hosts`** in **`ansible/inventory/group_vars/all.yml`** (or **`-e`**) to a list of extra FQDNs, for example **`auth.example.com`**. Re-run **`ansible-playbook playbooks/noble.yml --tags authentik`**. Ansible extends **`server.ingress.hosts`** and **`tls[0].hosts`** so **cert-manager** issues one certificate with SANs for the primary **`noble_authentik_host`** plus those names (DNS must resolve for your issuer — often **Cloudflare** for public names, split horizon for lab).
Then in **Pangolin**: link the domain, create an **HTTP** resource for that hostname, and set the **target** to your **Newt** site with **`ip:port`** pointing at the cluster **Traefik** HTTPS entry (same pattern as **`clusters/noble/bootstrap/newt/README.md`** — typically the MetalLB / LAN VIP and **443**). One Newt tunnel can front many hostnames.
### Split routing, two Brands, and optional blueprints
This role supports a **single Authentik deployment** with **two hostnames** (lab + public) and **different Brands** per **`Host`**, without Authentiks separate-database **Tenancy** feature (see [Tenancy](https://docs.goauthentik.io/sys-mgmt/tenancy) — alpha / licensing). [Brands](https://docs.goauthentik.io/brands/) choose default **authentication** (and related) **flows** and branding for each FQDN.
**Split routing (recommended):**
- **Lab / operator URL** — **`noble_authentik_host`** (default **`auth.apps.noble.lab.pcenicni.dev`**): keep DNS **internal-only** (split horizon, VPN, or LAN DNS). Do **not** publish this hostname as a Pangolin HTTP resource toward the internet unless you intentionally want it reachable off-LAN.
- **Public URL** — entries in **`noble_authentik_ingress_extra_hosts`**: use Pangolin (or another edge) only for these names so casual users never need the lab FQDN.
Network isolation is enforced at **DNS and the tunnel**, not inside Authentik. Optionally add firewall / Traefik entrypoint rules for defense in depth.
**Two-Brand model:**
- **Lab Brand** — domain equals **`noble_authentik_host`**: use a **restricted authentication flow** so only operator groups (defaults: **`noble-admins`**, **`authentik Admins`**) can complete sign-in on that hostname. Your bootstrap / break-glass account must remain in one of those groups (see **`noble_authentik_ensure_admin_ui_access`** and **`configure_authentik.py`** group membership).
- **Public Brand(s)** — one Brand per FQDN in **`noble_authentik_ingress_extra_hosts`**: use the stock **`default-authentication-flow`** (or replace with your own flow slug via a forked blueprint later). Assign people to **`noble_authentik_blueprint_public_groups`** and/or **`noble_authentik_blueprint_nikflix_groups`** (defaults include **`nikflix-users`** / **`nikflix-admins`** for the Nikflix hostname); **`noble-admins`** / **`noble-editors`** remain for cluster / Argo / Grafana as today.
**OAuth note:** Redirect URIs and **`iss`** must stay consistent with the hostname clients use (internal issuer for in-cluster apps vs public issuer is a deliberate choice — avoid mixing both for the same app).
**Mounted blueprints (optional):** set **`noble_authentik_blueprints_enabled: true`** in **`group_vars`** (or **`-e`**). On each **`--tags authentik`** run, Ansible renders Jinja templates under **`templates/blueprints/`** into a ConfigMap **`noble_authentik_blueprints_configmap_name`** (default **`authentik-noble-blueprints`**) and sets Helm **`blueprints.configMaps`** so **authentik-worker** loads them from **`/blueprints/mounted/cm-authentik-noble-blueprints/`** (see [Blueprints](https://docs.goauthentik.io/customize/blueprints/)). Files (apply in lexical order):
| Key | Purpose |
| --- | --- |
| **`10-noble-public-groups.yaml.j2`** | **`noble_authentik_blueprint_public_groups`** **`noble_authentik_blueprint_extra_directory_groups`** **`noble_authentik_blueprint_nikflix_groups`** → **Group** objects (see **Blueprint: directory groups**). |
| **`20-noble-lab-operator-authentication-flow.yaml.j2`** | Flow **`noble_authentik_blueprint_lab_flow_slug`** + expression policy **`noble_authentik_blueprint_operator_policy_name`** (allowed groups **`noble_authentik_blueprint_lab_operator_groups`**). |
| **`30-noble-brands-domain-split.yaml.j2`** | Brand for **`noble_authentik_host`** → lab flow; one Brand per **`noble_authentik_ingress_extra_hosts`** → default authentication. |
Tune titles via **`noble_authentik_blueprint_lab_brand_title`** and **`noble_authentik_blueprint_public_brand_title_prefix`**. After the worker applies blueprints, confirm **System → Brands** and **Flows** in the admin UI; fix any **`!Find`** failures if upstream default stage **names** change between Authentik versions.
#### Blueprint: directory groups
Three inventory lists are concatenated **in this order** into **`10-noble-public-groups.yaml.j2`**:
1. **`noble_authentik_blueprint_public_groups`** — generic “public hostname” audience (defaults **`noble-public-users`** / **`noble-public-admins`**).
2. **`noble_authentik_blueprint_extra_directory_groups`** — any other groups (empty by default).
3. **`noble_authentik_blueprint_nikflix_groups`** — Nikflix-facing groups (defaults **`nikflix-users`** / **`nikflix-admins`** with **`noble.ak/brand: nikflix`**). Listed last so **`parents`** can reference groups from (1) or (2) if you choose.
Each item may be:
| Form | Example |
| --- | --- |
| **String** | **`my-app-operators`** — creates a group with that **name** only. |
| **Mapping** | **`name`** (required), optional **`is_superuser`**: **`true`** (use sparingly), **`attributes`**: dict (JSON on the group; useful in expression policies), **`parents`**: list of **existing** group **names** (resolved with **`!Find`**). |
Order matters for **`parents`**: every parent must already exist when the child row is applied — list parents **above** children in the merged list, or reference groups Authentik already created (for example **`noble-public-users`** before **`noble-public-admins`** with **`parents: [noble-public-users]`**). See [Group properties and attributes](https://docs.goauthentik.io/users-sources/groups/group_ref/).
##### Audience groups vs per-service groups
For Nikflix (and similar brands), prefer **one broad “users” group and a small “admins” group** (`nikflix-users` / `nikflix-admins`), then bind **OAuth providers**, **policies**, and **app access** to those groups. Add **per-service** groups (for example **`nikflix-media-readonly`**) only when a service truly needs a **different** membership set than the rest of the brand; every extra group is another object to keep in sync with enrollment and IdP claims. Optional pattern: make a service group a **child** of **`nikflix-users`** via **`parents`** so members inherit the parent for generic “logged in to Nikflix” checks.
**Confirming blueprints on the cluster:** the Ansible task **Install Authentik (Helm)** uses **`changed_when: true`**, so a **“changed”** line there does **not** prove Helm mutated the release. When **`noble_authentik_blueprints_enabled`** is true, the role asserts the **worker** Deployment has a volumeMount named **`blueprints-cm-<noble_authentik_blueprints_configmap_name>`** (default **`blueprints-cm-authentik-noble-blueprints`**). You can also run:
```bash
kubectl -n authentik get configmap authentik-noble-blueprints -o yaml
helm get values authentik -n authentik -o yaml | grep -A2 blueprints
kubectl -n authentik get deploy -l app.kubernetes.io/component=worker -o yaml | grep blueprints-cm
```
Mounted files are applied asynchronously by **authentik-worker**; check **System → Blueprints** (or **Customization → Blueprints** depending on version) for instances sourced from **`/blueprints/mounted/cm-authentik-noble-blueprints/`**, and **`kubectl logs -n authentik deploy/authentik-worker`** if a blueprint shows **Error** / failed apply.
### “Secondary tenant” (separate PostgreSQL schema — alpha)
Authentik **tenancy** (multiple isolated tenants in one deployment, **`AUTHENTIK_TENANTS__ENABLED`**) is **alpha**, requires **per-tenant Enterprise licensing**, **`AUTHENTIK_TENANTS__API_KEY`**, and **`AUTHENTIK_OUTPOSTS__DISABLE_EMBEDDED_OUTPOST=true`** (embedded outposts are unsupported with tenancy). It is **not** wired in this repo by default. See [Tenancy](https://docs.goauthentik.io/sys-mgmt/tenancy). For most homelabs, **one tenant** plus **`noble_authentik_ingress_extra_hosts`** is the right split.
## IdP configuration
When **`noble_authentik_configure_idp`** is true, Ansible creates/updates OAuth2 providers and applications for **argocd**, **grafana**, **headlamp**, and **oauth2-proxy** using either the **worker ORM path** (default **`noble_authentik_oidc_provision_via: worker`**: **`kubectl exec`** + **`ak shell`** + **`files/worker_upsert_oauth_oidc.py`**, which avoids **2026+** REST **403** on **`GET …/providers/oauth2/**`) or the **REST-only path** (**`noble_authentik_oidc_provision_via: rest`**: **`files/configure_authentik.py`** needs a token that can list/patch OAuth2 providers). With the worker path and a bootstrap email, it also runs **`files/worker_add_bootstrap_user_groups.py`** so **`User.groups.add`** does not depend on **`GET …/core/users/**`. It then runs **`configure_authentik.py`** with **`AUTHENTIK_SKIP_OIDC_REST`** / **`AUTHENTIK_SKIP_USER_GROUP_REST`** when those worker steps ran, so the script only calls **`ensure_group`** over the API (skipped when **`AUTHENTIK_NOBLE_*_GROUP_PK`** are set).
## RBAC notes
- **Argo CD:** `noble-admins` group → `role:admin` (see **`clusters/noble/bootstrap/argocd/values-authentik-oidc.yaml`**).
- **Grafana:** `noble-admins` → Admin, `noble-editors` → Editor (see **`values-authentik-oidc.yaml`**).
## Troubleshooting
- **Blueprints from Ansible fail to apply** (worker logs / **System → Blueprints**): confirm the ConfigMap exists (**`kubectl -n authentik get cm authentik-noble-blueprints`** unless you changed **`noble_authentik_blueprints_configmap_name`**), that Helm mounts it (**`blueprints.configMaps`** in the rendered extra values), and that every **`!Find`** in **`templates/blueprints/20-*.j2`** still matches your Authentik versions default stage **names**. For **`10-noble-public-groups.yaml.j2`**, **`parents`** must reference groups that appear **earlier** in the merged list (or already exist in Authentik). Re-run **`--tags authentik`** after editing templates.
- **oauth2-proxy shows 500** on **`oauth2.apps…/oauth2/callback`** (logs: `email in id_token (...) isn't verified`): Authentiks id_token often lacks **`email_verified: true`** for bootstrap users. **`clusters/noble/bootstrap/oauth2-proxy/values.yaml`** sets **`insecure-oidc-allow-unverified-email`** for the lab; otherwise verify the users email in Authentik, then **`helm upgrade oauth2-proxy`** (or **`--tags authentik`**).
- Re-run **`configure_authentik.py`** only by executing **`noble.yml`** with **`--tags authentik`** after fixing `.env`.
- If Authentik API calls fail, check flows exist (slug **`default-provider-authorization-implicit-consent`**) and TLS reaches **`AUTHENTIK_API_BASE`**.
- **`GET …/flows/instances/…` → HTTP 403** with **`Token invalid/expired`**: the bootstrap API token is not accepted yet (common right after install: worker still creating it) or **`NOBLE_AUTHENTIK_BOOTSTRAP_TOKEN`** in `.env` does not match the value Helm applied. Re-run **`--tags authentik`** (the role waits for **`GET …/core/applications/`** to return **200** with your token). If you rotated the token in `.env` only, run the play again so Helm picks up the new value, or mint a new API token for **`akadmin`** in the admin UI.
- **`GET …/flows/instances/…` → HTTP 403** with **permission** errors (Authentik **2026+** RBAC): the bootstrap API token often cannot **view flows**. The role reads flow UUIDs from the **worker** database (`kubectl exec` + **`ak shell`**) when **`noble_authentik_oauth_authorization_flow_pk`** / **`noble_authentik_oauth_invalidation_flow_pk`** are unset. The same pattern applies to **`/crypto/certificatekeypairs/`**, **`/propertymappings/…`**, **`/core/groups/`**, and the matching **`noble_authentik_*`** inventory variables. If a lookup fails, fix **`akadmin`** / **authentik Admins** / token, or set the UUID variables manually (see below).
- **`GET …/crypto/certificatekeypairs/…` → HTTP 403** (permission): same RBAC issue as flows. When **`noble_authentik_oauth_signing_key_pk`** is unset, the role resolves the first **CertificateKeyPair** UUID from the **worker** DB. You can also set **`noble_authentik_oauth_signing_key_pk`** manually (Admin → **System****Certificates**).
- **`GET …/propertymappings/…` → HTTP 403** (permission): when **`noble_authentik_oauth_scope_mapping_pks`** is unset, the role resolves **ScopeMapping** UUIDs from the **worker** DB: **openid**, **email**, **profile**, **offline_access**, and **groups** only if a separate **`groups`** mapping exists (Authentik **2026.x** defaults put **groups** inside **profile** only).
- **`GET …/core/groups/…` → HTTP 403** (permission): when **`noble_authentik_group_pk_noble_admins`** and **`noble_authentik_group_pk_noble_editors`** are unset, the role runs **`resolve_noble_group_pks.py`** in the worker (**`get_or_create`** for **noble-admins** / **noble-editors**), then passes **`AUTHENTIK_NOBLE_*_GROUP_PK`** into **`configure_authentik.py`** so it skips group list/create via REST.
- **`GET …/providers/oauth2/…` → HTTP 403** (permission): bootstrap tokens often cannot list OAuth2 providers. With the default **`noble_authentik_oidc_provision_via: worker`**, the role upserts providers and applications in **`authentik-worker`** via Django ORM (**`worker_upsert_oauth_oidc.py`**) instead of **`configure_authentik.py`** REST. Set **`noble_authentik_oidc_provision_via: rest`** only if your API token has **view_oauth2provider** / provider edit permissions (e.g. a full **akadmin** token from the UI).
- **`GET …/core/users/…` → HTTP 403** when adding the bootstrap user to **noble-admins** / **noble-editors**: with **`noble_authentik_oidc_provision_via: worker`** and a non-empty bootstrap email, the role runs **`worker_add_bootstrap_user_groups.py`** in the worker (ORM **`User.groups.add`**) and sets **`AUTHENTIK_SKIP_USER_GROUP_REST`** so **`configure_authentik.py`** does not call the users API for membership.
- **Manual flow / signing / scope / group UUIDs (optional):** set **`noble_authentik_oauth_authorization_flow_pk`** and **`noble_authentik_oauth_invalidation_flow_pk`** (both together), optionally **`noble_authentik_oauth_signing_key_pk`**, **`noble_authentik_oauth_scope_mapping_pks`**, **`noble_authentik_group_pk_noble_admins`**, and **`noble_authentik_group_pk_noble_editors`**, from the admin UI or `-e` / `group_vars`; **`configure_authentik.py`** then skips the matching REST discovery calls.
- **`/if/admin/` redirects to `/if/user/`** (lost admin panel): in **2026.x**, **`canAccessAdmin`** follows **`isSuperuser`**, which is true only when the user belongs to a group with the **superuser** flag (**`authentik Admins`** by default). **`noble_authentik_ensure_admin_ui_access`** (default **true**) makes **`--tags authentik`** run **`files/worker_ensure_authentik_admin_access.py`** in **authentik-worker** (adds **akadmin** or the **bootstrap email** user to **authentik Admins** and forces **`is_superuser`** on that group). **Log out** of Authentik (private window is fine) and sign in again. Set **`noble_authentik_ensure_admin_ui_access: false`** to skip. Without Ansible, you can fix it in **Directory → Groups → authentik Admins** (superuser flag + membership) or run **`ak shell`** with the same logic as that script.
- **Grafana / Headlamp / ForwardAuth “Unauthorized” or Authentik “Not found”** (Authentik **2026.x**): OAuth endpoints are no longer under **`/application/o/<app>/oauth2/...`**. Use **issuer discovery** (Grafana **`server_url`** at **`…/application/o/<slug>/`**; oauth2-proxy **`oidc-issuer-url`**; Headlamp **`-oidc-idp-issuer-url`**). Re-apply **Traefik** (**`allowCrossNamespace`** so Ingresses can use Middleware in **`oauth2-proxy`**), **kube-prometheus-stack**, and **Headlamp** after updating values (e.g. **`ansible-playbook playbooks/noble.yml --tags authentik`**).
- **Headlamp OIDC authorize fails / `invalid_scope`**: Authentik often has no separate **`groups`** ScopeMapping (groups live under **`profile`**). Default **`noble_authentik_headlamp_oidc_scopes`** omits **`groups`**; add a **`groups`** mapping to the provider in Authentik and set **`noble_authentik_headlamp_oidc_scopes`** to include **`groups`** if you need that scope by name.
- **Headlamp OIDC: Authentik flashes then back at login / page refresh**: Headlamp **does** support normal browser OAuth (redirect to Authentik and return to **`/oidc-callback`**). If the callback fails, the UI looks like it “drops” auth. Common causes: **`X-Forwarded-Proto`** not reaching Headlamp (callback built as **`http`** — see [Headlamp OIDC docs](https://headlamp.dev/docs/latest/installation/in-cluster/oidc)); **Traefik ForwardAuth** on the same Ingress (do not combine with native OIDC); **PKCE** state issues — this role defaults **`noble_authentik_headlamp_oidc_use_pkce: false`** for Authentik confidential clients (set **`true`** in **`group_vars`** if you need PKCE).
- **Headlamp UI: `/me` works but `/clusters/main/version` (and other K8s calls) return 401**: Headlamp forwards your **OIDC id_token** to **kube-apiserver**. The API server must be configured with **OIDC** flags for the same **issuer** and **`oidc-client-id`** as Headlamp (see **`talos/talconfig.yaml`** patch and [Kubernetes OIDC authentication](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#openid-connect-tokens)). Apply regenerated Talos configs to control plane nodes, then **`kubectl apply -k clusters/noble/bootstrap/headlamp`** (or **`--tags authentik`**) for **`oidc-noble-admins-clusterrolebinding.yaml`**. Ensure your user is in Authentik group **`noble-admins`** and the id_token includes a **`groups`** claim if you rely on that binding.
- **Headlamp + Traefik ForwardAuth (`oauth2-proxy-forward-auth`)**: Do **not** put ForwardAuth on the **Headlamp** Ingress while using **native Headlamp OIDC**. Auth runs on **`/oidc-callback`** before Headlamp can finish the code exchange; ForwardAuth returns **401** and breaks login. Use **either** native OIDC (this repos **`values-authentik-oidc.yaml`**) **or** terminate auth at oauth2-proxy only (no **`config.oidc`**), not both.
### Fix admin access manually (worker shell, no Ansible)
```bash
kubectl exec -it deploy/authentik-worker -n authentik -- ak shell -c "from authentik.core.models import User, Group; u=User.objects.get(username='akadmin'); adm,_=Group.objects.get_or_create(name='authentik Admins', defaults={'is_superuser': True}); adm.is_superuser=True; adm.save(update_fields=['is_superuser']); u.groups.add(adm); u=User.objects.get(pk=u.pk); print('is_superuser', u.is_superuser)"
```
Then **log out** of Authentik and sign in again.