Files
home-server/ansible/roles/noble_authentik/README.md

65 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# noble_authentik — Authentik + OIDC for the noble stack
Installs **Authentik** (Helm `goauthentik/authentik`) as the cluster IdP, **oauth2-proxy** as an **OIDC** client to Authentik for Traefik **ForwardAuth** (Prometheus, Alertmanager, Longhorn UI), and re-applies Helm values so **Argo CD**, **Grafana**, and **Headlamp** use **native OIDC** to Authentik (not HTTP BasicAuth).
## Enable
1. Copy repository **`.env.sample`** to **`.env`** and set every **`NOBLE_AUTHENTIK_*`** variable (see comments there).
2. Set **`noble_authentik_install: true`** in **`ansible/inventory/group_vars/all.yml`** (or pass **`-e noble_authentik_install=true`**).
3. Run **`ansible-playbook playbooks/noble.yml --tags authentik`** (or a full **`noble.yml`**) from **`ansible/`** with a working **`KUBECONFIG`**.
`noble_authentik` runs **after** **`noble_platform`** so Grafana / Headlamp / Prometheus exist before SSO Helm upgrades.
## Variables
See **`defaults/main.yml`**. Hostnames default to **`auth.apps.noble.lab.pcenicni.dev`** and **`oauth2.apps.noble.lab.pcenicni.dev`**. **`noble_authentik_ensure_admin_ui_access`** (default **true**) re-applies **authentik Admins** superuser membership via the worker on each **`--tags authentik`** run so the admin UI keeps working under **2026+** RBAC.
### Extra public hostname (Pangolin + Newt, same Authentik)
To expose the **same** Authentik instance on an **internet-facing** FQDN (while keeping the lab name on Traefik), set **`noble_authentik_ingress_extra_hosts`** in **`ansible/inventory/group_vars/all.yml`** (or **`-e`**) to a list of extra FQDNs, for example **`auth.example.com`**. Re-run **`ansible-playbook playbooks/noble.yml --tags authentik`**. Ansible extends **`server.ingress.hosts`** and **`tls[0].hosts`** so **cert-manager** issues one certificate with SANs for the primary **`noble_authentik_host`** plus those names (DNS must resolve for your issuer — often **Cloudflare** for public names, split horizon for lab).
Then in **Pangolin**: link the domain, create an **HTTP** resource for that hostname, and set the **target** to your **Newt** site with **`ip:port`** pointing at the cluster **Traefik** HTTPS entry (same pattern as **`clusters/noble/bootstrap/newt/README.md`** — typically the MetalLB / LAN VIP and **443**). One Newt tunnel can front many hostnames.
In **Authentik**, add a **Brand** (or equivalent) for the new hostname if you want different titles/favicon; OAuth **redirect URIs** for each app must include issuer URLs that match what browsers use (often you keep **internal** issuer URLs in cluster apps and use the public URL only for human login, or align all apps to the public issuer — pick one strategy to avoid mixed **`iss`** / callback mismatches).
### “Secondary tenant” (separate PostgreSQL schema — alpha)
Authentik **tenancy** (multiple isolated tenants in one deployment, **`AUTHENTIK_TENANTS__ENABLED`**) is **alpha**, requires **per-tenant Enterprise licensing**, **`AUTHENTIK_TENANTS__API_KEY`**, and **`AUTHENTIK_OUTPOSTS__DISABLE_EMBEDDED_OUTPOST=true`** (embedded outposts are unsupported with tenancy). It is **not** wired in this repo by default. See [Tenancy](https://docs.goauthentik.io/sys-mgmt/tenancy). For most homelabs, **one tenant** plus **`noble_authentik_ingress_extra_hosts`** is the right split.
## IdP configuration
When **`noble_authentik_configure_idp`** is true, Ansible creates/updates OAuth2 providers and applications for **argocd**, **grafana**, **headlamp**, and **oauth2-proxy** using either the **worker ORM path** (default **`noble_authentik_oidc_provision_via: worker`**: **`kubectl exec`** + **`ak shell`** + **`files/worker_upsert_oauth_oidc.py`**, which avoids **2026+** REST **403** on **`GET …/providers/oauth2/**`) or the **REST-only path** (**`noble_authentik_oidc_provision_via: rest`**: **`files/configure_authentik.py`** needs a token that can list/patch OAuth2 providers). With the worker path and a bootstrap email, it also runs **`files/worker_add_bootstrap_user_groups.py`** so **`User.groups.add`** does not depend on **`GET …/core/users/**`. It then runs **`configure_authentik.py`** with **`AUTHENTIK_SKIP_OIDC_REST`** / **`AUTHENTIK_SKIP_USER_GROUP_REST`** when those worker steps ran, so the script only calls **`ensure_group`** over the API (skipped when **`AUTHENTIK_NOBLE_*_GROUP_PK`** are set).
## RBAC notes
- **Argo CD:** `noble-admins` group → `role:admin` (see **`clusters/noble/bootstrap/argocd/values-authentik-oidc.yaml`**).
- **Grafana:** `noble-admins` → Admin, `noble-editors` → Editor (see **`values-authentik-oidc.yaml`**).
## Troubleshooting
- **oauth2-proxy shows 500** on **`oauth2.apps…/oauth2/callback`** (logs: `email in id_token (...) isn't verified`): Authentiks id_token often lacks **`email_verified: true`** for bootstrap users. **`clusters/noble/bootstrap/oauth2-proxy/values.yaml`** sets **`insecure-oidc-allow-unverified-email`** for the lab; otherwise verify the users email in Authentik, then **`helm upgrade oauth2-proxy`** (or **`--tags authentik`**).
- Re-run **`configure_authentik.py`** only by executing **`noble.yml`** with **`--tags authentik`** after fixing `.env`.
- If Authentik API calls fail, check flows exist (slug **`default-provider-authorization-implicit-consent`**) and TLS reaches **`AUTHENTIK_API_BASE`**.
- **`GET …/flows/instances/…` → HTTP 403** with **`Token invalid/expired`**: the bootstrap API token is not accepted yet (common right after install: worker still creating it) or **`NOBLE_AUTHENTIK_BOOTSTRAP_TOKEN`** in `.env` does not match the value Helm applied. Re-run **`--tags authentik`** (the role waits for **`GET …/core/applications/`** to return **200** with your token). If you rotated the token in `.env` only, run the play again so Helm picks up the new value, or mint a new API token for **`akadmin`** in the admin UI.
- **`GET …/flows/instances/…` → HTTP 403** with **permission** errors (Authentik **2026+** RBAC): the bootstrap API token often cannot **view flows**. The role reads flow UUIDs from the **worker** database (`kubectl exec` + **`ak shell`**) when **`noble_authentik_oauth_authorization_flow_pk`** / **`noble_authentik_oauth_invalidation_flow_pk`** are unset. The same pattern applies to **`/crypto/certificatekeypairs/`**, **`/propertymappings/…`**, **`/core/groups/`**, and the matching **`noble_authentik_*`** inventory variables. If a lookup fails, fix **`akadmin`** / **authentik Admins** / token, or set the UUID variables manually (see below).
- **`GET …/crypto/certificatekeypairs/…` → HTTP 403** (permission): same RBAC issue as flows. When **`noble_authentik_oauth_signing_key_pk`** is unset, the role resolves the first **CertificateKeyPair** UUID from the **worker** DB. You can also set **`noble_authentik_oauth_signing_key_pk`** manually (Admin → **System****Certificates**).
- **`GET …/propertymappings/…` → HTTP 403** (permission): when **`noble_authentik_oauth_scope_mapping_pks`** is unset, the role resolves **ScopeMapping** UUIDs from the **worker** DB: **openid**, **email**, **profile**, **offline_access**, and **groups** only if a separate **`groups`** mapping exists (Authentik **2026.x** defaults put **groups** inside **profile** only).
- **`GET …/core/groups/…` → HTTP 403** (permission): when **`noble_authentik_group_pk_noble_admins`** and **`noble_authentik_group_pk_noble_editors`** are unset, the role runs **`resolve_noble_group_pks.py`** in the worker (**`get_or_create`** for **noble-admins** / **noble-editors**), then passes **`AUTHENTIK_NOBLE_*_GROUP_PK`** into **`configure_authentik.py`** so it skips group list/create via REST.
- **`GET …/providers/oauth2/…` → HTTP 403** (permission): bootstrap tokens often cannot list OAuth2 providers. With the default **`noble_authentik_oidc_provision_via: worker`**, the role upserts providers and applications in **`authentik-worker`** via Django ORM (**`worker_upsert_oauth_oidc.py`**) instead of **`configure_authentik.py`** REST. Set **`noble_authentik_oidc_provision_via: rest`** only if your API token has **view_oauth2provider** / provider edit permissions (e.g. a full **akadmin** token from the UI).
- **`GET …/core/users/…` → HTTP 403** when adding the bootstrap user to **noble-admins** / **noble-editors**: with **`noble_authentik_oidc_provision_via: worker`** and a non-empty bootstrap email, the role runs **`worker_add_bootstrap_user_groups.py`** in the worker (ORM **`User.groups.add`**) and sets **`AUTHENTIK_SKIP_USER_GROUP_REST`** so **`configure_authentik.py`** does not call the users API for membership.
- **Manual flow / signing / scope / group UUIDs (optional):** set **`noble_authentik_oauth_authorization_flow_pk`** and **`noble_authentik_oauth_invalidation_flow_pk`** (both together), optionally **`noble_authentik_oauth_signing_key_pk`**, **`noble_authentik_oauth_scope_mapping_pks`**, **`noble_authentik_group_pk_noble_admins`**, and **`noble_authentik_group_pk_noble_editors`**, from the admin UI or `-e` / `group_vars`; **`configure_authentik.py`** then skips the matching REST discovery calls.
- **`/if/admin/` redirects to `/if/user/`** (lost admin panel): in **2026.x**, **`canAccessAdmin`** follows **`isSuperuser`**, which is true only when the user belongs to a group with the **superuser** flag (**`authentik Admins`** by default). **`noble_authentik_ensure_admin_ui_access`** (default **true**) makes **`--tags authentik`** run **`files/worker_ensure_authentik_admin_access.py`** in **authentik-worker** (adds **akadmin** or the **bootstrap email** user to **authentik Admins** and forces **`is_superuser`** on that group). **Log out** of Authentik (private window is fine) and sign in again. Set **`noble_authentik_ensure_admin_ui_access: false`** to skip. Without Ansible, you can fix it in **Directory → Groups → authentik Admins** (superuser flag + membership) or run **`ak shell`** with the same logic as that script.
- **Grafana / Headlamp / ForwardAuth “Unauthorized” or Authentik “Not found”** (Authentik **2026.x**): OAuth endpoints are no longer under **`/application/o/<app>/oauth2/...`**. Use **issuer discovery** (Grafana **`server_url`** at **`…/application/o/<slug>/`**; oauth2-proxy **`oidc-issuer-url`**; Headlamp **`-oidc-idp-issuer-url`**). Re-apply **Traefik** (**`allowCrossNamespace`** so Ingresses can use Middleware in **`oauth2-proxy`**), **kube-prometheus-stack**, and **Headlamp** after updating values (e.g. **`ansible-playbook playbooks/noble.yml --tags authentik`**).
- **Headlamp OIDC authorize fails / `invalid_scope`**: Authentik often has no separate **`groups`** ScopeMapping (groups live under **`profile`**). Default **`noble_authentik_headlamp_oidc_scopes`** omits **`groups`**; add a **`groups`** mapping to the provider in Authentik and set **`noble_authentik_headlamp_oidc_scopes`** to include **`groups`** if you need that scope by name.
- **Headlamp OIDC: Authentik flashes then back at login / page refresh**: Headlamp **does** support normal browser OAuth (redirect to Authentik and return to **`/oidc-callback`**). If the callback fails, the UI looks like it “drops” auth. Common causes: **`X-Forwarded-Proto`** not reaching Headlamp (callback built as **`http`** — see [Headlamp OIDC docs](https://headlamp.dev/docs/latest/installation/in-cluster/oidc)); **Traefik ForwardAuth** on the same Ingress (do not combine with native OIDC); **PKCE** state issues — this role defaults **`noble_authentik_headlamp_oidc_use_pkce: false`** for Authentik confidential clients (set **`true`** in **`group_vars`** if you need PKCE).
- **Headlamp UI: `/me` works but `/clusters/main/version` (and other K8s calls) return 401**: Headlamp forwards your **OIDC id_token** to **kube-apiserver**. The API server must be configured with **OIDC** flags for the same **issuer** and **`oidc-client-id`** as Headlamp (see **`talos/talconfig.yaml`** patch and [Kubernetes OIDC authentication](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#openid-connect-tokens)). Apply regenerated Talos configs to control plane nodes, then **`kubectl apply -k clusters/noble/bootstrap/headlamp`** (or **`--tags authentik`**) for **`oidc-noble-admins-clusterrolebinding.yaml`**. Ensure your user is in Authentik group **`noble-admins`** and the id_token includes a **`groups`** claim if you rely on that binding.
- **Headlamp + Traefik ForwardAuth (`oauth2-proxy-forward-auth`)**: Do **not** put ForwardAuth on the **Headlamp** Ingress while using **native Headlamp OIDC**. Auth runs on **`/oidc-callback`** before Headlamp can finish the code exchange; ForwardAuth returns **401** and breaks login. Use **either** native OIDC (this repos **`values-authentik-oidc.yaml`**) **or** terminate auth at oauth2-proxy only (no **`config.oidc`**), not both.
### Fix admin access manually (worker shell, no Ansible)
```bash
kubectl exec -it deploy/authentik-worker -n authentik -- ak shell -c "from authentik.core.models import User, Group; u=User.objects.get(username='akadmin'); adm,_=Group.objects.get_or_create(name='authentik Admins', defaults={'is_superuser': True}); adm.is_superuser=True; adm.save(update_fields=['is_superuser']); u.groups.add(adm); u=User.objects.get(pk=u.pk); print('is_superuser', u.is_superuser)"
```
Then **log out** of Authentik and sign in again.