Files
home-server/ansible/roles/noble_authentik/README.md

13 KiB
Raw Blame History

noble_authentik — Authentik + OIDC for the noble stack

Installs Authentik (Helm goauthentik/authentik) as the cluster IdP, oauth2-proxy as an OIDC client to Authentik for Traefik ForwardAuth (Prometheus, Alertmanager, Longhorn UI), and re-applies Helm values so Argo CD, Grafana, and Headlamp use native OIDC to Authentik (not HTTP BasicAuth).

Enable

  1. Copy repository .env.sample to .env and set every NOBLE_AUTHENTIK_* variable (see comments there).
  2. Set noble_authentik_install: true in ansible/inventory/group_vars/all.yml (or pass -e noble_authentik_install=true).
  3. Run ansible-playbook playbooks/noble.yml --tags authentik (or a full noble.yml) from ansible/ with a working KUBECONFIG.

noble_authentik runs after noble_platform so Grafana / Headlamp / Prometheus exist before SSO Helm upgrades.

Variables

See defaults/main.yml. Hostnames default to auth.apps.noble.lab.pcenicni.dev and oauth2.apps.noble.lab.pcenicni.dev. noble_authentik_ensure_admin_ui_access (default true) re-applies authentik Admins superuser membership via the worker on each --tags authentik run so the admin UI keeps working under 2026+ RBAC.

Extra public hostname (Pangolin + Newt, same Authentik)

To expose the same Authentik instance on an internet-facing FQDN (while keeping the lab name on Traefik), set noble_authentik_ingress_extra_hosts in ansible/inventory/group_vars/all.yml (or -e) to a list of extra FQDNs, for example auth.example.com. Re-run ansible-playbook playbooks/noble.yml --tags authentik. Ansible extends server.ingress.hosts and tls[0].hosts so cert-manager issues one certificate with SANs for the primary noble_authentik_host plus those names (DNS must resolve for your issuer — often Cloudflare for public names, split horizon for lab).

Then in Pangolin: link the domain, create an HTTP resource for that hostname, and set the target to your Newt site with ip:port pointing at the cluster Traefik HTTPS entry (same pattern as clusters/noble/bootstrap/newt/README.md — typically the MetalLB / LAN VIP and 443). One Newt tunnel can front many hostnames.

In Authentik, add a Brand (or equivalent) for the new hostname if you want different titles/favicon; OAuth redirect URIs for each app must include issuer URLs that match what browsers use (often you keep internal issuer URLs in cluster apps and use the public URL only for human login, or align all apps to the public issuer — pick one strategy to avoid mixed iss / callback mismatches).

“Secondary tenant” (separate PostgreSQL schema — alpha)

Authentik tenancy (multiple isolated tenants in one deployment, AUTHENTIK_TENANTS__ENABLED) is alpha, requires per-tenant Enterprise licensing, AUTHENTIK_TENANTS__API_KEY, and AUTHENTIK_OUTPOSTS__DISABLE_EMBEDDED_OUTPOST=true (embedded outposts are unsupported with tenancy). It is not wired in this repo by default. See Tenancy. For most homelabs, one tenant plus noble_authentik_ingress_extra_hosts is the right split.

IdP configuration

When noble_authentik_configure_idp is true, Ansible creates/updates OAuth2 providers and applications for argocd, grafana, headlamp, and oauth2-proxy using either the worker ORM path (default noble_authentik_oidc_provision_via: worker: kubectl exec + ak shell + files/worker_upsert_oauth_oidc.py, which avoids 2026+ REST 403 on GET …/providers/oauth2/**) or the REST-only path (noble_authentik_oidc_provision_via: rest**: files/configure_authentik.py needs a token that can list/patch OAuth2 providers). With the worker path and a bootstrap email, it also runs files/worker_add_bootstrap_user_groups.py so User.groups.add does not depend on **GET …/core/users/**. It then runs configure_authentik.py with AUTHENTIK_SKIP_OIDC_REST / AUTHENTIK_SKIP_USER_GROUP_REST when those worker steps ran, so the script only calls ensure_group over the API (skipped when AUTHENTIK_NOBLE_*_GROUP_PK are set).

RBAC notes

  • Argo CD: noble-admins group → role:admin (see clusters/noble/bootstrap/argocd/values-authentik-oidc.yaml).
  • Grafana: noble-admins → Admin, noble-editors → Editor (see values-authentik-oidc.yaml).

Troubleshooting

  • oauth2-proxy shows 500 on oauth2.apps…/oauth2/callback (logs: email in id_token (...) isn't verified): Authentiks id_token often lacks email_verified: true for bootstrap users. clusters/noble/bootstrap/oauth2-proxy/values.yaml sets insecure-oidc-allow-unverified-email for the lab; otherwise verify the users email in Authentik, then helm upgrade oauth2-proxy (or --tags authentik).
  • Re-run configure_authentik.py only by executing noble.yml with --tags authentik after fixing .env.
  • If Authentik API calls fail, check flows exist (slug default-provider-authorization-implicit-consent) and TLS reaches AUTHENTIK_API_BASE.
  • GET …/flows/instances/… → HTTP 403 with Token invalid/expired: the bootstrap API token is not accepted yet (common right after install: worker still creating it) or NOBLE_AUTHENTIK_BOOTSTRAP_TOKEN in .env does not match the value Helm applied. Re-run --tags authentik (the role waits for GET …/core/applications/ to return 200 with your token). If you rotated the token in .env only, run the play again so Helm picks up the new value, or mint a new API token for akadmin in the admin UI.
  • GET …/flows/instances/… → HTTP 403 with permission errors (Authentik 2026+ RBAC): the bootstrap API token often cannot view flows. The role reads flow UUIDs from the worker database (kubectl exec + ak shell) when noble_authentik_oauth_authorization_flow_pk / noble_authentik_oauth_invalidation_flow_pk are unset. The same pattern applies to /crypto/certificatekeypairs/, /propertymappings/…, /core/groups/, and the matching noble_authentik_* inventory variables. If a lookup fails, fix akadmin / authentik Admins / token, or set the UUID variables manually (see below).
  • GET …/crypto/certificatekeypairs/… → HTTP 403 (permission): same RBAC issue as flows. When noble_authentik_oauth_signing_key_pk is unset, the role resolves the first CertificateKeyPair UUID from the worker DB. You can also set noble_authentik_oauth_signing_key_pk manually (Admin → SystemCertificates).
  • GET …/propertymappings/… → HTTP 403 (permission): when noble_authentik_oauth_scope_mapping_pks is unset, the role resolves ScopeMapping UUIDs from the worker DB: openid, email, profile, offline_access, and groups only if a separate groups mapping exists (Authentik 2026.x defaults put groups inside profile only).
  • GET …/core/groups/… → HTTP 403 (permission): when noble_authentik_group_pk_noble_admins and noble_authentik_group_pk_noble_editors are unset, the role runs resolve_noble_group_pks.py in the worker (get_or_create for noble-admins / noble-editors), then passes AUTHENTIK_NOBLE_*_GROUP_PK into configure_authentik.py so it skips group list/create via REST.
  • GET …/providers/oauth2/… → HTTP 403 (permission): bootstrap tokens often cannot list OAuth2 providers. With the default noble_authentik_oidc_provision_via: worker, the role upserts providers and applications in authentik-worker via Django ORM (worker_upsert_oauth_oidc.py) instead of configure_authentik.py REST. Set noble_authentik_oidc_provision_via: rest only if your API token has view_oauth2provider / provider edit permissions (e.g. a full akadmin token from the UI).
  • GET …/core/users/… → HTTP 403 when adding the bootstrap user to noble-admins / noble-editors: with noble_authentik_oidc_provision_via: worker and a non-empty bootstrap email, the role runs worker_add_bootstrap_user_groups.py in the worker (ORM User.groups.add) and sets AUTHENTIK_SKIP_USER_GROUP_REST so configure_authentik.py does not call the users API for membership.
  • Manual flow / signing / scope / group UUIDs (optional): set noble_authentik_oauth_authorization_flow_pk and noble_authentik_oauth_invalidation_flow_pk (both together), optionally noble_authentik_oauth_signing_key_pk, noble_authentik_oauth_scope_mapping_pks, noble_authentik_group_pk_noble_admins, and noble_authentik_group_pk_noble_editors, from the admin UI or -e / group_vars; configure_authentik.py then skips the matching REST discovery calls.
  • /if/admin/ redirects to /if/user/ (lost admin panel): in 2026.x, canAccessAdmin follows isSuperuser, which is true only when the user belongs to a group with the superuser flag (authentik Admins by default). noble_authentik_ensure_admin_ui_access (default true) makes --tags authentik run files/worker_ensure_authentik_admin_access.py in authentik-worker (adds akadmin or the bootstrap email user to authentik Admins and forces is_superuser on that group). Log out of Authentik (private window is fine) and sign in again. Set noble_authentik_ensure_admin_ui_access: false to skip. Without Ansible, you can fix it in Directory → Groups → authentik Admins (superuser flag + membership) or run ak shell with the same logic as that script.
  • Grafana / Headlamp / ForwardAuth “Unauthorized” or Authentik “Not found” (Authentik 2026.x): OAuth endpoints are no longer under /application/o/<app>/oauth2/.... Use issuer discovery (Grafana server_url at …/application/o/<slug>/; oauth2-proxy oidc-issuer-url; Headlamp -oidc-idp-issuer-url). Re-apply Traefik (allowCrossNamespace so Ingresses can use Middleware in oauth2-proxy), kube-prometheus-stack, and Headlamp after updating values (e.g. ansible-playbook playbooks/noble.yml --tags authentik).
  • Headlamp OIDC authorize fails / invalid_scope: Authentik often has no separate groups ScopeMapping (groups live under profile). Default noble_authentik_headlamp_oidc_scopes omits groups; add a groups mapping to the provider in Authentik and set noble_authentik_headlamp_oidc_scopes to include groups if you need that scope by name.
  • Headlamp OIDC: Authentik flashes then back at login / page refresh: Headlamp does support normal browser OAuth (redirect to Authentik and return to /oidc-callback). If the callback fails, the UI looks like it “drops” auth. Common causes: X-Forwarded-Proto not reaching Headlamp (callback built as http — see Headlamp OIDC docs); Traefik ForwardAuth on the same Ingress (do not combine with native OIDC); PKCE state issues — this role defaults noble_authentik_headlamp_oidc_use_pkce: false for Authentik confidential clients (set true in group_vars if you need PKCE).
  • Headlamp UI: /me works but /clusters/main/version (and other K8s calls) return 401: Headlamp forwards your OIDC id_token to kube-apiserver. The API server must be configured with OIDC flags for the same issuer and oidc-client-id as Headlamp (see talos/talconfig.yaml patch and Kubernetes OIDC authentication). Apply regenerated Talos configs to control plane nodes, then kubectl apply -k clusters/noble/bootstrap/headlamp (or --tags authentik) for oidc-noble-admins-clusterrolebinding.yaml. Ensure your user is in Authentik group noble-admins and the id_token includes a groups claim if you rely on that binding.
  • Headlamp + Traefik ForwardAuth (oauth2-proxy-forward-auth): Do not put ForwardAuth on the Headlamp Ingress while using native Headlamp OIDC. Auth runs on /oidc-callback before Headlamp can finish the code exchange; ForwardAuth returns 401 and breaks login. Use either native OIDC (this repos values-authentik-oidc.yaml) or terminate auth at oauth2-proxy only (no config.oidc), not both.

Fix admin access manually (worker shell, no Ansible)

kubectl exec -it deploy/authentik-worker -n authentik -- ak shell -c "from authentik.core.models import User, Group; u=User.objects.get(username='akadmin'); adm,_=Group.objects.get_or_create(name='authentik Admins', defaults={'is_superuser': True}); adm.is_superuser=True; adm.save(update_fields=['is_superuser']); u.groups.add(adm); u=User.objects.get(pk=u.pk); print('is_superuser', u.is_superuser)"

Then log out of Authentik and sign in again.