diff --git a/ansible/roles/noble_authentik/README.md b/ansible/roles/noble_authentik/README.md index b62df15..f8e32e3 100644 --- a/ansible/roles/noble_authentik/README.md +++ b/ansible/roles/noble_authentik/README.md @@ -37,6 +37,11 @@ When **`noble_authentik_configure_idp`** is true, Ansible creates/updates OAuth2 - **`GET …/core/users/…` → HTTP 403** when adding the bootstrap user to **noble-admins** / **noble-editors**: with **`noble_authentik_oidc_provision_via: worker`** and a non-empty bootstrap email, the role runs **`worker_add_bootstrap_user_groups.py`** in the worker (ORM **`User.groups.add`**) and sets **`AUTHENTIK_SKIP_USER_GROUP_REST`** so **`configure_authentik.py`** does not call the users API for membership. - **Manual flow / signing / scope / group UUIDs (optional):** set **`noble_authentik_oauth_authorization_flow_pk`** and **`noble_authentik_oauth_invalidation_flow_pk`** (both together), optionally **`noble_authentik_oauth_signing_key_pk`**, **`noble_authentik_oauth_scope_mapping_pks`**, **`noble_authentik_group_pk_noble_admins`**, and **`noble_authentik_group_pk_noble_editors`**, from the admin UI or `-e` / `group_vars`; **`configure_authentik.py`** then skips the matching REST discovery calls. - **`/if/admin/` redirects to `/if/user/`** (lost admin panel): in **2026.x**, **`canAccessAdmin`** follows **`isSuperuser`**, which is true only when the user belongs to a group with the **superuser** flag (**`authentik Admins`** by default). **`noble_authentik_ensure_admin_ui_access`** (default **true**) makes **`--tags authentik`** run **`files/worker_ensure_authentik_admin_access.py`** in **authentik-worker** (adds **akadmin** or the **bootstrap email** user to **authentik Admins** and forces **`is_superuser`** on that group). **Log out** of Authentik (private window is fine) and sign in again. Set **`noble_authentik_ensure_admin_ui_access: false`** to skip. Without Ansible, you can fix it in **Directory → Groups → authentik Admins** (superuser flag + membership) or run **`ak shell`** with the same logic as that script. +- **Grafana / Headlamp / ForwardAuth “Unauthorized” or Authentik “Not found”** (Authentik **2026.x**): OAuth endpoints are no longer under **`/application/o//oauth2/...`**. Use **issuer discovery** (Grafana **`server_url`** at **`…/application/o//`**; oauth2-proxy **`oidc-issuer-url`**; Headlamp **`-oidc-idp-issuer-url`**). Re-apply **Traefik** (**`allowCrossNamespace`** so Ingresses can use Middleware in **`oauth2-proxy`**), **kube-prometheus-stack**, and **Headlamp** after updating values (e.g. **`ansible-playbook playbooks/noble.yml --tags authentik`**). +- **Headlamp OIDC authorize fails / `invalid_scope`**: Authentik often has no separate **`groups`** ScopeMapping (groups live under **`profile`**). Default **`noble_authentik_headlamp_oidc_scopes`** omits **`groups`**; add a **`groups`** mapping to the provider in Authentik and set **`noble_authentik_headlamp_oidc_scopes`** to include **`groups`** if you need that scope by name. +- **Headlamp OIDC: Authentik flashes then back at login / page refresh**: Headlamp **does** support normal browser OAuth (redirect to Authentik and return to **`/oidc-callback`**). If the callback fails, the UI looks like it “drops” auth. Common causes: **`X-Forwarded-Proto`** not reaching Headlamp (callback built as **`http`** — see [Headlamp OIDC docs](https://headlamp.dev/docs/latest/installation/in-cluster/oidc)); **Traefik ForwardAuth** on the same Ingress (do not combine with native OIDC); **PKCE** state issues — this role defaults **`noble_authentik_headlamp_oidc_use_pkce: false`** for Authentik confidential clients (set **`true`** in **`group_vars`** if you need PKCE). +- **Headlamp UI: `/me` works but `/clusters/main/version` (and other K8s calls) return 401**: Headlamp forwards your **OIDC id_token** to **kube-apiserver**. The API server must be configured with **OIDC** flags for the same **issuer** and **`oidc-client-id`** as Headlamp (see **`talos/talconfig.yaml`** patch and [Kubernetes OIDC authentication](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#openid-connect-tokens)). Apply regenerated Talos configs to control plane nodes, then **`kubectl apply -k clusters/noble/bootstrap/headlamp`** (or **`--tags authentik`**) for **`oidc-noble-admins-clusterrolebinding.yaml`**. Ensure your user is in Authentik group **`noble-admins`** and the id_token includes a **`groups`** claim if you rely on that binding. +- **Headlamp + Traefik ForwardAuth (`oauth2-proxy-forward-auth`)**: Do **not** put ForwardAuth on the **Headlamp** Ingress while using **native Headlamp OIDC**. Auth runs on **`/oidc-callback`** before Headlamp can finish the code exchange; ForwardAuth returns **401** and breaks login. Use **either** native OIDC (this repo’s **`values-authentik-oidc.yaml`**) **or** terminate auth at oauth2-proxy only (no **`config.oidc`**), not both. ### Fix admin access manually (worker shell, no Ansible) @@ -45,4 +50,3 @@ kubectl exec -it deploy/authentik-worker -n authentik -- ak shell -c "from authe ``` Then **log out** of Authentik and sign in again. -- **Grafana / Headlamp / ForwardAuth “Unauthorized” or Authentik “Not found”** (Authentik **2026.x**): OAuth endpoints are no longer under **`/application/o//oauth2/...`**. Use **issuer discovery** (Grafana **`server_url`** at **`…/application/o//`**; oauth2-proxy **`oidc-issuer-url`**; Headlamp **`-oidc-idp-issuer-url`**). Re-apply **Traefik** (**`allowCrossNamespace`** so Ingresses can use Middleware in **`oauth2-proxy`**), **kube-prometheus-stack**, and **Headlamp** after updating values (e.g. **`ansible-playbook playbooks/noble.yml --tags authentik`**). diff --git a/ansible/roles/noble_authentik/defaults/main.yml b/ansible/roles/noble_authentik/defaults/main.yml index baa32df..785daa7 100644 --- a/ansible/roles/noble_authentik/defaults/main.yml +++ b/ansible/roles/noble_authentik/defaults/main.yml @@ -29,6 +29,15 @@ noble_authentik_client_id_grafana: grafana noble_authentik_client_id_headlamp: headlamp noble_authentik_client_id_oauth2_proxy: oauth2-proxy +# Headlamp **OIDC_SCOPES** for Secret **headlamp-oidc**. Omit **groups** unless the Authentik OAuth2 provider +# includes a separate **groups** ScopeMapping (2026.x defaults often embed groups in **profile** only; requesting +# **groups** then yields **invalid_scope** on authorize). Override if your IdP exposes **groups** explicitly. +noble_authentik_headlamp_oidc_scopes: "openid profile email offline_access" +# PKCE for Headlamp OIDC. **false** is the default for Authentik **confidential** clients: auth still uses the +# standard browser OAuth code flow; PKCE is optional and some users see the callback “flash” then login reset +# when PKCE state/cookies do not survive the redirect. Set **true** if you require PKCE. +noble_authentik_headlamp_oidc_use_pkce: false + # Secrets / bootstrap — prefer **lookup('env', ...)** set via repository **.env** (see from_env.yml). noble_authentik_secret_key: "" noble_authentik_postgresql_password: "" diff --git a/ansible/roles/noble_authentik/tasks/main.yml b/ansible/roles/noble_authentik/tasks/main.yml index 2a272e2..3db453f 100644 --- a/ansible/roles/noble_authentik/tasks/main.yml +++ b/ansible/roles/noble_authentik/tasks/main.yml @@ -444,9 +444,9 @@ --from-literal=OIDC_CLIENT_ID="{{ noble_authentik_client_id_headlamp }}" \ --from-literal=OIDC_CLIENT_SECRET="${HEADLAMP_OIDC_SECRET}" \ --from-literal=OIDC_ISSUER_URL="{{ noble_authentik_public_url }}/application/o/headlamp/" \ - --from-literal=OIDC_SCOPES="openid profile email groups offline_access" \ + --from-literal=OIDC_SCOPES="{{ noble_authentik_headlamp_oidc_scopes }}" \ --from-literal=OIDC_CALLBACK_URL="https://headlamp.apps.noble.lab.pcenicni.dev/oidc-callback" \ - --from-literal=OIDC_USE_PKCE="true" \ + --from-literal=OIDC_USE_PKCE="{{ 'true' if (noble_authentik_headlamp_oidc_use_pkce | bool) else 'false' }}" \ --dry-run=client -o yaml | kubectl apply -f - environment: KUBECONFIG: "{{ noble_kubeconfig }}" @@ -566,6 +566,7 @@ - "{{ noble_repo_root }}/clusters/noble/bootstrap/headlamp/values.yaml" - -f - "{{ noble_repo_root }}/clusters/noble/bootstrap/headlamp/values-authentik-oidc.yaml" + - --set=config.oidc.usePKCE={{ 'true' if (noble_authentik_headlamp_oidc_use_pkce | bool) else 'false' }} - --force-conflicts - --wait - --timeout @@ -574,6 +575,17 @@ KUBECONFIG: "{{ noble_kubeconfig }}" changed_when: true + - name: Apply Headlamp static manifests (metrics RBAC + OIDC group binding) + ansible.builtin.command: + argv: + - kubectl + - apply + - -k + - "{{ noble_repo_root }}/clusters/noble/bootstrap/headlamp" + environment: + KUBECONFIG: "{{ noble_kubeconfig }}" + changed_when: true + - name: Helm upgrade Longhorn with ForwardAuth (oauth2-proxy OIDC) ansible.builtin.command: argv: diff --git a/ansible/roles/noble_platform/tasks/main.yml b/ansible/roles/noble_platform/tasks/main.yml index 0027dd4..3770033 100644 --- a/ansible/roles/noble_platform/tasks/main.yml +++ b/ansible/roles/noble_platform/tasks/main.yml @@ -208,6 +208,17 @@ KUBECONFIG: "{{ noble_kubeconfig }}" changed_when: true +- name: Apply Headlamp static manifests (metrics RBAC + OIDC group binding when used) + ansible.builtin.command: + argv: + - kubectl + - apply + - -k + - "{{ noble_repo_root }}/clusters/noble/bootstrap/headlamp" + environment: + KUBECONFIG: "{{ noble_kubeconfig }}" + changed_when: true + - name: Argo CD — apply Application manifests after platform Helm ansible.builtin.include_role: name: noble_argocd diff --git a/clusters/noble/bootstrap/headlamp/README.md b/clusters/noble/bootstrap/headlamp/README.md index ada6228..a0b977f 100644 --- a/clusters/noble/bootstrap/headlamp/README.md +++ b/clusters/noble/bootstrap/headlamp/README.md @@ -15,7 +15,7 @@ helm upgrade --install headlamp headlamp/headlamp -n headlamp \ --version 0.40.1 -f clusters/noble/bootstrap/headlamp/values.yaml --wait --timeout 10m ``` -Sign-in uses a **ServiceAccount token** (Headlamp docs: create a limited SA for day-to-day use). This repo binds the Headlamp workload SA to the built-in **`edit`** ClusterRole (**`clusterRoleBinding.clusterRoleName: edit`** in **`values.yaml`**) — not **`cluster-admin`**. For cluster-scoped admin work, use **`kubectl`** with your admin kubeconfig. Optional **OIDC** in **`config.oidc`** replaces token login for SSO. +Sign-in uses a **ServiceAccount token** (Headlamp docs: create a limited SA for day-to-day use). This repo binds the Headlamp workload SA to the built-in **`edit`** ClusterRole (**`clusterRoleBinding.clusterRoleName: edit`** in **`values.yaml`**) — not **`cluster-admin`**. For cluster-scoped admin work, use **`kubectl`** with your admin kubeconfig. Optional **OIDC** in **`config.oidc`** replaces token login for SSO. **In-cluster OIDC requires kube-apiserver OIDC** (same Authentik app issuer + **`oidc-client-id: headlamp`**) or proxied K8s calls return **401** while **`/me`** still returns 200 — see **`talos/talconfig.yaml`**, **`oidc-noble-admins-clusterrolebinding.yaml`**, and **`ansible/roles/noble_authentik/README.md`** troubleshooting. ## Sign-in token (ServiceAccount `headlamp`) diff --git a/clusters/noble/bootstrap/headlamp/kustomization.yaml b/clusters/noble/bootstrap/headlamp/kustomization.yaml index 94478d2..07019bf 100644 --- a/clusters/noble/bootstrap/headlamp/kustomization.yaml +++ b/clusters/noble/bootstrap/headlamp/kustomization.yaml @@ -4,3 +4,4 @@ kind: Kustomization # Do not include it here — two Applications owning the same Namespace causes SharedResourceWarning. resources: - metrics-clusterrolebinding.yaml + - oidc-noble-admins-clusterrolebinding.yaml diff --git a/clusters/noble/bootstrap/headlamp/oidc-noble-admins-clusterrolebinding.yaml b/clusters/noble/bootstrap/headlamp/oidc-noble-admins-clusterrolebinding.yaml new file mode 100644 index 0000000..ac196a0 --- /dev/null +++ b/clusters/noble/bootstrap/headlamp/oidc-noble-admins-clusterrolebinding.yaml @@ -0,0 +1,19 @@ +# OIDC users in Authentik group **noble-admins** (claim **groups**) get the same cluster access as the Headlamp +# ServiceAccount binding (**edit**). Requires kube-apiserver **oidc-*** extraArgs (see **talos/talconfig.yaml**). +# If your IdP omits **groups** from the id_token, add a **groups** scope/mapping in Authentik or bind **User** subjects instead. +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: headlamp-oidc-noble-admins + labels: + app.kubernetes.io/name: headlamp + app.kubernetes.io/component: oidc-rbac +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: edit +subjects: + - apiGroup: rbac.authorization.k8s.io + kind: Group + name: noble-admins diff --git a/clusters/noble/bootstrap/headlamp/values-authentik-oidc.yaml b/clusters/noble/bootstrap/headlamp/values-authentik-oidc.yaml index 5b84ecf..1e49657 100644 --- a/clusters/noble/bootstrap/headlamp/values-authentik-oidc.yaml +++ b/clusters/noble/bootstrap/headlamp/values-authentik-oidc.yaml @@ -1,7 +1,9 @@ # OIDC with Authentik — credentials live in Secret **headlamp-oidc** (envFrom), created by **noble_authentik**. +# **OIDC_SCOPES** in that Secret must match scopes the Authentik provider exposes (see **noble_authentik_headlamp_oidc_scopes**). # # With **externalSecret**, the Headlamp chart only adds **-oidc-callback-url** / **-oidc-use-pkce** args when these # values are set here (or under **env:**). The Secret alone is not enough — without them, login can fail or Authentik returns errors. +# **usePKCE** defaults **false** for Authentik confidential clients (Ansible **noble_authentik_headlamp_oidc_use_pkce** also passes **--set** on **--tags authentik**). config: oidc: @@ -11,4 +13,4 @@ config: enabled: true name: headlamp-oidc callbackURL: "https://headlamp.apps.noble.lab.pcenicni.dev/oidc-callback" - usePKCE: true + usePKCE: false diff --git a/talos/runbooks/rbac.md b/talos/runbooks/rbac.md index 9f7bd18..7122c5e 100644 --- a/talos/runbooks/rbac.md +++ b/talos/runbooks/rbac.md @@ -2,6 +2,19 @@ **Headlamp** (`clusters/noble/bootstrap/headlamp/values.yaml`): the chart’s **ClusterRoleBinding** uses the built-in **`edit`** ClusterRole — not **`cluster-admin`**. Break-glass changes use **`kubectl`** with an admin kubeconfig. +**Headlamp OIDC + kube-apiserver (401 on `/clusters/main/version`, 200 on `/me`)** +Headlamp sends your **IdP JWT** to the Kubernetes API. **`/me`** is answered by Headlamp; **`/clusters/.../version`** is proxied to **kube-apiserver**. **401** there means **authentication failed** at the API server (RBAC would normally be **403** after a successful auth). You must: + +1. **Roll out Talos control-plane config** that sets **`cluster.apiServer.extraArgs`** for the same Authentik app as Headlamp — see the second **`patches`** entry in **`talos/talconfig.yaml`** (`oidc-issuer-url`, `oidc-client-id: headlamp`, `oidc-username-claim`, `oidc-groups-claim`). After edits: **`talhelper genconfig -o out`**, then **`talosctl apply-config`** on each control plane (rolling). +2. **Ensure control planes can reach** `https://auth.apps.noble.lab.pcenicni.dev/...` (JWKS / discovery). If that URL is unreachable from nodes, OIDC validation fails. +3. **Apply cluster RBAC for OIDC groups**: **`kubectl apply -k clusters/noble/bootstrap/headlamp`** (includes **`oidc-noble-admins-clusterrolebinding.yaml`**). Your user must be in Authentik group **`noble-admins`** and the id_token should carry a **`groups`** claim if you rely on that binding. + +Quick discovery check (any machine with DNS to Authentik): + +```bash +curl -fsS "https://auth.apps.noble.lab.pcenicni.dev/application/o/headlamp/.well-known/openid-configuration" | head -c 400; echo +``` + **Argo CD** (`clusters/noble/bootstrap/argocd/values.yaml`): **`policy.default: role:readonly`** — new OIDC/Git users get read-only unless you add **`g, , role:admin`** (or another role) in **`configs.rbac.policy.csv`**. Local user **`admin`** stays **`role:admin`** via **`g, admin, role:admin`**. **Audits** diff --git a/talos/talconfig.with-longhorn.yaml b/talos/talconfig.with-longhorn.yaml index e14d616..2f93693 100644 --- a/talos/talconfig.with-longhorn.yaml +++ b/talos/talconfig.with-longhorn.yaml @@ -94,3 +94,14 @@ patches: - bind - rshared - rw + # Headlamp OIDC: the UI sends the IdP JWT to kube-apiserver. Without these flags, /clusters/*/version etc. return 401. + # Must match Authentik **headlamp** provider (same issuer + client_id as Headlamp Helm / **headlamp-oidc** Secret). + # After **talhelper genconfig**, apply control plane configs (rolling). Then ensure RBAC (e.g. **oidc-noble-admins-clusterrolebinding.yaml**). + - |- + cluster: + apiServer: + extraArgs: + oidc-issuer-url: https://auth.apps.noble.lab.pcenicni.dev/application/o/headlamp/ + oidc-client-id: headlamp + oidc-username-claim: email + oidc-groups-claim: groups diff --git a/talos/talconfig.yaml b/talos/talconfig.yaml index e14d616..2f93693 100644 --- a/talos/talconfig.yaml +++ b/talos/talconfig.yaml @@ -94,3 +94,14 @@ patches: - bind - rshared - rw + # Headlamp OIDC: the UI sends the IdP JWT to kube-apiserver. Without these flags, /clusters/*/version etc. return 401. + # Must match Authentik **headlamp** provider (same issuer + client_id as Headlamp Helm / **headlamp-oidc** Secret). + # After **talhelper genconfig**, apply control plane configs (rolling). Then ensure RBAC (e.g. **oidc-noble-admins-clusterrolebinding.yaml**). + - |- + cluster: + apiServer: + extraArgs: + oidc-issuer-url: https://auth.apps.noble.lab.pcenicni.dev/application/o/headlamp/ + oidc-client-id: headlamp + oidc-username-claim: email + oidc-groups-claim: groups