Skip to content

ADR 0009: Authentik blueprints as the source of truth for per-app OIDC config

Status: Accepted Date: 2026-04-30

Context

Every SCALR app needs its own OIDC client at the Authentik server: a Provider with client_id=<slug>, an Application bound to it, redirect URIs covering both the gateway hostname (<slug>.localhost) and the direct-port URL, and a group binding so users can sign in. Until now this was a manual click-through in the Authentik UI — documented at docs/content/guides/configuring-authentik-oidc.md, referenced from every app's AGENT_INSTRUCTIONS.md and from make new-app's output.

The manual step is the longest-lasting friction in scaffolding a new app. It also produces inconsistent state: across the seven existing apps, two had Provider names like Calculator (matching the display name), one had template-app-provider (the convention the docs originally suggested), and one app's redirect URIs had a typo (http://localhost: 5204 with a leading space) that broke OIDC discovery silently.

We needed a way to:

  1. Scaffold a new app's OIDC config without UI clicks.
  2. Keep config consistent across apps.
  3. Make config diff-reviewable in PRs.
  4. Heal drift — if someone tweaks a Provider in the UI, our intent is recoverable from the repo.

Candidates

  • Authentik REST API from the scaffold script. The admin UI is built on the API; everything is exposed. Works, but introduces a runtime dependency: Authentik must be reachable at scaffold time, the scaffold script needs a bootstrap token, error handling has to deal with network failures, and the call is imperative (one-shot). No drift correction.
  • Authentik blueprints. Declarative YAML files Authentik watches and reconciles. Already used in this repo for the test user (services/auth/blueprints/scalr-test-user.yaml). Idempotent, drift-correcting, and the file IS the source of truth.
  • Terraform (goauthentik/terraform-provider-authentik). Production-grade, third-party, well-maintained. Brings a whole tool and concept (state files, plan/apply, lock files) for one use case.

Decision

Use Authentik blueprints. One YAML per app at services/auth/blueprints/apps/<slug>.yaml. The scaffold script generates the file at app-creation time. Existing apps got blueprints generated in the same change.

The blueprint identifies its Provider by client_id (which is unique), so it reconciles cleanly with Providers that pre-exist with different display names — including the typo-laden ones from manual setup.

Each blueprint creates three entries:

  • OAuth2/OIDC Providerclient_type=public, redirect URIs for both localhost:<port> and <slug>.localhost (each with and without trailing slash), default authorization + invalidation flows, default OIDC scope mappings (openid, email, profile).
  • Applicationslug=<slug>, meta_launch_url=http://<slug>.localhost, bound to the Provider above by !KeyOf reference.
  • Policy binding to the scalr-users group (provisioned by the test-user blueprint), so anyone in that group can sign in.

services/auth/blueprints/ is bind-mounted into both the server and worker containers as /blueprints/scalr/. Authentik scans recursively, so the apps/ subdirectory is picked up automatically.

Why blueprints beat the alternatives

  1. Idempotent. Re-applying does nothing if the entry's current state already matches the YAML. Editing the YAML and waiting for the next reconcile pushes the change. No "did the API call succeed" question.
  2. Drift-correcting. Manual UI changes are reverted on the next reconcile. The repo IS the truth. Caught the calculator redirect-URI typo automatically when the blueprint reconciled.
  3. No new infrastructure. The mount and reconciler already existed for the test-user blueprint. We just added more files.
  4. No new tools. Terraform would have meant introducing state management, a lock file, an apply step. Blueprints are config + reconciler — same model as Kubernetes manifests, well understood.
  5. Async without a wait step. The scaffold script writes the file and exits. Authentik's worker picks it up within ~60s, or you can docker compose restart worker for an immediate apply. The agent doesn't need to poll Authentik or block on it.

Consequences

Positive: - New app onboarding is one command (make new-app or python3 infra/scripts/new-app.py) plus one optional worker restart for instant pickup. - OIDC config is in PR diffs, so changes to redirect URIs / launch URLs / group bindings are reviewable. - The Authentik UI step in docs/content/guides/configuring-authentik-oidc.md is now a fallback, not the primary flow.

Negative: - Async failure mode. If the blueprint has a YAML error or references a missing flow, the scaffold script returns success and the broken state surfaces only when sign-in fails. Mitigated: the scaffold script's banner tells you to check docker compose logs worker | grep -i blueprint, and an actual failure shows up as status: error on the BlueprintInstance API. - No automatic teardown. Deleting an app's blueprint file removes the BlueprintInstance record but the Provider/Application it created persist (Authentik blueprints don't aggressively track entity ownership). Manual cleanup needed via UI or API on app removal. Acceptable trade for the simplicity gain — apps are rarely deleted. - Provider names get standardised. Existing apps had varied Provider display names (Calculator, template-app-provider, etc.) from manual setup. The blueprint sets name: <Display Name> (matching the manifest's name field). On first reconcile, names converge to that convention. Cosmetic only — client_id (the OIDC-relevant field) doesn't change.

What's NOT in the blueprint

Out of scope for the per-app blueprint, intentionally:

  • The signing certificate. Uses Authentik's default self-signed cert in dev. Prod will need a real cert; the blueprint references it by name (authentik Self-signed Certificate) so changing the cert in prod is a one-line edit.
  • Custom scope mappings (e.g., groups claim). Apps work today with the three default scope mappings (openid, email, profile). When/if we need group-based entitlement to actually flow into JWTs, we'll add a custom scope mapping in a separate blueprint and reference it from each app's property_mappings list.
  • Per-app group bindings. All apps bind to scalr-users (the broad-access group). Tightening any one app's access is a manual addition for now — fine for v0.

Migration path

If we ever decide blueprints are wrong: each blueprint maps cleanly to a few terraform-provider-authentik resources. Convert the YAML to HCL, run terraform import against existing entities, done. The Provider/Application/Binding entities themselves stay the same — only the management tool changes.