From a548665edb18c41631ec3ea292302bc197907a75 Mon Sep 17 00:00:00 2001 From: Lukasz Raczylo Date: Mon, 18 May 2026 17:35:37 +0100 Subject: [PATCH] feat: opt-in M2M bearer-token authentication (supersedes #93) (#140) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs: bearer-token auth design spec * docs: harden bearer-auth spec with security review findings * feat(bearer): opt-in M2M bearer-token authentication Adds an opt-in Authorization: Bearer path for machine-to-machine clients. Replaces and supersedes the broken approach in PR #93 (synthetic-session that omitted user_identifier and skipped ID-token rejection / replay-protection-semantics / kid-pinning / etc.). Design Two auth entrypoints feed one shared post-auth pipeline: cookie path ─┐ ├── forwardAuthorized(rw, req, *principal) bearer path ─┘ (roles/groups, header injection, security headers, cookie strip, forward) buildPrincipalFromSession and buildPrincipalFromBearerToken produce the same `principal` value type. forwardAuthorized is session-agnostic and runs the existing post-auth work; processAuthorizedRequest now wraps it with the session-specific concerns (backchannel-logout, dirty/Save). The cookie path's behaviour is byte-identical to before this PR; the existing test suite passes unmodified. Security hardening baked into the bearer path - Audience MANDATORY. Startup fails when EnableBearerAuth=true and Audience is empty. - BearerIdentifierClaim defaults to "sub"; "email" is rejected at startup to avoid the unverified-email spoofing footgun. Cookie path's UserIdentifierClaim is unaffected and still defaults to "email". - ID tokens explicitly rejected via the existing detectTokenType helper (nonce, typ=at+jwt, token_use, scope, aud-vs-clientID heuristics); belt-and-braces nonce/token_use=id rejection on top. - alg pinned to asymmetric allowlist (RS/PS/ES 256/384/512) BEFORE JWKS fetch, blocking alg=none and alg=HS* probes from amplifying into upstream calls. - kid length capped at 256 bytes and charset-restricted before JWKS fetch, blocking pathological-kid JWKS amplification. - Multi-audience tokens require azp == clientID. - iat upper-age bound (MaxTokenAgeSeconds, default 24h) bounds clock- manipulation and forever-token abuse. - Identifier sanitization: length cap, control-char + bidi-override + delimiter (, ; =) rejection. - Per-IP failure throttle: configurable threshold/window/penalty; returns 429 + Retry-After. Limits offline-guessing-style attacks and protects the shared rate-limiter / JWKS endpoint. - JTI replay marking suppressed via new internal verifyOpts {skipReplayMarking} so the same bearer can be reused until exp; the blacklist Get stays active so RevokeToken still terminates a bearer token immediately. The existing exported VerifyToken interface is unchanged so all mocks continue to work. - Cookie wins by default when both bearer and cookie are present (safer against browser/extension/proxy bearer injection). Operator can flip via BearerOverridesCookie. - Authorization header stripped on forward by default; also stripped on excluded URLs so the token can't leak into health/metrics downstream logs. - Optional RFC 7662 introspection via existing requireTokenIntrospection. Introspection-endpoint failure returns 503 (distinguishes infra from token rejection). - 401s use RFC 6750 WWW-Authenticate hints (toggleable). Failure reason is logged at debug; raw tokens are never logged. Implementation - principal.go: pure-data principal type and buildPrincipalFromSession. - bearer_auth.go: alg/kid pin, classifier, identifier sanitization, multi-aud azp gate, iat age check, per-IP failure tracker, handleBearerRequest, buildPrincipalFromBearerToken. - token_manager.go: VerifyToken now wraps a new verifyTokenWithOpts that accepts internal-only verifyOpts. Existing callers, the TokenVerifier interface, and all mocks unchanged. - middleware.go: extracted forwardAuthorized from processAuthorizedRequest; wired bearer detection after init wait + after bypass; excluded-URL Authorization strip when bearer enabled. - settings.go: ten new config fields with defaults applied in CreateConfig. - main.go: startup validation for audience + identifier-claim guard; bearer failure tracker init. Tests - bearer_auth_test.go: table-driven helper tests for every new component (parseBearerJOSEHeader, sanitizeBearerIdentifier, resolveBearerIdentifier, enforceMultiAudienceAzp, enforceIatAge, bearerFailureTracker, detectBearerToken). Integration tests through ServeHTTP covering happy path, ID-token rejection, alg=none rejection, oversized kid, multi-aud with/without azp, iat-too-old, bidi identifier, replay (100x reuse), 429 throttle trip, excluded-URL strip, roles gate, cookie-wins precedence, BearerOverridesCookie, oversized token, malformed JWT, feature-off pass-through. Startup validation for audience- required and email-identifier-rejected. - All existing tests pass unmodified (cookie-path regression). - go vet clean. golangci-lint clean (0 issues). Race detector clean on bearer tests. Documentation - README.md: bearer auth section with security highlights and config snippet; doc link in the index. - .traefik.yml: commented config block exposing every bearer knob. - docs/CONFIGURATION.md: new subsection with full parameter table. - docs/BEARER_AUTH.md: threat model, hardening matrix, failure response table, operational guidance, known follow-ups. - docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md: design spec + security-review hardening history. * fix(cache): redact raw cache keys in debug logs (CodeQL go/clear-text-logging) CodeQL flagged 9 high-severity alerts (go/clear-text-logging) where the in-memory cache and the hybrid L1+L2 backend printed `key=%s` at debug. Cache callers (token cache, blacklist, introspection cache) pass raw access / refresh / id tokens as cache keys, so any debug-enabled deployment would write them to log streams. Pre-existing issue. CodeQL started flagging it on this PR because the new bearer-auth path adds a data-flow source (req.Header.Get("Authorization")) that reaches the existing logging sinks via the same cache. The cookie path had the same risk but wasn't tracked as taint by CodeQL. Fix: hash the key (SHA-256[:8] hex) before printing. Same approach the bearer-auth logger uses for principal identifiers (spec §13). Doesn't change cache semantics — same key still produces the same hash, so debug correlation across log lines is preserved without exposing the raw value. Touches both affected packages: - internal/cache/cache.go (2 sites: Set + LRU eviction) - internal/cache/backends/hybrid.go (12 sites: L1/L2 read/write/fallback) New helper `redactKey` colocated with each package (unexported, package-local) keeps the change blast radius narrow. Tests green; lint clean. * docs(bearer): how to obtain bearer tokens from the OIDC provider Adds a section walking operators through the OAuth 2.0 client_credentials flow (RFC 6749 §4.4) and the JWT bearer assertion alternative (RFC 7523), with a worked Auth0-shape curl example, a per-provider quick reference (Auth0, Okta, Keycloak, Entra v2, Cognito, GitLab, Google), operational notes (token TTL, caching, JWKS rotation, revocation, scope vs audience, secret hygiene), and a three-line validation loop. Most common operator confusion: "I enabled the feature but tokens get 401'd" — almost always missing or wrong audience. The new section makes the audience-matching requirement loud, with per-provider parameter names so people don't have to dig through IdP docs. Locations: - docs/BEARER_AUTH.md — full section under "Quick start" - README.md — short snippet + deep link --- .traefik.yml | 22 + README.md | 87 ++ bearer_auth.go | 592 +++++++++++++ bearer_auth_test.go | 812 ++++++++++++++++++ docs/BEARER_AUTH.md | 250 ++++++ docs/CONFIGURATION.md | 20 + .../2026-05-18-bearer-token-auth-design.md | 459 ++++++++++ internal/cache/backends/hybrid.go | 24 +- internal/cache/backends/log_redact.go | 26 + internal/cache/cache.go | 4 +- internal/cache/log_redact.go | 22 + main.go | 99 ++- middleware.go | 193 +++-- principal.go | 58 ++ settings.go | 93 +- token_manager.go | 35 +- types.go | 13 + 17 files changed, 2702 insertions(+), 107 deletions(-) create mode 100644 bearer_auth.go create mode 100644 bearer_auth_test.go create mode 100644 docs/BEARER_AUTH.md create mode 100644 docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md create mode 100644 internal/cache/backends/log_redact.go create mode 100644 internal/cache/log_redact.go create mode 100644 principal.go diff --git a/.traefik.yml b/.traefik.yml index 216e3f6..858eacc 100644 --- a/.traefik.yml +++ b/.traefik.yml @@ -80,3 +80,25 @@ testData: # address: redis:6379 # password: urn:k8s:secret:redis:password # cacheMode: hybrid + + # Optional: bearer-token authentication for M2M (machine-to-machine) API + # clients. Default off. When enabled, requests presenting + # "Authorization: Bearer " are validated against the configured OIDC + # provider (signature/issuer/audience/exp) and forwarded without creating + # a cookie session. The bearer path REJECTS ID tokens, requires a non- + # default audience, and never trusts the `email` claim as the identifier. + # See docs/BEARER_AUTH.md for the full threat model. + # + # enableBearerAuth: true # opt-in + # audience: https://api.example.com # REQUIRED when bearer is enabled + # bearerIdentifierClaim: sub # default; used as X-Forwarded-User. `email` is rejected. + # stripAuthorizationHeader: true # default; drops the raw token before forwarding + # bearerEmitWWWAuthenticate: true # default; RFC 6750 hint on 401s + # bearerOverridesCookie: false # default; cookie wins when both are present + # requireTokenIntrospection: false # opt-in; calls RFC 7662 introspection per request + # maxTokenAgeSeconds: 86400 # 24h cap on iat (rejects clock-skew/forever tokens) + # maxIdentifierLength: 256 # cap on the sanitised principal identifier + # bearerFailureThreshold: 20 # consecutive 401s/IP that trip the throttle + # bearerFailureWindowSeconds: 60 # rolling window over which 401s are counted + # bearerFailurePenaltySeconds: 60 # 429 + Retry-After duration after threshold trips + diff --git a/README.md b/README.md index 61d8c03..d8576fb 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,7 @@ manages sessions, and forwards user identity to downstream services. - [Configuration reference](docs/CONFIGURATION.md) — every parameter - [Provider guide](docs/PROVIDERS.md) — Google, Azure, Auth0, Okta, Keycloak, Cognito, GitLab, GitHub, generic - [Auth0 audience guide](docs/AUTH0_AUDIENCE_GUIDE.md) — custom APIs, opaque tokens, token confusion +- [Bearer-token (M2M) auth](docs/BEARER_AUTH.md) — opt-in `Authorization: Bearer` path, threat model - [Redis cache](docs/REDIS.md) — multi-replica deployments - [Dynamic Client Registration](docs/DCR.md) — RFC 7591 - [Development](docs/DEVELOPMENT.md) · [Testing](docs/TESTING.md) @@ -171,6 +172,92 @@ Each instance must use a unique `cookiePrefix` **and** `sessionEncryptionKey`, otherwise a session minted by one instance can grant access through another. See [issue #87](https://github.com/lukaszraczylo/traefikoidc/issues/87). +### Bearer-token (M2M) authentication + +Opt-in path for API clients that present `Authorization: Bearer ` instead +of logging in via the browser flow. Default off. When enabled, the middleware +validates the bearer JWT against the configured OIDC provider (signature, +issuer, audience, expiry) and forwards the request downstream with the +principal headers — no cookie session is created. + +```yaml +enableBearerAuth: true +audience: https://api.example.com # REQUIRED when bearer is enabled +# optional, defaults shown: +bearerIdentifierClaim: sub # claim used as X-Forwarded-User +stripAuthorizationHeader: true # drop the raw token before forwarding +bearerEmitWWWAuthenticate: true # RFC 6750 hint on 401s +bearerOverridesCookie: false # cookie wins when both are present (safer) +maxTokenAgeSeconds: 86400 # 24h cap on iat +bearerFailureThreshold: 20 # consecutive 401s/IP before 429 throttle +``` + +Hardening built in by default: + +- **Audience required.** Startup fails if `enableBearerAuth=true` and + `audience` is unset. Eliminates the "token issued for service B accepted + by A" confusion vector. +- **ID tokens explicitly rejected.** Bearer is access-token-only. ID tokens + (detected via `nonce`, `typ: at+jwt`, `token_use`, `scope`, or audience + shape) return `401`. +- **`alg` and `kid` pinned at the entrypoint.** Asymmetric-only allowlist + (`RS256/384/512`, `PS256/384/512`, `ES256/384/512`); `kid` length and + charset capped — both checked **before** any JWKS fetch so attacker noise + can't amplify into upstream calls. +- **Identifier sanitised.** Default identifier source is `sub`; `email` is + rejected unless explicitly opted in (which the middleware still refuses to + avoid the unverified-email spoofing footgun). Control characters, bidi- + override codepoints, and the delimiters `, ; =` are all rejected before + the value reaches `X-Forwarded-User`. +- **Multi-audience tokens require `azp`.** When `aud` is an array of more + than one element, the token must carry `azp == clientID`. +- **`iat` upper-age bound.** Tokens older than `maxTokenAgeSeconds` are + rejected even if `exp` is far in the future. +- **Per-IP 401 throttle.** After `bearerFailureThreshold` consecutive 401s + from one source IP, further bearer requests from that IP are rejected + with `429 Too Many Requests` + `Retry-After`. +- **Cookie-wins by default.** When both a session cookie and an + `Authorization: Bearer` header arrive on the same request, the cookie path + runs (safer against browser/extension/proxy bearer injection). Set + `bearerOverridesCookie: true` for the AWS/GCP/Kubernetes convention. +- **Replay protection preserved.** The bearer path skips the JTI **Set** + (so the same token can be reused) but the **Get** stays active — + `RevokeToken` still terminates a bearer token immediately. +- **Excluded URLs strip Authorization.** When `enableBearerAuth=true`, + excluded paths (e.g. `/health`, `/metrics`) get the `Authorization` header + removed before forwarding so the token can't leak into public endpoint + logs. +- **Optional real-time revocation.** Set `requireTokenIntrospection: true` + to call RFC 7662 introspection on every cache miss; revoked tokens fail + immediately. Introspection endpoint failures return `503` (distinguishes + infra outage from credential rejection). + +**Obtaining bearer tokens** — minting is the IdP's job, not the +middleware's. The canonical M2M flow is OAuth 2.0 `client_credentials` +(RFC 6749 §4.4); Google requires JWT bearer assertion (RFC 7523) instead. +Minimal Auth0-shape request: + +```bash +curl -s -X POST https://issuer.example.com/oauth/token \ + -H 'Content-Type: application/json' \ + -d '{ + "grant_type": "client_credentials", + "client_id": "your-m2m-client-id", + "client_secret": "your-m2m-client-secret", + "audience": "https://api.example.com", + "scope": "api:read api:write" + }' +``` + +The `audience` you request from the IdP **must match** the `audience` you +configured on the middleware. Per-provider endpoints, parameter names, and +gotchas (Entra v2 endpoint, Cognito Resource Servers, Keycloak audience +mappers, Google's opaque-token quirk) are documented in +[docs/BEARER_AUTH.md](docs/BEARER_AUTH.md#obtaining-bearer-tokens-from-your-oidc-provider). + +Full threat model, configuration matrix, and follow-up gaps in +[docs/BEARER_AUTH.md](docs/BEARER_AUTH.md). + ### SSE and WebSocket endpoints Browser clients cannot follow an OIDC `302` redirect on an SSE stream or a diff --git a/bearer_auth.go b/bearer_auth.go new file mode 100644 index 0000000..8796ca3 --- /dev/null +++ b/bearer_auth.go @@ -0,0 +1,592 @@ +// Package traefikoidc — bearer-token (M2M) authentication path. +// +// Disabled by default. When enabled via Config.EnableBearerAuth, requests +// presenting "Authorization: Bearer " are validated against the +// configured OIDC provider (signature, issuer, audience, exp, replay-Get) +// and the request is forwarded downstream without creating a cookie session. +// +// Design rules (kept here in code as the single source of truth): +// - Access tokens only. ID tokens are rejected via detectTokenType. +// - Audience is mandatory (enforced at startup in main.go). +// - alg + kid pinned BEFORE JWKS fetch to deny amplification probes. +// - iat upper-age cap bounds clock-skew / forever-token abuse. +// - Multi-audience tokens require matching azp. +// - Per-IP 401 throttle returns 429 + Retry-After after a threshold. +// - JTI Set is suppressed (skipReplayMarking) but JTI Get stays — revoked +// tokens (RevokeToken adds to blacklist) are still rejected. +// - Identifier is read from BearerIdentifierClaim (default "sub"), never +// from UserIdentifierClaim, to avoid the unverified-email spoofing path. +// - Identifier is sanitized: length cap, control chars, bidi-override, +// delimiter chars (, ; =) rejected. +// - On excluded URLs the Authorization header is stripped before forwarding. +// +// See docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md and +// docs/BEARER_AUTH.md for the full threat model. +package traefikoidc + +import ( + "crypto/sha256" + "encoding/base64" + "encoding/hex" + "encoding/json" + "fmt" + "net" + "net/http" + "strings" + "sync" + "time" + "unicode" +) + +const bearerPrefix = "Bearer " + +// bearerAlgAllowlist is the set of JWS algorithms accepted on the bearer +// path. Asymmetric-only — HS* would allow public-key-as-HMAC-secret attacks +// if any operator ever rotates a key into the symmetric branch by mistake; +// "none" is obvious. Matches the allowlist enforced inside jwt.Verify but is +// checked here BEFORE the JWKS fetch so attacker noise can't amplify. +var bearerAlgAllowlist = map[string]struct{}{ + "RS256": {}, "RS384": {}, "RS512": {}, + "PS256": {}, "PS384": {}, "PS512": {}, + "ES256": {}, "ES384": {}, "ES512": {}, +} + +// bearerKidMaxLen caps the JOSE kid header length to keep memory and cache-key +// usage bounded against attacker-controlled values. +const bearerKidMaxLen = 256 + +// validKidChar is the allowlist for kid header characters. Letters, digits, +// dot, underscore, hyphen, equals. Intentionally narrow; real-world kid +// values are short URL-safe-base64-ish identifiers. +func validKidChar(r rune) bool { + if r >= 'a' && r <= 'z' { + return true + } + if r >= 'A' && r <= 'Z' { + return true + } + if r >= '0' && r <= '9' { + return true + } + switch r { + case '.', '_', '-', '=': + return true + } + return false +} + +// bearerError categorizes failure modes for the response builder. Categories +// map 1:1 to the table in docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md +// §9 so behavior is auditable from spec to code. +type bearerErrorKind int + +const ( + bearerErrInvalidRequest bearerErrorKind = iota + bearerErrInvalidToken + bearerErrTokenInactive + bearerErrInvalidIdentifier + bearerErrForbidden + bearerErrThrottled + bearerErrIntrospectionUnavailable +) + +type bearerError struct { + kind bearerErrorKind + reason string +} + +func (e *bearerError) Error() string { return e.reason } + +func newBearerError(kind bearerErrorKind, reason string) *bearerError { + return &bearerError{kind: kind, reason: reason} +} + +// joseHeader is the minimal subset of the JWS protected header we inspect +// BEFORE running the full verification pipeline. Lifted out so the alg+kid +// pin can run without paying for parseJWT's full claim decode. +type joseHeader struct { + Alg string `json:"alg"` + Kid string `json:"kid"` + Typ string `json:"typ"` +} + +// parseBearerJOSEHeader decodes the first JWT segment for early alg/kid pinning. +// Does not touch the payload or signature — those are the verifier's job. +// Returns nil on success; *bearerError on rejection so the handler can map +// directly to a status code. The decoded header itself is not surfaced because +// callers don't need it (verifyTokenWithOpts re-parses internally). +func parseBearerJOSEHeader(token string) *bearerError { + dot := strings.IndexByte(token, '.') + if dot <= 0 { + return newBearerError(bearerErrInvalidToken, "malformed JWT: no header segment") + } + raw, err := base64.RawURLEncoding.DecodeString(token[:dot]) + if err != nil { + // Some IdPs pad with '='; tolerate by retrying with StdEncoding. + raw, err = base64.URLEncoding.DecodeString(token[:dot]) + if err != nil { + return newBearerError(bearerErrInvalidToken, "malformed JWT: header not base64url") + } + } + var hdr joseHeader + if err := json.Unmarshal(raw, &hdr); err != nil { + return newBearerError(bearerErrInvalidToken, "malformed JWT: header not JSON") + } + if _, ok := bearerAlgAllowlist[hdr.Alg]; !ok { + return newBearerError(bearerErrInvalidToken, fmt.Sprintf("disallowed alg %q on bearer path", hdr.Alg)) + } + if hdr.Kid == "" { + return newBearerError(bearerErrInvalidToken, "missing kid header") + } + if len(hdr.Kid) > bearerKidMaxLen { + return newBearerError(bearerErrInvalidToken, "kid header exceeds max length") + } + for _, r := range hdr.Kid { + if !validKidChar(r) { + return newBearerError(bearerErrInvalidToken, "kid header contains disallowed characters") + } + } + return nil +} + +// sanitizeBearerIdentifier validates and trims a principal identifier before +// it is injected into request headers. Layered defense: net/http will reject +// CRLF on the wire too, but rejecting early gives clearer error logs and +// prevents bidi-override / delimiter chars that pass net/http's narrower +// checks but confuse downstream parsers and admin UIs. +func sanitizeBearerIdentifier(raw string, maxLen int) (string, *bearerError) { + identifier := strings.TrimSpace(raw) + if identifier == "" { + return "", newBearerError(bearerErrInvalidIdentifier, "identifier claim empty") + } + if maxLen > 0 && len(identifier) > maxLen { + return "", newBearerError(bearerErrInvalidIdentifier, "identifier exceeds max length") + } + for _, r := range identifier { + if unicode.IsControl(r) { + return "", newBearerError(bearerErrInvalidIdentifier, "identifier contains control character") + } + // Unicode bidi-override range (RTL spoofing of admin UI / SIEM). + if (r >= 0x202A && r <= 0x202E) || (r >= 0x2066 && r <= 0x2069) { + return "", newBearerError(bearerErrInvalidIdentifier, "identifier contains bidi-override character") + } + if r == ',' || r == ';' || r == '=' { + return "", newBearerError(bearerErrInvalidIdentifier, "identifier contains delimiter character") + } + } + return identifier, nil +} + +// resolveBearerIdentifier picks the principal identifier from claims using +// the configured BearerIdentifierClaim (default "sub"). Decoupled from +// userIdentifierClaim (cookie path) to avoid the unverified-email spoofing +// vector documented in the spec §13. +func resolveBearerIdentifier(claims map[string]interface{}, claimName string) (string, *bearerError) { + if claimName == "" { + claimName = "sub" + } + raw, ok := claims[claimName] + if !ok { + return "", newBearerError(bearerErrInvalidIdentifier, fmt.Sprintf("missing claim %q", claimName)) + } + str, ok := raw.(string) + if !ok { + return "", newBearerError(bearerErrInvalidIdentifier, fmt.Sprintf("claim %q not a string", claimName)) + } + return str, nil +} + +// enforceMultiAudienceAzp implements the spec hardening: when aud is a +// multi-element array, require an azp claim equal to clientID. Single-string +// aud is unaffected (existing verifyAudience handles it). +func enforceMultiAudienceAzp(claims map[string]interface{}, clientID string) *bearerError { + audRaw, ok := claims["aud"] + if !ok { + return nil // verifyToken already rejects missing aud + } + arr, ok := audRaw.([]interface{}) + if !ok { + return nil // single-string aud + } + if len(arr) <= 1 { + return nil + } + azpRaw, ok := claims["azp"] + if !ok { + return newBearerError(bearerErrInvalidToken, "multi-audience token missing azp") + } + azp, ok := azpRaw.(string) + if !ok || azp == "" { + return newBearerError(bearerErrInvalidToken, "multi-audience token has empty/non-string azp") + } + if azp != clientID { + return newBearerError(bearerErrInvalidToken, "multi-audience token azp does not match clientID") + } + return nil +} + +// enforceIatAge implements the spec MaxTokenAgeSeconds bound on iat. Bounds +// clock-manipulation / forever-token abuse without rejecting tokens with a +// normal iat just because the issuer's clock skews a few seconds. +func enforceIatAge(claims map[string]interface{}, maxAge time.Duration) *bearerError { + if maxAge <= 0 { + return nil + } + iatRaw, ok := claims["iat"].(float64) + if !ok { + // jwt.Verify already requires iat; this branch shouldn't be reached. + return newBearerError(bearerErrInvalidToken, "missing iat claim") + } + iat := time.Unix(int64(iatRaw), 0) + if time.Since(iat) > maxAge { + return newBearerError(bearerErrInvalidToken, "token iat outside age bound") + } + return nil +} + +// hashIdentifierForLog returns a short SHA-256 prefix safe for info-level +// logs. Full identifier is only emitted at debug. Satisfies the audit +// requirement (trace which principal was rejected) without leaking PII. +func hashIdentifierForLog(identifier string) string { + if identifier == "" { + return "(none)" + } + sum := sha256.Sum256([]byte(identifier)) + return hex.EncodeToString(sum[:4]) // 8 hex chars +} + +// --- Per-IP failure throttle --- + +// bearerFailureTracker records consecutive bearer-auth 401s per source IP and +// parks repeat offenders in a 429 penalty box. Limits offline-guessing-style +// attacks and protects the shared rate-limiter / JWKS endpoint from being +// burned by a single source. +type bearerFailureTracker struct { + mu sync.Mutex + entries map[string]*bearerFailureEntry + // Configuration snapshot. Captured at construction so a hot reconfigure + // doesn't race with the per-request paths. + threshold int + window time.Duration + penalty time.Duration +} + +type bearerFailureEntry struct { + firstFailureAt time.Time + penaltyUntil time.Time + count int +} + +func newBearerFailureTracker(threshold int, window, penalty time.Duration) *bearerFailureTracker { + if threshold <= 0 { + threshold = 20 + } + if window <= 0 { + window = 60 * time.Second + } + if penalty <= 0 { + penalty = 60 * time.Second + } + return &bearerFailureTracker{ + entries: make(map[string]*bearerFailureEntry), + threshold: threshold, + window: window, + penalty: penalty, + } +} + +// blocked reports whether the source IP is currently in the penalty box. +// Returns (true, retryAfter) when blocked; (false, 0) when allowed. +func (b *bearerFailureTracker) blocked(ip string) (bool, time.Duration) { + if b == nil || ip == "" { + return false, 0 + } + b.mu.Lock() + defer b.mu.Unlock() + e, ok := b.entries[ip] + if !ok { + return false, 0 + } + now := time.Now() + if !e.penaltyUntil.IsZero() && now.Before(e.penaltyUntil) { + return true, time.Until(e.penaltyUntil) + } + return false, 0 +} + +// recordFailure increments the failure counter for the given IP and trips +// the penalty box once threshold-within-window is exceeded. +func (b *bearerFailureTracker) recordFailure(ip string) { + if b == nil || ip == "" { + return + } + b.mu.Lock() + defer b.mu.Unlock() + now := time.Now() + e, ok := b.entries[ip] + if !ok || now.Sub(e.firstFailureAt) > b.window { + e = &bearerFailureEntry{firstFailureAt: now} + b.entries[ip] = e + } + e.count++ + if e.count >= b.threshold { + e.penaltyUntil = now.Add(b.penalty) + } +} + +// recordSuccess clears the failure counter for the given IP after a +// successful bearer auth. +func (b *bearerFailureTracker) recordSuccess(ip string) { + if b == nil || ip == "" { + return + } + b.mu.Lock() + defer b.mu.Unlock() + delete(b.entries, ip) +} + +// clientIPForBearer returns the source IP used to key the failure tracker. +// Trusts only the request's transport-level RemoteAddr; X-Forwarded-For is +// intentionally ignored to avoid attacker-controlled key spoofing. Behind a +// trusted reverse proxy where every request shares one IP, the throttle is +// still useful (caps attacker churn through that proxy) — operators wanting +// per-real-client throttling must terminate at this middleware. +func clientIPForBearer(req *http.Request) string { + if req == nil { + return "" + } + host, _, err := net.SplitHostPort(req.RemoteAddr) + if err != nil { + return req.RemoteAddr + } + return host +} + +// --- Bearer auth entrypoint --- + +// detectBearerToken returns (token, true) when the request carries a usable +// Authorization: Bearer header. Case-insensitive on the scheme. Returns +// ("", false) for any other shape. +func detectBearerToken(req *http.Request) (string, bool) { + if req == nil { + return "", false + } + h := req.Header.Get("Authorization") + if len(h) < len(bearerPrefix) { + return "", false + } + if !strings.EqualFold(h[:len(bearerPrefix)], bearerPrefix) { + return "", false + } + token := strings.TrimSpace(h[len(bearerPrefix):]) + if token == "" { + return "", false + } + return token, true +} + +// hasSessionCookie reports whether the request carries any cookie matching +// the session prefix. Used to implement the cookie-wins-by-default +// precedence rule when both bearer and cookie are present. +func (t *TraefikOidc) hasSessionCookie(req *http.Request) bool { + if t.sessionManager == nil { + return false + } + prefix := t.sessionManager.GetCookiePrefix() + if prefix == "" { + return false + } + for _, c := range req.Cookies() { + if strings.HasPrefix(c.Name, prefix) { + return true + } + } + return false +} + +// writeBearerError writes the canonical 401/403/429/503 response per spec §9. +// Body is always generic; reason is logged at debug only. The +// WWW-Authenticate hint is gated by config (default on, RFC 6750 compliant). +func (t *TraefikOidc) writeBearerError(rw http.ResponseWriter, req *http.Request, err *bearerError) { + var ( + status int + errCode string + body string + retryAfter time.Duration + ) + switch err.kind { + case bearerErrInvalidRequest: + status = http.StatusUnauthorized + errCode = "invalid_request" + body = "Unauthorized" + case bearerErrInvalidToken, bearerErrTokenInactive, bearerErrInvalidIdentifier: + status = http.StatusUnauthorized + errCode = "invalid_token" + body = "Unauthorized" + case bearerErrForbidden: + status = http.StatusForbidden + body = "Access denied" + case bearerErrThrottled: + status = http.StatusTooManyRequests + body = "Too Many Requests" + retryAfter = t.bearerFailurePenalty + case bearerErrIntrospectionUnavailable: + status = http.StatusServiceUnavailable + body = "Service Unavailable" + default: + status = http.StatusUnauthorized + body = "Unauthorized" + } + + if t.bearerEmitWWWAuthenticate && errCode != "" { + rw.Header().Set("WWW-Authenticate", fmt.Sprintf(`Bearer error=%q`, errCode)) + } + if retryAfter > 0 { + rw.Header().Set("Retry-After", fmt.Sprintf("%d", int(retryAfter.Seconds()))) + } + rw.Header().Set("Content-Type", "text/plain; charset=utf-8") + rw.WriteHeader(status) + _, _ = rw.Write([]byte(body)) // Safe to ignore: best-effort error body write + + if t.logger != nil { + t.logger.Debugf("bearer auth rejected: status=%d category=%v reason=%q path=%s", + status, err.kind, err.reason, req.URL.Path) + } +} + +// handleBearerRequest is the entry point invoked by ServeHTTP when the +// EnableBearerAuth flag is set, the request carries an Authorization: Bearer +// header, and the (configurable) cookie-precedence rule allows the bearer +// path to run. +func (t *TraefikOidc) handleBearerRequest(rw http.ResponseWriter, req *http.Request) { + ip := clientIPForBearer(req) + + if blocked, retryAfter := t.bearerFailureTracker.blocked(ip); blocked { + throttled := newBearerError(bearerErrThrottled, "ip in penalty box") + // Preserve the actual retry-after even if it diverged from the + // configured default (clock-skew, partial-window expiry). + if retryAfter > 0 { + rw.Header().Set("Retry-After", fmt.Sprintf("%d", int(retryAfter.Seconds()))) + } + t.writeBearerError(rw, req, throttled) + return + } + + token, ok := detectBearerToken(req) + if !ok { + t.bearerFailureTracker.recordFailure(ip) + t.writeBearerError(rw, req, newBearerError(bearerErrInvalidRequest, "missing or empty bearer token")) + return + } + if len(token) > AccessTokenConfig.MaxLength { + t.bearerFailureTracker.recordFailure(ip) + t.writeBearerError(rw, req, newBearerError(bearerErrInvalidToken, "token exceeds max length")) + return + } + if strings.Count(token, ".") != 2 { + t.bearerFailureTracker.recordFailure(ip) + t.writeBearerError(rw, req, newBearerError(bearerErrInvalidToken, "token is not a 3-segment JWT")) + return + } + + if bErr := parseBearerJOSEHeader(token); bErr != nil { + t.bearerFailureTracker.recordFailure(ip) + t.writeBearerError(rw, req, bErr) + return + } + + p, bErr := t.buildPrincipalFromBearerToken(token) + if bErr != nil { + t.bearerFailureTracker.recordFailure(ip) + t.writeBearerError(rw, req, bErr) + return + } + + t.bearerFailureTracker.recordSuccess(ip) + if t.logger != nil { + t.logger.Debugf("bearer auth success: identifier_hash=%s path=%s", + hashIdentifierForLog(p.Identifier), req.URL.Path) + } + t.forwardAuthorized(rw, req, p) +} + +// buildPrincipalFromBearerToken runs the full bearer verification pipeline +// described in spec §7.3 and returns a principal ready for forwardAuthorized. +// Returns a typed *bearerError on failure so the caller can map to status. +func (t *TraefikOidc) buildPrincipalFromBearerToken(token string) (*principal, *bearerError) { + if err := t.verifyTokenWithOpts(token, verifyOpts{skipReplayMarking: true}); err != nil { + return nil, newBearerError(bearerErrInvalidToken, "token verification failed: "+err.Error()) + } + + parsed, err := parseJWT(token) + if err != nil { + return nil, newBearerError(bearerErrInvalidToken, "post-verify parseJWT failed: "+err.Error()) + } + claims := parsed.Claims + + // Token-type guard. Reuse the well-tested classifier which already + // checks nonce / typ=at+jwt / token_use / scope / aud-vs-clientID. + if t.detectTokenType(parsed, token) { + return nil, newBearerError(bearerErrInvalidToken, "ID tokens are not accepted on the bearer path") + } + // Belt-and-braces explicit rejection (cheap, catches edge cases not + // covered by detectTokenType's heuristic). + if nonce, ok := claims["nonce"].(string); ok && nonce != "" { + return nil, newBearerError(bearerErrInvalidToken, "nonce claim present (ID-token shape)") + } + if tu, ok := claims["token_use"].(string); ok && tu == "id" { + return nil, newBearerError(bearerErrInvalidToken, "token_use=id rejected") + } + + if bErr := enforceMultiAudienceAzp(claims, t.clientID); bErr != nil { + return nil, bErr + } + if bErr := enforceIatAge(claims, t.maxTokenAge); bErr != nil { + return nil, bErr + } + + if t.requireTokenIntrospection { + if bErr := t.introspectOnBearerPath(token); bErr != nil { + return nil, bErr + } + } + + rawIdentifier, bErr := resolveBearerIdentifier(claims, t.bearerIdentifierClaim) + if bErr != nil { + return nil, bErr + } + identifier, bErr := sanitizeBearerIdentifier(rawIdentifier, t.maxIdentifierLength) + if bErr != nil { + return nil, bErr + } + + subject, _ := claims["sub"].(string) + clientID, _ := claims["azp"].(string) + if clientID == "" { + clientID, _ = claims["client_id"].(string) + } + + return &principal{ + Source: sourceBearer, + Identifier: identifier, + Subject: subject, + ClientID: clientID, + Claims: claims, + AccessToken: token, + }, nil +} + +// introspectOnBearerPath calls the existing RFC 7662 introspector when the +// operator demands real-time revocation. Distinguishes "token revoked" (401) +// from "endpoint unavailable" (503) so transient infra failures don't look +// like credential failures. +func (t *TraefikOidc) introspectOnBearerPath(token string) *bearerError { + resp, err := t.introspectToken(token) + if err != nil { + return newBearerError(bearerErrIntrospectionUnavailable, "introspection failed: "+err.Error()) + } + if !resp.Active { + return newBearerError(bearerErrTokenInactive, "introspection reports token inactive") + } + return nil +} diff --git a/bearer_auth_test.go b/bearer_auth_test.go new file mode 100644 index 0000000..94fcf60 --- /dev/null +++ b/bearer_auth_test.go @@ -0,0 +1,812 @@ +package traefikoidc + +import ( + "context" + "encoding/base64" + "encoding/json" + "fmt" + "net/http" + "net/http/httptest" + "strings" + "sync/atomic" + "testing" + "time" + + "golang.org/x/time/rate" +) + +// ============================================================================= +// Helper builders +// ============================================================================= + +// makeBearerJWT constructs a JWT with explicit header + claims for tests. +// Signature is opaque (b64("signature")) — bearer tests don't exercise the +// real cryptographic verifier; verification is bypassed via tokenCache pre- +// seed so the bearer pipeline under test sees a "verified" token. +func makeBearerJWT(t *testing.T, header, claims map[string]interface{}) string { + t.Helper() + hb, err := json.Marshal(header) + if err != nil { + t.Fatalf("marshal header: %v", err) + } + cb, err := json.Marshal(claims) + if err != nil { + t.Fatalf("marshal claims: %v", err) + } + return fmt.Sprintf("%s.%s.%s", + base64.RawURLEncoding.EncodeToString(hb), + base64.RawURLEncoding.EncodeToString(cb), + base64.RawURLEncoding.EncodeToString([]byte("signature")), + ) +} + +// defaultBearerHeader produces the standard RS256+kid header used in tests. +func defaultBearerHeader() map[string]interface{} { + return map[string]interface{}{"alg": "RS256", "kid": "test-kid"} +} + +// defaultBearerClaims produces a baseline access-token claim set. Tests +// shallow-clone and override fields as needed. +func defaultBearerClaims() map[string]interface{} { + return map[string]interface{}{ + "iss": "https://issuer.example.com", + "aud": "https://api.example.com", + "sub": "service-account-1", + "scope": "api:read api:write", + "exp": float64(time.Now().Add(time.Hour).Unix()), + "iat": float64(time.Now().Unix()), + } +} + +// makeBearerOIDC constructs a TraefikOidc wired for bearer auth tests. The +// real verifyTokenWithOpts pipeline is short-circuited via tokenCache pre- +// seed: any token Set into t.tokenCache returns nil from VerifyToken, +// letting tests exercise the post-verify bearer logic (classifier, identifier, +// throttle, header forwarding) without standing up JWKs. +func makeBearerOIDC(t *testing.T, next http.Handler) *TraefikOidc { + t.Helper() + sm := createTestSessionManager(t) + oidc := &TraefikOidc{ + next: next, + logger: NewLogger("error"), + initComplete: make(chan struct{}), + sessionManager: sm, + firstRequestReceived: true, + metadataRefreshStarted: true, + issuerURL: "https://issuer.example.com", + audience: "https://api.example.com", + clientID: "https://api.example.com", + tokenCache: NewTokenCache(), + excludedURLs: map[string]struct{}{"/favicon.ico": {}}, + allowedRolesAndGroups: map[string]struct{}{}, + limiter: rate.NewLimiter(rate.Every(time.Second), 1000), + ctx: context.Background(), + enableBearerAuth: true, + stripAuthorizationHeader: true, + bearerEmitWWWAuthenticate: true, + bearerOverridesCookie: false, + bearerIdentifierClaim: "sub", + maxIdentifierLength: 256, + maxTokenAge: 24 * time.Hour, + bearerFailureThreshold: 20, + bearerFailureWindow: 60 * time.Second, + bearerFailurePenalty: 60 * time.Second, + bearerFailureTracker: newBearerFailureTracker(20, 60*time.Second, 60*time.Second), + } + oidc.extractClaimsFunc = extractClaims + close(oidc.initComplete) + return oidc +} + +// seedVerified pre-populates the tokenCache so verifyTokenWithOpts short- +// circuits to nil for the given token. Mirrors the production fast-return +// path at token_manager.go for previously-verified tokens. +func seedVerified(t *testing.T, oidc *TraefikOidc, token string, claims map[string]interface{}) { + t.Helper() + if oidc.tokenCache == nil { + oidc.tokenCache = NewTokenCache() + } + oidc.tokenCache.Set(token, claims, time.Hour) +} + +// ============================================================================= +// Unit tests — small helpers +// ============================================================================= + +func TestDetectBearerToken(t *testing.T) { + t.Parallel() + cases := []struct { + name string + header string + want string + ok bool + }{ + {"missing header", "", "", false}, + {"basic auth", "Basic abc", "", false}, + {"bearer with token", "Bearer abc.def.ghi", "abc.def.ghi", true}, + {"lowercase bearer", "bearer abc.def.ghi", "abc.def.ghi", true}, + {"mixed case", "BeArEr abc.def.ghi", "abc.def.ghi", true}, + {"empty token after prefix", "Bearer ", "", false}, + {"bearer no space", "Bearerabc", "", false}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + req := httptest.NewRequest("GET", "/", nil) + if tc.header != "" { + req.Header.Set("Authorization", tc.header) + } + got, ok := detectBearerToken(req) + if ok != tc.ok || got != tc.want { + t.Fatalf("got=(%q, %v), want=(%q, %v)", got, ok, tc.want, tc.ok) + } + }) + } +} + +func TestParseBearerJOSEHeader(t *testing.T) { + t.Parallel() + mk := func(t *testing.T, h map[string]interface{}) string { + return makeBearerJWT(t, h, map[string]interface{}{"sub": "x"}) + } + cases := []struct { + header map[string]interface{} + name string + wantErr bool + }{ + {name: "valid RS256", header: map[string]interface{}{"alg": "RS256", "kid": "k1"}, wantErr: false}, + {name: "valid ES512", header: map[string]interface{}{"alg": "ES512", "kid": "abc-_.="}, wantErr: false}, + {name: "alg=none rejected", header: map[string]interface{}{"alg": "none", "kid": "k1"}, wantErr: true}, + {name: "alg=HS256 rejected", header: map[string]interface{}{"alg": "HS256", "kid": "k1"}, wantErr: true}, + {name: "missing kid", header: map[string]interface{}{"alg": "RS256"}, wantErr: true}, + {name: "kid too long", header: map[string]interface{}{"alg": "RS256", "kid": strings.Repeat("a", bearerKidMaxLen+1)}, wantErr: true}, + {name: "kid bad chars", header: map[string]interface{}{"alg": "RS256", "kid": "evil/../etc/passwd"}, wantErr: true}, + {name: "kid with space", header: map[string]interface{}{"alg": "RS256", "kid": "key one"}, wantErr: true}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + token := mk(t, tc.header) + err := parseBearerJOSEHeader(token) + if (err != nil) != tc.wantErr { + t.Fatalf("err=%v wantErr=%v", err, tc.wantErr) + } + }) + } +} + +func TestSanitiseBearerIdentifier(t *testing.T) { + t.Parallel() + cases := []struct { + name string + in string + want string + wantErr bool + }{ + {"normal sub", "service-account-1", "service-account-1", false}, + {"email-like", "alice@example.com", "alice@example.com", false}, + {"trim whitespace", " abc ", "abc", false}, + {"empty", "", "", true}, + {"only whitespace", " ", "", true}, + {"control char (newline)", "alice\nbob", "", true}, + {"control char (CR)", "alice\rbob", "", true}, + {"control char (NUL)", "alice\x00bob", "", true}, + {"bidi override", "alice\u202ebob", "", true}, + {"bidi isolate", "alice\u2066bob", "", true}, + {"comma delimiter", "alice,bob", "", true}, + {"semicolon delimiter", "alice;bob", "", true}, + {"equals delimiter", "alice=bob", "", true}, + {"over length", strings.Repeat("a", 257), "", true}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got, err := sanitizeBearerIdentifier(tc.in, 256) + if (err != nil) != tc.wantErr { + t.Fatalf("err=%v wantErr=%v", err, tc.wantErr) + } + if !tc.wantErr && got != tc.want { + t.Fatalf("got=%q want=%q", got, tc.want) + } + }) + } +} + +func TestResolveBearerIdentifier(t *testing.T) { + t.Parallel() + cases := []struct { + claims map[string]interface{} + name string + claim string + want string + wantErr bool + }{ + {name: "default sub", claims: map[string]interface{}{"sub": "abc"}, claim: "", want: "abc"}, + {name: "explicit sub", claims: map[string]interface{}{"sub": "abc"}, claim: "sub", want: "abc"}, + {name: "custom client_id claim", claims: map[string]interface{}{"client_id": "svc"}, claim: "client_id", want: "svc"}, + {name: "missing claim", claims: map[string]interface{}{"other": "x"}, claim: "sub", wantErr: true}, + {name: "non-string claim", claims: map[string]interface{}{"sub": 123}, claim: "sub", wantErr: true}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got, err := resolveBearerIdentifier(tc.claims, tc.claim) + if (err != nil) != tc.wantErr { + t.Fatalf("err=%v wantErr=%v", err, tc.wantErr) + } + if !tc.wantErr && got != tc.want { + t.Fatalf("got=%q want=%q", got, tc.want) + } + }) + } +} + +func TestEnforceMultiAudienceAzp(t *testing.T) { + t.Parallel() + const cid = "https://api.example.com" + cases := []struct { + claims map[string]interface{} + name string + wantErr bool + }{ + {name: "single string aud", claims: map[string]interface{}{"aud": "x"}, wantErr: false}, + {name: "single element array", claims: map[string]interface{}{"aud": []interface{}{"x"}}, wantErr: false}, + {name: "multi-aud with matching azp", claims: map[string]interface{}{"aud": []interface{}{"a", "b"}, "azp": cid}, wantErr: false}, + {name: "multi-aud missing azp", claims: map[string]interface{}{"aud": []interface{}{"a", "b"}}, wantErr: true}, + {name: "multi-aud empty azp", claims: map[string]interface{}{"aud": []interface{}{"a", "b"}, "azp": ""}, wantErr: true}, + {name: "multi-aud wrong azp", claims: map[string]interface{}{"aud": []interface{}{"a", "b"}, "azp": "other"}, wantErr: true}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + err := enforceMultiAudienceAzp(tc.claims, cid) + if (err != nil) != tc.wantErr { + t.Fatalf("err=%v wantErr=%v", err, tc.wantErr) + } + }) + } +} + +func TestEnforceIatAge(t *testing.T) { + t.Parallel() + now := time.Now() + cases := []struct { + name string + iat float64 + maxAge time.Duration + wantErr bool + }{ + {name: "fresh", iat: float64(now.Unix()), maxAge: time.Hour, wantErr: false}, + {name: "23h59m old, max 24h", iat: float64(now.Add(-23*time.Hour - 59*time.Minute).Unix()), maxAge: 24 * time.Hour, wantErr: false}, + {name: "25h old, max 24h", iat: float64(now.Add(-25 * time.Hour).Unix()), maxAge: 24 * time.Hour, wantErr: true}, + {name: "1970 token", iat: float64(0), maxAge: 24 * time.Hour, wantErr: true}, + {name: "maxAge disabled (0)", iat: float64(0), maxAge: 0, wantErr: false}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + err := enforceIatAge(map[string]interface{}{"iat": tc.iat}, tc.maxAge) + if (err != nil) != tc.wantErr { + t.Fatalf("err=%v wantErr=%v", err, tc.wantErr) + } + }) + } +} + +func TestBearerFailureTracker(t *testing.T) { + t.Parallel() + tr := newBearerFailureTracker(3, 60*time.Second, 60*time.Second) + const ip = "10.0.0.1" + // Below threshold: not blocked. + for i := 0; i < 2; i++ { + tr.recordFailure(ip) + if b, _ := tr.blocked(ip); b { + t.Fatalf("blocked too early after %d failures", i+1) + } + } + // Threshold reached: blocked. + tr.recordFailure(ip) + if b, retry := tr.blocked(ip); !b || retry <= 0 { + t.Fatalf("expected blocked with positive retry, got=%v retry=%v", b, retry) + } + // Success clears the counter. + tr.recordSuccess(ip) + if b, _ := tr.blocked(ip); b { + t.Fatalf("expected unblocked after success") + } + // Other IPs are unaffected. + if b, _ := tr.blocked("10.0.0.2"); b { + t.Fatalf("unrelated IP should not be blocked") + } +} + +// ============================================================================= +// Integration tests — full ServeHTTP via the bearer pipeline +// ============================================================================= + +func TestServeHTTP_Bearer_HappyPath(t *testing.T) { + t.Parallel() + var nextCalled atomic.Bool + var capturedHeaders http.Header + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + nextCalled.Store(true) + capturedHeaders = r.Header.Clone() + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + claims := defaultBearerClaims() + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if !nextCalled.Load() { + t.Fatalf("expected next handler to run; got status=%d body=%q", rw.Code, rw.Body.String()) + } + if rw.Code != http.StatusOK { + t.Fatalf("status=%d, want 200", rw.Code) + } + if got := capturedHeaders.Get("X-Forwarded-User"); got != "service-account-1" { + t.Fatalf("X-Forwarded-User=%q, want service-account-1", got) + } + if got := capturedHeaders.Get("Authorization"); got != "" { + t.Fatalf("Authorization should be stripped, got=%q", got) + } +} + +func TestServeHTTP_Bearer_StripAuthDisabled(t *testing.T) { + t.Parallel() + var capturedAuth string + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + capturedAuth = r.Header.Get("Authorization") + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + oidc.stripAuthorizationHeader = false + claims := defaultBearerClaims() + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if !strings.HasPrefix(capturedAuth, "Bearer ") { + t.Fatalf("expected Authorization to be forwarded, got=%q", capturedAuth) + } +} + +func TestServeHTTP_Bearer_RejectIDToken(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for ID token rejection") + }) + oidc := makeBearerOIDC(t, next) + // ID-token shape: nonce claim present and no scope. detectTokenType + // returns true. + claims := map[string]interface{}{ + "iss": "https://issuer.example.com", + "aud": "https://api.example.com", + "sub": "user-1", + "nonce": "n-0S6_WzA2Mj", + "exp": float64(time.Now().Add(time.Hour).Unix()), + "iat": float64(time.Now().Unix()), + } + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } + if wa := rw.Header().Get("WWW-Authenticate"); !strings.Contains(wa, `error="invalid_token"`) { + t.Fatalf("expected WWW-Authenticate invalid_token, got=%q", wa) + } +} + +func TestServeHTTP_Bearer_AlgNoneRejected(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for alg=none") + }) + oidc := makeBearerOIDC(t, next) + header := map[string]interface{}{"alg": "none", "kid": "k1"} + claims := defaultBearerClaims() + token := makeBearerJWT(t, header, claims) + // Even if we pre-seeded the cache, the early alg pin runs FIRST. + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } +} + +func TestServeHTTP_Bearer_KidTooLongRejected(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for oversized kid") + }) + oidc := makeBearerOIDC(t, next) + header := map[string]interface{}{"alg": "RS256", "kid": strings.Repeat("a", bearerKidMaxLen+1)} + claims := defaultBearerClaims() + token := makeBearerJWT(t, header, claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } +} + +func TestServeHTTP_Bearer_MultiAudRequiresAzp(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for multi-aud without azp") + }) + oidc := makeBearerOIDC(t, next) + claims := defaultBearerClaims() + claims["aud"] = []interface{}{"https://api.example.com", "https://other.example.com"} + delete(claims, "azp") + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } +} + +func TestServeHTTP_Bearer_MultiAudWithAzpAccepted(t *testing.T) { + t.Parallel() + var nextCalled atomic.Bool + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + nextCalled.Store(true) + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + claims := defaultBearerClaims() + claims["aud"] = []interface{}{"https://api.example.com", "https://other.example.com"} + claims["azp"] = oidc.clientID + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusOK || !nextCalled.Load() { + t.Fatalf("expected 200 + next called; got status=%d called=%v", rw.Code, nextCalled.Load()) + } +} + +func TestServeHTTP_Bearer_IatTooOldRejected(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for old iat") + }) + oidc := makeBearerOIDC(t, next) + claims := defaultBearerClaims() + claims["iat"] = float64(time.Now().Add(-25 * time.Hour).Unix()) + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } +} + +func TestServeHTTP_Bearer_IdentifierWithBidiRejected(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for bidi identifier") + }) + oidc := makeBearerOIDC(t, next) + claims := defaultBearerClaims() + claims["sub"] = "alice\u202ebob" + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } +} + +func TestServeHTTP_Bearer_ReplayRegression(t *testing.T) { + t.Parallel() + var successCount atomic.Int32 + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + successCount.Add(1) + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + claims := defaultBearerClaims() + claims["jti"] = "regression-jti" + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + for i := 0; i < 100; i++ { + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + if rw.Code != http.StatusOK { + t.Fatalf("iteration %d: status=%d, want 200", i, rw.Code) + } + } + if successCount.Load() != 100 { + t.Fatalf("successCount=%d, want 100", successCount.Load()) + } +} + +func TestServeHTTP_Bearer_ThrottleTrips429(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run during throttle test") + }) + oidc := makeBearerOIDC(t, next) + oidc.bearerFailureTracker = newBearerFailureTracker(3, 60*time.Second, 60*time.Second) + + // Send malformed bearers from the same RemoteAddr until threshold trips. + send := func() *httptest.ResponseRecorder { + req := httptest.NewRequest("GET", "/api/work", nil) + req.RemoteAddr = "10.0.0.5:1234" + req.Header.Set("Authorization", "Bearer not-a-jwt") + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + return rw + } + for i := 0; i < 3; i++ { + rw := send() + if rw.Code != http.StatusUnauthorized { + t.Fatalf("pre-throttle iteration %d: status=%d, want 401", i, rw.Code) + } + } + // 4th request: throttled. + rw := send() + if rw.Code != http.StatusTooManyRequests { + t.Fatalf("expected 429 after threshold, got %d", rw.Code) + } + if ra := rw.Header().Get("Retry-After"); ra == "" { + t.Fatalf("expected Retry-After header on 429") + } +} + +func TestServeHTTP_Bearer_ExcludedURLStripsAuth(t *testing.T) { + t.Parallel() + var capturedAuth string + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + capturedAuth = r.Header.Get("Authorization") + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + oidc.excludedURLs = map[string]struct{}{"/favicon.ico": {}} + + req := httptest.NewRequest("GET", "/favicon.ico", nil) + req.Header.Set("Authorization", "Bearer abc.def.ghi") + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if rw.Code != http.StatusOK { + t.Fatalf("excluded path should pass; got %d", rw.Code) + } + if capturedAuth != "" { + t.Fatalf("Authorization must be stripped on excluded paths, got=%q", capturedAuth) + } +} + +func TestServeHTTP_Bearer_RolesGate(t *testing.T) { + t.Parallel() + cases := []struct { + name string + rolesClaim []interface{} + want int + }{ + {name: "matching role", rolesClaim: []interface{}{"admin"}, want: http.StatusOK}, + {name: "no matching role", rolesClaim: []interface{}{"viewer"}, want: http.StatusForbidden}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + oidc.allowedRolesAndGroups = map[string]struct{}{"admin": {}} + oidc.roleClaimName = "roles" + claims := defaultBearerClaims() + claims["roles"] = tc.rolesClaim + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + if rw.Code != tc.want { + t.Fatalf("status=%d, want %d", rw.Code, tc.want) + } + }) + } +} + +func TestServeHTTP_Bearer_CookieWinsByDefault(t *testing.T) { + t.Parallel() + // Both cookie and bearer present: cookie path runs (which will redirect + // to /authorize since the cookie is empty/unauthenticated). + var nextCalled atomic.Bool + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + nextCalled.Store(true) + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + claims := defaultBearerClaims() + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + prefix := oidc.sessionManager.GetCookiePrefix() + req.AddCookie(&http.Cookie{Name: prefix + "main", Value: "irrelevant"}) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + // Cookie path consumed the request; bearer was ignored. Since the + // cookie is empty, the cookie path will either 302 to /authorize or + // return 401 — in either case, next must NOT be called. + if nextCalled.Load() { + t.Fatalf("next must not be called when bearer is ignored due to cookie precedence") + } +} + +func TestServeHTTP_Bearer_BearerOverridesCookie(t *testing.T) { + t.Parallel() + var nextCalled atomic.Bool + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + nextCalled.Store(true) + w.WriteHeader(http.StatusOK) + }) + oidc := makeBearerOIDC(t, next) + oidc.bearerOverridesCookie = true + claims := defaultBearerClaims() + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + prefix := oidc.sessionManager.GetCookiePrefix() + req.AddCookie(&http.Cookie{Name: prefix + "main", Value: "irrelevant"}) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + + if !nextCalled.Load() || rw.Code != http.StatusOK { + t.Fatalf("expected bearer to win with override; status=%d called=%v", rw.Code, nextCalled.Load()) + } +} + +func TestServeHTTP_Bearer_OversizedToken(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for oversized token") + }) + oidc := makeBearerOIDC(t, next) + huge := strings.Repeat("a", AccessTokenConfig.MaxLength+1) + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+huge) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } +} + +func TestServeHTTP_Bearer_MalformedJWT(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + t.Fatalf("next must not run for malformed JWT") + }) + oidc := makeBearerOIDC(t, next) + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer not.jwt") // 1 dot + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + if rw.Code != http.StatusUnauthorized { + t.Fatalf("status=%d, want 401", rw.Code) + } +} + +func TestServeHTTP_Bearer_FeatureOffPassesThrough(t *testing.T) { + t.Parallel() + next := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + // Should not be reached: cookie path runs and (with no session) + // will redirect or 401. We assert no panic / next not called. + t.Fatalf("next must not run when bearer is off and no valid session exists") + }) + oidc := makeBearerOIDC(t, next) + oidc.enableBearerAuth = false + claims := defaultBearerClaims() + token := makeBearerJWT(t, defaultBearerHeader(), claims) + seedVerified(t, oidc, token, claims) + req := httptest.NewRequest("GET", "/api/work", nil) + req.Header.Set("Authorization", "Bearer "+token) + rw := httptest.NewRecorder() + oidc.ServeHTTP(rw, req) + // Expect non-200: either 302 to /authorize or 401. The point is the + // bearer pipeline didn't run. + if rw.Code == http.StatusOK { + t.Fatalf("expected non-200 when bearer is off; got %d", rw.Code) + } +} + +// ============================================================================= +// Startup validation tests +// ============================================================================= + +func TestStartupValidation_BearerRequiresAudience(t *testing.T) { + t.Parallel() + cfg := CreateConfig() + cfg.ProviderURL = "https://issuer.example.com" + cfg.ClientID = "id" + cfg.ClientSecret = "secret" + cfg.CallbackURL = "/oauth/callback" + cfg.SessionEncryptionKey = "0123456789abcdef0123456789abcdef0123456789abcdef" + cfg.EnableBearerAuth = true + cfg.Audience = "" + _, err := New(context.Background(), http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {}), cfg, "bearer-test") + if err == nil || !strings.Contains(err.Error(), "requires Audience") { + t.Fatalf("expected audience-required error, got %v", err) + } +} + +func TestStartupValidation_BearerRejectsEmailIdentifier(t *testing.T) { + t.Parallel() + cfg := CreateConfig() + cfg.ProviderURL = "https://issuer.example.com" + cfg.ClientID = "id" + cfg.ClientSecret = "secret" + cfg.CallbackURL = "/oauth/callback" + cfg.SessionEncryptionKey = "0123456789abcdef0123456789abcdef0123456789abcdef" + cfg.EnableBearerAuth = true + cfg.Audience = "https://api.example.com" + cfg.BearerIdentifierClaim = "email" + _, err := New(context.Background(), http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {}), cfg, "bearer-test") + if err == nil || !strings.Contains(err.Error(), "bearerIdentifierClaim=\"email\"") { + t.Fatalf("expected email-identifier rejection, got %v", err) + } +} + +// ============================================================================= +// Principal invariants +// ============================================================================= + +func TestBuildPrincipalFromSession_NoIdentifier(t *testing.T) { + t.Parallel() + oidc := &TraefikOidc{logger: NewLogger("error")} + if p := oidc.buildPrincipalFromSession(nil); p != nil { + t.Fatalf("nil session must produce nil principal") + } +} diff --git a/docs/BEARER_AUTH.md b/docs/BEARER_AUTH.md new file mode 100644 index 0000000..1d85c2c --- /dev/null +++ b/docs/BEARER_AUTH.md @@ -0,0 +1,250 @@ +# Bearer Token (M2M) Authentication + +Opt-in path that lets API clients present `Authorization: Bearer ` to +authenticate without going through the cookie-based OIDC redirect flow. +Designed for machine-to-machine (M2M) traffic — services calling other +services with tokens minted by your OIDC provider. + +The bearer path lives next to the cookie path: both go through the same +post-auth pipeline (`forwardAuthorized`) that injects identity headers, +checks `allowedRolesAndGroups`, applies security headers, and forwards to +the backend. The only thing that differs is how the principal is established +for that single request. + +## Quick start + +```yaml +enableBearerAuth: true +audience: https://api.example.com # REQUIRED when bearer is enabled +clientID: my-api-client-id +providerURL: https://issuer.example.com +sessionEncryptionKey: <32+-byte secret> +callbackURL: /oauth2/callback +``` + +That is the minimum. Everything else has a secure default. + +## Obtaining bearer tokens from your OIDC provider + +The middleware only **validates** bearer tokens — minting them is the IdP's job. For M2M traffic the canonical mint flow is OAuth 2.0 **`client_credentials`** (RFC 6749 §4.4); some providers require **JWT bearer assertion** (RFC 7523) instead. + +``` +┌────────────┐ POST /token ┌──────────┐ +│ client │ ───────────────────────────────►│ IdP │ +│ (service) │ grant_type=client_credentials │ /token │ +│ │ client_id=… │ │ +│ │ client_secret=… (or JWT) │ │ +│ │ audience=https://api.… ←── critical │ +│ │ scope=api:read … │ +│ │ ◄───────────────────────────────│ │ +│ │ access_token (JWT) │ │ +└────────────┘ └──────────┘ + │ + │ GET /protected + │ Authorization: Bearer + ▼ + Your service (behind Traefik + this plugin) +``` + +The IdP returns a JWT signed by the same JWKs the middleware already trusts (it discovers them from `providerURL`/.well-known). On the first protected request, the middleware verifies signature + issuer + **audience** + `exp` + identifier claim, then forwards downstream with `X-Forwarded-User` set. + +### Minimal worked example (Auth0-shape) + +```bash +# 1. Mint a token +curl -s -X POST https://issuer.example.com/oauth/token \ + -H 'Content-Type: application/json' \ + -d '{ + "grant_type": "client_credentials", + "client_id": "your-m2m-client-id", + "client_secret": "your-m2m-client-secret", + "audience": "https://api.example.com", + "scope": "api:read api:write" + }' +# → {"access_token":"eyJhbGciOiJSUzI1NiIs…","token_type":"Bearer","expires_in":86400,…} + +# 2. Use it +curl -H 'Authorization: Bearer eyJhbGciOiJSUzI1NiIs…' https://api.example.com/protected +``` + +The `audience` field in the token request **must match** the `audience` you configured on the middleware. Mismatch → 401 with `Bearer error="invalid_token"`. + +### Per-provider quick reference + +| Provider | Grant | Token endpoint | Audience parameter | Notes | +|---|---|---|---|---| +| **Auth0** | `client_credentials` | `https://TENANT.auth0.com/oauth/token` | `audience=` | Register an "API" + "Machine to Machine Application" authorised against that API. Without `audience` you get an opaque /userinfo token, which the bearer path rejects. See `docs/AUTH0_AUDIENCE_GUIDE.md`. | +| **Okta** | `client_credentials` | `https://TENANT.okta.com/oauth2/default/v1/token` | Configured in the authorization server; default `aud` is the auth-server URL | Service app must enable the `client_credentials` flow and be granted the requested scopes. | +| **Keycloak** | `client_credentials` | `https://kc/realms/REALM/protocol/openid-connect/token` | Configure an "Audience" mapper on a client scope, or use `client_id` as the audience | Client must have `serviceAccountsEnabled: true` plus role mappings. | +| **Entra ID / Azure AD** | `client_credentials` (v2.0 endpoint) | `https://login.microsoftonline.com/TENANT/oauth2/v2.0/token` | Pass `scope=/.default`; `aud` ends up being the API's App ID URI | Requires an App Registration + API permissions + admin consent. **Use the v2.0 endpoint** — v1 issues Microsoft-proprietary access tokens that are opaque to non-Microsoft clients. | +| **AWS Cognito** | `client_credentials` | `https://YOUR_DOMAIN.auth.REGION.amazoncognito.com/oauth2/token` | Scopes from a "Resource Server" attached to your User Pool | App client must have `client_credentials` flow enabled. Use HTTP **Basic** auth header for `client_id:client_secret`. | +| **GitLab** | `client_credentials` | `https://gitlab.com/oauth/token` | Audience matches the GitLab issuer | Rarely used for protecting external APIs; better suited for GitLab's own resources. | +| **Google** | **JWT bearer (RFC 7523)** — *not* `client_credentials` | `https://oauth2.googleapis.com/token` | Signed assertion JWT carries `aud=https://oauth2.googleapis.com/token`; resulting access token is **opaque** unless you specifically request a Google-issued JWT for your API | Google service-account flow is not the best fit for this middleware (opaque tokens are rejected on the bearer path). Run Auth0 / Okta / Keycloak in front, or use ID-token-based flows on the cookie path. | + +### RFC 7523 (JWT bearer assertion) — secretless alternative + +When shared secrets are forbidden (FAPI, internal compliance), swap `client_secret` for a signed JWT assertion: + +``` +POST /token +grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer +assertion= +``` + +The assertion JWT carries `iss=`, `sub=`, `aud=`, `exp`. The IdP verifies the signature against a public key you've pre-registered and returns an access token. + +This middleware already supports JWT assertions on the *middleware → IdP* hop via `clientAuthMethod: private_key_jwt` (see `docs/CONFIGURATION.md`). For the *client → IdP* hop, the same pattern applies — the client signs its own assertion. + +### Operational notes + +- **Token TTL is typically 1–24 hours.** Clients should refresh on `401`, not on a polling timer — saves the IdP. +- **Cache and reuse tokens.** The middleware caches verified tokens too, so repeated presentations are cheap. Clients SHOULD reuse a token until ~80 % of `expires_in`. +- **JWKS rotation is transparent.** The middleware auto-refreshes its JWKS cache when the IdP rotates keys. Clients don't need to do anything. +- **Revocation is generally not per-token** with `client_credentials`. If you need real-time revocation, set `requireTokenIntrospection: true` on the middleware and the IdP is consulted on every cache miss. +- **`scope` vs `audience`.** Scope says *what the client may do*; audience says *which service the token is for*. The middleware enforces audience; the backend service should enforce scope. +- **Secret hygiene.** Store `client_secret` in a secrets manager (Vault, AWS Secrets Manager, Kubernetes `Secret`). For higher assurance, switch the client to `private_key_jwt` (no shared secret at all). + +### Quickest validation loop + +```bash +# 1. Mint +TOKEN=$(curl -s -X POST https://issuer.example.com/oauth/token \ + -H 'Content-Type: application/json' \ + -d '{"grant_type":"client_credentials","client_id":"…","client_secret":"…","audience":"https://api.example.com"}' \ + | jq -r .access_token) + +# 2. Inspect claims to confirm aud/iss/exp match the middleware config +echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | jq + +# 3. Hit the protected route +curl -i -H "Authorization: Bearer $TOKEN" https://api.example.com/protected +``` + +`HTTP/1.1 200` with `X-Forwarded-User` on the backend confirms the loop works end-to-end. `401` with `WWW-Authenticate: Bearer error="invalid_token"` plus a middleware debug log explaining the rejection (audience mismatch, ID token presented, `iat` outside the 24h window, etc.) confirms the hardening is firing as designed. + +## Threat model and design rules + +Bearer authentication has materially different security properties from +cookie sessions: no `HttpOnly`/`Secure`/`SameSite` shielding, the token is +visible in headers and logs, and it's easier to exfiltrate. The bearer path +treats every one of these as a first-class concern. + +| Property | Behaviour | Why | +|---|---|---| +| Default state | `enableBearerAuth=false` | Bearer is opt-in; existing deployments observe no change. | +| Audience | **Mandatory.** Startup fails if `audience` is empty when bearer is enabled. | Eliminates the "token issued for service B accepted by service A" confusion attack. | +| Token format | JWT only (3 segments, JOSE-encoded). Opaque tokens are not accepted on the bearer path. | Matches the validation pipeline; opaque tokens require introspection only and bypass JWT-specific defences. | +| `alg` allowlist | Hard-pinned asymmetric: `RS256/384/512`, `PS256/384/512`, `ES256/384/512`. Checked **before** any JWKS fetch. | Denies `alg=none` and `alg=HS*` probes; prevents attacker noise from amplifying into JWKS round-trips. | +| `kid` hardening | Max 256 bytes; charset `[A-Za-z0-9._\-=]`. Checked **before** JWKS fetch. | Prevents cache-key explosion / pathological-`kid` JWKS amplification. | +| Token type | ID tokens are explicitly rejected (`nonce` claim, `typ: at+jwt`, `token_use=id`, scope/aud heuristics — reuses the existing `detectTokenType` helper). | ID tokens are not API credentials; treating them as such is classic token confusion. | +| Multi-audience | When `aud` is an array of length > 1, the token must carry `azp == clientID`. | OIDC §2 hardening against tokens minted for one client being replayed by another. | +| `iat` upper-age | Rejects tokens older than `maxTokenAgeSeconds` (default 24h). | Bounds clock-manipulation / forever-token abuse, even if `exp` is far in the future. | +| Identifier claim | `bearerIdentifierClaim` (default `"sub"`). Resolved value drives `X-Forwarded-User`. | Decoupled from the cookie path's `UserIdentifierClaim` (default `email`) so the M2M flow can never accidentally trust an unverified email. | +| Identifier sanitisation | Length cap (`maxIdentifierLength`, default 256). Rejects control chars, Unicode bidi-overrides (U+202A–U+202E, U+2066–U+2069), and the delimiters `, ; =`. | Defence in depth against downstream header injection / log injection / admin-UI spoofing. | +| JTI replay marking | Bearer path skips the JTI **Set** (so the same token can be reused until `exp`) but the **Get** stays active. | Allows legitimate bearer reuse without false-positive replay detection; revoked tokens (added to the blacklist by `RevokeToken`) still fail immediately. | +| Mixed bearer + cookie | **Cookie wins by default.** Flip to bearer-wins with `bearerOverridesCookie=true`. | Safer against browser/extension/proxy bearer injection scenarios. The cookie is the authoritative authenticator when present. | +| `Authorization` strip | `stripAuthorizationHeader=true` by default. | Keeps the raw token out of downstream services and their logs. | +| Excluded URLs | `Authorization` is stripped on excluded paths when `enableBearerAuth=true`. | Prevents bearer leakage into public health/metrics endpoint logs and prevents recon via excluded paths. | +| Per-IP throttle | After `bearerFailureThreshold` consecutive 401s from one source IP within `bearerFailureWindowSeconds`, further bearer requests from that IP return `429 Too Many Requests` + `Retry-After` for `bearerFailurePenaltySeconds`. | Limits offline-guessing-style attacks and protects the shared rate-limiter / JWKS endpoint. | +| Optional introspection | `requireTokenIntrospection=true` calls RFC 7662 introspection on every cache miss. Introspection result is cached briefly. Endpoint failure returns `503` (distinguishes infra outage from credential rejection). | Real-time revocation for high-assurance environments. Adds per-request IdP latency. | +| Response shape | `401 Unauthorized` with generic body. `WWW-Authenticate: Bearer error="invalid_token"` per RFC 6750 §3 (toggleable via `bearerEmitWWWAuthenticate`). `403` for roles/groups denial. `429` for throttle. `503` for introspection-endpoint outage. | Auditable from spec to code; reason categories never leak into the response body. | +| Logging | Failure reason + identifier hash (SHA-256 truncated to 8 hex chars) logged at debug. Raw tokens are never logged. | Audit trail without secrets-in-logs. | + +## Configuration reference + +| Field | Default | Description | +|---|---|---| +| `enableBearerAuth` | `false` | Master switch for the bearer path. | +| `audience` | (unset) | **Required** when `enableBearerAuth=true`. Reuses the existing global `audience` field. | +| `bearerIdentifierClaim` | `"sub"` | JWT claim used as the principal identifier. `"email"` is rejected at startup. | +| `stripAuthorizationHeader` | `true` | Remove the `Authorization` header before forwarding to the backend. Disable only when a downstream needs to re-verify the bearer. | +| `bearerEmitWWWAuthenticate` | `true` | Include `WWW-Authenticate: Bearer error="..."` on 401 responses (RFC 6750 §3). Disable to reduce recon signal. | +| `bearerOverridesCookie` | `false` | Cookie wins when both are present (default). Set `true` for the AWS/GCP/Kubernetes bearer-wins convention. | +| `maxTokenAgeSeconds` | `86400` | Upper bound on `iat` claim age (24h). Set `0` to disable the check (not recommended). | +| `maxIdentifierLength` | `256` | Length cap for the post-sanitisation identifier. | +| `bearerFailureThreshold` | `20` | Consecutive 401s from one IP that trip the throttle. | +| `bearerFailureWindowSeconds` | `60` | Rolling window over which 401s are counted. | +| `bearerFailurePenaltySeconds` | `60` | Duration of the 429 penalty box after the threshold trips. | +| `requireTokenIntrospection` | `false` | Call RFC 7662 introspection on every cache miss. Adds per-request IdP latency. | + +## What the bearer path does NOT do + +- **Human-user / browser flows.** The bearer path is M2M-only in this + iteration. Browser SPAs that want to attach a bearer to fetch calls work + if your backend treats them as machine clients, but the spec defaults are + tuned for service-to-service traffic. +- **Opaque access tokens.** Tokens must be JWTs. Introspection is a + revocation overlay on top of JWT verification, not a substitute for it. +- **`email_verified` enforcement.** The bearer path rejects `email` as the + identifier claim at startup precisely because `email_verified` is not + enforced in this iteration. Adding human-user bearer support is a + follow-up that must include this check. +- **mTLS / API keys.** Out of scope. The `principal` abstraction enables + adding these later as additional auth methods that produce a principal + for the shared `forwardAuthorized` pipeline. +- **SSE / WebSocket bypass with bearer.** Bypass paths keep their existing + cookie-only behaviour; bearer headers are ignored on those endpoints. + Documented limitation; widen by removing the bypass if you need bearer on + streaming endpoints. + +## Operational guidance + +- **Always set `strictAudienceValidation: true` when bearer is enabled.** + Startup logs a recommendation if you don't. +- **Set a tight `maxTokenAgeSeconds`** for environments where tokens are + expected to be minted frequently — the default 24h is conservative. +- **Enable `requireTokenIntrospection`** if your IdP supports it and + revocation latency matters. Bearer-path introspection caches results for + a short window per token. +- **Monitor 429s.** Sustained 429 traffic indicates either a buggy client + loop or an active credential-stuffing attempt. The throttle is your + primary signal for both. +- **`stripAuthorizationHeader=false` extends the token's blast radius** to + every downstream service that sees the request. Treat those services' + logs as token stores. +- **Bearer reuse is normal.** Don't enable per-token rate limiting; that's + what `bearerFailureThreshold` is for (per-IP, not per-token). +- **Cookie-wins is the safer default.** Only flip `bearerOverridesCookie` + if you control all clients and have audited that none of them present a + cookie alongside a bearer they don't intend to authenticate with. + +## Failure response matrix + +| Trigger | Status | Body | `WWW-Authenticate` | +|---|---|---|---| +| Empty bearer after prefix | 401 | `Unauthorized` | `Bearer error="invalid_request"` | +| Token over `MaxLength` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Not a 3-segment JWT | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Disallowed `alg` (e.g. none, HS*) | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Missing / oversized / bad-charset `kid` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Signature / issuer / audience / `exp` failure | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| `iat` older than `maxTokenAgeSeconds` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Multi-audience token without matching `azp` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Detected as ID token | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| JTI blacklisted (revoked) | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Introspection reports `active=false` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Introspection endpoint failure | 503 | `Service Unavailable` | (none) | +| Identifier claim missing / empty | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Identifier fails sanitisation | 401 | `Unauthorized` | `Bearer error="invalid_token"` | +| Per-IP failure threshold tripped | 429 | `Too Many Requests` | (none); `Retry-After: ` | +| Roles / groups not allowed | 403 | `Access denied` | (none) | + +## Known follow-ups (deferred) + +These are documented as future work, not blockers: + +- **Human-user bearer with `email_verified` enforcement.** Requires + decoupling the email-claim guard from the startup rejection and adding a + per-request `email_verified=true` check. +- **Introspection respects `client_assertion`.** The existing introspection + helper uses `client_secret_basic` only; operators on `private_key_jwt` + will see introspection silently use basic auth. +- **Per-route bearer configuration.** Single middleware-wide setting in this + iteration. + +## References + +- [PR design spec](superpowers/specs/2026-05-18-bearer-token-auth-design.md) — full design rationale, alternatives considered, and per-section sign-off history. +- [RFC 6750](https://www.rfc-editor.org/rfc/rfc6750) — Bearer Token Usage. +- [RFC 7662](https://www.rfc-editor.org/rfc/rfc7662) — OAuth 2.0 Token Introspection. +- [RFC 9068](https://www.rfc-editor.org/rfc/rfc9068) — JWT Profile for OAuth 2.0 Access Tokens. diff --git a/docs/CONFIGURATION.md b/docs/CONFIGURATION.md index 6a2697b..408cce4 100644 --- a/docs/CONFIGURATION.md +++ b/docs/CONFIGURATION.md @@ -261,6 +261,26 @@ strictAudienceValidation: true | `disableReplayDetection` | bool | `false` | Disable JTI-based replay attack detection | | `allowPrivateIPAddresses` | bool | `false` | Allow private IPs in provider URLs | +### Bearer-token (M2M) authentication + +Opt-in path that accepts `Authorization: Bearer ` instead of the cookie +session flow. M2M-only, default off, audience-mandatory. See +[docs/BEARER_AUTH.md](BEARER_AUTH.md) for the threat model and operational +guidance. + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `enableBearerAuth` | bool | `false` | Master switch. Startup fails if true with empty `audience` or with `bearerIdentifierClaim=email`. | +| `bearerIdentifierClaim` | string | `"sub"` | JWT claim used as the principal identifier. `"email"` is rejected at startup. | +| `stripAuthorizationHeader` | bool | `true` | Strip `Authorization` from forwarded requests after successful bearer auth. | +| `bearerEmitWWWAuthenticate` | bool | `true` | Emit RFC 6750 `WWW-Authenticate: Bearer error="..."` hints on 401. | +| `bearerOverridesCookie` | bool | `false` | Cookie wins when both bearer and cookie are present (default). Set true for bearer-wins. | +| `maxTokenAgeSeconds` | int64 | `86400` | Upper bound on `iat` claim age (24h). 0 disables the check. | +| `maxIdentifierLength` | int | `256` | Length cap on the sanitised principal identifier. | +| `bearerFailureThreshold` | int | `20` | Consecutive 401s from one source IP that trip the throttle. | +| `bearerFailureWindowSeconds` | int | `60` | Rolling window for counting 401s. | +| `bearerFailurePenaltySeconds` | int | `60` | 429 + `Retry-After` duration after the threshold trips. | + --- ## Session Management diff --git a/docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md b/docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md new file mode 100644 index 0000000..e333ca8 --- /dev/null +++ b/docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md @@ -0,0 +1,459 @@ +# Bearer Token Authentication — Design Spec + +- **Date**: 2026-05-18 +- **Status**: Design — pending implementation plan +- **Supersedes**: PR #93 (broken implementation; recommended to close in favour of this design) + +## 1. Summary + +Add an opt-in path that lets API clients (machine-to-machine) authenticate by presenting a signed access token in the `Authorization: Bearer ` header, bypassing the cookie-based OIDC redirect flow. Identity, roles, and authorization checks remain consistent with the existing cookie path; the only thing that changes is how the principal is established for that single request. + +The feature is implemented by extracting a shared `forwardAuthorized` pipeline from the existing `processAuthorizedRequest`, introducing a `principal` value type, and adding a small bearer-specific entrypoint that builds a principal directly from a verified JWT — without synthesising a fake `SessionData`. + +## 2. Motivation + +PR #93 attempted this feature by building an in-memory `SessionData` from JWT claims and reusing `processAuthorizedRequest`. The approach has three latent defects: + +1. The synthetic session omits `mainSession.Values["user_identifier"]`. `processAuthorizedRequest` reads it via `GetUserIdentifier()`; when empty it bails to `defaultInitiateAuthentication` and issues an OIDC redirect. The feature is non-functional in practice despite the unit test passing. +2. `verifyToken` accepts both ID tokens (audience match against `clientID`) and access tokens. ID tokens are not API credentials; treating them as such is a classic token-confusion vector. +3. `verifyToken` adds JTI to the replay blacklist on first verify. Once the verified-token cache evicts, subsequent reuse of the same bearer token triggers a false-positive replay rejection. + +Rather than patch a synthetic-session approach that will keep generating bugs as `SessionData` evolves, this spec replaces it with a cleaner abstraction where session lifecycle and post-auth header injection live in separate units. + +## 3. Goals + +- Accept `Authorization: Bearer ` from M2M clients, validate the token, and forward the request downstream with identity headers populated. +- Enforce the same `allowedRolesAndGroups` policy as the cookie path. +- Default-off; safe defaults when enabled (audience required, ID tokens rejected, identifier sanitised). +- No behavioural change to the cookie path. Existing tests must continue to pass without modification. + +## 4. Non-Goals + +- Human-user / browser flows. Bearer is M2M-only in this iteration. +- Pure opaque access tokens on the bearer path. Tokens must be JWTs; introspection (RFC 7662) is supported *on top of* JWT verification for revocation state, not as a substitute for it. +- mTLS, API keys, or any other auth method. The `principal` abstraction enables them later, but they are not delivered here. +- Per-route bearer configuration. Single middleware-wide setting. + +## 5. Decided Requirements + +| Topic | Decision | +|---|---| +| Consumer type | Machine-to-machine (M2M) only | +| Token format | JWT only (signature, issuer, audience, exp) | +| Audience | Mandatory when feature enabled; startup fails if `Audience == ""` | +| Token type | Access tokens only; ID tokens explicitly rejected | +| Revocation | JWT-only verification by default; introspection (RFC 7662) opt-in via existing `RequireTokenIntrospection` | +| Identity claim | New `BearerIdentifierClaim` config (string, default `"sub"`). Bearer path reads this claim exclusively; does NOT use `UserIdentifierClaim` (which defaults to `"email"` and drives the cookie path). Resolved value must be a non-empty string. `sub` is mandatory per `jwt.go:416` regardless, so even with a different `BearerIdentifierClaim` the token must still carry a valid `sub`. Decoupling avoids the M2M-vs-human-user identity-claim conflict and the email-spoofing footgun. | +| Identifier sanitisation | Reject value containing any `unicode.IsControl` char, any Unicode bidi-override (U+202A–U+202E, U+2066–U+2069), leading/trailing whitespace, commas, semicolons, equals signs. Max length 256 bytes. | +| Token classifier | **Reuse existing `detectTokenType(jwt, token)` at `token_manager.go:187-303`** which already handles `nonce`, `typ: at+jwt`, `token_use`, `scope`, and aud-vs-clientID priority. Bearer path rejects any token where `detectTokenType == true` (ID token). Do not invent a parallel classifier. | +| Algorithm pinning | Hard-pin `alg ∈ {RS256, RS384, RS512, PS256, PS384, PS512, ES256, ES384, ES512}`, enforced **before** JWKS lookup on the bearer path. Prevents wasted JWKS fetches for `alg=none`/HS attacker probes. | +| `kid` hardening | `kid` ≤ 256 bytes, charset `[A-Za-z0-9._\-=]`. Reject before JWKS lookup. | +| Token age | Bearer path enforces `now - iat <= MaxTokenAgeSeconds` (default 86400 / 24h, configurable). Cookie path unchanged. | +| Multi-audience policy | If `aud` is an array (length > 1), require `azp` claim to be present and equal to `clientID`. Single-string `aud` unaffected. | +| Mixed bearer + cookie precedence | **Cookie wins by default** when both are presented (safer for browser scenarios). Operator opt-in: `BearerOverridesCookie=true` to flip. Either way, a warning is logged on the request. | +| Bearer + excluded URL | `Authorization` header is **stripped** before forwarding when the request hits an excluded URL. Prevents bearer leaking into public endpoints' downstream logs and prevents recon via excluded paths. | +| Per-source bearer 401 throttle | New sharded cache `failedBearerAttempts` keyed by client IP. After N (default 20) consecutive 401s from one IP within 1 minute, reject further bearer requests from that IP with 429 for 60s. Applied BEFORE `verifyToken` to deny JWKS amplification. | +| `Authorization` header passthrough | New `StripAuthorizationHeader` config, default `true` | +| Roles/groups gating | Same `allowedRolesAndGroups` rules as cookie path | +| Default state | `EnableBearerAuth` = `false` | +| JTI replay marking | Suppressed on bearer path; cookie path unchanged | +| Failure response shape | 401 with generic body; `WWW-Authenticate: Bearer error="invalid_token"` per RFC 6750 | +| Introspection endpoint outage | 503 (distinguishes infra outage from token rejection) | +| Mixed bearer + cookie | Bearer wins; cookie ignored on that request | +| SSE/WS bypass + bearer | Bypass paths keep cookie-only check; bearer header ignored on SSE/WS | + +## 6. Architecture + +``` + ┌──────────────────┐ + HTTP req ──► │ ServeHTTP │ (existing entry; adds bearer detection) + └─────────┬────────┘ + ┌───────────┴────────────┐ + ▼ ▼ + cookie / session bearer (Authorization: Bearer …) + │ │ + ▼ ▼ + ┌────────────────┐ ┌────────────────────┐ + │ buildPrincipal │ │ buildPrincipal │ + │ FromSession() │ │ FromBearerToken() │ + └────────┬───────┘ └─────────┬──────────┘ + │ produces *principal │ + └──────────────┬───────────┘ + ▼ + ┌────────────────────────────┐ + │ forwardAuthorized(rw,req,p)│ (shared pipeline) + │ • roles/groups gate │ + │ • header injection │ + │ • header templates │ + │ • security headers │ + │ • cookie stripping │ + │ • next.ServeHTTP │ + └────────────────────────────┘ +``` + +**Invariant**: `forwardAuthorized` never touches session storage. Session-specific concerns (Save, IsDirty, backchannel-logout invalidation) stay inside `processAuthorizedRequest` around the call to `forwardAuthorized`. + +**Feature gate**: when `EnableBearerAuth == false`, the bearer-detection check in `ServeHTTP` is a no-op. Existing deployments observe byte-identical behaviour. + +## 7. Components + +### 7.1 `principal` type (new file `principal.go`) + +```go +type principalSource int + +const ( + sourceSession principalSource = iota + sourceBearer +) + +type principal struct { + Identifier string // drives X-Forwarded-User + Email string // optional, "" for M2M + Subject string // sub claim + ClientID string // azp / client_id, M2M caller + Claims map[string]interface{} // raw claims for templates / groups + AccessToken string // for X-Auth-Request-Token (gated by minimalHeaders) + IDToken string // "" on bearer path + RefreshToken string // "" on bearer path + Source principalSource +} +``` + +Pure data. No methods that mutate it. No I/O. No manager pointer. + +### 7.2 `buildPrincipalFromSession(*SessionData) *principal` (new in `principal.go`) + +Read-only adapter over existing `SessionData` getters: `GetUserIdentifier`, `GetEmail`, `GetAccessToken`, `GetIDToken`, `GetRefreshToken`, cached claims via `GetIDTokenClaims`. Does not write back to the session. This is the only function that still knows about `SessionData`. + +### 7.3 `buildPrincipalFromBearerToken(token string) (*principal, error)` (new in `bearer_auth.go`) + +1. **Length / format guards**: `len(token) <= AccessTokenConfig.MaxLength`, exactly two dots, non-empty after trim. +2. **Parse header for early alg/kid pinning** (without trusting payload): decode JOSE header; reject if `alg` ∉ asymmetric allowlist; reject if `kid` missing, > 256 bytes, or contains chars outside `[A-Za-z0-9._\-=]`. This happens **before** JWKS lookup so attacker noise doesn't amplify into JWKS fetches. +3. **Per-IP 401 throttle check**: if this IP is in the `failedBearerAttempts` penalty box, return 429 immediately. +4. `t.verifyToken(token, verifyOpts{skipReplayMarking: true})` — reuses signature, issuer, audience, expiration, JTI Get (replay detection). The `skipReplayMarking` flag gates ONLY the JTI Set at `token_manager.go:108-143`; the JTI Get at `token_manager.go:44-47, 80-89` remains active so revoked tokens (via `RevokeToken` adding to blacklist) are still rejected. +5. **Re-parse claims** (`parseJWT(token)` is cheap and already done internally; reuse via a single decode if practical). +6. **Token-type guard**: call existing `detectTokenType(jwt, token)` (`token_manager.go:187-303`). Reject when it returns `true` (ID token). Belt-and-braces: also reject if `claims["nonce"]` is a non-empty string or `claims["token_use"] == "id"`. +7. **Multi-audience hardening**: if `claims["aud"]` is a `[]interface{}` with length > 1, require `claims["azp"]` to be a non-empty string equal to `t.clientID`; reject otherwise. +8. **`iat` upper-age bound**: reject when `time.Now().Unix() - int64(claims["iat"].(float64)) > MaxTokenAgeSeconds` (default 86400). +9. **Optional introspection**: if `requireTokenIntrospection` is set, call `introspectToken`; reject if `active == false` (401); surface 503 on transport failure. Bearer-path introspection cache TTL is capped at 60s (not 5min) to keep the "real-time revocation" promise close to true. +10. **Identifier resolution**: read `t.bearerIdentifierClaim` (defaults to `"sub"`); do NOT use `t.userIdentifierClaim` (cookie path's setting, default `email`). The bearer path does NOT fall back to other claims because `jwt.Verify` already enforces non-empty `sub` (`jwt.go:416-419`). Empty/missing identifier → 401. +11. **Identifier sanitisation**: trim, then reject if length > 256 OR contains any of: `unicode.IsControl`, bidi-override (U+202A–U+202E, U+2066–U+2069), `,`, `;`, `=`. +12. Return `&principal{ Source: sourceBearer, … }`. + +On any failure path: increment the per-IP `failedBearerAttempts` counter; return the appropriate HTTP status (401 / 403 / 429 / 503) without revealing the failure reason in the response body. Reason is logged at debug only, with the identifier (if resolved) hashed via SHA-256 truncated to 8 hex chars. + +### 7.4 `forwardAuthorized(rw, req, *principal)` (new in `middleware.go`, extracted) + +The shared post-auth pipeline. Lifted verbatim from the existing `processAuthorizedRequest`: + +1. Roles/groups extraction via existing `extractGroupsAndRolesFromClaims`. +2. `allowedRolesAndGroups` gate (existing logic). +3. Inject `X-Forwarded-User`, `X-User-Groups`, `X-User-Roles`. +4. Inject `X-Auth-Request-*` (gated by `minimalHeaders`). +5. Header templates. +6. Security headers. +7. Cookie strip when `stripAuthCookies`. +8. **New**: `Authorization` header strip when `stripAuthorizationHeader` AND `principal.Source == sourceBearer`. +9. `t.next.ServeHTTP(rw, req)`. + +Does not call `Save`, does not check `IsDirty`. Session persistence stays with the cookie-path caller. + +### 7.5 `handleBearerRequest(rw, req)` (new in `bearer_auth.go`) + +``` +1. Detect "Authorization: Bearer " (case-insensitive prefix). +2. token = TrimSpace(authHeader[7:]); reject empty. +3. p, err := buildPrincipalFromBearerToken(token). + On err → 401 with WWW-Authenticate, log reason at debug. +4. forwardAuthorized(rw, req, p). +``` + +Target: ~40 lines. + +### 7.6 Refactor of `processAuthorizedRequest` (modify `middleware.go`) + +Splits along the principal boundary: +- Session-specific part (backchannel-logout invalidation, `IsDirty` / `Save`) stays in `processAuthorizedRequest`. +- Everything else moves to `forwardAuthorized`. +- `processAuthorizedRequest` ends with `forwardAuthorized(rw, req, buildPrincipalFromSession(session))`. + +### 7.7 `verifyOpts` extension to `verifyToken` (modify `token_manager.go`) + +Add a parameter struct: +```go +type verifyOpts struct { + skipReplayMarking bool // suppress JTI Set (token_manager.go:108-143); blacklist Get stays active +} +``` + +Both the type and field are unexported (internal-only knob). Signature change: `verifyToken(token string)` becomes `verifyToken(token string, opts verifyOpts)`. Existing callers pass `verifyOpts{}` (zero value = current behaviour). Bearer path passes `verifyOpts{skipReplayMarking: true}`. + +**Critical semantics — must be reflected in implementation and tests:** +- `skipReplayMarking` only gates the **Set** at `token_manager.go:108-143` (the call adding the JTI to the blacklist and replay cache). +- The blacklist **Get** at `token_manager.go:44-47, 80-89` stays unconditionally active on the bearer path. Tokens revoked via `RevokeToken` (which adds the JTI to the blacklist) MUST still be rejected on the bearer path. +- Must NOT be implemented by mutating `t.disableReplayDetection` (struct field) — that would create a cross-request race that disables replay protection globally. + +A targeted regression test exercises: bearer token verified once → admin calls `RevokeToken` adding the JTI to the blacklist → same token replayed → 401. + +### 7.8 Config additions (modify `settings.go`) + +```go +EnableBearerAuth bool `json:"enableBearerAuth,omitempty"` +BearerIdentifierClaim string `json:"bearerIdentifierClaim,omitempty"` +StripAuthorizationHeader bool `json:"stripAuthorizationHeader,omitempty"` +BearerEmitWWWAuthenticate bool `json:"bearerEmitWWWAuthenticate,omitempty"` +BearerOverridesCookie bool `json:"bearerOverridesCookie,omitempty"` +MaxTokenAgeSeconds int64 `json:"maxTokenAgeSeconds,omitempty"` +MaxIdentifierLength int `json:"maxIdentifierLength,omitempty"` +BearerFailureThreshold int `json:"bearerFailureThreshold,omitempty"` +BearerFailureWindowSeconds int `json:"bearerFailureWindowSeconds,omitempty"` +BearerFailurePenaltySeconds int `json:"bearerFailurePenaltySeconds,omitempty"` +``` + +Defaults (applied in `CreateConfig` for the bearer-related fields; values >0 only honoured when `EnableBearerAuth=true`): +- `EnableBearerAuth`: `false`. +- `BearerIdentifierClaim`: `"sub"`. +- `StripAuthorizationHeader`: `true`. +- `BearerEmitWWWAuthenticate`: `true` (RFC 6750 hint enabled by default; flip to false if recon-exposure is a concern). +- `BearerOverridesCookie`: `false` (cookie wins when both present; flip to `true` for the legacy/industry-default behaviour). +- `MaxTokenAgeSeconds`: `86400` (24h upper bound on `iat`). +- `MaxIdentifierLength`: `256`. +- `BearerFailureThreshold`: `20` (consecutive 401s per IP before throttle). +- `BearerFailureWindowSeconds`: `60`. +- `BearerFailurePenaltySeconds`: `60` (429 reply for this long after threshold tripped). + +### 7.9 Startup validation (modify `main.go` `New()`) + +- `EnableBearerAuth && Audience == ""` → fatal error. +- `EnableBearerAuth && !StrictAudienceValidation` → warning log (recommended hardening). +- `EnableBearerAuth && BearerIdentifierClaim == "email"` → fatal error (the bearer path is M2M and an `email` identifier without `email_verified` enforcement is a spoofing vector; default `BearerIdentifierClaim=sub` avoids this; explicit override to `email` is rejected). +- `EnableBearerAuth && MaxTokenAgeSeconds <= 0` → reset to default 86400 with info log. +- `EnableBearerAuth && BearerFailureThreshold <= 0` → reset to default 20 with info log. + +## 8. Data Flow + +### 8.1 Bearer path + +``` +ServeHTTP entry (pre-init paths unchanged: logout, backchannel, frontchannel, excluded URLs, SSE/WS bypass) + │ + ├─ enableBearerAuth == false? → fall through to cookie path + │ + └─ enableBearerAuth == true AND Authorization starts with "Bearer " + │ + ▼ + handleBearerRequest + │ + ├─ format guards (empty, length, segment count) + │ + ▼ + verifyToken(token, verifyOpts{SkipReplayMarking: true}) + │ signature, issuer, audience (strict), exp + │ + ▼ + classifyToken(claims) → reject ID tokens + │ + ▼ + if requireTokenIntrospection: introspectToken → active check + │ + ▼ + resolveIdentifier(claims) → sanitiseIdentifier + │ + ▼ + principal{Source: sourceBearer, …} + │ + ▼ + forwardAuthorized(rw, req, principal) + │ + ├─ roles/groups gate (403 on deny) + ├─ header injection + ├─ header templates + ├─ security headers + ├─ strip OIDC cookies (existing) + ├─ strip Authorization header (new, when configured) + └─ next.ServeHTTP(rw, req) +``` + +### 8.2 Cookie path (refactored, semantically unchanged) + +``` +processAuthorizedRequest + 1. Session validity / backchannel-logout invalidation (unchanged). + 2. principal := buildPrincipalFromSession(session). + 3. forwardAuthorized(rw, req, principal). + 4. if session.IsDirty(): session.Save(). +``` + +## 9. Error Handling + +| Trigger | Status | Body | WWW-Authenticate | Debug log reason | +|---|---|---|---|---| +| Empty bearer after prefix | 401 | `Unauthorized` | `Bearer error="invalid_request"` | empty bearer token | +| Token over MaxLength | 401 | `Unauthorized` | `Bearer error="invalid_token"` | token exceeds max length | +| Not a 3-segment JWT | 401 | `Unauthorized` | `Bearer error="invalid_token"` | malformed JWT | +| Disallowed `alg` (e.g. none, HS*) | 401 | `Unauthorized` | `Bearer error="invalid_token"` | unsupported alg | +| Missing/oversized/bad-charset `kid` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | invalid kid | +| Signature / issuer / aud / exp fail | 401 | `Unauthorized` | `Bearer error="invalid_token"` | reason from verifyToken (category only) | +| `iat` older than MaxTokenAgeSeconds | 401 | `Unauthorized` | `Bearer error="invalid_token"` | token too old (iat outside age bound) | +| Multi-aud without matching `azp` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | multi-aud token without azp match | +| Detected as ID token | 401 | `Unauthorized` | `Bearer error="invalid_token"` | ID tokens not accepted on bearer path | +| JTI blacklisted (revoked) | 401 | `Unauthorized` | `Bearer error="invalid_token"` | token JTI in blacklist | +| Introspection `active=false` | 401 | `Unauthorized` | `Bearer error="invalid_token"` | token inactive at IdP | +| Introspection endpoint failure | 503 | `Service Unavailable` | (none) | introspection unavailable | +| Identifier claim missing/empty | 401 | `Unauthorized` | `Bearer error="invalid_token"` | no identifier claim | +| Identifier fails sanitisation | 401 | `Unauthorized` | `Bearer error="invalid_token"` | invalid identifier characters | +| Per-IP failure threshold tripped | 429 | `Too Many Requests` | (none); `Retry-After: ` | source IP in penalty box | +| Roles/groups not allowed | 403 | `Access denied` | (none) | user not in allowedRolesAndGroups | + +Responses never include token contents, never include the raw failure reason, and never set `Location` headers (API clients cannot follow redirects). + +## 10. Edge Cases + +1. **Both bearer header and cookie session present.** Cookie wins by default (safer against browser/extension/proxy bearer injection). `BearerOverridesCookie=true` flips to bearer-wins. Either way: WARN log includes both source markers so operators can audit. +2. **`Authorization: Basic …`.** Not bearer; cookie path runs as today. +3. **`Authorization: Bearer ` (trailing space, no value).** Empty after trim → 401. +4. **Mixed-case prefix (`bearer`, `BEARER`, `BeArEr`).** Case-insensitive prefix check; token value preserved verbatim. +5. **Multiple `Authorization` headers.** Use only the first (Go `http.Header.Get` default). Documented. +6. **Bearer during OIDC init wait.** Bearer requests also block on init: we need `issuerURL`, `audience`, JWKs ready. If init fails, bearer requests return 503 just like cookie requests. +7. **SSE / WebSocket bypass with bearer.** Bypass paths keep cookie-only behaviour. Operators who want bearer on streaming endpoints must remove SSE/WS bypass. Documented. +8. **Logout endpoint with bearer.** Logout runs before bearer detection. Treated as cookie-session logout; bearer token revocation requires IdP-side action. +9. **Excluded URLs with bearer.** Bypass excluded URLs as today; bearer not validated on excluded paths. ADDITIONALLY: `Authorization: Bearer` is stripped from the request before forwarding so the token can't leak into the excluded endpoint's downstream logs / metrics scrapers / health checks. +10. **Concurrent identical bearer requests.** Existing `tokenCache` is concurrency-safe; no new locking. +11. **Client rotates token between requests.** Independent verification per token; independent cache entries. +12. **Clock skew.** Use existing `jwt.Verify` leeway. (If absent, add ±30s as a separate change; out of scope here.) + +## 11. Testing Strategy + +### 11.1 Integration tests (new `bearer_auth_test.go`) + +Table-driven test against a real `httptest.Server` and the full `ServeHTTP` flow. Coverage matrix: + +- Valid access token + allowed roles → 200, `next` ran, `X-Forwarded-User` set. +- Valid token without configured roles → 200. +- Wrong audience, expired, tampered signature → 401, `next` did not run. +- ID token presented → 401 (`ID tokens not accepted`). +- Malformed JWT (2 segments) → 401. +- Oversized token (> MaxLength) → 401. +- Empty bearer → 401. +- Missing identifier claim → 401. +- Identifier containing `\r\n` → 401. +- `allowedRolesAndGroups` mismatch → 403. +- `allowedRolesAndGroups` match → 200. +- `EnableBearerAuth=false` + bearer header → cookie path runs (302 to `/authorize`). +- Bearer + valid cookie session → bearer wins, 200. +- `StripAuthorizationHeader=true` → downstream sees no `Authorization`. +- `StripAuthorizationHeader=false` → downstream sees `Authorization`. +- Case variants (`bearer`, `BEARER`) → 200. +- SSE bypass + bearer → cookie-only check applies (bearer ignored). +- **Replay regression**: same token 1000 times in a row → all 200. +- **Cache-evict regression**: same token, force-evict `tokenCache` between iterations (call `tokenCache.Delete` directly), replay → still 200 (verifies `skipReplayMarking` doesn't poison the blacklist). +- **Revocation-while-bearer regression**: bearer token verified once → admin calls `RevokeToken` adding JTI to blacklist → same token presented → 401 (verifies blacklist Get stays active on bearer path even with `skipReplayMarking` set). +- **Alg-pin: token signed with `alg=none`** → 401, no JWKS fetch happens (verify with a counting mock). +- **`kid` injection: 50KB random kid** → 401 immediately, no JWKS fetch. +- **Per-IP throttle**: 21 bad bearer requests from same IP within 1 minute → 22nd returns 429 + Retry-After. +- **`iat` upper-age**: token with `iat = now - 25h` → 401 (older than 24h default). +- **Multi-aud without azp**: aud = `["a", "b"]`, no azp → 401. +- **Multi-aud with matching azp**: aud = `["api-aud", "other"]`, azp = clientID → 200. +- **Identifier with bidi-override**: sub contains U+202E → 401. +- **Identifier with comma**: sub = `"alice,bob"` → 401. +- **Identifier over 256 bytes** → 401. +- **`UserIdentifierClaim=email` at startup with EnableBearerAuth=true** → startup fails. +- **Excluded URL + bearer**: bearer header presented on excluded URL → request forwarded, downstream sees no `Authorization` header (stripped). + +### 11.2 Unit tests (in `bearer_auth_test.go`) + +- `classifyToken`: ID-token detection, access-token detection by `scope`/`scp`/`token_use`, ambiguous → reject. +- `resolveIdentifier`: precedence (`userIdentifierClaim` → `sub` → `client_id`/`azp`); missing → error; empty string → error. +- `sanitizeIdentifier`: rejects all `unicode.IsControl`; accepts email/sub-style values. + +### 11.3 Introspection tests (`bearer_auth_introspection_test.go`) + +- Token valid + introspection `active=true` → 200. +- Token valid + introspection `active=false` → 401. +- Introspection endpoint 500 → 503. +- Second request hits introspection cache (no second HTTP call). + +### 11.4 Startup validation tests (extend `settings_test.go` / `main_test.go`) + +- `EnableBearerAuth=true, Audience=""` → `New()` errors. +- `EnableBearerAuth=true, StrictAudienceValidation=false` → succeeds with warning. +- `EnableBearerAuth=false` → no validation; existing tests untouched. + +### 11.5 Cookie-path regression suite + +- All existing `TestServeHTTP_*` tests in `main_servehttp_test.go` pass unmodified. +- Add: cookie session, `EnableBearerAuth=true`, no bearer header → identical behaviour to baseline. +- Add: dirty session still triggers `Save()` after refactor. + +### 11.6 Principal invariants + +- `buildPrincipalFromSession`: `Source == sourceSession`; `IDToken` / `RefreshToken` populated when present in session. +- `buildPrincipalFromBearerToken`: `Source == sourceBearer`; `IDToken == ""`, `RefreshToken == ""`. +- `forwardAuthorized` produces identical headers for equivalent principals regardless of source. + +### 11.7 Coverage gate + +- New code in `bearer_auth.go` and `principal.go`: ≥ 90% line coverage. +- `forwardAuthorized` coverage ≥ existing `processAuthorizedRequest` coverage baseline. + +### 11.8 Out of scope (follow-ups) + +- Load test of bearer vs cookie hot path. +- Fuzzing the JWT parser. +- Additional auth methods (mTLS, API keys) — design enables them, but they are separate work. + +## 12. Migration / Rollout + +Default-off. Existing deployments observe no behavioural change. Operators opt in by setting: + +```yaml +enableBearerAuth: true +audience: https://api.example.com # required when bearer enabled +# optional: +stripAuthorizationHeader: true # default +requireTokenIntrospection: false # default; set true for real-time revocation +userIdentifierClaim: client_id # optional override; defaults to sub fallback chain +``` + +Documentation: update `docs/CONFIGURATION.md` with a bearer-auth section, and add a new `docs/BEARER_AUTH.md` covering the security model, threat assumptions (token issuer is trusted; audience must be set; bearer means trust the issuer's revocation policy unless introspection enabled), and recommended configurations for common IdPs. + +## 13. Security Considerations + +| Concern | Mitigation | +|---|---| +| Token confusion (ID token used as bearer) | Reuse `detectTokenType` (`token_manager.go:187-303`) which checks `nonce`, `typ: at+jwt`, `token_use`, `scope`, aud-vs-clientID. Belt-and-braces: explicit `nonce` + `token_use == "id"` rejection on top. | +| Audience confusion (token for service B accepted by A) | `Audience` mandatory at startup; verified via existing `VerifyJWTSignatureAndClaims`; multi-aud tokens require matching `azp == clientID`. | +| Replay-via-blacklist false positive | `verifyOpts{skipReplayMarking: true}` on bearer path. Gates ONLY the Set; the Get stays so revoked tokens still fail. | +| Revocation lag | Optional RFC 7662 introspection. Bearer-path introspection cache TTL capped at 60s. Set `RequireTokenIntrospection=true` for real-time revocation. | +| `alg`-confusion / `alg=none` attacks | Hard-pin asymmetric allowlist at bearer entry, **before** JWKS fetch. Prevents wasted upstream calls and locks out HS/none probes. | +| `kid` injection / JWKS amplification | `kid` length cap (256 bytes) + charset allowlist enforced at bearer entry. | +| Bearer 401 brute-force / oracle | Per-IP `failedBearerAttempts` cache; configurable threshold + penalty box returning 429 + `Retry-After`. | +| `iat` clock-manipulation / forever-tokens | `MaxTokenAgeSeconds` upper bound (default 24h); cookie path unchanged. | +| Identifier-driven header injection | `sanitizeIdentifier`: length cap, control-char + bidi-override + `,;=` rejection. `net/http` rejects CRLF on the wire too (defence in depth). | +| Token leakage downstream | `StripAuthorizationHeader=true` by default. Also: `Authorization` stripped on excluded-URL requests so bearer can't leak into health/metrics downstream logs. | +| Token-in-logs | All log paths log reason categories, not raw tokens. Identifier hashed via SHA-256 truncated to 8 hex chars before any info/warn-level emission (full identifier only at debug). New `safeLogAuthEvent(category, hashedIdentifier, reasonCode)` helper makes this hard to misuse. | +| `email` claim spoofing | Startup fails if `EnableBearerAuth && UserIdentifierClaim == "email"`. Future human-user bearer iteration must add `email_verified` enforcement. | +| Bypass on SSE / WS endpoints | SSE/WS bypass keeps cookie-only behaviour; bearer ignored. Operators choose to widen if needed. | +| Mixed bearer + cookie precedence | Cookie wins by default (safer for browser scenarios); `BearerOverridesCookie=true` flips. WARN log on both-present requests. | +| Configuration drift (operator forgets audience) | Startup fails when `EnableBearerAuth=true && Audience==""`. | +| Downstream blast radius when `StripAuthorizationHeader=false` | Documented: forwarded bearer extends token's blast radius to all downstream services. Logs at those services become token stores. Operators must treat downstream log policy accordingly. | +| Introspection auth method (pre-existing gap, called out) | `token_introspection.go:80` uses `client_secret_basic` only; does not honour `private_key_jwt`. Out of scope for this PR but documented as a follow-up; operators using `ClientAuthMethod=private_key_jwt` + `RequireTokenIntrospection=true` should be aware introspection will use basic auth. | + +## 14. Open Questions + +None — all design decisions resolved during brainstorming + security review. Implementation may surface incidental questions (e.g. exact clock-skew leeway in `jwt.Verify`); those are out of scope for this spec and handled in the implementation plan. + +## 14a. Security Review Reference + +This design was reviewed by the `security-reviewer` subagent on 2026-05-18. Findings incorporated: + +- **Critical**: C1 (classifier reuses `detectTokenType`), C2 (sub fallback dropped — unreachable due to `jwt.go:416`), C3 (replay-marking gates only Set, not Get; revocation regression test added). +- **High**: H1 (alg pinned at bearer entry), H2 (kid length + charset), H3 (cookie wins by default, configurable), H4 (per-IP 401 throttle), H5 (multi-aud requires azp). +- **Medium**: M1 (identifier max-length + bidi reject + delimiter chars), M2 (introspection cache TTL capped at 60s on bearer path), M4 (log-hashing via SHA-256[:8]), M5 (StripAuth blast-radius documented), M6 (iat upper-age bound), M7 (Authorization stripped on excluded URLs). +- **Low/Nit**: L2 (renamed to `BearerEmitWWWAuthenticate`), N3 (startup rejects `UserIdentifierClaim=email`). +- **Documented as pre-existing gaps (follow-up PRs)**: M3 (introspection auth method doesn't honour `private_key_jwt`). + +## 15. Implementation Plan Reference + +To be produced by the `writing-plans` skill in a follow-up document at `docs/superpowers/plans/2026-05-18-bearer-token-auth-plan.md`. The plan decomposes this design into ordered, independently-testable PRs. diff --git a/internal/cache/backends/hybrid.go b/internal/cache/backends/hybrid.go index 890c513..3e4be76 100644 --- a/internal/cache/backends/hybrid.go +++ b/internal/cache/backends/hybrid.go @@ -164,7 +164,7 @@ func (h *HybridBackend) Set(ctx context.Context, key string, value []byte, ttl t // Check if we're in fallback mode if h.fallbackMode.Load() { - h.logger.Debugf("Operating in fallback mode, skipping L2 write for key: %s", key) + h.logger.Debugf("Operating in fallback mode, skipping L2 write for key: %s", redactKey(key)) return nil // Don't fail the operation if L2 is down } @@ -176,13 +176,13 @@ func (h *HybridBackend) Set(ctx context.Context, key string, value []byte, ttl t // Synchronous write for critical cache types if err := h.secondary.Set(ctx, key, value, ttl); err != nil { h.errors.Add(1) - h.logger.Warnf("Failed to write to L2 cache (sync) for key %s: %v", key, err) + h.logger.Warnf("Failed to write to L2 cache (sync) for key %s: %v", redactKey(key), err) h.recordL2Error() // Don't fail the operation - L1 write succeeded return nil } h.l2Writes.Add(1) - h.logger.Debugf("Synchronous write to L2 completed for critical key: %s", key) + h.logger.Debugf("Synchronous write to L2 completed for critical key: %s", redactKey(key)) } else { // Asynchronous write for non-critical cache types select { @@ -192,10 +192,10 @@ func (h *HybridBackend) Set(ctx context.Context, key string, value []byte, ttl t ttl: ttl, ctx: ctx, }: - h.logger.Debugf("Queued async write to L2 for key: %s", key) + h.logger.Debugf("Queued async write to L2 for key: %s", redactKey(key)) default: // Buffer is full, log and continue - h.logger.Warnf("Async write buffer full, dropping L2 write for key: %s", key) + h.logger.Warnf("Async write buffer full, dropping L2 write for key: %s", redactKey(key)) h.errors.Add(1) } } @@ -209,7 +209,7 @@ func (h *HybridBackend) Get(ctx context.Context, key string) ([]byte, time.Durat value, ttl, exists, err := h.primary.Get(ctx, key) if err != nil { h.errors.Add(1) - h.logger.Debugf("L1 get error for key %s: %v", key, err) + h.logger.Debugf("L1 get error for key %s: %v", redactKey(key), err) } if exists { @@ -227,7 +227,7 @@ func (h *HybridBackend) Get(ctx context.Context, key string) ([]byte, time.Durat value, ttl, exists, err = h.secondary.Get(ctx, key) if err != nil { h.errors.Add(1) - h.logger.Debugf("L2 get error for key %s: %v", key, err) + h.logger.Debugf("L2 get error for key %s: %v", redactKey(key), err) h.recordL2Error() h.misses.Add(1) return nil, 0, false, nil // Don't propagate L2 errors @@ -544,7 +544,7 @@ func (h *HybridBackend) queueL1Backfill(key string, value []byte, ttl time.Durat case h.l1BackfillBuffer <- &l1BackfillItem{key: key, value: value, ttl: ttl}: default: h.l1BackfillDrops.Add(1) - h.logger.Debugf("L1 backfill buffer full, dropping for key: %s", key) + h.logger.Debugf("L1 backfill buffer full, dropping for key: %s", redactKey(key)) } } @@ -576,9 +576,9 @@ func (h *HybridBackend) l1BackfillWorker() { } writeCtx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond) if err := h.primary.Set(writeCtx, item.key, item.value, item.ttl); err != nil { - h.logger.Debugf("Failed to populate L1 cache from L2 for key %s: %v", item.key, err) + h.logger.Debugf("Failed to populate L1 cache from L2 for key %s: %v", redactKey(item.key), err) } else { - h.logger.Debugf("Populated L1 cache from L2 for key: %s", item.key) + h.logger.Debugf("Populated L1 cache from L2 for key: %s", redactKey(item.key)) } cancel() } @@ -619,11 +619,11 @@ func (h *HybridBackend) asyncWriteWorker() { writeCtx, cancel := context.WithTimeout(item.ctx, 500*time.Millisecond) if err := h.secondary.Set(writeCtx, item.key, item.value, item.ttl); err != nil { h.errors.Add(1) - h.logger.Debugf("Async write to L2 failed for key %s: %v", item.key, err) + h.logger.Debugf("Async write to L2 failed for key %s: %v", redactKey(item.key), err) h.recordL2Error() } else { h.l2Writes.Add(1) - h.logger.Debugf("Async write to L2 completed for key: %s", item.key) + h.logger.Debugf("Async write to L2 completed for key: %s", redactKey(item.key)) } cancel() } diff --git a/internal/cache/backends/log_redact.go b/internal/cache/backends/log_redact.go new file mode 100644 index 0000000..1022f44 --- /dev/null +++ b/internal/cache/backends/log_redact.go @@ -0,0 +1,26 @@ +// Package backends provides cache backend implementations for the Traefik OIDC plugin. +package backends + +import ( + "crypto/sha256" + "encoding/hex" +) + +// redactKey returns a short, deterministic hash prefix of a cache key for use +// in debug/info log lines. Cache keys in this plugin can include raw access / +// refresh / id tokens (any caller may pass an arbitrary string), and CodeQL +// flags `key=%s` formatters as a clear-text-logging sink for HTTP-header- +// sourced taint. The hash preserves cache-key uniqueness in logs (same key → +// same hash, useful for correlating a problematic key across log lines) while +// keeping the raw value out of disk-resident log streams. +// +// 8 hex chars (32 bits) is enough to disambiguate at human-debugging scale +// without making the hash itself a useful lookup primitive for an attacker +// who only has the log stream. +func redactKey(key string) string { + if key == "" { + return "(empty)" + } + sum := sha256.Sum256([]byte(key)) + return hex.EncodeToString(sum[:4]) +} diff --git a/internal/cache/cache.go b/internal/cache/cache.go index 7964684..0185910 100644 --- a/internal/cache/cache.go +++ b/internal/cache/cache.go @@ -190,7 +190,7 @@ func (c *Cache) Set(key string, value interface{}, ttl time.Duration) error { c.currentSize++ atomic.AddInt64(&c.sets, 1) - c.logger.Debugf("Cache: Set key=%s, size=%d, ttl=%v", key, size, ttl) + c.logger.Debugf("Cache: Set key=%s, size=%d, ttl=%v", redactKey(key), size, ttl) return nil } @@ -346,7 +346,7 @@ func (c *Cache) evictLRU() { item, _ := elem.Value.(*Item) // Safe to ignore: type assertion from known type c.removeItem(item.Key, item) atomic.AddInt64(&c.evictions, 1) - c.logger.Debugf("Cache: Evicted LRU item key=%s", item.Key) + c.logger.Debugf("Cache: Evicted LRU item key=%s", redactKey(item.Key)) } } diff --git a/internal/cache/log_redact.go b/internal/cache/log_redact.go new file mode 100644 index 0000000..809b4a5 --- /dev/null +++ b/internal/cache/log_redact.go @@ -0,0 +1,22 @@ +// Package cache provides the in-memory cache implementation for the Traefik +// OIDC plugin. +package cache + +import ( + "crypto/sha256" + "encoding/hex" +) + +// redactKey returns a short, deterministic hash prefix of a cache key for use +// in debug/info log lines. Cache keys may include raw access / refresh / id +// tokens (callers pass arbitrary strings) and CodeQL flags `key=%s` +// formatters as a clear-text-logging sink for HTTP-header-sourced taint. +// The hash preserves uniqueness in logs (same key → same hash) while keeping +// the raw value out of disk-resident log streams. +func redactKey(key string) string { + if key == "" { + return "(empty)" + } + sum := sha256.Sum256([]byte(key)) + return hex.EncodeToString(sum[:4]) +} diff --git a/main.go b/main.go index 6682464..1eb69d7 100644 --- a/main.go +++ b/main.go @@ -239,23 +239,63 @@ func NewWithContext(ctx context.Context, config *Config, next http.Handler, name } return 0 }(), - tokenCleanupStopChan: make(chan struct{}), - metadataRefreshStopChan: make(chan struct{}), - ctx: pluginCtx, - cancelFunc: cancelFunc, - suppressDiagnosticLogs: isTestMode(), - securityHeadersApplier: config.GetSecurityHeadersApplier(), - scopeFilter: NewScopeFilter(logger), // NEW - for discovery-based scope filtering - dcrConfig: config.DynamicClientRegistration, - allowPrivateIPAddresses: config.AllowPrivateIPAddresses, - minimalHeaders: config.MinimalHeaders, - stripAuthCookies: config.StripAuthCookies, - enableBackchannelLogout: config.EnableBackchannelLogout, - enableFrontchannelLogout: config.EnableFrontchannelLogout, - backchannelLogoutPath: normalizeLogoutPath(config.BackchannelLogoutURL), - frontchannelLogoutPath: normalizeLogoutPath(config.FrontchannelLogoutURL), - sessionInvalidationCache: cacheManager.GetSharedSessionInvalidationCache(), - refreshResultCache: cacheManager.GetSharedRefreshResultCache(), + tokenCleanupStopChan: make(chan struct{}), + metadataRefreshStopChan: make(chan struct{}), + ctx: pluginCtx, + cancelFunc: cancelFunc, + suppressDiagnosticLogs: isTestMode(), + securityHeadersApplier: config.GetSecurityHeadersApplier(), + scopeFilter: NewScopeFilter(logger), // NEW - for discovery-based scope filtering + dcrConfig: config.DynamicClientRegistration, + allowPrivateIPAddresses: config.AllowPrivateIPAddresses, + minimalHeaders: config.MinimalHeaders, + stripAuthCookies: config.StripAuthCookies, + enableBackchannelLogout: config.EnableBackchannelLogout, + enableFrontchannelLogout: config.EnableFrontchannelLogout, + backchannelLogoutPath: normalizeLogoutPath(config.BackchannelLogoutURL), + frontchannelLogoutPath: normalizeLogoutPath(config.FrontchannelLogoutURL), + sessionInvalidationCache: cacheManager.GetSharedSessionInvalidationCache(), + refreshResultCache: cacheManager.GetSharedRefreshResultCache(), + enableBearerAuth: config.EnableBearerAuth, + stripAuthorizationHeader: config.StripAuthorizationHeader, + bearerEmitWWWAuthenticate: config.BearerEmitWWWAuthenticate, + bearerOverridesCookie: config.BearerOverridesCookie, + bearerIdentifierClaim: func() string { + if config.BearerIdentifierClaim != "" { + return config.BearerIdentifierClaim + } + return "sub" + }(), + maxIdentifierLength: func() int { + if config.MaxIdentifierLength > 0 { + return config.MaxIdentifierLength + } + return 256 + }(), + maxTokenAge: func() time.Duration { + if config.MaxTokenAgeSeconds > 0 { + return time.Duration(config.MaxTokenAgeSeconds) * time.Second + } + return 24 * time.Hour + }(), + bearerFailureThreshold: func() int { + if config.BearerFailureThreshold > 0 { + return config.BearerFailureThreshold + } + return 20 + }(), + bearerFailureWindow: func() time.Duration { + if config.BearerFailureWindowSeconds > 0 { + return time.Duration(config.BearerFailureWindowSeconds) * time.Second + } + return 60 * time.Second + }(), + bearerFailurePenalty: func() time.Duration { + if config.BearerFailurePenaltySeconds > 0 { + return time.Duration(config.BearerFailurePenaltySeconds) * time.Second + } + return 60 * time.Second + }(), } // Log audience configuration @@ -265,6 +305,31 @@ func NewWithContext(ctx context.Context, config *Config, next http.Handler, name t.logger.Debugf("No custom audience specified, using clientID as audience: %s", t.clientID) } + // Bearer-auth startup validation. The bearer path is M2M-only and demands + // a non-default audience so tokens issued for a different resource cannot + // be replayed against this service. The BearerIdentifierClaim guard blocks + // the `email` claim explicitly — without email_verified enforcement (out of + // scope for M2M), trusting email is a spoofing vector for federated IdPs. + // See spec §7.9 / §13. + if config.EnableBearerAuth { + if config.Audience == "" { + cancelFunc() + return nil, fmt.Errorf("EnableBearerAuth=true requires Audience to be set explicitly (cannot default to clientID — that path accepts ID tokens)") + } + if t.bearerIdentifierClaim == "email" { + cancelFunc() + return nil, fmt.Errorf("enableBearerAuth=true with bearerIdentifierClaim=%q is rejected: email-based identity without email_verified enforcement is a spoofing vector for federated IdPs (use \"sub\" or a custom claim; cookie-path userIdentifierClaim is unaffected)", t.bearerIdentifierClaim) + } + if !config.StrictAudienceValidation { + t.logger.Infof("EnableBearerAuth=true with StrictAudienceValidation=false: recommend enabling strict audience validation for hardening") + } + t.bearerFailureTracker = newBearerFailureTracker( + t.bearerFailureThreshold, t.bearerFailureWindow, t.bearerFailurePenalty, + ) + t.logger.Infof("Bearer-token auth enabled: audience=%q identifierClaim=%q stripAuthz=%t bearerOverridesCookie=%t maxTokenAge=%s", + config.Audience, t.bearerIdentifierClaim, t.stripAuthorizationHeader, t.bearerOverridesCookie, t.maxTokenAge) + } + // Convert sessionMaxAge from seconds to duration (0 will use default 24 hours) sessionMaxAge := time.Duration(config.SessionMaxAge) * time.Second t.sessionManager, _ = NewSessionManager(config.SessionEncryptionKey, config.ForceHTTPS, config.CookieDomain, config.CookiePrefix, sessionMaxAge, t.logger) // Safe to ignore: session manager creation with fallback to defaults diff --git a/middleware.go b/middleware.go index 083b637..9d6a5d7 100644 --- a/middleware.go +++ b/middleware.go @@ -168,6 +168,14 @@ func (t *TraefikOidc) ServeHTTP(rw http.ResponseWriter, req *http.Request) { // unauthenticated traffic would silently expose the backend. if bypass, reason := t.shouldBypassAuth(req); bypass { t.logger.Debugf("Bypassing OIDC for %s (%s)", req.URL.Path, reason) + // When bearer auth is enabled, strip the Authorization header on + // bypassed paths so a bearer token can't leak into health/metrics/ + // public endpoint logs via downstream services that don't expect it. + // Excluded URLs are explicitly public; bearer is an artifact of the + // API auth flow that doesn't belong on them. + if t.enableBearerAuth { + req.Header.Del("Authorization") + } switch reason { case bypassReasonExcluded: // Operator-declared excluded URLs forward unconditionally. @@ -236,6 +244,24 @@ func (t *TraefikOidc) ServeHTTP(rw http.ResponseWriter, req *http.Request) { // Bypass checks already ran before the init wait; no need to repeat them. t.sessionManager.CleanupOldCookies(rw, req) + // Bearer-token auth (opt-in). Runs after init (we need issuer+JWKs+aud + // available) and after bypass (excluded URLs always win). Cookie-vs- + // bearer precedence is configurable; the safe default is cookie-wins. + // See bearer_auth.go for the full pipeline. + if t.enableBearerAuth { + if _, hasBearer := detectBearerToken(req); hasBearer { + cookiePresent := t.hasSessionCookie(req) + if !cookiePresent || t.bearerOverridesCookie { + if cookiePresent { + t.logger.Infof("Both Authorization: Bearer and session cookie present on %s; bearer-wins per BearerOverridesCookie=true", req.URL.Path) + } + t.handleBearerRequest(rw, req) + return + } + t.logger.Infof("Both Authorization: Bearer and session cookie present on %s; cookie-wins (default); bearer ignored", req.URL.Path) + } + } + session, err := t.sessionManager.GetSession(req) if err != nil { t.logger.Errorf("Error getting session: %v. Initiating authentication.", err) @@ -401,10 +427,17 @@ func (t *TraefikOidc) ServeHTTP(rw http.ResponseWriter, req *http.Request) { t.defaultInitiateAuthentication(rw, req, session, redirectURL) } -// processAuthorizedRequest processes requests for authenticated users. -// It extracts claims, validates roles/groups if configured, sets authentication headers, -// processes header templates, and forwards the request to the next handler. -// Domain checks should be performed before calling this method. +// processAuthorizedRequest processes requests for authenticated cookie/session +// users. It performs session-specific checks (identifier presence, backchannel- +// logout invalidation, claims extraction with potential re-auth), persists +// dirty session state, then delegates the post-auth pipeline (roles/groups, +// header injection, security headers, cookie strip, forward) to +// forwardAuthorized. +// +// The bearer-token path uses the same forwardAuthorized helper but takes a +// different route to it (see bearer_auth.go). Keeping forwardAuthorized +// session-agnostic is what lets the two auth methods share one pipeline. +// // Parameters: // - rw: The HTTP response writer. // - req: The HTTP request to process. @@ -442,8 +475,7 @@ func (t *TraefikOidc) processAuthorizedRequest(rw http.ResponseWriter, req *http // the parsed claims keyed on the raw ID token, so concurrent dashboard // panel requests on the same session don't repeatedly base64-decode and // JSON-unmarshal the same JWT (a real cost under the yaegi interpreter - // that hosts Traefik plugins). idClaims is reused below by the - // header-templates branch. + // that hosts Traefik plugins). idToken := session.GetIDToken() var ( idClaims map[string]interface{} @@ -472,18 +504,76 @@ func (t *TraefikOidc) processAuthorizedRequest(rw http.ResponseWriter, req *http return } - var groups, roles []string + if groupClaimsErr != nil && len(t.allowedRolesAndGroups) > 0 { + // Claims couldn't be extracted but roles checks are required: + // re-authenticate rather than 403 (session may be salvageable on + // re-issue). Bearer path uses 401 for the equivalent failure. + t.logger.Errorf("Failed to extract claims for roles/groups check: %v", groupClaimsErr) + session.ResetRedirectCount() + t.defaultInitiateAuthentication(rw, req, session, redirectURL) + return + } - if groupClaimsErr == nil && groupClaims != nil { - var err error - groups, roles, err = t.extractGroupsAndRolesFromClaims(groupClaims) - if err != nil && len(t.allowedRolesAndGroups) > 0 { - t.logger.Errorf("Failed to extract groups and roles: %v", err) - session.ResetRedirectCount() - t.defaultInitiateAuthentication(rw, req, session, redirectURL) + // Persist any dirty session state BEFORE forwardAuthorized writes the + // response. Once next.ServeHTTP fires, Set-Cookie can no longer reach + // the client. The forwardAuthorized pipeline does not mutate session + // state, so saving here is safe. + if session.IsDirty() { + if err := session.Save(req, rw); err != nil { + t.logger.Errorf("Failed to save session after processing headers: %v", err) + } + } else { + t.logger.Debug("Session not dirty, skipping save in processAuthorizedRequest") + } + + // Build the source-agnostic principal. ID-token claims drive header + // templates and roles when present; otherwise fall back to access-token + // claims (matches prior behavior for opaque-ID-token providers). + p := &principal{ + Source: sourceSession, + Identifier: userIdentifier, + AccessToken: session.GetAccessToken(), + IDToken: idToken, + RefreshToken: session.GetRefreshToken(), + Claims: groupClaims, + } + + t.forwardAuthorized(rw, req, p) +} + +// forwardAuthorized completes the post-authentication pipeline shared by the +// cookie/session path and the bearer-token path. It performs: +// +// 1. Roles/groups extraction from p.Claims (idempotent; existing +// extractGroupsAndRolesFromClaims helper). +// 2. allowedRolesAndGroups gate — writes a 403 and returns if denied. +// 3. Identity-header injection (X-Forwarded-User, X-User-Groups, X-User-Roles, +// plus X-Auth-Request-* when !minimalHeaders). +// 4. Operator-defined header templates. +// 5. Security headers (delegated to t.securityHeadersApplier or fallback). +// 6. OIDC session-cookie strip (stripAuthCookies). +// 7. Authorization header strip on bearer source when stripAuthorizationHeader. +// 8. next.ServeHTTP. +// +// Session persistence is the CALLER's responsibility — it must happen before +// this function so Set-Cookie reaches the response. +func (t *TraefikOidc) forwardAuthorized(rw http.ResponseWriter, req *http.Request, p *principal) { + var ( + groups, roles []string + extractErr error + ) + if p.Claims != nil { + groups, roles, extractErr = t.extractGroupsAndRolesFromClaims(p.Claims) + if extractErr != nil && len(t.allowedRolesAndGroups) > 0 { + // Bearer path: 403 (caller already verified the token; principal + // claims are present but malformed for roles purposes). + // Cookie path can't reach here because processAuthorizedRequest + // catches groupClaimsErr earlier. + t.logger.Errorf("Failed to extract groups and roles: %v", extractErr) + t.sendErrorResponse(rw, req, "Access denied", http.StatusForbidden) return } - if err == nil { + if extractErr == nil { if len(groups) > 0 { req.Header.Set("X-User-Groups", strings.Join(groups, ",")) } @@ -502,62 +592,46 @@ func (t *TraefikOidc) processAuthorizedRequest(rw http.ResponseWriter, req *http } } if !allowed { - t.logger.Infof("User %s does not have any allowed roles or groups", userIdentifier) + t.logger.Infof("User %s does not have any allowed roles or groups", p.Identifier) errorMsg := fmt.Sprintf("Access denied: You do not have any of the allowed roles or groups. To log out, visit: %s", t.logoutURLPath) t.sendErrorResponse(rw, req, errorMsg, http.StatusForbidden) return } } - req.Header.Set("X-Forwarded-User", userIdentifier) + req.Header.Set("X-Forwarded-User", p.Identifier) // When minimalHeaders is enabled, skip extra headers to prevent 431 errors if !t.minimalHeaders { req.Header.Set("X-Auth-Request-Redirect", req.URL.RequestURI()) - req.Header.Set("X-Auth-Request-User", userIdentifier) - if idToken != "" { - req.Header.Set("X-Auth-Request-Token", idToken) + req.Header.Set("X-Auth-Request-User", p.Identifier) + if p.IDToken != "" { + req.Header.Set("X-Auth-Request-Token", p.IDToken) } } if len(t.headerTemplates) > 0 { - if idClaimsErr != nil { - t.logger.Errorf("Failed to extract claims from ID Token for template headers: %v", idClaimsErr) - } else { - // idClaims may be nil when no ID token is present; templates - // referencing .Claims.* will simply produce empty values, which - // matches the prior behavior. - templateData := map[string]interface{}{ - "AccessToken": session.GetAccessToken(), - "IDToken": idToken, - "RefreshToken": session.GetRefreshToken(), - "Claims": idClaims, - } - - for headerName, tmpl := range t.headerTemplates { - var buf bytes.Buffer - if err := tmpl.Execute(&buf, templateData); err != nil { - t.logger.Errorf("Failed to execute template for header %s: %v", headerName, err) - continue - } - headerValue := buf.String() - req.Header.Set(headerName, headerValue) - t.logger.Debugf("Set templated header %s = %s", headerName, headerValue) - } - // NOTE: templates only mutate request headers (not session state), - // so we deliberately do NOT MarkDirty / Save here. Previously every - // authenticated request with header templates re-encrypted and - // rewrote all session cookies, which was a measurable CPU and - // Set-Cookie tax on dashboards that poll many panels per second. + // p.Claims may be nil (e.g. session without an ID token). Templates + // referencing .Claims.* will simply produce empty values — matches + // the prior behavior. Bearer-source principals always carry access- + // token claims (post-verifyToken). + templateData := map[string]interface{}{ + "AccessToken": p.AccessToken, + "IDToken": p.IDToken, + "RefreshToken": p.RefreshToken, + "Claims": p.Claims, } - } - if session.IsDirty() { - if err := session.Save(req, rw); err != nil { - t.logger.Errorf("Failed to save session after processing headers: %v", err) + for headerName, tmpl := range t.headerTemplates { + var buf bytes.Buffer + if err := tmpl.Execute(&buf, templateData); err != nil { + t.logger.Errorf("Failed to execute template for header %s: %v", headerName, err) + continue + } + headerValue := buf.String() + req.Header.Set(headerName, headerValue) + t.logger.Debugf("Set templated header %s = %s", headerName, headerValue) } - } else { - t.logger.Debug("Session not dirty, skipping save in processAuthorizedRequest") } // Apply security headers if configured @@ -573,7 +647,7 @@ func (t *TraefikOidc) processAuthorizedRequest(rw http.ResponseWriter, req *http // Strip OIDC session cookies before forwarding to the backend to prevent // HTTP 431 "Request Header Fields Too Large" errors (GitHub issue #122). - if t.stripAuthCookies { + if t.stripAuthCookies && t.sessionManager != nil { prefix := t.sessionManager.GetCookiePrefix() filtered := make([]*http.Cookie, 0, len(req.Cookies())) for _, c := range req.Cookies() { @@ -587,7 +661,14 @@ func (t *TraefikOidc) processAuthorizedRequest(rw http.ResponseWriter, req *http } } - t.logger.Debugf("Request authorized for user %s, forwarding to next handler", userIdentifier) + // Bearer source: strip the Authorization header to keep the raw token + // out of downstream service logs. Off-by-config for operators who chain + // services that each re-verify the bearer. + if p.Source == sourceBearer && t.stripAuthorizationHeader { + req.Header.Del("Authorization") + } + + t.logger.Debugf("Request authorized for user %s (source=%d), forwarding to next handler", p.Identifier, p.Source) t.next.ServeHTTP(rw, req) } diff --git a/principal.go b/principal.go new file mode 100644 index 0000000..ea42c8a --- /dev/null +++ b/principal.go @@ -0,0 +1,58 @@ +// Package traefikoidc — principal abstraction for the shared post-auth +// pipeline. A principal carries the resolved identity + tokens + claims +// produced by EITHER the cookie session path or the bearer-token path, so +// downstream header injection / roles checks / forwarding can be implemented +// once and reused. +package traefikoidc + +// principalSource indicates which auth path produced a principal. Used by +// forwardAuthorized to decide source-specific behavior (e.g. only strip the +// Authorization header for bearer-source principals). +type principalSource int + +const ( + sourceSession principalSource = iota + sourceBearer +) + +// principal is the immutable post-auth value passed to forwardAuthorized. +// No methods mutate it; no manager pointer; no I/O. Pure data. +type principal struct { + Claims map[string]interface{} + Identifier string + Subject string + ClientID string + AccessToken string + IDToken string + RefreshToken string + Source principalSource +} + +// buildPrincipalFromSession adapts an authenticated SessionData into a +// principal value WITHOUT writing back to the session. This is the only +// function that still knows about SessionData; the rest of the pipeline is +// session-agnostic. Returns nil when the session has no usable identity. +func (t *TraefikOidc) buildPrincipalFromSession(session *SessionData) *principal { + if session == nil { + return nil + } + identifier := session.GetUserIdentifier() + if identifier == "" { + return nil + } + + var claims map[string]interface{} + if idToken := session.GetIDToken(); idToken != "" && t.extractClaimsFunc != nil { + // Best-effort: cached on the session, never blocking. + claims, _ = session.GetIDTokenClaims(t.extractClaimsFunc) // Safe to ignore: claims-error path handled by header-template branch + } + + return &principal{ + Source: sourceSession, + Identifier: identifier, + AccessToken: session.GetAccessToken(), + IDToken: session.GetIDToken(), + RefreshToken: session.GetRefreshToken(), + Claims: claims, + } +} diff --git a/settings.go b/settings.go index 516e4db..c4f34b7 100644 --- a/settings.go +++ b/settings.go @@ -63,23 +63,23 @@ type Config struct { // IdPs do not expose RT TTL on the wire, so this is intentionally a // conservative heuristic; tune to match your provider configuration. // Default 21600 (6h). Set to 0 to disable the check. - MaxRefreshTokenAgeSeconds int `json:"maxRefreshTokenAgeSeconds"` - SessionMaxAge int `json:"sessionMaxAge"` - RateLimit int `json:"rateLimit"` - OverrideScopes bool `json:"overrideScopes"` - DisableReplayDetection bool `json:"disableReplayDetection,omitempty"` - RequireTokenIntrospection bool `json:"requireTokenIntrospection,omitempty"` - AllowOpaqueTokens bool `json:"allowOpaqueTokens,omitempty"` - StrictAudienceValidation bool `json:"strictAudienceValidation,omitempty"` - EnablePKCE bool `json:"enablePKCE"` - ForceHTTPS bool `json:"forceHTTPS"` - AllowPrivateIPAddresses bool `json:"allowPrivateIPAddresses,omitempty"` - MinimalHeaders bool `json:"minimalHeaders,omitempty"` - StripAuthCookies bool `json:"stripAuthCookies,omitempty"` - EnableBackchannelLogout bool `json:"enableBackchannelLogout,omitempty"` - EnableFrontchannelLogout bool `json:"enableFrontchannelLogout,omitempty"` - BackchannelLogoutURL string `json:"backchannelLogoutURL,omitempty"` - FrontchannelLogoutURL string `json:"frontchannelLogoutURL,omitempty"` + MaxRefreshTokenAgeSeconds int `json:"maxRefreshTokenAgeSeconds"` + SessionMaxAge int `json:"sessionMaxAge"` + RateLimit int `json:"rateLimit"` + OverrideScopes bool `json:"overrideScopes"` + DisableReplayDetection bool `json:"disableReplayDetection,omitempty"` + RequireTokenIntrospection bool `json:"requireTokenIntrospection,omitempty"` + AllowOpaqueTokens bool `json:"allowOpaqueTokens,omitempty"` + StrictAudienceValidation bool `json:"strictAudienceValidation,omitempty"` + EnablePKCE bool `json:"enablePKCE"` + ForceHTTPS bool `json:"forceHTTPS"` + AllowPrivateIPAddresses bool `json:"allowPrivateIPAddresses,omitempty"` + MinimalHeaders bool `json:"minimalHeaders,omitempty"` + StripAuthCookies bool `json:"stripAuthCookies,omitempty"` + EnableBackchannelLogout bool `json:"enableBackchannelLogout,omitempty"` + EnableFrontchannelLogout bool `json:"enableFrontchannelLogout,omitempty"` + BackchannelLogoutURL string `json:"backchannelLogoutURL,omitempty"` + FrontchannelLogoutURL string `json:"frontchannelLogoutURL,omitempty"` // CACertPath is an optional filesystem path to a PEM-encoded CA bundle used // to verify the OIDC provider's TLS certificate. Use this when the provider // is signed by an internal/private CA that is not in the system trust store. @@ -125,6 +125,52 @@ type Config struct { // ClientAssertionAlg is the JWS signing algorithm. Defaults to RS256. // Supported: RS256/384/512, PS256/384/512, ES256/384/512. ClientAssertionAlg string `json:"clientAssertionAlg,omitempty"` + + // --- Bearer-token auth (opt-in M2M path) --- + + // EnableBearerAuth turns on the Authorization: Bearer auth path. + // Default false. When true, Audience MUST be set or startup fails. The + // bearer path is M2M-only: it accepts validated access-token JWTs, rejects + // ID tokens, and forwards principal headers downstream without creating a + // cookie session. See docs/BEARER_AUTH.md for the threat model. + EnableBearerAuth bool `json:"enableBearerAuth,omitempty"` + // BearerIdentifierClaim names the JWT claim used as the principal identifier + // on the bearer-token auth path. Default "sub". Decoupled from + // UserIdentifierClaim (which defaults to "email" and drives the cookie path) + // so M2M bearer flow never accidentally relies on an unverified email. + BearerIdentifierClaim string `json:"bearerIdentifierClaim,omitempty"` + // StripAuthorizationHeader removes the Authorization header from the + // forwarded request after successful bearer auth, so downstream services + // never see the raw token. Default true. Disable only when a downstream + // explicitly needs to re-validate the bearer. + StripAuthorizationHeader bool `json:"stripAuthorizationHeader,omitempty"` + // BearerEmitWWWAuthenticate controls whether 401 responses on the bearer + // path include a WWW-Authenticate: Bearer error="invalid_token" hint per + // RFC 6750 §3. Default true. Disable to reduce reconnaissance signal. + BearerEmitWWWAuthenticate bool `json:"bearerEmitWWWAuthenticate,omitempty"` + // BearerOverridesCookie controls precedence when both Authorization: + // Bearer and a session cookie are present. Default false: cookie wins + // (safer against browser/extension/proxy bearer injection). Set true for + // the bearer-wins convention used by AWS/GCP/Kubernetes API gateways. + BearerOverridesCookie bool `json:"bearerOverridesCookie,omitempty"` + // MaxTokenAgeSeconds caps how old (iat-based) a bearer token may be. + // Default 86400 (24h). Bounds clock-manipulation tokens with implausibly + // distant iat values. + MaxTokenAgeSeconds int64 `json:"maxTokenAgeSeconds,omitempty"` + // MaxIdentifierLength bounds the post-sanitisation length of the bearer + // principal identifier (the value injected as X-Forwarded-User). Default + // 256. + MaxIdentifierLength int `json:"maxIdentifierLength,omitempty"` + // BearerFailureThreshold is the number of consecutive 401s from one + // source IP within BearerFailureWindowSeconds that trips the throttle. + // Default 20. + BearerFailureThreshold int `json:"bearerFailureThreshold,omitempty"` + // BearerFailureWindowSeconds is the rolling window (seconds) over which + // 401s are counted for throttling. Default 60. + BearerFailureWindowSeconds int `json:"bearerFailureWindowSeconds,omitempty"` + // BearerFailurePenaltySeconds is how long an IP is parked in the 429 + // penalty box after BearerFailureThreshold is exceeded. Default 60. + BearerFailurePenaltySeconds int `json:"bearerFailurePenaltySeconds,omitempty"` } // loadCACertPool assembles an x509.CertPool from CACertPath and CACertPEM. @@ -291,6 +337,19 @@ func CreateConfig() *Config { MaxRefreshTokenAgeSeconds: 21600, // 6h - conservative heuristic, see field doc SecurityHeaders: createDefaultSecurityConfig(), Redis: nil, // Redis is disabled by default, configure via Traefik or env vars + + // Bearer-auth defaults. EnableBearerAuth=false leaves the feature + // dormant; the rest are values that apply only when bearer is enabled. + EnableBearerAuth: false, + BearerIdentifierClaim: "sub", + StripAuthorizationHeader: true, + BearerEmitWWWAuthenticate: true, + BearerOverridesCookie: false, + MaxTokenAgeSeconds: 86400, + MaxIdentifierLength: 256, + BearerFailureThreshold: 20, + BearerFailureWindowSeconds: 60, + BearerFailurePenaltySeconds: 60, } return c diff --git a/token_manager.go b/token_manager.go index 3bb7172..b4a7596 100644 --- a/token_manager.go +++ b/token_manager.go @@ -29,6 +29,29 @@ import ( // //nolint:gocognit,gocyclo // Complex token verification logic requires multiple security checks func (t *TraefikOidc) VerifyToken(token string) error { + return t.verifyTokenWithOpts(token, verifyOpts{}) +} + +// verifyOpts are internal-only knobs for verifyTokenWithOpts. Kept unexported +// because they expose subtle replay-protection semantics that are dangerous +// to misuse. +type verifyOpts struct { + // skipReplayMarking suppresses the JTI -> blacklist Set near the bottom + // of verifyTokenWithOpts. The Get at the top remains active, so revoked + // tokens (added to the blacklist by RevokeToken) are still rejected. + // Used exclusively by the bearer-auth path, where bearer tokens are + // designed to be reused until exp. + skipReplayMarking bool +} + +// verifyTokenWithOpts runs the full token verification pipeline used by both +// the cookie path and the bearer path. The cookie path uses the zero-value +// opts; the bearer path sets skipReplayMarking=true. See the security spec +// (docs/superpowers/specs/2026-05-18-bearer-token-auth-design.md §7.7) for +// the exact contract: skipReplayMarking gates ONLY the JTI Set, never the Get. +// +//nolint:gocognit,gocyclo // Complex token verification logic requires multiple security checks +func (t *TraefikOidc) verifyTokenWithOpts(token string, opts verifyOpts) error { if token == "" { return fmt.Errorf("invalid JWT format: token is empty") } @@ -76,7 +99,9 @@ func (t *TraefikOidc) VerifyToken(token string) error { } // Only check JTI blacklist for tokens that aren't already in the cache - // This is for FIRST-TIME validation to detect replay attacks + // This is for FIRST-TIME validation to detect replay attacks. The + // blacklist Get is ALWAYS active on the bearer path too — only the + // Set below is gated by opts.skipReplayMarking. if jti, ok := parsedJWT.Claims["jti"].(string); ok && jti != "" { // Skip JTI blacklist check if replay detection is disabled if !t.disableReplayDetection { @@ -105,8 +130,12 @@ func (t *TraefikOidc) VerifyToken(token string) error { t.cacheVerifiedToken(token, jwt.Claims) - if jti, ok := jwt.Claims["jti"].(string); ok && jti != "" && !t.disableReplayDetection { - // Only add to blacklist if replay detection is enabled + // Replay marking: add JTI to blacklist so subsequent presentations of + // the SAME token can short-circuit via cache. Bearer path suppresses + // this Set (opts.skipReplayMarking=true) because bearer tokens are + // designed for reuse until exp; the cache-evict-then-replay scenario + // would otherwise trigger false replay detection. + if jti, ok := jwt.Claims["jti"].(string); ok && jti != "" && !t.disableReplayDetection && !opts.skipReplayMarking { expiry := time.Now().Add(defaultBlacklistDuration) if expClaim, expOk := jwt.Claims["exp"].(float64); expOk { expTime := time.Unix(int64(expClaim), 0) diff --git a/types.go b/types.go index 08c00ae..06323df 100644 --- a/types.go +++ b/types.go @@ -149,4 +149,17 @@ type TraefikOidc struct { enablePKCE bool forceHTTPS bool suppressDiagnosticLogs bool + + // Bearer-auth runtime state (populated only when EnableBearerAuth=true). + bearerIdentifierClaim string + bearerFailureTracker *bearerFailureTracker + maxTokenAge time.Duration + maxIdentifierLength int + bearerFailureThreshold int + bearerFailureWindow time.Duration + bearerFailurePenalty time.Duration + enableBearerAuth bool + stripAuthorizationHeader bool + bearerEmitWWWAuthenticate bool + bearerOverridesCookie bool }