Commit Graph

51 Commits

Author SHA1 Message Date
lukaszraczylo 4d28fa01ab perf(jwk,cache): cache parsed public keys + RLock token cache reads
Hot-path JWT verification rebuilt the public key on every call:
  jwk -> ToRSAPublicKey -> x509.MarshalPKIXPublicKey -> pem.Encode
  -> verifySignature -> pem.Decode -> x509.ParsePKIXPublicKey -> verify
Under yaegi this pinned a CPU when many concurrent dashboard panels
poll behind the middleware. The PEM round trip is pure waste.

* jwk.go: cache pre-parsed crypto.PublicKey per kid alongside the
  raw JWKSet (parallel cache entry, same 1h TTL, invalidates together).
* jwt.go: split verifySignatureWithKey from verifySignature; existing
  PEM-input entry point preserved for backchannel-logout callers.
* token_manager.go: VerifyJWTSignatureAndClaims now goes straight from
  jwks cache to verifySignatureWithKey, no PEM round trip and no
  per-request availableKids slice.
* universal_cache.go: token/JWK/session Get() takes RLock when the
  entry is unexpired, so concurrent token verifications no longer
  serialize on a single mutex. LRU semantics for general and metadata
  caches are unchanged (tests cover the strict-LRU contract there).
* mocks: MockJWKCache, EnhancedMockJWKCache, mockJWKCacheForLogout,
  staticJWKCache satisfy the extended interface.
2026-04-30 10:14:10 +01:00
lukaszraczylo 6efb78b7a8 Smarter approach to the cookies (#103)
* Smarter approach to the cookies

  - Single maxCookieSize = 1400 constant with clear documentation
  - Combined cookie storage for ~40-45% size reduction
  - Backward compatible migration from legacy cookies

* Tuneup the code.
2025-12-12 18:35:06 +00:00
lukaszraczylo d0b920c4f0 multiple realms fix (#102)
* Allow to use multiple realms

This change is a ressurection of PR #88 which can't be merged due to significant refactor of the codebase.

* Fix the autocleanup routine to handle multiple realms correctly, update tests.

* Metadata rediscovery when provider is unavailable for any reason during the start.

This one prevents the permanent 503 from the plugin when OIDC provider was for some reason unavailable during the start.
2025-12-10 13:07:22 +00:00
lukaszraczylo c474bbafd6 Cleanup [dec2025] (#101)
* Cleanup excessive comments.

* Remove leftovers hanging around from previous refactor

* Improve test coverage
2025-12-09 01:38:02 +00:00
lukaszraczylo 9126c74723 December 2025 Improvements - Azure AD, Internal Networks, Startup Race Condition (#100)
* Allow internal IPs for OIDC configuration via extra flag.

Addresses issue #97

* Allow for internal IPs in OIDC configuration.

Addresses issue #97.

* feat: Add allowPrivateIPAddresses config option for internal networks

Adds a new configuration option `allowPrivateIPAddresses` that allows
OIDC provider URLs to use private IP addresses (10.x.x.x, 172.16-31.x.x,
192.168.x.x). This is useful for internal deployments where Keycloak or
other OIDC providers run on private networks without DNS resolution.

Security considerations:
- Loopback addresses (127.0.0.1, localhost, ::1) remain blocked
- Link-local addresses (169.254.x.x) remain blocked
- Default is false (secure by default)

Fixes #97

* feat: Support non-email user identifiers for Azure AD

Add userIdentifierClaim configuration option to support Azure AD users
without email addresses. This allows using alternative JWT claims like
"sub", "oid", "upn", or "preferred_username" for user identification.

- Default behavior uses "email" claim (backward compatible)
- Falls back to "sub" claim if configured claim is missing
- allowedUsers matches against the configured claim value
- allowedUserDomains only applies when using email-based identification

Fixes #95

* Race condition on traefik pod startup

When the plugin initializes and calls GetMetadataWithRecovery():

1. Checks cache first (if metadata is cached, returns immediately)
2. Creates a retry executor with startup-optimized settings (10 attempts, 1s delays)
3. Attempts to fetch metadata from the OIDC provider
4. If the fetch fails with a retryable error (connection refused, EOF, TLS/certificate errors, Traefik default cert), it waits and retries
5. After 10 attempts or on a non-retryable error, returns the error

This allows the plugin to handle the race condition where:
- Traefik initializes the plugin before routes are established
- Traefik serves its default certificate before loading real ones
- The OIDC provider pod isn't fully ready yet

Fixes issue #90

* Race condition on traefik pod startup

When the plugin initializes and calls GetMetadataWithRecovery():

1. Checks cache first (if metadata is cached, returns immediately)
2. Creates a retry executor with startup-optimized settings (10 attempts, 1s delays)
3. Attempts to fetch metadata from the OIDC provider
4. If the fetch fails with a retryable error (connection refused, EOF, TLS/certificate errors, Traefik default cert), it waits and retries
5. After 10 attempts or on a non-retryable error, returns the error

This allows the plugin to handle the race condition where:
- Traefik initializes the plugin before routes are established
- Traefik serves its default certificate before loading real ones
- The OIDC provider pod isn't fully ready yet

Fixes issue #90

* Headers too big and 431 responses

Added new option `minimalHeaders` to reduce the size of forwarded headers from the auth middleware to backend services.

  - When minimalHeaders: false (default): All headers are forwarded as before
    - X-Forwarded-User (always set)
    - X-Auth-Request-Redirect
    - X-Auth-Request-User
    - X-Auth-Request-Token (the large ID token)
    - X-User-Groups, X-User-Roles (if configured)
  - When minimalHeaders: true: Reduces header overhead
    - X-Forwarded-User (always set)
    - X-User-Groups, X-User-Roles (still forwarded if configured)
    - Custom templated headers (still processed)
    - Skipped: X-Auth-Request-Token, X-Auth-Request-User, X-Auth-Request-Redirect

Fixes issues #64 and #86
2025-12-08 14:21:17 +00:00
lukaszraczylo e64fc7f730 Add redis support for distributed caching (#83)
* Add redis support for distributed caching

* Move towards the self-provided Redis connection pool and RESP protocol implementation.
Official redis client library won't work with yaegi.

* fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* ... and another all nighter.

* fixup! ... and another all nighter.

* fixup! fixup! ... and another all nighter.

* fixup! fixup! fixup! ... and another all nighter.

* Resolve issue #85 by adding ability to set custom claims in JWT tokens

* Remove redundant validation in auth middleware ( issue #89 )

* Add ability to set cookie prefix for session cookies ( #87 )

* fixup! Add ability to set cookie prefix for session cookies ( #87 )

* Add ability to set cookie max age - issue #91

* Potential fix for code scanning alert no. 10: Size computation for allocation may overflow

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fixup! Merge main into 0.8.0-redis: resolve conflicts

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-11-30 02:18:46 +00:00
lukaszraczylo 5fcbd54955 Add sharded cache and prevention of CPU spikes / locks (#96)
* Add sharded cache and prevention of CPU spikes / locks

* Add dynamic client registration with oidc provider

* Fix race condition introduced during the sharded cache implementation.

* Add page for traefikoidc.
2025-11-30 01:41:12 +00:00
lukaszraczylo bde1db1c3b traefik plugin 0.7.7 (#73)
* Automatic discovery of the scopes.

Issue #61 raised very valid concerns about users configuring scopes that are not supported by the provider.
This change introduces automatic discovery of supported scopes by fetching the provider's discovery document and filtering out unsupported scopes.

Before:
User configures: scopes: ["openid", "profile", "email", "offline_access"]
Self-hosted GitLab: "The requested scope is invalid, unknown, or malformed"
Authentication:  FAILS

After:
User configures: scopes: ["openid", "profile", "email", "offline_access"]
Middleware checks discovery doc → offline_access not supported
Automatically filters to: ["openid", "profile", "email"]
Authentication:  SUCCEEDS

* Resolves issue #74 by enabling user to specify expected audience in the configuration.

* Fix flaky tests.
2025-10-08 11:44:00 +01:00
lukaszraczylo 1b49e133da Complete rebuild of the plugin
* Fix bug affecting Azure OIDC authentication ( and most likely others )

* Fixes issue #51

* Ensure that appended roles are unique. Update the documentation.

* Improvements targetting possible memory usage spikes.

* Additional fixes and cleanup

* Refactoring code to fix the issues identified by the users.

* Modernize run

* Fieldalignment

* Multiple changes to improve performance and reduce complexity.
- Optimise the errors and recovery.
- Deduplicate code in metadata cache.
- Remove unused performance monitoring code.
- Simplify session management and settings handling.

* Fix claims issue.

* Add ability to overwrite the default scopes in the settings file

* Well.. that escalated quickly.

Completely forgot that Traefik uses outdated Yaegi and requires compatibility with 1.20 ( pre-generic Go code ).

* Bugfix #51: Ensures that user provided scopes overrides work.

* fixup! Bugfix #51: Ensures that user provided scopes overrides work.

* fixup! fixup! Bugfix #51: Ensures that user provided scopes overrides work.

* Abstract the provider logic into a separate package.

* Additional micro fixes and cleanups.

* Simplify all the things.

* fixup! Simplify all the things.

* fixup! fixup! Simplify all the things.

* fixup! fixup! fixup! Simplify all the things.

* fixup! fixup! fixup! fixup! Simplify all the things.

* ...

* Cleanup tests.

* fixup! Cleanup tests.

* fixup! fixup! fixup! Cleanup tests.

* fixup! fixup! fixup! fixup! Cleanup tests.

* fixup! fixup! fixup! fixup! fixup! Cleanup tests.

* Issue #53: Fix CSRF token handling in reverse proxy

1.  HTTPS Detection Fixed (session.go:723)
- Now uses X-Forwarded-Proto header instead of r.URL.Scheme
- Properly detects HTTPS in reverse proxy environments
2.  SameSite Cookie Attribute Fixed
- Removed automatic SameSiteStrictMode for HTTPS (would break OAuth)
- Keeps SameSiteLaxMode to allow OAuth callbacks from external domains
- Only uses Strict for AJAX requests which don't involve OAuth redirects
3.  Cookie Domain Handling Fixed
- Now respects X-Forwarded-Host header for cookie domain
- Ensures cookies are set for the public domain, not internal proxy domain
4.  EnhanceSessionSecurity Properly Integrated
- Function is now actually called during session save
- Applies security enhancements without breaking OAuth flow

Why Issue #53 Failed Before:

1. Cookies were not marked Secure in HTTPS environments (browser wouldn't send them back)
2. If they had been Secure with SameSite=Strict, Azure callbacks would still fail
3. Cookie domain might have been wrong (internal vs public domain)

Why It Works Now:

1. Cookies are properly marked Secure for HTTPS
2. Uses SameSite=Lax to allow OAuth provider callbacks
3. Cookie domain uses public domain from X-Forwarded-Host
4. CSRF token persists through the entire OAuth flow

* Next set of enhancements together with memory usage improvements.

* Memory leak fixes and optimisations.

* CSRF and Cookie Domain fixes

* fixup! CSRF and Cookie Domain fixes

* Metadata cache leak fix + profiling

* fixup! Metadata cache leak fix + profiling

* Memory leaks hunting, part 1337.

* Further pursue of perfection.

* fixup! Further pursue of perfection.

* fixup! fixup! Further pursue of perfection.

* fixup! fixup! fixup! Further pursue of perfection.

* fixup! fixup! fixup! fixup! Further pursue of perfection.

* fixup! fixup! fixup! fixup! fixup! Further pursue of perfection.

* fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection.

* fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection.

* fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection.

* fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection.

* Clear race conditions

* fixup! Clear race conditions

* Weekend fun with memory leaks

* Splitting code into multiple files with reasonable testing coverage.

```
ok      github.com/lukaszraczylo/traefikoidc    117.017s        coverage: 72.6% of statements
ok      github.com/lukaszraczylo/traefikoidc/auth       0.505s  coverage: 87.1% of statements
ok      github.com/lukaszraczylo/traefikoidc/circuit_breaker    0.283s  coverage: 99.0% of statements
        github.com/lukaszraczylo/traefikoidc/config             coverage: 0.0% of statements
ok      github.com/lukaszraczylo/traefikoidc/handlers   0.349s  coverage: 98.2% of statements
ok      github.com/lukaszraczylo/traefikoidc/internal/providers (cached)        coverage: 94.3% of statements
ok      github.com/lukaszraczylo/traefikoidc/middleware 0.808s  coverage: 78.0% of statements
ok      github.com/lukaszraczylo/traefikoidc/recovery   0.653s  coverage: 100.0% of statements
ok      github.com/lukaszraczylo/traefikoidc/session/chunking   (cached)        coverage: 87.8% of statements
ok      github.com/lukaszraczylo/traefikoidc/session/core       (cached)        coverage: 85.6% of statements
ok      github.com/lukaszraczylo/traefikoidc/session/crypto     (cached)        coverage: 81.8% of statements
ok      github.com/lukaszraczylo/traefikoidc/session/storage    (cached)        coverage: 93.5% of statements
ok      github.com/lukaszraczylo/traefikoidc/session/validators (cached)        coverage: 98.8% of statements
````

* fixup! Splitting code into multiple files with reasonable testing coverage.

* fixup! fixup! Splitting code into multiple files with reasonable testing coverage.

* Weekend fun with further optimisations.

* fixup! Weekend fun with further optimisations.

* fixup! fixup! Weekend fun with further optimisations.

* fixup! fixup! fixup! Weekend fun with further optimisations.

* fixup! fixup! fixup! fixup! Weekend fun with further optimisations.

* fixup! fixup! fixup! fixup! fixup! Weekend fun with further optimisations.

* Pre-release cleanup.

* Enhance test coverage.

* fixup! Enhance test coverage.

* fixup! fixup! Enhance test coverage.

* fixup! fixup! fixup! Enhance test coverage.
2025-09-18 11:01:30 +01:00
lukaszraczylo efa0cd708b Fixes issue #50 2025-05-26 02:48:20 +01:00
lukaszraczylo 99881f5837 Multiple fixes
- Unbounded Replay Cache: Now bounded to 10,000 entries with automatic cleanup
- Session Pool Leaks: Proper object lifecycle prevents accumulation
- HTTP Client Leaks: Reusable clients eliminate connection overhead
- Goroutine Leaks: Tracked lifecycle with graceful shutdown
2025-05-23 10:55:57 +01:00
lukaszraczylo 82a640cc3b Large scale refactoring for the v0.6
Cryptographic:
RSA Algorithm Support: RS256, RS384, RS512 (PKCS1v15) + PS256, PS384, PS512 (PSS)
Elliptic Curve Support: ES256 (P-256), ES384 (P-384), ES512 (P-521)
Security-First Approach: Proper rejection of HS256/HS384/HS512 and "none" algorithms
Algorithm Confusion Protection: Prevents downgrade attacks
JWK Multi-Format Support: RSA and EC key handling with correct curve parameters
Signature Verification: Comprehensive support for all major JWT algorithms

Security:
Real-time threat detection with automatic IP blocking
Comprehensive input validation against 11+ attack vectors
Advanced authentication protection with session security
CSRF protection with token-based validation
Multi-algorithm JWT support with proper cryptographic implementation
OWASP Top 10 compliance with full coverage
Zero vulnerabilities across all categories
Thread-safe security monitoring with proper synchronization
Header injection protection with complete validation

Reliability:
Circuit breaker patterns for automatic failure recovery
Retry mechanisms with exponential backoff
Graceful degradation for service continuity
Resource protection with memory and connection limits
Zero panics with comprehensive error handling
Perfect race condition elimination
Robust error recovery with modern Go patterns

Performance:
High throughput: 108,312 operations/second
Low latency: P95 < 1ms, P99 < 5ms
Efficient caching: 95%+ hit ratio
Optimized resource usage with automatic cleanup
Perfect metrics collection with detailed monitoring
Thread-safe performance tracking
2025-05-23 01:52:08 +01:00
lukaszraczylo 248ca018e2 Add user email filtering logic. 2025-05-21 10:43:42 +01:00
lukaszraczylo da70e69ad1 Memleak fixes. 2025-05-09 19:05:24 +01:00
lukaszraczylo 83693d2893 General improvements and tests related fixes. 2025-05-07 02:03:58 +01:00
lukaszraczylo bd7eaf6dff Bugfix: Refresh token not obtained when access token is expired. 2025-04-05 18:28:12 +01:00
lukaszraczylo 1910cd6000 Update documentation to the higher standards. 2025-04-05 11:31:45 +01:00
lukaszraczylo 23e019092a Multiple improvements for April 2025
* Improve refresh token handling in the background.

Resolves issue when user opens the website, allows the access token to expire, but continues browsing.
The background requests are failing with CORS errors to OIDC provider.

* fixup! Improve refresh token handling in the background.

* Abstract the token blacklisting.
2025-04-04 18:42:41 +01:00
lukaszraczylo 4322407129 Add support for PKCE (#31)
* Add PKCE support.
* Add option to toggle PKCE checks feature.
* GoFMT
2025-03-18 01:09:14 +00:00
lukaszraczylo 7d204113ea Cleanup the codebase, DRY and abstract functions, increase the test coverage. 2025-02-25 12:53:52 +00:00
lukaszraczylo d194cd778a gofmt the updated files. 2025-02-19 11:56:31 +00:00
lukaszraczylo 803a1e5e21 Clean the caches properly to avoid memleak 2025-02-19 11:55:32 +00:00
lukaszraczylo 09daa1025c Follow multiple redirects during the OIDC flow. 2025-02-06 23:31:13 +00:00
lukaszraczylo c09e7a9228 Add additional test cases to cover it. 2025-02-06 21:50:35 +00:00
lukaszraczylo 751933ffa0 Multiple improvements.
* Add todo list.

* fixup! Add todo list.

* fixup! fixup! Add todo list.

* fixup! fixup! fixup! Add todo list.

* Improve the session handling and cache.

* Fix an issue where expired session can cause infinite redirect loop

* fixup! Fix an issue where expired session can cause infinite redirect loop

* Add semver setup for automatic releases.

* fixup! Add semver setup for automatic releases.

* fixup! fixup! Add semver setup for automatic releases.

* fixup! fixup! fixup! Add semver setup for automatic releases.
2025-02-01 12:16:50 +00:00
lukaszraczylo 025107fe3e Well, release it finally. 2025-01-21 19:31:51 +00:00
lukaszraczylo dfb9c0771e Fix session handling and the redirection to the original URL incl. get parameters 2025-01-21 17:49:54 +00:00
lukaszraczylo 081c32925a fixup! Security improvements have been implemented and verified across four main areas: 2025-01-14 11:47:49 +00:00
lukaszraczylo 17dea67229 Security improvements have been implemented and verified across four main areas:
JWT Token Security:
Protected against algorithm switching attacks by validating and whitelisting algorithms (RS256, RS384, RS512, PS256, PS384, PS512, ES256, ES384, ES512)
Added 2-minute clock skew tolerance for time-based validations
Added "not before" (nbf) claim validation with clock skew tolerance
Required JWT ID (jti) claim to prevent replay attacks
Added strict algorithm validation to prevent downgrade attacks
Session Management Security:
Implemented cryptographically secure random cookie names to prevent targeting
Added automatic session ID rotation after successful login to prevent session fixation
Enforced 24-hour absolute session timeout
Added strict encryption key length validation (minimum 32 bytes)
Added comprehensive session validation including timeout checks
Implemented session pooling for secure resource management
Added secure session cleanup on expiration
Configuration and URL Security:
Enforced HTTPS for all provider URLs and external endpoints
Added minimum rate limit (10 req/sec) to prevent DOS attacks
Added strict validation for excluded URLs:
Must start with "/"
No path traversal (..)
No wildcards (*)
Made ForceHTTPS true by default for secure cookies
Added validation for secure redirect URIs
Added validation for all OIDC endpoints (must be HTTPS)
Added secure defaults in configuration
Test Coverage:
Added comprehensive test cases verifying all security validations
Added test cases for HTTPS enforcement on all endpoints
Added test cases for minimum rate limits
Added test cases for secure session management
Added test cases for token validation with clock skew
Added test cases for secure configuration defaults
All security improvements have been verified through passing test cases, protecting against:

Session fixation attacks
Token replay attacks
Algorithm switching attacks
Path traversal attacks
Session hijacking
Timing attacks
DOS attacks
Man-in-the-middle attacks through enforced HTTPS
2025-01-14 11:33:48 +00:00
lukaszraczylo 147aa0b169 Fix the issue #16
Removed global metadata cache and sync.Once
Each middleware instance now handles its own metadata initialization
Added tests to verify multiple instances work correctly
The changes ensure that:

Each route gets its own properly initialized middleware instance
Metadata is fetched and set correctly for each instance
No shared state between instances that could cause conflicts
Each instance can handle requests independently
The added test verifies this by creating multiple middleware instances with different routes and confirming they all initialize and function correctly. The test specifically checks that:

Each instance initializes successfully
Each instance gets its own metadata configuration
Each instance can handle requests independently
Callback URLs are correctly set per route
2025-01-06 11:23:12 +00:00
lukaszraczylo 11bc6f3e31 Re-introduce user roles separation with additional tests. 2024-12-11 09:08:50 +00:00
lukaszraczylo 2b7af88ff9 Move session management into session manager. Split the cookies to avoid the 4k limit ( resolves issue: #15 ) 2024-12-10 10:19:35 +00:00
lukaszraczylo 01ee7c4dc8 Improve cookie setting. 2024-12-10 10:19:35 +00:00
lukaszraczylo a6fa4d8789 Downgrade gorilla sessions preventing the publishing by traefik hub temporarily. 2024-12-10 10:19:34 +00:00
lukaszraczylo 8ca669105b Fix OIDC logout issue, improve test coverage, load provider once. 2024-11-06 11:33:29 +00:00
lukaszraczylo 218165d365 Cleanup and optimise the code. 2024-11-06 11:33:03 +00:00
lukaszraczylo 6f2cf65d49 Fix the tests hanging on the open channel. 2024-11-06 11:32:36 +00:00
lukaszraczylo 2bb1debeb3 First step in improvement of caching mechanism. 2024-11-06 11:32:36 +00:00
lukaszraczylo 7a53da6080 Update tests and additional fixups. 2024-11-06 11:32:35 +00:00
lukaszraczylo 66e08755c1 Update the tests to handle nonce 2024-11-06 11:32:35 +00:00
lukaszraczylo 9ff6779caa Add support for different signing algorithms 2024-10-03 14:00:43 +01:00
lukaszraczylo a7d42de0a4 Invalidate user session with provider on logout 2024-10-03 14:00:43 +01:00
lukaszraczylo 2fbca0a88c Add allowed domains list. 2024-10-03 14:00:43 +01:00
lukaszraczylo 38433dfff8 Improve handling of expired sessions 2024-10-03 14:00:43 +01:00
lukaszraczylo e9e1fccf5e Fix up the excluded URLs configuration. 2024-10-03 14:00:42 +01:00
lukaszraczylo 77ead9b8a1 Add option to exclude URLs from the authentication. 2024-10-03 14:00:41 +01:00
lukaszraczylo 448392e9bd Update: Don't refresh token / issue cookie on every request. 2024-10-03 14:00:41 +01:00
lukaszraczylo a4a943ae9c Fix the issue when expired token did not forced reauthentication. 2024-10-03 14:00:41 +01:00
lukaszraczylo 5652ade5e2 Add basic benchmarks. 2024-10-03 14:00:40 +01:00
lukaszraczylo 13572003a5 Even more tests, crossed 50% coverage 2024-10-03 14:00:39 +01:00