mirror of
https://github.com/lukaszraczylo/traefikoidc.git
synced 2026-06-05 22:44:17 +00:00
1b49e133da
* Fix bug affecting Azure OIDC authentication ( and most likely others ) * Fixes issue #51 * Ensure that appended roles are unique. Update the documentation. * Improvements targetting possible memory usage spikes. * Additional fixes and cleanup * Refactoring code to fix the issues identified by the users. * Modernize run * Fieldalignment * Multiple changes to improve performance and reduce complexity. - Optimise the errors and recovery. - Deduplicate code in metadata cache. - Remove unused performance monitoring code. - Simplify session management and settings handling. * Fix claims issue. * Add ability to overwrite the default scopes in the settings file * Well.. that escalated quickly. Completely forgot that Traefik uses outdated Yaegi and requires compatibility with 1.20 ( pre-generic Go code ). * Bugfix #51: Ensures that user provided scopes overrides work. * fixup! Bugfix #51: Ensures that user provided scopes overrides work. * fixup! fixup! Bugfix #51: Ensures that user provided scopes overrides work. * Abstract the provider logic into a separate package. * Additional micro fixes and cleanups. * Simplify all the things. * fixup! Simplify all the things. * fixup! fixup! Simplify all the things. * fixup! fixup! fixup! Simplify all the things. * fixup! fixup! fixup! fixup! Simplify all the things. * ... * Cleanup tests. * fixup! Cleanup tests. * fixup! fixup! fixup! Cleanup tests. * fixup! fixup! fixup! fixup! Cleanup tests. * fixup! fixup! fixup! fixup! fixup! Cleanup tests. * Issue #53: Fix CSRF token handling in reverse proxy 1. ✅ HTTPS Detection Fixed (session.go:723) - Now uses X-Forwarded-Proto header instead of r.URL.Scheme - Properly detects HTTPS in reverse proxy environments 2. ✅ SameSite Cookie Attribute Fixed - Removed automatic SameSiteStrictMode for HTTPS (would break OAuth) - Keeps SameSiteLaxMode to allow OAuth callbacks from external domains - Only uses Strict for AJAX requests which don't involve OAuth redirects 3. ✅ Cookie Domain Handling Fixed - Now respects X-Forwarded-Host header for cookie domain - Ensures cookies are set for the public domain, not internal proxy domain 4. ✅ EnhanceSessionSecurity Properly Integrated - Function is now actually called during session save - Applies security enhancements without breaking OAuth flow Why Issue #53 Failed Before: 1. Cookies were not marked Secure in HTTPS environments (browser wouldn't send them back) 2. If they had been Secure with SameSite=Strict, Azure callbacks would still fail 3. Cookie domain might have been wrong (internal vs public domain) Why It Works Now: 1. Cookies are properly marked Secure for HTTPS 2. Uses SameSite=Lax to allow OAuth provider callbacks 3. Cookie domain uses public domain from X-Forwarded-Host 4. CSRF token persists through the entire OAuth flow * Next set of enhancements together with memory usage improvements. * Memory leak fixes and optimisations. * CSRF and Cookie Domain fixes * fixup! CSRF and Cookie Domain fixes * Metadata cache leak fix + profiling * fixup! Metadata cache leak fix + profiling * Memory leaks hunting, part 1337. * Further pursue of perfection. * fixup! Further pursue of perfection. * fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * Clear race conditions * fixup! Clear race conditions * Weekend fun with memory leaks * Splitting code into multiple files with reasonable testing coverage. ``` ok github.com/lukaszraczylo/traefikoidc 117.017s coverage: 72.6% of statements ok github.com/lukaszraczylo/traefikoidc/auth 0.505s coverage: 87.1% of statements ok github.com/lukaszraczylo/traefikoidc/circuit_breaker 0.283s coverage: 99.0% of statements github.com/lukaszraczylo/traefikoidc/config coverage: 0.0% of statements ok github.com/lukaszraczylo/traefikoidc/handlers 0.349s coverage: 98.2% of statements ok github.com/lukaszraczylo/traefikoidc/internal/providers (cached) coverage: 94.3% of statements ok github.com/lukaszraczylo/traefikoidc/middleware 0.808s coverage: 78.0% of statements ok github.com/lukaszraczylo/traefikoidc/recovery 0.653s coverage: 100.0% of statements ok github.com/lukaszraczylo/traefikoidc/session/chunking (cached) coverage: 87.8% of statements ok github.com/lukaszraczylo/traefikoidc/session/core (cached) coverage: 85.6% of statements ok github.com/lukaszraczylo/traefikoidc/session/crypto (cached) coverage: 81.8% of statements ok github.com/lukaszraczylo/traefikoidc/session/storage (cached) coverage: 93.5% of statements ok github.com/lukaszraczylo/traefikoidc/session/validators (cached) coverage: 98.8% of statements ```` * fixup! Splitting code into multiple files with reasonable testing coverage. * fixup! fixup! Splitting code into multiple files with reasonable testing coverage. * Weekend fun with further optimisations. * fixup! Weekend fun with further optimisations. * fixup! fixup! Weekend fun with further optimisations. * fixup! fixup! fixup! Weekend fun with further optimisations. * fixup! fixup! fixup! fixup! Weekend fun with further optimisations. * fixup! fixup! fixup! fixup! fixup! Weekend fun with further optimisations. * Pre-release cleanup. * Enhance test coverage. * fixup! Enhance test coverage. * fixup! fixup! Enhance test coverage. * fixup! fixup! fixup! Enhance test coverage.
245 lines
7.4 KiB
Go
245 lines
7.4 KiB
Go
package traefikoidc
|
|
|
|
import (
|
|
"context"
|
|
"fmt"
|
|
"time"
|
|
)
|
|
|
|
// TokenResilienceConfig centralizes resilience configuration for token operations
|
|
type TokenResilienceConfig struct {
|
|
// Circuit breaker configuration for token operations
|
|
CircuitBreakerEnabled bool
|
|
CircuitBreakerConfig CircuitBreakerConfig
|
|
|
|
// Retry configuration for token operations
|
|
RetryEnabled bool
|
|
RetryConfig RetryConfig
|
|
|
|
// Metadata cache progressive grace period configuration
|
|
MetadataCacheConfig MetadataCacheResilienceConfig
|
|
}
|
|
|
|
// MetadataCacheResilienceConfig defines resilience settings for metadata cache
|
|
type MetadataCacheResilienceConfig struct {
|
|
// EnableProgressiveGracePeriod allows extending cache TTL on failures
|
|
EnableProgressiveGracePeriod bool
|
|
|
|
// InitialGracePeriod is the first extension when service is unavailable (5 minutes)
|
|
InitialGracePeriod time.Duration
|
|
|
|
// ExtendedGracePeriod is the second extension for continued failures (15 minutes)
|
|
ExtendedGracePeriod time.Duration
|
|
|
|
// MaxGracePeriod is the maximum extension allowed (30 minutes for normal, 15 for security-critical)
|
|
MaxGracePeriod time.Duration
|
|
|
|
// SecurityCriticalMaxGracePeriod enforces Allan's security limit for critical metadata
|
|
SecurityCriticalMaxGracePeriod time.Duration
|
|
|
|
// SecurityCriticalFields defines which metadata fields are security-critical
|
|
SecurityCriticalFields []string
|
|
}
|
|
|
|
// DefaultTokenResilienceConfig returns the default resilience configuration for token operations
|
|
func DefaultTokenResilienceConfig() TokenResilienceConfig {
|
|
return TokenResilienceConfig{
|
|
CircuitBreakerEnabled: true,
|
|
CircuitBreakerConfig: CircuitBreakerConfig{
|
|
MaxFailures: 3,
|
|
Timeout: 30 * time.Second,
|
|
ResetTimeout: 15 * time.Second,
|
|
},
|
|
RetryEnabled: true,
|
|
RetryConfig: RetryConfig{
|
|
MaxAttempts: 3,
|
|
InitialDelay: 250 * time.Millisecond,
|
|
MaxDelay: 2 * time.Second,
|
|
BackoffFactor: 2.0,
|
|
EnableJitter: true,
|
|
RetryableErrors: []string{
|
|
"connection refused",
|
|
"timeout",
|
|
"temporary failure",
|
|
"network unreachable",
|
|
"connection reset",
|
|
"no route to host",
|
|
},
|
|
},
|
|
MetadataCacheConfig: DefaultMetadataCacheResilienceConfig(),
|
|
}
|
|
}
|
|
|
|
// DefaultMetadataCacheResilienceConfig returns the default metadata cache resilience configuration
|
|
func DefaultMetadataCacheResilienceConfig() MetadataCacheResilienceConfig {
|
|
return MetadataCacheResilienceConfig{
|
|
EnableProgressiveGracePeriod: true,
|
|
InitialGracePeriod: 5 * time.Minute,
|
|
ExtendedGracePeriod: 15 * time.Minute,
|
|
MaxGracePeriod: 30 * time.Minute,
|
|
SecurityCriticalMaxGracePeriod: 15 * time.Minute, // Allan's security limit
|
|
SecurityCriticalFields: []string{
|
|
"jwks_uri",
|
|
"authorization_endpoint",
|
|
"token_endpoint",
|
|
"revocation_endpoint",
|
|
"end_session_endpoint",
|
|
},
|
|
}
|
|
}
|
|
|
|
// TokenResilienceManager coordinates resilience mechanisms for token operations
|
|
type TokenResilienceManager struct {
|
|
config TokenResilienceConfig
|
|
errorRecoveryManager *ErrorRecoveryManager
|
|
circuitBreaker *CircuitBreaker
|
|
retryExecutor *RetryExecutor
|
|
logger *Logger
|
|
}
|
|
|
|
// NewTokenResilienceManager creates a new token resilience manager
|
|
func NewTokenResilienceManager(config TokenResilienceConfig, logger *Logger) *TokenResilienceManager {
|
|
manager := &TokenResilienceManager{
|
|
config: config,
|
|
logger: logger,
|
|
}
|
|
|
|
// Initialize error recovery manager
|
|
manager.errorRecoveryManager = NewErrorRecoveryManager(logger)
|
|
|
|
// Initialize circuit breaker if enabled
|
|
if config.CircuitBreakerEnabled {
|
|
manager.circuitBreaker = NewCircuitBreaker(config.CircuitBreakerConfig, logger)
|
|
}
|
|
|
|
// Initialize retry executor if enabled
|
|
if config.RetryEnabled {
|
|
manager.retryExecutor = NewRetryExecutor(config.RetryConfig, logger)
|
|
}
|
|
|
|
return manager
|
|
}
|
|
|
|
// ExecuteTokenOperation executes a token operation with full resilience support
|
|
func (trm *TokenResilienceManager) ExecuteTokenOperation(ctx context.Context, operation string, fn func() error) error {
|
|
if trm.logger != nil {
|
|
trm.logger.Debugf("Executing token operation %s with resilience", operation)
|
|
}
|
|
|
|
// If no resilience mechanisms are enabled, execute directly
|
|
if !trm.config.CircuitBreakerEnabled && !trm.config.RetryEnabled {
|
|
return fn()
|
|
}
|
|
|
|
// Compose resilience mechanisms
|
|
var finalOperation func() error = fn
|
|
|
|
// Wrap with circuit breaker if enabled
|
|
if trm.config.CircuitBreakerEnabled && trm.circuitBreaker != nil {
|
|
originalOp := finalOperation
|
|
finalOperation = func() error {
|
|
return trm.circuitBreaker.ExecuteWithContext(ctx, originalOp)
|
|
}
|
|
}
|
|
|
|
// Wrap with retry if enabled
|
|
if trm.config.RetryEnabled && trm.retryExecutor != nil {
|
|
originalOp := finalOperation
|
|
finalOperation = func() error {
|
|
return trm.retryExecutor.ExecuteWithContext(ctx, originalOp)
|
|
}
|
|
}
|
|
|
|
err := finalOperation()
|
|
|
|
if err != nil && trm.logger != nil {
|
|
trm.logger.Errorf("Token operation %s failed after resilience mechanisms: %v", operation, err)
|
|
} else if trm.logger != nil {
|
|
trm.logger.Debugf("Token operation %s completed successfully", operation)
|
|
}
|
|
|
|
return err
|
|
}
|
|
|
|
// ExecuteTokenExchange executes token exchange with resilience
|
|
func (trm *TokenResilienceManager) ExecuteTokenExchange(ctx context.Context, t *TraefikOidc, grantType, codeOrToken, redirectURL, codeVerifier string) (*TokenResponse, error) {
|
|
var result *TokenResponse
|
|
var err error
|
|
|
|
operation := fmt.Sprintf("token_exchange_%s", grantType)
|
|
|
|
err = trm.ExecuteTokenOperation(ctx, operation, func() error {
|
|
result, err = t.exchangeTokens(ctx, grantType, codeOrToken, redirectURL, codeVerifier)
|
|
return err
|
|
})
|
|
|
|
return result, err
|
|
}
|
|
|
|
// ExecuteTokenRefresh executes token refresh with resilience
|
|
func (trm *TokenResilienceManager) ExecuteTokenRefresh(ctx context.Context, t *TraefikOidc, refreshToken string) (*TokenResponse, error) {
|
|
var result *TokenResponse
|
|
var err error
|
|
|
|
err = trm.ExecuteTokenOperation(ctx, "token_refresh", func() error {
|
|
result, err = t.getNewTokenWithRefreshToken(refreshToken)
|
|
return err
|
|
})
|
|
|
|
return result, err
|
|
}
|
|
|
|
// GetMetrics returns metrics for all resilience mechanisms
|
|
func (trm *TokenResilienceManager) GetMetrics() map[string]interface{} {
|
|
metrics := make(map[string]interface{})
|
|
|
|
if trm.circuitBreaker != nil {
|
|
metrics["circuit_breaker"] = trm.circuitBreaker.GetMetrics()
|
|
}
|
|
|
|
if trm.retryExecutor != nil {
|
|
metrics["retry_executor"] = trm.retryExecutor.GetMetrics()
|
|
}
|
|
|
|
if trm.errorRecoveryManager != nil {
|
|
recoveryMetrics := trm.errorRecoveryManager.GetRecoveryMetrics()
|
|
metrics["error_recovery"] = recoveryMetrics
|
|
}
|
|
|
|
return metrics
|
|
}
|
|
|
|
// Reset resets all resilience mechanisms
|
|
func (trm *TokenResilienceManager) Reset() {
|
|
if trm.circuitBreaker != nil {
|
|
trm.circuitBreaker.Reset()
|
|
}
|
|
|
|
if trm.retryExecutor != nil {
|
|
trm.retryExecutor.Reset()
|
|
}
|
|
|
|
if trm.logger != nil {
|
|
trm.logger.Infof("Token resilience manager has been reset")
|
|
}
|
|
}
|
|
|
|
// IsSecurityCriticalField checks if a metadata field is security-critical
|
|
func (config MetadataCacheResilienceConfig) IsSecurityCriticalField(fieldName string) bool {
|
|
for _, criticalField := range config.SecurityCriticalFields {
|
|
if fieldName == criticalField {
|
|
return true
|
|
}
|
|
}
|
|
return false
|
|
}
|
|
|
|
// GetEffectiveMaxGracePeriod returns the effective maximum grace period for a field
|
|
// considering Allan's security limits
|
|
func (config MetadataCacheResilienceConfig) GetEffectiveMaxGracePeriod(fieldName string) time.Duration {
|
|
if config.IsSecurityCriticalField(fieldName) {
|
|
return config.SecurityCriticalMaxGracePeriod
|
|
}
|
|
return config.MaxGracePeriod
|
|
}
|