mirror of
https://github.com/lukaszraczylo/traefikoidc.git
synced 2026-06-05 22:44:17 +00:00
1b49e133da
* Fix bug affecting Azure OIDC authentication ( and most likely others ) * Fixes issue #51 * Ensure that appended roles are unique. Update the documentation. * Improvements targetting possible memory usage spikes. * Additional fixes and cleanup * Refactoring code to fix the issues identified by the users. * Modernize run * Fieldalignment * Multiple changes to improve performance and reduce complexity. - Optimise the errors and recovery. - Deduplicate code in metadata cache. - Remove unused performance monitoring code. - Simplify session management and settings handling. * Fix claims issue. * Add ability to overwrite the default scopes in the settings file * Well.. that escalated quickly. Completely forgot that Traefik uses outdated Yaegi and requires compatibility with 1.20 ( pre-generic Go code ). * Bugfix #51: Ensures that user provided scopes overrides work. * fixup! Bugfix #51: Ensures that user provided scopes overrides work. * fixup! fixup! Bugfix #51: Ensures that user provided scopes overrides work. * Abstract the provider logic into a separate package. * Additional micro fixes and cleanups. * Simplify all the things. * fixup! Simplify all the things. * fixup! fixup! Simplify all the things. * fixup! fixup! fixup! Simplify all the things. * fixup! fixup! fixup! fixup! Simplify all the things. * ... * Cleanup tests. * fixup! Cleanup tests. * fixup! fixup! fixup! Cleanup tests. * fixup! fixup! fixup! fixup! Cleanup tests. * fixup! fixup! fixup! fixup! fixup! Cleanup tests. * Issue #53: Fix CSRF token handling in reverse proxy 1. ✅ HTTPS Detection Fixed (session.go:723) - Now uses X-Forwarded-Proto header instead of r.URL.Scheme - Properly detects HTTPS in reverse proxy environments 2. ✅ SameSite Cookie Attribute Fixed - Removed automatic SameSiteStrictMode for HTTPS (would break OAuth) - Keeps SameSiteLaxMode to allow OAuth callbacks from external domains - Only uses Strict for AJAX requests which don't involve OAuth redirects 3. ✅ Cookie Domain Handling Fixed - Now respects X-Forwarded-Host header for cookie domain - Ensures cookies are set for the public domain, not internal proxy domain 4. ✅ EnhanceSessionSecurity Properly Integrated - Function is now actually called during session save - Applies security enhancements without breaking OAuth flow Why Issue #53 Failed Before: 1. Cookies were not marked Secure in HTTPS environments (browser wouldn't send them back) 2. If they had been Secure with SameSite=Strict, Azure callbacks would still fail 3. Cookie domain might have been wrong (internal vs public domain) Why It Works Now: 1. Cookies are properly marked Secure for HTTPS 2. Uses SameSite=Lax to allow OAuth provider callbacks 3. Cookie domain uses public domain from X-Forwarded-Host 4. CSRF token persists through the entire OAuth flow * Next set of enhancements together with memory usage improvements. * Memory leak fixes and optimisations. * CSRF and Cookie Domain fixes * fixup! CSRF and Cookie Domain fixes * Metadata cache leak fix + profiling * fixup! Metadata cache leak fix + profiling * Memory leaks hunting, part 1337. * Further pursue of perfection. * fixup! Further pursue of perfection. * fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Further pursue of perfection. * Clear race conditions * fixup! Clear race conditions * Weekend fun with memory leaks * Splitting code into multiple files with reasonable testing coverage. ``` ok github.com/lukaszraczylo/traefikoidc 117.017s coverage: 72.6% of statements ok github.com/lukaszraczylo/traefikoidc/auth 0.505s coverage: 87.1% of statements ok github.com/lukaszraczylo/traefikoidc/circuit_breaker 0.283s coverage: 99.0% of statements github.com/lukaszraczylo/traefikoidc/config coverage: 0.0% of statements ok github.com/lukaszraczylo/traefikoidc/handlers 0.349s coverage: 98.2% of statements ok github.com/lukaszraczylo/traefikoidc/internal/providers (cached) coverage: 94.3% of statements ok github.com/lukaszraczylo/traefikoidc/middleware 0.808s coverage: 78.0% of statements ok github.com/lukaszraczylo/traefikoidc/recovery 0.653s coverage: 100.0% of statements ok github.com/lukaszraczylo/traefikoidc/session/chunking (cached) coverage: 87.8% of statements ok github.com/lukaszraczylo/traefikoidc/session/core (cached) coverage: 85.6% of statements ok github.com/lukaszraczylo/traefikoidc/session/crypto (cached) coverage: 81.8% of statements ok github.com/lukaszraczylo/traefikoidc/session/storage (cached) coverage: 93.5% of statements ok github.com/lukaszraczylo/traefikoidc/session/validators (cached) coverage: 98.8% of statements ```` * fixup! Splitting code into multiple files with reasonable testing coverage. * fixup! fixup! Splitting code into multiple files with reasonable testing coverage. * Weekend fun with further optimisations. * fixup! Weekend fun with further optimisations. * fixup! fixup! Weekend fun with further optimisations. * fixup! fixup! fixup! Weekend fun with further optimisations. * fixup! fixup! fixup! fixup! Weekend fun with further optimisations. * fixup! fixup! fixup! fixup! fixup! Weekend fun with further optimisations. * Pre-release cleanup. * Enhance test coverage. * fixup! Enhance test coverage. * fixup! fixup! Enhance test coverage. * fixup! fixup! fixup! Enhance test coverage.
259 lines
7.6 KiB
Go
259 lines
7.6 KiB
Go
// Package recovery provides error recovery and resilience mechanisms
|
|
package recovery
|
|
|
|
import (
|
|
"context"
|
|
"sync"
|
|
"sync/atomic"
|
|
"time"
|
|
)
|
|
|
|
// ErrorRecoveryMechanism defines the interface for error recovery strategies.
|
|
// It provides a common contract for implementing various resilience patterns
|
|
// (circuit breaker, retry, graceful degradation) to handle transient failures
|
|
// and protect downstream services from cascading failures.
|
|
type ErrorRecoveryMechanism interface {
|
|
// ExecuteWithContext executes a function with error recovery mechanisms
|
|
ExecuteWithContext(ctx context.Context, fn func() error) error
|
|
// GetMetrics returns metrics about the recovery mechanism's performance
|
|
GetMetrics() map[string]interface{}
|
|
// Reset resets the mechanism to its initial state
|
|
Reset()
|
|
// IsAvailable returns whether the mechanism is available for requests
|
|
IsAvailable() bool
|
|
}
|
|
|
|
// Logger interface for dependency injection
|
|
type Logger interface {
|
|
Infof(format string, args ...interface{})
|
|
Errorf(format string, args ...interface{})
|
|
Debugf(format string, args ...interface{})
|
|
}
|
|
|
|
// BaseRecoveryMechanism provides common functionality and metrics tracking
|
|
// for all error recovery mechanisms. It handles request/failure/success counting,
|
|
// timing information, and logging capabilities for derived recovery mechanisms.
|
|
type BaseRecoveryMechanism struct {
|
|
// startTime tracks when the mechanism was created
|
|
startTime time.Time
|
|
// lastFailureTime records the most recent failure timestamp
|
|
lastFailureTime time.Time
|
|
// lastSuccessTime records the most recent success timestamp
|
|
lastSuccessTime time.Time
|
|
// logger for debugging and monitoring
|
|
logger Logger
|
|
// name identifies this recovery mechanism instance
|
|
name string
|
|
// totalRequests counts all requests processed
|
|
totalRequests int64
|
|
// totalFailures counts failed requests
|
|
totalFailures int64
|
|
// totalSuccesses counts successful requests
|
|
totalSuccesses int64
|
|
// mutex protects shared state access
|
|
mutex sync.RWMutex
|
|
}
|
|
|
|
// NewBaseRecoveryMechanism creates a new base recovery mechanism with the given name and logger.
|
|
// This serves as the foundation for specific recovery mechanism implementations.
|
|
func NewBaseRecoveryMechanism(name string, logger Logger) *BaseRecoveryMechanism {
|
|
if logger == nil {
|
|
logger = NewNoOpLogger()
|
|
}
|
|
|
|
return &BaseRecoveryMechanism{
|
|
name: name,
|
|
logger: logger,
|
|
startTime: time.Now(),
|
|
}
|
|
}
|
|
|
|
// RecordRequest increments the total request counter.
|
|
// This method is thread-safe using atomic operations.
|
|
func (b *BaseRecoveryMechanism) RecordRequest() {
|
|
atomic.AddInt64(&b.totalRequests, 1)
|
|
}
|
|
|
|
// RecordSuccess increments the success counter and updates the last success timestamp.
|
|
// This method is thread-safe using atomic operations for counters
|
|
// and mutex protection for timestamp updates.
|
|
func (b *BaseRecoveryMechanism) RecordSuccess() {
|
|
atomic.AddInt64(&b.totalSuccesses, 1)
|
|
|
|
b.mutex.Lock()
|
|
defer b.mutex.Unlock()
|
|
b.lastSuccessTime = time.Now()
|
|
}
|
|
|
|
// RecordFailure increments the failure counter and updates the last failure timestamp.
|
|
// This method is thread-safe using atomic operations for counters
|
|
// and mutex protection for timestamp updates.
|
|
func (b *BaseRecoveryMechanism) RecordFailure() {
|
|
atomic.AddInt64(&b.totalFailures, 1)
|
|
|
|
b.mutex.Lock()
|
|
defer b.mutex.Unlock()
|
|
b.lastFailureTime = time.Now()
|
|
}
|
|
|
|
// GetBaseMetrics returns basic metrics collected by the base recovery mechanism.
|
|
// This includes request counts, success/failure rates, and timing information.
|
|
func (b *BaseRecoveryMechanism) GetBaseMetrics() map[string]interface{} {
|
|
b.mutex.RLock()
|
|
defer b.mutex.RUnlock()
|
|
|
|
totalReqs := atomic.LoadInt64(&b.totalRequests)
|
|
totalSucc := atomic.LoadInt64(&b.totalSuccesses)
|
|
totalFail := atomic.LoadInt64(&b.totalFailures)
|
|
|
|
metrics := map[string]interface{}{
|
|
"name": b.name,
|
|
"total_requests": totalReqs,
|
|
"total_successes": totalSucc,
|
|
"total_failures": totalFail,
|
|
"start_time": b.startTime,
|
|
}
|
|
|
|
if totalReqs > 0 {
|
|
metrics["success_rate"] = float64(totalSucc) / float64(totalReqs)
|
|
metrics["failure_rate"] = float64(totalFail) / float64(totalReqs)
|
|
}
|
|
|
|
if !b.lastSuccessTime.IsZero() {
|
|
metrics["last_success_time"] = b.lastSuccessTime
|
|
metrics["time_since_last_success"] = time.Since(b.lastSuccessTime)
|
|
}
|
|
|
|
if !b.lastFailureTime.IsZero() {
|
|
metrics["last_failure_time"] = b.lastFailureTime
|
|
metrics["time_since_last_failure"] = time.Since(b.lastFailureTime)
|
|
}
|
|
|
|
metrics["uptime"] = time.Since(b.startTime)
|
|
|
|
return metrics
|
|
}
|
|
|
|
// LogInfo logs an info message if a logger is available
|
|
func (b *BaseRecoveryMechanism) LogInfo(format string, args ...interface{}) {
|
|
if b.logger != nil {
|
|
b.logger.Infof(format, args...)
|
|
}
|
|
}
|
|
|
|
// LogError logs an error message if a logger is available
|
|
func (b *BaseRecoveryMechanism) LogError(format string, args ...interface{}) {
|
|
if b.logger != nil {
|
|
b.logger.Errorf(format, args...)
|
|
}
|
|
}
|
|
|
|
// LogDebug logs a debug message if a logger is available
|
|
func (b *BaseRecoveryMechanism) LogDebug(format string, args ...interface{}) {
|
|
if b.logger != nil {
|
|
b.logger.Debugf(format, args...)
|
|
}
|
|
}
|
|
|
|
// ErrorHandler provides centralized error handling and recovery coordination
|
|
type ErrorHandler struct {
|
|
mechanisms []ErrorRecoveryMechanism
|
|
logger Logger
|
|
mutex sync.RWMutex
|
|
}
|
|
|
|
// NewErrorHandler creates a new error handler with the given mechanisms
|
|
func NewErrorHandler(logger Logger, mechanisms ...ErrorRecoveryMechanism) *ErrorHandler {
|
|
return &ErrorHandler{
|
|
mechanisms: mechanisms,
|
|
logger: logger,
|
|
}
|
|
}
|
|
|
|
// AddMechanism adds a recovery mechanism to the handler
|
|
func (eh *ErrorHandler) AddMechanism(mechanism ErrorRecoveryMechanism) {
|
|
eh.mutex.Lock()
|
|
defer eh.mutex.Unlock()
|
|
eh.mechanisms = append(eh.mechanisms, mechanism)
|
|
}
|
|
|
|
// ExecuteWithRecovery executes a function with all configured recovery mechanisms
|
|
func (eh *ErrorHandler) ExecuteWithRecovery(ctx context.Context, fn func() error) error {
|
|
eh.mutex.RLock()
|
|
mechanisms := make([]ErrorRecoveryMechanism, len(eh.mechanisms))
|
|
copy(mechanisms, eh.mechanisms)
|
|
eh.mutex.RUnlock()
|
|
|
|
// If no mechanisms are configured, execute directly
|
|
if len(mechanisms) == 0 {
|
|
return fn()
|
|
}
|
|
|
|
// Chain the mechanisms - each wraps the next
|
|
var wrappedFn func() error = fn
|
|
for i := len(mechanisms) - 1; i >= 0; i-- {
|
|
mechanism := mechanisms[i]
|
|
currentFn := wrappedFn
|
|
wrappedFn = func() error {
|
|
return mechanism.ExecuteWithContext(ctx, currentFn)
|
|
}
|
|
}
|
|
|
|
return wrappedFn()
|
|
}
|
|
|
|
// GetAllMetrics returns metrics from all configured mechanisms
|
|
func (eh *ErrorHandler) GetAllMetrics() map[string]interface{} {
|
|
eh.mutex.RLock()
|
|
defer eh.mutex.RUnlock()
|
|
|
|
allMetrics := make(map[string]interface{})
|
|
for i, mechanism := range eh.mechanisms {
|
|
mechanismKey := "mechanism_" + string(rune(i))
|
|
allMetrics[mechanismKey] = mechanism.GetMetrics()
|
|
}
|
|
|
|
return allMetrics
|
|
}
|
|
|
|
// ResetAll resets all configured mechanisms
|
|
func (eh *ErrorHandler) ResetAll() {
|
|
eh.mutex.RLock()
|
|
defer eh.mutex.RUnlock()
|
|
|
|
for _, mechanism := range eh.mechanisms {
|
|
mechanism.Reset()
|
|
}
|
|
}
|
|
|
|
// IsHealthy returns true if all mechanisms are available
|
|
func (eh *ErrorHandler) IsHealthy() bool {
|
|
eh.mutex.RLock()
|
|
defer eh.mutex.RUnlock()
|
|
|
|
for _, mechanism := range eh.mechanisms {
|
|
if !mechanism.IsAvailable() {
|
|
return false
|
|
}
|
|
}
|
|
|
|
return true
|
|
}
|
|
|
|
// NoOpLogger provides a logger that does nothing
|
|
type NoOpLogger struct{}
|
|
|
|
// NewNoOpLogger creates a new no-op logger
|
|
func NewNoOpLogger() *NoOpLogger {
|
|
return &NoOpLogger{}
|
|
}
|
|
|
|
// Infof does nothing
|
|
func (l *NoOpLogger) Infof(format string, args ...interface{}) {}
|
|
|
|
// Errorf does nothing
|
|
func (l *NoOpLogger) Errorf(format string, args ...interface{}) {}
|
|
|
|
// Debugf does nothing
|
|
func (l *NoOpLogger) Debugf(format string, args ...interface{}) {}
|