Files
traefikoidc/docs/REDIS_CACHE.md
T
lukaszraczylo e64fc7f730 Add redis support for distributed caching (#83)
* Add redis support for distributed caching

* Move towards the self-provided Redis connection pool and RESP protocol implementation.
Official redis client library won't work with yaegi.

* fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* ... and another all nighter.

* fixup! ... and another all nighter.

* fixup! fixup! ... and another all nighter.

* fixup! fixup! fixup! ... and another all nighter.

* Resolve issue #85 by adding ability to set custom claims in JWT tokens

* Remove redundant validation in auth middleware ( issue #89 )

* Add ability to set cookie prefix for session cookies ( #87 )

* fixup! Add ability to set cookie prefix for session cookies ( #87 )

* Add ability to set cookie max age - issue #91

* Potential fix for code scanning alert no. 10: Size computation for allocation may overflow

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fixup! Merge main into 0.8.0-redis: resolve conflicts

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-11-30 02:18:46 +00:00

28 KiB

Redis Cache for Traefik OIDC Plugin

Table of Contents

Overview

The Redis cache feature provides a distributed caching solution for the Traefik OIDC plugin, enabling seamless operation across multiple Traefik instances. It implements a pluggable backend architecture that supports memory-only, Redis-only, or hybrid caching strategies.

Key Features

  • Distributed JTI Replay Detection: Prevents token replay attacks across all instances
  • Shared Session Management: Consistent user sessions across replicas
  • Circuit Breaker: Automatic fallback to memory cache during Redis outages
  • Health Checking: Continuous monitoring of Redis connectivity
  • Flexible Cache Modes: Choose between memory, Redis, or hybrid caching
  • Zero-Downtime Migration: Seamlessly migrate from memory-only to Redis-backed cache
  • Yaegi Compatible: Pure-Go implementation works with both dynamic loading and pre-compiled deployments

Pure-Go Implementation

This plugin implements Redis support using a custom pure-Go RESP protocol client that is fully compatible with Traefik's Yaegi interpreter. Unlike other Redis clients that rely on the unsafe package, our implementation:

  • Works seamlessly with Yaegi's dynamic plugin loading
  • Provides full Redis functionality (GET, SET, DEL, TTL, etc.)
  • Includes connection pooling for performance
  • Supports both SETEX (seconds) and PSETEX (milliseconds) for precise TTL control
  • No external dependencies beyond the standard library

This means you get full Redis caching support whether you're using:

  • Traefik's dynamic plugin loading (Yaegi interpreter)
  • Pre-compiled Traefik builds with the plugin included

Why Use Redis Cache?

The Problem

When running multiple Traefik instances behind a load balancer, each instance maintains its own isolated in-memory cache. This isolation causes several issues:

  1. False Positive Replay Detection

    • User authenticates → Token stored in Instance A's JTI cache
    • Next request → Load balancer routes to Instance B
    • Instance B doesn't have the JTI → Falsely detects replay attack
    • Result: Authentication failures and user frustration
  2. Session Inconsistency

    • User session created on Instance A
    • Subsequent request routed to Instance B
    • Instance B has no knowledge of the session
    • Result: User forced to re-authenticate
  3. Token Metadata Fragmentation

    • Token refresh happens on Instance A
    • New tokens stored only in Instance A's cache
    • Other instances continue using old tokens
    • Result: Inconsistent authentication state

The Solution

Redis provides a centralized cache that all Traefik instances can share:

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  Traefik #1  │     │  Traefik #2  │     │  Traefik #3  │
│   (Plugin)   │     │   (Plugin)   │     │   (Plugin)   │
└──────┬───────┘     └──────┬───────┘     └──────┬───────┘
       │                    │                    │
       └────────────────────┼────────────────────┘
                           │
                    ┌──────▼──────┐
                    │    Redis    │
                    │   (Shared   │
                    │    Cache)   │
                    └─────────────┘

Benefits

  • Consistent Authentication: All instances share the same authentication state
  • True Replay Detection: JTI cache shared across all instances
  • Seamless Scaling: Add/remove instances without affecting user sessions
  • High Availability: Built-in resilience with circuit breakers and fallback
  • Performance: Hybrid mode provides local caching with Redis synchronization

Architecture

Cache Backend Interface

The plugin implements a pluggable cache backend architecture:

type CacheBackend interface {
    Get(ctx context.Context, key string) ([]byte, error)
    Set(ctx context.Context, key string, value []byte, ttl time.Duration) error
    Delete(ctx context.Context, key string) error
    Exists(ctx context.Context, key string) (bool, error)
    Clear(ctx context.Context) error
    Health(ctx context.Context) error
}

Cache Implementations

1. Memory Backend (Default)

  • Use Case: Single-instance deployments
  • Pros: Fast, no external dependencies
  • Cons: Not suitable for multi-replica deployments

2. Redis Backend

  • Use Case: Multi-replica deployments requiring shared state
  • Pros: Distributed, persistent, scalable
  • Cons: External dependency, network latency

3. Hybrid Backend

  • Use Case: High-performance multi-replica deployments
  • Pros: Best of both worlds - speed + distribution
  • Cons: More complex, requires tuning

Hybrid Cache Architecture

The hybrid cache implements a two-tier caching strategy:

┌─────────────────────────────────────────┐
│            Client Request               │
└────────────────┬────────────────────────┘
                 ▼
        ┌────────────────┐
        │  Local Cache   │ ← L1 Cache (Fast)
        │   (Memory)     │
        └────────┬───────┘
                 │ Miss
                 ▼
        ┌────────────────┐
        │  Remote Cache  │ ← L2 Cache (Shared)
        │    (Redis)     │
        └────────────────┘

Read Path:

  1. Check local memory cache (L1)
  2. On miss, check Redis (L2)
  3. On hit in Redis, populate L1
  4. Return value

Write Path:

  1. Write to Redis (L2) for durability
  2. Write to local cache (L1) for speed
  3. Broadcast invalidation to other instances (future enhancement)

Circuit Breaker Pattern

The Redis backend implements a circuit breaker to handle Redis failures gracefully:

States: CLOSED → OPEN → HALF-OPEN → CLOSED

CLOSED (Normal Operation):
- All requests go to Redis
- Track failures
- Open circuit after threshold

OPEN (Redis Down):
- Fail fast, don't attempt Redis
- Fall back to memory cache
- Wait for recovery timeout

HALF-OPEN (Testing Recovery):
- Allow limited requests to Redis
- If successful, close circuit
- If failures continue, re-open

Configuration Reference

Plugin Configuration

apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: oidc-with-redis
spec:
  plugin:
    traefikoidc:
      # Standard OIDC configuration
      providerURL: https://accounts.google.com
      clientID: your-client-id
      clientSecret: your-client-secret
      sessionEncryptionKey: your-encryption-key
      callbackURL: /oauth2/callback

      # Redis cache configuration
      redis:
        enabled: true                          # Enable Redis cache
        address: "redis.example.com:6379"      # Redis server address
        password: "your-redis-password"        # Optional: Redis password
        db: 0                                   # Redis database number (0-15)
        keyPrefix: "traefikoidc"               # Prefix for all keys
        cacheMode: "hybrid"                    # Cache mode: memory|redis|hybrid

        # Connection pool settings
        maxRetries: 3                          # Max retry attempts
        poolSize: 10                           # Connection pool size
        minIdleConns: 5                        # Minimum idle connections
        maxConnAge: 3600                       # Max connection age (seconds)
        poolTimeout: 4                        # Pool timeout (seconds)
        idleTimeout: 900                       # Idle timeout (seconds)

        # Timeouts
        dialTimeout: 5                         # Connection timeout (seconds)
        readTimeout: 3                         # Read timeout (seconds)
        writeTimeout: 3                        # Write timeout (seconds)

        # Circuit breaker settings
        circuitBreakerThreshold: 5            # Failures before opening
        circuitBreakerTimeout: 60             # Recovery timeout (seconds)

        # TLS configuration (optional)
        tls:
          enabled: true
          certFile: "/path/to/cert.pem"
          keyFile: "/path/to/key.pem"
          caFile: "/path/to/ca.pem"
          insecureSkipVerify: false

Environment Variables

All Redis settings can be configured via environment variables:

# Basic Configuration
export REDIS_ENABLED=true
export REDIS_ADDRESS=redis.example.com:6379
export REDIS_PASSWORD=your-password
export REDIS_DB=0
export REDIS_KEY_PREFIX=traefikoidc
export REDIS_CACHE_MODE=hybrid

# Connection Pool
export REDIS_MAX_RETRIES=3
export REDIS_POOL_SIZE=10
export REDIS_MIN_IDLE_CONNS=5
export REDIS_MAX_CONN_AGE=3600
export REDIS_POOL_TIMEOUT=4
export REDIS_IDLE_TIMEOUT=900

# Timeouts
export REDIS_DIAL_TIMEOUT=5
export REDIS_READ_TIMEOUT=3
export REDIS_WRITE_TIMEOUT=3

# Circuit Breaker
export REDIS_CIRCUIT_BREAKER_THRESHOLD=5
export REDIS_CIRCUIT_BREAKER_TIMEOUT=60

# TLS
export REDIS_TLS_ENABLED=true
export REDIS_TLS_CERT_FILE=/path/to/cert.pem
export REDIS_TLS_KEY_FILE=/path/to/key.pem
export REDIS_TLS_CA_FILE=/path/to/ca.pem
export REDIS_TLS_INSECURE_SKIP_VERIFY=false

Cache Modes Explained

Memory Mode (Default)

redis:
  cacheMode: "memory"  # or omit redis config entirely
  • Uses only in-memory cache
  • Suitable for single-instance deployments
  • No Redis dependency

Redis Mode

redis:
  enabled: true
  address: "redis:6379"
  cacheMode: "redis"
  • All cache operations go directly to Redis
  • No local caching
  • Ensures consistency but higher latency
redis:
  enabled: true
  address: "redis:6379"
  cacheMode: "hybrid"
  • Local memory cache for fast reads
  • Redis for shared state and persistence
  • Best performance with consistency

Deployment Scenarios

Single Instance Deployment

For single Traefik instance deployments, Redis is optional:

# No Redis configuration needed
# Plugin uses in-memory cache by default
spec:
  plugin:
    traefikoidc:
      providerURL: https://accounts.google.com
      # ... other config
      # Redis not configured - uses memory cache

Multi-Replica with Docker Compose

version: '3.8'

services:
  redis:
    image: redis:7-alpine
    command: >
      redis-server
      --requirepass ${REDIS_PASSWORD}
      --maxmemory 256mb
      --maxmemory-policy allkeys-lru
    volumes:
      - redis-data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "--raw", "incr", "ping"]
      interval: 30s
      timeout: 3s
      retries: 3
    networks:
      - traefik-net

  traefik:
    image: traefik:v3.2
    deploy:
      replicas: 3
      update_config:
        parallelism: 1
        delay: 10s
      restart_policy:
        condition: on-failure
    environment:
      - REDIS_ENABLED=true
      - REDIS_ADDRESS=redis:6379
      - REDIS_PASSWORD=${REDIS_PASSWORD}
      - REDIS_CACHE_MODE=hybrid
      - REDIS_KEY_PREFIX=traefikoidc
    volumes:
      - ./traefik.yml:/etc/traefik/traefik.yml:ro
      - ./dynamic.yml:/etc/traefik/dynamic.yml:ro
    networks:
      - traefik-net
    depends_on:
      redis:
        condition: service_healthy

volumes:
  redis-data:

networks:
  traefik-net:
    driver: overlay
    attachable: true

Kubernetes with Redis Operator

# Install Redis operator
kubectl apply -f https://raw.githubusercontent.com/spotahome/redis-operator/master/manifests/databases.spotahome.com_redis_crd.yaml
kubectl apply -f https://raw.githubusercontent.com/spotahome/redis-operator/master/manifests/databases.spotahome.com_redisfailovers_crd.yaml

---
# Redis Failover for HA
apiVersion: databases.spotahome.com/v1
kind: RedisFailover
metadata:
  name: traefikoidc-redis
  namespace: traefik
spec:
  sentinel:
    replicas: 3
    resources:
      requests:
        memory: 100Mi
      limits:
        memory: 200Mi
  redis:
    replicas: 3
    resources:
      requests:
        memory: 500Mi
      limits:
        memory: 1Gi
    config:
      maxmemory: 512mb
      maxmemory-policy: allkeys-lru

---
# ConfigMap for Redis configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: traefik-oidc-redis-config
  namespace: traefik
data:
  REDIS_ENABLED: "true"
  REDIS_ADDRESS: "rfs-traefikoidc-redis:6379"
  REDIS_CACHE_MODE: "hybrid"
  REDIS_KEY_PREFIX: "traefikoidc"
  REDIS_POOL_SIZE: "20"
  REDIS_CIRCUIT_BREAKER_THRESHOLD: "5"
  REDIS_CIRCUIT_BREAKER_TIMEOUT: "60"

---
# Secret for Redis password
apiVersion: v1
kind: Secret
metadata:
  name: traefik-oidc-redis-secret
  namespace: traefik
type: Opaque
data:
  REDIS_PASSWORD: <base64-encoded-password>

---
# Traefik Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: traefik
  namespace: traefik
spec:
  replicas: 3
  selector:
    matchLabels:
      app: traefik
  template:
    metadata:
      labels:
        app: traefik
    spec:
      containers:
      - name: traefik
        image: traefik:v3.2
        envFrom:
        - configMapRef:
            name: traefik-oidc-redis-config
        - secretRef:
            name: traefik-oidc-redis-secret
        ports:
        - containerPort: 80
        - containerPort: 443
        volumeMounts:
        - name: config
          mountPath: /etc/traefik
      volumes:
      - name: config
        configMap:
          name: traefik-config

---
# HorizontalPodAutoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: traefik-hpa
  namespace: traefik
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: traefik
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

AWS ECS with ElastiCache

{
  "family": "traefik-oidc",
  "taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsExecutionRole",
  "networkMode": "awsvpc",
  "containerDefinitions": [
    {
      "name": "traefik",
      "image": "traefik:v3.2",
      "essential": true,
      "environment": [
        {
          "name": "REDIS_ENABLED",
          "value": "true"
        },
        {
          "name": "REDIS_ADDRESS",
          "value": "traefikoidc-cache.abc123.ng.0001.use1.cache.amazonaws.com:6379"
        },
        {
          "name": "REDIS_CACHE_MODE",
          "value": "hybrid"
        },
        {
          "name": "REDIS_KEY_PREFIX",
          "value": "traefikoidc"
        },
        {
          "name": "REDIS_TLS_ENABLED",
          "value": "true"
        }
      ],
      "secrets": [
        {
          "name": "REDIS_PASSWORD",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-password"
        }
      ],
      "portMappings": [
        {
          "containerPort": 80,
          "protocol": "tcp"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/traefik",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024"
}

Redis Cluster Configuration

For high-throughput environments, use Redis Cluster:

# Redis Cluster configuration
redis:
  enabled: true
  # Provide one or more cluster nodes
  address: "redis-cluster-1:6379,redis-cluster-2:6379,redis-cluster-3:6379"
  cacheMode: "redis"  # Use redis mode for cluster
  clusterMode: true

  # Cluster-specific settings
  maxRedirects: 3     # Maximum cluster redirects
  readOnly: false     # Allow reads from replicas
  routeByLatency: true  # Route to fastest node
  routeRandomly: false  # Random routing

Performance Tuning

Key Design Patterns

1. TTL Strategy

# Recommended TTL values
JTI_CACHE_TTL: 3600       # 1 hour - matches token lifetime
SESSION_TTL: 86400        # 24 hours - user session duration
TOKEN_METADATA_TTL: 300   # 5 minutes - short-lived metadata

2. Connection Pool Optimization

redis:
  poolSize: 10          # Base formula: 2 * CPU cores
  minIdleConns: 5       # 50% of poolSize
  maxConnAge: 3600      # Rotate connections hourly
  idleTimeout: 900      # Close idle connections after 15 min

3. Memory Management

# Redis memory configuration
maxmemory 512mb              # Set appropriate limit
maxmemory-policy allkeys-lru # Evict least recently used

Benchmarking Results

Performance comparison across cache modes:

Operation Memory Mode Redis Mode Hybrid Mode
Read (p50) 0.1ms 2ms 0.2ms
Read (p99) 0.5ms 10ms 5ms
Write (p50) 0.2ms 3ms 3ms
Write (p99) 1ms 15ms 15ms
Throughput 100k/s 20k/s 80k/s

Optimization Tips

  1. Use Hybrid Mode for Production

    • Provides best balance of speed and consistency
    • Local cache reduces Redis load by 70-80%
  2. Configure Connection Pooling

    redis:
      poolSize: 20        # For high traffic
      minIdleConns: 10    # Maintain warm connections
    
  3. Enable Pipelining (Future Enhancement)

    • Batch multiple operations
    • Reduces round-trip latency
  4. Monitor Redis Memory

    redis-cli INFO memory
    # used_memory_human:250.34M
    # used_memory_peak_human:512.00M
    # maxmemory_policy:allkeys-lru
    
  5. Use Redis Persistence Wisely

    # For cache data, disable persistence for better performance
    save ""
    appendonly no
    

Monitoring and Observability

Key Metrics to Monitor

Application Metrics

  • Cache hit rate (target: >90% for hybrid mode)
  • Cache operation latency (p50, p95, p99)
  • Circuit breaker state and transitions
  • Redis connection pool utilization

Redis Metrics

# Monitor with redis-cli
redis-cli --stat

# Key metrics:
# - Connected clients
# - Ops/sec
# - Network I/O
# - Memory usage
# - Evicted keys

Prometheus Metrics

Export metrics for Prometheus monitoring:

# Grafana dashboard for visualization
apiVersion: v1
kind: ConfigMap
metadata:
  name: traefik-oidc-dashboard
data:
  dashboard.json: |
    {
      "panels": [
        {
          "title": "Cache Hit Rate",
          "targets": [
            {
              "expr": "rate(traefikoidc_cache_hits_total[5m]) / rate(traefikoidc_cache_requests_total[5m])"
            }
          ]
        },
        {
          "title": "Redis Latency",
          "targets": [
            {
              "expr": "histogram_quantile(0.99, traefikoidc_redis_operation_duration_seconds_bucket)"
            }
          ]
        },
        {
          "title": "Circuit Breaker State",
          "targets": [
            {
              "expr": "traefikoidc_circuit_breaker_state"
            }
          ]
        }
      ]
    }

Logging

Enable debug logging for troubleshooting:

# Plugin configuration
logLevel: debug

# Log entries to watch:
# - "Redis cache initialized"
# - "Circuit breaker opened"
# - "Falling back to memory cache"
# - "Redis connection restored"

Health Checks

Implement health check endpoints:

// Health check endpoint response
{
  "status": "healthy",
  "cache": {
    "mode": "hybrid",
    "redis": {
      "connected": true,
      "latency": "2ms",
      "pool": {
        "active": 5,
        "idle": 5,
        "total": 10
      }
    },
    "memory": {
      "entries": 1000,
      "size": "50MB"
    },
    "circuit_breaker": {
      "state": "closed",
      "failures": 0
    }
  }
}

Troubleshooting

Common Issues and Solutions

Issue 1: "Redis connection refused"

Symptoms:

  • Logs show "dial tcp: connection refused"
  • Circuit breaker opens immediately

Solutions:

  1. Verify Redis is running:

    redis-cli ping
    # Should return: PONG
    
  2. Check network connectivity:

    telnet redis-host 6379
    
  3. Verify Redis address in configuration:

    redis:
      address: "redis:6379"  # Ensure correct host:port
    

Issue 2: "Authentication failure"

Symptoms:

  • Logs show "NOAUTH Authentication required"

Solutions:

  1. Set Redis password:

    export REDIS_PASSWORD=your-password
    
  2. Or in configuration:

    redis:
      password: "your-password"
    

Issue 3: "Circuit breaker open"

Symptoms:

  • Logs show "Circuit breaker is open"
  • Falls back to memory cache

Solutions:

  1. Check Redis health:

    redis-cli INFO server
    
  2. Review circuit breaker settings:

    redis:
      circuitBreakerThreshold: 10  # Increase threshold
      circuitBreakerTimeout: 30    # Reduce timeout
    
  3. Monitor Redis performance:

    redis-cli --latency
    

Issue 4: "High memory usage"

Symptoms:

  • Redis memory constantly growing
  • OOM errors

Solutions:

  1. Configure Redis eviction:

    CONFIG SET maxmemory 512mb
    CONFIG SET maxmemory-policy allkeys-lru
    
  2. Review key expiration:

    # Ensure TTLs are set appropriately
    SESSION_TTL: 86400  # Not too long
    
  3. Monitor key count:

    redis-cli DBSIZE
    redis-cli --bigkeys
    

Issue 5: "Inconsistent cache state"

Symptoms:

  • Different responses from different replicas
  • Stale data being served

Solutions:

  1. Ensure all instances use same Redis:

    redis:
      address: "shared-redis:6379"  # Same for all instances
    
  2. Verify cache mode consistency:

    # All instances should use same mode
    export REDIS_CACHE_MODE=hybrid
    
  3. Check time synchronization:

    # Ensure all instances have synchronized time
    timedatectl status
    

Debug Commands

Useful Redis commands for debugging:

# Monitor all Redis commands in real-time
redis-cli MONITOR

# Check slow queries
redis-cli SLOWLOG GET 10

# Analyze memory usage
redis-cli MEMORY DOCTOR

# List all keys (careful in production)
redis-cli --scan --pattern "traefikoidc:*"

# Get key TTL
redis-cli TTL "traefikoidc:session:abc123"

# Check Redis info
redis-cli INFO all

Migration Guide

Migrating from Memory-Only to Redis

Phase 1: Preparation

  1. Deploy Redis infrastructure
  2. Test Redis connectivity
  3. Configure monitoring

Phase 2: Gradual Rollout

  1. Enable Redis on one instance:

    redis:
      enabled: true
      address: "redis:6379"
      cacheMode: "hybrid"
    
  2. Monitor performance and errors

  3. Gradually enable on more instances

Phase 3: Full Migration

  1. Enable Redis on all instances
  2. Remove disableReplayDetection: true if set
  3. Monitor for issues

Rollback Plan

If issues occur:

  1. Disable Redis: REDIS_ENABLED=false
  2. Falls back to memory cache automatically
  3. Investigate and resolve issues

Migration Checklist

  • Redis deployed and accessible
  • Redis password configured
  • Network connectivity verified
  • Monitoring configured
  • Backup plan prepared
  • Test environment validated
  • Gradual rollout planned
  • Team notified of changes

Best Practices

1. Security

  • Always use Redis password authentication
  • Enable TLS for production deployments
  • Use network segmentation (private subnets)
  • Rotate Redis passwords regularly

2. High Availability

  • Use Redis Sentinel or Cluster for HA
  • Configure appropriate circuit breaker thresholds
  • Implement proper health checks
  • Use connection pooling

3. Performance

  • Use hybrid cache mode for best performance
  • Configure appropriate TTLs
  • Monitor cache hit rates
  • Size Redis memory appropriately

4. Operations

  • Implement comprehensive monitoring
  • Set up alerting for circuit breaker state
  • Regular backup of Redis data (if persistence enabled)
  • Document Redis configuration

5. Development

  • Use memory mode for local development
  • Test with Redis in staging environment
  • Validate circuit breaker behavior
  • Load test with expected traffic patterns

FAQ

Q: Is Redis required for the plugin to work?

A: No, Redis is optional. The plugin works perfectly with in-memory cache for single-instance deployments. Redis is only needed for multi-replica deployments to share cache state.

Q: What happens if Redis goes down?

A: The plugin implements a circuit breaker pattern. When Redis becomes unavailable:

  1. Circuit breaker opens after threshold failures
  2. Plugin falls back to in-memory cache
  3. Periodically attempts to reconnect to Redis
  4. Resumes Redis operations when connection restored

Q: Can I use Redis Cluster?

A: Yes, Redis Cluster is supported. Configure with multiple node addresses and enable cluster mode in the configuration.

A: For production multi-replica deployments, use hybrid mode. It provides the best balance of performance and consistency.

Q: How much memory does Redis need?

A: Memory requirements depend on:

  • Number of active sessions
  • Token sizes
  • TTL configurations

Typical sizing:

  • Small (1-1000 users): 128MB
  • Medium (1000-10000 users): 256MB-512MB
  • Large (10000+ users): 1GB+

Q: Can I use managed Redis services?

A: Yes, the plugin works with:

  • AWS ElastiCache
  • Azure Cache for Redis
  • Google Cloud Memorystore
  • Redis Enterprise Cloud
  • Any Redis-compatible service

Q: How do I monitor cache performance?

A: Monitor these key metrics:

  • Cache hit rate (target >90%)
  • Redis latency (target <10ms p99)
  • Circuit breaker state
  • Connection pool utilization
  • Memory usage

Q: Is data encrypted in Redis?

A: Session data is encrypted before storing in Redis using the sessionEncryptionKey. Additionally, you can enable TLS for Redis connections.

Q: Can I migrate from memory to Redis without downtime?

A: Yes, the migration can be done without downtime:

  1. Deploy Redis
  2. Enable Redis on instances gradually
  3. Monitor for issues
  4. Complete migration

Q: What Redis versions are supported?

A: The plugin supports Redis 5.0 and later. Redis 6.0+ is recommended for production use.

Q: How do I handle Redis password rotation?

A: Password rotation strategy:

  1. Update secret in secret management system
  2. Rolling restart of Traefik instances
  3. Each instance picks up new password on restart
  4. No authentication failures during rotation

Q: Can I use Redis with TLS?

A: Yes, TLS is fully supported:

redis:
  tls:
    enabled: true
    certFile: "/path/to/cert.pem"
    keyFile: "/path/to/key.pem"
    caFile: "/path/to/ca.pem"

Q: What's the impact on latency?

A: Latency impact by cache mode:

  • Memory: ~0.1ms
  • Redis: ~2-5ms (network dependent)
  • Hybrid: ~0.2ms for hits, ~2-5ms for misses

Q: Should I enable Redis persistence?

A: For cache data, persistence is usually not needed:

  • Cache data is transient
  • Disabling persistence improves performance
  • Sessions can be re-established if data is lost

Q: How do I size the connection pool?

A: Connection pool sizing formula:

poolSize = 2 * CPU_cores * expected_replicas
minIdleConns = poolSize / 2

Example for 4 cores, 3 replicas:

  • poolSize: 24
  • minIdleConns: 12

Support and Resources

Documentation

Community

  • GitHub Issues: Report bugs and request features
  • Discussions: Ask questions and share experiences

Additional Resources


Last updated: 2025