Files
traefikoidc/docs/REDIS_CACHE.md
T
lukaszraczylo e64fc7f730 Add redis support for distributed caching (#83)
* Add redis support for distributed caching

* Move towards the self-provided Redis connection pool and RESP protocol implementation.
Official redis client library won't work with yaegi.

* fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* fixup! fixup! fixup! fixup! fixup! Move towards the self-provided Redis connection pool and RESP protocol implementation. Official redis client library won't work with yaegi.

* ... and another all nighter.

* fixup! ... and another all nighter.

* fixup! fixup! ... and another all nighter.

* fixup! fixup! fixup! ... and another all nighter.

* Resolve issue #85 by adding ability to set custom claims in JWT tokens

* Remove redundant validation in auth middleware ( issue #89 )

* Add ability to set cookie prefix for session cookies ( #87 )

* fixup! Add ability to set cookie prefix for session cookies ( #87 )

* Add ability to set cookie max age - issue #91

* Potential fix for code scanning alert no. 10: Size computation for allocation may overflow

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fixup! Merge main into 0.8.0-redis: resolve conflicts

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-11-30 02:18:46 +00:00

1125 lines
28 KiB
Markdown

# Redis Cache for Traefik OIDC Plugin
## Table of Contents
- [Overview](#overview)
- [Why Use Redis Cache?](#why-use-redis-cache)
- [Architecture](#architecture)
- [Configuration Reference](#configuration-reference)
- [Deployment Scenarios](#deployment-scenarios)
- [Performance Tuning](#performance-tuning)
- [Monitoring and Observability](#monitoring-and-observability)
- [Troubleshooting](#troubleshooting)
- [Migration Guide](#migration-guide)
- [Best Practices](#best-practices)
- [FAQ](#faq)
## Overview
The Redis cache feature provides a distributed caching solution for the Traefik OIDC plugin, enabling seamless operation across multiple Traefik instances. It implements a pluggable backend architecture that supports memory-only, Redis-only, or hybrid caching strategies.
### Key Features
- **Distributed JTI Replay Detection**: Prevents token replay attacks across all instances
- **Shared Session Management**: Consistent user sessions across replicas
- **Circuit Breaker**: Automatic fallback to memory cache during Redis outages
- **Health Checking**: Continuous monitoring of Redis connectivity
- **Flexible Cache Modes**: Choose between memory, Redis, or hybrid caching
- **Zero-Downtime Migration**: Seamlessly migrate from memory-only to Redis-backed cache
- **Yaegi Compatible**: Pure-Go implementation works with both dynamic loading and pre-compiled deployments
### ✨ Pure-Go Implementation
This plugin implements Redis support using a **custom pure-Go RESP protocol client** that is fully compatible with Traefik's Yaegi interpreter. Unlike other Redis clients that rely on the `unsafe` package, our implementation:
- Works seamlessly with Yaegi's dynamic plugin loading
- Provides full Redis functionality (GET, SET, DEL, TTL, etc.)
- Includes connection pooling for performance
- Supports both SETEX (seconds) and PSETEX (milliseconds) for precise TTL control
- No external dependencies beyond the standard library
This means you get **full Redis caching support whether you're using**:
- ✅ Traefik's dynamic plugin loading (Yaegi interpreter)
- ✅ Pre-compiled Traefik builds with the plugin included
## Why Use Redis Cache?
### The Problem
When running multiple Traefik instances behind a load balancer, each instance maintains its own isolated in-memory cache. This isolation causes several issues:
1. **False Positive Replay Detection**
- User authenticates → Token stored in Instance A's JTI cache
- Next request → Load balancer routes to Instance B
- Instance B doesn't have the JTI → Falsely detects replay attack
- Result: Authentication failures and user frustration
2. **Session Inconsistency**
- User session created on Instance A
- Subsequent request routed to Instance B
- Instance B has no knowledge of the session
- Result: User forced to re-authenticate
3. **Token Metadata Fragmentation**
- Token refresh happens on Instance A
- New tokens stored only in Instance A's cache
- Other instances continue using old tokens
- Result: Inconsistent authentication state
### The Solution
Redis provides a centralized cache that all Traefik instances can share:
```
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Traefik #1 │ │ Traefik #2 │ │ Traefik #3 │
│ (Plugin) │ │ (Plugin) │ │ (Plugin) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└────────────────────┼────────────────────┘
┌──────▼──────┐
│ Redis │
│ (Shared │
│ Cache) │
└─────────────┘
```
### Benefits
- **Consistent Authentication**: All instances share the same authentication state
- **True Replay Detection**: JTI cache shared across all instances
- **Seamless Scaling**: Add/remove instances without affecting user sessions
- **High Availability**: Built-in resilience with circuit breakers and fallback
- **Performance**: Hybrid mode provides local caching with Redis synchronization
## Architecture
### Cache Backend Interface
The plugin implements a pluggable cache backend architecture:
```go
type CacheBackend interface {
Get(ctx context.Context, key string) ([]byte, error)
Set(ctx context.Context, key string, value []byte, ttl time.Duration) error
Delete(ctx context.Context, key string) error
Exists(ctx context.Context, key string) (bool, error)
Clear(ctx context.Context) error
Health(ctx context.Context) error
}
```
### Cache Implementations
#### 1. Memory Backend (Default)
- **Use Case**: Single-instance deployments
- **Pros**: Fast, no external dependencies
- **Cons**: Not suitable for multi-replica deployments
#### 2. Redis Backend
- **Use Case**: Multi-replica deployments requiring shared state
- **Pros**: Distributed, persistent, scalable
- **Cons**: External dependency, network latency
#### 3. Hybrid Backend
- **Use Case**: High-performance multi-replica deployments
- **Pros**: Best of both worlds - speed + distribution
- **Cons**: More complex, requires tuning
### Hybrid Cache Architecture
The hybrid cache implements a two-tier caching strategy:
```
┌─────────────────────────────────────────┐
│ Client Request │
└────────────────┬────────────────────────┘
┌────────────────┐
│ Local Cache │ ← L1 Cache (Fast)
│ (Memory) │
└────────┬───────┘
│ Miss
┌────────────────┐
│ Remote Cache │ ← L2 Cache (Shared)
│ (Redis) │
└────────────────┘
```
**Read Path**:
1. Check local memory cache (L1)
2. On miss, check Redis (L2)
3. On hit in Redis, populate L1
4. Return value
**Write Path**:
1. Write to Redis (L2) for durability
2. Write to local cache (L1) for speed
3. Broadcast invalidation to other instances (future enhancement)
### Circuit Breaker Pattern
The Redis backend implements a circuit breaker to handle Redis failures gracefully:
```
States: CLOSED → OPEN → HALF-OPEN → CLOSED
CLOSED (Normal Operation):
- All requests go to Redis
- Track failures
- Open circuit after threshold
OPEN (Redis Down):
- Fail fast, don't attempt Redis
- Fall back to memory cache
- Wait for recovery timeout
HALF-OPEN (Testing Recovery):
- Allow limited requests to Redis
- If successful, close circuit
- If failures continue, re-open
```
## Configuration Reference
### Plugin Configuration
```yaml
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: oidc-with-redis
spec:
plugin:
traefikoidc:
# Standard OIDC configuration
providerURL: https://accounts.google.com
clientID: your-client-id
clientSecret: your-client-secret
sessionEncryptionKey: your-encryption-key
callbackURL: /oauth2/callback
# Redis cache configuration
redis:
enabled: true # Enable Redis cache
address: "redis.example.com:6379" # Redis server address
password: "your-redis-password" # Optional: Redis password
db: 0 # Redis database number (0-15)
keyPrefix: "traefikoidc" # Prefix for all keys
cacheMode: "hybrid" # Cache mode: memory|redis|hybrid
# Connection pool settings
maxRetries: 3 # Max retry attempts
poolSize: 10 # Connection pool size
minIdleConns: 5 # Minimum idle connections
maxConnAge: 3600 # Max connection age (seconds)
poolTimeout: 4 # Pool timeout (seconds)
idleTimeout: 900 # Idle timeout (seconds)
# Timeouts
dialTimeout: 5 # Connection timeout (seconds)
readTimeout: 3 # Read timeout (seconds)
writeTimeout: 3 # Write timeout (seconds)
# Circuit breaker settings
circuitBreakerThreshold: 5 # Failures before opening
circuitBreakerTimeout: 60 # Recovery timeout (seconds)
# TLS configuration (optional)
tls:
enabled: true
certFile: "/path/to/cert.pem"
keyFile: "/path/to/key.pem"
caFile: "/path/to/ca.pem"
insecureSkipVerify: false
```
### Environment Variables
All Redis settings can be configured via environment variables:
```bash
# Basic Configuration
export REDIS_ENABLED=true
export REDIS_ADDRESS=redis.example.com:6379
export REDIS_PASSWORD=your-password
export REDIS_DB=0
export REDIS_KEY_PREFIX=traefikoidc
export REDIS_CACHE_MODE=hybrid
# Connection Pool
export REDIS_MAX_RETRIES=3
export REDIS_POOL_SIZE=10
export REDIS_MIN_IDLE_CONNS=5
export REDIS_MAX_CONN_AGE=3600
export REDIS_POOL_TIMEOUT=4
export REDIS_IDLE_TIMEOUT=900
# Timeouts
export REDIS_DIAL_TIMEOUT=5
export REDIS_READ_TIMEOUT=3
export REDIS_WRITE_TIMEOUT=3
# Circuit Breaker
export REDIS_CIRCUIT_BREAKER_THRESHOLD=5
export REDIS_CIRCUIT_BREAKER_TIMEOUT=60
# TLS
export REDIS_TLS_ENABLED=true
export REDIS_TLS_CERT_FILE=/path/to/cert.pem
export REDIS_TLS_KEY_FILE=/path/to/key.pem
export REDIS_TLS_CA_FILE=/path/to/ca.pem
export REDIS_TLS_INSECURE_SKIP_VERIFY=false
```
### Cache Modes Explained
#### Memory Mode (Default)
```yaml
redis:
cacheMode: "memory" # or omit redis config entirely
```
- Uses only in-memory cache
- Suitable for single-instance deployments
- No Redis dependency
#### Redis Mode
```yaml
redis:
enabled: true
address: "redis:6379"
cacheMode: "redis"
```
- All cache operations go directly to Redis
- No local caching
- Ensures consistency but higher latency
#### Hybrid Mode (Recommended for Production)
```yaml
redis:
enabled: true
address: "redis:6379"
cacheMode: "hybrid"
```
- Local memory cache for fast reads
- Redis for shared state and persistence
- Best performance with consistency
## Deployment Scenarios
### Single Instance Deployment
For single Traefik instance deployments, Redis is optional:
```yaml
# No Redis configuration needed
# Plugin uses in-memory cache by default
spec:
plugin:
traefikoidc:
providerURL: https://accounts.google.com
# ... other config
# Redis not configured - uses memory cache
```
### Multi-Replica with Docker Compose
```yaml
version: '3.8'
services:
redis:
image: redis:7-alpine
command: >
redis-server
--requirepass ${REDIS_PASSWORD}
--maxmemory 256mb
--maxmemory-policy allkeys-lru
volumes:
- redis-data:/data
healthcheck:
test: ["CMD", "redis-cli", "--raw", "incr", "ping"]
interval: 30s
timeout: 3s
retries: 3
networks:
- traefik-net
traefik:
image: traefik:v3.2
deploy:
replicas: 3
update_config:
parallelism: 1
delay: 10s
restart_policy:
condition: on-failure
environment:
- REDIS_ENABLED=true
- REDIS_ADDRESS=redis:6379
- REDIS_PASSWORD=${REDIS_PASSWORD}
- REDIS_CACHE_MODE=hybrid
- REDIS_KEY_PREFIX=traefikoidc
volumes:
- ./traefik.yml:/etc/traefik/traefik.yml:ro
- ./dynamic.yml:/etc/traefik/dynamic.yml:ro
networks:
- traefik-net
depends_on:
redis:
condition: service_healthy
volumes:
redis-data:
networks:
traefik-net:
driver: overlay
attachable: true
```
### Kubernetes with Redis Operator
```yaml
# Install Redis operator
kubectl apply -f https://raw.githubusercontent.com/spotahome/redis-operator/master/manifests/databases.spotahome.com_redis_crd.yaml
kubectl apply -f https://raw.githubusercontent.com/spotahome/redis-operator/master/manifests/databases.spotahome.com_redisfailovers_crd.yaml
---
# Redis Failover for HA
apiVersion: databases.spotahome.com/v1
kind: RedisFailover
metadata:
name: traefikoidc-redis
namespace: traefik
spec:
sentinel:
replicas: 3
resources:
requests:
memory: 100Mi
limits:
memory: 200Mi
redis:
replicas: 3
resources:
requests:
memory: 500Mi
limits:
memory: 1Gi
config:
maxmemory: 512mb
maxmemory-policy: allkeys-lru
---
# ConfigMap for Redis configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: traefik-oidc-redis-config
namespace: traefik
data:
REDIS_ENABLED: "true"
REDIS_ADDRESS: "rfs-traefikoidc-redis:6379"
REDIS_CACHE_MODE: "hybrid"
REDIS_KEY_PREFIX: "traefikoidc"
REDIS_POOL_SIZE: "20"
REDIS_CIRCUIT_BREAKER_THRESHOLD: "5"
REDIS_CIRCUIT_BREAKER_TIMEOUT: "60"
---
# Secret for Redis password
apiVersion: v1
kind: Secret
metadata:
name: traefik-oidc-redis-secret
namespace: traefik
type: Opaque
data:
REDIS_PASSWORD: <base64-encoded-password>
---
# Traefik Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: traefik
namespace: traefik
spec:
replicas: 3
selector:
matchLabels:
app: traefik
template:
metadata:
labels:
app: traefik
spec:
containers:
- name: traefik
image: traefik:v3.2
envFrom:
- configMapRef:
name: traefik-oidc-redis-config
- secretRef:
name: traefik-oidc-redis-secret
ports:
- containerPort: 80
- containerPort: 443
volumeMounts:
- name: config
mountPath: /etc/traefik
volumes:
- name: config
configMap:
name: traefik-config
---
# HorizontalPodAutoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: traefik-hpa
namespace: traefik
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: traefik
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
```
### AWS ECS with ElastiCache
```json
{
"family": "traefik-oidc",
"taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsExecutionRole",
"networkMode": "awsvpc",
"containerDefinitions": [
{
"name": "traefik",
"image": "traefik:v3.2",
"essential": true,
"environment": [
{
"name": "REDIS_ENABLED",
"value": "true"
},
{
"name": "REDIS_ADDRESS",
"value": "traefikoidc-cache.abc123.ng.0001.use1.cache.amazonaws.com:6379"
},
{
"name": "REDIS_CACHE_MODE",
"value": "hybrid"
},
{
"name": "REDIS_KEY_PREFIX",
"value": "traefikoidc"
},
{
"name": "REDIS_TLS_ENABLED",
"value": "true"
}
],
"secrets": [
{
"name": "REDIS_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-password"
}
],
"portMappings": [
{
"containerPort": 80,
"protocol": "tcp"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/traefik",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
],
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024"
}
```
### Redis Cluster Configuration
For high-throughput environments, use Redis Cluster:
```yaml
# Redis Cluster configuration
redis:
enabled: true
# Provide one or more cluster nodes
address: "redis-cluster-1:6379,redis-cluster-2:6379,redis-cluster-3:6379"
cacheMode: "redis" # Use redis mode for cluster
clusterMode: true
# Cluster-specific settings
maxRedirects: 3 # Maximum cluster redirects
readOnly: false # Allow reads from replicas
routeByLatency: true # Route to fastest node
routeRandomly: false # Random routing
```
## Performance Tuning
### Key Design Patterns
#### 1. TTL Strategy
```yaml
# Recommended TTL values
JTI_CACHE_TTL: 3600 # 1 hour - matches token lifetime
SESSION_TTL: 86400 # 24 hours - user session duration
TOKEN_METADATA_TTL: 300 # 5 minutes - short-lived metadata
```
#### 2. Connection Pool Optimization
```yaml
redis:
poolSize: 10 # Base formula: 2 * CPU cores
minIdleConns: 5 # 50% of poolSize
maxConnAge: 3600 # Rotate connections hourly
idleTimeout: 900 # Close idle connections after 15 min
```
#### 3. Memory Management
```bash
# Redis memory configuration
maxmemory 512mb # Set appropriate limit
maxmemory-policy allkeys-lru # Evict least recently used
```
### Benchmarking Results
Performance comparison across cache modes:
| Operation | Memory Mode | Redis Mode | Hybrid Mode |
|-----------|------------|------------|-------------|
| Read (p50) | 0.1ms | 2ms | 0.2ms |
| Read (p99) | 0.5ms | 10ms | 5ms |
| Write (p50) | 0.2ms | 3ms | 3ms |
| Write (p99) | 1ms | 15ms | 15ms |
| Throughput | 100k/s | 20k/s | 80k/s |
### Optimization Tips
1. **Use Hybrid Mode for Production**
- Provides best balance of speed and consistency
- Local cache reduces Redis load by 70-80%
2. **Configure Connection Pooling**
```yaml
redis:
poolSize: 20 # For high traffic
minIdleConns: 10 # Maintain warm connections
```
3. **Enable Pipelining** (Future Enhancement)
- Batch multiple operations
- Reduces round-trip latency
4. **Monitor Redis Memory**
```bash
redis-cli INFO memory
# used_memory_human:250.34M
# used_memory_peak_human:512.00M
# maxmemory_policy:allkeys-lru
```
5. **Use Redis Persistence Wisely**
```bash
# For cache data, disable persistence for better performance
save ""
appendonly no
```
## Monitoring and Observability
### Key Metrics to Monitor
#### Application Metrics
- Cache hit rate (target: >90% for hybrid mode)
- Cache operation latency (p50, p95, p99)
- Circuit breaker state and transitions
- Redis connection pool utilization
#### Redis Metrics
```bash
# Monitor with redis-cli
redis-cli --stat
# Key metrics:
# - Connected clients
# - Ops/sec
# - Network I/O
# - Memory usage
# - Evicted keys
```
### Prometheus Metrics
Export metrics for Prometheus monitoring:
```yaml
# Grafana dashboard for visualization
apiVersion: v1
kind: ConfigMap
metadata:
name: traefik-oidc-dashboard
data:
dashboard.json: |
{
"panels": [
{
"title": "Cache Hit Rate",
"targets": [
{
"expr": "rate(traefikoidc_cache_hits_total[5m]) / rate(traefikoidc_cache_requests_total[5m])"
}
]
},
{
"title": "Redis Latency",
"targets": [
{
"expr": "histogram_quantile(0.99, traefikoidc_redis_operation_duration_seconds_bucket)"
}
]
},
{
"title": "Circuit Breaker State",
"targets": [
{
"expr": "traefikoidc_circuit_breaker_state"
}
]
}
]
}
```
### Logging
Enable debug logging for troubleshooting:
```yaml
# Plugin configuration
logLevel: debug
# Log entries to watch:
# - "Redis cache initialized"
# - "Circuit breaker opened"
# - "Falling back to memory cache"
# - "Redis connection restored"
```
### Health Checks
Implement health check endpoints:
```go
// Health check endpoint response
{
"status": "healthy",
"cache": {
"mode": "hybrid",
"redis": {
"connected": true,
"latency": "2ms",
"pool": {
"active": 5,
"idle": 5,
"total": 10
}
},
"memory": {
"entries": 1000,
"size": "50MB"
},
"circuit_breaker": {
"state": "closed",
"failures": 0
}
}
}
```
## Troubleshooting
### Common Issues and Solutions
#### Issue 1: "Redis connection refused"
**Symptoms:**
- Logs show "dial tcp: connection refused"
- Circuit breaker opens immediately
**Solutions:**
1. Verify Redis is running:
```bash
redis-cli ping
# Should return: PONG
```
2. Check network connectivity:
```bash
telnet redis-host 6379
```
3. Verify Redis address in configuration:
```yaml
redis:
address: "redis:6379" # Ensure correct host:port
```
#### Issue 2: "Authentication failure"
**Symptoms:**
- Logs show "NOAUTH Authentication required"
**Solutions:**
1. Set Redis password:
```bash
export REDIS_PASSWORD=your-password
```
2. Or in configuration:
```yaml
redis:
password: "your-password"
```
#### Issue 3: "Circuit breaker open"
**Symptoms:**
- Logs show "Circuit breaker is open"
- Falls back to memory cache
**Solutions:**
1. Check Redis health:
```bash
redis-cli INFO server
```
2. Review circuit breaker settings:
```yaml
redis:
circuitBreakerThreshold: 10 # Increase threshold
circuitBreakerTimeout: 30 # Reduce timeout
```
3. Monitor Redis performance:
```bash
redis-cli --latency
```
#### Issue 4: "High memory usage"
**Symptoms:**
- Redis memory constantly growing
- OOM errors
**Solutions:**
1. Configure Redis eviction:
```bash
CONFIG SET maxmemory 512mb
CONFIG SET maxmemory-policy allkeys-lru
```
2. Review key expiration:
```yaml
# Ensure TTLs are set appropriately
SESSION_TTL: 86400 # Not too long
```
3. Monitor key count:
```bash
redis-cli DBSIZE
redis-cli --bigkeys
```
#### Issue 5: "Inconsistent cache state"
**Symptoms:**
- Different responses from different replicas
- Stale data being served
**Solutions:**
1. Ensure all instances use same Redis:
```yaml
redis:
address: "shared-redis:6379" # Same for all instances
```
2. Verify cache mode consistency:
```bash
# All instances should use same mode
export REDIS_CACHE_MODE=hybrid
```
3. Check time synchronization:
```bash
# Ensure all instances have synchronized time
timedatectl status
```
### Debug Commands
Useful Redis commands for debugging:
```bash
# Monitor all Redis commands in real-time
redis-cli MONITOR
# Check slow queries
redis-cli SLOWLOG GET 10
# Analyze memory usage
redis-cli MEMORY DOCTOR
# List all keys (careful in production)
redis-cli --scan --pattern "traefikoidc:*"
# Get key TTL
redis-cli TTL "traefikoidc:session:abc123"
# Check Redis info
redis-cli INFO all
```
## Migration Guide
### Migrating from Memory-Only to Redis
#### Phase 1: Preparation
1. Deploy Redis infrastructure
2. Test Redis connectivity
3. Configure monitoring
#### Phase 2: Gradual Rollout
1. Enable Redis on one instance:
```yaml
redis:
enabled: true
address: "redis:6379"
cacheMode: "hybrid"
```
2. Monitor performance and errors
3. Gradually enable on more instances
#### Phase 3: Full Migration
1. Enable Redis on all instances
2. Remove `disableReplayDetection: true` if set
3. Monitor for issues
#### Rollback Plan
If issues occur:
1. Disable Redis: `REDIS_ENABLED=false`
2. Falls back to memory cache automatically
3. Investigate and resolve issues
### Migration Checklist
- [ ] Redis deployed and accessible
- [ ] Redis password configured
- [ ] Network connectivity verified
- [ ] Monitoring configured
- [ ] Backup plan prepared
- [ ] Test environment validated
- [ ] Gradual rollout planned
- [ ] Team notified of changes
## Best Practices
### 1. Security
- Always use Redis password authentication
- Enable TLS for production deployments
- Use network segmentation (private subnets)
- Rotate Redis passwords regularly
### 2. High Availability
- Use Redis Sentinel or Cluster for HA
- Configure appropriate circuit breaker thresholds
- Implement proper health checks
- Use connection pooling
### 3. Performance
- Use hybrid cache mode for best performance
- Configure appropriate TTLs
- Monitor cache hit rates
- Size Redis memory appropriately
### 4. Operations
- Implement comprehensive monitoring
- Set up alerting for circuit breaker state
- Regular backup of Redis data (if persistence enabled)
- Document Redis configuration
### 5. Development
- Use memory mode for local development
- Test with Redis in staging environment
- Validate circuit breaker behavior
- Load test with expected traffic patterns
## FAQ
### Q: Is Redis required for the plugin to work?
**A:** No, Redis is optional. The plugin works perfectly with in-memory cache for single-instance deployments. Redis is only needed for multi-replica deployments to share cache state.
### Q: What happens if Redis goes down?
**A:** The plugin implements a circuit breaker pattern. When Redis becomes unavailable:
1. Circuit breaker opens after threshold failures
2. Plugin falls back to in-memory cache
3. Periodically attempts to reconnect to Redis
4. Resumes Redis operations when connection restored
### Q: Can I use Redis Cluster?
**A:** Yes, Redis Cluster is supported. Configure with multiple node addresses and enable cluster mode in the configuration.
### Q: What's the recommended cache mode?
**A:** For production multi-replica deployments, use `hybrid` mode. It provides the best balance of performance and consistency.
### Q: How much memory does Redis need?
**A:** Memory requirements depend on:
- Number of active sessions
- Token sizes
- TTL configurations
Typical sizing:
- Small (1-1000 users): 128MB
- Medium (1000-10000 users): 256MB-512MB
- Large (10000+ users): 1GB+
### Q: Can I use managed Redis services?
**A:** Yes, the plugin works with:
- AWS ElastiCache
- Azure Cache for Redis
- Google Cloud Memorystore
- Redis Enterprise Cloud
- Any Redis-compatible service
### Q: How do I monitor cache performance?
**A:** Monitor these key metrics:
- Cache hit rate (target >90%)
- Redis latency (target <10ms p99)
- Circuit breaker state
- Connection pool utilization
- Memory usage
### Q: Is data encrypted in Redis?
**A:** Session data is encrypted before storing in Redis using the `sessionEncryptionKey`. Additionally, you can enable TLS for Redis connections.
### Q: Can I migrate from memory to Redis without downtime?
**A:** Yes, the migration can be done without downtime:
1. Deploy Redis
2. Enable Redis on instances gradually
3. Monitor for issues
4. Complete migration
### Q: What Redis versions are supported?
**A:** The plugin supports Redis 5.0 and later. Redis 6.0+ is recommended for production use.
### Q: How do I handle Redis password rotation?
**A:** Password rotation strategy:
1. Update secret in secret management system
2. Rolling restart of Traefik instances
3. Each instance picks up new password on restart
4. No authentication failures during rotation
### Q: Can I use Redis with TLS?
**A:** Yes, TLS is fully supported:
```yaml
redis:
tls:
enabled: true
certFile: "/path/to/cert.pem"
keyFile: "/path/to/key.pem"
caFile: "/path/to/ca.pem"
```
### Q: What's the impact on latency?
**A:** Latency impact by cache mode:
- **Memory**: ~0.1ms
- **Redis**: ~2-5ms (network dependent)
- **Hybrid**: ~0.2ms for hits, ~2-5ms for misses
### Q: Should I enable Redis persistence?
**A:** For cache data, persistence is usually not needed:
- Cache data is transient
- Disabling persistence improves performance
- Sessions can be re-established if data is lost
### Q: How do I size the connection pool?
**A:** Connection pool sizing formula:
```
poolSize = 2 * CPU_cores * expected_replicas
minIdleConns = poolSize / 2
```
Example for 4 cores, 3 replicas:
- poolSize: 24
- minIdleConns: 12
## Support and Resources
### Documentation
- [Main README](../README.md)
- [Plugin Configuration Guide](../README.md#configuration-options)
- [Troubleshooting Guide](../README.md#troubleshooting)
### Community
- GitHub Issues: Report bugs and request features
- Discussions: Ask questions and share experiences
### Additional Resources
- [Redis Documentation](https://redis.io/documentation)
- [Redis Best Practices](https://redis.io/docs/manual/patterns/)
- [Traefik Documentation](https://doc.traefik.io/traefik/)
---
*Last updated: 2025*