Compare commits

...

28 Commits

Author SHA1 Message Date
lukaszraczylo 4ef42e5781 fixup! Github pages + benchmarks. 2025-12-07 16:49:50 +00:00
lukaszraczylo 996d29b57b Github pages + benchmarks. 2025-12-07 16:46:13 +00:00
lukaszraczylo 7c80d6adaa Create CNAME 2025-12-07 16:45:33 +00:00
lukaszraczylo 31fc3ae3d9 Switch to goreleaser. 2025-12-07 16:13:33 +00:00
lukaszraczylo da8ec5f21d Add LRU cache support. 2025-12-03 10:22:33 +00:00
lukaszraczylo 3d80f457d3 Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-12-03 03:24:59 +00:00
lukaszraczylo 09c3e4cd95 Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-12-02 03:26:31 +00:00
lukaszraczylo d07ee4090c Dashboard update. 2025-11-29 15:53:05 +00:00
lukaszraczylo b1045b8bc2 Retry budget. 2025-11-29 15:36:17 +00:00
lukaszraczylo cc35031db9 Fixes issue with the dashboard metrics. 2025-11-29 14:51:29 +00:00
lukaszraczylo 6a69694ab3 November improvements. (#29)
* Tackling the CPU / memory spikes after some time.

* Update admin dashboard, fix the circuit breaker and request coalescing.
2025-11-29 14:21:09 +00:00
lukaszraczylo b210627fb7 Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-27 03:21:34 +00:00
lukaszraczylo edcabe3cf0 Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-25 03:23:21 +00:00
lukaszraczylo c99bf2b245 fixup! Improve caching by adding user ids and roles to hash. 2025-11-22 17:10:59 +00:00
lukaszraczylo 39dc7b49cf Improve caching by adding user ids and roles to hash. 2025-11-22 17:02:16 +00:00
lukaszraczylo 28223b40da Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-20 03:20:52 +00:00
lukaszraczylo ee5618c699 Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-19 03:21:29 +00:00
lukaszraczylo 94c097bc6c fixup! Race condition in parseGraphQLQuery result pooling 2025-11-18 17:27:55 +00:00
lukaszraczylo 4e84cd7461 Race condition in parseGraphQLQuery result pooling
Under high concurrency, the sync.Pool pattern was creating a race condition
where the same result pointer was being reused by multiple concurrent requests.

The bug:
- parseGraphQLQuery() returns a pointer to 'res' from the pool
- The defer statement returns 'res' back to the pool on function exit
- While the caller is still using the returned pointer, another concurrent
  request could get the SAME pointer from the pool and modify it

This caused mutations to randomly get the wrong activeEndpoint value:
- Request A: mutation parsed → activeEndpoint set to :8080 (write)
- Request A: returns pointer to result
- Request A: defer runs → result returned to pool
- Request B: gets SAME pointer from pool
- Request B: query parsed → activeEndpoint overwritten to :8088 (read-only)
- Request A: still holding pointer, now sees :8088 instead of :8080!
- Result: mutation routed to read-only endpoint → database write failure

The fix:
Create a copy of the result before returning, so the pooled object can be
safely reused without affecting the returned value.
2025-11-18 17:03:11 +00:00
lukaszraczylo e37a8beaa7 Fix: Move endpoint routing outside loop to prevent mutation misrouting
BUG FIX: The endpoint routing logic was inside the loop that
processes all GraphQL definitions. This caused mutations to be incorrectly
routed to read-only endpoints when followed by other definitions (queries,
fragments, etc).

The bug manifested as: mutations → read-only Hasura → read-only pooler →
PostgreSQL replica → "cannot set transaction read-write mode during recovery"

Changes:
- Move endpoint routing logic AFTER the definition processing loop
- Ensures mutations are ALWAYS routed to write endpoint regardless of
  subsequent definitions in the document
- Add 3 comprehensive regression tests covering:
  1. Mutation with multiple operations
  2. Mutation followed by fragment
  3. Complex main-bot style mutation document

Tests: All pass including new regression tests
Impact: Fixes database write failures in main-bot and other services
2025-11-18 17:03:06 +00:00
lukaszraczylo 9dd8c11363 CRITICAL: Routing fix for mutations in case of the R/W replicas 2025-11-18 16:28:58 +00:00
lukaszraczylo 9fbee0d9a1 Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-18 03:21:43 +00:00
lukaszraczylo 7df651c17a Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-12 03:21:41 +00:00
lukaszraczylo 7ada94e4fa Fix nil pointers + improve the cleanup. 2025-11-11 10:43:07 +00:00
lukaszraczylo c510c29a8f Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-11 03:22:29 +00:00
lukaszraczylo 370602858a Update go.mod and go.sum
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-09 03:21:44 +00:00
lukaszraczylo 6261be6e53 fixup! fixup! fixup! fixup! fixup! fixup! fixup! fixup! Update go.mod and go.sum 2025-11-06 16:55:12 +00:00
lukaszraczylo 5ae4ea1e25 fixup! fixup! fixup! fixup! fixup! fixup! fixup! Update go.mod and go.sum 2025-11-05 22:55:03 +00:00
41 changed files with 2963 additions and 495 deletions
+8 -64
View File
@@ -5,69 +5,13 @@ on:
schedule:
- cron: "0 3 * * *"
env:
GO_VERSION: ">=1.21"
permissions:
contents: write
actions: write
jobs:
# This job is responsible for preparation of the build
# environment variables.
prepare:
name: Preparing build context
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Install Go
uses: actions/setup-go@v5
id: cache
with:
go-version: ${{env.GO_VERSION}}
cache-dependency-path: "**/*.sum"
- name: Go get dependencies
if: steps.cache.outputs.cache-hit != 'true'
run: |
go get ./...
# This job is responsible for running tests and linting the codebase
test:
name: "Unit testing"
runs-on: ubuntu-latest
container: golang:1
needs: [prepare]
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0 # Ensure full history is checked out
token: ${{ secrets.GHCR_TOKEN }}
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: ${{env.GO_VERSION}}
cache-dependency-path: "**/*.sum"
- name: Install dependencies
run: |
apt-get update
apt-get install ca-certificates make -y
update-ca-certificates
go mod tidy
go get -u -v ./...
go mod tidy -v
- name: Run unit tests
run: |
CI_RUN=${CI} make test
git config --global --add safe.directory /__w/graphql-monitoring-proxy/graphql-monitoring-proxy
- name: Commit changes
uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: "Update go.mod and go.sum"
commit_options: "--no-verify --signoff"
file_pattern: "go.mod go.sum"
autoupdate:
uses: lukaszraczylo/shared-actions/.github/workflows/go-autoupdate.yaml@main
with:
go-version: "1.24"
release-workflow: "release.yaml"
+2 -2
View File
@@ -105,5 +105,5 @@ jobs:
summary-always: true
# auto-push only if it's on main branch
auto-push: false
gh-pages-branch: "gh-pages"
benchmark-data-dir-path: "docs"
gh-pages-branch: "main"
benchmark-data-dir-path: "docs/bench"
+60
View File
@@ -0,0 +1,60 @@
name: Release
on:
workflow_dispatch:
push:
paths-ignore:
- "**.md"
- "**/release.yaml"
- "static/**"
- "docs/**"
branches:
- main
permissions:
contents: write
packages: write
deployments: write
jobs:
release:
uses: lukaszraczylo/shared-actions/.github/workflows/go-release.yaml@main
with:
go-version: "1.24"
docker-enabled: true
secrets: inherit
benchmark:
name: Publish Benchmarks
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: "1.24"
- name: Run benchmarks
run: go test -bench=. -benchmem ./... -run=^# | tee output.txt
- name: Store benchmark result
uses: benchmark-action/github-action-benchmark@v1
with:
tool: "go"
output-file-path: output.txt
fail-on-alert: true
github-token: ${{ secrets.GITHUB_TOKEN }}
comment-on-alert: true
summary-always: true
auto-push: false
benchmark-data-dir-path: "docs/bench"
- name: Push benchmark results
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add docs/bench
git diff --staged --quiet || git commit -m "Update benchmark results"
git push
-72
View File
@@ -1,72 +0,0 @@
name: Test and release
on:
workflow_dispatch:
push:
paths-ignore:
- "**/**.md"
- "**/**.yaml"
- "static/**"
branches:
- "main"
env:
GO_VERSION: ">=1.21"
permissions:
# deployments permission to deploy GitHub pages website
deployments: write
# contents permission to update benchmark contents in gh-pages branch
contents: write
jobs:
shared:
uses: telegram-bot-app/ci-scripts/.github/workflows/build-test-publish-inject.yaml@main
with:
enable-code-scans: false
should-deploy: false
secrets:
ghcr-token: ${{ secrets.GHCR_TOKEN }}
test:
name: "Benchmarking the results"
needs: [shared]
runs-on: ubuntu-latest
container: golang:1
# container: github/super-linter:v4
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: ${{env.GO_VERSION}}
cache-dependency-path: "**/*.sum"
- name: Install dependencies
run: |
apt-get update
apt-get install ca-certificates make -y
update-ca-certificates
go mod tidy
git config --global --add safe.directory "$GITHUB_WORKSPACE"
- name: Run benchmark
run: |
go test -bench=. -benchmem ./... -run=^# | tee output.txt
- name: Store benchmark result
uses: benchmark-action/github-action-benchmark@v1
with:
tool: "go"
output-file-path: output.txt
fail-on-alert: true
github-token: ${{ secrets.GITHUB_TOKEN }}
comment-on-alert: true
summary-always: true
# auto-push only if it's on main branch
auto-push: true
gh-pages-branch: "gh-pages"
benchmark-data-dir-path: "docs"
+2
View File
@@ -3,3 +3,5 @@ test.sh
banned.json*
dist/
coverage.out
CLAUDE.md
graphql-monitoring-proxy
+67
View File
@@ -0,0 +1,67 @@
version: 2
before:
hooks:
- go mod tidy
builds:
- id: graphql-proxy
main: .
binary: graphql-proxy
env:
- CGO_ENABLED=0
goos:
- linux
- darwin
- windows
goarch:
- amd64
- arm64
ldflags:
- -s -w
archives:
- id: graphql-proxy
formats: [tar.gz]
name_template: "graphql-proxy-{{ .Os }}-{{ .Arch }}"
format_overrides:
- goos: windows
formats: [zip]
files:
- LICENSE
- README.md
checksum:
name_template: "graphql-proxy-checksums.txt"
algorithm: sha256
changelog:
sort: asc
filters:
exclude:
- '^docs:'
- '^test:'
- '^Merge'
- '^WIP'
- '^Update go.mod'
release:
github:
owner: lukaszraczylo
name: graphql-monitoring-proxy
name_template: "version {{.Version}}"
draft: false
prerelease: auto
dockers_v2:
- images:
- "ghcr.io/lukaszraczylo/graphql-monitoring-proxy"
tags:
- "{{ .Version }}"
- "latest"
platforms:
- linux/amd64
- linux/arm64
dockerfile: Dockerfile.goreleaser
extra_files:
- static/app
+6
View File
@@ -0,0 +1,6 @@
FROM gcr.io/distroless/base-debian12:nonroot
ARG TARGETPLATFORM
WORKDIR /go/src/app
COPY --chmod=777 --chown=nonroot:nonroot static/app /go/src/app
COPY ${TARGETPLATFORM}/graphql-proxy /go/src/app/graphql-proxy
ENTRYPOINT ["/go/src/app/graphql-proxy"]
+57 -6
View File
@@ -155,6 +155,8 @@ You can still use the non-prefixed environment variables in the spirit of the ba
| `CACHE_TTL` | The cache TTL | `60` |
| `CACHE_MAX_MEMORY_SIZE` | Maximum memory size for cache in MB | `100` |
| `CACHE_MAX_ENTRIES` | Maximum number of entries in cache | `10000` |
| `CACHE_USE_LRU` | Use LRU eviction algorithm (see [Cache Eviction](#cache-eviction-algorithms)) | `false` |
| `CACHE_PER_USER_DISABLED` | **⚠️ SECURITY**: Disable per-user cache isolation | `false` (**DO NOT** set to `true` in multi-user apps) |
| `ENABLE_REDIS_CACHE` | Enable distributed Redis cache | `false` |
| `CACHE_REDIS_URL` | URL to redis server / cluster endpoint | `localhost:6379` |
| `CACHE_REDIS_PASSWORD` | Redis connection password | `` |
@@ -347,19 +349,38 @@ The admin dashboard (`/admin`) provides:
The cache engine is enabled in the background by default, using no additional resources.
You can then start using the cache by setting the `ENABLE_GLOBAL_CACHE` or `ENABLE_REDIS_CACHE` environment variable to `true` - which will enable the cache for all queries without introspection. You can leave the global cache disabled and enable the cache for specific queries by adding the `@cached` directive to the query.
**Important**: The cache key is calculated from the **entire request body**, which includes both the GraphQL query and variables. This means:
**Important**: The cache key is calculated from the **request body + user context (user ID and role)**. This means:
- Identical queries with different variables are cached separately
- Identical queries with different variable values get their own cache entries
- This ensures correct caching behavior for parameterized queries
- **Identical queries from different users are cached separately** (security isolation)
- **Identical queries with different roles are cached separately** (prevents privilege escalation)
- This ensures correct caching behavior and prevents data leakage between users
**🔒 Security Update (v0.27.0+)**: Cache keys now include user context by default to prevent security vulnerabilities where users could see each other's cached data. This is enabled by default and should NOT be disabled in multi-user applications.
Example:
```graphql
# These two requests will have DIFFERENT cache keys:
# These requests will have DIFFERENT cache keys:
# Different variables
query GetUser($id: ID!) { user(id: $id) { name } }
variables: { "id": "123" }
variables: { "id": "123" } // Cache key: MD5(body + user:alice + role:user)
query GetUser($id: ID!) { user(id: $id) { name } }
variables: { "id": "456" }
variables: { "id": "456" } // Cache key: MD5(body + user:alice + role:user)
# Different users (SECURITY: prevents data leakage)
query GetMyProfile { me { email } }
Authorization: Bearer token_for_alice // Cache key: MD5(body + user:alice + role:user)
query GetMyProfile { me { email } }
Authorization: Bearer token_for_bob // Cache key: MD5(body + user:bob + role:user)
# Different roles (SECURITY: prevents privilege escalation)
query GetData { data { value } }
Authorization: Bearer token_admin // Cache key: MD5(body + user:alice + role:admin)
query GetData { data { value } }
Authorization: Bearer token_user // Cache key: MD5(body + user:alice + role:user)
```
In the case of the `@cached` you can add additional parameters to the directive which will set the cache for specific queries to the provided time.
@@ -419,12 +440,40 @@ These features ensure the cache runs efficiently even under high load and with l
Since version `0.5.30` the cache is gzipped in the memory, which should optimise the memory usage quite significantly.
Since version `0.15.48` the you can also use the distributed Redis cache.
#### Cache Eviction Algorithms
The proxy supports two cache eviction strategies:
**Standard (default):** Uses Go's `sync.Map` with approximate eviction. When memory limits are reached, entries are evicted based on iteration order (pseudo-random). This is memory-efficient and has excellent concurrent read performance.
**LRU (Least Recently Used):** Uses a proper LRU algorithm with a linked list to track access order. When limits are reached, the least recently accessed entries are evicted first. Enable with `CACHE_USE_LRU=true`.
| Feature | Standard | LRU |
|---------|----------|-----|
| Eviction order | Pseudo-random | Least recently used |
| Read performance | Excellent | Good |
| Memory tracking | Approximate | Precise |
| Best for | High read throughput | Cache hit optimization |
*LRU cache configuration:*
```bash
GMP_ENABLE_GLOBAL_CACHE=true
GMP_CACHE_TTL=300
GMP_CACHE_USE_LRU=true
GMP_CACHE_MAX_MEMORY_SIZE=200
GMP_CACHE_MAX_ENTRIES=5000
```
Use LRU when cache hit rate is critical and you want to ensure frequently accessed data stays cached. Use Standard (default) for maximum read throughput with less memory overhead.
#### Read-only endpoint
You can now specify the read-only GraphQL endpoint by setting the `HOST_GRAPHQL_READONLY` environment variable. The default value is empty, preventing the proxy from using the read-only endpoint for the queries and directing all the requests to the main endpoint specified as `HOST_GRAPHQL`. If the `HOST_GRAPHQL_READONLY` is set, the proxy will use the read-only endpoint for the queries with the `query` type and the main endpoint for the `mutation` type queries. Format of the read-only endpoint is the same as `HOST_GRAPHQL` endpoint, for example `http://localhost:8080/`.
You can check out the [example of combined deployment with RW and read-only hasura](static/kubernetes-single-deployment-with-ro.yaml).
**Important:** When using a read-only Hasura instance connected to a PostgreSQL read replica, you **must** disable event trigger processing on that instance by setting `HASURA_GRAPHQL_EVENTS_FETCH_INTERVAL=0` in the read-only Hasura container environment variables. This prevents the read-only instance from attempting to process event triggers (which require write access to event log tables), avoiding "cannot set transaction read-write mode during recovery" errors.
### Resilience
#### Circuit Breaker Pattern
@@ -723,6 +772,8 @@ Following tables are being cleaned:
- `hdb_catalog.hdb_cron_event_invocation_logs`
- `hdb_catalog.hdb_scheduled_event_invocation_logs`
**Important for RO/RW setups:** The `HASURA_EVENT_METADATA_DB` connection string must point to the **read-write primary database** where the `hdb_catalog` schema resides. The cleaner executes DELETE operations which require write permissions. Do not point this to a read-only replica.
### Security
+149 -20
View File
@@ -563,6 +563,26 @@
<span class="metric-label">Enabled</span>
<span class="metric-value" id="cb-enabled">--</span>
</div>
<div class="metric-row">
<span class="metric-label">Total Requests</span>
<span class="metric-value" id="cb-total-requests">--</span>
</div>
<div class="metric-row">
<span class="metric-label">Total Successes</span>
<span class="metric-value" id="cb-total-successes">--</span>
</div>
<div class="metric-row">
<span class="metric-label">Total Failures</span>
<span class="metric-value" id="cb-total-failures">--</span>
</div>
<div class="metric-row">
<span class="metric-label">Consecutive Successes</span>
<span class="metric-value" id="cb-consecutive-successes">--</span>
</div>
<div class="metric-row">
<span class="metric-label">Consecutive Failures</span>
<span class="metric-value" id="cb-consecutive-failures">--</span>
</div>
<div class="metric-row">
<span class="metric-label">Max Failures</span>
<span class="metric-value" id="cb-max-failures">--</span>
@@ -1055,6 +1075,63 @@
// Update all statistics
function updateAllStats(data) {
// Check if this is cluster mode data
if (data.cluster_mode && data.stats) {
// Cluster mode: data is structured differently
// Stats contains aggregated data with nested objects
const stats = data.stats;
// Update cluster status section
document.getElementById('cluster-status-section').style.display = 'block';
document.getElementById('cluster-total-instances').textContent = data.total_instances || 0;
document.getElementById('cluster-healthy-instances').textContent = data.healthy_instances || 0;
document.getElementById('overview-title').textContent = 'Cluster Overview';
// Update cluster info in toggle
const totalInstances = data.total_instances || 0;
document.getElementById('cluster-info').textContent =
`(${totalInstances} instance${totalInstances !== 1 ? 's' : ''} available)`;
// Build stats object with uptime from cluster_uptime
const statsWithUptime = {
...stats,
uptime_seconds: stats.cluster_uptime || 0,
uptime_human: formatUptime(stats.cluster_uptime || 0)
};
updateStats(statsWithUptime);
// Extract nested objects from stats for cluster mode
if (stats.circuit_breaker) updateCircuitBreaker(stats.circuit_breaker);
if (stats.coalescing) updateCoalescing(stats.coalescing);
if (stats.retry_budget) updateRetryBudget(stats.retry_budget);
if (stats.websocket) updateWebSocket(stats.websocket);
if (stats.connections) updateConnections(stats.connections);
// Handle memory for cluster mode (Redis doesn't track memory per instance)
if (stats.memory) {
const totalMemMB = stats.memory.total_usage_mb;
if (totalMemMB < 0) {
// All instances are using Redis cache
document.getElementById('cache-memory').textContent = 'N/A';
document.getElementById('cache-memory').title = 'Memory tracking not available for Redis cache';
document.getElementById('cache-memory-pct').textContent = 'Redis cache';
document.getElementById('memory-progress').style.width = '0%';
} else {
document.getElementById('cache-memory').textContent = totalMemMB.toFixed(2) + ' MB';
document.getElementById('cache-memory-pct').textContent = 'Cluster total';
}
}
// Update instance list if available
if (data.instances && data.instances.length > 0) {
document.getElementById('instance-details-section').style.display = 'block';
updateInstanceList(data.instances, null);
}
return;
}
// Non-cluster mode: original behavior
if (data.stats) updateStats(data.stats);
if (data.health) updateHealth(data.health);
if (data.circuit_breaker) updateCircuitBreaker(data.circuit_breaker);
@@ -1200,14 +1277,54 @@
stateEl.classList.remove('loading');
let badgeClass = 'badge-info';
if (data.state === 'closed') badgeClass = 'badge-success';
else if (data.state === 'open') badgeClass = 'badge-danger';
else if (data.state === 'half-open') badgeClass = 'badge-warning';
let stateText = data.state || 'unknown';
stateEl.innerHTML = `<span class="badge ${badgeClass}">${data.state || 'Unknown'}</span>`;
// For cluster mode, determine state from instance counts
if (data.instances_open !== undefined) {
// Cluster mode data
if (data.instances_open > 0) {
stateText = `${data.instances_open} open`;
badgeClass = 'badge-danger';
} else if (data.instances_halfopen > 0) {
stateText = `${data.instances_halfopen} half-open`;
badgeClass = 'badge-warning';
} else if (data.instances_closed > 0) {
stateText = `${data.instances_closed} closed`;
badgeClass = 'badge-success';
}
} else {
// Single instance mode
if (data.state === 'closed') badgeClass = 'badge-success';
else if (data.state === 'open') badgeClass = 'badge-danger';
else if (data.state === 'half-open') badgeClass = 'badge-warning';
}
stateEl.innerHTML = `<span class="badge ${badgeClass}">${stateText}</span>`;
document.getElementById('cb-enabled').textContent = data.enabled ? 'Yes' : 'No';
if (data.counts) {
// Single instance mode with detailed counts
document.getElementById('cb-total-requests').textContent =
(data.counts.requests || 0).toLocaleString();
document.getElementById('cb-total-successes').textContent =
(data.counts.total_successes || 0).toLocaleString();
document.getElementById('cb-total-failures').textContent =
(data.counts.total_failures || 0).toLocaleString();
document.getElementById('cb-consecutive-successes').textContent =
(data.counts.consecutive_successes || 0).toLocaleString();
document.getElementById('cb-consecutive-failures').textContent =
(data.counts.consecutive_failures || 0).toLocaleString();
} else if (data.instances_open !== undefined) {
// Cluster mode - show instance distribution instead
const total = (data.instances_open || 0) + (data.instances_closed || 0) + (data.instances_halfopen || 0);
document.getElementById('cb-total-requests').textContent = total + ' instances';
document.getElementById('cb-total-successes').textContent = (data.instances_closed || 0).toLocaleString();
document.getElementById('cb-total-failures').textContent = (data.instances_open || 0).toLocaleString();
document.getElementById('cb-consecutive-successes').textContent = '--';
document.getElementById('cb-consecutive-failures').textContent = '--';
}
if (data.config) {
document.getElementById('cb-max-failures').textContent = data.config.max_failures || '--';
document.getElementById('cb-timeout').textContent = (data.config.timeout || '--') + 's';
@@ -1217,23 +1334,31 @@
function updateCoalescing(data) {
document.getElementById('coalescing-rate').textContent =
(data.backend_savings_pct || 0).toFixed(1) + '%';
document.getElementById('coalescing-total').textContent =
(data.total_requests || 0).toLocaleString();
document.getElementById('coalescing-primary').textContent =
(data.primary_requests || 0).toLocaleString();
document.getElementById('coalescing-coalesced').textContent =
(data.coalesced_requests || 0).toLocaleString();
// Handle both single instance (total_requests) and cluster mode (total_coalesced + total_primary)
const totalRequests = data.total_requests ||
((data.total_coalesced_requests || 0) + (data.total_primary_requests || 0));
document.getElementById('coalescing-total').textContent = totalRequests.toLocaleString();
// Handle both single instance and cluster mode field names
const primaryRequests = data.primary_requests || data.total_primary_requests || 0;
document.getElementById('coalescing-primary').textContent = primaryRequests.toLocaleString();
const coalescedRequests = data.coalesced_requests || data.total_coalesced_requests || 0;
document.getElementById('coalescing-coalesced').textContent = coalescedRequests.toLocaleString();
document.getElementById('coalescing-savings').textContent =
(data.backend_savings_pct || 0).toFixed(1) + '%';
}
function updateRetryBudget(data) {
document.getElementById('retry-tokens').textContent =
data.current_tokens || '--';
document.getElementById('retry-current-tokens').textContent =
data.current_tokens || '--';
document.getElementById('retry-max-tokens').textContent =
data.max_tokens || '--';
// Use explicit undefined check to handle 0 values correctly
const currentTokens = data.current_tokens !== undefined ? data.current_tokens : '--';
const maxTokens = data.max_tokens !== undefined ? data.max_tokens : '--';
document.getElementById('retry-tokens').textContent = currentTokens;
document.getElementById('retry-current-tokens').textContent = currentTokens;
document.getElementById('retry-max-tokens').textContent = maxTokens;
document.getElementById('retry-total').textContent =
(data.total_attempts || 0).toLocaleString();
document.getElementById('retry-denied').textContent =
@@ -1243,13 +1368,17 @@
}
function updateWebSocket(data) {
document.getElementById('ws-connections').textContent =
data.active_connections || 0;
// Handle both single instance (active_connections) and cluster mode (total_connections)
const connections = data.active_connections !== undefined ? data.active_connections :
(data.total_connections !== undefined ? data.total_connections : 0);
document.getElementById('ws-connections').textContent = connections;
}
function updateConnections(data) {
document.getElementById('pool-connections').textContent =
data.active_connections || 0;
// Handle both single instance (active_connections) and cluster mode (total_active)
const connections = data.active_connections !== undefined ? data.active_connections :
(data.total_active !== undefined ? data.total_active : 0);
document.getElementById('pool-connections').textContent = connections;
}
async function resetCoalescing() {
+174 -14
View File
@@ -79,9 +79,44 @@ func (ad *AdminDashboard) serveDashboard(c *fiber.Ctx) error {
}
// getStats returns overall proxy statistics
// In cluster mode (when metrics aggregator is available), returns aggregated stats from all instances
func (ad *AdminDashboard) getStats(c *fiber.Ctx) error {
// Check if cluster mode is enabled - if so, return aggregated stats
if aggregator := GetMetricsAggregator(); aggregator != nil {
metrics, err := aggregator.GetAggregatedMetrics()
if err != nil {
if ad.logger != nil {
ad.logger.Error(&libpack_logger.LogMessage{
Message: "Failed to get aggregated metrics, falling back to local stats",
Pairs: map[string]interface{}{"error": err.Error()},
})
}
// Fall through to local stats on error
} else {
// Return aggregated cluster stats
response := map[string]interface{}{
"cluster_mode": true,
"total_instances": metrics.TotalInstances,
"healthy_instances": metrics.HealthyInstances,
"timestamp": metrics.LastUpdate.Format(time.RFC3339),
"version": "0.27.0",
}
// Add combined stats from aggregation
if metrics.CombinedStats != nil {
for k, v := range metrics.CombinedStats {
response[k] = v
}
}
return c.JSON(response)
}
}
// Local instance stats (fallback or non-cluster mode)
uptimeSeconds := time.Since(startTime).Seconds()
stats := map[string]interface{}{
"cluster_mode": false,
"timestamp": time.Now().Format(time.RFC3339),
"uptime_seconds": uptimeSeconds,
"uptime_human": formatDuration(time.Since(startTime)),
@@ -208,9 +243,17 @@ func (ad *AdminDashboard) getCircuitBreakerStatus(c *fiber.Ctx) error {
if cb != nil {
cbMutex.RLock()
state := cb.State()
counts := cb.Counts()
cbMutex.RUnlock()
status["state"] = state.String()
status["counts"] = map[string]interface{}{
"requests": counts.Requests,
"total_successes": counts.TotalSuccesses,
"total_failures": counts.TotalFailures,
"consecutive_successes": counts.ConsecutiveSuccesses,
"consecutive_failures": counts.ConsecutiveFailures,
}
status["config"] = map[string]interface{}{
"max_failures": cfg.CircuitBreaker.MaxFailures,
"failure_ratio": cfg.CircuitBreaker.FailureRatio,
@@ -225,9 +268,62 @@ func (ad *AdminDashboard) getCircuitBreakerStatus(c *fiber.Ctx) error {
}
// getCacheStats returns cache statistics
// In cluster mode, returns aggregated cache stats from all instances
func (ad *AdminDashboard) getCacheStats(c *fiber.Ctx) error {
// Check if cluster mode is enabled - if so, return aggregated cache stats
if aggregator := GetMetricsAggregator(); aggregator != nil {
metrics, err := aggregator.GetAggregatedMetrics()
if err != nil {
if ad.logger != nil {
ad.logger.Error(&libpack_logger.LogMessage{
Message: "Failed to get aggregated cache metrics, falling back to local stats",
Pairs: map[string]interface{}{"error": err.Error()},
})
}
// Fall through to local stats on error
} else {
// Build aggregated cache stats from combined stats
response := map[string]interface{}{
"cluster_mode": true,
"total_instances": metrics.TotalInstances,
}
// Add cache config from local config
if cfg != nil {
response["enabled"] = cfg.Cache.CacheEnable
response["redis_enabled"] = cfg.Cache.CacheRedisEnable
response["ttl_seconds"] = cfg.Cache.CacheTTL
response["max_memory_mb"] = cfg.Cache.CacheMaxMemorySize
response["max_entries"] = cfg.Cache.CacheMaxEntries
}
// Extract aggregated cache stats from combined stats
if metrics.CombinedStats != nil {
if cacheHits, ok := metrics.CombinedStats["cache_hits"]; ok {
response["cache_hits"] = cacheHits
}
if cacheMisses, ok := metrics.CombinedStats["cache_misses"]; ok {
response["cache_misses"] = cacheMisses
}
if cachedQueries, ok := metrics.CombinedStats["cached_queries"]; ok {
response["cached_queries"] = cachedQueries
}
if hitRate, ok := metrics.CombinedStats["cache_hit_rate_pct"]; ok {
response["hit_rate_pct"] = hitRate
}
if memoryMB, ok := metrics.CombinedStats["memory_usage_mb"]; ok {
response["memory_usage_mb"] = memoryMB
}
}
return c.JSON(response)
}
}
// Local instance stats (fallback or non-cluster mode)
stats := map[string]interface{}{
"enabled": false,
"cluster_mode": false,
"enabled": false,
}
if cfg != nil {
@@ -333,7 +429,7 @@ func (ad *AdminDashboard) getWebSocketStats(c *fiber.Ctx) error {
// clearCache clears the cache
func (ad *AdminDashboard) clearCache(c *fiber.Ctx) error {
// TODO: Implement cache clearing
libpack_cache.CacheClear()
return c.JSON(map[string]interface{}{
"success": true,
"message": "Cache cleared successfully",
@@ -582,8 +678,8 @@ func (ad *AdminDashboard) handleStatsWebSocket(c *websocket.Conn) {
ticker := time.NewTicker(2 * time.Second)
defer ticker.Stop()
// Send initial stats immediately
if stats := ad.gatherAllStats(); stats != nil {
// Send initial stats immediately (cluster-aware for dashboard)
if stats := ad.gatherAllStatsClusterAware(); stats != nil {
if data, err := json.Marshal(stats); err == nil {
c.WriteMessage(websocket.TextMessage, data)
}
@@ -593,8 +689,8 @@ func (ad *AdminDashboard) handleStatsWebSocket(c *websocket.Conn) {
for {
select {
case <-ticker.C:
// Gather all stats
stats := ad.gatherAllStats()
// Gather all stats (cluster-aware for dashboard)
stats := ad.gatherAllStatsClusterAware()
// Marshal to JSON
data, err := json.Marshal(stats)
@@ -627,8 +723,56 @@ func (ad *AdminDashboard) handleStatsWebSocket(c *websocket.Conn) {
}
// gatherAllStats collects all statistics into a single structure
// This always returns LOCAL stats for this instance (used by metrics aggregator)
func (ad *AdminDashboard) gatherAllStats() map[string]interface{} {
return ad.gatherAllStatsWithMode(false)
}
// gatherAllStatsClusterAware collects statistics with cluster awareness
// If cluster mode is available, returns aggregated stats from all instances
func (ad *AdminDashboard) gatherAllStatsClusterAware() map[string]interface{} {
return ad.gatherAllStatsWithMode(true)
}
// gatherAllStatsWithMode collects statistics with optional cluster mode
func (ad *AdminDashboard) gatherAllStatsWithMode(useClusterMode bool) map[string]interface{} {
// Check if cluster mode is requested and available
if useClusterMode {
if aggregator := GetMetricsAggregator(); aggregator != nil {
metrics, err := aggregator.GetAggregatedMetrics()
if err == nil && metrics != nil {
// Return aggregated cluster stats
result := map[string]interface{}{
"cluster_mode": true,
"total_instances": metrics.TotalInstances,
"healthy_instances": metrics.HealthyInstances,
}
// Build stats section from combined stats
stats := map[string]interface{}{
"timestamp": metrics.LastUpdate.Format(time.RFC3339),
"version": "0.27.0",
}
// Copy all combined stats
if metrics.CombinedStats != nil {
for k, v := range metrics.CombinedStats {
stats[k] = v
}
}
result["stats"] = stats
// Add per-instance details
result["instances"] = metrics.Instances
return result
}
}
}
// Fall back to local stats
result := make(map[string]interface{})
result["cluster_mode"] = false
// Main stats
uptimeSeconds := time.Since(startTime).Seconds()
@@ -732,9 +876,17 @@ func (ad *AdminDashboard) gatherAllStats() map[string]interface{} {
if cb != nil {
cbMutex.RLock()
state := cb.State()
counts := cb.Counts()
cbMutex.RUnlock()
cbStatus["state"] = state.String()
cbStatus["counts"] = map[string]interface{}{
"requests": counts.Requests,
"total_successes": counts.TotalSuccesses,
"total_failures": counts.TotalFailures,
"consecutive_successes": counts.ConsecutiveSuccesses,
"consecutive_failures": counts.ConsecutiveFailures,
}
cbStatus["config"] = map[string]interface{}{
"max_failures": cfg.CircuitBreaker.MaxFailures,
"failure_ratio": cfg.CircuitBreaker.FailureRatio,
@@ -771,16 +923,24 @@ func (ad *AdminDashboard) gatherAllStats() map[string]interface{} {
}
cacheStats["hit_rate_pct"] = hitRate
memoryUsage := libpack_cache.GetCacheMemoryUsage()
maxMemory := libpack_cache.GetCacheMaxMemorySize()
cacheStats["memory_usage_bytes"] = memoryUsage
cacheStats["memory_usage_mb"] = float64(memoryUsage) / (1024 * 1024)
// Only get memory usage for in-memory cache (not Redis)
if cfg.Cache.CacheEnable && !cfg.Cache.CacheRedisEnable {
memoryUsage := libpack_cache.GetCacheMemoryUsage()
maxMemory := libpack_cache.GetCacheMaxMemorySize()
cacheStats["memory_usage_bytes"] = memoryUsage
cacheStats["memory_usage_mb"] = float64(memoryUsage) / (1024 * 1024)
memoryUsagePct := 0.0
if maxMemory > 0 {
memoryUsagePct = float64(memoryUsage) / float64(maxMemory) * 100
memoryUsagePct := 0.0
if maxMemory > 0 {
memoryUsagePct = float64(memoryUsage) / float64(maxMemory) * 100
}
cacheStats["memory_usage_pct"] = memoryUsagePct
} else {
// For Redis cache, memory tracking is not available per instance
cacheStats["memory_usage_bytes"] = int64(-1)
cacheStats["memory_usage_mb"] = float64(-1)
cacheStats["memory_usage_pct"] = float64(-1)
}
cacheStats["memory_usage_pct"] = memoryUsagePct
}
}
result["cache"] = cacheStats
+8 -4
View File
@@ -214,12 +214,16 @@ func TestAdminDashboard_GetCacheStats(t *testing.T) {
CacheRedisEnable bool
CacheMaxMemorySize int
CacheMaxEntries int
CacheUseLRU bool
GraphQLQueryCacheSize int
PerUserCacheDisabled bool
}{
CacheEnable: true,
CacheTTL: 60,
CacheMaxMemorySize: 100,
CacheMaxEntries: 10000,
CacheEnable: true,
CacheTTL: 60,
CacheMaxMemorySize: 100,
CacheMaxEntries: 10000,
CacheUseLRU: false,
PerUserCacheDisabled: false,
},
}
+3 -1
View File
@@ -170,9 +170,11 @@ func apiCircuitBreakerHealth(c *fiber.Ctx) error {
})
}
// Get circuit breaker state
// Get circuit breaker state with proper mutex protection
cbMutex.RLock()
state := cb.State()
counts := cb.Counts()
cbMutex.RUnlock()
// Determine health status
var status string
+61 -22
View File
@@ -3,6 +3,7 @@ package libpack_cache
import (
"bytes"
"compress/gzip"
"fmt"
"io"
"sync/atomic"
"time"
@@ -26,8 +27,11 @@ type CacheConfig struct {
Memory struct {
MaxMemorySize int64 `json:"max_memory_size"` // Maximum memory size in bytes
MaxEntries int64 `json:"max_entries"` // Maximum number of entries
UseLRU bool `json:"use_lru"` // Use LRU eviction algorithm instead of random eviction
}
TTL int `json:"ttl"`
TTL int `json:"ttl"`
IncludeUserContext bool `json:"include_user_context"` // Include user ID and role in cache key
PerUserCacheDisabled bool `json:"per_user_cache_disabled"` // Disable per-user caching (backward compatibility)
}
type CacheStats struct {
@@ -52,10 +56,14 @@ var (
config *CacheConfig
)
// CalculateHash generates an MD5 hash from the request body.
// CalculateHash generates an MD5 hash from the request body and optionally user context.
// For GraphQL requests, this includes both the query and variables,
// ensuring that identical queries with different variables are cached separately.
//
// SECURITY FIX: This function now includes user ID and role in the cache key by default
// to prevent data leakage between authenticated users. Set CACHE_PER_USER_DISABLED=true
// to revert to the old behavior (NOT RECOMMENDED for multi-user applications).
//
// Example GraphQL request body:
//
// {
@@ -63,9 +71,30 @@ var (
// "variables": { "id": "123" }
// }
//
// Different variable values will produce different cache keys.
func CalculateHash(c *fiber.Ctx) string {
return strutil.Md5(c.Body())
// With user context enabled (default):
// - Same query, same variables, same user → same cache key
// - Same query, same variables, different user → different cache key
//
// Different variable values will always produce different cache keys.
func CalculateHash(c *fiber.Ctx, userID string, userRole string) string {
cacheKeyData := string(c.Body())
// Include user context in cache key (default behavior for security)
// Only skip if explicitly disabled via configuration (backward compatibility)
if config != nil && !config.PerUserCacheDisabled {
// Normalize empty user values to prevent cache key collisions
if userID == "" {
userID = "-"
}
if userRole == "" {
userRole = "-"
}
// Append user context to ensure cache isolation between users
cacheKeyData = fmt.Sprintf("%s|uid:%s|role:%s", cacheKeyData, userID, userRole)
}
return strutil.Md5(cacheKeyData)
}
func EnableCache(cfg *CacheConfig) {
@@ -96,34 +125,41 @@ func EnableCache(cfg *CacheConfig) {
cfg.Client = libpack_cache_redis.NewCacheWrapper(redisClient, cfg.Logger)
}
} else {
// Calculate memory and entry limits
maxMemory := cfg.Memory.MaxMemorySize
if maxMemory <= 0 {
maxMemory = libpack_cache_memory.DefaultMaxMemorySize
}
maxEntries := cfg.Memory.MaxEntries
if maxEntries <= 0 {
maxEntries = libpack_cache_memory.DefaultMaxCacheSize
}
cacheType := "standard"
if cfg.Memory.UseLRU {
cacheType = "LRU"
}
cfg.Logger.Debug(&libpack_logger.LogMessage{
Message: "Using in-memory cache",
Pairs: map[string]interface{}{
"max_memory_size_bytes": cfg.Memory.MaxMemorySize,
"max_entries": cfg.Memory.MaxEntries,
"type": cacheType,
"max_memory_size_bytes": maxMemory,
"max_entries": maxEntries,
},
})
// Use memory size and entry limits if configured, otherwise use defaults
if cfg.Memory.MaxMemorySize > 0 || cfg.Memory.MaxEntries > 0 {
maxMemory := cfg.Memory.MaxMemorySize
if maxMemory <= 0 {
maxMemory = libpack_cache_memory.DefaultMaxMemorySize
}
maxEntries := cfg.Memory.MaxEntries
if maxEntries <= 0 {
maxEntries = libpack_cache_memory.DefaultMaxCacheSize
}
if cfg.Memory.UseLRU {
// Use LRU cache with proper eviction algorithm
cfg.Client = libpack_cache_memory.NewLRUMemoryCache(maxMemory, maxEntries)
} else {
// Use standard sync.Map-based cache
cfg.Client = libpack_cache_memory.NewWithSize(
time.Duration(cfg.TTL)*time.Second,
maxMemory,
maxEntries,
)
} else {
// Backward compatibility
cfg.Client = libpack_cache_memory.New(time.Duration(cfg.TTL) * time.Second)
}
}
config = cfg
@@ -233,6 +269,9 @@ func CacheGetQueries() int64 {
}
func CacheClear() {
if !IsCacheInitialized() {
return
}
config.Client.Clear()
cacheStats = &CacheStats{}
}
+78 -16
View File
@@ -20,7 +20,7 @@ func (suite *Tests) Test_CalculateHash() {
// Test with empty body
suite.Run("empty body", func() {
ctx.Request().SetBody([]byte(""))
hash := CalculateHash(ctx)
hash := CalculateHash(ctx, "user1", "admin")
assert.NotEmpty(hash)
assert.Equal(32, len(hash)) // MD5 hash is 32 characters
})
@@ -28,7 +28,7 @@ func (suite *Tests) Test_CalculateHash() {
// Test with non-empty body
suite.Run("non-empty body", func() {
ctx.Request().SetBody([]byte("test body"))
hash := CalculateHash(ctx)
hash := CalculateHash(ctx, "user1", "admin")
assert.NotEmpty(hash)
assert.Equal(32, len(hash))
})
@@ -36,10 +36,10 @@ func (suite *Tests) Test_CalculateHash() {
// Test with different bodies produce different hashes
suite.Run("different bodies", func() {
ctx.Request().SetBody([]byte("body1"))
hash1 := CalculateHash(ctx)
hash1 := CalculateHash(ctx, "user1", "admin")
ctx.Request().SetBody([]byte("body2"))
hash2 := CalculateHash(ctx)
hash2 := CalculateHash(ctx, "user1", "admin")
assert.NotEqual(hash1, hash2)
})
@@ -51,10 +51,10 @@ func (suite *Tests) Test_CalculateHash() {
query2 := []byte(`{"query":"query GetUser($id: ID!) { user(id: $id) { name } }","variables":{"id":"456"}}`)
ctx.Request().SetBody(query1)
hash1 := CalculateHash(ctx)
hash1 := CalculateHash(ctx, "user1", "admin")
ctx.Request().SetBody(query2)
hash2 := CalculateHash(ctx)
hash2 := CalculateHash(ctx, "user1", "admin")
assert.NotEqual(hash1, hash2, "Different variables should produce different cache keys")
})
@@ -66,13 +66,83 @@ func (suite *Tests) Test_CalculateHash() {
query2 := []byte(`{"query":"query GetUsers { users { name } }","variables":{}}`)
ctx.Request().SetBody(query1)
hash1 := CalculateHash(ctx)
hash1 := CalculateHash(ctx, "user1", "admin")
ctx.Request().SetBody(query2)
hash2 := CalculateHash(ctx)
hash2 := CalculateHash(ctx, "user1", "admin")
assert.NotEqual(hash1, hash2, "Query with and without variables object should produce different cache keys")
})
// SECURITY TEST: Different users should get different cache keys
suite.Run("different users produce different cache keys", func() {
// Same query, same variables, but different users - CRITICAL SECURITY TEST
query := []byte(`{"query":"query GetMyProfile { me { id email } }"}`)
ctx.Request().SetBody(query)
hash1 := CalculateHash(ctx, "user1", "admin")
hash2 := CalculateHash(ctx, "user2", "user")
assert.NotEqual(hash1, hash2, "Different users MUST produce different cache keys to prevent data leakage")
})
// SECURITY TEST: Same user should get same cache key
suite.Run("same user produces same cache key", func() {
// Same query, same user
query := []byte(`{"query":"query GetMyProfile { me { id email } }"}`)
ctx.Request().SetBody(query)
hash1 := CalculateHash(ctx, "user1", "admin")
hash2 := CalculateHash(ctx, "user1", "admin")
assert.Equal(hash1, hash2, "Same user should get same cache key for cache effectiveness")
})
// SECURITY TEST: Different roles should get different cache keys
suite.Run("different roles produce different cache keys", func() {
// Same query, same user ID, but different roles
query := []byte(`{"query":"query GetData { data { value } }"}`)
ctx.Request().SetBody(query)
hash1 := CalculateHash(ctx, "user1", "admin")
hash2 := CalculateHash(ctx, "user1", "user")
assert.NotEqual(hash1, hash2, "Different roles MUST produce different cache keys to prevent privilege escalation")
})
// SECURITY TEST: Empty user context should be normalized
suite.Run("empty user context is normalized", func() {
query := []byte(`{"query":"query GetPublic { public { data } }"}`)
ctx.Request().SetBody(query)
// Empty strings should be normalized to "-"
hash1 := CalculateHash(ctx, "", "")
hash2 := CalculateHash(ctx, "-", "-")
assert.Equal(hash1, hash2, "Empty user context should be normalized to prevent cache key collisions")
})
// BACKWARD COMPATIBILITY TEST: Legacy mode without user context
suite.Run("legacy mode without user context", func() {
// Setup config with per-user cache disabled
oldConfig := config
config = &CacheConfig{
Logger: libpack_logger.New(),
Client: libpack_cache_memory.New(5 * time.Minute),
TTL: 60,
PerUserCacheDisabled: true, // Disable per-user caching
}
defer func() { config = oldConfig }()
query := []byte(`{"query":"query GetData { data { value } }"}`)
ctx.Request().SetBody(query)
// In legacy mode, different users should get the SAME cache key (backward compatibility)
hash1 := CalculateHash(ctx, "user1", "admin")
hash2 := CalculateHash(ctx, "user2", "user")
assert.Equal(hash1, hash2, "With per-user cache disabled, all users get same cache key (backward compatibility)")
})
}
func (suite *Tests) Test_CacheDelete() {
@@ -112,8 +182,6 @@ func (suite *Tests) Test_CacheDelete() {
suite.Run("uninitialized cache", func() {
// Save current config
oldConfig := config
// Set config to nil
config = nil
// This should not cause any errors
@@ -156,8 +224,6 @@ func (suite *Tests) Test_CacheStoreWithTTL() {
suite.Run("uninitialized cache", func() {
// Save current config
oldConfig := config
// Set config to nil
config = nil
// This should not cause any errors
@@ -194,8 +260,6 @@ func (suite *Tests) Test_CacheGetQueries() {
suite.Run("uninitialized cache", func() {
// Save current config
oldConfig := config
// Set config to nil
config = nil
// This should return 0
@@ -280,8 +344,6 @@ func (suite *Tests) Test_GetCacheStats() {
suite.Run("uninitialized cache", func() {
// Save current config
oldConfig := config
// Set config to nil
config = nil
// This should return empty stats
+5
View File
@@ -279,3 +279,8 @@ func (c *LRUMemoryCache) GetMemoryUsage() int64 {
func (c *LRUMemoryCache) GetMaxMemorySize() int64 {
return c.maxMemorySize
}
// CountQueries returns the number of entries in the cache
func (c *LRUMemoryCache) CountQueries() int64 {
return atomic.LoadInt64(&c.currentCount)
}
+343
View File
@@ -0,0 +1,343 @@
package libpack_cache_memory
import (
"fmt"
"sync"
"testing"
"time"
"github.com/stretchr/testify/suite"
)
type LRUMemoryCacheTestSuite struct {
suite.Suite
}
func TestLRUMemoryCacheTestSuite(t *testing.T) {
suite.Run(t, new(LRUMemoryCacheTestSuite))
}
func (suite *LRUMemoryCacheTestSuite) TestNewLRUMemoryCache() {
cache := NewLRUMemoryCache(1024*1024, 100) // 1MB, 100 entries
suite.NotNil(cache)
suite.Equal(int64(0), cache.CountQueries())
suite.Equal(int64(0), cache.GetMemoryUsage())
suite.Equal(int64(1024*1024), cache.GetMaxMemorySize())
}
func (suite *LRUMemoryCacheTestSuite) TestSetAndGet() {
cache := NewLRUMemoryCache(1024*1024, 100)
// Set a value
cache.Set("key1", []byte("value1"), 5*time.Second)
// Get the value
val, found := cache.Get("key1")
suite.True(found)
suite.Equal([]byte("value1"), val)
// Get non-existent key
val, found = cache.Get("nonexistent")
suite.False(found)
suite.Nil(val)
}
func (suite *LRUMemoryCacheTestSuite) TestUpdateExisting() {
cache := NewLRUMemoryCache(1024*1024, 100)
cache.Set("key1", []byte("value1"), 5*time.Second)
cache.Set("key1", []byte("value2"), 5*time.Second)
val, found := cache.Get("key1")
suite.True(found)
suite.Equal([]byte("value2"), val)
suite.Equal(int64(1), cache.CountQueries())
}
func (suite *LRUMemoryCacheTestSuite) TestDelete() {
cache := NewLRUMemoryCache(1024*1024, 100)
cache.Set("key1", []byte("value1"), 5*time.Second)
suite.Equal(int64(1), cache.CountQueries())
cache.Delete("key1")
suite.Equal(int64(0), cache.CountQueries())
val, found := cache.Get("key1")
suite.False(found)
suite.Nil(val)
// Delete non-existent key should not panic
cache.Delete("nonexistent")
}
func (suite *LRUMemoryCacheTestSuite) TestClear() {
cache := NewLRUMemoryCache(1024*1024, 100)
cache.Set("key1", []byte("value1"), 5*time.Second)
cache.Set("key2", []byte("value2"), 5*time.Second)
cache.Set("key3", []byte("value3"), 5*time.Second)
suite.Equal(int64(3), cache.CountQueries())
cache.Clear()
suite.Equal(int64(0), cache.CountQueries())
suite.Equal(int64(0), cache.GetMemoryUsage())
_, found := cache.Get("key1")
suite.False(found)
}
func (suite *LRUMemoryCacheTestSuite) TestExpiration() {
cache := NewLRUMemoryCache(1024*1024, 100)
cache.Set("key1", []byte("value1"), 100*time.Millisecond)
// Should exist immediately
val, found := cache.Get("key1")
suite.True(found)
suite.Equal([]byte("value1"), val)
// Wait for expiration
time.Sleep(150 * time.Millisecond)
// Should be expired
val, found = cache.Get("key1")
suite.False(found)
suite.Nil(val)
}
func (suite *LRUMemoryCacheTestSuite) TestEvictionByCount() {
cache := NewLRUMemoryCache(1024*1024, 3) // Max 3 entries
cache.Set("key1", []byte("value1"), 5*time.Second)
cache.Set("key2", []byte("value2"), 5*time.Second)
cache.Set("key3", []byte("value3"), 5*time.Second)
// All 3 should exist
_, found := cache.Get("key1")
suite.True(found)
_, found = cache.Get("key2")
suite.True(found)
_, found = cache.Get("key3")
suite.True(found)
// Add 4th entry - should evict oldest (key1)
cache.Set("key4", []byte("value4"), 5*time.Second)
suite.Equal(int64(3), cache.CountQueries())
// key1 should be evicted (it was least recently used)
_, found = cache.Get("key1")
suite.False(found)
// Others should still exist
_, found = cache.Get("key2")
suite.True(found)
_, found = cache.Get("key3")
suite.True(found)
_, found = cache.Get("key4")
suite.True(found)
}
func (suite *LRUMemoryCacheTestSuite) TestLRUOrder() {
cache := NewLRUMemoryCache(1024*1024, 3) // Max 3 entries
cache.Set("key1", []byte("value1"), 5*time.Second)
cache.Set("key2", []byte("value2"), 5*time.Second)
cache.Set("key3", []byte("value3"), 5*time.Second)
// Access key1 to make it recently used
cache.Get("key1")
// Add key4 - should evict key2 (now least recently used)
cache.Set("key4", []byte("value4"), 5*time.Second)
// key2 should be evicted
_, found := cache.Get("key2")
suite.False(found)
// key1 should still exist (was accessed recently)
_, found = cache.Get("key1")
suite.True(found)
}
func (suite *LRUMemoryCacheTestSuite) TestEvictionByMemory() {
// Small memory limit - 500 bytes
cache := NewLRUMemoryCache(500, 100)
// Each entry has ~64 bytes overhead + key + value
cache.Set("key1", []byte("value1"), 5*time.Second)
cache.Set("key2", []byte("value2"), 5*time.Second)
cache.Set("key3", []byte("value3"), 5*time.Second)
// Add large entry that should trigger eviction
largeValue := make([]byte, 200)
cache.Set("large", largeValue, 5*time.Second)
// Memory should be under limit
suite.LessOrEqual(cache.GetMemoryUsage(), int64(500))
}
func (suite *LRUMemoryCacheTestSuite) TestCompression() {
cache := NewLRUMemoryCache(1024*1024, 100)
// Create a compressible value (> 1KB to trigger compression)
largeValue := make([]byte, 2048)
for i := range largeValue {
largeValue[i] = 'A' // Highly compressible
}
cache.Set("compressed", largeValue, 5*time.Second)
// Should be able to retrieve it correctly
val, found := cache.Get("compressed")
suite.True(found)
suite.Equal(largeValue, val)
}
func (suite *LRUMemoryCacheTestSuite) TestGetStats() {
cache := NewLRUMemoryCache(1024*1024, 100)
cache.Set("key1", []byte("value1"), 5*time.Second)
cache.Set("key2", []byte("value2"), 5*time.Second)
stats := cache.GetStats()
suite.Equal(int64(2), stats["entries"])
suite.Equal(int64(1024*1024), stats["max_memory"])
suite.Equal(int64(100), stats["max_entries"])
suite.NotNil(stats["memory_bytes"])
suite.NotNil(stats["fill_percent"])
}
func (suite *LRUMemoryCacheTestSuite) TestConcurrentAccess() {
cache := NewLRUMemoryCache(10*1024*1024, 1000)
const numGoroutines = 50
const numOperations = 500
var wg sync.WaitGroup
wg.Add(numGoroutines * 3) // readers, writers, deleters
// Writers
for i := 0; i < numGoroutines; i++ {
go func(id int) {
defer wg.Done()
for j := 0; j < numOperations; j++ {
key := fmt.Sprintf("key-%d-%d", id, j)
value := []byte(fmt.Sprintf("value-%d-%d", id, j))
cache.Set(key, value, 5*time.Second)
}
}(i)
}
// Readers
for i := 0; i < numGoroutines; i++ {
go func(id int) {
defer wg.Done()
for j := 0; j < numOperations; j++ {
key := fmt.Sprintf("key-%d-%d", id, j)
cache.Get(key)
}
}(i)
}
// Deleters
for i := 0; i < numGoroutines; i++ {
go func(id int) {
defer wg.Done()
for j := 0; j < numOperations; j++ {
key := fmt.Sprintf("key-%d-%d", id, j%100)
cache.Delete(key)
}
}(i)
}
wg.Wait()
}
func (suite *LRUMemoryCacheTestSuite) TestCleanExpiredEntries() {
cache := NewLRUMemoryCache(1024*1024, 100)
cache.Set("expire1", []byte("value1"), 50*time.Millisecond)
cache.Set("expire2", []byte("value2"), 50*time.Millisecond)
cache.Set("keep", []byte("value3"), 5*time.Second)
suite.Equal(int64(3), cache.CountQueries())
// Wait for some to expire
time.Sleep(100 * time.Millisecond)
// Clean expired entries
cache.CleanExpiredEntries()
// Only "keep" should remain
suite.Equal(int64(1), cache.CountQueries())
_, found := cache.Get("keep")
suite.True(found)
}
func (suite *LRUMemoryCacheTestSuite) TestCountQueries() {
cache := NewLRUMemoryCache(1024*1024, 100)
suite.Equal(int64(0), cache.CountQueries())
cache.Set("key1", []byte("value1"), 5*time.Second)
suite.Equal(int64(1), cache.CountQueries())
cache.Set("key2", []byte("value2"), 5*time.Second)
suite.Equal(int64(2), cache.CountQueries())
cache.Delete("key1")
suite.Equal(int64(1), cache.CountQueries())
cache.Clear()
suite.Equal(int64(0), cache.CountQueries())
}
// Benchmarks
func BenchmarkLRUMemoryCacheSet(b *testing.B) {
cache := NewLRUMemoryCache(100*1024*1024, 100000)
value := []byte("benchmark-value")
b.ResetTimer()
for i := 0; i < b.N; i++ {
key := fmt.Sprintf("key-%d", i)
cache.Set(key, value, 5*time.Second)
}
}
func BenchmarkLRUMemoryCacheGet(b *testing.B) {
cache := NewLRUMemoryCache(100*1024*1024, 100000)
value := []byte("benchmark-value")
// Pre-populate
for i := 0; i < 10000; i++ {
key := fmt.Sprintf("key-%d", i)
cache.Set(key, value, 5*time.Minute)
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
key := fmt.Sprintf("key-%d", i%10000)
cache.Get(key)
}
}
func BenchmarkLRUMemoryCacheConcurrent(b *testing.B) {
cache := NewLRUMemoryCache(100*1024*1024, 100000)
value := []byte("benchmark-value")
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
key := fmt.Sprintf("key-%d", i)
if i%2 == 0 {
cache.Set(key, value, 5*time.Second)
} else {
cache.Get(key)
}
i++
}
})
}
+6 -4
View File
@@ -33,8 +33,9 @@ func (suite *CircuitBreakerTestSuite) TestCircuitBreakerCacheFallback() {
ctx := app.AcquireCtx(requestCtx)
defer app.ReleaseCtx(ctx)
// Calculate the cache key that would be used
cacheKey := libpack_cache.CalculateHash(ctx)
// Calculate the cache key that would be used (with default user context since no auth headers)
// extractUserInfo() returns ("-", "-") when no auth is present
cacheKey := libpack_cache.CalculateHash(ctx, "-", "-")
// Add a test response to the cache
cachedResponse := []byte(`{"data":{"test":"cached-response"}}`)
@@ -158,8 +159,9 @@ func (suite *CircuitBreakerTestSuite) TestCacheDisabledFallback() {
ctx := app.AcquireCtx(requestCtx)
defer app.ReleaseCtx(ctx)
// Calculate cache key and store a response
cacheKey := libpack_cache.CalculateHash(ctx)
// Calculate cache key and store a response (with default user context since no auth headers)
// extractUserInfo() returns ("-", "-") when no auth is present
cacheKey := libpack_cache.CalculateHash(ctx, "-", "-")
cachedResponse := []byte(`{"data":{"test":"cached-response"}}`)
libpack_cache.CacheStore(cacheKey, cachedResponse)
+6 -10
View File
@@ -23,18 +23,14 @@ func NewCircuitBreakerMetrics(monitoring *libpack_monitoring.MetricsSetup) *Circ
// Initialize state value
cbm.stateValue.Store(float64(0))
// Create gauge with callback that reads the atomic value
cbm.stateGauge = monitoring.RegisterMetricsGauge(
// Create gauge with callback that reads the atomic value on every scrape
// This ensures the metric always reflects the current circuit breaker state
cbm.stateGauge = monitoring.RegisterMetricsGaugeFunc(
libpack_monitoring.MetricsCircuitState,
nil,
0, // Initial value doesn't matter as callback will be used
)
// Override the gauge callback to read from atomic value
cbm.stateGauge = monitoring.RegisterMetricsGauge(
libpack_monitoring.MetricsCircuitState,
nil,
cbm.GetState(),
func() float64 {
return cbm.GetState()
},
)
return cbm
+143
View File
@@ -0,0 +1,143 @@
package main
import (
"fmt"
"strings"
fiber "github.com/gofiber/fiber/v2"
"github.com/graphql-go/graphql/language/ast"
"github.com/graphql-go/graphql/language/parser"
"github.com/graphql-go/graphql/language/source"
libpack_logger "github.com/lukaszraczylo/graphql-monitoring-proxy/logging"
)
// debugParseGraphQLQuery provides detailed logging for mutation routing analysis
// This is automatically called when LOG_LEVEL=DEBUG to help identify routing issues
//
// It logs:
// - GraphQL query structure (operations, selections, directives)
// - Final routing decision (which endpoint was chosen)
// - Automatic detection of mutations routed to wrong endpoints
//
// To enable: Set LOG_LEVEL=DEBUG and restart the proxy
func debugParseGraphQLQuery(c *fiber.Ctx, query string) {
if cfg == nil || cfg.Logger == nil {
return
}
cfg.Logger.Info(&libpack_logger.LogMessage{
Message: "=== DEBUG: Parsing GraphQL Query ===",
Pairs: map[string]interface{}{
"query_length": len(query),
"query_preview": truncateString(query, 100),
},
})
// Parse the query
src := source.NewSource(&source.Source{
Body: []byte(query),
Name: "Debug GraphQL request",
})
p, err := parser.Parse(parser.ParseParams{Source: src})
if err != nil {
cfg.Logger.Error(&libpack_logger.LogMessage{
Message: "DEBUG: Failed to parse query",
Pairs: map[string]interface{}{"error": err.Error()},
})
return
}
cfg.Logger.Info(&libpack_logger.LogMessage{
Message: "DEBUG: Query parsed successfully",
Pairs: map[string]interface{}{
"definitions_count": len(p.Definitions),
},
})
// Analyze each definition
for i, d := range p.Definitions {
if oper, ok := d.(*ast.OperationDefinition); ok {
operationType := strings.ToLower(oper.Operation)
operationName := "unnamed"
if oper.Name != nil {
operationName = oper.Name.Value
}
// Count selections
selectionCount := 0
if oper.SelectionSet != nil {
selectionCount = len(oper.GetSelectionSet().Selections)
}
cfg.Logger.Info(&libpack_logger.LogMessage{
Message: fmt.Sprintf("DEBUG: Definition #%d (OperationDefinition)", i),
Pairs: map[string]interface{}{
"operation_type": operationType,
"operation_name": operationName,
"selection_count": selectionCount,
"is_mutation": operationType == "mutation",
"directive_count": len(oper.Directives),
},
})
// Log selections for mutations
if operationType == "mutation" && oper.SelectionSet != nil {
for j, sel := range oper.GetSelectionSet().Selections {
if field, ok := sel.(*ast.Field); ok {
cfg.Logger.Info(&libpack_logger.LogMessage{
Message: fmt.Sprintf("DEBUG: Mutation field #%d", j),
Pairs: map[string]interface{}{
"field_name": field.Name.Value,
},
})
}
}
}
} else if frag, ok := d.(*ast.FragmentDefinition); ok {
cfg.Logger.Info(&libpack_logger.LogMessage{
Message: fmt.Sprintf("DEBUG: Definition #%d (FragmentDefinition)", i),
Pairs: map[string]interface{}{
"fragment_name": frag.Name.Value,
},
})
}
}
// Now run the actual parsing to see the result
result := parseGraphQLQuery(c)
cfg.Logger.Info(&libpack_logger.LogMessage{
Message: "DEBUG: Final routing decision",
Pairs: map[string]interface{}{
"operation_type": result.operationType,
"operation_name": result.operationName,
"active_endpoint": result.activeEndpoint,
"should_block": result.shouldBlock,
"should_ignore": result.shouldIgnore,
"write_endpoint": cfg.Server.HostGraphQL,
"read_endpoint": cfg.Server.HostGraphQLReadOnly,
"is_using_write": result.activeEndpoint == cfg.Server.HostGraphQL,
},
})
// Check for potential issues
if result.operationType == "mutation" && result.activeEndpoint != cfg.Server.HostGraphQL {
cfg.Logger.Error(&libpack_logger.LogMessage{
Message: "DEBUG: ⚠️ BUG DETECTED: Mutation routed to wrong endpoint!",
Pairs: map[string]interface{}{
"expected_endpoint": cfg.Server.HostGraphQL,
"actual_endpoint": result.activeEndpoint,
},
})
}
if result.operationType == "mutation" && strings.Contains(strings.ToLower(result.activeEndpoint), "read") {
cfg.Logger.Error(&libpack_logger.LogMessage{
Message: "DEBUG: ⚠️ CRITICAL: Mutation endpoint contains 'read' in URL!",
Pairs: map[string]interface{}{
"endpoint": result.activeEndpoint,
},
})
}
}
+1
View File
@@ -0,0 +1 @@
graphql-monitoring-proxy.raczylo.com
+586
View File
@@ -0,0 +1,586 @@
<!doctype html>
<html lang="en" class="scroll-smooth">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>GraphQL Monitoring Proxy - High-Performance GraphQL Gateway</title>
<meta
name="description"
content="High-performance GraphQL proxy with monitoring, caching, circuit breaker, rate limiting, and security features. Zero cost monitoring at 30k req/s."
/>
<script src="https://cdn.tailwindcss.com"></script>
<script>
tailwind.config = {
darkMode: 'class'
}
</script>
<link
rel="stylesheet"
href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.5.1/css/all.min.css"
/>
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link
href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap"
rel="stylesheet"
/>
<style>
body { font-family: "Inter", sans-serif; }
code, pre { font-family: "JetBrains Mono", monospace; }
.theme-transition {
transition: background-color 0.3s ease, color 0.3s ease, border-color 0.3s ease;
}
@keyframes fadeInUp {
from { opacity: 0; transform: translateY(20px); }
to { opacity: 1; transform: translateY(0); }
}
@keyframes float {
0%, 100% { transform: translateY(0px); }
50% { transform: translateY(-10px); }
}
.animate-fade-in-up { animation: fadeInUp 0.6s ease-out; }
.animate-float { animation: float 3s ease-in-out infinite; }
.glass {
background: rgba(255, 255, 255, 0.7);
backdrop-filter: blur(10px);
-webkit-backdrop-filter: blur(10px);
border: 1px solid rgba(255, 255, 255, 0.2);
}
.dark .glass {
background: rgba(17, 24, 39, 0.7);
border: 1px solid rgba(255, 255, 255, 0.1);
}
.gradient-text {
background: linear-gradient(135deg, #e879f9 0%, #818cf8 100%);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
background-clip: text;
}
.dark .gradient-text {
background: linear-gradient(135deg, #f0abfc 0%, #a5b4fc 100%);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
background-clip: text;
}
.shadow-modern { box-shadow: 0 10px 40px -10px rgba(0, 0, 0, 0.1); }
.dark .shadow-modern { box-shadow: 0 10px 40px -10px rgba(0, 0, 0, 0.4); }
html { scroll-behavior: smooth; }
</style>
<script>
if (localStorage.theme === "dark" || (!("theme" in localStorage) && window.matchMedia("(prefers-color-scheme: dark)").matches)) {
document.documentElement.classList.add("dark");
} else {
document.documentElement.classList.remove("dark");
}
</script>
</head>
<body class="bg-white dark:bg-gray-900 text-gray-900 dark:text-gray-100 theme-transition">
<!-- Navigation -->
<nav class="fixed w-full glass shadow-modern z-50 theme-transition">
<div class="max-w-6xl mx-auto px-4 sm:px-6">
<div class="flex justify-between h-16 items-center">
<a href="#" class="flex items-center hover:opacity-80 transition-opacity duration-300 gap-2">
<i class="fas fa-diagram-project text-2xl gradient-text"></i>
<span class="text-xl font-bold gradient-text">graphql-monitoring-proxy</span>
</a>
<div class="hidden md:flex space-x-6">
<a href="#features" class="text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 font-medium">Features</a>
<a href="#installation" class="text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 font-medium">Installation</a>
<a href="#configuration" class="text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 font-medium">Configuration</a>
<a href="#endpoints" class="text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 font-medium">Endpoints</a>
</div>
<div class="flex items-center space-x-4">
<button id="theme-toggle" class="text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 p-2 min-w-[44px] min-h-[44px] flex items-center justify-center" aria-label="Toggle theme">
<i class="fas fa-moon dark:hidden text-xl"></i>
<i class="fas fa-sun hidden dark:inline text-xl"></i>
</button>
<a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy" target="_blank" class="text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 p-2 min-w-[44px] min-h-[44px] flex items-center justify-center" aria-label="View on GitHub">
<i class="fab fa-github text-xl"></i>
</a>
<button id="mobile-menu-toggle" class="md:hidden text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 p-2 min-w-[44px] min-h-[44px] flex items-center justify-center" aria-label="Toggle menu">
<i class="fas fa-bars text-xl" id="menu-open-icon"></i>
<i class="fas fa-times text-xl hidden" id="menu-close-icon"></i>
</button>
</div>
</div>
</div>
<div id="mobile-menu" class="hidden md:hidden border-t border-gray-200 dark:border-gray-700">
<div class="px-4 py-3 space-y-1 bg-white dark:bg-gray-800">
<a href="#features" class="block px-3 py-3 text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 hover:bg-gray-50 dark:hover:bg-gray-700 rounded font-medium">Features</a>
<a href="#installation" class="block px-3 py-3 text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 hover:bg-gray-50 dark:hover:bg-gray-700 rounded font-medium">Installation</a>
<a href="#configuration" class="block px-3 py-3 text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 hover:bg-gray-50 dark:hover:bg-gray-700 rounded font-medium">Configuration</a>
<a href="#endpoints" class="block px-3 py-3 text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-gray-100 hover:bg-gray-50 dark:hover:bg-gray-700 rounded font-medium">Endpoints</a>
</div>
</div>
</nav>
<!-- Hero Section -->
<section class="relative pt-24 sm:pt-32 pb-12 sm:pb-20 overflow-hidden">
<div class="absolute inset-0 bg-gradient-to-br from-fuchsia-50 via-violet-50 to-indigo-50 dark:from-gray-900 dark:via-fuchsia-900/20 dark:to-indigo-900/20 theme-transition"></div>
<div class="absolute top-0 -left-4 w-72 h-72 bg-fuchsia-300 dark:bg-fuchsia-500 rounded-full mix-blend-multiply dark:mix-blend-soft-light filter blur-xl opacity-20 animate-float"></div>
<div class="absolute top-0 -right-4 w-72 h-72 bg-violet-300 dark:bg-violet-500 rounded-full mix-blend-multiply dark:mix-blend-soft-light filter blur-xl opacity-20 animate-float" style="animation-delay: 1s;"></div>
<div class="absolute -bottom-8 left-20 w-72 h-72 bg-indigo-300 dark:bg-indigo-500 rounded-full mix-blend-multiply dark:mix-blend-soft-light filter blur-xl opacity-20 animate-float" style="animation-delay: 2s;"></div>
<div class="relative max-w-6xl mx-auto px-4 sm:px-6">
<div class="text-center">
<div class="mb-8 sm:mb-10 flex justify-center animate-fade-in-up">
<div class="text-8xl sm:text-9xl animate-float">
<i class="fas fa-diagram-project gradient-text"></i>
</div>
</div>
<h1 class="text-3xl sm:text-4xl md:text-5xl lg:text-6xl font-bold text-gray-900 dark:text-gray-100 mb-4 sm:mb-6 leading-tight animate-fade-in-up" style="animation-delay: 0.1s;">
GraphQL Monitoring<br /><span class="gradient-text">Proxy</span>
</h1>
<p class="text-base sm:text-lg md:text-xl text-gray-600 dark:text-gray-300 mb-8 sm:mb-10 max-w-2xl mx-auto leading-relaxed px-4 animate-fade-in-up" style="animation-delay: 0.2s;">
High-performance GraphQL gateway with monitoring, caching, circuit breaker, rate limiting, and security. Tested at 30k req/s using 10MB RAM.
</p>
<div class="flex flex-col sm:flex-row gap-3 sm:gap-4 justify-center mb-8 sm:mb-12 px-4 animate-fade-in-up" style="animation-delay: 0.3s;">
<a href="#installation" class="group relative bg-gradient-to-r from-fuchsia-500 to-indigo-600 hover:from-fuchsia-600 hover:to-indigo-700 text-white px-8 py-3 rounded-lg font-medium transition-all duration-300 min-h-[48px] flex items-center justify-center shadow-lg hover:shadow-xl hover:scale-105">
<span class="relative z-10">Get Started</span>
</a>
<a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy" class="group glass hover:shadow-lg text-gray-900 dark:text-gray-100 px-8 py-3 rounded-lg font-medium transition-all duration-300 min-h-[48px] flex items-center justify-center hover:scale-105">
<i class="fab fa-github mr-2"></i>View on GitHub
</a>
</div>
<div class="flex flex-wrap justify-center gap-2 sm:gap-4 text-sm px-4">
<img src="https://img.shields.io/github/v/release/lukaszraczylo/graphql-monitoring-proxy" alt="Version" class="h-5" />
<img src="https://img.shields.io/github/license/lukaszraczylo/graphql-monitoring-proxy" alt="License" class="h-5" />
<img src="https://goreportcard.com/badge/github.com/lukaszraczylo/graphql-monitoring-proxy" alt="Go Report" class="h-5" />
</div>
<div class="mt-12 sm:mt-16 max-w-3xl mx-auto px-4 animate-fade-in-up" style="animation-delay: 0.4s;">
<div class="relative group">
<div class="absolute -inset-1 bg-gradient-to-r from-fuchsia-500 to-indigo-600 rounded-xl blur opacity-25 group-hover:opacity-50 transition duration-500"></div>
<div class="relative bg-gray-900 rounded-xl p-6 text-left">
<div class="flex items-center gap-2 mb-4">
<div class="w-3 h-3 rounded-full bg-red-500"></div>
<div class="w-3 h-3 rounded-full bg-yellow-500"></div>
<div class="w-3 h-3 rounded-full bg-green-500"></div>
<span class="ml-2 text-gray-400 text-sm">terminal</span>
</div>
<pre class="text-gray-100 text-sm sm:text-base overflow-x-auto"><code><span class="text-gray-400"># Run with Docker</span>
<span class="text-fuchsia-400">$</span> docker pull ghcr.io/lukaszraczylo/graphql-monitoring-proxy:latest
<span class="text-gray-400"># Configure and run</span>
<span class="text-fuchsia-400">$</span> docker run -p 8080:8080 -p 9393:9393 \
-e GMP_HOST_GRAPHQL=http://your-graphql:4000/ \
ghcr.io/lukaszraczylo/graphql-monitoring-proxy:latest</code></pre>
</div>
</div>
</div>
</div>
</div>
</section>
<!-- Features Section -->
<section id="features" class="py-12 sm:py-16 md:py-20 bg-white dark:bg-gray-900 theme-transition">
<div class="max-w-6xl mx-auto px-4 sm:px-6">
<div class="text-center mb-8 sm:mb-12">
<h2 class="text-2xl sm:text-3xl md:text-4xl font-bold text-gray-900 dark:text-gray-100 mb-3 sm:mb-4">Features</h2>
<p class="text-base sm:text-lg text-gray-600 dark:text-gray-300 px-4">Enterprise-grade GraphQL gateway at zero cost</p>
</div>
<div class="grid sm:grid-cols-2 lg:grid-cols-3 gap-4">
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-fuchsia-500 to-fuchsia-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-chart-line text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">Prometheus Metrics</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Full observability with query timing, user tracking, and operation metrics</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-violet-500 to-violet-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-bolt text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">Smart Caching</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Memory-aware caching with LRU eviction, compression, and Redis support</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-indigo-500 to-indigo-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-shield-halved text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">Circuit Breaker</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Automatic failure detection with graceful degradation and cached fallbacks</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-rose-500 to-rose-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-gauge-high text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">Rate Limiting</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Role-based rate limiting with JWT extraction and burst control</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-amber-500 to-amber-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-layer-group text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">Request Coalescing</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Deduplicate concurrent identical queries, reducing backend load 50-80%</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-emerald-500 to-emerald-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-plug text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">WebSocket Support</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Native GraphQL subscriptions with bidirectional proxying</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-cyan-500 to-cyan-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-eye-slash text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">Introspection Blocking</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Block schema introspection with configurable allowlists</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-orange-500 to-orange-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-satellite-dish text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">OpenTelemetry Tracing</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Distributed tracing with configurable OTLP collector endpoint</p>
</div>
</div>
</div>
<div class="glass p-5 rounded-xl group hover:shadow-lg transition-all duration-300">
<div class="flex items-start gap-4">
<div class="w-12 h-12 rounded-xl bg-gradient-to-br from-pink-500 to-pink-600 flex items-center justify-center flex-shrink-0 group-hover:scale-110 transition-transform duration-300">
<i class="fas fa-desktop text-white"></i>
</div>
<div>
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-1">Admin Dashboard</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">Real-time web UI for monitoring health, metrics, and controls</p>
</div>
</div>
</div>
</div>
</div>
</section>
<!-- Installation Section -->
<section id="installation" class="py-12 sm:py-16 md:py-20 bg-gray-50 dark:bg-gray-800 theme-transition">
<div class="max-w-6xl mx-auto px-4 sm:px-6">
<div class="text-center mb-8 sm:mb-12">
<h2 class="text-2xl sm:text-3xl md:text-4xl font-bold text-gray-900 dark:text-gray-100 mb-3 sm:mb-4">Installation</h2>
<p class="text-base sm:text-lg text-gray-600 dark:text-gray-300 px-4">Deploy in seconds</p>
</div>
<div class="max-w-3xl mx-auto space-y-6">
<div class="glass p-6 rounded-xl">
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-3 flex items-center">
<i class="fab fa-docker mr-2 text-blue-500"></i>
Docker
</h3>
<pre class="bg-gray-900 text-gray-100 p-4 rounded-lg overflow-x-auto"><code>docker pull ghcr.io/lukaszraczylo/graphql-monitoring-proxy:latest</code></pre>
</div>
<div class="glass p-6 rounded-xl">
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-3 flex items-center">
<i class="fas fa-download mr-2 text-fuchsia-500"></i>
Binary Download
</h3>
<p class="text-gray-600 dark:text-gray-400 mb-3">Download from the <a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy/releases/latest" class="text-fuchsia-600 dark:text-fuchsia-400 hover:underline">releases page</a>.</p>
<p class="text-sm text-gray-500 dark:text-gray-400">Supported: Darwin ARM64/AMD64, Linux ARM64/AMD64, Windows AMD64</p>
</div>
<div class="glass p-6 rounded-xl">
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-3 flex items-center">
<i class="fas fa-dharmachakra mr-2 text-indigo-500"></i>
Kubernetes
</h3>
<p class="text-gray-600 dark:text-gray-400 mb-3">Example manifests available in the repository:</p>
<ul class="text-sm text-gray-500 dark:text-gray-400 space-y-1">
<li><a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy/blob/main/static/kubernetes-deployment.yaml" class="text-fuchsia-600 dark:text-fuchsia-400 hover:underline">Standalone deployment</a></li>
<li><a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy/blob/main/static/kubernetes-single-deployment.yaml" class="text-fuchsia-600 dark:text-fuchsia-400 hover:underline">Combined deployment</a></li>
<li><a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy/blob/main/static/kubernetes-single-deployment-with-ro.yaml" class="text-fuchsia-600 dark:text-fuchsia-400 hover:underline">Combined with read-only replica</a></li>
</ul>
</div>
</div>
</div>
</section>
<!-- Endpoints Section -->
<section id="endpoints" class="py-12 sm:py-16 md:py-20 bg-white dark:bg-gray-900 theme-transition">
<div class="max-w-6xl mx-auto px-4 sm:px-6">
<div class="text-center mb-8 sm:mb-12">
<h2 class="text-2xl sm:text-3xl md:text-4xl font-bold text-gray-900 dark:text-gray-100 mb-3 sm:mb-4">Endpoints</h2>
<p class="text-base sm:text-lg text-gray-600 dark:text-gray-300 px-4">Available HTTP endpoints</p>
</div>
<div class="max-w-3xl mx-auto">
<div class="glass p-6 rounded-xl">
<div class="space-y-4">
<div class="flex items-start gap-4 p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
<code class="text-fuchsia-600 dark:text-fuchsia-400 font-medium whitespace-nowrap">:8080/*</code>
<span class="text-gray-600 dark:text-gray-400">GraphQL passthrough endpoint</span>
</div>
<div class="flex items-start gap-4 p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
<code class="text-fuchsia-600 dark:text-fuchsia-400 font-medium whitespace-nowrap">:8080/admin</code>
<span class="text-gray-600 dark:text-gray-400">Admin dashboard UI</span>
</div>
<div class="flex items-start gap-4 p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
<code class="text-fuchsia-600 dark:text-fuchsia-400 font-medium whitespace-nowrap">:9393/metrics</code>
<span class="text-gray-600 dark:text-gray-400">Prometheus metrics endpoint</span>
</div>
<div class="flex items-start gap-4 p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
<code class="text-fuchsia-600 dark:text-fuchsia-400 font-medium whitespace-nowrap">:8080/healthz</code>
<span class="text-gray-600 dark:text-gray-400">Health check endpoint</span>
</div>
<div class="flex items-start gap-4 p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
<code class="text-fuchsia-600 dark:text-fuchsia-400 font-medium whitespace-nowrap">:8080/livez</code>
<span class="text-gray-600 dark:text-gray-400">Liveness probe endpoint</span>
</div>
<div class="flex items-start gap-4 p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
<code class="text-fuchsia-600 dark:text-fuchsia-400 font-medium whitespace-nowrap">:9090/api/*</code>
<span class="text-gray-600 dark:text-gray-400">Management API (ban/unban, cache, circuit breaker)</span>
</div>
</div>
</div>
</div>
</div>
</section>
<!-- Configuration Section -->
<section id="configuration" class="py-12 sm:py-16 md:py-20 bg-gray-50 dark:bg-gray-800 theme-transition">
<div class="max-w-6xl mx-auto px-4 sm:px-6">
<div class="text-center mb-8 sm:mb-12">
<h2 class="text-2xl sm:text-3xl md:text-4xl font-bold text-gray-900 dark:text-gray-100 mb-3 sm:mb-4">Configuration</h2>
<p class="text-base sm:text-lg text-gray-600 dark:text-gray-300 px-4">Environment variables (prefix with GMP_ recommended)</p>
</div>
<div class="max-w-4xl mx-auto space-y-6">
<div class="glass p-6 rounded-xl">
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-4 flex items-center">
<i class="fas fa-server mr-2 text-fuchsia-500"></i>
Core Settings
</h3>
<div class="overflow-x-auto">
<table class="w-full text-sm">
<thead>
<tr class="text-left text-gray-500 dark:text-gray-400 border-b border-gray-200 dark:border-gray-700">
<th class="pb-2 pr-4">Variable</th>
<th class="pb-2 pr-4">Description</th>
<th class="pb-2">Default</th>
</tr>
</thead>
<tbody class="text-gray-600 dark:text-gray-300">
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">HOST_GRAPHQL</code></td>
<td class="py-2 pr-4">GraphQL backend URL</td>
<td class="py-2">http://localhost/</td>
</tr>
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">PORT_GRAPHQL</code></td>
<td class="py-2 pr-4">Proxy listen port</td>
<td class="py-2">8080</td>
</tr>
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">MONITORING_PORT</code></td>
<td class="py-2 pr-4">Metrics endpoint port</td>
<td class="py-2">9393</td>
</tr>
<tr>
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">LOG_LEVEL</code></td>
<td class="py-2 pr-4">Logging level</td>
<td class="py-2">info</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="glass p-6 rounded-xl">
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-4 flex items-center">
<i class="fas fa-bolt mr-2 text-violet-500"></i>
Caching
</h3>
<div class="overflow-x-auto">
<table class="w-full text-sm">
<thead>
<tr class="text-left text-gray-500 dark:text-gray-400 border-b border-gray-200 dark:border-gray-700">
<th class="pb-2 pr-4">Variable</th>
<th class="pb-2 pr-4">Description</th>
<th class="pb-2">Default</th>
</tr>
</thead>
<tbody class="text-gray-600 dark:text-gray-300">
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">ENABLE_GLOBAL_CACHE</code></td>
<td class="py-2 pr-4">Enable memory cache</td>
<td class="py-2">false</td>
</tr>
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">CACHE_TTL</code></td>
<td class="py-2 pr-4">Cache TTL in seconds</td>
<td class="py-2">60</td>
</tr>
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">CACHE_MAX_MEMORY_SIZE</code></td>
<td class="py-2 pr-4">Max memory in MB</td>
<td class="py-2">100</td>
</tr>
<tr>
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">ENABLE_REDIS_CACHE</code></td>
<td class="py-2 pr-4">Enable Redis cache</td>
<td class="py-2">false</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="glass p-6 rounded-xl">
<h3 class="font-semibold text-gray-900 dark:text-gray-100 mb-4 flex items-center">
<i class="fas fa-shield-halved mr-2 text-indigo-500"></i>
Circuit Breaker
</h3>
<div class="overflow-x-auto">
<table class="w-full text-sm">
<thead>
<tr class="text-left text-gray-500 dark:text-gray-400 border-b border-gray-200 dark:border-gray-700">
<th class="pb-2 pr-4">Variable</th>
<th class="pb-2 pr-4">Description</th>
<th class="pb-2">Default</th>
</tr>
</thead>
<tbody class="text-gray-600 dark:text-gray-300">
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">ENABLE_CIRCUIT_BREAKER</code></td>
<td class="py-2 pr-4">Enable circuit breaker</td>
<td class="py-2">false</td>
</tr>
<tr class="border-b border-gray-100 dark:border-gray-700">
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">CIRCUIT_MAX_FAILURES</code></td>
<td class="py-2 pr-4">Failures before trip</td>
<td class="py-2">10</td>
</tr>
<tr>
<td class="py-2 pr-4"><code class="text-fuchsia-600 dark:text-fuchsia-400">CIRCUIT_TIMEOUT_SECONDS</code></td>
<td class="py-2 pr-4">Recovery timeout</td>
<td class="py-2">60</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="text-center mt-8">
<a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy#configuration" class="inline-flex items-center text-fuchsia-600 dark:text-fuchsia-400 hover:underline font-medium">
View all configuration options
<i class="fas fa-arrow-right ml-2"></i>
</a>
</div>
</div>
</div>
</section>
<!-- Performance Section -->
<section class="py-12 sm:py-16 md:py-20 bg-white dark:bg-gray-900 theme-transition">
<div class="max-w-6xl mx-auto px-4 sm:px-6">
<div class="text-center mb-8 sm:mb-12">
<h2 class="text-2xl sm:text-3xl md:text-4xl font-bold text-gray-900 dark:text-gray-100 mb-3 sm:mb-4">Performance</h2>
<p class="text-base sm:text-lg text-gray-600 dark:text-gray-300 px-4">Battle-tested at scale</p>
</div>
<div class="max-w-3xl mx-auto">
<div class="grid sm:grid-cols-3 gap-4 text-center">
<div class="glass p-6 rounded-xl">
<div class="text-4xl font-bold gradient-text mb-2">100k+</div>
<p class="text-sm text-gray-600 dark:text-gray-400">Requests/second</p>
</div>
<div class="glass p-6 rounded-xl">
<div class="text-4xl font-bold gradient-text mb-2">10MB</div>
<p class="text-sm text-gray-600 dark:text-gray-400">RAM usage</p>
</div>
<div class="glass p-6 rounded-xl">
<div class="text-4xl font-bold gradient-text mb-2">0.1%</div>
<p class="text-sm text-gray-600 dark:text-gray-400">CPU usage</p>
</div>
</div>
<div class="mt-8 text-center">
<a href="bench/" class="inline-flex items-center text-fuchsia-600 dark:text-fuchsia-400 hover:underline font-medium">
View benchmarks
<i class="fas fa-arrow-right ml-2"></i>
</a>
</div>
</div>
</div>
</section>
<!-- Footer -->
<footer class="py-8 bg-gray-100 dark:bg-gray-800 theme-transition">
<div class="max-w-6xl mx-auto px-4 sm:px-6">
<div class="flex flex-col sm:flex-row justify-between items-center gap-4">
<div class="flex items-center gap-2">
<i class="fas fa-diagram-project text-xl gradient-text"></i>
<span class="font-semibold gradient-text">graphql-monitoring-proxy</span>
</div>
<div class="flex items-center gap-6">
<a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy" class="text-gray-600 dark:text-gray-400 hover:text-gray-900 dark:hover:text-gray-100">
<i class="fab fa-github text-xl"></i>
</a>
<a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy/issues" class="text-gray-600 dark:text-gray-400 hover:text-gray-900 dark:hover:text-gray-100 text-sm">
Issues
</a>
<a href="https://github.com/lukaszraczylo/graphql-monitoring-proxy/releases" class="text-gray-600 dark:text-gray-400 hover:text-gray-900 dark:hover:text-gray-100 text-sm">
Releases
</a>
</div>
<p class="text-gray-500 dark:text-gray-400 text-sm">MIT License</p>
</div>
</div>
</footer>
<script>
// Theme toggle
document.getElementById('theme-toggle').addEventListener('click', function() {
if (document.documentElement.classList.contains('dark')) {
document.documentElement.classList.remove('dark');
localStorage.theme = 'light';
} else {
document.documentElement.classList.add('dark');
localStorage.theme = 'dark';
}
});
// Mobile menu toggle
document.getElementById('mobile-menu-toggle').addEventListener('click', function() {
const menu = document.getElementById('mobile-menu');
const openIcon = document.getElementById('menu-open-icon');
const closeIcon = document.getElementById('menu-close-icon');
menu.classList.toggle('hidden');
openIcon.classList.toggle('hidden');
closeIcon.classList.toggle('hidden');
});
// Close mobile menu when clicking a link
document.querySelectorAll('#mobile-menu a').forEach(link => {
link.addEventListener('click', () => {
document.getElementById('mobile-menu').classList.add('hidden');
document.getElementById('menu-open-icon').classList.remove('hidden');
document.getElementById('menu-close-icon').classList.add('hidden');
});
});
</script>
</body>
</html>
-109
View File
@@ -106,117 +106,8 @@ func (e *ProxyError) WithMetadata(key string, value interface{}) *ProxyError {
return e
}
// Common error constructors
// NewConnectionError creates a connection-related error
func NewConnectionError(err error) *ProxyError {
code := ErrCodeConnectionRefused
if err != nil {
errStr := err.Error()
if contains(errStr, "reset") {
code = ErrCodeConnectionReset
}
}
return NewProxyError(code, "Failed to connect to backend", 502, true).
WithCause(err)
}
// NewTimeoutError creates a timeout error
func NewTimeoutError(err error) *ProxyError {
return NewProxyError(ErrCodeTimeout, "Request timed out", 504, false).
WithCause(err)
}
// NewCircuitOpenError creates a circuit breaker open error
func NewCircuitOpenError() *ProxyError {
return NewProxyError(ErrCodeCircuitOpen, "Service temporarily unavailable due to circuit breaker", 503, false).
WithDetails("The backend service is currently experiencing issues. Please try again later.")
}
// NewRateLimitError creates a rate limit error
func NewRateLimitError(userID, role string) *ProxyError {
return NewProxyError(ErrCodeRateLimited, "Rate limit exceeded", 429, false).
WithDetails("You have exceeded the rate limit for your role").
WithMetadata("user_id", userID).
WithMetadata("role", role)
}
// NewBackendError creates a backend error from status code
func NewBackendError(statusCode int, body string) *ProxyError {
code := ErrCodeBackendError
message := "Backend returned an error"
retryable := false
switch {
case statusCode == 429:
code = ErrCodeRateLimited
message = "Backend rate limit exceeded"
retryable = true
case statusCode == 503:
code = ErrCodeServiceUnavailable
message = "Backend service unavailable"
retryable = true
case statusCode == 502 || statusCode == 504:
code = ErrCodeBadGateway
message = "Bad gateway"
retryable = true
case statusCode >= 500:
code = ErrCodeBackendError
message = "Backend server error"
retryable = true
case statusCode == 404:
code = ErrCodeNotFound
message = "Resource not found"
case statusCode == 403:
code = ErrCodeForbidden
message = "Access forbidden"
case statusCode == 401:
code = ErrCodeUnauthorized
message = "Unauthorized"
case statusCode >= 400:
code = ErrCodeInvalidRequest
message = "Invalid request"
}
return NewProxyError(code, message, statusCode, retryable).
WithMetadata("backend_status", statusCode).
WithMetadata("backend_body", truncateString(body, 500))
}
// NewInvalidResponseError creates an invalid response error
func NewInvalidResponseError(details string) *ProxyError {
return NewProxyError(ErrCodeInvalidResponse, "Backend returned invalid response", 502, false).
WithDetails(details)
}
// NewInternalError creates an internal error
func NewInternalError(err error) *ProxyError {
return NewProxyError(ErrCodeInternalError, "Internal proxy error", 500, false).
WithCause(err)
}
// NewContextCanceledError creates a context canceled error
func NewContextCanceledError() *ProxyError {
return NewProxyError(ErrCodeContextCanceled, "Request canceled", 499, false).
WithDetails("The request was canceled by the client")
}
// Helper functions
func contains(s, substr string) bool {
return len(s) > 0 && len(substr) > 0 && len(s) >= len(substr) && (s == substr || len(s) > len(substr) && (s[:len(substr)] == substr || s[len(s)-len(substr):] == substr || containsMiddle(s, substr)))
}
func containsMiddle(s, substr string) bool {
for i := 0; i <= len(s)-len(substr); i++ {
if s[i:i+len(substr)] == substr {
return true
}
}
return false
}
func truncateString(s string, maxLen int) string {
if len(s) <= maxLen {
return s
+6 -5
View File
@@ -15,12 +15,13 @@ const (
)
// Use parameterized queries to prevent SQL injection
// Cast $1 to interval type to allow proper parameterized interval values
var delQueries = [...]string{
"DELETE FROM hdb_catalog.event_invocation_logs WHERE created_at < NOW() - INTERVAL $1",
"DELETE FROM hdb_catalog.event_log WHERE created_at < NOW() - INTERVAL $1",
"DELETE FROM hdb_catalog.hdb_action_log WHERE created_at < NOW() - INTERVAL $1",
"DELETE FROM hdb_catalog.hdb_cron_event_invocation_logs WHERE created_at < NOW() - INTERVAL $1",
"DELETE FROM hdb_catalog.hdb_scheduled_event_invocation_logs WHERE created_at < NOW() - INTERVAL $1",
"DELETE FROM hdb_catalog.event_invocation_logs WHERE created_at < NOW() - $1::INTERVAL",
"DELETE FROM hdb_catalog.event_log WHERE created_at < NOW() - $1::INTERVAL",
"DELETE FROM hdb_catalog.hdb_action_log WHERE created_at < NOW() - $1::INTERVAL",
"DELETE FROM hdb_catalog.hdb_cron_event_invocation_logs WHERE created_at < NOW() - $1::INTERVAL",
"DELETE FROM hdb_catalog.hdb_scheduled_event_invocation_logs WHERE created_at < NOW() - $1::INTERVAL",
}
func enableHasuraEventCleaner(ctx context.Context) error {
+2 -2
View File
@@ -340,8 +340,8 @@ func getDelQueries() []string {
// This should return the actual delQueries from the main package
// For testing purposes, we return expected parameterized queries
return []string{
"DELETE FROM hdb_catalog.event_log WHERE created_at < NOW() - INTERVAL '$1 days'",
"DELETE FROM hdb_catalog.event_invocation_logs WHERE created_at < NOW() - INTERVAL '$1 days'",
"DELETE FROM hdb_catalog.event_log WHERE created_at < NOW() - $1::INTERVAL",
"DELETE FROM hdb_catalog.event_invocation_logs WHERE created_at < NOW() - $1::INTERVAL",
}
}
+14 -14
View File
@@ -9,18 +9,18 @@ require (
github.com/alicebob/miniredis/v2 v2.33.0
github.com/avast/retry-go/v4 v4.7.0
github.com/goccy/go-json v0.10.5
github.com/gofiber/fiber/v2 v2.52.9
github.com/gofiber/fiber/v2 v2.52.10
github.com/gofiber/websocket/v2 v2.2.1
github.com/gofrs/flock v0.13.0
github.com/google/uuid v1.6.0
github.com/gookit/goutil v0.7.1
github.com/gookit/goutil v0.7.2
github.com/gorilla/websocket v1.5.3
github.com/graphql-go/graphql v0.8.1
github.com/jackc/pgx/v5 v5.7.6
github.com/lukaszraczylo/ask v0.0.0-20240916204100-6e9ef53a62d9
github.com/lukaszraczylo/go-ratecounter v0.1.12
github.com/lukaszraczylo/go-simple-graphql v1.2.84
github.com/redis/go-redis/v9 v9.16.0
github.com/lukaszraczylo/go-simple-graphql v1.2.89
github.com/redis/go-redis/v9 v9.17.2
github.com/sony/gobreaker v1.0.0
github.com/stretchr/testify v1.11.1
github.com/valyala/fasthttp v1.68.0
@@ -28,7 +28,7 @@ require (
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.38.0
go.opentelemetry.io/otel/sdk v1.38.0
go.opentelemetry.io/otel/trace v1.38.0
google.golang.org/grpc v1.76.0
google.golang.org/grpc v1.77.0
)
require (
@@ -47,7 +47,7 @@ require (
github.com/jackc/pgpassfile v1.0.0 // indirect
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
github.com/jackc/puddle/v2 v2.2.2 // indirect
github.com/klauspost/compress v1.18.1 // indirect
github.com/klauspost/compress v1.18.2 // indirect
github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/mattn/go-runewidth v0.0.19 // indirect
@@ -61,14 +61,14 @@ require (
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.38.0 // indirect
go.opentelemetry.io/otel/metric v1.38.0 // indirect
go.opentelemetry.io/proto/otlp v1.9.0 // indirect
golang.org/x/crypto v0.43.0 // indirect
golang.org/x/net v0.46.0 // indirect
golang.org/x/sync v0.17.0 // indirect
golang.org/x/sys v0.37.0 // indirect
golang.org/x/term v0.36.0 // indirect
golang.org/x/text v0.30.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20251103181224-f26f9409b101 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20251103181224-f26f9409b101 // indirect
golang.org/x/crypto v0.45.0 // indirect
golang.org/x/net v0.47.0 // indirect
golang.org/x/sync v0.18.0 // indirect
golang.org/x/sys v0.38.0 // indirect
golang.org/x/term v0.37.0 // indirect
golang.org/x/text v0.31.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20251202230838-ff82c1b0f217 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20251202230838-ff82c1b0f217 // indirect
google.golang.org/protobuf v1.36.10 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
+28 -28
View File
@@ -36,8 +36,8 @@ github.com/goccy/go-json v0.10.5 h1:Fq85nIqj+gXn/S5ahsiTlK3TmC85qgirsdTP/+DeaC4=
github.com/goccy/go-json v0.10.5/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
github.com/goccy/go-reflect v1.2.0 h1:O0T8rZCuNmGXewnATuKYnkL0xm6o8UNOJZd/gOkb9ms=
github.com/goccy/go-reflect v1.2.0/go.mod h1:n0oYZn8VcV2CkWTxi8B9QjkCoq6GTtCEdfmR66YhFtE=
github.com/gofiber/fiber/v2 v2.52.9 h1:YjKl5DOiyP3j0mO61u3NTmK7or8GzzWzCFzkboyP5cw=
github.com/gofiber/fiber/v2 v2.52.9/go.mod h1:YEcBbO/FB+5M1IZNBP9FO3J9281zgPAreiI1oqg8nDw=
github.com/gofiber/fiber/v2 v2.52.10 h1:jRHROi2BuNti6NYXmZ6gbNSfT3zj/8c0xy94GOU5elY=
github.com/gofiber/fiber/v2 v2.52.10/go.mod h1:YEcBbO/FB+5M1IZNBP9FO3J9281zgPAreiI1oqg8nDw=
github.com/gofiber/websocket/v2 v2.2.1 h1:C9cjxvloojayOp9AovmpQrk8VqvVnT8Oao3+IUygH7w=
github.com/gofiber/websocket/v2 v2.2.1/go.mod h1:Ao/+nyNnX5u/hIFPuHl28a+NIkrqK7PRimyKaj4JxVU=
github.com/gofrs/flock v0.13.0 h1:95JolYOvGMqeH31+FC7D2+uULf6mG61mEZ/A8dRYMzw=
@@ -48,8 +48,8 @@ github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/gookit/goutil v0.7.1 h1:AaFJPN9mrdeYBv8HOybri26EHGCC34WJVT7jUStGJsI=
github.com/gookit/goutil v0.7.1/go.mod h1:vJS9HXctYTCLtCsZot5L5xF+O1oR17cDYO9R0HxBmnU=
github.com/gookit/goutil v0.7.2 h1:NSiqWWY+BT0MwIlKDeSVPfQmr9xTkkAqwDjhplobdgo=
github.com/gookit/goutil v0.7.2/go.mod h1:vJS9HXctYTCLtCsZot5L5xF+O1oR17cDYO9R0HxBmnU=
github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
github.com/graphql-go/graphql v0.8.1 h1:p7/Ou/WpmulocJeEx7wjQy611rtXGQaAcXGqanuMMgc=
@@ -64,8 +64,8 @@ github.com/jackc/pgx/v5 v5.7.6 h1:rWQc5FwZSPX58r1OQmkuaNicxdmExaEz5A2DO2hUuTk=
github.com/jackc/pgx/v5 v5.7.6/go.mod h1:aruU7o91Tc2q2cFp5h4uP3f6ztExVpyVv88Xl/8Vl8M=
github.com/jackc/puddle/v2 v2.2.2 h1:PR8nw+E/1w0GLuRFSmiioY6UooMp6KJv0/61nB7icHo=
github.com/jackc/puddle/v2 v2.2.2/go.mod h1:vriiEXHvEE654aYKXXjOvZM39qJ0q+azkZFrfEOc3H4=
github.com/klauspost/compress v1.18.1 h1:bcSGx7UbpBqMChDtsF28Lw6v/G94LPrrbMbdC3JH2co=
github.com/klauspost/compress v1.18.1/go.mod h1:ZQFFVG+MdnR0P+l6wpXgIL4NTtwiKIdBnrBd8Nrxr+0=
github.com/klauspost/compress v1.18.2 h1:iiPHWW0YrcFgpBYhsA6D1+fqHssJscY/Tm/y2Uqnapk=
github.com/klauspost/compress v1.18.2/go.mod h1:R0h/fSBs8DE4ENlcrlib3PsXS61voFxhIs2DeRhCvJ4=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
@@ -74,8 +74,8 @@ github.com/lukaszraczylo/ask v0.0.0-20240916204100-6e9ef53a62d9 h1:pL8B9mjv6RPUf
github.com/lukaszraczylo/ask v0.0.0-20240916204100-6e9ef53a62d9/go.mod h1:M+UVdyqZs++xtEPrascaVmZdOMhCnxjZ2SgH+xHpR0c=
github.com/lukaszraczylo/go-ratecounter v0.1.12 h1:VO6hHYGw/Jy9JUizXf/bS0AI2QX1ueWWAWckMFVJ/w4=
github.com/lukaszraczylo/go-ratecounter v0.1.12/go.mod h1:TqXEOCtFJStk1i0tkipprv1kiDHGon1MVUisjSTBSKM=
github.com/lukaszraczylo/go-simple-graphql v1.2.84 h1:yP00k8XSYKFYo6PmZFOsDblexLOG6WZzVWhzdstrxiw=
github.com/lukaszraczylo/go-simple-graphql v1.2.84/go.mod h1:PxQYblQDZISmYYj8sNfazAWxAOh1rhAtU208y+uPV8s=
github.com/lukaszraczylo/go-simple-graphql v1.2.89 h1:Xbu1Ny+a0lT2Sr2SaSC8mcHmGQDwGD4TJKk4DDd+PwA=
github.com/lukaszraczylo/go-simple-graphql v1.2.89/go.mod h1:PxQYblQDZISmYYj8sNfazAWxAOh1rhAtU208y+uPV8s=
github.com/mattn/go-colorable v0.1.14 h1:9A9LHSqF/7dyVVX6g0U9cwm9pG3kP9gSzcuIPHPsaIE=
github.com/mattn/go-colorable v0.1.14/go.mod h1:6LmQG8QLFO4G5z1gPvYEzlUgJ2wF+stgPZH1UqBm1s8=
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
@@ -84,8 +84,8 @@ github.com/mattn/go-runewidth v0.0.19 h1:v++JhqYnZuu5jSKrk9RbgF5v4CGUjqRfBm05byF
github.com/mattn/go-runewidth v0.0.19/go.mod h1:XBkDxAl56ILZc9knddidhrOlY5R/pDhgLpndooCuJAs=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/redis/go-redis/v9 v9.16.0 h1:OotgqgLSRCmzfqChbQyG1PHC3tLNR89DG4jdOERSEP4=
github.com/redis/go-redis/v9 v9.16.0/go.mod h1:u410H11HMLoB+TP67dz8rL9s6QW2j76l0//kSOd3370=
github.com/redis/go-redis/v9 v9.17.2 h1:P2EGsA4qVIM3Pp+aPocCJ7DguDHhqrXNhVcEp4ViluI=
github.com/redis/go-redis/v9 v9.17.2/go.mod h1:u410H11HMLoB+TP67dz8rL9s6QW2j76l0//kSOd3370=
github.com/rogpeppe/go-internal v1.14.1 h1:UQB4HGPB6osV0SQTLymcB4TgvyWu6ZyliaW0tI/otEQ=
github.com/rogpeppe/go-internal v1.14.1/go.mod h1:MaRKkUm5W0goXpeCfT7UZI6fk/L7L7so1lCWt35ZSgc=
github.com/savsgio/gotils v0.0.0-20250924091648-bce9a52d7761 h1:McifyVxygw1d67y6vxUqls2D46J8W9nrki9c8c0eVvE=
@@ -129,27 +129,27 @@ go.opentelemetry.io/proto/otlp v1.9.0 h1:l706jCMITVouPOqEnii2fIAuO3IVGBRPV5ICjce
go.opentelemetry.io/proto/otlp v1.9.0/go.mod h1:xE+Cx5E/eEHw+ISFkwPLwCZefwVjY+pqKg1qcK03+/4=
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
golang.org/x/crypto v0.43.0 h1:dduJYIi3A3KOfdGOHX8AVZ/jGiyPa3IbBozJ5kNuE04=
golang.org/x/crypto v0.43.0/go.mod h1:BFbav4mRNlXJL4wNeejLpWxB7wMbc79PdRGhWKncxR0=
golang.org/x/net v0.46.0 h1:giFlY12I07fugqwPuWJi68oOnpfqFnJIJzaIIm2JVV4=
golang.org/x/net v0.46.0/go.mod h1:Q9BGdFy1y4nkUwiLvT5qtyhAnEHgnQ/zd8PfU6nc210=
golang.org/x/sync v0.17.0 h1:l60nONMj9l5drqw6jlhIELNv9I0A4OFgRsG9k2oT9Ug=
golang.org/x/sync v0.17.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
golang.org/x/crypto v0.45.0 h1:jMBrvKuj23MTlT0bQEOBcAE0mjg8mK9RXFhRH6nyF3Q=
golang.org/x/crypto v0.45.0/go.mod h1:XTGrrkGJve7CYK7J8PEww4aY7gM3qMCElcJQ8n8JdX4=
golang.org/x/net v0.47.0 h1:Mx+4dIFzqraBXUugkia1OOvlD6LemFo1ALMHjrXDOhY=
golang.org/x/net v0.47.0/go.mod h1:/jNxtkgq5yWUGYkaZGqo27cfGZ1c5Nen03aYrrKpVRU=
golang.org/x/sync v0.18.0 h1:kr88TuHDroi+UVf+0hZnirlk8o8T+4MrK6mr60WkH/I=
golang.org/x/sync v0.18.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.37.0 h1:fdNQudmxPjkdUTPnLn5mdQv7Zwvbvpaxqs831goi9kQ=
golang.org/x/sys v0.37.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
golang.org/x/term v0.36.0 h1:zMPR+aF8gfksFprF/Nc/rd1wRS1EI6nDBGyWAvDzx2Q=
golang.org/x/term v0.36.0/go.mod h1:Qu394IJq6V6dCBRgwqshf3mPF85AqzYEzofzRdZkWss=
golang.org/x/text v0.30.0 h1:yznKA/E9zq54KzlzBEAWn1NXSQ8DIp/NYMy88xJjl4k=
golang.org/x/text v0.30.0/go.mod h1:yDdHFIX9t+tORqspjENWgzaCVXgk0yYnYuSZ8UzzBVM=
golang.org/x/sys v0.38.0 h1:3yZWxaJjBmCWXqhN1qh02AkOnCQ1poK6oF+a7xWL6Gc=
golang.org/x/sys v0.38.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
golang.org/x/term v0.37.0 h1:8EGAD0qCmHYZg6J17DvsMy9/wJ7/D/4pV/wfnld5lTU=
golang.org/x/term v0.37.0/go.mod h1:5pB4lxRNYYVZuTLmy8oR2BH8dflOR+IbTYFD8fi3254=
golang.org/x/text v0.31.0 h1:aC8ghyu4JhP8VojJ2lEHBnochRno1sgL6nEi9WGFGMM=
golang.org/x/text v0.31.0/go.mod h1:tKRAlv61yKIjGGHX/4tP1LTbc13YSec1pxVEWXzfoeM=
gonum.org/v1/gonum v0.16.0 h1:5+ul4Swaf3ESvrOnidPp4GZbzf0mxVQpDCYUQE7OJfk=
gonum.org/v1/gonum v0.16.0/go.mod h1:fef3am4MQ93R2HHpKnLk4/Tbh/s0+wqD5nfa6Pnwy4E=
google.golang.org/genproto/googleapis/api v0.0.0-20251103181224-f26f9409b101 h1:vk5TfqZHNn0obhPIYeS+cxIFKFQgser/M2jnI+9c6MM=
google.golang.org/genproto/googleapis/api v0.0.0-20251103181224-f26f9409b101/go.mod h1:E17fc4PDhkr22dE3RgnH2hEubUaky6ZwW4VhANxyspg=
google.golang.org/genproto/googleapis/rpc v0.0.0-20251103181224-f26f9409b101 h1:tRPGkdGHuewF4UisLzzHHr1spKw92qLM98nIzxbC0wY=
google.golang.org/genproto/googleapis/rpc v0.0.0-20251103181224-f26f9409b101/go.mod h1:7i2o+ce6H/6BluujYR+kqX3GKH+dChPTQU19wjRPiGk=
google.golang.org/grpc v1.76.0 h1:UnVkv1+uMLYXoIz6o7chp59WfQUYA2ex/BXQ9rHZu7A=
google.golang.org/grpc v1.76.0/go.mod h1:Ju12QI8M6iQJtbcsV+awF5a4hfJMLi4X0JLo94ULZ6c=
google.golang.org/genproto/googleapis/api v0.0.0-20251202230838-ff82c1b0f217 h1:fCvbg86sFXwdrl5LgVcTEvNC+2txB5mgROGmRL5mrls=
google.golang.org/genproto/googleapis/api v0.0.0-20251202230838-ff82c1b0f217/go.mod h1:+rXWjjaukWZun3mLfjmVnQi18E1AsFbDN9QdJ5YXLto=
google.golang.org/genproto/googleapis/rpc v0.0.0-20251202230838-ff82c1b0f217 h1:gRkg/vSppuSQoDjxyiGfN4Upv/h/DQmIR10ZU8dh4Ww=
google.golang.org/genproto/googleapis/rpc v0.0.0-20251202230838-ff82c1b0f217/go.mod h1:7i2o+ce6H/6BluujYR+kqX3GKH+dChPTQU19wjRPiGk=
google.golang.org/grpc v1.77.0 h1:wVVY6/8cGA6vvffn+wWK5ToddbgdU3d8MNENr4evgXM=
google.golang.org/grpc v1.77.0/go.mod h1:z0BY1iVj0q8E1uSQCjL9cppRj+gnZjzDnzV0dHhrNig=
google.golang.org/protobuf v1.36.10 h1:AYd7cD/uASjIL6Q9LiTjz8JLcrh/88q5UObnmY3aOOE=
google.golang.org/protobuf v1.36.10/go.mod h1:HTf+CrKn2C3g5S8VImy6tdcUvCska2kB7j23XfzDpco=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
+52 -13
View File
@@ -7,6 +7,7 @@ import (
"sync"
"sync/atomic"
"time"
"unicode"
"github.com/goccy/go-json"
fiber "github.com/gofiber/fiber/v2"
@@ -37,6 +38,40 @@ var (
currentCacheSize int64 // Use atomic operations for this
)
// sanitizeOperationName removes null bytes and other invalid characters from operation names
// This prevents panics when creating metrics with invalid label values
func sanitizeOperationName(name string) string {
if name == "" || name == "undefined" {
return name
}
var buf strings.Builder
buf.Grow(len(name))
for _, r := range name {
// Skip null bytes entirely
if r == '\x00' {
continue
}
// Replace control characters with underscores
if r < 32 || r == 127 {
buf.WriteByte('_')
continue
}
// Only allow printable characters
if unicode.IsPrint(r) {
buf.WriteRune(r)
}
}
result := buf.String()
// Return "undefined" if we ended up with an empty string after sanitization
if result == "" {
return "undefined"
}
return result
}
func prepareQueriesAndExemptions() {
introspectionAllowedQueries = make(map[string]struct{})
allowedUrls = make(map[string]struct{})
@@ -298,8 +333,8 @@ func parseGraphQLQuery(c *fiber.Ctx) *parseGraphQLQueryResult {
res.operationType = "mutation"
if oper.Name != nil {
mutationName = oper.Name.Value
// Use mutation name immediately
res.operationName = mutationName
// Use mutation name immediately, sanitized to prevent metric panics
res.operationName = sanitizeOperationName(mutationName)
}
break // Found a mutation, no need to continue first pass
}
@@ -316,7 +351,7 @@ func parseGraphQLQuery(c *fiber.Ctx) *parseGraphQLQueryResult {
// We already set operation type to mutation in first pass
// Only set name if we didn't find a mutation name earlier
if res.operationName == "undefined" && oper.Name != nil {
res.operationName = oper.Name.Value
res.operationName = sanitizeOperationName(oper.Name.Value)
}
} else {
// No mutation found, use the normal logic
@@ -325,18 +360,10 @@ func parseGraphQLQuery(c *fiber.Ctx) *parseGraphQLQueryResult {
}
if res.operationName == "undefined" && oper.Name != nil {
res.operationName = oper.Name.Value
res.operationName = sanitizeOperationName(oper.Name.Value)
}
}
// Handle endpoint routing - always use write endpoint for mutations
if res.operationType == "mutation" {
res.activeEndpoint = cfg.Server.HostGraphQL
} else if cfg.Server.HostGraphQLReadOnly != "" {
// Use read-only endpoint for non-mutation operations
res.activeEndpoint = cfg.Server.HostGraphQLReadOnly
}
// Block mutations in read-only mode
if res.operationType == "mutation" && cfg.Server.ReadOnlyMode {
if ifNotInTest() {
@@ -359,13 +386,25 @@ func parseGraphQLQuery(c *fiber.Ctx) *parseGraphQLQueryResult {
}
}
// Handle endpoint routing AFTER processing all definitions
// This ensures mutations are always routed to the write endpoint
if res.operationType == "mutation" {
res.activeEndpoint = cfg.Server.HostGraphQL
} else if cfg.Server.HostGraphQLReadOnly != "" {
// Use read-only endpoint for non-mutation operations
res.activeEndpoint = cfg.Server.HostGraphQLReadOnly
}
// Track parsing time
if ifNotInTest() && cfg.Monitoring != nil {
parseTime := float64(time.Since(startTime).Milliseconds())
cfg.Monitoring.IncrementFloat(libpack_monitoring.MetricsGraphQLParsingTime, nil, parseTime)
}
return res
// Create a copy to return, since the original will be returned to the pool
// This prevents race conditions where concurrent requests could modify the same result
result := *res
return &result
}
// processDirectives extracts caching directives from the operation
+539 -2
View File
@@ -1,14 +1,16 @@
package main
import (
"context"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/gofiber/fiber/v2"
"github.com/gookit/goutil/strutil"
libpack_cache "github.com/lukaszraczylo/graphql-monitoring-proxy/cache"
libpack_monitoring "github.com/lukaszraczylo/graphql-monitoring-proxy/monitoring"
"github.com/sony/gobreaker"
@@ -115,7 +117,8 @@ func (suite *Tests) TestCachingAndCircuitBreakerInteraction() {
suite.Equal(responseBody, firstResponseBody, "Response body should match server response")
// Calculate hash the same way the system does, before releasing context
cacheKey := strutil.Md5(ctx.Body())
// Use default user context ("-", "-") since no auth headers are set in this test
cacheKey := libpack_cache.CalculateHash(ctx, "-", "-")
// Store in cache directly for test
libpack_cache.CacheStore(cacheKey, []byte(responseBody))
@@ -495,3 +498,537 @@ func getMetricValue(metricName string) int {
}
return int(counter.Get())
}
// TestRequestCoalescingIntegration tests that request coalescing works end-to-end
// through the proxy layer, ensuring concurrent identical requests result in only
// one backend call while all clients receive the correct response.
func (suite *Tests) TestRequestCoalescingIntegration() {
// Save original config
originalCoalescing := cfg.RequestCoalescing
originalClient := cfg.Client.FastProxyClient
originalHostGraphQL := cfg.Server.HostGraphQL
// Restore after test
defer func() {
cfg.RequestCoalescing = originalCoalescing
cfg.Client.FastProxyClient = originalClient
cfg.Server.HostGraphQL = originalHostGraphQL
}()
// Track backend calls
var backendCallCount atomic.Int32
var requestDelay = 100 * time.Millisecond
// Create test server that counts requests and introduces delay
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
backendCallCount.Add(1)
time.Sleep(requestDelay) // Delay to allow concurrent requests to coalesce
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(`{"data":{"users":[{"id":"1","name":"Test User"}]}}`))
}))
defer server.Close()
// Configure for test
cfg.Server.HostGraphQL = server.URL
cfg.Client.ClientTimeout = 5
cfg.Client.FastProxyClient = createFasthttpClient(cfg)
cfg.RequestCoalescing.Enable = true
// Initialize request coalescer for this test
// Reset the global coalescer by creating a new one
testCoalescer := NewRequestCoalescer(true, cfg.Logger, cfg.Monitoring)
// Temporarily replace global coalescer
originalCoalescer := requestCoalescer
requestCoalescer = testCoalescer
defer func() {
requestCoalescer = originalCoalescer
}()
// Test Case 1: Concurrent identical requests should coalesce
suite.Run("concurrent_identical_requests_coalesce", func() {
backendCallCount.Store(0)
testCoalescer.Reset()
concurrentRequests := 10
var wg sync.WaitGroup
wg.Add(concurrentRequests)
responses := make([]string, concurrentRequests)
errors := make([]error, concurrentRequests)
// Launch concurrent requests with identical query
for i := 0; i < concurrentRequests; i++ {
go func(index int) {
defer wg.Done()
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { users { id name } }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
err := proxyTheRequest(ctx, cfg.Server.HostGraphQL)
errors[index] = err
responses[index] = string(ctx.Response().Body())
suite.app.ReleaseCtx(ctx)
}(i)
}
wg.Wait()
// Verify only 1 backend call was made
suite.Equal(int32(1), backendCallCount.Load(),
"Should make only 1 backend call for %d concurrent identical requests", concurrentRequests)
// Verify all requests succeeded with same response
expectedResponse := `{"data":{"users":[{"id":"1","name":"Test User"}]}}`
for i := 0; i < concurrentRequests; i++ {
suite.Nil(errors[i], "Request %d should succeed", i)
suite.Equal(expectedResponse, responses[i],
"Request %d should have correct response", i)
}
// Verify coalescing stats
stats := testCoalescer.GetStats()
suite.Equal(int64(concurrentRequests), stats["total_requests"],
"Total requests should match")
suite.Equal(int64(1), stats["primary_requests"],
"Should have 1 primary request")
suite.Equal(int64(concurrentRequests-1), stats["coalesced_requests"],
"Should have %d coalesced requests", concurrentRequests-1)
})
// Test Case 2: Different queries should NOT coalesce
suite.Run("different_queries_not_coalesced", func() {
backendCallCount.Store(0)
testCoalescer.Reset()
// Create server that returns query-specific responses
server2 := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
backendCallCount.Add(1)
time.Sleep(50 * time.Millisecond)
body := make([]byte, r.ContentLength)
_, _ = r.Body.Read(body)
var response string
if strings.Contains(string(body), "query1") {
response = `{"data":{"result":"query1"}}`
} else if strings.Contains(string(body), "query2") {
response = `{"data":{"result":"query2"}}`
} else {
response = `{"data":{"result":"unknown"}}`
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(response))
}))
defer server2.Close()
cfg.Server.HostGraphQL = server2.URL
cfg.Client.FastProxyClient = createFasthttpClient(cfg)
var wg sync.WaitGroup
wg.Add(2)
var response1, response2 string
var err1, err2 error
// Launch two requests with different queries concurrently
go func() {
defer wg.Done()
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { query1 }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
err1 = proxyTheRequest(ctx, cfg.Server.HostGraphQL)
response1 = string(ctx.Response().Body())
suite.app.ReleaseCtx(ctx)
}()
go func() {
defer wg.Done()
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { query2 }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
err2 = proxyTheRequest(ctx, cfg.Server.HostGraphQL)
response2 = string(ctx.Response().Body())
suite.app.ReleaseCtx(ctx)
}()
wg.Wait()
// Both requests should succeed
suite.Nil(err1, "Query1 should succeed")
suite.Nil(err2, "Query2 should succeed")
// Should have made 2 backend calls (no coalescing for different queries)
suite.Equal(int32(2), backendCallCount.Load(),
"Should make 2 backend calls for 2 different queries")
// Responses should be different
suite.Contains(response1, "query1", "Response1 should be for query1")
suite.Contains(response2, "query2", "Response2 should be for query2")
})
// Test Case 3: Coalescing disabled should make separate calls
suite.Run("coalescing_disabled", func() {
// Create a fresh server for this test
var disabledCallCount atomic.Int32
serverDisabled := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
disabledCallCount.Add(1)
time.Sleep(50 * time.Millisecond)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(`{"data":{"users":[{"id":"1"}]}}`))
}))
defer serverDisabled.Close()
cfg.Server.HostGraphQL = serverDisabled.URL
cfg.Client.FastProxyClient = createFasthttpClient(cfg)
// Disable coalescing
cfg.RequestCoalescing.Enable = false
concurrentRequests := 5
var wg sync.WaitGroup
wg.Add(concurrentRequests)
// Launch concurrent identical requests
for i := 0; i < concurrentRequests; i++ {
go func() {
defer wg.Done()
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { users { id } }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
_ = proxyTheRequest(ctx, cfg.Server.HostGraphQL)
suite.app.ReleaseCtx(ctx)
}()
}
wg.Wait()
// Should make separate backend calls when coalescing is disabled
suite.Equal(int32(concurrentRequests), disabledCallCount.Load(),
"Should make %d backend calls when coalescing is disabled", concurrentRequests)
// Re-enable for subsequent tests
cfg.RequestCoalescing.Enable = true
})
// Test Case 4: Error responses should be shared correctly
suite.Run("error_responses_coalesced", func() {
backendCallCount.Store(0)
testCoalescer.Reset()
// Create server that returns errors
serverError := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
backendCallCount.Add(1)
time.Sleep(50 * time.Millisecond)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusInternalServerError)
_, _ = w.Write([]byte(`{"errors":[{"message":"Internal server error"}]}`))
}))
defer serverError.Close()
cfg.Server.HostGraphQL = serverError.URL
cfg.Client.FastProxyClient = createFasthttpClient(cfg)
concurrentRequests := 5
var wg sync.WaitGroup
wg.Add(concurrentRequests)
errors := make([]error, concurrentRequests)
for i := 0; i < concurrentRequests; i++ {
go func(index int) {
defer wg.Done()
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { fail }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
errors[index] = proxyTheRequest(ctx, cfg.Server.HostGraphQL)
suite.app.ReleaseCtx(ctx)
}(i)
}
wg.Wait()
// Should still only make 1 backend call
suite.Equal(int32(1), backendCallCount.Load(),
"Should make only 1 backend call even for error responses")
// All requests should receive the same error
for i := 0; i < concurrentRequests; i++ {
suite.NotNil(errors[i], "Request %d should have error", i)
}
})
}
// TestRetryBudgetIntegration tests that retry budget correctly limits retry attempts
func (suite *Tests) TestRetryBudgetIntegration() {
// Initialize a retry budget with limited tokens for testing
budgetCtx := context.Background()
testBudget := NewRetryBudgetWithContext(budgetCtx, RetryBudgetConfig{
MaxTokens: 3, // Only allow 3 retries total
TokensPerSecond: 0, // Don't refill during test
Enabled: true,
}, cfg.Logger)
// Replace global retry budget
originalBudget := retryBudget
retryBudget = testBudget
defer func() {
testBudget.Shutdown()
retryBudget = originalBudget
}()
suite.Run("retry_budget_limits_retries", func() {
testBudget.Reset()
// Verify retry budget is set and works correctly
rb := GetRetryBudget()
suite.NotNil(rb, "Retry budget should be set")
suite.True(rb.enabled, "Retry budget should be enabled")
suite.T().Logf("Retry budget: enabled=%v, tokens=%d", rb.enabled, rb.currentTokens.Load())
// Test that AllowRetry consumes tokens correctly
initialTokens := rb.currentTokens.Load()
suite.Equal(int64(3), initialTokens, "Should start with 3 tokens")
// First 3 retries should be allowed
suite.True(rb.AllowRetry(), "First retry should be allowed")
suite.True(rb.AllowRetry(), "Second retry should be allowed")
suite.True(rb.AllowRetry(), "Third retry should be allowed")
// Fourth retry should be denied (tokens exhausted)
suite.False(rb.AllowRetry(), "Fourth retry should be denied - budget exhausted")
// Verify stats
stats := rb.GetStats()
suite.Equal(int64(4), stats["total_attempts"].(int64), "Should have 4 total attempts")
suite.Equal(int64(3), stats["allowed_retries"].(int64), "Should have 3 allowed retries")
suite.Equal(int64(1), stats["denied_retries"].(int64), "Should have 1 denied retry")
suite.T().Logf("Retry budget stats: total=%d, allowed=%d, denied=%d",
stats["total_attempts"], stats["allowed_retries"], stats["denied_retries"])
})
suite.Run("retry_budget_exhaustion", func() {
// Create a new budget with only 1 token
testBudget.Shutdown()
budgetCtx2 := context.Background()
testBudget2 := NewRetryBudgetWithContext(budgetCtx2, RetryBudgetConfig{
MaxTokens: 1, // Only allow 1 retry
TokensPerSecond: 0, // Don't refill
Enabled: true,
}, cfg.Logger)
retryBudget = testBudget2
defer func() {
testBudget2.Shutdown()
}()
// Test budget exhaustion with 1 token
rb := GetRetryBudget()
suite.NotNil(rb, "Retry budget should be set")
suite.Equal(int64(1), rb.currentTokens.Load(), "Should start with 1 token")
// First retry should be allowed
suite.True(rb.AllowRetry(), "First retry should be allowed")
// Second retry should be denied (only 1 token available)
suite.False(rb.AllowRetry(), "Second retry should be denied - budget exhausted")
// Verify stats
stats := rb.GetStats()
suite.Equal(int64(2), stats["total_attempts"].(int64), "Should have 2 total attempts")
suite.Equal(int64(1), stats["allowed_retries"].(int64), "Should have 1 allowed retry")
suite.Equal(int64(1), stats["denied_retries"].(int64), "Should have 1 denied retry")
suite.T().Logf("Retry budget stats: total=%d, allowed=%d, denied=%d",
stats["total_attempts"], stats["allowed_retries"], stats["denied_retries"])
})
}
// TestConnectionPoolStatsIntegration tests that connection pool stats are tracked
func (suite *Tests) TestConnectionPoolStatsIntegration() {
// Save original config
originalClient := cfg.Client.FastProxyClient
originalHostGraphQL := cfg.Server.HostGraphQL
originalCoalescing := cfg.RequestCoalescing.Enable
// Restore after test
defer func() {
cfg.Client.FastProxyClient = originalClient
cfg.Server.HostGraphQL = originalHostGraphQL
cfg.RequestCoalescing.Enable = originalCoalescing
}()
// Disable request coalescing for accurate tracking
cfg.RequestCoalescing.Enable = false
suite.Run("connection_success_tracked", func() {
// Create test server that succeeds
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(`{"data":{"test":"success"}}`))
}))
defer server.Close()
cfg.Server.HostGraphQL = server.URL
cfg.Client.ClientTimeout = 5
cfg.Client.FastProxyClient = createFasthttpClient(cfg)
// Initialize connection pool
InitializeConnectionPool(cfg.Client.FastProxyClient)
defer ShutdownConnectionPool()
poolMgr := GetConnectionPoolManager()
suite.NotNil(poolMgr, "Connection pool manager should be initialized")
// Get stats before
statsBefore := poolMgr.GetConnectionStats()
successBefore := statsBefore["total_connections"].(int64)
// Make a successful request
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { test }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
err := proxyTheRequest(ctx, cfg.Server.HostGraphQL)
suite.app.ReleaseCtx(ctx)
suite.Nil(err, "Request should succeed")
// Get stats after
statsAfter := poolMgr.GetConnectionStats()
successAfter := statsAfter["total_connections"].(int64)
suite.Greater(successAfter, successBefore,
"Total connections should increase after successful request")
})
suite.Run("connection_failure_tracked_on_5xx", func() {
// Create test server that returns 503
// Note: 503 triggers retry which records failures
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusServiceUnavailable)
_, _ = w.Write([]byte(`{"errors":[{"message":"Service unavailable"}]}`))
}))
defer server.Close()
cfg.Server.HostGraphQL = server.URL
cfg.Client.ClientTimeout = 2
cfg.Client.FastProxyClient = createFasthttpClient(cfg)
// Initialize connection pool
InitializeConnectionPool(cfg.Client.FastProxyClient)
defer ShutdownConnectionPool()
poolMgr := GetConnectionPoolManager()
suite.NotNil(poolMgr, "Connection pool manager should be initialized")
// Get stats before
statsBefore := poolMgr.GetConnectionStats()
failuresBefore := statsBefore["connection_failures"].(int64)
// Make a failing request (503 is retryable, so it will retry and track failures)
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { fail }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
_ = proxyTheRequest(ctx, cfg.Server.HostGraphQL)
suite.app.ReleaseCtx(ctx)
// Get stats after - should have failures from retry attempts
statsAfter := poolMgr.GetConnectionStats()
failuresAfter := statsAfter["connection_failures"].(int64)
suite.Greater(failuresAfter, failuresBefore,
"Connection failures should increase after 5xx responses that trigger retries")
suite.T().Logf("Connection failures: before=%d, after=%d",
failuresBefore, failuresAfter)
})
suite.Run("stats_reflect_request_outcomes", func() {
// This test verifies that connection stats properly reflect the
// combination of successes and failures over multiple requests
// Start with a fresh server
var requestCount atomic.Int32
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
count := requestCount.Add(1)
// First 2 requests succeed, rest fail
if count <= 2 {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(`{"data":{"test":"success"}}`))
} else {
w.WriteHeader(http.StatusInternalServerError)
_, _ = w.Write([]byte(`{"errors":[{"message":"Error"}]}`))
}
}))
defer server.Close()
cfg.Server.HostGraphQL = server.URL
cfg.Client.ClientTimeout = 2
cfg.Client.FastProxyClient = createFasthttpClient(cfg)
// Initialize connection pool
InitializeConnectionPool(cfg.Client.FastProxyClient)
defer ShutdownConnectionPool()
poolMgr := GetConnectionPoolManager()
suite.NotNil(poolMgr, "Connection pool manager should be initialized")
// Make 2 successful requests
for i := 0; i < 2; i++ {
reqCtx := &fasthttp.RequestCtx{}
reqCtx.Request.SetRequestURI("/graphql")
reqCtx.Request.Header.SetMethod("POST")
reqCtx.Request.Header.Set("Content-Type", "application/json")
reqCtx.Request.SetBody([]byte(`{"query": "query { test }"}`))
ctx := suite.app.AcquireCtx(reqCtx)
_ = proxyTheRequest(ctx, cfg.Server.HostGraphQL)
suite.app.ReleaseCtx(ctx)
}
// Get stats after successes
statsAfterSuccess := poolMgr.GetConnectionStats()
totalConnections := statsAfterSuccess["total_connections"].(int64)
suite.GreaterOrEqual(totalConnections, int64(2),
"Should have at least 2 successful connections tracked")
suite.T().Logf("Total connections after 2 successful requests: %d", totalConnections)
})
}
+51 -17
View File
@@ -131,8 +131,30 @@ func parseConfig() {
c.Cache.CacheTTL = getDetailsFromEnv("CACHE_TTL", 60)
c.Cache.CacheMaxMemorySize = getDetailsFromEnv("CACHE_MAX_MEMORY_SIZE", 100) // Default 100MB
c.Cache.CacheMaxEntries = getDetailsFromEnv("CACHE_MAX_ENTRIES", 10000) // Default 10000 entries
c.Cache.CacheUseLRU = getDetailsFromEnv("CACHE_USE_LRU", false) // Use LRU eviction algorithm
// GraphQL query parsing cache - auto-calculate based on CPU cores if not set
c.Cache.GraphQLQueryCacheSize = getDetailsFromEnv("GRAPHQL_QUERY_CACHE_SIZE", runtime.GOMAXPROCS(0)*250)
// SECURITY: Per-user cache isolation (enabled by default for security)
// Set CACHE_PER_USER_DISABLED=true ONLY if you have a single-user application
// or understand the security implications of shared cache across users
c.Cache.PerUserCacheDisabled = getDetailsFromEnv("CACHE_PER_USER_DISABLED", false)
// Log warning if per-user caching is disabled
if c.Cache.PerUserCacheDisabled {
defer func() {
if c.Logger != nil {
c.Logger.Warning(&libpack_logging.LogMessage{
Message: "⚠️ Per-user cache isolation is DISABLED - Users may see each other's cached data!",
Pairs: map[string]interface{}{
"security_risk": "CRITICAL - Do not use in multi-user applications",
"recommendation": "Remove CACHE_PER_USER_DISABLED or set it to false",
},
})
}
}()
}
// Redis cache
c.Cache.CacheRedisEnable = getDetailsFromEnv("ENABLE_REDIS_CACHE", false)
c.Cache.CacheRedisURL = getDetailsFromEnv("CACHE_REDIS_URL", "localhost:6379")
@@ -246,7 +268,7 @@ func parseConfig() {
c.Api.BannedUsersFile = validatedPath
}
c.Server.PurgeOnCrawl = getDetailsFromEnv("PURGE_METRICS_ON_CRAWL", false)
c.Server.PurgeEvery = getDetailsFromEnv("PURGE_METRICS_ON_TIMER", 0)
c.Server.PurgeEvery = getDetailsFromEnv("PURGE_METRICS_ON_TIMER", 1800) // Default: purge metrics every 30 minutes
// Hasura event cleaner
c.HasuraEventCleaner.Enable = getDetailsFromEnv("HASURA_EVENT_CLEANER", false)
c.HasuraEventCleaner.ClearOlderThan = getDetailsFromEnv("HASURA_EVENT_CLEANER_OLDER_THAN", 1)
@@ -355,8 +377,9 @@ func parseConfig() {
// Initialize cache if enabled
if cfg.Cache.CacheEnable || cfg.Cache.CacheRedisEnable {
cacheConfig := &libpack_cache.CacheConfig{
Logger: cfg.Logger,
TTL: cfg.Cache.CacheTTL,
Logger: cfg.Logger,
TTL: cfg.Cache.CacheTTL,
PerUserCacheDisabled: cfg.Cache.PerUserCacheDisabled,
}
// Redis cache configurations
if cfg.Cache.CacheRedisEnable {
@@ -368,9 +391,16 @@ func parseConfig() {
// Memory cache configurations
cacheConfig.Memory.MaxMemorySize = int64(cfg.Cache.CacheMaxMemorySize) * 1024 * 1024 // Convert MB to bytes
cacheConfig.Memory.MaxEntries = int64(cfg.Cache.CacheMaxEntries)
cacheConfig.Memory.UseLRU = cfg.Cache.CacheUseLRU
cacheType := "standard"
if cfg.Cache.CacheUseLRU {
cacheType = "LRU"
}
cfg.Logger.Info(&libpack_logging.LogMessage{
Message: "Configuring memory cache with limits",
Pairs: map[string]interface{}{
"type": cacheType,
"max_memory_mb": cfg.Cache.CacheMaxMemorySize,
"max_entries": cfg.Cache.CacheMaxEntries,
},
@@ -387,15 +417,7 @@ func parseConfig() {
initCircuitBreaker(cfg)
}
// Initialize retry budget
if cfg.RetryBudget.Enable {
retryBudgetConfig := RetryBudgetConfig{
TokensPerSecond: cfg.RetryBudget.TokensPerSecond,
MaxTokens: cfg.RetryBudget.MaxTokens,
Enabled: cfg.RetryBudget.Enable,
}
InitializeRetryBudget(retryBudgetConfig, cfg.Logger)
}
// Note: Retry budget is initialized in main() with context for graceful shutdown
// Initialize request coalescer
if cfg.RequestCoalescing.Enable {
@@ -420,11 +442,7 @@ func parseConfig() {
healthMgr.StartHealthChecking()
}
// Initialize RPS tracker for real-time requests per second monitoring
InitializeRPSTracker()
cfg.Logger.Info(&libpack_logging.LogMessage{
Message: "RPS tracker initialized",
})
// Note: RPS tracker is initialized in main() with context for graceful shutdown
// Load rate limit configuration with improved error handling
if err := loadRatelimitConfig(); err != nil {
@@ -462,6 +480,22 @@ func main() {
// Initialize shutdown manager
shutdownManager = NewShutdownManager(ctx)
// Initialize RPS tracker with context for graceful shutdown
InitializeRPSTracker(ctx)
cfg.Logger.Info(&libpack_logging.LogMessage{
Message: "RPS tracker initialized",
})
// Initialize retry budget with context for graceful shutdown
if cfg.RetryBudget.Enable {
retryBudgetConfig := RetryBudgetConfig{
TokensPerSecond: cfg.RetryBudget.TokensPerSecond,
MaxTokens: cfg.RetryBudget.MaxTokens,
Enabled: cfg.RetryBudget.Enable,
}
InitializeRetryBudgetWithContext(ctx, retryBudgetConfig, cfg.Logger)
}
// Create a wait group to manage goroutines
var wg sync.WaitGroup
+28 -2
View File
@@ -506,6 +506,7 @@ func (ma *MetricsAggregator) aggregateStats(instances []InstanceMetrics) map[str
totalCacheMisses int64
totalCachedQueries int64
totalMemoryUsageMB float64
hasValidMemoryStats bool // Track if any instance has valid memory stats
totalCurrentRPS float64
totalAvgRPS float64
totalActiveConnections int64
@@ -518,7 +519,10 @@ func (ma *MetricsAggregator) aggregateStats(instances []InstanceMetrics) map[str
totalRetryAllowed int64
totalRetryDenied int64
totalRetryAttempts int64
totalCurrentTokens int64
totalMaxTokens int64
retryBudgetEnabled = false
retryTokensPerSec float64 // Use max tokens_per_sec from any instance
// Circuit breaker stats
cbOpenCount int
@@ -598,9 +602,11 @@ func (ma *MetricsAggregator) aggregateStats(instances []InstanceMetrics) map[str
}
// Aggregate memory usage from full cache details
// Skip -1 values which indicate Redis cache (memory tracking not available)
if len(instance.Cache) > 0 {
if memMB, ok := instance.Cache["memory_usage_mb"].(float64); ok {
if memMB, ok := instance.Cache["memory_usage_mb"].(float64); ok && memMB >= 0 {
totalMemoryUsageMB += memMB
hasValidMemoryStats = true
}
}
@@ -642,6 +648,17 @@ func (ma *MetricsAggregator) aggregateStats(instances []InstanceMetrics) map[str
if attempts, ok := instance.RetryBudget["total_attempts"].(float64); ok {
totalRetryAttempts += int64(attempts)
}
if currentTokens, ok := instance.RetryBudget["current_tokens"].(float64); ok {
totalCurrentTokens += int64(currentTokens)
}
if maxTokens, ok := instance.RetryBudget["max_tokens"].(float64); ok {
totalMaxTokens += int64(maxTokens)
}
if tokensPerSec, ok := instance.RetryBudget["tokens_per_sec"].(float64); ok {
if tokensPerSec > retryTokensPerSec {
retryTokensPerSec = tokensPerSec
}
}
}
// Aggregate circuit breaker stats
@@ -718,7 +735,13 @@ func (ma *MetricsAggregator) aggregateStats(instances []InstanceMetrics) map[str
"total_cached": totalCachedQueries,
},
"memory": map[string]interface{}{
"total_usage_mb": totalMemoryUsageMB,
"total_usage_mb": func() float64 {
if hasValidMemoryStats {
return totalMemoryUsageMB
}
return -1
}(),
"available": hasValidMemoryStats,
},
"connections": map[string]interface{}{
"total_active": totalActiveConnections,
@@ -739,6 +762,9 @@ func (ma *MetricsAggregator) aggregateStats(instances []InstanceMetrics) map[str
"denied_retries": totalRetryDenied,
"total_attempts": totalRetryAttempts,
"denial_rate_pct": retryDenialRate,
"current_tokens": totalCurrentTokens,
"max_tokens": totalMaxTokens,
"tokens_per_sec": retryTokensPerSec,
},
"circuit_breaker": map[string]interface{}{
"enabled": circuitBreakerEnabled,
-4
View File
@@ -9,7 +9,3 @@ func (ms *MetricsSetup) RegisterDefaultMetrics() {
ms.RegisterMetricsCounter(MetricsCacheMiss, nil)
ms.RegisterMetricsCounter(MetricsQueriesCached, nil)
}
func (ms *MetricsSetup) RegisterGoMetrics() {
// TODO: metrics.WriteProcessMetrics(ms.metrics_set)
}
+107 -17
View File
@@ -68,26 +68,74 @@ func ensureDefaultLabels(labels *map[string]string, podName string) {
}
}
// sanitizeLabelValue removes or replaces characters that are invalid in metric labels
// This includes null bytes, newlines, carriage returns, quotes, and backslashes
func sanitizeLabelValue(value string) string {
if value == "" {
return value
}
var buf strings.Builder
buf.Grow(len(value))
for _, r := range value {
switch r {
case '\x00': // null byte
continue // Skip null bytes entirely
case '\n', '\r', '\t': // newlines, carriage returns, tabs
buf.WriteByte(' ') // Replace with space
case '"', '\\': // quotes and backslashes need escaping
buf.WriteByte('\\')
buf.WriteRune(r)
default:
// Only allow printable ASCII and common unicode characters
if unicode.IsPrint(r) {
buf.WriteRune(r)
}
}
}
return buf.String()
}
func appendSortedLabels(buf *bytes.Buffer, labels map[string]string) {
if len(labels) == 0 {
// Add defer/recover to prevent panics from crashing the application
defer func() {
if r := recover(); r != nil {
// Log the panic but don't crash
fmt.Fprintf(os.Stderr, "Recovered from panic in appendSortedLabels: %v\n", r)
}
}()
if len(labels) == 0 || buf == nil {
return
}
// Create a snapshot to avoid concurrent access issues
labelsCopy := make(map[string]string, len(labels))
for k, v := range labels {
labelsCopy[k] = v
if k == "" {
continue // Skip empty keys
}
// Sanitize the label value to remove null bytes and other invalid characters
labelsCopy[k] = sanitizeLabelValue(v)
}
if len(labelsCopy) == 0 {
return
}
keys := getSortedKeys(labelsCopy)
for i, k := range keys {
if i > 0 {
buf.WriteByte(',')
if v, ok := labelsCopy[k]; ok {
if i > 0 {
buf.WriteByte(',')
}
buf.WriteString(k)
buf.WriteString(`="`)
buf.WriteString(v)
buf.WriteByte('"')
}
buf.WriteString(k)
buf.WriteString(`="`)
buf.WriteString(labelsCopy[k])
buf.WriteByte('"')
}
}
@@ -117,7 +165,15 @@ func getSortedKeys(labels map[string]string) []string {
}
func labelsToString(labels map[string]string) string {
if labels == nil {
// Add defer/recover to prevent panics from crashing the application
defer func() {
if r := recover(); r != nil {
// Log the panic but don't crash
fmt.Fprintf(os.Stderr, "Recovered from panic in labelsToString: %v\n", r)
}
}()
if len(labels) == 0 {
return ""
}
@@ -126,17 +182,34 @@ func labelsToString(labels map[string]string) string {
values := make(map[string]string, len(labels))
for k, v := range labels {
if k == "" {
continue // Skip empty keys
}
keys = append(keys, k)
values[k] = v
}
if len(keys) == 0 {
return ""
}
sort.Strings(keys)
// Pre-allocate the builder with estimated capacity to avoid reallocation
var sb strings.Builder
estimatedSize := 0
for _, k := range keys {
sb.WriteString(k)
sb.WriteByte('=')
sb.WriteString(values[k])
sb.WriteByte(';')
estimatedSize += len(k) + len(values[k]) + 2 // key + value + '=' + ';'
}
sb.Grow(estimatedSize)
for _, k := range keys {
if v, ok := values[k]; ok {
sb.WriteString(k)
sb.WriteByte('=')
sb.WriteString(v)
sb.WriteByte(';')
}
}
return sb.String()
}
@@ -186,6 +259,14 @@ func is_special_rune(r rune) bool {
}
func compile_metrics_with_labels(name string, labels map[string]string) string {
// Add defer/recover to prevent panics from crashing the application
defer func() {
if r := recover(); r != nil {
// Log the panic but don't crash
fmt.Fprintf(os.Stderr, "Recovered from panic in compile_metrics_with_labels: %v\n", r)
}
}()
var buf bytes.Buffer
buf.WriteString(name)
@@ -197,16 +278,25 @@ func compile_metrics_with_labels(name string, labels map[string]string) string {
// Create a snapshot to avoid concurrent access issues
labelsCopy := make(map[string]string, len(labels))
for k, v := range labels {
if k == "" {
continue // Skip empty keys
}
labelsCopy[k] = v
}
if len(labelsCopy) == 0 {
return buf.String()
}
keys := getSortedKeys(labelsCopy)
for _, k := range keys {
buf.WriteByte('_')
buf.WriteString(k)
buf.WriteByte('_')
buf.WriteString(labelsCopy[k])
if v, ok := labelsCopy[k]; ok {
buf.WriteByte('_')
buf.WriteString(k)
buf.WriteByte('_')
buf.WriteString(v)
}
}
return buf.String()
+40 -2
View File
@@ -1,6 +1,7 @@
package libpack_monitoring
import (
"context"
"flag"
"fmt"
"time"
@@ -17,6 +18,8 @@ type MetricsSetup struct {
metrics_set_custom *metrics.Set
ic *InitConfig
metrics_prefix string
ctx context.Context
cancel context.CancelFunc
}
var log = libpack_logger.New().SetMinLogLevel(libpack_logger.LEVEL_INFO)
@@ -27,10 +30,18 @@ type InitConfig struct {
}
func NewMonitoring(ic *InitConfig) *MetricsSetup {
return NewMonitoringWithContext(context.Background(), ic)
}
// NewMonitoringWithContext creates a new monitoring instance with context for graceful shutdown
func NewMonitoringWithContext(ctx context.Context, ic *InitConfig) *MetricsSetup {
monCtx, cancel := context.WithCancel(ctx)
ms := &MetricsSetup{
ic: ic,
metrics_set: metrics.NewSet(),
metrics_set_custom: metrics.NewSet(),
ctx: monCtx,
cancel: cancel,
}
if flag.Lookup("test.v") == nil {
@@ -39,8 +50,14 @@ func NewMonitoring(ic *InitConfig) *MetricsSetup {
if ic.PurgeEvery > 0 {
ticker := time.NewTicker(time.Duration(ic.PurgeEvery) * time.Second)
go func() {
for range ticker.C {
ms.PurgeMetrics()
defer ticker.Stop()
for {
select {
case <-ms.ctx.Done():
return
case <-ticker.C:
ms.PurgeMetrics()
}
}
}()
}
@@ -49,6 +66,13 @@ func NewMonitoring(ic *InitConfig) *MetricsSetup {
return ms
}
// Shutdown stops the monitoring goroutines
func (ms *MetricsSetup) Shutdown() {
if ms.cancel != nil {
ms.cancel()
}
}
func (ms *MetricsSetup) startPrometheusEndpoint() {
app := fiber.New(fiber.Config{
DisableStartupMessage: true,
@@ -95,6 +119,20 @@ func (ms *MetricsSetup) RegisterMetricsGauge(metric_name string, labels map[stri
})
}
// RegisterMetricsGaugeFunc registers a gauge with a callback function that is called on every scrape
// This is useful for metrics that need to return a dynamic value
func (ms *MetricsSetup) RegisterMetricsGaugeFunc(metric_name string, labels map[string]string, fn func() float64) *metrics.Gauge {
if err := validate_metrics_name(metric_name); err != nil {
log.Error(&libpack_logger.LogMessage{
Message: "RegisterMetricsGaugeFunc() error - invalid metric name",
Pairs: map[string]interface{}{"error": err.Error(), "metric_name": metric_name},
})
// Return a dummy gauge instead of nil to prevent panics
return &metrics.Gauge{}
}
return ms.metrics_set_custom.GetOrCreateGauge(ms.get_metrics_name(metric_name, labels), fn)
}
func (ms *MetricsSetup) RegisterMetricsCounter(metric_name string, labels map[string]string) *metrics.Counter {
if err := validate_metrics_name(metric_name); err != nil {
log.Error(&libpack_logger.LogMessage{
+128 -27
View File
@@ -325,16 +325,72 @@ func setupTracing(c *fiber.Ctx) context.Context {
return ctx
}
// performProxyRequest executes the proxy request with retries and circuit breaker
// performProxyRequest executes the proxy request with retries, circuit breaker, and request coalescing
func performProxyRequest(c *fiber.Ctx, proxyURL string) error {
// Extract user context for cache key (needed for coalescing and circuit breaker fallback)
userID, userRole := extractUserInfo(c)
// Calculate cache key - includes user context for security
// This key is used for both request coalescing and cache fallback
cacheKey := libpack_cache.CalculateHash(c, userID, userRole)
// Check if request coalescing is enabled
rc := GetRequestCoalescer()
if rc != nil && cfg.RequestCoalescing.Enable {
// Use request coalescing to deduplicate identical concurrent requests
response, err := rc.Do(cacheKey, func() (*CoalescedResponse, error) {
// Execute the actual proxy request
proxyErr := performProxyRequestCore(c, proxyURL, cacheKey)
// Capture the response for coalescing
if proxyErr != nil {
return &CoalescedResponse{
Err: proxyErr,
StatusCode: c.Response().StatusCode(),
}, proxyErr
}
return &CoalescedResponse{
Body: c.Response().Body(),
StatusCode: c.Response().StatusCode(),
Headers: make(map[string]string),
}, nil
})
// Check for error from rc.Do (though it typically returns nil)
if err != nil {
return err
}
// Check for error stored in the response (for coalesced requests)
if response != nil && response.Err != nil {
return response.Err
}
// For coalesced requests (not the primary), we need to copy the response
if response != nil && response.Body != nil && len(response.Body) > 0 {
// Only set response if this is a coalesced request (body would be empty otherwise)
if len(c.Response().Body()) == 0 {
c.Response().SetStatusCode(response.StatusCode)
c.Response().SetBody(response.Body)
}
}
return nil
}
// No coalescing - execute directly
return performProxyRequestCore(c, proxyURL, cacheKey)
}
// performProxyRequestCore executes the proxy request with retries and circuit breaker
// This is the core implementation used by both direct calls and coalesced requests
func performProxyRequestCore(c *fiber.Ctx, proxyURL string, cacheKey string) error {
// If circuit breaker is not enabled, use the original method
if !cfg.CircuitBreaker.Enable || cb == nil {
return performProxyRequestWithRetries(c, proxyURL)
}
// Calculate cache key for potential fallback
cacheKey := libpack_cache.CalculateHash(c)
// Execute request through circuit breaker
_, err := cb.Execute(func() (interface{}, error) {
// Execute the request with retries
@@ -398,19 +454,36 @@ func executeProxyAttempt(c *fiber.Ctx, proxyURL string) error {
return retry.Unrecoverable(fmt.Errorf("fiber context became nil during retry"))
}
// Get connection pool manager for stats tracking
poolMgr := GetConnectionPoolManager()
// Execute the proxy request
if err := doProxyRequestWithTimeout(c, proxyURL, cfg.Client.FastProxyClient); err != nil {
proxyErr := doProxyRequestWithTimeout(c, proxyURL, cfg.Client.FastProxyClient)
if proxyErr != nil {
// Check if this is a connection error
if isConnectionError(err) {
if isConnectionError(proxyErr) {
notifyHealthManager(false)
return err // Connection errors are retryable
// Track connection failure
if poolMgr != nil {
poolMgr.RecordConnectionFailure()
}
return proxyErr // Connection errors are retryable
}
// Check if this is a timeout error - don't retry timeouts
if isTimeoutError(err) {
return retry.Unrecoverable(err)
if isTimeoutError(proxyErr) {
return retry.Unrecoverable(proxyErr)
}
return err
// Check if this is a retryable HTTP error (e.g., 503)
// These indicate the server responded but with an error status
if strings.Contains(proxyErr.Error(), "non-200 response") {
// Track as a failure for retryable HTTP errors
if poolMgr != nil {
poolMgr.RecordConnectionFailure()
}
}
return proxyErr
}
// Safety check before accessing response
@@ -425,10 +498,18 @@ func executeProxyAttempt(c *fiber.Ctx, proxyURL string) error {
if err == nil {
// Success case
notifyHealthManager(true)
// Track successful connection
if poolMgr != nil {
poolMgr.RecordConnectionSuccess()
}
return nil
}
if shouldRetry {
// Track connection failure for retryable errors (5xx, etc)
if poolMgr != nil {
poolMgr.RecordConnectionFailure()
}
return err // Retryable error
}
@@ -485,31 +566,51 @@ func performProxyRequestWithEnhancedRetries(c *fiber.Ctx, proxyURL string, backe
retry.LastErrorOnly(true),
retry.RetryIf(func(err error) bool {
// Don't retry if context is cancelled or context is nil
defer func() {
// Recover from any panic when accessing context
if r := recover(); r != nil {
// If we panic, don't retry
return
}
}()
if c == nil {
return false
}
// Try to safely access the context
ctx := c.Context()
if ctx == nil {
// Safely check if context is done/cancelled
// Note: fasthttp.RequestCtx.Done() can panic if not properly initialized
// If we panic, don't retry (maintains backward compatibility with test behavior)
shouldRetry := true
func() {
defer func() {
if r := recover(); r != nil {
// If we panic accessing context, don't retry
// This typically happens in test scenarios with mock contexts
shouldRetry = false
}
}()
ctx := c.Context()
if ctx == nil {
return
}
select {
case <-ctx.Done():
shouldRetry = false
default:
}
}()
if !shouldRetry {
return false
}
// Check if context is done/cancelled
select {
case <-ctx.Done():
return false
default:
return true
// Check retry budget before allowing retry
if rb := GetRetryBudget(); rb != nil {
if !rb.AllowRetry() {
cfg.Logger.Warning(&libpack_logger.LogMessage{
Message: "Retry denied by budget",
Pairs: map[string]interface{}{
"path": c.Path(),
"error": err.Error(),
},
})
return false
}
}
return true
}),
)
}
+30
View File
@@ -82,6 +82,36 @@ func (suite *Tests) Test_proxyTheRequest() {
wantErr: false,
wantEndpoint: "https://telegram-bot.app/",
},
{
name: "Test mutation with multiple operations (bug fix regression test)",
body: `{"query":"mutation getOrCreateUser { insert_tg_users_one(object: {id: 123}) { id } } query otherQuery { users { id } }"}`,
host: "https://telegram-bot.app/",
hostRO: "https://google.com/",
path: "/v1/graphql",
headers: supplied_headers,
wantErr: false,
wantEndpoint: "https://telegram-bot.app/",
},
{
name: "Test mutation followed by fragment (bug fix regression test)",
body: `{"query":"mutation insertUser { insert_users_one(object: {name: \"test\"}) { ...userFields } } fragment userFields on users { id name }"}`,
host: "https://telegram-bot.app/",
hostRO: "https://google.com/",
path: "/v1/graphql",
headers: supplied_headers,
wantErr: false,
wantEndpoint: "https://telegram-bot.app/",
},
{
name: "Test complex mutation document (main-bot style)",
body: `{"query":"mutation getOrCreateUser($user_id: bigint!, $group_id: bigint!) { insert_tg_users_one(object: {id: $user_id}, on_conflict: {constraint: tg_users_pkey, update_columns: last_seen}) { id } insert_tg_groups_one(object: {id: $group_id}, on_conflict: {constraint: tg_groups_pkey, update_columns: last_seen}) { id } }"}`,
host: "https://telegram-bot.app/",
hostRO: "https://google.com/",
path: "/v1/graphql",
headers: supplied_headers,
wantErr: false,
wantEndpoint: "https://telegram-bot.app/",
},
}
for _, tt := range tests {
+33 -5
View File
@@ -1,6 +1,7 @@
package main
import (
"context"
"sync"
"sync/atomic"
"time"
@@ -18,6 +19,8 @@ type RetryBudget struct {
mu sync.RWMutex
enabled bool
logger *libpack_logger.Logger
ctx context.Context
cancel context.CancelFunc
// Statistics
totalAttempts atomic.Int64
@@ -32,13 +35,21 @@ type RetryBudgetConfig struct {
Enabled bool // Whether retry budget is enabled
}
// NewRetryBudget creates a new retry budget
// NewRetryBudget creates a new retry budget (deprecated, use NewRetryBudgetWithContext)
func NewRetryBudget(config RetryBudgetConfig, logger *libpack_logger.Logger) *RetryBudget {
return NewRetryBudgetWithContext(context.Background(), config, logger)
}
// NewRetryBudgetWithContext creates a new retry budget with context for graceful shutdown
func NewRetryBudgetWithContext(ctx context.Context, config RetryBudgetConfig, logger *libpack_logger.Logger) *RetryBudget {
budgetCtx, cancel := context.WithCancel(ctx)
rb := &RetryBudget{
tokensPerSecond: config.TokensPerSecond,
maxTokens: int64(config.MaxTokens),
enabled: config.Enabled,
logger: logger,
ctx: budgetCtx,
cancel: cancel,
}
// Initialize with full bucket
@@ -91,8 +102,20 @@ func (rb *RetryBudget) refillLoop() {
ticker := time.NewTicker(100 * time.Millisecond) // Refill every 100ms
defer ticker.Stop()
for range ticker.C {
rb.refill()
for {
select {
case <-rb.ctx.Done():
return
case <-ticker.C:
rb.refill()
}
}
}
// Shutdown stops the retry budget goroutine
func (rb *RetryBudget) Shutdown() {
if rb.cancel != nil {
rb.cancel()
}
}
@@ -187,10 +210,15 @@ var (
retryBudgetOnce sync.Once
)
// InitializeRetryBudget initializes the global retry budget
// InitializeRetryBudget initializes the global retry budget (deprecated, use InitializeRetryBudgetWithContext)
func InitializeRetryBudget(config RetryBudgetConfig, logger *libpack_logger.Logger) *RetryBudget {
return InitializeRetryBudgetWithContext(context.Background(), config, logger)
}
// InitializeRetryBudgetWithContext initializes the global retry budget with context for graceful shutdown
func InitializeRetryBudgetWithContext(ctx context.Context, config RetryBudgetConfig, logger *libpack_logger.Logger) *RetryBudget {
retryBudgetOnce.Do(func() {
retryBudget = NewRetryBudget(config, logger)
retryBudget = NewRetryBudgetWithContext(ctx, config, logger)
if logger != nil && config.Enabled {
logger.Info(&libpack_logger.LogMessage{
Message: "Retry budget initialized",
+27 -8
View File
@@ -1,6 +1,7 @@
package main
import (
"context"
"sync"
"sync/atomic"
"time"
@@ -12,11 +13,17 @@ type RPSTracker struct {
lastSampleTime atomic.Int64 // Unix nano
currentRPS uint64 // stored as uint64, accessed with atomic operations
mu sync.RWMutex // for currentRPS updates
ctx context.Context
cancel context.CancelFunc
}
// NewRPSTracker creates a new RPS tracker
func NewRPSTracker() *RPSTracker {
tracker := &RPSTracker{}
// NewRPSTracker creates a new RPS tracker with context for graceful shutdown
func NewRPSTracker(ctx context.Context) *RPSTracker {
trackerCtx, cancel := context.WithCancel(ctx)
tracker := &RPSTracker{
ctx: trackerCtx,
cancel: cancel,
}
tracker.lastSampleTime.Store(time.Now().UnixNano())
go tracker.updateLoop()
return tracker
@@ -33,8 +40,20 @@ func (r *RPSTracker) updateLoop() {
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
for range ticker.C {
r.sample()
for {
select {
case <-r.ctx.Done():
return
case <-ticker.C:
r.sample()
}
}
}
// Shutdown stops the RPS tracker
func (r *RPSTracker) Shutdown() {
if r.cancel != nil {
r.cancel()
}
}
@@ -75,10 +94,10 @@ func (r *RPSTracker) GetCurrentRPS() float64 {
var globalRPSTracker *RPSTracker
// InitializeRPSTracker initializes the global RPS tracker
func InitializeRPSTracker() *RPSTracker {
// InitializeRPSTracker initializes the global RPS tracker with context for graceful shutdown
func InitializeRPSTracker(ctx context.Context) *RPSTracker {
if globalRPSTracker == nil {
globalRPSTracker = NewRPSTracker()
globalRPSTracker = NewRPSTracker(ctx)
}
return globalRPSTracker
}
+16 -2
View File
@@ -272,6 +272,17 @@ func processGraphQLRequest(c *fiber.Ctx) error {
// Parse the GraphQL query
parsedResult := parseGraphQLQuery(c)
// Debug logging for mutation routing analysis (enabled when LOG_LEVEL=DEBUG)
if cfg.LogLevel == "DEBUG" {
var m map[string]interface{}
if err := json.Unmarshal(c.Body(), &m); err == nil {
if query, ok := m["query"].(string); ok {
debugParseGraphQLQuery(c, query)
}
}
}
if parsedResult.shouldBlock {
return c.Status(fiber.StatusForbidden).SendString("Request blocked")
}
@@ -316,8 +327,11 @@ func extractUserInfo(c *fiber.Ctx) (string, string) {
// handleCaching manages the caching logic for GraphQL requests
func handleCaching(c *fiber.Ctx, parsedResult *parseGraphQLQueryResult, userID string) (bool, error) {
// Calculate query hash for cache key
calculatedQueryHash := libpack_cache.CalculateHash(c)
// Extract user role for cache key (in addition to userID already passed)
_, userRole := extractUserInfo(c)
// Calculate query hash for cache key - now includes user context for security
calculatedQueryHash := libpack_cache.CalculateHash(c, userID, userRole)
// Set cache time from header or default
if parsedResult.cacheTime == 0 {
@@ -97,6 +97,9 @@ spec:
value: "error"
- name: HASURA_GRAPHQL_SERVER_PORT
value: "8088"
# Disable event trigger processing on read-only instance
- name: HASURA_GRAPHQL_EVENTS_FETCH_INTERVAL
value: "0"
- name: graphql-proxy
image: ghcr.io/lukaszraczylo/graphql-monitoring-proxy:latest
+3 -1
View File
@@ -44,7 +44,9 @@ type config struct {
CacheRedisEnable bool
CacheMaxMemorySize int
CacheMaxEntries int
GraphQLQueryCacheSize int // Max number of parsed GraphQL queries to cache
CacheUseLRU bool // Use LRU eviction algorithm instead of random eviction
GraphQLQueryCacheSize int // Max number of parsed GraphQL queries to cache
PerUserCacheDisabled bool // Disable per-user cache isolation (SECURITY RISK - not recommended)
}
Client struct {
GQLClient *graphql.BaseClient
+91 -2
View File
@@ -8,6 +8,7 @@ import (
"sync/atomic"
"time"
"github.com/goccy/go-json"
"github.com/gofiber/fiber/v2"
"github.com/gofiber/websocket/v2"
gorillaws "github.com/gorilla/websocket"
@@ -141,8 +142,29 @@ func (wsp *WebSocketProxy) handleConnection(ctx context.Context, clientConn *web
// Set message size limit
clientConn.SetReadLimit(wsp.maxMessageSize)
// Connect to backend WebSocket with forwarded headers
backendConn, err := wsp.dialBackend(ctx, headers)
// Read first message to extract authentication from connection_init payload
// This bridges the gap between clients that send auth in payload vs Hasura expecting it in HTTP headers
messageType, message, err := clientConn.ReadMessage()
if err != nil {
wsp.errors.Add(1)
if wsp.logger != nil {
wsp.logger.Error(&libpack_logger.LogMessage{
Message: "Failed to read first message from client",
Pairs: map[string]interface{}{
"connection_id": connectionID,
"error": err.Error(),
},
})
}
clientConn.Close()
return
}
// Try to extract headers from connection_init payload (for GraphQL WebSocket protocols)
enrichedHeaders := wsp.extractAuthFromPayload(message, headers)
// Connect to backend WebSocket with enriched headers
backendConn, err := wsp.dialBackend(ctx, enrichedHeaders)
if err != nil {
wsp.errors.Add(1)
if wsp.logger != nil {
@@ -159,6 +181,21 @@ func (wsp *WebSocketProxy) handleConnection(ctx context.Context, clientConn *web
}
defer backendConn.Close()
// Forward the first message (connection_init) to backend
if err := backendConn.WriteMessage(messageType, message); err != nil {
wsp.errors.Add(1)
if wsp.logger != nil {
wsp.logger.Error(&libpack_logger.LogMessage{
Message: "Failed to forward connection_init to backend",
Pairs: map[string]interface{}{
"connection_id": connectionID,
"error": err.Error(),
},
})
}
return
}
if wsp.logger != nil {
wsp.logger.Debug(&libpack_logger.LogMessage{
Message: "Backend WebSocket connection established",
@@ -336,6 +373,58 @@ func (wsp *WebSocketProxy) proxyBackendToClient(ctx context.Context, backend *go
}
}
// extractAuthFromPayload extracts authentication headers from GraphQL WebSocket connection_init payload
// This bridges the gap between clients sending auth in payload and Hasura expecting it in HTTP headers
func (wsp *WebSocketProxy) extractAuthFromPayload(message []byte, originalHeaders http.Header) http.Header {
// Create a copy of original headers
enrichedHeaders := make(http.Header)
for k, v := range originalHeaders {
enrichedHeaders[k] = v
}
// Try to parse as JSON to extract headers from payload
var msg map[string]interface{}
if err := json.Unmarshal(message, &msg); err != nil {
// Not JSON or parse error, return original headers
return enrichedHeaders
}
// Check if this is a connection_init message
msgType, ok := msg["type"].(string)
if !ok || (msgType != "connection_init" && msgType != "start") {
// Not a connection_init, return original headers
return enrichedHeaders
}
// Extract payload
payload, ok := msg["payload"].(map[string]interface{})
if !ok {
return enrichedHeaders
}
// Try to extract headers from payload.headers (graphql-ws format)
if payloadHeaders, ok := payload["headers"].(map[string]interface{}); ok {
for key, value := range payloadHeaders {
if strValue, ok := value.(string); ok {
enrichedHeaders.Set(key, strValue)
}
}
}
// Also check top-level payload keys that look like headers (Apollo format)
for key, value := range payload {
if strValue, ok := value.(string); ok {
// Common auth headers
if key == "Authorization" || key == "authorization" ||
key == "x-hasura-role" || key == "x-hasura-admin-secret" {
enrichedHeaders.Set(key, strValue)
}
}
}
return enrichedHeaders
}
// dialBackend establishes a WebSocket connection to the backend
func (wsp *WebSocketProxy) dialBackend(ctx context.Context, headers http.Header) (*gorillaws.Conn, error) {
// Convert http:// to ws:// or https:// to wss://