Commit Graph

19 Commits

Author SHA1 Message Date
lukaszraczylo ba0797f5af chore: remove unreachable SetupWithManager(secret-only) on SourceReconciler
All registration goes through SetupWithManagerForResourceType (eager and
lazy paths in main.go); the legacy single-resource SetupWithManager and
its corev1 import were dead code that misled readers about how the
controller wires up.
2026-05-02 22:50:02 +01:00
lukaszraczylo 75f7c18f3c fix: hash drift, transformer leak guard, prod logger, ctx-aware wait
M7: extractUnstructuredContent only hashed 'spec' when present, dropping
all other top-level content fields. Resources with both spec and data
(or any non-spec content) silently drifted until the next 10m resync.
Now hashes every non-Kubernetes-managed top-level field, matching the
fields updateUnstructuredMirror copies.

M6: when a source has a transform annotation, also hash the source's
labels and annotations (filtered of kubemirror.raczylo.com/* keys to
avoid the controller's own bookkeeping churning the hash). Templates
read these via TransformContext; without this a label change wouldn't
re-render the transformed mirror.

H3: text/template.Execute is not context-aware, so applyTemplateRule's
timeout cancels the select but leaks the executor goroutine. Added a
process-wide semaphore (cap 64) so a runaway template can't spawn an
unbounded number of stuck goroutines on every reconcile.

M4: zap dev mode (DPanic-on-error, console output, stacktraces on
warning) was hardcoded on. Defaulted to production; --zap-devel flag
remains for opt-in.

M5: WaitForInitialDiscovery was anchored on context.Background() with
its own WithTimeout, so SIGTERM during startup couldn't abort the wait.
Now anchors on signalCtx.
2026-05-02 22:49:15 +01:00
lukaszraczylo cf095e93f4 fix(dynamic-manager): release mu before invoking registration
H2: scanAndRegister held d.mu (write lock) across registerController
and registerMirrorControllerOnly. Those calls enter controller-runtime's
manager state machine, which takes its own internal locks and can block
on cache sync — holding our application-level write lock across them
is a latent deadlock the moment any reentrant access happens (health
checks reading GetRegisteredCount, factories that introspect state).

Restructured into three phases: snapshot work under RLock, perform
registrations with NO lock held, then commit results under Lock.
Registration step routed through funcs to keep tests honest about
the lock state at the moment of invocation.
2026-05-02 22:45:27 +01:00
lukaszraczylo a8e48a9eb6 fix(controller): forward APIReader+CircuitBreaker through NamespaceReconciler
H4: NamespaceReconciler.reconcileMirror builds an ad-hoc SourceReconciler
to delegate mirror creation. The previous version left APIReader and
CircuitBreaker as nil, which silently disabled freshness verification
on the namespace-driven path (a label change to a target namespace
would mirror cached, possibly stale source data) and bypassed circuit
breaker accounting for those reconciles.

Construction extracted into newSourceReconciler so the forwarding is
covered by a unit test that pins both fields by identity.
2026-05-02 22:41:11 +01:00
lukaszraczylo dfe08b35d1 fix(controller): stop self-triggered reconcile loops
C2: updateLastSyncStatus wrote the sync-status annotation on every
successful reconcile. Because the source's watch predicate is the
'enabled' label (server-side filter), that Update fires a watch event
that re-enters Reconcile. With reconciled/error counts varying across
cycles, the value differs each time, so the API server bumps RV and
the loop never quiesces. Now skips the Update when the value matches
the existing annotation.

C3: NamespaceReconciler's happy-path returned RequeueAfter=3s
unconditionally. Every namespace in the cluster re-reconciled every
3 seconds forever, generating constant List calls per source kind.
Now returns ctrl.Result{}; cache-staleness windows are handled by
the manager's resync period and source freshness verification.
2026-05-02 22:39:09 +01:00
lukaszraczylo 99c0eccd53 fix: default verify-source-freshness=true; honor opt-out for glob
H1: --verify-source-freshness used to default to false, so any source
update whose annotation was still in the informer cache (5-20s lag)
would resolve the wrong target list. cleanupOrphanedMirrors then ran
against the stale list and missed orphans (manifested in e2e as
'Orphaned mirror in kubemirror-e2e-app-1 not deleted within timeout'
after target-namespaces was changed). Defaulting to true fixes the
race; the trade-off is one extra API read per stale-cache reconcile.

M2: ResolveTargetNamespaces glob branch checked filter.IsAllowed but
not the opt-out map, so a namespace labeled allow-mirrors=false would
still receive a mirror through patterns like 'app-*'. The 'all' branch
already had the guard; the glob branch now does too. Direct namespace
listings still bypass opt-out by design (explicit author intent).
2026-05-02 22:36:50 +01:00
lukaszraczylo 4277c8ac39 fix(controller): guard mirror deletion + enforce secret blacklist
C1: deleteAllMirrors used to issue a blind Delete on every namespace
matching the source name+GVK, which would destroy unrelated resources
(e.g. a 'default' SA, 'ca-bundle' ConfigMap) sharing the source name.
Now reads each candidate, verifies managed-by label and source-reference
annotation, and only deletes confirmed mirrors.

M1: BlacklistedSecretTypes was declared but never enforced. Enabling
mirroring on a service-account-token / bootstrap-token / helm release
Secret would mirror credentials cluster-wide. Now refused at Reconcile.

M3: deleteAllMirrors swallowed per-namespace errors and returned nil,
so callers removed the finalizer even on partial failure (orphans).
Errors are now joined and returned.
2026-05-02 22:35:40 +01:00
lukaszraczylo 096dca47d1 improvements jan2025 (#6)
* feat(controller): add lazy watcher, improve resource usage and add pattern validation

- [x] Add cache sync health check for readiness probe verification
- [x] Create namespace lister with API reader support for fresh label queries
- [x] Add pattern validation with warning logs for invalid glob patterns
- [x] Implement lazy watcher initialization mode to scan for active resources
- [x] Add requeue delay to namespace reconciler for cache settlement
- [x] Replace custom containsString with slices.Contains from stdlib
- [x] Add structured logging context to reconcilers (kind, group, version)
- [x] Improve error variable naming for clarity in nested conditions
- [x] Add nil-safe label access in namespace reconciler setup
- [x] Add APIReader to namespace and source reconcilers for fresh data
- [x] Improve type assertions with proper error handling in mirror operations
- [x] Reorder struct fields for consistency and readability
- [x] Add comprehensive pattern validation tests and validation API

* feat(controller): add lazy watcher, improve resource usage and add pattern validation

- [x] Add circuit breaker for reconciliation failure tracking and prevention
- [x] Implement granular registration state tracking (not-registered, source-only, fully-registered)
- [x] Add lazy controller initialization for active resource types only
- [x] Consolidate namespace listing into single API call for efficiency
- [x] Add mirror creation verification to catch webhook rejections
- [x] Implement high-cardinality resource detection and warnings
- [x] Add source deletion check in mirror reconciler to prevent races
- [x] Preserve transformation annotations on errors in mirror reconciliation
- [x] Expand constants documentation with labels vs annotations design rationale
- [x] Add comprehensive test coverage for circuit breaker and registration states
- [x] Add mutation-safety tests for hash computation

* fixup! feat(controller): add lazy watcher, improve resource usage and add pattern validation
2026-01-14 13:07:11 +00:00
lukaszraczylo 19e72e136a Add lazy watcher, improving resource usage; update website. 2025-12-27 01:28:46 +00:00
lukaszraczylo 1d49573fd1 Fix the last tests 2025-12-26 17:44:57 +00:00
lukaszraczylo 2f5faddf04 Fix transformer handling logic and improve content hashing 2025-12-26 17:39:33 +00:00
lukaszraczylo c8ebfe376b Reliabity improvements. 2025-12-26 17:30:13 +00:00
lukaszraczylo ceff0ed67f CRD discovery, log noise reduction, e2e tests 2025-12-26 15:25:25 +00:00
lukaszraczylo e822eb3e17 Compliment the reconciliation on annotation change with tests. 2025-12-26 01:42:16 +00:00
lukaszraczylo c6bdc1f559 Remove targets if annotations on source have changed. 2025-12-26 01:35:46 +00:00
lukaszraczylo 2dd34bf39e fix: Mirrored resources managed by other operators. 2025-12-26 01:02:55 +00:00
lukaszraczylo ca0cff3be3 fixup! Utilise shared workflows. 2025-12-25 23:20:03 +00:00
lukaszraczylo 3e872dfdeb Preparation for release. 2025-12-25 23:11:32 +00:00
lukaszraczylo 8adb52608f initial commit 2025-12-25 22:10:57 +00:00