lukaszraczylo 2b36071647 Multiple fixes (#29)
* Multiple fixes

- add goreleaser to the build / release process
- add kubectl plugin for job graphs visualization
- add installation scripts
- update dependencies

* Update the release & CRD content.

* Next set of improvements.

  Code Quality

  - Label constants: Added LabelWorkflowName, LabelGroupName, LabelJobName, LabelJobID in controllers/definitions.go
  - Removed commented debug code: Cleaned up dead code from multiple files
  - Removed unused dependencyTree field: Cleaned connPackage struct
  - Fixed snake_case variables: Changed to camelCase (runGroup, groupDep, runJob, jobDep, k8sJob)

  Kubernetes Best Practices

  - Finalizers: Implemented handleDeletion() and deleteChildJobs() for proper cleanup
  - Status enum validation: Added +kubebuilder:validation:Enum=pending;running;succeeded;failed;aborted
  - ImagePullPolicy default: Created getImagePullPolicy() helper that defaults to IfNotPresent
  - Resource limits support: Added Resources *corev1.ResourceRequirements to ManagedJobParameters

  Observability

  - Prometheus metrics: Created controllers/metrics.go with counters (jobs created/succeeded/failed), histogram (reconciliation duration), and gauge (active jobs)
  - Structured logging: Added logger field to connPackage, used context-based logging throughout

  Configuration

  - Leader election ID: Made configurable via --leader-election-id flag
  - Development mode: Made configurable via --dev-mode flag and LOG_LEVEL env var

  Performance

  - Dependency lookup optimization: Changed from O(n*m) to O(1) using lookup maps (jobDepMap, groupDepMap)
  - Reconciliation backoff: Added RequeueAfter: 30*time.Second when workflow is running

  Documentation & Testing

  - Godoc documentation: Added comprehensive comments to API types and controller
  - Unit tests: Added helpers_test.go with tests for all helper functions
  - Integration tests: Added managedjob_controller_test.go with Ginkgo/Gomega tests

* Add the helm chart release.

* Add reasonable test coverage.
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2023-02-19 21:23:50 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2023-02-19 21:23:50 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:20 +00:00
2025-12-17 22:33:22 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2025-12-17 22:33:23 +00:00
2023-02-19 21:23:50 +00:00
2025-12-17 22:33:23 +00:00
2023-02-19 21:23:50 +00:00

Kubernetes Jobs Manager Operator

Description

This operator is responsible for managing the lifecycle of complicated workflows which consist of multiple jobs and making their management easy, without need for dozens of yaml files and doing magic with ordering.

Getting Started

Installation with helm

helm repo add raczylo https://lukaszraczylo.github.io/helm-charts/
helm repo update raczylo
helm install jobs-manager raczylo/jobs-manager

Prerequisites for local runs

Jobs configuration

apiVersion: jobsmanager.raczylo.com/v1beta1
kind: ManagedJob
metadata:
  labels:
  name: managedjob-sample
spec:
  retries: 3
  params:
    env:
      - name: "FOO"
        value: "bar"
      - name: "QUE"
        value: "pasa"

  # Job groups definitions
  groups:
    - name: "first-group"
      parallel: true
      params:
        env:
          - name: "FEE"
            value: "bee"
      jobs:
        - name: "first-job"
          image: "busybox"
          args:
            - "echo"
            - "Hello world!"
          params:
            env:
              - name: "POO"
                value: "paz"

        - name: "second-job"
          image: "busybox"
          args:
            - "sleep"
            - "10"
        - name: "second-half-job"
          image: "busybox"
          args:
            - "sleep"
            - "10"

    - name: "second-group"
      parallel: true
      jobs:
        - name: "third-job"
          image: "busybox"
          args:
            - "echo"
            - "Hello world!"
          parallel: true

        - name: "fourth-job"
          image: "busybox"
          args:
            - "sleep"
            - "10"
          parallel: false

    - name: "third-group"
      parallel: false
      jobs:
        - name: "fifth-job"
          image: "busybox"
          args:
            - "echo"
            - "Hello world!"
          parallel: true

How does it look in practice?

managedjob-sample
├── first-group
│   ├── first-job
│   ├── second-job
│   │   └── Depends on: managedjob-sample-first-group-first-job
│   └── second-half-job
│       ├── Depends on: managedjob-sample-first-group-first-job
│       └── Depends on: managedjob-sample-first-group-second-job
├── second-group
│   ├── third-job
│   └── fourth-job
│       └── Depends on: managedjob-sample-second-group-third-job
└── third-group
    ├── fifth-job
    ├── Depends on group: first-group
    └── Depends on group: second-group

If dependency exists on the group level - the group will not be executed until all of remaining groups have finished successfuly. If dependency exists on the job level - the job will not be executed until all of remaining jobs have finished successfuly. Remember that ORDER matters.

Things to remember

Parameters params are always merged downwards to DRY your definitions. In this case - result for the first job will look like this:

    - jobs:
      - args:
        - echo
        - Hello world!
        compiledParams:
          env:
          - name: POO
            value: paz
          - name: FEE
            value: bee
          - name: FOO
            value: bar
          - name: QUE
            value: pasa
        image: busybox
        name: first-job
        parallel: false
        status: succeeded

Available params

There's quite a lot of of flexibility with parameters. On every level where params are allowed, you can define:

params:
  fromEnv:
    - configMapRef:
        name: "configmap-name"
      key: "key-name"
  env:
    - name: "FOO"
      value: "bar"
  volumes:
    - name: secrets-store-api
      csi:
        driver: secrets-store.csi.k8s.io
        readOnly: true
        volumeAttributes:
          secretProviderClass: api-secrets-provider
  volumeMount:
    - name: secrets-store-api
      mountPath: "/mnt/secrets-api"
      readOnly: true
  serviceAccount: "service-account-name"
  restartPolicy: "Never"
  imagePullSecrets:
    - "ghcr-token"
  imagePullPolicy:
    - "Always"
  labels:
    this/works: "true"
  annotations:
    this/works/aswell: "true"

Kustomization and references

In case of any issues with configmapGenerator or secretGenerator, please add following to your kustomization.yaml:

configurations:
  - crd-name-reference.yaml

Then you can create crd-name-reference.yaml file with following content:

---
nameReference:
  - kind: 'ConfigMap'
    fieldSpecs:
      - kind: 'ManagedJob'
        path: 'spec/params/fromEnv[]/configMapRef/name'
      - kind: 'ManagedJob'
        path: 'spec/params/env[]/configMapRef/name'

This will instruct kustomize to replace all references to configmaps with their names if they are managed by generators.

Running on the cluster

Manual installation

  1. Install Instances of Custom Resources:
kubectl apply -f config/samples/
  1. Build and push your image to the location specified by IMG:
make docker-build docker-push IMG=ghcr.io/lukaszraczylo/jobs-manager-operator:tag
  1. Deploy the controller to the cluster with the image specified by IMG:
make deploy IMG=ghcr.io/lukaszraczylo/jobs-manager-operator:tag

Manually uninstall CRDs

To delete the CRDs from the cluster:

make uninstall

Manually undeploy controller

UnDeploy the controller from the cluster:

make undeploy

How it works

This project aims to follow the Kubernetes Operator pattern.

It uses Controllers, which provide a reconcile function responsible for synchronizing resources until the desired state is reached on the cluster.

License

Copyright 2023.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

S
Description
Languages
Go 87.6%
Shell 6.2%
Makefile 4.5%
Go Template 1%
Dockerfile 0.7%