Slumlord

Kubernetes operator for cost optimization -- automatically scales down workloads during off-hours and detects idle resources.

Features

Sleep Schedules -- scale down workloads on a time-based schedule:

Scales Deployments and StatefulSets to zero replicas
Scales Prometheus Operator CRDs (ThanosRuler, Alertmanager, Prometheus) to zero replicas
Suspends MariaDB Operator CRDs (MariaDB, MaxScale) via spec.suspend
Suspends CronJobs, FluxCD HelmReleases and Kustomizations
Hibernates CNPG PostgreSQL clusters
Timezone-aware scheduling with day-of-week filters
Overnight schedule support (e.g., 22:00-06:00)
Label selectors and name-based matching (with wildcards)
State preservation -- original replica counts, suspend states, and hibernation annotations are stored for restoration

Idle Detection -- detect and optionally scale down underutilized workloads:

Monitors CPU and memory usage against configurable thresholds
Configurable idle duration before action
Three modes: alert (report only), scale (auto-scale to zero), or resize (in-place pod right-sizing via K8s 1.33)
Supports Deployments, StatefulSets, and CronJobs
MatchNames wildcard selector (e.g., prod-*)
Finalizer ensures workloads are restored on detector deletion
Tracks original state for safe restoration

Note: The idle detector requires metrics-server in the cluster. Without it, the operator runs in degraded mode (always returns not-idle).

Installation

Helm (recommended)

helm install slumlord oci://ghcr.io/cschockaert/charts/slumlord --version 2.13.0

From source

make install  # Install CRDs
make run      # Run locally against current kubeconfig

Usage

Basic example -- nightly sleep on weekdays

apiVersion: slumlord.io/v1alpha1
kind: SlumlordSleepSchedule
metadata:
  name: nightly-sleep
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - Deployment
      - StatefulSet
  schedule:
    start: "22:00"
    end: "06:00"
    timezone: Europe/Paris
    days: [1, 2, 3, 4, 5]

Weekend shutdown

apiVersion: slumlord.io/v1alpha1
kind: SlumlordSleepSchedule
metadata:
  name: weekend-stop
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - Deployment
      - StatefulSet
      - CronJob
  schedule:
    start: "00:00"
    end: "23:59"
    timezone: Europe/Paris
    days: [0, 6]

CNPG PostgreSQL hibernation

apiVersion: slumlord.io/v1alpha1
kind: SlumlordSleepSchedule
metadata:
  name: pg-hibernate
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - Cluster
  schedule:
    start: "20:00"
    end: "07:00"
    timezone: Europe/Paris
    days: [1, 2, 3, 4, 5]

FluxCD reconciliation suspend

apiVersion: slumlord.io/v1alpha1
kind: SlumlordSleepSchedule
metadata:
  name: flux-suspend
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - HelmRelease
      - Kustomization
  schedule:
    start: "21:55"
    end: "06:05"
    timezone: Europe/Paris
    days: [1, 2, 3, 4, 5]

Important: When managing FluxCD resources alongside Deployments/StatefulSets, use a wider sleep window for the FluxCD schedule. Suspend Flux reconciliation before scaling workloads, and resume it after restoring them. This prevents Flux from restoring scaled-down workloads during the sleep window.

Prometheus Operator CRDs

apiVersion: slumlord.io/v1alpha1
kind: SlumlordSleepSchedule
metadata:
  name: monitoring-sleep
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - Prometheus
      - Alertmanager
      - ThanosRuler
  schedule:
    start: "20:00"
    end: "07:00"
    timezone: Europe/Paris
    days: [1, 2, 3, 4, 5]

Note: The Prometheus Operator itself should NOT be scaled down -- only its managed CRs. The operator must be running to reconcile the CRs back up on wake.

MariaDB Operator suspend

apiVersion: slumlord.io/v1alpha1
kind: SlumlordSleepSchedule
metadata:
  name: mariadb-suspend
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - MariaDB
      - MaxScale
  schedule:
    start: "21:55"
    end: "06:05"
    timezone: Europe/Paris
    days: [1, 2, 3, 4, 5]

Important: Like FluxCD, the MariaDB Operator's spec.suspend pauses its reconciliation loop. Suspend the operator before scaling down underlying workloads, and resume it after restoring them. This prevents the operator from recreating resources during the sleep window.

Idle detection -- alert mode

apiVersion: slumlord.io/v1alpha1
kind: SlumlordIdleDetector
metadata:
  name: idle-alert
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - Deployment
      - StatefulSet
  thresholds:
    cpuPercent: 5
    memoryPercent: 10
  idleDuration: "1h"
  action: alert

Idle detection -- auto-scale mode

apiVersion: slumlord.io/v1alpha1
kind: SlumlordIdleDetector
metadata:
  name: idle-scaler
spec:
  selector:
    matchNames:
      - "dev-*"
      - "staging-*"
    types:
      - Deployment
  thresholds:
    cpuPercent: 3
    memoryPercent: 5
  idleDuration: "2h"
  action: scale

Idle detection -- in-place resize mode

apiVersion: slumlord.io/v1alpha1
kind: SlumlordIdleDetector
metadata:
  name: idle-resizer
spec:
  selector:
    matchLabels:
      slumlord.io/managed: "true"
    types:
      - Deployment
      - StatefulSet
  thresholds:
    cpuPercent: 10
    memoryPercent: 15
  idleDuration: "1h"
  action: resize
  reconcileInterval: 10m
  resize:
    bufferPercent: 30
    minRequests:
      cpu: "50m"
      memory: "64Mi"

Note: The resize action requires Kubernetes 1.33+ with InPlacePodVerticalScaling feature gate enabled. It patches pod resource requests in-place without restarting pods.

CRD Reference

SlumlordSleepSchedule

Field	Type	Required	Description
`spec.selector.matchLabels`	`map[string]string`	No	Label selector for target workloads
`spec.selector.matchNames`	`[]string`	No	Name patterns (supports wildcards)
`spec.selector.types`	`[]string`	No	Workload types to manage. Valid: `Deployment`, `StatefulSet`, `CronJob`, `Cluster`, `HelmRelease`, `Kustomization`, `ThanosRuler`, `Alertmanager`, `Prometheus`, `MariaDB`, `MaxScale`. Default: all types
`spec.schedule.start`	`string`	Yes	Sleep start time in `HH:MM` format
`spec.schedule.end`	`string`	Yes	Wake time in `HH:MM` format
`spec.schedule.timezone`	`string`	No	IANA timezone (e.g., `Europe/Paris`). Default: `UTC`
`spec.schedule.days`	`[]int`	No	Days of week (0=Sunday, 6=Saturday). Default: every day
`spec.suspend`	`bool`	No	Pause the schedule. Sleeping workloads are woken up. Default: `false`
`spec.reconcileInterval`	`duration`	No	Override reconcile interval (e.g., `2m`, `10m`). Default: `5m`

SlumlordIdleDetector

Field	Type	Required	Description
`spec.selector.matchLabels`	`map[string]string`	No	Label selector for target workloads
`spec.selector.matchNames`	`[]string`	No	Name patterns with wildcard support (e.g., `prod-*`)
`spec.selector.types`	`[]string`	No	Workload types: `Deployment`, `StatefulSet`, `CronJob`. Default: all
`spec.thresholds.cpuPercent`	`int32`	No	CPU usage % threshold (0-100). Below = idle
`spec.thresholds.memoryPercent`	`int32`	No	Memory usage % threshold (0-100). Below = idle
`spec.idleDuration`	`string`	Yes	How long a workload must be idle before action (e.g., `30m`, `1h`)
`spec.action`	`string`	Yes	`alert` (report only), `scale` (auto-scale to zero), or `resize` (in-place right-sizing)
`spec.reconcileInterval`	`duration`	No	Override reconcile interval (e.g., `5m`, `10m`). Default: `5m`
`spec.resize.bufferPercent`	`int32`	No	Headroom % above actual usage for resize. Default: `25`
`spec.resize.minRequests.cpu`	`quantity`	No	Minimum CPU request floor. Default: `50m`
`spec.resize.minRequests.memory`	`quantity`	No	Minimum memory request floor. Default: `64Mi`

Status

kubectl get slumlordsleepschedules -A
NAMESPACE   NAME            SLEEPING   START   END     DAYS      AGE
default     nightly-sleep   false      22:00   06:00   Mon-Fri   19h

kubectl get slumlordidledetectors -A
NAMESPACE   NAME          ACTION   IDLE DURATION   LAST CHECK            AGE
default     idle-alert    alert    1h              2026-02-08T12:00:00Z  1d

How it works

Sleep Schedules

The operator runs a reconciliation loop (default: every 5 minutes, configurable) for each SlumlordSleepSchedule resource:

Checks if the current time (in the configured timezone) falls within the sleep window
On sleep: scales Deployments/StatefulSets to 0, scales Prometheus Operator CRDs (ThanosRuler, Alertmanager, Prometheus) to 0, suspends CronJobs, hibernates CNPG clusters, suspends FluxCD HelmReleases/Kustomizations, and suspends MariaDB Operator CRDs (MariaDB, MaxScale)
On wake: restores all workloads to their original state
Original state (replica counts, suspend flags, hibernation annotations) is stored in status.managedWorkloads to survive operator restarts

Idle Detection

The idle detector reconciles periodically (default: every 5 minutes, configurable) for each SlumlordIdleDetector resource:

Lists workloads matching the selector (labels and/or name patterns)
Checks resource usage against configured thresholds
Tracks how long each workload has been continuously idle
In alert mode: reports idle workloads in status.idleWorkloads
In scale mode: scales down workloads idle longer than idleDuration, stores original state in status.scaledWorkloads
On detector deletion: restores all scaled workloads via finalizer

Performance considerations

The BinPacker reconciler performs cluster-wide list operations each cycle (Nodes, Pods, ReplicaSets, PDBs, Deployments, StatefulSets). These are served from the controller-runtime informer cache, not direct API server calls. On very large clusters (thousands of Deployments/Pods), consider scoping with nodeSelector and namespaces to reduce the working set.

graph LR
    A[SlumlordSleepSchedule] --> B[Sleep Controller]
    B --> C[Deployments]
    B --> D[StatefulSets]
    B --> E[CronJobs]
    B --> F[CNPG Clusters]
    B --> G[HelmReleases]
    B --> H[Kustomizations]
    B --> K[ThanosRulers]
    B --> L[Alertmanagers]
    B --> M[Prometheuses]
    B --> N[MariaDBs]
    B --> O[MaxScales]

    I[SlumlordIdleDetector] --> J[Idle Controller]
    J --> C
    J --> D
    J --> E

Performance Tuning

Reconciliation intervals

Each controller has a default reconcile interval that can be overridden globally via CLI flags or per-resource via spec.reconcileInterval:

Controller	Default	CLI Flag	Per-resource field
SleepSchedule	5m	`--sleep-reconcile-interval`	`spec.reconcileInterval`
IdleDetector	5m30s	`--idle-reconcile-interval`	`spec.reconcileInterval`
BinPacker	6m	`--binpacker-reconcile-interval`	`spec.reconcileInterval`
NodeDrainPolicy	6m30s	`--nodedrain-reconcile-interval`	`spec.reconcileInterval`

Per-resource overrides take priority over global CLI flags, which take priority over built-in defaults.

Helm values:

reconcileIntervals:
  sleepSchedule: "10m"   # Reduce API server load in large clusters
  idleDetector: "10m"
  binPacker: "10m"
  nodeDrain: "10m"

Recommendations:

Dev/staging: use longer intervals (10m+) to reduce load
Production: use shorter intervals (2-5m) for faster response
Large clusters: combine longer intervals with scoped selectors (nodeSelector, namespaces)
Short intervals (< 1m) increase API server pressure with minimal benefit

Uninstallation

Important: Remove Slumlord resources before uninstalling to restore workloads.

# Restore all workloads by deleting resources first
kubectl delete slumlordidledetectors --all -A
kubectl delete slumlordsleepschedules --all -A

# Then uninstall
helm uninstall slumlord

Development

make generate    # Regenerate DeepCopy methods
make manifests   # Regenerate CRD manifests
make test        # Run tests
make lint        # Lint code
make build       # Build binary

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github		.github
api/v1alpha1		api/v1alpha1
charts/slumlord		charts/slumlord
cmd		cmd
config		config
internal		internal
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slumlord

Features

Installation

Helm (recommended)

From source

Usage

Basic example -- nightly sleep on weekdays

Weekend shutdown

CNPG PostgreSQL hibernation

FluxCD reconciliation suspend

Prometheus Operator CRDs

MariaDB Operator suspend

Idle detection -- alert mode

Idle detection -- auto-scale mode

Idle detection -- in-place resize mode

CRD Reference

SlumlordSleepSchedule

SlumlordIdleDetector

Status

How it works

Sleep Schedules

Idle Detection

Performance considerations

Performance Tuning

Reconciliation intervals

Uninstallation

Development

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

cschockaert/slumlord

Folders and files

Latest commit

History

Repository files navigation

Slumlord

Features

Installation

Helm (recommended)

From source

Usage

Basic example -- nightly sleep on weekdays

Weekend shutdown

CNPG PostgreSQL hibernation

FluxCD reconciliation suspend

Prometheus Operator CRDs

MariaDB Operator suspend

Idle detection -- alert mode

Idle detection -- auto-scale mode

Idle detection -- in-place resize mode

CRD Reference

SlumlordSleepSchedule

SlumlordIdleDetector

Status

How it works

Sleep Schedules

Idle Detection

Performance considerations

Performance Tuning

Reconciliation intervals

Uninstallation

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages