Skip to content

[Sandbox] Velero #457

@nikhita

Description

@nikhita

Project summary

Velero provides Kubernetes-native tools to backup, restore, and migrate cluster resources and persistent volumes, operating at the Kubernetes API server layer with a pluggable architecture supporting multiple cloud providers and storage systems.

Project description

What does Velero do?

Velero is a Kubernetes-native data protection and migration system that operates at the Kubernetes API layer and the storage snapshot layer, rather than at the control-plane datastore level. It captures and persists:

  • Kubernetes API objects (namespaced and cluster-scoped), serialized from the API server
  • Persistent volume data, coordinated via CSI VolumeSnapshot APIs or provider-specific snapshot/backup integrations
  • Restore-time transformations, enabling remapping of namespaces, storage classes, and resource references

Velero’s plugin architecture decouples core backup/restore logic from infrastructure-specific implementations, allowing consistent behavior across cloud providers, on-prem environments, and Kubernetes distributions.

Core Technical Capabilities

  • Backup & Restore (Kubernetes resources + Persistent Volumes): Velero can back up API objects and persistent volume contents, and restore them selectively using namespace, label, and resource filters.
  • Scheduled Backups with Retention Policies: Velero supports scheduled backups using Cron expressions and configurable retention settings.
  • Pre and Post-Backup Hooks: Velero can run custom commands inside containers before and after backups.
  • CSI Snapshot Support (Volume Snapshotting): Velero integrates with Kubernetes CSI snapshot APIs to take snapshots of persistent volumes.
  • CSI Snapshot Data Movement (Cross-Storage and Cross-Cluster): Velero can copy or move snapshot data between storage locations, enabling migration workflows across clouds or clusters.
  • File System Backup (Restic/Kopia-based): When snapshots aren’t available, Velero uses Restic (and/or Kopia in later releases) to back up volume file systems in a portable fashion.
  • Windows Workload Support: Velero supports backup and restore of Windows-based Kubernetes workloads, including persistent storage.
  • Cluster Migration Use Case (Shared Storage): Velero can assist with migrating workloads between clusters by taking backups to shared object storage and restoring in target clusters.

Why is Velero needed?

Kubernetes provides no native mechanism for:

  • Consistent, cluster-wide backup of API objects
  • Coordinated recovery of Kubernetes resources and associated persistent volumes
  • Portable restore or migration of application state across clusters

etcd snapshots alone are insufficient for application recovery because they:

  • Capture control-plane state, not workload data
  • Are tightly coupled to a specific cluster, etcd version, and Kubernetes control plane
  • Cannot be safely or portably replayed across clusters

As Kubernetes adoption expands to stateful, business-critical workloads, operators need a solution that preserves application intent and data rather than raw control-plane state. Velero addresses this requirement without requiring application changes.

Org repo URL (provide if all repos under the org are in scope of the application)

Repos are part of the https://github.com/vmware-tanzu org

If velero is accepted as a sandbox project, all repos outlined below will move to the github.com/velero-io org

Project repo URL in scope of application

https://github.com/vmware-tanzu/velero

Additional repos in scope of the application

Website URL

https://velero.io/

Roadmap

https://github.com/vmware-tanzu/velero/wiki (linked in https://github.com/vmware-tanzu/velero/blob/main/ROADMAP.md)

Contributing guide

https://velero.io/docs/main/code-standards/ (linked in https://github.com/vmware-tanzu/velero/blob/main/CONTRIBUTING.md)

Code of Conduct (CoC)

https://github.com/vmware-tanzu/velero/blob/main/CODE_OF_CONDUCT.md (Contributor Covenant v2.0)

Adopters

https://github.com/vmware-tanzu/velero/blob/main/ADOPTERS.md

Maintainers file

https://github.com/vmware-tanzu/velero/blob/main/MAINTAINERS.md

Security policy file

https://github.com/vmware-tanzu/velero/blob/main/SECURITY.md

Standard or specification?

N/A

Business product or service to project separation

The sponsoring organization is Broadcom. Velero is consumed by Broadcom products as part of their Kubernetes and data protection workflows; however, Velero itself is not offered as a standalone Broadcom product or commercial service.

Why CNCF?

Velero seeks CNCF membership to ensure long-term sustainability and vendor-neutral governance. As a mature project with maintainers from multiple organizations (Broadcom, Red Hat, Microsoft), Velero has outgrown single-vendor stewardship and requires the governance structure that CNCF provides.

Velero's existing governance already follows CNCF principles (consensus-based decision making with supermajority voting, lazy consensus with 5-day review periods, company-consolidated voting to prevent vendor domination). The project expects to leverage CNCF's neutrality to expand the maintainer and contributor base while continuing to serve enterprise production workloads.

Benefit to the landscape

Within the CNCF landscape, Velero fills a missing layer between orchestration and storage primitives:

  • CNCF projects such as CSI define storage interfaces, but not backup or recovery workflows
  • Control-plane recovery tools focus on cluster reconstruction, not application portability
  • There is no CNCF project providing Kubernetes-native, API-driven backup and migration semantics across environments

Velero provides this missing capability by enabling application-level disaster recovery, migration, and lifecycle operations in a way that is:

  • Kubernetes-version aware
  • Storage-backend agnostic
  • Compatible with managed and self-hosted clusters

This makes Velero a foundational building block for operating stateful Kubernetes workloads at scale.

Cloud native 'fit'

Velero is fundamentally cloud native in design, operation, and philosophy:

Kubernetes-Native Architecture:

  • CRDs: All backup operations are managed via CRDs (Backup, Restore, Schedule, BackupStorageLocation, VolumeSnapshotLocation) – declarative, version-controlled and GitOps-compatible
  • Controller pattern: Server component runs as a Deployment, reconciling desired state (CRDs) with actual state via control loops
  • API server integration: Operates entirely through Kubernetes API – no direct etcd access, no out-of-band agents

Cloud Native Design Principles:

  • Distributed and containerized: Runs as a pod, schedules init containers for hooks, deploys node-agents as DaemonSets for file-system backup
  • Declarative configuration: Backup policies defined as YAML manifests, scheduled via cron expressions, version-controlled in Git
  • Horizontal scalability: Supports multiple concurrent backup/restore operations; plugin architecture allows vendor-specific optimizations
  • Resilience and self-healing: Integrates with Prometheus for health monitoring; supports backup retry logic and partial failure recovery

Multi-Cloud and Hybrid-Cloud:

  • Provider-agnostic storage backend
  • Cross-cloud migration capability
  • On-premises support

Ecosystem Integration:

  • CSI: First-class support for CSI volume snapshots and CSI snapshot data movement
  • Prometheus: Exposes metrics for backup success/failure rates, durations, and sizes
  • Helm: Official Helm chart for declarative installation

Cloud native 'integration'

Direct Dependencies

CNCF Project How Velero Uses It
Kubernetes Core dependency. Velero runs as a Kubernetes Deployment, uses CRDs for API, schedules operations via Jobs/Pods, watches API server for resource changes.
Prometheus Velero exposes Prometheus metrics for backup/restore operations, durations, success/failure rates, and queue depths. Used for operational monitoring and alerting.
Helm Official Velero Helm chart for declarative installation and configuration management.

Complementary Integrations

CNCF Project Integration Type
Rook Velero backs up Rook-managed Ceph clusters via CSI snapshots or file-system backup. Used together for distributed storage + data protection.
Longhorn Longhorn users leverage Velero for disaster recovery and cross-cluster migration of Longhorn-backed persistent volumes.
cert-manager Velero backs up cert-manager's Certificate and Secret resources, enabling recovery of TLS configurations.

Ecosystem Position

Velero is a horizontal enabler – it doesn't compete with orchestration, networking, storage, or observability projects. Instead, it provides the data protection layer that every production Kubernetes deployment requires, regardless of which CNI, CSI, or ingress controller they use.

Cloud native overlap

Velero is complementary to, not competitive with, existing CNCF backup-related projects.

Project CNCF Level Focus How Velero Differs
Kanister Sandbox Application-level data operations via reusable Blueprints. Focuses on database-specific backup workflows (MongoDB, PostgreSQL, MySQL, Cassandra). Velero operates at the cluster resource + persistent volume level via the Kubernetes API. It backs up the entire cluster state (or filtered subsets), not just individual application data. Kanister is complementary: it can handle app-specific consistency while Velero handles the broader cluster.
K8up Sandbox PVC backup via Restic with namespace-level configuration. Simpler scope focused on persistent volume content. Velero provides full cluster resource backup (API objects + volumes), cross-provider migration, CSI snapshot integration, a plugin architecture, and enterprise-scale features like scheduled backups with retention policies and pre/post hooks.
Longhorn Incubating Cloud-native distributed block storage for Kubernetes with built-in backup. Longhorn is a storage system with backup as a feature. Velero is a backup system that works with any storage backend via plugins. They serve different primary use cases.

Similar projects

N/A. Comparison with CNCF back-up related projects covered above.

Landscape

Yes, Velero is already listed on the CNCF landscape.

Trademark and accounts

  • If the project is accepted, I agree to donate all project trademarks and accounts to the CNCF

IP policy

  • If the project is accepted, I agree the project will follow the CNCF IP Policy

Will the project require a license exception?

No, Velero is licensed under the Apache License 2.0, which is already an approved CNCF license. The project does not require any license exceptions.

Project "Domain Technical Review"

Presented the intent to donate Velero, along with the draft sandbox application, to TAG Operational Resilience during the Feb 4, 2026 meeting. Meeting notes are available here.

Application contact email(s)

nikhita dot raghunath at broadcom dot com

Contributing or sponsoring entity signatory information

Name Country Email address
Ali Emadi USA ali dot emadi at broadcom dot com

CNCF contacts

TOC - Davanum Srinivas (@dims)
TAG Operational Resilience TL - Nabarun Pal (@palnabarun)

Metadata

Metadata

Assignees

Type

No type

Projects

Status

🤔 In voting

Status

New - Sandbox Pending Review

Relationships

None yet

Development

No branches or pull requests

Issue actions