Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
343 changes: 271 additions & 72 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,120 +3,319 @@ SPDX-FileCopyrightText: The RamenDR authors
SPDX-License-Identifier: Apache-2.0
-->

# Install
# Installing Ramen

This guide describes how to install Ramen's disaster recovery operators on your
cluster environment.

## Overview

Ramen consists of two operators that work together to provide disaster recovery
capabilities:

1. **Ramen Hub Operator** - Installed on the OCM hub cluster, orchestrates DR
operations across managed clusters
1. **Ramen DR Cluster Operator** - Installed on each OCM managed cluster,
manages local volume replication and workload protection

## Prerequisites

### OCM managed multi-cluster setup
Before installing Ramen, ensure the following requirements are met:

### 1. Kubernetes Versions

**Supported versions:**

- Kubernetes 1.30 or higher

### 2. Open Cluster Management (OCM) Setup

Ramen requires an
[OCM](https://open-cluster-management.io/docs/concepts/architecture/) hub cluster
with at least two managed clusters for disaster recovery operations.

**Requirements:**

- OCM hub cluster with `multicluster-engine`, `ocm-controller` (from
multicloud-operators-foundation), and hub addons
(`application-manager`, `governance-policy-framework`)
- OCM managed cluster addons on each managed cluster:
`application-manager`, `governance-policy-framework`, and
`config-policy-controller`
- At least 2 managed clusters registered with the hub
- All clusters must be able to communicate with each other

For OCM installation instructions, see the [OCM installation guide](https://open-cluster-management.io/docs/getting-started/installation/).

### 3. Storage Replication Support

Ramen supports two disaster recovery modes, each with different storage requirements:

#### Sync Mode (Metro DR)

Sync mode uses an external storage cluster that all managed clusters connect
to, providing all clusters access to the same storage backend.

**Supported:**

- Any CSI-compatible storage systems that support shared external storage
clusters
- CSI drivers that support static provisioning for PVCs and can attach the
same storage when a PV is transferred between clusters sharing the same
storage backend

**Required:**

- External storage cluster
- Storage provider installed that supports CSI and synchronous replication
- StorageClasses on managed clusters with the same
`ramendr.openshift.io/storageid` labels (indicating shared storage)
- Low-latency network connectivity between managed clusters and the external
storage cluster

#### Async Mode (Regional DR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(future) Overall we would need a doc on sync/async and what that means in the DR and ramen scope


Async mode uses storage in each managed cluster with asynchronous replication
between clusters based on configurable time intervals.

**Supported:**

- Any CSI-compatible storage that supports VolumeReplication or
VolumeSnapshot

Ramen works as part of the [OCM hub](https://open-cluster-management.io/concepts/architecture/#hub-cluster)
cluster to orchestrate the [placement](https://open-cluster-management.io/concepts/placement/)
of [workloads](https://kubernetes.io/docs/concepts/workloads/) and their attachment
to PersistentVolumes, on [OCM managed clusters](https://open-cluster-management.io/concepts/managedcluster/).
**Required:**

[Ramen hub](#ramen-hub-operator) and [Ramen cluster](#ramen-cluster-operator)
operators hence require an OCM managed multi-cluster setup for their
operation.
- Storage provider installed in each managed cluster that supports CSI and
VolumeReplication or VolumeSnapshot
- StorageClasses on managed clusters with different
`ramendr.openshift.io/storageid` labels (indicating separate storage
instances)
- [VolumeReplication](https://github.com/csi-addons/volume-replication-operator)
CRD and VolumeReplicationClass OR VolSync operator and VolumeSnapshotClass
- Network connectivity between managed clusters for replication

### OCM Managed Cluster supporting VolumeReplication CRD
For information about installing storage providers that support CSI, see your
storage vendor's documentation or the
[Kubernetes CSI documentation](https://kubernetes-csi.github.io/docs/).

Ramen works as part of the [OCM managed clusters](https://open-cluster-management.io/concepts/managedcluster/)
to orchestrate:
### 4. S3 Object Storage

- [VolumeReplication](https://github.com/csi-addons/volume-replication-operator/blob/main/api/v1alpha1/volumereplication_types.go)
resources for all PVCs of a workload
- Preserving relevant cluster data regarding each PVC that is replicated
Ramen stores workload metadata and Ramen resources in S3-compatible object
storage for cross-cluster recovery.

VolumeReplication custom resources require storage providers to support
[CSI extensions](https://github.com/csi-addons/spec) that enable managing
replication features for provisioned volumes.
[Ceph-CSI](https://github.com/ceph/ceph-csi/) is one such storage provider
that supports the required extensions.
**Requirements:**

[Ramen cluster operator](#ramen-cluster-operator) hence should be deployed
to OCM managed clusters that support VolumeReplication extensions.
- S3-compatible object store accessible from all managed clusters
- Bucket(s) created for each managed cluster
- S3 credentials (access key and secret key)

### S3 store
**Supported:**

Ramen preserves cluster data related to PVC resources in an S3 compatible
object store. An S3 store endpoint is hence required as part of the setup.
- Any S3-compatible storage

**NOTE**: Ramen specifically stores PV cluster data for a replicated PVC, to
restore the same across peer cluster prior to deploying the PVCs of the
workload, this ensures proper binding of the PVC resources to the replicated
storage end points.
### 5. Operator Lifecycle Manager (OLM)

### Operator lifecycle manager (OLM)
Ramen operators are distributed via OLM catalogs.

Ramen components are provided as [OLM](https://olm.operatorframework.io/docs/getting-started/)
catalog sources in the [Ramen catalog](https://quay.io/repository/ramendr/ramen-operator-catalog?tab=info).
**Requirements:**

All clusters that require Ramen hub or cluster components installed, require
OLM installed on the same.
- OLM installed on hub and managed clusters
- For vanilla Kubernetes, [install OLM](https://olm.operatorframework.io/docs/getting-started/)

### Kubernetes versions
### 6. Required Tools

Kubernetes versions supported are [1.20](https://kubernetes.io/releases/)
or higher.
The following tools must be installed on your workstation:

### Tool versions
- **kubectl** >= v1.30 - Kubernetes CLI

Installation and deployment require the following tools at specified versions
(or higher):
### 7. Optional Components

- kubectl > v1.21
- kubectl version can be verfied using
These components enhance Ramen's capabilities:

```bash
kubectl version
```
#### Recipe CRD (for Recipe-based Protection)

## Ramen hub operator
Required if using Recipe-based workload protection.

`ramen-hub-operator` is the controller for managing the life cycle of user
created [DRPlacementControl (DRPC)](drpc-crd.md) Ramen API resources and
administrator created [DRPolicy](drpolicy-crd.md) Ramen API resources, and is
installed on the **OCM hub cluster**.
**Install on each managed cluster:**

### Install ramen-hub-operator
```bash
kubectl apply -k "https://github.com/RamenDR/recipe.git/config/crd?ref=main"
```

The Recipe CRD is also available at:
[recipe/config/crd/bases/ramendr.openshift.io_recipes.yaml](https://github.com/RamenDR/recipe/blob/main/config/crd/bases/ramendr.openshift.io_recipes.yaml)

## Installation Steps

### Step 1: Install Ramen Hub Operator

The `ramen-hub-operator` manages disaster recovery orchestration on the OCM hub
cluster. It controls:

- [DRPlacementControl (DRPC)](drpc-crd.md) - DR operations for individual
applications
- [DRPolicy](drpolicy-crd.md) - DR topology and replication configuration
- [DRCluster](drcluster-crd.md) - Managed cluster registration and S3 configuration

#### Install on Hub Cluster

Configure kubectl to use your OCM hub cluster context:

```bash
kubectl config use-context <hub-cluster-context>
```

To install `ramen-hub-operator` configure [kubectl](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/#context)
to use the desired OCM hub cluster and execute:
Install the operator using OLM:

```bash
kubectl apply -k github.com/RamenDR/ramen/config/olm-install/hub/?ref=main
kubectl apply -k "https://github.com/RamenDR/ramen/config/olm-install/hub?ref=main"
```

**NOTE**: By default `ramen-hub-operator` creates a deployment for its
controller in the `ramen-system` namespace. To verify check the health of the
deployment:
This creates:

- `ramen-system` namespace
- Ramen hub operator deployment
- Required CRDs (DRPlacementControl, DRPolicy, DRCluster)
- RBAC resources

#### Verify Hub Operator Installation

**Check operator deployment:**

```bash
kubectl get deployments -n ramen-system ramen-hub-operator
kubectl get deployments -n ramen-system
```

## Ramen cluster operator
Expected output:

`ramen-dr-cluster-operator` is the controller for managing the life cycle of
[VolumeReplicationGroup](vrg-crd.md) Ramen API resources and is installed on
the **OCM managed clusters**.
```
NAME READY UP-TO-DATE AVAILABLE AGE
ramen-hub-operator 1/1 1 1 2m
```

**NOTE**: Lifecycle of VolumeReplicationGroup resources are managed by
[Ramen hub](#ramen-hub-operator) on required OCM managed clusters.
### Step 2: Install Ramen DR Cluster Operator

### Install ramen-dr-cluster-operator
Install the catalog source for the DR Cluster operator in all managed clusters.

To install `ramen-dr-cluster-operator` configure [kubectl](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/#context)
to use the desired OCM managed cluster and execute:
Switch to each managed cluster context:

```bash
kubectl apply -k github.com/RamenDR/ramen/config/olm-install/dr-cluster/?ref=main
kubectl config use-context <managed-cluster-context>
```

**NOTE**: By default `ramen-dr-cluster-operator` creates a deployment for its
controller in the `ramen-system` namespace. To verify check the health of
the deployment:
Install the catalog source:

```bash
kubectl get deployments -n ramen-system ramen-dr-cluster-operator
kubectl apply -k "https://github.com/RamenDR/ramen/config/olm-install/base?ref=main"
```

The DR Cluster operator will be installed automatically during configuration.
Refer to [configure.md](configure.md) for more details.

## Post-Installation

After installing Ramen operators, you need to configure them for your environment.

### Next Steps

**Configure Ramen** - Set up DRPolicy, DRCluster resources, and S3 storage.
See [configure.md](configure.md) for detailed configuration instructions.

### Configuration Prerequisites
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once configure.md is posted, I think I will delete this section in this doc.


Before configuring Ramen, prepare the following information:

- **S3 credentials** for each managed cluster

- Bucket names
- Access key and secret key
- S3 endpoint URL

- **Cluster information**

- Managed cluster names (as registered in OCM)
- Storage class names for PVCs
- VolumeReplicationClass or VolumeSnapshotClass names

- **Replication schedule** (for async DR)

- How often to replicate (e.g., "5m", "1h")

## Troubleshooting Installation

### Operator Pod Not Running

**Check pod status:**

```bash
kubectl describe pod -n ramen-system <pod-name>
```

**Common issues:**

- Image pull errors - check image registry access
- OLM not installed - verify `kubectl get csv -n openshift-operators`
- Resource constraints - check cluster resources

### CRDs Not Created

**Verify OLM created the subscription:**

```bash
kubectl get subscription -n ramen-system
kubectl get installplan -n ramen-system
```

**Check catalog source:**

```bash
kubectl get catalogsource -n openshift-marketplace
```

### Operator Logs Show Errors

**View logs:**

```bash
kubectl logs -n ramen-system deployment/ramen-hub-operator -c manager --tail=100
```

**Common errors:**

- RBAC permissions - check ClusterRole and ClusterRoleBinding
- API server connection issues - verify network connectivity
- Missing dependencies - ensure OCM is properly installed

### Rollback Installation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a note to state that if configure, rollback configuration before rolling back the installation.

**Note:** If Ramen has been configured, roll back configurations before rolling
back the installation.

**Remove hub operator:**

```bash
kubectl delete -k "https://github.com/RamenDR/ramen/config/olm-install/hub?ref=main"
```

**Note:** This will remove the operators but not the CRDs. To remove CRDs:

```bash
kubectl delete crd drplacementcontrols.ramendr.openshift.io
kubectl delete crd drpolicies.ramendr.openshift.io
kubectl delete crd drclusters.ramendr.openshift.io
```

**Remove catalog source in all managed clusters:**

```bash
kubectl delete -k "https://github.com/RamenDR/ramen/config/olm-install/base?ref=main"
```

## Development Installation

For development and testing, see:

- [user-quick-start.md](user-quick-start.md) - Complete test environment setup
with drenv
- [devel-quick-start.md](devel-quick-start.md) - Developer setup for
contributing to Ramen