-
Notifications
You must be signed in to change notification settings - Fork 67
Enhance install.md documentation #2341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,120 +3,319 @@ SPDX-FileCopyrightText: The RamenDR authors | |
| SPDX-License-Identifier: Apache-2.0 | ||
| --> | ||
|
|
||
| # Install | ||
| # Installing Ramen | ||
|
|
||
| This guide describes how to install Ramen's disaster recovery operators on your | ||
| cluster environment. | ||
|
|
||
| ## Overview | ||
|
|
||
| Ramen consists of two operators that work together to provide disaster recovery | ||
| capabilities: | ||
|
|
||
| 1. **Ramen Hub Operator** - Installed on the OCM hub cluster, orchestrates DR | ||
| operations across managed clusters | ||
| 1. **Ramen DR Cluster Operator** - Installed on each OCM managed cluster, | ||
| manages local volume replication and workload protection | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| ### OCM managed multi-cluster setup | ||
| Before installing Ramen, ensure the following requirements are met: | ||
|
|
||
| ### 1. Kubernetes Versions | ||
|
|
||
| **Supported versions:** | ||
|
|
||
| - Kubernetes 1.30 or higher | ||
|
|
||
| ### 2. Open Cluster Management (OCM) Setup | ||
|
|
||
| Ramen requires an | ||
| [OCM](https://open-cluster-management.io/docs/concepts/architecture/) hub cluster | ||
| with at least two managed clusters for disaster recovery operations. | ||
|
|
||
| **Requirements:** | ||
|
|
||
| - OCM hub cluster with `multicluster-engine`, `ocm-controller` (from | ||
| multicloud-operators-foundation), and hub addons | ||
| (`application-manager`, `governance-policy-framework`) | ||
| - OCM managed cluster addons on each managed cluster: | ||
| `application-manager`, `governance-policy-framework`, and | ||
| `config-policy-controller` | ||
| - At least 2 managed clusters registered with the hub | ||
| - All clusters must be able to communicate with each other | ||
|
|
||
| For OCM installation instructions, see the [OCM installation guide](https://open-cluster-management.io/docs/getting-started/installation/). | ||
|
|
||
| ### 3. Storage Replication Support | ||
|
|
||
| Ramen supports two disaster recovery modes, each with different storage requirements: | ||
|
|
||
| #### Sync Mode (Metro DR) | ||
|
|
||
| Sync mode uses an external storage cluster that all managed clusters connect | ||
| to, providing all clusters access to the same storage backend. | ||
|
|
||
| **Supported:** | ||
|
|
||
| - Any CSI-compatible storage systems that support shared external storage | ||
| clusters | ||
| - CSI drivers that support static provisioning for PVCs and can attach the | ||
| same storage when a PV is transferred between clusters sharing the same | ||
| storage backend | ||
|
|
||
| **Required:** | ||
|
|
||
| - External storage cluster | ||
| - Storage provider installed that supports CSI and synchronous replication | ||
| - StorageClasses on managed clusters with the same | ||
| `ramendr.openshift.io/storageid` labels (indicating shared storage) | ||
| - Low-latency network connectivity between managed clusters and the external | ||
| storage cluster | ||
|
|
||
| #### Async Mode (Regional DR) | ||
|
|
||
| Async mode uses storage in each managed cluster with asynchronous replication | ||
| between clusters based on configurable time intervals. | ||
|
|
||
| **Supported:** | ||
|
|
||
| - Any CSI-compatible storage that supports VolumeReplication or | ||
| VolumeSnapshot | ||
|
|
||
| Ramen works as part of the [OCM hub](https://open-cluster-management.io/concepts/architecture/#hub-cluster) | ||
| cluster to orchestrate the [placement](https://open-cluster-management.io/concepts/placement/) | ||
| of [workloads](https://kubernetes.io/docs/concepts/workloads/) and their attachment | ||
| to PersistentVolumes, on [OCM managed clusters](https://open-cluster-management.io/concepts/managedcluster/). | ||
| **Required:** | ||
|
|
||
| [Ramen hub](#ramen-hub-operator) and [Ramen cluster](#ramen-cluster-operator) | ||
| operators hence require an OCM managed multi-cluster setup for their | ||
| operation. | ||
| - Storage provider installed in each managed cluster that supports CSI and | ||
| VolumeReplication or VolumeSnapshot | ||
| - StorageClasses on managed clusters with different | ||
| `ramendr.openshift.io/storageid` labels (indicating separate storage | ||
| instances) | ||
| - [VolumeReplication](https://github.com/csi-addons/volume-replication-operator) | ||
| CRD and VolumeReplicationClass OR VolSync operator and VolumeSnapshotClass | ||
| - Network connectivity between managed clusters for replication | ||
|
|
||
| ### OCM Managed Cluster supporting VolumeReplication CRD | ||
| For information about installing storage providers that support CSI, see your | ||
| storage vendor's documentation or the | ||
| [Kubernetes CSI documentation](https://kubernetes-csi.github.io/docs/). | ||
|
|
||
| Ramen works as part of the [OCM managed clusters](https://open-cluster-management.io/concepts/managedcluster/) | ||
| to orchestrate: | ||
| ### 4. S3 Object Storage | ||
|
|
||
| - [VolumeReplication](https://github.com/csi-addons/volume-replication-operator/blob/main/api/v1alpha1/volumereplication_types.go) | ||
| resources for all PVCs of a workload | ||
| - Preserving relevant cluster data regarding each PVC that is replicated | ||
| Ramen stores workload metadata and Ramen resources in S3-compatible object | ||
| storage for cross-cluster recovery. | ||
|
|
||
| VolumeReplication custom resources require storage providers to support | ||
| [CSI extensions](https://github.com/csi-addons/spec) that enable managing | ||
| replication features for provisioned volumes. | ||
| [Ceph-CSI](https://github.com/ceph/ceph-csi/) is one such storage provider | ||
| that supports the required extensions. | ||
| **Requirements:** | ||
|
|
||
| [Ramen cluster operator](#ramen-cluster-operator) hence should be deployed | ||
| to OCM managed clusters that support VolumeReplication extensions. | ||
| - S3-compatible object store accessible from all managed clusters | ||
| - Bucket(s) created for each managed cluster | ||
| - S3 credentials (access key and secret key) | ||
|
|
||
| ### S3 store | ||
| **Supported:** | ||
|
|
||
| Ramen preserves cluster data related to PVC resources in an S3 compatible | ||
| object store. An S3 store endpoint is hence required as part of the setup. | ||
| - Any S3-compatible storage | ||
|
|
||
| **NOTE**: Ramen specifically stores PV cluster data for a replicated PVC, to | ||
| restore the same across peer cluster prior to deploying the PVCs of the | ||
| workload, this ensures proper binding of the PVC resources to the replicated | ||
| storage end points. | ||
| ### 5. Operator Lifecycle Manager (OLM) | ||
|
|
||
| ### Operator lifecycle manager (OLM) | ||
| Ramen operators are distributed via OLM catalogs. | ||
|
|
||
| Ramen components are provided as [OLM](https://olm.operatorframework.io/docs/getting-started/) | ||
| catalog sources in the [Ramen catalog](https://quay.io/repository/ramendr/ramen-operator-catalog?tab=info). | ||
| **Requirements:** | ||
|
|
||
| All clusters that require Ramen hub or cluster components installed, require | ||
| OLM installed on the same. | ||
| - OLM installed on hub and managed clusters | ||
| - For vanilla Kubernetes, [install OLM](https://olm.operatorframework.io/docs/getting-started/) | ||
|
|
||
| ### Kubernetes versions | ||
| ### 6. Required Tools | ||
|
|
||
| Kubernetes versions supported are [1.20](https://kubernetes.io/releases/) | ||
| or higher. | ||
| The following tools must be installed on your workstation: | ||
|
|
||
| ### Tool versions | ||
| - **kubectl** >= v1.30 - Kubernetes CLI | ||
|
|
||
| Installation and deployment require the following tools at specified versions | ||
| (or higher): | ||
| ### 7. Optional Components | ||
|
|
||
| - kubectl > v1.21 | ||
| - kubectl version can be verfied using | ||
| These components enhance Ramen's capabilities: | ||
|
|
||
| ```bash | ||
| kubectl version | ||
| ``` | ||
| #### Recipe CRD (for Recipe-based Protection) | ||
|
|
||
| ## Ramen hub operator | ||
| Required if using Recipe-based workload protection. | ||
|
|
||
| `ramen-hub-operator` is the controller for managing the life cycle of user | ||
| created [DRPlacementControl (DRPC)](drpc-crd.md) Ramen API resources and | ||
| administrator created [DRPolicy](drpolicy-crd.md) Ramen API resources, and is | ||
| installed on the **OCM hub cluster**. | ||
| **Install on each managed cluster:** | ||
|
|
||
| ### Install ramen-hub-operator | ||
| ```bash | ||
| kubectl apply -k "https://github.com/RamenDR/recipe.git/config/crd?ref=main" | ||
| ``` | ||
|
|
||
| The Recipe CRD is also available at: | ||
| [recipe/config/crd/bases/ramendr.openshift.io_recipes.yaml](https://github.com/RamenDR/recipe/blob/main/config/crd/bases/ramendr.openshift.io_recipes.yaml) | ||
|
|
||
| ## Installation Steps | ||
|
|
||
| ### Step 1: Install Ramen Hub Operator | ||
|
|
||
| The `ramen-hub-operator` manages disaster recovery orchestration on the OCM hub | ||
| cluster. It controls: | ||
|
|
||
| - [DRPlacementControl (DRPC)](drpc-crd.md) - DR operations for individual | ||
| applications | ||
| - [DRPolicy](drpolicy-crd.md) - DR topology and replication configuration | ||
| - [DRCluster](drcluster-crd.md) - Managed cluster registration and S3 configuration | ||
|
|
||
| #### Install on Hub Cluster | ||
|
|
||
| Configure kubectl to use your OCM hub cluster context: | ||
|
|
||
| ```bash | ||
| kubectl config use-context <hub-cluster-context> | ||
| ``` | ||
|
|
||
| To install `ramen-hub-operator` configure [kubectl](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/#context) | ||
| to use the desired OCM hub cluster and execute: | ||
| Install the operator using OLM: | ||
|
|
||
| ```bash | ||
| kubectl apply -k github.com/RamenDR/ramen/config/olm-install/hub/?ref=main | ||
| kubectl apply -k "https://github.com/RamenDR/ramen/config/olm-install/hub?ref=main" | ||
| ``` | ||
|
|
||
| **NOTE**: By default `ramen-hub-operator` creates a deployment for its | ||
| controller in the `ramen-system` namespace. To verify check the health of the | ||
| deployment: | ||
| This creates: | ||
|
|
||
| - `ramen-system` namespace | ||
| - Ramen hub operator deployment | ||
| - Required CRDs (DRPlacementControl, DRPolicy, DRCluster) | ||
| - RBAC resources | ||
|
|
||
| #### Verify Hub Operator Installation | ||
|
|
||
| **Check operator deployment:** | ||
|
|
||
| ```bash | ||
| kubectl get deployments -n ramen-system ramen-hub-operator | ||
| kubectl get deployments -n ramen-system | ||
| ``` | ||
|
|
||
| ## Ramen cluster operator | ||
| Expected output: | ||
|
|
||
| `ramen-dr-cluster-operator` is the controller for managing the life cycle of | ||
| [VolumeReplicationGroup](vrg-crd.md) Ramen API resources and is installed on | ||
| the **OCM managed clusters**. | ||
| ``` | ||
| NAME READY UP-TO-DATE AVAILABLE AGE | ||
| ramen-hub-operator 1/1 1 1 2m | ||
| ``` | ||
|
|
||
| **NOTE**: Lifecycle of VolumeReplicationGroup resources are managed by | ||
| [Ramen hub](#ramen-hub-operator) on required OCM managed clusters. | ||
| ### Step 2: Install Ramen DR Cluster Operator | ||
|
|
||
| ### Install ramen-dr-cluster-operator | ||
| Install the catalog source for the DR Cluster operator in all managed clusters. | ||
|
|
||
| To install `ramen-dr-cluster-operator` configure [kubectl](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/#context) | ||
| to use the desired OCM managed cluster and execute: | ||
| Switch to each managed cluster context: | ||
|
|
||
| ```bash | ||
| kubectl apply -k github.com/RamenDR/ramen/config/olm-install/dr-cluster/?ref=main | ||
| kubectl config use-context <managed-cluster-context> | ||
| ``` | ||
|
|
||
| **NOTE**: By default `ramen-dr-cluster-operator` creates a deployment for its | ||
| controller in the `ramen-system` namespace. To verify check the health of | ||
| the deployment: | ||
| Install the catalog source: | ||
|
|
||
| ```bash | ||
| kubectl get deployments -n ramen-system ramen-dr-cluster-operator | ||
| kubectl apply -k "https://github.com/RamenDR/ramen/config/olm-install/base?ref=main" | ||
| ``` | ||
|
|
||
| The DR Cluster operator will be installed automatically during configuration. | ||
| Refer to [configure.md](configure.md) for more details. | ||
|
|
||
| ## Post-Installation | ||
|
|
||
| After installing Ramen operators, you need to configure them for your environment. | ||
|
|
||
| ### Next Steps | ||
|
|
||
| **Configure Ramen** - Set up DRPolicy, DRCluster resources, and S3 storage. | ||
| See [configure.md](configure.md) for detailed configuration instructions. | ||
|
|
||
| ### Configuration Prerequisites | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Once configure.md is posted, I think I will delete this section in this doc. |
||
|
|
||
| Before configuring Ramen, prepare the following information: | ||
|
|
||
| - **S3 credentials** for each managed cluster | ||
|
|
||
| - Bucket names | ||
| - Access key and secret key | ||
| - S3 endpoint URL | ||
|
|
||
| - **Cluster information** | ||
|
|
||
| - Managed cluster names (as registered in OCM) | ||
| - Storage class names for PVCs | ||
| - VolumeReplicationClass or VolumeSnapshotClass names | ||
|
|
||
| - **Replication schedule** (for async DR) | ||
|
|
||
| - How often to replicate (e.g., "5m", "1h") | ||
|
|
||
| ## Troubleshooting Installation | ||
|
|
||
| ### Operator Pod Not Running | ||
|
|
||
| **Check pod status:** | ||
|
|
||
| ```bash | ||
| kubectl describe pod -n ramen-system <pod-name> | ||
| ``` | ||
|
|
||
| **Common issues:** | ||
|
|
||
| - Image pull errors - check image registry access | ||
| - OLM not installed - verify `kubectl get csv -n openshift-operators` | ||
| - Resource constraints - check cluster resources | ||
|
|
||
| ### CRDs Not Created | ||
|
|
||
| **Verify OLM created the subscription:** | ||
|
|
||
| ```bash | ||
| kubectl get subscription -n ramen-system | ||
| kubectl get installplan -n ramen-system | ||
| ``` | ||
|
|
||
| **Check catalog source:** | ||
|
|
||
| ```bash | ||
| kubectl get catalogsource -n openshift-marketplace | ||
| ``` | ||
|
|
||
| ### Operator Logs Show Errors | ||
|
|
||
| **View logs:** | ||
|
|
||
| ```bash | ||
| kubectl logs -n ramen-system deployment/ramen-hub-operator -c manager --tail=100 | ||
| ``` | ||
|
|
||
| **Common errors:** | ||
|
|
||
| - RBAC permissions - check ClusterRole and ClusterRoleBinding | ||
| - API server connection issues - verify network connectivity | ||
| - Missing dependencies - ensure OCM is properly installed | ||
|
|
||
| ### Rollback Installation | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a note to state that if configure, rollback configuration before rolling back the installation. |
||
| **Note:** If Ramen has been configured, roll back configurations before rolling | ||
| back the installation. | ||
|
|
||
| **Remove hub operator:** | ||
|
|
||
| ```bash | ||
| kubectl delete -k "https://github.com/RamenDR/ramen/config/olm-install/hub?ref=main" | ||
| ``` | ||
|
|
||
| **Note:** This will remove the operators but not the CRDs. To remove CRDs: | ||
|
|
||
| ```bash | ||
| kubectl delete crd drplacementcontrols.ramendr.openshift.io | ||
| kubectl delete crd drpolicies.ramendr.openshift.io | ||
| kubectl delete crd drclusters.ramendr.openshift.io | ||
| ``` | ||
|
|
||
| **Remove catalog source in all managed clusters:** | ||
|
|
||
| ```bash | ||
| kubectl delete -k "https://github.com/RamenDR/ramen/config/olm-install/base?ref=main" | ||
| ``` | ||
|
|
||
| ## Development Installation | ||
|
|
||
| For development and testing, see: | ||
|
|
||
| - [user-quick-start.md](user-quick-start.md) - Complete test environment setup | ||
| with drenv | ||
| - [devel-quick-start.md](devel-quick-start.md) - Developer setup for | ||
| contributing to Ramen | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(future) Overall we would need a doc on sync/async and what that means in the DR and ramen scope