diff --git a/docs/deploy/aws.md b/docs/deploy/aws.md index 49b0ea1d69ae19..d00ef9792c56d7 100644 --- a/docs/deploy/aws.md +++ b/docs/deploy/aws.md @@ -187,6 +187,58 @@ Note down the elb address in the address column. Add the DNS CNAME record to the from above) to the elb address. DNS updates generally take a few minutes to an hour. Once that is done, you should be able to access datahub-frontend through the host-name. +## Increasing Storage for Elasticsearch, Kafka, and MySQL in Kubernetes + +When deploying DataHub in a Kubernetes environment, storage configurations for external dependencies like Elasticsearch, Kafka, and MySQL must be updated in the `datahub-prerequisites` Helm chart. Here’s how you can increase storage for these services: + +### Understanding Helm Charts + +- **`datahub` Chart**: Used for deploying the DataHub application itself. +- **`datahub-prerequisites` Chart**: Used for deploying external dependencies such as Elasticsearch, Kafka, and MySQL. + +### Step-by-Step Instructions + +1. **Locate the `values.yaml` File**: This file is part of the `datahub-prerequisites` chart and contains configurations for Elasticsearch, Kafka, and MySQL. + +2. **Update Storage Configurations**: + - **Elasticsearch**: + ```yaml + elasticsearch: + volumeClaimTemplate: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 20Gi # Increase this to the desired size + ``` + - **Kafka**: + ```yaml + kafka: + broker: + persistence: + size: 16Gi # Increase this to the desired size + ``` + - **MySQL**: + ```yaml + mysql: + persistence: + size: 10Gi # Increase this to the desired size + ``` + +3. **Apply Changes with Helm**: + ```bash + helm upgrade datahub/datahub-prerequisites -f values.yaml + ``` + +4. **Verify the Update**: + ```bash + kubectl get pvc + kubectl get pods + ``` + +### Cautionary Note + +Be aware of the potential for data loss when manually deleting PVCs if dynamic resizing is not supported. Always ensure you have backups before proceeding with storage changes. + ## Use AWS managed services for the storage layer Managing the storage services like MySQL, Elasticsearch, and Kafka as kubernetes pods requires a great deal of diff --git a/docs/deploy/kubernetes.md b/docs/deploy/kubernetes.md index 5d6a61b98b7545..3e07419ba55405 100644 --- a/docs/deploy/kubernetes.md +++ b/docs/deploy/kubernetes.md @@ -85,6 +85,49 @@ any of the configuration and deploy by running the following command. helm install prerequisites datahub/datahub-prerequisites --values <> ``` +### Increasing Storage for Elasticsearch, Kafka, and MySQL + +To increase storage for Elasticsearch, Kafka, and MySQL, you need to update the `datahub-prerequisites` chart. Follow these steps: + +1. **Locate the `values.yaml` File**: This file is part of the `datahub-prerequisites` chart and contains configurations for Elasticsearch, Kafka, and MySQL. + +2. **Update Storage Configuration**: + - **Elasticsearch**: Modify the `volumeClaimTemplate.resources.requests.storage` value. + ```yaml + elasticsearch: + volumeClaimTemplate: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 20Gi # Increase this to the desired size + ``` + - **Kafka**: Modify the `broker.persistence.size` value. + ```yaml + kafka: + broker: + persistence: + size: 16Gi # Increase this to the desired size + ``` + - **MySQL**: Modify the `persistence.size` value. + ```yaml + mysql: + persistence: + size: 10Gi # Increase this to the desired size + ``` + +3. **Apply the Changes**: Use Helm to apply the changes by upgrading your `datahub-prerequisites` release. + ```bash + helm upgrade datahub/datahub-prerequisites -f values.yaml + ``` + +4. **Verify the Changes**: Check the status of the PVCs and pods to ensure they are running with the updated storage size. + ```bash + kubectl get pvc + kubectl get pods + ``` + +5. **Caution**: If your Kubernetes environment does not support dynamic resizing of PVCs, you may need to manually delete the existing PVCs and allow Kubernetes to create new ones with the updated size. This process can lead to data loss if not handled carefully, so ensure you have backups or a plan to migrate data. + Run `kubectl get pods` to check whether all the pods for the dependencies are running. You should get a result similar to below.