Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions docs/deploy/aws.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,58 @@ Note down the elb address in the address column. Add the DNS CNAME record to the
from above) to the elb address. DNS updates generally take a few minutes to an hour. Once that is done, you should be
able to access datahub-frontend through the host-name.

## Increasing Storage for Elasticsearch, Kafka, and MySQL in Kubernetes

When deploying DataHub in a Kubernetes environment, storage configurations for external dependencies like Elasticsearch, Kafka, and MySQL must be updated in the `datahub-prerequisites` Helm chart. Here’s how you can increase storage for these services:

### Understanding Helm Charts

- **`datahub` Chart**: Used for deploying the DataHub application itself.
- **`datahub-prerequisites` Chart**: Used for deploying external dependencies such as Elasticsearch, Kafka, and MySQL.

### Step-by-Step Instructions

1. **Locate the `values.yaml` File**: This file is part of the `datahub-prerequisites` chart and contains configurations for Elasticsearch, Kafka, and MySQL.

2. **Update Storage Configurations**:
- **Elasticsearch**:
```yaml
elasticsearch:
volumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi # Increase this to the desired size
```
- **Kafka**:
```yaml
kafka:
broker:
persistence:
size: 16Gi # Increase this to the desired size
```
- **MySQL**:
```yaml
mysql:
persistence:
size: 10Gi # Increase this to the desired size
```

3. **Apply Changes with Helm**:
```bash
helm upgrade <prerequisites-release-name> datahub/datahub-prerequisites -f values.yaml
```

4. **Verify the Update**:
```bash
kubectl get pvc
kubectl get pods
```

### Cautionary Note

Be aware of the potential for data loss when manually deleting PVCs if dynamic resizing is not supported. Always ensure you have backups before proceeding with storage changes.

## Use AWS managed services for the storage layer

Managing the storage services like MySQL, Elasticsearch, and Kafka as kubernetes pods requires a great deal of
Expand Down
43 changes: 43 additions & 0 deletions docs/deploy/kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,49 @@ any of the configuration and deploy by running the following command.
helm install prerequisites datahub/datahub-prerequisites --values <<path-to-values-file>>
```

### Increasing Storage for Elasticsearch, Kafka, and MySQL

To increase storage for Elasticsearch, Kafka, and MySQL, you need to update the `datahub-prerequisites` chart. Follow these steps:

1. **Locate the `values.yaml` File**: This file is part of the `datahub-prerequisites` chart and contains configurations for Elasticsearch, Kafka, and MySQL.

2. **Update Storage Configuration**:
- **Elasticsearch**: Modify the `volumeClaimTemplate.resources.requests.storage` value.
```yaml
elasticsearch:
volumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi # Increase this to the desired size
```
- **Kafka**: Modify the `broker.persistence.size` value.
```yaml
kafka:
broker:
persistence:
size: 16Gi # Increase this to the desired size
```
- **MySQL**: Modify the `persistence.size` value.
```yaml
mysql:
persistence:
size: 10Gi # Increase this to the desired size
```

3. **Apply the Changes**: Use Helm to apply the changes by upgrading your `datahub-prerequisites` release.
```bash
helm upgrade <prerequisites-release-name> datahub/datahub-prerequisites -f values.yaml
```

4. **Verify the Changes**: Check the status of the PVCs and pods to ensure they are running with the updated storage size.
```bash
kubectl get pvc
kubectl get pods
```

5. **Caution**: If your Kubernetes environment does not support dynamic resizing of PVCs, you may need to manually delete the existing PVCs and allow Kubernetes to create new ones with the updated size. This process can lead to data loss if not handled carefully, so ensure you have backups or a plan to migrate data.

Run `kubectl get pods` to check whether all the pods for the dependencies are running. You should get a result similar
to below.

Expand Down