This is referred as "NeSI internal helm chart" in the release notes.
**Check the tags for the version number. **
updated version of the Gen3 Helm chart, which was utilized to deploy the application on the OpenStack platform.
based on gen3 helm chart version gen3-0.2.3
https://github.com/uc-cdis/gen3-helm
The purpose of this example is to deploy Gen3 on the OpenStack platform using services native to OpenStack. Currently, the deployment of the Helm chart on OpenStack is in the evolutionary phase. We will share any changes in the evolution as they occur.
The example openstack deployment is using:
-
External postgres Database (deployed in the same k8s cluster)
-
opensource local elasticsearch (use sub-chart master-chart/helm/aws-es-proxy)
-- image: docker.elastic.co/elasticsearch/elasticsearch-oss
-- tag: 7.10.2
-
the following service will be deployed by this helmchart:
-- microservices: arborist, elasticsearch(aw-es-proxy), fence, guppy, indexd, manifestservice, metadata, peregrine, portal, revproxy, sheepdog
-- cronjob: etl-cronjob
To better manage and gain insight into the configuration from a general perspective, we placed some important configuration values in the customized values.yaml file, which will override the corresponding values in the gen3 and its subcharts.
These values including images, resources, and PostgreSQL settings for each subchart.
Key configurations, such as Fence client settings, are also in gen3/values.yaml.
Note: Some subchart configurations are still in their respective values.yaml files for the file size reason. We add comment in the gen3/values.yaml, be sure to review them.
Resolved an issue when set dbCreate to false, the dbcreated key was missing in the secrets: arborist-dbcreds, fence-dbcreds, indexd-dbcreds, peregrine-dbcreds, sheepdog-dbcreds.
The helm/common/templates/_db_setup_job.tpl now correctly checks if dbCreate = false in subcharts and create the key dbcreated: true in the above secrets.
The Kubernetes secrets (e.g., fence-dbcreds, sheepdog-dbcreds) must include a dbcreated: true key to prevent pod crashes.
Updated helm/common/templates/_netpolicy_templates.tpl to open port 5432 for PostgreSQL communication.
Ensures that pods can connect to the PostgreSQL database without network restrictions.
Add gen3job to both ingressApps and egressApps sections in the netPolicy configuration.
This is necessary to allow the ETL job to communicate with the Elasticsearch pod. Without this, the ETL job will fail due to network policy restrictions.
these URLs are expected in the sub charts as portal, peregrine from the configmap manifest-global
(1) Create a secret regcred to store your docker credential in the k8s cluster
kubectl create secret docker-registry regcred \
--docker-server=docker.io \
--docker-username=xxxx \
--docker-password=xxxx \
-n your namespace(2) Add configuration code in the subchart as the follow format:
portal:
imagePullSecrets:
- name: "regcred"(3) for metadata , you need to add the follow code to \helm\metadata\templates\deployment.yaml
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}For better management and maintenance, we sperate etlmapping and useryaml to a single file and store in folder ./etlmapping, ./useryaml
The value in the files will be attached to the gen3 instances after the pipeline run.
We extract portal customization values to single files at ./portal/defaults folder.
User can maintain gitops.json, gitops.css, gitops-logo.png, gitops-favicon.ico, and then run pipeline, the change will be implemented to portal.
-
Admin of a kubernetes cluster (we have tested with CAPI 1.31)
the 2 control nodes use the flavor of 8 cpu, 16 GB ram
the 3 worker nodes use the flavor of 16 cpu, 32 GB ram
-
Packages on admin client (kubectl, helm and psql client)
-
Access to a gilab repo
- Deploy posgres to the k8s cluster via helm chart
namspace: postgres
internal url: postgres-postgresql.postgres.svc.cluster.local
the recommend minimal resouce of the postgresql deployment:
primary:
resources:
requests:
cpu: 500m
memory: 1024Mi
limits:
cpu: 2
memory: 4096Mi
persistence:
size: 20Gi
persistentVolumeClaimRetentionPolicy:
enabled: true
whenScaled: Retain
whenDeleted: Retain- DB Creation
Before deploying Gen3, create the following databases in PostgreSQL:
arborist_db, fence_db, indexd_db, metadata_db, sheepdog_db
e.g. Install postgres helm chart
- User Creation
Create the following database users in PostgreSQL:
arborist_user, fence_user, indexd_user, sheepdog_user
- Permissions
Grant all privileges for each database to the corresponding user.
For sheepdog_db and metadata, grant privileges to sheepdog_user.
-
Deploy nginx-ingress to handle ingress traffic (not included in the Helm chart)
-
Create a SSL certificate for your URLs
-
Create tls secret in k8s based on the SSL certification
-
Create and apply the ingress application
It is recommended to use a GitLab pipeline for deployment and maintenance. The following components are required in the Kubernetes cluster:
-
gitlab agent
-
gitlab runner
-
certificate manager
-
velero (k8s backup and restore)
-
the example customized values.yaml file is at
./values.yaml -
resources requirement of portal: the memory request should at least 4Gi, otherwise the portal pod sometimes will raise error while compiling. The cpu request recommend at least 1, otherwise the pod start time is too long.
-
add env var imagePullSecrets into master-chart/helm/metadata/templates/deployment.yaml: for downloading customized metadata image from docker hub
-
add env var discovery.type into master-chart/helm/aws-es-proxy/templates/deployment.yaml: for the elastic search node configuration
-
resources requirement of elasticsearch: here the aws-es-proxy subchart deploy local elasticsearch, recommend minimal request is 4Gi memory and 1 cpu.
-
if set global.netPolicy = true, need to set some netpolicies to allow the necessary traffic
peregrine: netPolicy: ingressApps: - fence aws-es-proxy: netPolicy: ingressApps: ... - gen3job
Create CI/CD variables to store the secrets(such as db password, client secret), and reference the secrets in the deployment pipeline.
the etl job will run every 30 minutes, if you want trigger is manually, run the following command:
kubectl create job --from=cronjob/etl-cronjob etl-cronjob-manual -n yourns
We recommend that you deploy alloy, prometheus to have the logs and metrics sent to a central monitoring system, such as grafana, mimir, loki.
A sample gitlab-ci.yaml file is provided for deploying Gen3 via a GitLab pipeline at ./.gitlab-ci-ex.yml.
Use this as a reference for configuring your deployment pipeline.
We can create more pipeline to implement the maintanance work such as update user, update etlmapping.
helm\aws-es-proxy\templates\deployment.yaml
helm\common\templates\_db_setup_job.tpl
helm\common\templates\_netpolicy_templates.tpl
helm\gen3\templates\global-manifest.yaml
helm\metadata\templates\deployment.yaml
helm\portal\values.yaml