From dedb38f5741d230254f940fc2b7862a759862451 Mon Sep 17 00:00:00 2001 From: kpwrennbu Date: Thu, 26 Feb 2026 17:10:11 -0500 Subject: [PATCH] add Apache Kafka on OpenShift documentation under Other Tools On NERC Co-Authored-By: Claude Sonnet 4.6 --- docs/other-tools/kafka/kafka-on-openshift.md | 353 +++++++++++++++++++ mkdocs.yaml | 1 + 2 files changed, 354 insertions(+) create mode 100644 docs/other-tools/kafka/kafka-on-openshift.md diff --git a/docs/other-tools/kafka/kafka-on-openshift.md b/docs/other-tools/kafka/kafka-on-openshift.md new file mode 100644 index 00000000..bfd195a7 --- /dev/null +++ b/docs/other-tools/kafka/kafka-on-openshift.md @@ -0,0 +1,353 @@ +# Apache Kafka on NERC OpenShift + +## Apache Kafka Overview + +[Apache Kafka](https://kafka.apache.org/) is a distributed event streaming platform +capable of handling trillions of events per day. Originally developed at LinkedIn and +open-sourced in 2011, Kafka is designed for high-throughput, fault-tolerant, and +scalable real-time data pipelines and streaming applications. + +Kafka uses a **publish-subscribe** model organized around the following core concepts: + +- **Broker**: A Kafka server that stores and serves messages. +- **Topic**: A named stream to which producers publish records and from which consumers + read records. +- **Partition**: Topics are split into partitions for parallelism and fault tolerance. +- **Producer**: A client application that publishes records to one or more topics. +- **Consumer**: A client application that subscribes to topics and processes records. +- **Consumer Group**: A group of consumers that collectively consume a topic, with each + partition assigned to exactly one member. + +Running Kafka on [NERC OpenShift](https://nerc-project.github.io/nerc-docs/openshift/) +is the recommended approach for course workloads requiring persistent, scalable message +streaming. This guide uses the **[Strimzi Operator](https://strimzi.io/)**, which is +the standard Kubernetes-native method for deploying Kafka on OpenShift. + +## Prerequisites + +Before proceeding, ensure you have: + +- Access to a [NERC OpenShift project](https://nerc-project.github.io/nerc-docs/openshift/logging-in/access-the-openshift-web-console/) +- The `oc` CLI installed and authenticated to the NERC OpenShift cluster +- Sufficient quota in your project (at least 3 vCPUs and 6 GiB memory recommended + for a minimal Kafka cluster) + +!!! note "Checking Your Quota" + + You can view your project's resource quota by running: + + ```sh + oc describe quota -n + ``` + + If you need additional resources, contact your project PI or NERC support. + +## Deploy Kafka Using the Strimzi Operator + +Strimzi provides a Kubernetes Operator that manages the full lifecycle of Kafka +clusters on OpenShift. On NERC OpenShift, you will install Strimzi into your own +namespace. + +### Install the Strimzi Operator + +- Log in to the NERC OpenShift cluster and switch to your project namespace: + + ```sh + oc login + oc project + ``` + + For example: + + ```sh + oc login https://api.shift.nerc.mghpcc.org:6443 + oc project ds551-kafka + ``` + +- Download the Strimzi installation YAML files. Always check the + [Strimzi releases page](https://github.com/strimzi/strimzi-kafka-operator/releases) + for the latest version: + + ```sh + STRIMZI_VERSION="0.45.0" + wget https://github.com/strimzi/strimzi-kafka-operator/releases/download/${STRIMZI_VERSION}/strimzi-${STRIMZI_VERSION}.tar.gz + tar -xzf strimzi-${STRIMZI_VERSION}.tar.gz + cd strimzi-${STRIMZI_VERSION} + ``` + + !!! warning "Very Important Note" + + Check the [Strimzi compatibility matrix](https://strimzi.io/downloads/) to + confirm the Strimzi version supports the Kafka version and Kubernetes/OpenShift + version running on NERC. Mismatched versions can prevent the operator from + starting. + +- Update the installation files to use your project namespace. Replace all + occurrences of `myproject` with your actual namespace: + + ```sh + sed -i 's/namespace: .*/namespace: /' install/cluster-operator/*RoleBinding*.yaml + ``` + + For example: + + ```sh + sed -i 's/namespace: .*/namespace: ds551-kafka/' install/cluster-operator/*RoleBinding*.yaml + ``` + +- Apply the Strimzi Cluster Operator installation files: + + ```sh + oc apply -f install/cluster-operator/ -n + ``` + +- Verify the operator pod is running: + + ```sh + oc get pods -n -l name=strimzi-cluster-operator + ``` + + The output should look similar to: + + ``` + NAME READY STATUS RESTARTS AGE + strimzi-cluster-operator-7d96bf8c59-kfzwp 1/1 Running 0 45s + ``` + + !!! note "Note" + + It may take 1–2 minutes for the operator pod to reach `Running` status. + +### Create a Kafka Cluster + +Once the Strimzi Operator is running, you can deploy a Kafka cluster by creating +a `Kafka` custom resource. + +- Create a file named `kafka-cluster.yaml` with the following minimal cluster + definition. This deploys a single-broker Kafka cluster with 3 ZooKeeper replicas + and ephemeral (non-persistent) storage suitable for development and testing: + + ```yaml + apiVersion: kafka.strimzi.io/v1beta2 + kind: Kafka + metadata: + name: my-cluster + namespace: + spec: + kafka: + version: 3.8.0 + replicas: 1 + listeners: + - name: plain + port: 9092 + type: internal + tls: false + - name: tls + port: 9093 + type: internal + tls: true + config: + offsets.topic.replication.factor: 1 + transaction.state.log.replication.factor: 1 + transaction.state.log.min.isr: 1 + default.replication.factor: 1 + min.insync.replicas: 1 + storage: + type: ephemeral + zookeeper: + replicas: 3 + storage: + type: ephemeral + entityOperator: + topicOperator: {} + userOperator: {} + ``` + + !!! warning "Very Important Note" + + This configuration uses `ephemeral` storage, meaning all Kafka data will + be lost if pods restart. For persistent storage across pod restarts, change + the `storage` type to `persistent-claim` and specify a `size` and + `storageClass`. See the + [Strimzi storage documentation](https://strimzi.io/docs/operators/latest/full/deploying.html#type-EphemeralStorage-reference) + for details. + +- Apply the Kafka cluster definition: + + ```sh + oc apply -f kafka-cluster.yaml -n + ``` + +- Watch the cluster come up. It may take 3–5 minutes for all pods to reach + `Running` status: + + ```sh + oc get pods -n -l strimzi.io/cluster=my-cluster -w + ``` + + A healthy cluster will show output similar to: + + ``` + NAME READY STATUS RESTARTS AGE + my-cluster-entity-operator-6d7f9c7d4b-xqtlp 2/2 Running 0 2m + my-cluster-kafka-0 1/1 Running 0 3m + my-cluster-zookeeper-0 1/1 Running 0 4m + my-cluster-zookeeper-1 1/1 Running 0 4m + my-cluster-zookeeper-2 1/1 Running 0 4m + ``` + +### Create a Kafka Topic + +- Create a file named `kafka-topic.yaml`: + + ```yaml + apiVersion: kafka.strimzi.io/v1beta2 + kind: KafkaTopic + metadata: + name: my-topic + namespace: + labels: + strimzi.io/cluster: my-cluster + spec: + partitions: 3 + replicas: 1 + config: + retention.ms: 7200000 + segment.bytes: 1073741824 + ``` + +- Apply the topic: + + ```sh + oc apply -f kafka-topic.yaml -n + ``` + +- Verify the topic was created: + + ```sh + oc get kafkatopic my-topic -n + ``` + + Expected output: + + ``` + NAME CLUSTER PARTITIONS REPLICATION FACTOR READY + my-topic my-cluster 3 1 True + ``` + +## Test the Kafka Cluster + +Strimzi ships with example producer and consumer jobs you can use to verify your +cluster is working correctly. + +### Run a Producer + +- Start a producer pod that sends 100 messages to `my-topic`: + + ```sh + oc run kafka-producer -ti \ + --image=quay.io/strimzi/kafka:0.45.0-kafka-3.8.0 \ + --rm=true --restart=Never \ + -- bin/kafka-console-producer.sh \ + --bootstrap-server my-cluster-kafka-bootstrap:9092 \ + --topic my-topic + ``` + +- Type a few test messages at the prompt and press `Enter` after each: + + ``` + > Hello from NERC OpenShift! + > This is a Kafka test message. + ``` + + Press `Ctrl+C` to stop the producer. + +### Run a Consumer + +- In a separate terminal, start a consumer pod to read messages from the beginning: + + ```sh + oc run kafka-consumer -ti \ + --image=quay.io/strimzi/kafka:0.45.0-kafka-3.8.0 \ + --rm=true --restart=Never \ + -- bin/kafka-console-consumer.sh \ + --bootstrap-server my-cluster-kafka-bootstrap:9092 \ + --topic my-topic \ + --from-beginning + ``` + + You should see the messages published by the producer: + + ``` + Hello from NERC OpenShift! + This is a Kafka test message. + ``` + + Press `Ctrl+C` to stop the consumer. + + !!! tip "Consumer Groups" + + To simulate multiple consumers sharing a topic workload, add the flag + `--group ` to the consumer command. Each consumer in the same + group will receive messages from a distinct subset of partitions. + +## Connecting Applications to Kafka + +Applications running inside the same OpenShift project can reach the Kafka broker +using the internal bootstrap address: + +``` +my-cluster-kafka-bootstrap:9092 # plaintext (no TLS) +my-cluster-kafka-bootstrap:9093 # TLS +``` + +For Python applications, use the [kafka-python](https://kafka-python.readthedocs.io/) +or [confluent-kafka](https://docs.confluent.io/kafka-clients/python/current/overview.html) +client libraries: + +```python +from kafka import KafkaProducer, KafkaConsumer + +# Producer example +producer = KafkaProducer(bootstrap_servers='my-cluster-kafka-bootstrap:9092') +producer.send('my-topic', b'Hello from Python!') +producer.flush() + +# Consumer example +consumer = KafkaConsumer( + 'my-topic', + bootstrap_servers='my-cluster-kafka-bootstrap:9092', + auto_offset_reset='earliest', + group_id='my-group' +) +for msg in consumer: + print(f"Received: {msg.value.decode()}") +``` + +!!! note "Note" + + The bootstrap address `my-cluster-kafka-bootstrap` is an OpenShift Service + created automatically by Strimzi. It is only reachable from within the same + project namespace. If you need external access, configure a `route` or + `loadbalancer` type listener in the Kafka CR. + +## Clean Up Resources + +When you are finished, remove all Kafka resources to free up project quota: + +```sh +# Delete the Kafka topic +oc delete kafkatopic my-topic -n + +# Delete the Kafka cluster (also removes ZooKeeper and Entity Operator pods) +oc delete kafka my-cluster -n + +# Remove the Strimzi Operator +oc delete -f install/cluster-operator/ -n +``` + +!!! danger "Very Important Note" + + Deleting the Kafka cluster with ephemeral storage permanently destroys all + messages stored in that cluster. Make sure you have consumed or exported any + data you need before running these commands. \ No newline at end of file diff --git a/mkdocs.yaml b/mkdocs.yaml index 48ad2b4d..da737a17 100644 --- a/mkdocs.yaml +++ b/mkdocs.yaml @@ -181,6 +181,7 @@ nav: - RStudio: other-tools/R/rstudio-server/running-rstudio-on-NERC.md - R Shiny: other-tools/R/r-shiny-server/running-rshiny-on-NERC.md - Apache Spark: other-tools/apache-spark/spark.md + - Apache Kafka on OpenShift: other-tools/kafka/kafka-on-openshift.md - MLflow: - MLflow Overview: other-tools/mlflow/mlflow-overview.md - MLflow Server Setup: other-tools/mlflow/mlflow-server-setup.md