Skip to content

feat: enhance observability-tracing-opensearch chart with multi-cluster installation modes#59

Merged
nilushancosta merged 1 commit intoopenchoreo:mainfrom
akila-i:multi-cluster-otel
Mar 19, 2026
Merged

feat: enhance observability-tracing-opensearch chart with multi-cluster installation modes#59
nilushancosta merged 1 commit intoopenchoreo:mainfrom
akila-i:multi-cluster-otel

Conversation

@akila-i
Copy link
Copy Markdown
Contributor

@akila-i akila-i commented Mar 19, 2026

Purpose

Related: openchoreo/openchoreo#2815
This pull request introduces support for multi-cluster installation modes for the OpenChoreo Observability Tracing Helm chart, allowing deployments in both single-cluster and multi-cluster topologies. It adds configuration validation, updates documentation, and improves secret handling for OpenSearch credentials. The changes also update resource names and enhance flexibility for exporter deployments.

Multi-cluster support and configuration changes:

  • Added support for three installation modes (singleCluster, multiClusterReceiver, multiClusterExporter) with corresponding configuration and documentation updates in README.md, values.yaml, and configMap.yaml. [1] [2] [3] [4] [5] [6] [7] [8]

Validation and error handling:

  • Introduced a Helm template helper for validating the global.installationMode and required fields, with a new validate.yaml template to enforce configuration correctness at install time. [1] [2]

OpenSearch credential and resource improvements:

  • Updated OpenSearch and OpenTelemetry Collector environment variable handling to source credentials from Kubernetes secrets instead of hardcoded values, and renamed the collector config map for clarity. [1] [2]

Networking and service configuration:

  • Added a new httproute.yaml template to expose the OpenTelemetry Collector OTLP endpoints in multi-cluster receiver mode, and changed the collector service type from LoadBalancer to ClusterIP for improved cluster compatibility. [1] [2]

Version bump:

  • Updated the chart and app versions from 0.3.7 to 0.3.8 to reflect the new features and changes.

Goals

Describe the solutions that this feature/fix will introduce to resolve the problems described above

Approach

  • Added support for multiClusterReceiver and multiClusterExporter installation modes in README.md.
  • Updated Chart.yaml version to 0.3.8.
  • Modified values.yaml to use secret references for OpenSearch credentials.
  • Introduced validation logic for installation modes in templates.
  • Updated ConfigMap and HTTPRoute configurations to accommodate multi-cluster setups.

User stories

Summary of user stories addressed by this change>

Release note

Brief description of the new feature or bug fix as it will appear in the release notes

Documentation

Link(s) to product documentation that addresses the changes of this PR. If no doc impact, enter “N/A” plus brief explanation of why there’s no doc impact

Training

Link to the PR for changes to the training content in https://github.com/wso2/WSO2-Training, if applicable

Certification

Type “Sent” when you have provided new/updated certification questions, plus four answers for each question (correct answer highlighted in bold), based on this change. Certification questions/answers should be sent to certification@wso2.com and NOT pasted in this PR. If there is no impact on certification exams, type “N/A” and explain why.

Marketing

Link to drafts of marketing content that will describe and promote this feature, including product page changes, technical articles, blog posts, videos, etc., if applicable

Automation tests

  • Unit tests

    Code coverage information

  • Integration tests

    Details about the test cases and coverage

Security checks

Samples

Provide high-level details about the samples related to this feature

Related PRs

List any other related PRs

Migrations (if applicable)

Describe migration steps and platforms on which migration has been tested

Test environment

List all JDK versions, operating systems, databases, and browser/versions on which this feature/fix was tested

Learning

Describe the research phase and any blog posts, patterns, libraries, or add-ons you used to solve the problem.

Summary by CodeRabbit

  • Documentation

    • Added detailed single-cluster and multi-cluster topology guides with step-by-step Helm install procedures and explicit installationMode usage notes.
  • New Features

    • Introduced distinct single-cluster / multi-cluster receiver/exporter modes, OTLP forwarding option for multi-cluster, and render-time validation of installationMode and required fields.
  • Chores

    • Bumped chart and app version to 0.3.9 and updated example install commands.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ea07f128-89c4-4f17-b86b-466238a146a7

📥 Commits

Reviewing files that changed from the base of the PR and between 618bb45 and 993781b.

📒 Files selected for processing (7)
  • observability-tracing-opensearch/README.md
  • observability-tracing-opensearch/helm/Chart.yaml
  • observability-tracing-opensearch/helm/templates/_helpers.tpl
  • observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml
  • observability-tracing-opensearch/helm/templates/opentelemetry-collector/httproute.yaml
  • observability-tracing-opensearch/helm/templates/validate.yaml
  • observability-tracing-opensearch/helm/values.yaml
✅ Files skipped from review due to trivial changes (2)
  • observability-tracing-opensearch/helm/Chart.yaml
  • observability-tracing-opensearch/README.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • observability-tracing-opensearch/helm/templates/_helpers.tpl
  • observability-tracing-opensearch/helm/values.yaml

📝 Walkthrough

Walkthrough

Adds multi-topology installation modes to the observability-tracing-opensearch Helm chart via global.installationMode (singleCluster, multiClusterReceiver, multiClusterExporter); introduces render-time validation, conditional OTLP exporter and HTTPRoute for receivers, secret-sourced OpenSearch credentials, and bumps chart version to 0.3.9.

Changes

Cohort / File(s) Summary
Chart metadata
observability-tracing-opensearch/helm/Chart.yaml
Bumped Helm chart version and appVersion from 0.3.80.3.9.
Documentation
observability-tracing-opensearch/README.md
Documented global.installationMode (singleCluster, multiClusterReceiver, multiClusterExporter), added single-/multi-cluster instructions and Helm install examples (chart refs updated to 0.3.9).
Validation helper & invocation
observability-tracing-opensearch/helm/templates/_helpers.tpl, observability-tracing-opensearch/helm/templates/validate.yaml
Added observability-tracing-opensearch.validate helper to enforce allowed installationMode values and require opentelemetryCollectorCustomizations.http.observabilityPlaneUrl when multiClusterExporter; added template to invoke validation at render time.
OTel Collector config
observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml
Renamed ConfigMap name; conditionalized opensearch exporter and basicauth extension for singleCluster/multiClusterReceiver; added conditional otlp exporter for multiClusterExporter targeting observabilityPlaneUrl (optional virtual host header); adjusted processors, service.extensions, and pipeline exporters per installationMode.
Receiver ingress
observability-tracing-opensearch/helm/templates/opentelemetry-collector/httproute.yaml
New conditional Gateway API HTTPRoute for multiClusterReceiver exposing gRPC (4317) and HTTP (4318) with optional hostnames.
Values & envs
observability-tracing-opensearch/helm/values.yaml
Added opentelemetryCollectorCustomizations.http keys (observabilityPlaneUrl, observabilityPlaneVirtualHost, hostnames); moved OpenSearch admin creds to opensearch-admin-credentials secret (valueFrom); added OPENSEARCH_USERNAME/PASSWORD collector envs; changed collector service type to ClusterIP; updated configMap existingName to opentelemetry-collector.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Dataplane as Dataplane App
    participant Exporter as OTel Collector\n(Exporter cluster)
    participant Receiver as OTel Collector\n(Receiver cluster)
    participant Gateway as K8s Gateway / HTTPRoute
    participant OpenSearch as OpenSearch

    Dataplane->>Exporter: send traces (OTLP)
    Exporter->>Receiver: forward OTLP to observabilityPlaneUrl
    Receiver->>Gateway: receives gRPC/HTTP (via HTTPRoute)
    Receiver->>OpenSearch: export traces (singleCluster or multiClusterReceiver)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • LakshanSS
  • nilushancosta

Poem

🐰 I hopped through charts with careful mind,
Split clusters, routes, and secrets I did find,
Receivers wait, exporters send on line,
Helm nudged forward to 0.3.9,
Traces bloom like carrots in sunshine!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding multi-cluster installation mode support to the observability-tracing-opensearch Helm chart.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml`:
- Around line 40-41: The template currently always emits a Host header override
using
.Values.opentelemetryCollectorCustomizations.http.observabilityPlaneVirtualHost
even when it's an empty string; update the Helm template in the ConfigMap so the
headers block includes the Host key only when observabilityPlaneVirtualHost is
non-empty (i.e., add a conditional around the Host line that checks
.Values.opentelemetryCollectorCustomizations.http.observabilityPlaneVirtualHost)
to avoid producing an empty/null header in the collector config.

In
`@observability-tracing-opensearch/helm/templates/opentelemetry-collector/httproute.yaml`:
- Around line 14-17: When rendering the HTTPRoute, omit the entire hostnames
section when the list is empty; change the template around the current range
over .Values.opentelemetryCollectorCustomizations.http.hostnames so it only
emits the "hostnames:" key and the list items when that value is non-empty
(e.g., wrap the existing range block in a conditional that checks the slice is
not empty). Update the template that currently iterates
.Values.opentelemetryCollectorCustomizations.http.hostnames to guard emitting
"hostnames:" unless there are entries to avoid producing a blank hostnames:
field.

In `@observability-tracing-opensearch/helm/values.yaml`:
- Around line 49-59: The OPENSEARCH_USERNAME/OPENSEARCH_PASSWORD secret refs
under extraEnvs are always rendered and cause installs to fail; update the Helm
values/templates so these env entries are only included when
multiClusterExporter is enabled (check the multiClusterExporter flag) or mark
the secretKeyRef as optional (use secretKeyRef.optional: true) for the
OPENSEARCH_USERNAME and OPENSEARCH_PASSWORD entries referencing
opensearch-admin-credentials; modify the template that renders
opentelemetry-collector.extraEnvs (or the values block named extraEnvs) to guard
inclusion on .Values.multiClusterExporter.enabled or add optional: true to the
secretKeyRef objects.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2e767169-590f-45ae-8068-84fcca37c900

📥 Commits

Reviewing files that changed from the base of the PR and between 52bb0e7 and 56d9bb4.

📒 Files selected for processing (7)
  • observability-tracing-opensearch/README.md
  • observability-tracing-opensearch/helm/Chart.yaml
  • observability-tracing-opensearch/helm/templates/_helpers.tpl
  • observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml
  • observability-tracing-opensearch/helm/templates/opentelemetry-collector/httproute.yaml
  • observability-tracing-opensearch/helm/templates/validate.yaml
  • observability-tracing-opensearch/helm/values.yaml


This chart supports three `global.installationMode` values:

- **`singleCluster`**: Deploy everything (collector + OpenSearch + OpenSearch setup job) into a single cluster (uses when dataplane and observability plane are in the same cluster).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename it to OpenTelemetry collector. collector seems a bit too generic

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done through 618bb45

@akila-i akila-i force-pushed the multi-cluster-otel branch from 56d9bb4 to 618bb45 Compare March 19, 2026 08:31
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml`:
- Around line 35-43: The config uses headers.Host under the otlp block and
tls.insecure, which won't set the HTTP/2 :authority and disables TLS; change the
OTLP exporter configuration to use configgrpc with the authority/server_name set
to
.Values.opentelemetryCollectorCustomizations.http.observabilityPlaneVirtualHost
(i.e., use the configgrpc.authority field or server_name override) and replace
tls.insecure: true with tls.insecure_skip_verify: true so TLS stays enabled but
skips cert validation when needed; update references around otlp, tls and the
observabilityPlaneVirtualHost value and remove the headers.Host entry.

In
`@observability-tracing-opensearch/helm/templates/opentelemetry-collector/httproute.yaml`:
- Around line 14-19: The template omits hostnames when
.Values.opentelemetryCollectorCustomizations.http.hostnames is empty, which
makes the HTTPRoute match all hostnames and become a catch-all with the
PathPrefix: / rule; either require a hostname via values validation for
.Values.opentelemetryCollectorCustomizations.http.hostnames or explicitly bind
the route to an OTLP-only listener by setting sectionName on the HTTPRoute rule
(so the route only applies to the dedicated listener); update the template
generating hostnames (and the similar block at lines 33–40) to either render an
error/validation when hostnames is empty or add a sectionName value (e.g.
.Values.opentelemetryCollectorCustomizations.http.sectionName) to the route rule
so unrelated traffic isn’t accepted on shared/wildcard listeners.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 90b17bd4-adaf-4df4-bb77-964c39ebf732

📥 Commits

Reviewing files that changed from the base of the PR and between 56d9bb4 and 618bb45.

📒 Files selected for processing (7)
  • observability-tracing-opensearch/README.md
  • observability-tracing-opensearch/helm/Chart.yaml
  • observability-tracing-opensearch/helm/templates/_helpers.tpl
  • observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml
  • observability-tracing-opensearch/helm/templates/opentelemetry-collector/httproute.yaml
  • observability-tracing-opensearch/helm/templates/validate.yaml
  • observability-tracing-opensearch/helm/values.yaml
✅ Files skipped from review due to trivial changes (2)
  • observability-tracing-opensearch/README.md
  • observability-tracing-opensearch/helm/values.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • observability-tracing-opensearch/helm/Chart.yaml
  • observability-tracing-opensearch/helm/templates/_helpers.tpl

Comment on lines +35 to +43
{{- if (eq .Values.global.installationMode "multiClusterExporter") }}
otlp:
endpoint: {{ .Values.opentelemetryCollectorCustomizations.http.observabilityPlaneUrl }}
tls:
insecure: true # TODO: Enable mTLS between collectors
{{- with .Values.opentelemetryCollectorCustomizations.http.observabilityPlaneVirtualHost }}
headers:
Host: {{ . | quote }}
{{- end }}
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In the OpenTelemetry Collector otlpexporter (gRPC), doesheaders: { Host: ... }change the HTTP/2:authority/ Host used for routing, and what is the exact behavior oftls.insecureversustls.insecure_skip_verify?

💡 Result:

No, headers: { Host: ... } in the OpenTelemetry Collector OTLP gRPC exporter does not change the HTTP/2 :authority / Host header used for routing. The :authority pseudo-header in gRPC over HTTP/2 is derived from the endpoint configuration (host:port), not from gRPC metadata headers set via the 'headers' config. gRPC headers are application-level metadata sent in grpc-metadata, separate from HTTP/2 transport headers like :authority, which is set by the gRPC dial options based on the target endpoint. There is a WithAuthority option in configgrpc to override it, but it is not exposed in the standard OTLP exporter config and requires custom ClientConfig settings. tls.insecure (default false): When true, disables TLS entirely (equivalent to grpc.WithInsecure), forcing plaintext gRPC connections even if the endpoint suggests TLS. Used for non-TLS endpoints. tls.insecure_skip_verify (default false): When true and TLS is enabled (tls.insecure=false), skips server certificate verification (sets tls.Config.InsecureSkipVerify=true), allowing self-signed or invalid certs while keeping encryption. Does not disable TLS. Sources confirm this distinction across HTTP/gRPC exporters and issues: for HTTPS endpoints, insecure=true does not skip verification (use insecure_skip_verify instead); gRPC issues show separate TLS behaviors.

Citations:


Use WithAuthority in configgrpc to set the gRPC authority; the headers.Host approach will not work for hostname routing.

The headers.Host setting only affects gRPC metadata headers, not the HTTP/2 :authority pseudo-header used for transport-level hostname routing. If the observability plane gateway is hostname-constrained, this config will fail. Additionally, tls.insecure: true disables TLS entirely instead of just skipping certificate verification—use tls.insecure_skip_verify: true if the endpoint requires TLS but certs cannot be validated. Properly configure the authority/server-name via configgrpc and the correct TLS settings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml`
around lines 35 - 43, The config uses headers.Host under the otlp block and
tls.insecure, which won't set the HTTP/2 :authority and disables TLS; change the
OTLP exporter configuration to use configgrpc with the authority/server_name set
to
.Values.opentelemetryCollectorCustomizations.http.observabilityPlaneVirtualHost
(i.e., use the configgrpc.authority field or server_name override) and replace
tls.insecure: true with tls.insecure_skip_verify: true so TLS stays enabled but
skips cert validation when needed; update references around otlp, tls and the
observabilityPlaneVirtualHost value and remove the headers.Host entry.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will move to TLSRoute in a later iteration

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akila-i, understood! Acknowledged the deferral. Would you like me to open a follow-up GitHub issue to track the migration to TLSRoute (along with fixing the headers.Host → HTTP/2 :authority and tls.insecuretls.insecure_skip_verify concerns) so it's not lost?


✏️ Learnings added
Learnt from: akila-i
Repo: openchoreo/community-modules PR: 59
File: observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml:35-43
Timestamp: 2026-03-19T09:09:25.316Z
Learning: In `observability-tracing-opensearch/helm/templates/opentelemetry-collector/configMap.yaml`, the `multiClusterExporter` OTLP exporter currently uses `headers.Host` (which only sets gRPC metadata, not the HTTP/2 `:authority` pseudo-header) and `tls.insecure: true` (which disables TLS entirely). The maintainer (akila-i) has deferred fixing this to a later iteration by migrating to a `TLSRoute`-based approach, where the `configgrpc.authority` field (or `server_name`) and `tls.insecure_skip_verify` will be used properly.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: ruks
Repo: openchoreo/community-modules PR: 22
File: observability-logs-moesif/helm/values.yaml:118-124
Timestamp: 2026-03-12T13:54:17.645Z
Learning: In observability-logs-moesif/helm/values.yaml, the Fluent Bit initContainer set-volume-ownership runs chown -R 10000:10000 /var/lib/fluent-bit/db on a hostPath volume. The previous use of privileged: true was due to capabilites.drop: [ALL] removing CAP_CHOWN. Replace the full privileges with a least-privilege config: keep capabilities.drop: [ALL] but add back only CAP_CHOWN via capabilities.add: ["CHOWN"]. This avoids granting full node privileges while still allowing the required chown operation. Apply this change specifically in the initContainer spec that performs volume ownership, ensuring the container remains non-privileged except for the explicitly required CHOWN capability.

Comment on lines +14 to +19
{{- with .Values.opentelemetryCollectorCustomizations.http.hostnames }}
hostnames:
{{- range . }}
- {{ . | quote }}
{{- end }}
{{- end }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Kubernetes Gateway API, if an HTTPRouteomitsspec.hostnames, does it match all hostnames allowed by the attached listener?

💡 Result:

Yes, in Kubernetes Gateway API, if an HTTPRoute omits spec.hostnames, it matches all hostnames allowed by the attached listener.

Citations:


Ensure a hostname is configured or bind this route to a dedicated listener.

When .Values.opentelemetryCollectorCustomizations.http.hostnames is empty, the hostnames field is omitted from the HTTPRoute. In Kubernetes Gateway API, this causes the route to match all hostnames allowed by the attached listener. Combined with the PathPrefix: / rule, this creates a catch-all that may accept unrelated traffic on a shared or wildcard listener. Either enforce a required hostname in the values validation, or bind this route to an OTLP-only listener via sectionName.

This also applies to lines 33–40.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@observability-tracing-opensearch/helm/templates/opentelemetry-collector/httproute.yaml`
around lines 14 - 19, The template omits hostnames when
.Values.opentelemetryCollectorCustomizations.http.hostnames is empty, which
makes the HTTPRoute match all hostnames and become a catch-all with the
PathPrefix: / rule; either require a hostname via values validation for
.Values.opentelemetryCollectorCustomizations.http.hostnames or explicitly bind
the route to an OTLP-only listener by setting sectionName on the HTTPRoute rule
(so the route only applies to the dedicated listener); update the template
generating hostnames (and the similar block at lines 33–40) to either render an
error/validation when hostnames is empty or add a sectionName value (e.g.
.Values.opentelemetryCollectorCustomizations.http.sectionName) to the route rule
so unrelated traffic isn’t accepted on shared/wildcard listeners.

…cluster installation modes

- Added support for `multiClusterReceiver` and `multiClusterExporter` installation modes in README.md.
- Updated Chart.yaml version to 0.3.8.
- Modified values.yaml to use secret references for OpenSearch credentials.
- Introduced validation logic for installation modes in templates.
- Updated ConfigMap and HTTPRoute configurations to accommodate multi-cluster setups.

Signed-off-by: Akila-I <akila.99g@gmail.com>
@akila-i akila-i force-pushed the multi-cluster-otel branch from 618bb45 to 993781b Compare March 19, 2026 09:03
@nilushancosta nilushancosta merged commit 834c819 into openchoreo:main Mar 19, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants