feat(csharp): implement TelemetryClientManager (WI-3.2)#172
Draft
jadewang-db wants to merge 17 commits intomainfrom
Draft
feat(csharp): implement TelemetryClientManager (WI-3.2)#172jadewang-db wants to merge 17 commits intomainfrom
jadewang-db wants to merge 17 commits intomainfrom
Conversation
This was referenced Jan 22, 2026
9930723 to
5e956d8
Compare
5e956d8 to
94b6786
Compare
jadewang-db
added a commit
that referenced
this pull request
Jan 23, 2026
## 🥞 Stacked PR Use this [link](https://github.com/adbc-drivers/databricks/pull/161/files) to review incremental changes. - [**stack/wi-1.2-tag-definition-system**](#161) [[Files changed](https://github.com/adbc-drivers/databricks/pull/161/files)] - [stack/wi-2.1-telemetry-data-models](#162) [[Files changed](https://github.com/adbc-drivers/databricks/pull/162/files/ab7fa964ff62f3fc9884034e17a7e57630fa8037..a566292aec78d19717c92e28f135535b09f25c80)] - [stack/wi-2.1-exception-classifier](#163) [[Files changed](https://github.com/adbc-drivers/databricks/pull/163/files/a566292aec78d19717c92e28f135535b09f25c80..baa7a2ae32662fddc65272e0264e8bb7d1644716)] - [stack/wi-3.1-circuit-breaker](#164) [[Files changed](https://github.com/adbc-drivers/databricks/pull/164/files/baa7a2ae32662fddc65272e0264e8bb7d1644716..03f7027e6731efe032c15555afe517ba49de3651)] - [stack/wi-3.1-feature-flag-cache](#165) [[Files changed](https://github.com/adbc-drivers/databricks/pull/165/files/03f7027e6731efe032c15555afe517ba49de3651..1d6e3d5b1c4c31ec91361337e574e6e5411fbbb6)] - [stack/wi-3.4-databricks-telemetry-exporter](#166) [[Files changed](https://github.com/adbc-drivers/databricks/pull/166/files/1d6e3d5b1c4c31ec91361337e574e6e5411fbbb6..eb382cb291c120a5f3cc3a1c38e0975b99c1369f)] - [stack/wi-3.5-metrics-aggregator](#167) [[Files changed](https://github.com/adbc-drivers/databricks/pull/167/files/eb382cb291c120a5f3cc3a1c38e0975b99c1369f..67723fabe6f62d7ed16591c3e88e96aa269daddd)] - [stack/wi-3.5-circuit-breaker-manager](#168) [[Files changed](https://github.com/adbc-drivers/databricks/pull/168/files/67723fabe6f62d7ed16591c3e88e96aa269daddd..6b66d37e9d97ca621d88c48a58ac60b2487425ea)] - [stack/e2e-feature-flag-cache-tests](#169) [[Files changed](https://github.com/adbc-drivers/databricks/pull/169/files/6b66d37e9d97ca621d88c48a58ac60b2487425ea..2a6fff2b9b91c7fd6cff7558d1d3b3596c0fa3c2)] - [stack/databricks-activity-listener](#170) [[Files changed](https://github.com/adbc-drivers/databricks/pull/170/files/2a6fff2b9b91c7fd6cff7558d1d3b3596c0fa3c2..39f6aed55278a533390e9aadf655f80dc11159c2)] - [stack/circuit-breaker-telemetry-exporter](#171) [[Files changed](https://github.com/adbc-drivers/databricks/pull/171/files/39f6aed55278a533390e9aadf655f80dc11159c2..4473de5ca3cfca8579818e6d58f8a2b12e869a47)] - [stack/telemetry-client-manager-wi-3.2](#172) [[Files changed](https://github.com/adbc-drivers/databricks/pull/172/files/4473de5ca3cfca8579818e6d58f8a2b12e869a47..94b678636d76a6d41a6612f76d00b4caccdab48a)] - [stack/telemetry-client-wi-5.5](#173) [[Files changed](https://github.com/adbc-drivers/databricks/pull/173/files/94b678636d76a6d41a6612f76d00b4caccdab48a..ce00998cbd0372d94303ad1d69e9711e4489fe96)] - [stack/telemetry-client-manager-e2e-wi-7](#174) [[Files changed](https://github.com/adbc-drivers/databricks/pull/174/files/ce00998cbd0372d94303ad1d69e9711e4489fe96..2646e86223ff1e7706b20d5970e556ec2f17867b)] - [stack/telemetry-client-e2e-tests-wi-7-standalone](#175) [[Files changed](https://github.com/adbc-drivers/databricks/pull/175/files/2646e86223ff1e7706b20d5970e556ec2f17867b..0b9ebd3867250d92d0d8007cb17d6ce471d5560a)] - [stack/wi-6.1-databricks-connection-telemetry-integration](#176) [[Files changed](https://github.com/adbc-drivers/databricks/pull/176/files/0b9ebd3867250d92d0d8007cb17d6ce471d5560a..4f553284c30eb7efcf67369c58dddd56675cd0be)] - [stack/wi-6.2-telemetry-tags-driver-activities](#177) [[Files changed](https://github.com/adbc-drivers/databricks/pull/177/files/4f553284c30eb7efcf67369c58dddd56675cd0be..1f7cde0c5642072b06588665b16ee3a30a90d256)] - [stack/wi-9-full-integration-e2e-tests](#178) [[Files changed](https://github.com/adbc-drivers/databricks/pull/178/files/1f7cde0c5642072b06588665b16ee3a30a90d256..c65e9fea7c65fa456f0114e95c867ee15f21bd87)] --------- --------- Co-authored-by: Jade Wang <jade.wang+data@databricks.com> Co-authored-by: Claude <noreply@anthropic.com>
94b6786 to
75039c6
Compare
jadewang-db
added a commit
that referenced
this pull request
Jan 23, 2026
## 🥞 Stacked PR Use this [link](https://github.com/adbc-drivers/databricks/pull/162/files) to review incremental changes. - [**stack/wi-2.1-telemetry-data-models**](#162) [[Files changed](https://github.com/adbc-drivers/databricks/pull/162/files)] - [stack/wi-2.1-exception-classifier](#163) [[Files changed](https://github.com/adbc-drivers/databricks/pull/163/files/1e58d3c3785fa7ec1b83da01f80ddea1f6167851..0dac01831e7d9d313c67dc31e4aacceb17e74298)] - [stack/wi-3.1-circuit-breaker](#164) [[Files changed](https://github.com/adbc-drivers/databricks/pull/164/files/0dac01831e7d9d313c67dc31e4aacceb17e74298..59b0221cb4c9262d80a35041a2f1098376f6e19e)] - [stack/wi-3.1-feature-flag-cache](#165) [[Files changed](https://github.com/adbc-drivers/databricks/pull/165/files/59b0221cb4c9262d80a35041a2f1098376f6e19e..8c30fc0649b09bc38e09cfd4d6875d66963ff6c0)] - [stack/wi-3.4-databricks-telemetry-exporter](#166) [[Files changed](https://github.com/adbc-drivers/databricks/pull/166/files/8c30fc0649b09bc38e09cfd4d6875d66963ff6c0..a6e926c8017e9a3b3b6de31bbbafb367adaba884)] - [stack/wi-3.5-metrics-aggregator](#167) [[Files changed](https://github.com/adbc-drivers/databricks/pull/167/files/a6e926c8017e9a3b3b6de31bbbafb367adaba884..c53df5d3c0124c490b920e1e1a611dd9c24e02a4)] - [stack/wi-3.5-circuit-breaker-manager](#168) [[Files changed](https://github.com/adbc-drivers/databricks/pull/168/files/c53df5d3c0124c490b920e1e1a611dd9c24e02a4..de8757a697dd023628011d1aff9961896560bc95)] - [stack/e2e-feature-flag-cache-tests](#169) [[Files changed](https://github.com/adbc-drivers/databricks/pull/169/files/de8757a697dd023628011d1aff9961896560bc95..0b77f8373958342da429c20f7e30c02105402331)] - [stack/databricks-activity-listener](#170) [[Files changed](https://github.com/adbc-drivers/databricks/pull/170/files/0b77f8373958342da429c20f7e30c02105402331..9090bdefba63d6c7fbff45bf60c2c63668f3884e)] - [stack/circuit-breaker-telemetry-exporter](#171) [[Files changed](https://github.com/adbc-drivers/databricks/pull/171/files/9090bdefba63d6c7fbff45bf60c2c63668f3884e..0a0159524a429726078bd7340057672d6927d1cd)] - [stack/telemetry-client-manager-wi-3.2](#172) [[Files changed](https://github.com/adbc-drivers/databricks/pull/172/files/0a0159524a429726078bd7340057672d6927d1cd..75039c6574c2dc437f5d670e71b938b98719c06f)] - [stack/telemetry-client-wi-5.5](#173) [[Files changed](https://github.com/adbc-drivers/databricks/pull/173/files/75039c6574c2dc437f5d670e71b938b98719c06f..254cdc75487f3e9344d3df6fb9b9cbf49fd03228)] - [stack/telemetry-client-manager-e2e-wi-7](#174) [[Files changed](https://github.com/adbc-drivers/databricks/pull/174/files/254cdc75487f3e9344d3df6fb9b9cbf49fd03228..7371da59309d109e8d457f4c27edd13adfa38a2c)] - [stack/telemetry-client-e2e-tests-wi-7-standalone](#175) [[Files changed](https://github.com/adbc-drivers/databricks/pull/175/files/7371da59309d109e8d457f4c27edd13adfa38a2c..5ff7e96827faa69e8bae1d5b5da06a9f95b91a8c)] - [stack/wi-6.1-databricks-connection-telemetry-integration](#176) [[Files changed](https://github.com/adbc-drivers/databricks/pull/176/files/5ff7e96827faa69e8bae1d5b5da06a9f95b91a8c..7757345889dbfd0b1dcb22556e2e6c746d7fa0f0)] - [stack/wi-6.2-telemetry-tags-driver-activities](#177) [[Files changed](https://github.com/adbc-drivers/databricks/pull/177/files/7757345889dbfd0b1dcb22556e2e6c746d7fa0f0..2364122ad5402c9205008f39acaec6a400a4db98)] - [stack/wi-9-full-integration-e2e-tests](#178) [[Files changed](https://github.com/adbc-drivers/databricks/pull/178/files/2364122ad5402c9205008f39acaec6a400a4db98..698f3ea13f65a17b62385be8e8e4032497f88993)] --------- --------- Co-authored-by: Jade Wang <jade.wang+data@databricks.com> Co-authored-by: Claude <noreply@anthropic.com>
75039c6 to
4e3aeb0
Compare
4e3aeb0 to
f47d08c
Compare
1d67e92 to
86b607b
Compare
86b607b to
7626c46
Compare
7626c46 to
d535d04
Compare
d535d04 to
5554e76
Compare
Implements per-host feature flag caching with reference counting to avoid repeated API calls and rate limiting. Key features: - FeatureFlagContext: Holds cached telemetry enabled state, last fetched timestamp, reference count, and configurable cache duration (default 15 min) - FeatureFlagCache: Singleton managing per-host contexts with thread-safe ConcurrentDictionary storage API: - GetInstance(): Returns the singleton instance - GetOrCreateContext(host): Creates/returns context and increments RefCount - ReleaseContext(host): Decrements RefCount, removes context when zero - IsTelemetryEnabledAsync(): Returns cached value if valid, otherwise fetches Thread safety ensured via ConcurrentDictionary and Interlocked operations. Includes 46 comprehensive unit tests covering all exit criteria. Co-Authored-By: Claude <noreply@anthropic.com>
…(WI-3.1)
Refactored FeatureFlagCache based on updated design doc requirements:
- Moved from Telemetry namespace to root namespace (AdbcDrivers.Databricks)
to make it a generic, reusable component
- Added HTTP API integration to fetch flags from
/api/2.0/connector-service/feature-flags/OSS_JDBC/{version}
- Implemented background refresh scheduler with server-provided TTL
- Added FeatureFlagsResponse model for API response parsing
- Updated FeatureFlagContext interface:
- GetFlagValue(string) - get individual flag value
- GetAllFlags() - get all cached flags as dictionary
- IsFeatureEnabled(string) - check if flag is "true"
- Shutdown() - stop background refresh scheduler
- IDisposable for proper cleanup
- Updated FeatureFlagCache.GetOrCreateContext() to accept HttpClient
and driver version parameters
- Updated all unit tests for new interface
Co-Authored-By: Claude <noreply@anthropic.com>
…I-3.1)
Integrated feature flag cache into the connection lifecycle:
- Fetch feature flags from server during connection initialization
- Merge flags into Properties dictionary with proper priority:
User Properties > Feature Flags > Driver Defaults
- Track host for proper context cleanup on Dispose
- Release feature flag context when connection is disposed
- All feature flag operations are fail-safe (errors logged, not thrown)
The feature flag endpoint used is:
GET /api/2.0/connector-service/feature-flags/OSS_JDBC/{driver_version}
Co-Authored-By: Claude <noreply@anthropic.com>
…-3.1) - Add EnsureSuccessStatusCode pattern for HTTP response handling - Extract common HTTP fetch code into single FetchFeatureFlags method - Make feature flag endpoint configurable via optional parameter - Replace Debug.WriteLine with Activity trace pattern - Add E2E tests for FeatureFlagCache using real Databricks instance Co-Authored-By: Claude (databricks-claude-opus-4-5) <noreply@anthropic.com>
…WI-3.1) - Move MergePropertiesWithFeatureFlags, TryGetHost, CreateFeatureFlagHttpClient, and MergeProperties helper methods from DatabricksConnection to FeatureFlagCache - Replace Debug.WriteLine with ActivitySource tracing for structured events - DatabricksConnection now delegates to FeatureFlagCache.GetInstance().MergePropertiesWithFeatureFlags() Co-Authored-By: Claude (databricks-claude-opus-4-5) <noreply@anthropic.com>
…ructor (WI-3.1) Replace hardcoded "1.0.0" with ApacheUtility.GetAssemblyVersion() to use the actual driver version in the test constructor. Co-Authored-By: Claude (databricks-claude-opus-4-5) <noreply@anthropic.com>
…-3.1) - Add proxy support using HiveServer2ProxyConfigurator - Handle protocol prefix in host (e.g., "https://myhost.databricks.com") - Add configurable timeout via FeatureFlagTimeoutSeconds parameter - Use consistent User-Agent format: DatabricksJDBCDriverOSS/{version} (ADBC) - Rename variables to localProperties/remoteProperties for clarity - Remove IsFeatureEnabled method from FeatureFlagContext - Use EnsureSuccessOrThrow extension method for HTTP error handling - Enhance E2E tests to verify flags fetched and cache cleanup Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for OAuth client_credentials (M2M) authentication in addition to token-based (PAT) auth for feature flag API calls. This ensures feature flags work with all supported authentication methods. - Add AuthHelper class with shared token extraction methods - Update FeatureFlagCache to use AuthHelper.GetAccessToken - Update HttpHandlerFactory to use AuthHelper.GetTokenFromProperties Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5554e76 to
5b4756a
Compare
Move the test factory method from production to test code: - Make FeatureFlagContext constructor internal instead of private - Make Ttl setter internal to allow tests to configure TTL - Remove CreateForTesting from FeatureFlagContext.cs - Add CreateTestContext helper method in FeatureFlagCacheTests.cs This addresses the PR review feedback that test-only code should not be in production source files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement the HTTP exporter that sends telemetry events to Databricks service. Key features: - ITelemetryExporter interface with ExportAsync method - Creates TelemetryRequest wrapper with uploadTime and protoLogs - Uses /telemetry-ext for authenticated requests - Uses /telemetry-unauth for unauthenticated requests - Implements retry logic for transient failures - Uses ExceptionClassifier for terminal vs retryable errors - Never throws exceptions (all swallowed and logged at TRACE level) - Cancellation is propagated (not swallowed) Files added: - src/Telemetry/ITelemetryExporter.cs - src/Telemetry/DatabricksTelemetryExporter.cs - test/Unit/Telemetry/DatabricksTelemetryExporterTests.cs Co-Authored-By: Claude <noreply@anthropic.com>
Implement MetricsAggregator that aggregates Activity data by statement_id and handles exception buffering with terminal vs retryable classification. Key features: - ProcessActivity extracts tags and aggregates by statement_id using ConcurrentDictionary<string, StatementTelemetryContext> - CompleteStatement emits aggregated TelemetryEvent - RecordException flushes terminal exceptions immediately - RecordException buffers retryable exceptions until CompleteStatement - FlushAsync exports when batch size or time interval reached - Uses TelemetryTagRegistry to filter tags - Creates TelemetryFrontendLog wrapper with workspace_id - All exceptions swallowed and logged at TRACE level Implementation details: - Connection events emit immediately (no aggregation needed) - Statement events aggregate until CompleteStatement is called - Timer-based periodic flush using System.Threading.Timer - Thread-safe aggregation using ConcurrentDictionary - Nested StatementTelemetryContext holds aggregated metrics and buffered exceptions per statement Test coverage: - 29 unit tests covering all exit criteria - Tests for exception handling, tag filtering, frontend log wrapping - End-to-end statement lifecycle tests Co-Authored-By: Claude <noreply@anthropic.com>
Implement CircuitBreakerManager as a singleton that manages circuit breakers per host. Each host gets its own circuit breaker instance for isolation, preventing one failing endpoint from affecting others. Key features: - Singleton pattern with GetInstance() method - Per-host circuit breaker isolation using ConcurrentDictionary - Thread-safe concurrent access - Case-insensitive host matching - Support for both default and custom configurations This follows the JDBC driver pattern in CircuitBreakerManager.java. Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive E2E tests for feature flag fetching from real Databricks endpoints and validate caching and reference counting behavior: - FeatureFlagCache_FetchFromRealEndpoint_ReturnsBoolean: Tests real endpoint - FeatureFlagCache_CachesValue_DoesNotRefetchWithinTTL: Validates caching - FeatureFlagCache_InvalidHost_ReturnsDefaultFalse: Tests error handling - FeatureFlagCache_RefCountingWorks_CleanupAfterRelease: Tests ref counting Additional tests cover: - Cache expiry and refetch behavior - Null/empty host handling - Unknown host behavior - Multiple hosts with independent ref counts - Concurrent reference counting thread safety - False value caching - Cancellation propagation Co-Authored-By: Claude <noreply@anthropic.com>
Add DatabricksActivityListener that listens to 'Databricks.Adbc.Driver' ActivitySource, extracts metrics from activities, and delegates to MetricsAggregator. This implements Phase 5 of the telemetry design. Key features: - ShouldListenTo returns true for 'Databricks.Adbc.Driver' source - Sample callback respects feature flag (AllDataAndRecorded when enabled, None when disabled) - ActivityStopped callback delegates to MetricsAggregator.ProcessActivity - All callbacks wrapped in try-catch with TRACE logging - StopAsync flushes pending metrics via MetricsAggregator.FlushAsync - Supports dynamic feature flag checking via optional Func<bool> Co-Authored-By: Claude <noreply@anthropic.com>
Implement wrapper exporter that protects inner telemetry exporter with circuit breaker pattern. Key features: - Wraps ITelemetryExporter with circuit breaker protection - Uses CircuitBreakerManager.GetCircuitBreaker(host) for per-host isolation - Exports events when circuit is closed - Drops events silently when circuit is open (logged at DEBUG level) - Circuit breaker tracks failures BEFORE exceptions are swallowed This follows the design in Section 3.3 of the telemetry design document. Co-Authored-By: Claude <noreply@anthropic.com>
Implement per-host telemetry client management with reference counting to prevent rate limiting from concurrent connections. - ITelemetryClient: Interface for telemetry clients with ExportAsync and CloseAsync methods - TelemetryClientHolder: Holds client and reference count with atomic operations using Interlocked - TelemetryClientManager: Singleton factory managing one client per host using ConcurrentDictionary for thread-safety - TelemetryClientAdapter: Adapter bridging ITelemetryExporter to ITelemetryClient interface Key features: - GetInstance() returns singleton - GetOrCreateClient() creates/returns client and increments RefCount - ReleaseClientAsync() decrements RefCount, closes client when zero - Same host returns same client instance (case-insensitive) - Thread-safe with ConcurrentDictionary and atomic ref counting - All exceptions swallowed per telemetry design requirement Co-Authored-By: Claude <noreply@anthropic.com>
5b4756a to
249f1db
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🥞 Stacked PR
Use this link to review incremental changes.