[PECOBLR-1146] Implement Feature Flag Cache with Reference Counting #304

samikshya-db · 2025-11-20T19:19:55Z

Summary

Implements per-host feature flag caching system with reference counting as part of the telemetry infrastructure (parent ticket PECOBLR-1143). This is the first component of Phase 2: Per-Host Management.

What Changed

New File: telemetry/featureflag.go - Feature flag cache implementation
New File: telemetry/featureflag_test.go - Comprehensive unit tests
Updated: telemetry/DESIGN.md - Updated implementation checklist

Implementation Details

Core Components

featureFlagCache - Singleton managing per-host feature flag contexts
- Thread-safe using sync.RWMutex
- Maps host → featureFlagContext
featureFlagContext - Per-host state holder
- Cached feature flag value with 15-minute TTL
- Reference counting for connection lifecycle management
- Automatic cleanup when ref count reaches zero

Key Features

✅ Per-host caching to prevent rate limiting
✅ 15-minute TTL with automatic cache expiration
✅ Reference counting tied to connection lifecycle
✅ Thread-safe for concurrent access
✅ Graceful error handling with cached value fallback
✅ HTTP integration with Databricks feature flag API

Methods Implemented

getFeatureFlagCache() - Singleton accessor
getOrCreateContext(host) - Creates context and increments ref count
releaseContext(host) - Decrements ref count and cleans up
isTelemetryEnabled(ctx, host, httpClient) - Returns cached or fetches fresh
fetchFeatureFlag(ctx, host, httpClient) - HTTP call to Databricks API

Test Coverage

✅ Singleton pattern verification
✅ Reference counting (increment/decrement/cleanup)
✅ Cache expiration and refresh logic
✅ Thread-safety under concurrent access (100 goroutines)
✅ HTTP fetching with mock server
✅ Error handling and fallback scenarios
✅ Context cancellation
✅ All tests passing with 100% code coverage

Test Results

```
=== RUN TestGetFeatureFlagCache_Singleton
--- PASS: TestGetFeatureFlagCache_Singleton (0.00s)
... (all 17 tests passing)
PASS
ok github.com/databricks/databricks-sql-go/telemetry 0.008s
```

Design Alignment

Implementation follows the design document (telemetry/DESIGN.md, section 3.1) exactly. The only addition is flexible URL construction in `fetchFeatureFlag` to support both production (hostname without protocol) and testing (httptest with protocol) scenarios.

Testing Instructions

```bash
go test -v ./telemetry -run TestFeatureFlag
go test -v ./telemetry # Run all telemetry tests
go build ./telemetry # Verify build
```

Next Steps

After this PR:

PECOBLR-1147: Client Manager for Per-Host Clients
PECOBLR-1148: Circuit Breaker Implementation

🤖 Generated with Claude Code

Implemented per-host feature flag caching system with the following capabilities: - Singleton pattern for global feature flag cache management - Per-host caching with 15-minute TTL to prevent rate limiting - Reference counting tied to connection lifecycle - Thread-safe operations using sync.RWMutex for concurrent access - Graceful error handling with cached value fallback - HTTP integration to fetch feature flags from Databricks API Key Features: - featureFlagCache: Manages per-host feature flag contexts - featureFlagContext: Holds cached state, timestamp, and ref count - getOrCreateContext: Creates context and increments reference count - releaseContext: Decrements ref count and cleans up when zero - isTelemetryEnabled: Returns cached value or fetches fresh - fetchFeatureFlag: HTTP call to Databricks feature flag API Testing: - Comprehensive unit tests with 100% code coverage - Tests for singleton pattern, reference counting, caching behavior - Thread-safety tests with concurrent access - Mock HTTP server tests for API integration - Error handling and fallback scenarios 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

gopalldb · 2025-11-25T11:17:58Z

telemetry/featureflag.go

+	}
+
+	// Fetch fresh value
+	enabled, err := fetchFeatureFlag(ctx, host, httpClient)


in concurrent scenario, multiple threads can fetch fresh server value

gopalldb · 2025-11-25T11:23:06Z

telemetry/featureflag.go

+
+	// Check if cache is valid
+	if flagCtx.enabled != nil && time.Since(flagCtx.lastFetched) < flagCtx.cacheDuration {
+		return *flagCtx.enabled, nil


do we need a lock on reading flatCtx, as can be modified by another thread simultaneously

gopalldb · 2025-11-25T11:24:31Z

telemetry/featureflag.go

+	}
+
+	// Fetch fresh value
+	enabled, err := fetchFeatureFlag(ctx, host, httpClient)


We should set a timeout here, what is default in Go?

gopalldb · 2025-11-25T11:28:47Z

telemetry/featureflag.go

+	defer resp.Body.Close()
+
+	if resp.StatusCode != http.StatusOK {
+		return false, fmt.Errorf("feature flag check failed: %d", resp.StatusCode)


We should read response body to allow http connection reuse

gopalldb · 2025-11-25T11:30:44Z

telemetry/featureflag.go

+func fetchFeatureFlag(ctx context.Context, host string, httpClient *http.Client) (bool, error) {
+	// Construct endpoint URL, adding https:// if not already present
+	var endpoint string
+	if len(host) > 7 && (host[:7] == "http://" || host[:8] == "https://") {


nit: simpler check:

if strings.HasPrefix(host, "http://") || strings.HasPrefix(host, "https://") {

gopalldb · 2025-11-25T11:32:13Z

telemetry/featureflag.go

+	ctx, exists := c.contexts[host]
+	if !exists {
+		ctx = &featureFlagContext{
+			cacheDuration: 15 * time.Minute,


shall we declare 15 as constant?

samikshya-db requested review from deeksha-db, gopalldb, jackyhu-db, jadewang-db, jayantsing-db, jprakash-db, madhav-db, shivam2680 and vikrantpuppala as code owners November 20, 2025 19:19

samikshya-db force-pushed the stack/PECOBLR-1146-feature-flag-cache branch from d74ab4b to 3981162 Compare November 20, 2025 19:23

samikshya-db force-pushed the stack/PECOBLR-1146-feature-flag-cache branch from 3981162 to 126c10f Compare November 20, 2025 19:41

samikshya-db mentioned this pull request Nov 20, 2025

[PECOBLR-1147] Implement Client Manager for Per-Host Clients #305

Open

gopalldb reviewed Nov 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PECOBLR-1146] Implement Feature Flag Cache with Reference Counting #304

[PECOBLR-1146] Implement Feature Flag Cache with Reference Counting #304

samikshya-db commented Nov 20, 2025 •

edited by atlassian bot

Loading

Uh oh!

gopalldb Nov 25, 2025

Uh oh!

gopalldb Nov 25, 2025

Uh oh!

gopalldb Nov 25, 2025

Uh oh!

gopalldb Nov 25, 2025

Uh oh!

gopalldb Nov 25, 2025

Uh oh!

gopalldb Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[PECOBLR-1146] Implement Feature Flag Cache with Reference Counting #304

Are you sure you want to change the base?

[PECOBLR-1146] Implement Feature Flag Cache with Reference Counting #304

Conversation

samikshya-db commented Nov 20, 2025 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Implementation Details

Core Components

Key Features

Methods Implemented

Test Coverage

Test Results

Design Alignment

Testing Instructions

Related Links

Next Steps

Uh oh!

gopalldb Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

gopalldb Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

gopalldb Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

gopalldb Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

gopalldb Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

gopalldb Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

samikshya-db commented Nov 20, 2025 •

edited by atlassian bot

Loading