Commit ebe4b07
feat: Implement host-level telemetry batching to reduce rate limiting (#718)
* feat: Implement host-level telemetry batching to reduce rate limiting
Changes telemetry client architecture from per-session to per-host batching,
matching the JDBC driver implementation. This reduces the number of HTTP
requests to the telemetry endpoint and prevents rate limiting in test
environments.
Key changes:
- Add _TelemetryClientHolder with reference counting for shared clients
- Change TelemetryClientFactory to key clients by host_url instead of session_id
- Add getHostUrlSafely() helper for defensive null handling
- Update all callers (client.py, exc.py, latency_logger.py) to pass host_url
Before: 100 connections to same host = 100 separate TelemetryClients
After: 100 connections to same host = 1 shared TelemetryClient (refcount=100)
This fixes rate limiting issues seen in e2e tests where 300+ parallel
connections were overwhelming the telemetry endpoint with 429 errors.
* chore: Change all telemetry logging to DEBUG level
Reduces log noise by changing all telemetry-related log statements
(info, warning, error) to debug level. Telemetry operations are
background tasks and should not clutter logs with operational messages.
Changes:
- Circuit breaker state changes: info/warning -> debug
- Telemetry send failures: error -> debug
- All telemetry operations now consistently use debug level
* chore: Fix remaining telemetry warning log to debug
Changes remaining logger.warning in telemetry_push_client.py to debug level
for consistency with other telemetry logging.
* fix: Update tests to use host_url instead of session_id_hex
- Update circuit breaker test to check logger.debug instead of logger.info
- Replace all session_id_hex test parameters with host_url
- Apply Black formatting to exc.py and telemetry_client.py
This fixes test failures caused by the signature change from session_id_hex
to host_url in the Error class and TelemetryClientFactory.
* fix: Revert session_id_hex in tests for functions that still use it
Only Error classes changed from session_id_hex to host_url.
Other classes (TelemetryClient, ResultSetDownloadHandler, etc.) still use session_id_hex.
Reverted:
- test_telemetry.py: TelemetryClient and initialize_telemetry_client
- test_downloader.py: ResultSetDownloadHandler
- test_download_manager.py: ResultFileDownloadManager
Kept as host_url:
- test_client.py: Error class instantiation
* fix: Update all Error raises and test calls to use host_url
Changes:
1. client.py: Changed all error raises from session_id_hex to host_url
- Connection class: session_id_hex=self.get_session_id_hex() -> host_url=self.session.host
- Cursor class: session_id_hex=self.connection.get_session_id_hex() -> host_url=self.connection.session.host
2. test_telemetry.py: Updated get_telemetry_client() and close() calls
- get_telemetry_client(session_id) -> get_telemetry_client(host_url)
- close(session_id) -> close(host_url=host_url)
3. test_telemetry_push_client.py: Changed logger.warning to logger.debug
- Updated test assertion to match debug logging level
These changes complete the migration from session-level to host-level
telemetry client management.
* fix: Update thrift_backend.py to use host_url instead of session_id_hex
Changes:
1. Added self._host attribute to store server_hostname
2. Updated all error raises to use host_url=self._host
3. Changed method signatures from session_id_hex to host_url:
- _check_response_for_error
- _hive_schema_to_arrow_schema
- _col_to_description
- _hive_schema_to_description
- _check_direct_results_for_error
4. Updated all method calls to pass self._host instead of self._session_id_hex
This completes the migration from session-level to host-level error reporting.
* Fix Black formatting by adjusting fmt directive placement
Moved the `# fmt: on` directive to the except block level instead
of inside the if statement to resolve Black parsing confusion.
* Fix telemetry feature flag tests to set mock session host
The tests were failing because they called get_telemetry_client("test")
but the mock session didn't have .host set, so the telemetry client was
registered under a different key (likely None or MagicMock). This caused
the factory to return NoopTelemetryClient instead of the expected client.
Fixed by setting mock_session_instance.host = "test" in all three tests.
* Add teardown_method to clear telemetry factory state between tests
Without this cleanup, tests were sharing telemetry clients because they
all used the same host key ("test"), causing test pollution. The first
test would create an enabled client, and subsequent tests would reuse it
even when they expected a disabled client.
* Clear feature flag context cache in teardown to fix test pollution
The FeatureFlagsContextFactory caches feature flag contexts per session,
causing tests to share the same feature flag state. This resulted in the
first test creating a context with telemetry enabled, and subsequent tests
incorrectly reusing that enabled state even when they expected disabled.
* fix: Access actual client from holder in flush worker
The flush worker was calling _flush() on _TelemetryClientHolder objects
instead of the actual TelemetryClient. Fixed by accessing holder.client
before calling _flush().
Fixes AttributeError in e2e tests: '_TelemetryClientHolder' object has
no attribute '_flush'
* Clear telemetry client cache in e2e test teardown
Added _clients.clear() to the teardown fixture to prevent telemetry
clients from persisting across e2e tests, which was causing session ID
pollution in test_concurrent_queries_sends_telemetry.
* Pass session_id parameter to telemetry export methods
With host-level telemetry batching, multiple connections share one
TelemetryClient. Each client stores session_id_hex from the first connection
that created it. This caused all subsequent connections' telemetry events
to use the wrong session ID.
Changes:
- Modified telemetry export method signatures to accept optional session_id
- Updated Connection.export_initial_telemetry_log() to pass session_id
- Updated latency_logger.py export_latency_log() to pass session_id
- Updated Error.__init__() to accept optional session_id_hex and pass it
- Updated all error raises in Connection and Cursor to pass session_id_hex
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix Black formatting in telemetry_client.py
* Use 'test-host' instead of 'test' for mock host in telemetry tests
* Replace test-session-id with test-host in test_client.py
* Fix telemetry client lookup to use test-host in tests
* Make session_id_hex keyword-only parameter in Error.__init__
---------
Co-authored-by: Claude <noreply@anthropic.com>1 parent d524f0e commit ebe4b07
File tree
12 files changed
+262
-108
lines changed- src/databricks/sql
- backend
- telemetry
- tests
- e2e
- unit
12 files changed
+262
-108
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
| 166 | + | |
166 | 167 | | |
167 | 168 | | |
168 | 169 | | |
| |||
279 | 280 | | |
280 | 281 | | |
281 | 282 | | |
282 | | - | |
| 283 | + | |
283 | 284 | | |
284 | 285 | | |
285 | 286 | | |
286 | 287 | | |
287 | 288 | | |
288 | 289 | | |
289 | | - | |
| 290 | + | |
290 | 291 | | |
291 | 292 | | |
292 | 293 | | |
| |||
340 | 341 | | |
341 | 342 | | |
342 | 343 | | |
343 | | - | |
| 344 | + | |
344 | 345 | | |
345 | 346 | | |
346 | 347 | | |
| |||
461 | 462 | | |
462 | 463 | | |
463 | 464 | | |
| 465 | + | |
464 | 466 | | |
465 | 467 | | |
466 | 468 | | |
467 | 469 | | |
468 | 470 | | |
469 | | - | |
470 | | - | |
471 | 471 | | |
472 | 472 | | |
473 | 473 | | |
| |||
516 | 516 | | |
517 | 517 | | |
518 | 518 | | |
519 | | - | |
520 | | - | |
521 | | - | |
| 519 | + | |
522 | 520 | | |
523 | 521 | | |
524 | 522 | | |
| |||
533 | 531 | | |
534 | 532 | | |
535 | 533 | | |
536 | | - | |
| 534 | + | |
537 | 535 | | |
538 | 536 | | |
539 | 537 | | |
| |||
547 | 545 | | |
548 | 546 | | |
549 | 547 | | |
550 | | - | |
| 548 | + | |
551 | 549 | | |
552 | 550 | | |
553 | 551 | | |
554 | 552 | | |
555 | 553 | | |
556 | 554 | | |
557 | 555 | | |
558 | | - | |
| 556 | + | |
559 | 557 | | |
560 | 558 | | |
561 | 559 | | |
| |||
570 | 568 | | |
571 | 569 | | |
572 | 570 | | |
573 | | - | |
| 571 | + | |
574 | 572 | | |
575 | 573 | | |
576 | 574 | | |
| |||
639 | 637 | | |
640 | 638 | | |
641 | 639 | | |
642 | | - | |
| 640 | + | |
643 | 641 | | |
644 | 642 | | |
645 | 643 | | |
| |||
649 | 647 | | |
650 | 648 | | |
651 | 649 | | |
652 | | - | |
| 650 | + | |
653 | 651 | | |
654 | 652 | | |
655 | 653 | | |
| |||
660 | 658 | | |
661 | 659 | | |
662 | 660 | | |
663 | | - | |
| 661 | + | |
664 | 662 | | |
665 | 663 | | |
666 | 664 | | |
| |||
683 | 681 | | |
684 | 682 | | |
685 | 683 | | |
686 | | - | |
| 684 | + | |
687 | 685 | | |
688 | 686 | | |
689 | 687 | | |
| |||
692 | 690 | | |
693 | 691 | | |
694 | 692 | | |
695 | | - | |
| 693 | + | |
696 | 694 | | |
697 | 695 | | |
698 | 696 | | |
| |||
724 | 722 | | |
725 | 723 | | |
726 | 724 | | |
727 | | - | |
| 725 | + | |
728 | 726 | | |
729 | 727 | | |
730 | 728 | | |
| |||
735 | 733 | | |
736 | 734 | | |
737 | 735 | | |
738 | | - | |
| 736 | + | |
739 | 737 | | |
740 | 738 | | |
741 | 739 | | |
| |||
745 | 743 | | |
746 | 744 | | |
747 | 745 | | |
748 | | - | |
| 746 | + | |
749 | 747 | | |
750 | 748 | | |
751 | 749 | | |
| |||
759 | 757 | | |
760 | 758 | | |
761 | 759 | | |
762 | | - | |
| 760 | + | |
763 | 761 | | |
764 | 762 | | |
765 | 763 | | |
| |||
778 | 776 | | |
779 | 777 | | |
780 | 778 | | |
781 | | - | |
782 | | - | |
783 | | - | |
| 779 | + | |
784 | 780 | | |
785 | 781 | | |
786 | 782 | | |
| |||
795 | 791 | | |
796 | 792 | | |
797 | 793 | | |
798 | | - | |
| 794 | + | |
799 | 795 | | |
800 | 796 | | |
801 | 797 | | |
| |||
818 | 814 | | |
819 | 815 | | |
820 | 816 | | |
821 | | - | |
| 817 | + | |
822 | 818 | | |
823 | 819 | | |
824 | 820 | | |
| |||
833 | 829 | | |
834 | 830 | | |
835 | 831 | | |
836 | | - | |
| 832 | + | |
837 | 833 | | |
838 | 834 | | |
839 | 835 | | |
| |||
844 | 840 | | |
845 | 841 | | |
846 | 842 | | |
847 | | - | |
| 843 | + | |
848 | 844 | | |
849 | 845 | | |
850 | 846 | | |
| |||
895 | 891 | | |
896 | 892 | | |
897 | 893 | | |
898 | | - | |
| 894 | + | |
899 | 895 | | |
900 | 896 | | |
901 | 897 | | |
| |||
906 | 902 | | |
907 | 903 | | |
908 | 904 | | |
909 | | - | |
| 905 | + | |
910 | 906 | | |
911 | 907 | | |
912 | 908 | | |
| |||
971 | 967 | | |
972 | 968 | | |
973 | 969 | | |
974 | | - | |
| 970 | + | |
975 | 971 | | |
976 | 972 | | |
977 | 973 | | |
978 | 974 | | |
979 | | - | |
| 975 | + | |
980 | 976 | | |
981 | 977 | | |
982 | 978 | | |
983 | 979 | | |
984 | | - | |
| 980 | + | |
985 | 981 | | |
986 | 982 | | |
987 | 983 | | |
988 | 984 | | |
989 | | - | |
| 985 | + | |
990 | 986 | | |
991 | 987 | | |
992 | 988 | | |
993 | 989 | | |
994 | | - | |
| 990 | + | |
995 | 991 | | |
996 | 992 | | |
997 | 993 | | |
| |||
1260 | 1256 | | |
1261 | 1257 | | |
1262 | 1258 | | |
1263 | | - | |
| 1259 | + | |
1264 | 1260 | | |
1265 | 1261 | | |
1266 | 1262 | | |
| |||
1275 | 1271 | | |
1276 | 1272 | | |
1277 | 1273 | | |
1278 | | - | |
| 1274 | + | |
1279 | 1275 | | |
1280 | 1276 | | |
1281 | 1277 | | |
| |||
1313 | 1309 | | |
1314 | 1310 | | |
1315 | 1311 | | |
1316 | | - | |
| 1312 | + | |
1317 | 1313 | | |
1318 | 1314 | | |
1319 | 1315 | | |
| |||
0 commit comments