fix(gateway): exponential backoff for Google Drive watch-channel rate limits#1529
Open
bingran-you wants to merge 1 commit intodevfrom
Open
fix(gateway): exponential backoff for Google Drive watch-channel rate limits#1529bingran-you wants to merge 1 commit intodevfrom
bingran-you wants to merge 1 commit intodevfrom
Conversation
… limits The previous implementation tracked failed watch-channel registrations in an in-memory HashSet that was cleared on every gateway restart. When Google Drive rejected registrations with subscriptionRateLimitExceeded, the gateway continued hammering the endpoint on every poll cycle (30s) and every restart, silently losing file-change notifications for users' shared docs/sheets. - Replace failed_watch_files HashSet with a backoff map that records a retry-not-before instant and an exponential step per file. - Classify 403 subscriptionRateLimitExceeded (and similar) as a rate- limited failure with a 5m base delay capped at 6h. Other failures use a 2m base capped at 1h. - Cap new registrations at 5 per poll cycle to avoid cold-start bursts. - On a rate-limit hit, stop further registrations within the same cycle. - Clear a file's backoff entry on successful registration. Refs #1528.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1528 — the inbound gateway's Google Workspace poller was hammering Google's
files.watchAPI on every 30s poll cycle and on every restart, repeatedly hittingsubscriptionRateLimitExceededand silently losing file-change notifications.Changes
failed_watch_files: HashSet<String>withfailed_watch_backoff: HashMap<String, (Instant, u32)>that tracks a retry-not-before instant and an exponential-backoff step per file.subscriptionRateLimitExceededas a rate-limited failure with a 5m base delay capped at 6h; other failures use a 2m base capped at 1h.Not in this PR (intentionally)
Test plan
cargo check -p scheduler_module --bin inbound_gatewaypasses cleanlypm2 logs dw_gatewayshowsbackoff <N>s (step <K>)instead of- will not retrysubscriptionRateLimitExceededfloodspm2 restart dw_gateway— confirm cold-start does not immediately trigger 403sFiled by the scheduled
dowhiz-service-debugtask.