Skip to content

Adding a retry logic for Hubspot associations loading#318

Merged
PascalCrow merged 3 commits intomasterfrom
retry_logic_associations
Apr 13, 2026
Merged

Adding a retry logic for Hubspot associations loading#318
PascalCrow merged 3 commits intomasterfrom
retry_logic_associations

Conversation

@PascalCrow
Copy link
Copy Markdown
Contributor

At one of our customers the loading pipeline receives a lot of connection errors for one of their endpoints. After some research I found that this is likely because of the high amount of data loaded by this endpoint in combination with associations that again have an high amount in rows. This leads to many API calls, increasing the chance of hitting a connection error.

With this PR I introduce a retry logic for the associations loading similar to what is already implemented for the property loading but with adding a catch for connection errors.

@PascalCrow PascalCrow requested review from Dimi727 and lpillmann April 13, 2026 09:40
@PascalCrow PascalCrow self-assigned this Apr 13, 2026
Comment thread ewah/hooks/hubspot.py Outdated
Comment thread ewah/hooks/hubspot.py
if request.status_code >= 200 and request.status_code < 300:
break
self.log.info(
"Status {0} - Waiting {2}s and trying again. Response:\n\n{1}".format(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Style] The log message says "trying again" even on the final attempt, which is misleading. Consider checking try_number < retries to vary the message, or simply omit "trying again" from the final-attempt log.

@gemma-claude-assistant
Copy link
Copy Markdown
Contributor

PR #318 Review — HubSpot association retry logic

Overall: The retry mechanism is a sensible addition, but has one blocking bug and one style issue.

Blocking:

  • sleep(wait_for_seconds) is called unconditionally at the end of the while loop. On the last retry (when try_number == retries) the code sleeps 30 s, then the while condition is False, and then the assertion fires. Every fully-exhausted retry sequence wastes an extra 30 seconds. Fix: wrap the sleep with if try_number < retries:.

Style:

  • The non-2xx log message says "Waiting Xs and trying again" even on the final attempt, which is misleading.

No issues found with:

  • sleep import (already present via from time import sleep)
  • VERSION bump (0.9.25 → 0.9.26, appropriate for a patch)
  • Airflow 2.3.x / Python 3.10 compatibility
  • ConnectionError path correctly re-raises on the final attempt without sleeping

@gemma-claude-assistant
Copy link
Copy Markdown
Contributor

Claude review cost: $0.2353 · 10 turns · 137s · eu.anthropic.claude-sonnet-4-6

Co-authored-by: gemma-claude-assistant[bot] <262789552+gemma-claude-assistant[bot]@users.noreply.github.com>
Comment thread ewah/hooks/hubspot.py
data=payload,
)
# Bad requests are retried in the while loop
if request.status_code >= 200 and request.status_code < 300:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The retry fires on all non-2xx responses, including permanent 4xx client errors (400 Bad Request, 401 Unauthorized, 403 Forbidden). Retrying those wastes ~90 seconds per failed batch with no chance of success. Consider limiting retries to retriable codes only:

Suggested change
if request.status_code >= 200 and request.status_code < 300:
if request.status_code >= 200 and request.status_code < 300:
break
retriable = request.status_code == 429 or request.status_code >= 500
if not retriable:
break

Then the assert below will still surface the error correctly.

Comment thread ewah/hooks/hubspot.py
# Connection errors are retried in the while loop
except requests.exceptions.ConnectionError as e:
self.log.info(
"Connection error on attempt {0}/{1}: {2} - Waiting {3}s and trying again.".format(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Style] The log message always says "Waiting {3}s and trying again" but on the final attempt (try_number >= retries) the exception is re-raised immediately — no sleep, no retry. Same misleading wording applies to the HTTP-error log at the line above (self.log.info("Status {0} - Waiting {2}s and trying again...")) on the last iteration. Both should conditionally omit "trying again" on the last attempt to avoid confusing operators reading logs.

@gemma-claude-assistant
Copy link
Copy Markdown
Contributor

PR 318 Review — HubSpot association retry logic

Overall: The approach is sound — wrapping the association batch request in a retry loop with exponential-ish back-off addresses real transient failures. Two issues worth addressing:

  1. Retry scope too broad (Suggestion): The loop retries on all non-2xx codes, including permanent 4xx errors (400, 401, 403). Those will never succeed on retry and each exhausted retry costs ~90 s of unnecessary sleeping. Restricting retries to 429 and 5xx is safer.

  2. Misleading log messages (Style): Both error-path log statements say "trying again" unconditionally, but on the final attempt the code either re-raises the ConnectionError or exits the loop for the assert — no actual retry happens. Operators reading logs during an incident will be confused.

Everything else looks correct: sleep is already imported, the assert guards after the loop still catch exhausted retries with bad status codes, and the version bump (0.9.25 → 0.9.26) is present as required.

@gemma-claude-assistant
Copy link
Copy Markdown
Contributor

Claude review cost: $0.2284 · 8 turns · 149s · eu.anthropic.claude-sonnet-4-6

@PascalCrow PascalCrow merged commit 4bd7624 into master Apr 13, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants