Skip to content

fix: Download components.zip instead of components.py from registry#1005

Merged
Aaron ("AJ") Steers (aaronsteers) merged 3 commits intomainfrom
devin/1774498591-fix-components-zip-download
Mar 26, 2026
Merged

fix: Download components.zip instead of components.py from registry#1005
Aaron ("AJ") Steers (aaronsteers) merged 3 commits intomainfrom
devin/1774498591-fix-components-zip-download

Conversation

@aaronsteers
Copy link
Copy Markdown
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Mar 26, 2026

Summary

The Airbyte connector registry (connectors.airbyte.com) serves custom Python components as components.zip archives, but PyAirbyte was requesting components.py directly — which always 404s. This caused PyAirbyte to silently skip components for all 56 manifest-only connectors that require custom Python components (e.g., source-slack, source-hubspot, source-jira), leading to runtime failures.

This PR changes _try_get_manifest_connector_files to:

  1. Download components.zip instead of components.py
  2. Extract components.py from the zip archive using zipfile.ZipFile
  3. Raise a clear error if the zip is corrupt or doesn't contain components.py

Closes #997

Review & Testing Checklist for Human

  • Verify zip structure: Confirm that components.py lives at the root of the components.zip archive served by the registry (not nested in a subdirectory). You can check with: curl -s "https://connectors.airbyte.com/files/metadata/airbyte/source-slack/3.1.12/components.zip" | python3 -c "import zipfile,sys; print(zipfile.ZipFile(sys.stdin.buffer).namelist())"
  • End-to-end test with an affected connector: Run the reproduction script from the issue against a connector like source-slack to confirm components are loaded correctly and get_available_streams() succeeds
  • Verify 404 fallback still works: Confirm that connectors without custom components (where components.zip returns 404) still work — the 404 graceful handling path is unchanged but worth a sanity check

Notes

  • No existing tests cover _try_get_manifest_connector_files; this fix is verified by lint/format only — manual or integration testing is needed to confirm correctness
  • The checksum is computed on the decoded UTF-8 string, consistent with the original behavior

Link to Devin session: https://app.devin.ai/sessions/2164a710e6d3473f8b5ce8d4198b937a
Requested by: Aaron ("AJ") Steers (@aaronsteers)


Open with Devin

Summary by CodeRabbit

  • Bug Fixes

    • Clearer, more actionable error messages when connector component downloads or archive extraction fail, including explicit handling of HTTP 404 (components missing) and corrupt/malformed archives.
    • Graceful handling when components are not available, avoiding installation crashes.
  • Refactor

    • Switched to retrieving connector components from a packaged archive and extracting the components file for validation; checksum behavior for extracted content remains unchanged.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

The Airbyte connector registry serves custom components as
components.zip archives, but PyAirbyte was trying to fetch
components.py (which 404s). This caused all 56 manifest-only
connectors with custom Python components to silently fail.

Now downloads components.zip and extracts components.py from
the archive before passing it to DeclarativeExecutor.

Closes #997

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1774498591-fix-components-zip-download' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1774498591-fix-components-zip-download'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review March 26, 2026 04:19
Copilot AI review requested due to automatic review settings March 26, 2026 04:19
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes manifest-only connectors with custom Python components by aligning PyAirbyte’s registry download logic with the registry’s artifact format (components.zip instead of components.py), so components are no longer silently skipped.

Changes:

  • Update the registry components URL template to point to components.zip.
  • Extract components.py from the downloaded zip archive and compute its checksum.
  • Improve error reporting for failed components archive downloads/extraction (while keeping 404-as-no-components behavior).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Fetches components.zip from the registry instead of components.py, opens the ZIP, reads and decodes components.py from inside, and updates error messages to surface zip- and decoding-related failures; 404 still treated as “components not found.”

Changes

Cohort / File(s) Summary
Component Download Format Migration
airbyte/_executors/util.py
Changed DEFAULT_COMPONENTS_URL to point to components.zip. _try_get_manifest_connector_files now GETs the zip, opens it with zipfile.ZipFile(io.BytesIO(...)), reads the components.py entry, decodes UTF‑8, computes MD5 from the decoded content, and raises AirbyteConnectorInstallationError on HTTP failures, BadZipFile, missing components.py, or UnicodeDecodeError. HTTP 404 on the zip still returns (manifest_dict, None, None).

Sequence Diagram(s)

sequenceDiagram
  participant Client as PyAirbyte
  participant Registry as Registry CDN
  participant Extractor as ZipExtractor
  participant Executor as DeclarativeExecutor

  Client->>Registry: GET /.../components.zip
  alt 404
    Registry-->>Client: 404 Not Found
    Client->>Executor: continue without components (manifest only)
  else 200
    Registry-->>Client: components.zip (bytes)
    Client->>Extractor: open(zip bytes) / read('components.py')
    alt zip or entry error
      Extractor-->>Client: BadZipFile / KeyError / UnicodeDecodeError
      Client->>Client: raise AirbyteConnectorInstallationError (components_url)
    else success
      Extractor-->>Client: components.py (text)
      Client->>Client: compute checksum from components.py
      Client->>Executor: initialize with manifest + components.py
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Shall I also add a unit test sketch for the zip extraction error paths and the 404 case, wdyt?

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: updating the download source from components.py to components.zip in the registry endpoint.
Linked Issues check ✅ Passed The PR fully addresses issue #997 by changing DEFAULT_COMPONENTS_URL to components.zip, implementing zip extraction logic, and adding proper error handling for corrupt or incomplete zips.
Out of Scope Changes check ✅ Passed All changes are directly scoped to resolving the components.zip download issue; no unrelated modifications are present in airbyte/_executors/util.py.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1774498591-fix-components-zip-download

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

Co-Authored-By: AJ Steers <aj@airbyte.io>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 26, 2026

PyTest Results (Fast Tests Only, No Creds)

343 tests  ±0   343 ✅ ±0   5m 51s ⏱️ +4s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 84591e5. ± Comparison against base commit 3ab17f9.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

PyTest Results (Full)

413 tests  ±0   395 ✅ ±0   24m 38s ⏱️ - 1m 1s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 84591e5. ± Comparison against base commit 3ab17f9.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit 65e1c03 into main Mar 26, 2026
23 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1774498591-fix-components-zip-download branch March 26, 2026 04:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Manifest-only connectors with custom components broken: PyAirbyte fetches components.py but registry serves components.zip

2 participants