fix: Download components.zip instead of components.py from registry#1005
Conversation
The Airbyte connector registry serves custom components as components.zip archives, but PyAirbyte was trying to fetch components.py (which 404s). This caused all 56 manifest-only connectors with custom Python components to silently fail. Now downloads components.zip and extracts components.py from the archive before passing it to DeclarativeExecutor. Closes #997 Co-Authored-By: AJ Steers <aj@airbyte.io>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1774498591-fix-components-zip-download' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1774498591-fix-components-zip-download'PR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful ResourcesCommunity SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
There was a problem hiding this comment.
Pull request overview
Fixes manifest-only connectors with custom Python components by aligning PyAirbyte’s registry download logic with the registry’s artifact format (components.zip instead of components.py), so components are no longer silently skipped.
Changes:
- Update the registry components URL template to point to
components.zip. - Extract
components.pyfrom the downloaded zip archive and compute its checksum. - Improve error reporting for failed components archive downloads/extraction (while keeping 404-as-no-components behavior).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
📝 WalkthroughWalkthroughFetches Changes
Sequence Diagram(s)sequenceDiagram
participant Client as PyAirbyte
participant Registry as Registry CDN
participant Extractor as ZipExtractor
participant Executor as DeclarativeExecutor
Client->>Registry: GET /.../components.zip
alt 404
Registry-->>Client: 404 Not Found
Client->>Executor: continue without components (manifest only)
else 200
Registry-->>Client: components.zip (bytes)
Client->>Extractor: open(zip bytes) / read('components.py')
alt zip or entry error
Extractor-->>Client: BadZipFile / KeyError / UnicodeDecodeError
Client->>Client: raise AirbyteConnectorInstallationError (components_url)
else success
Extractor-->>Client: components.py (text)
Client->>Client: compute checksum from components.py
Client->>Executor: initialize with manifest + components.py
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Shall I also add a unit test sketch for the zip extraction error paths and the 404 case, wdyt? 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Summary
The Airbyte connector registry (
connectors.airbyte.com) serves custom Python components ascomponents.ziparchives, but PyAirbyte was requestingcomponents.pydirectly — which always 404s. This caused PyAirbyte to silently skip components for all 56 manifest-only connectors that require custom Python components (e.g.,source-slack,source-hubspot,source-jira), leading to runtime failures.This PR changes
_try_get_manifest_connector_filesto:components.zipinstead ofcomponents.pycomponents.pyfrom the zip archive usingzipfile.ZipFilecomponents.pyCloses #997
Review & Testing Checklist for Human
components.pylives at the root of thecomponents.ziparchive served by the registry (not nested in a subdirectory). You can check with:curl -s "https://connectors.airbyte.com/files/metadata/airbyte/source-slack/3.1.12/components.zip" | python3 -c "import zipfile,sys; print(zipfile.ZipFile(sys.stdin.buffer).namelist())"source-slackto confirm components are loaded correctly andget_available_streams()succeedscomponents.zipreturns 404) still work — the 404 graceful handling path is unchanged but worth a sanity checkNotes
_try_get_manifest_connector_files; this fix is verified by lint/format only — manual or integration testing is needed to confirm correctnessLink to Devin session: https://app.devin.ai/sessions/2164a710e6d3473f8b5ce8d4198b937a
Requested by: Aaron ("AJ") Steers (@aaronsteers)
Summary by CodeRabbit
Bug Fixes
Refactor
Important
Auto-merge enabled.
This PR is set to merge automatically when all requirements are met.