feat(connectors): add source-smoke-test connector for destination regression testing#74058
Closed
Aaron ("AJ") Steers (aaronsteers) wants to merge 2 commits intomasterfrom
Closed
feat(connectors): add source-smoke-test connector for destination regression testing#74058Aaron ("AJ") Steers (aaronsteers) wants to merge 2 commits intomasterfrom
Aaron ("AJ") Steers (aaronsteers) wants to merge 2 commits intomasterfrom
Conversation
…ression testing Co-Authored-By: AJ Steers <aj@airbyte.io>
Contributor
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Contributor
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksPR Slash CommandsAirbyte Maintainers (that's you!) can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful Resources
|
Contributor
|
Co-Authored-By: AJ Steers <aj@airbyte.io>
Contributor
|
Closing this PR in favor of tracking the monorepo source-smoke-test connector work as an issue in the PyAirbyte repo instead. A replacement issue will be created there linking back to this PR. Requested by Aaron ("AJ") Steers (@aaronsteers). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a new
source-smoke-testconnector that generates synthetic data across 15 predefined scenarios designed to exercise common destination failure patterns. This is the monorepo counterpart to the PyAirbyte implementation in airbytehq/PyAirbyte#982.Closes: airbytehq/PyAirbyte#981
How
Self-contained Python CDK-based source connector with:
source.py—SourceSmokeTestclass implementingspec,check,discover,readscenarios.py— 15 predefined scenario definitions covering type variations, null handling, naming edge cases, schema variations, batch sizes, and unicode stringsall_fast_streams(default true) andall_slow_streams(default false) boolean flags, plus an explicitscenario_filterlistcustom_scenariosconfig fieldmetadata.yaml,pyproject.toml,main.py,README.md,icon.svgReview guide
Start here — these are the files that matter:
source_smoke_test/source.py— Core connector logic. Key areas:_get_all_scenarios()(line ~190): scenario selection/filtering logic combining boolean flags + explicit filter + custom scenarioscheck(): validates custom scenario structure before calling_get_all_scenariosread(): emits records from selected streamssource_smoke_test/scenarios.py— All 15 predefined scenario definitions. Large file but straightforward data declarations. Thelarge_batch_streamscenario uses a generator function rather than inline records.metadata.yaml— Registry configuration. Note:cloud.enabled: false,oss.enabled: true,releaseStage: alpha.Things that warrant reviewer attention:
HIGH_VOLUME_SCENARIO_NAMESis exported fromscenarios.pybut never imported bysource.py(which checksscenario.get("high_volume", False)directly instead).definitionId: 4745b886-d299-4a4f-b14b-4803e643159ais randomly generated — verify no conflict and that this is the intended permanent ID.poetry.lock— other connectors include one, though CI passes without it.secrets/config.json,sample_files/configured_catalog.json).User Impact
Adds a new alpha-stage OSS source connector. No impact on existing connectors or users. The connector generates deterministic test data with no external dependencies.
Can this PR be safely reverted and rolled back?
Requested by: Aaron ("AJ") Steers (@aaronsteers)
Devin session