-
Notifications
You must be signed in to change notification settings - Fork 71
Description
Multi-Execution Smoke Test Scenarios
Problem
The current source-smoke-test architecture only supports single-sync scenarios: one catalog, one set of records, one execution. This means we cannot test important destination behaviors that require multiple sequential syncs, such as:
- Schema evolution (add/drop/change columns between syncs)
- Truncate refresh (overwrite existing data)
- Incremental append after initial load
These are among the most common sources of destination bugs and are well-covered by the CDK's BasicFunctionalityIntegrationTest and connector-specific component tests (e.g. SnowflakeTableSchemaEvolutionTest), but have no coverage in the lightweight smoke test framework.
Proposed Solution
Extend the scenario format to support multi-part scenarios — an ordered list of executions within a single scenario. Each execution would specify:
- A catalog (stream schema + sync mode)
- Records to emit
- (Optional) Whether to run a
checkoperation before this execution
Example structure (conceptual):
name: schema_evolution_add_column
description: "Adds a column between two syncs"
executions:
- catalog:
json_schema:
properties:
id: {type: integer}
name: {type: string}
primary_key: [["id"]]
records:
- {id: 1, name: "Alice"}
- catalog:
json_schema:
properties:
id: {type: integer}
name: {type: string}
email: {type: string}
primary_key: [["id"]]
records:
- {id: 2, name: "Bob", email: "bob@example.com"}Design Considerations
- Two executions is likely sufficient — schema evolution and most multi-sync scenarios only need before/after. No known need for three-part scenarios currently.
- Check operation: Evaluate whether including a
checkbetween executions adds value. It requires a full container spin-up/spin-down, so it should be opt-in per execution rather than automatic, to avoid unnecessary slowdown. - Discovery is not needed: The smoke test source declares its own catalog via scenarios — there's no external schema to discover.
- Orchestrator changes: The
run_destination_smoke_test()function (and the MCP tool / CLI) would need to understand that a scenario can be an ordered list of executions, running them sequentially against the same destination namespace.
Candidate Multi-Execution Scenarios
schema_evolution_add_column— Sync with schema v1, then sync with schema v2 (new column added)schema_evolution_drop_column— Sync with schema v1, then sync with schema v2 (column removed)schema_evolution_change_type— Sync with a string column, then change it to integeroverwrite_refresh— Initial sync, then a second sync that replaces all dataincremental_append— Initial sync with records 1-3, then second sync with records 4-6
Context
This issue was identified during an analysis of coverage gaps between destination-snowflake integration tests and source-smoke-test scenarios. The integration tests cover schema evolution, table operations, and sync mode transitions extensively, but source-smoke-test currently cannot test any of these because it only supports single-sync execution.
See also: Devin session for the full coverage analysis.