-
Notifications
You must be signed in to change notification settings - Fork 380
Description
After deleting a job that connects two parts of a lineage graph, datasets and jobs from the disconnected part still appear in lineage queries. This happens because the lineage traversal uses the deleted job's I/O mappings even though the job itself is hidden.
Steps to Reproduce
Create a lineage chain:
d1 → job1 → d2 → job3 → d3 → job2 → d4
Where:
job1 produces d1 and d2
job3 consumes d2 and produces d3
job2 consumes d3 and produces d4
Delete job3
Query lineage for d1:
Expected Behavior
After deleting job3, the lineage for d1 should only show the directly connected portion:
d1 → job1 → d2
Since job3 (which connects d2 to d3) is deleted, there should be no path to d3, job2, or d4.
Actual Behavior
The lineage for d1 incorrectly includes:
d1→job1→d2
d3→job2→d4
Problem: d3, job2, and d4 appear in the graph even though:
job3 (the only connection from d2 to d3) is deleted
There's no visible path explaining how these nodes are related to d1