Retracing of Mapping Explicit to Objects (RoMEO)
Extract relevant information from SQL scripts such as source and target tables.
- Target table name: In each sql script the target table name can be derived from the create (table, view, procedure) statement.
Unvalid patterns:
| V1 |
|---|
| with |
Valid patterns:
| V1 |
|---|
| from |
| join |
Filter patterns:
| V1 |
|---|
| case |
| where |
Aggregation patterns:
| V1 |
|---|
| GROUP BY |
| SUM |
| AVG |
Create patterns:
| V1 |
|---|
| create procedure |
| create view |
Insert patterns:
| V1 |
|---|
| insert into |
- Multiple insert into statements: SQL scripts can have multiple insert statements. This implies that the relation is (n x m) for (n) = target table and (m) = source table.
- Scripts without from statements: SQL scripts can just serve the sole purpose to calculate are prepare something and do not refer to a source table load.
- Selected attributes: Selected attributes shall be extracted and thus create per script with one target table and one source table a (1 x n x m) relation with 1 = target table, (n) = source table(s), and (m) = attribute(s).
- Automated join loop: Extracted SQL script shall be joined until end.
- Re-organization of columns in output: Columns in the output table shall reflect lineage steps or layer design (however in legacy systems layer design is often violated).