I dug deeper and the current DB writing method in #577.
While the data seems to be added correctly for tables with one-column primary key it does not work with multiple columns. At least one table is affected (changed_dso_assignment) where a single-column PK is not sufficient leading to rows to be silently dropped, see #577.
The current method needs to be overhauled:
Writing method:
- Expected:
add_table_to_sqlite_database() for SQLite, add_table_to_non_sqlite_database() for non-SQLite
- Actual:
add_table_to_non_sqlite_database() for non-SQLite (OK) but add_table_to_sqlite_database() for SQLite with fallback to add_table_to_non_sqlite_database() on error (cf. "Error handling" below)
- Solution: Remove fallback, clearly distinct between the two cases and rework error handling
Error handling
- Expected: Integrity errors
sqlalchemy.exc.IntegrityError and sqlite3.IntegrityError are catched, also "NOT NULL" is included and results in some meaningful message
- Actual: Integrity errors
sqlalchemy.exc.IntegrityError are catched and for sqlite3.IntegrityError the general (non-specific) exception is raised causing fallback to add_table_to_non_sqlite_database() for some reason
- Solution: See "Writing method"
Handling duplicates
- Expected: Drop duplicates based on multi-column primary key and write to DB
- Actual: Drop duplicates based on single-column primary key and write to DB leading to rows to be dropped for multiple entries in PK, example:
changed_dso_assignment.EinheitMastrNummer, e.g. SEE909729154103
- Solution: Drop duplicates based on PK before DB write attempts, maybe here in
process_xml_file()
Messaging in write_single_entries_until_not_unique_comes_up()
I already started reworking some parts.
I dug deeper and the current DB writing method in #577.
While the data seems to be added correctly for tables with one-column primary key it does not work with multiple columns. At least one table is affected (
changed_dso_assignment) where a single-column PK is not sufficient leading to rows to be silently dropped, see #577.The current method needs to be overhauled:
Writing method:
add_table_to_sqlite_database()for SQLite,add_table_to_non_sqlite_database()for non-SQLiteadd_table_to_non_sqlite_database()for non-SQLite (OK) butadd_table_to_sqlite_database()for SQLite with fallback toadd_table_to_non_sqlite_database()on error (cf. "Error handling" below)Error handling
sqlalchemy.exc.IntegrityErrorandsqlite3.IntegrityErrorare catched, also "NOT NULL" is included and results in some meaningful messagesqlalchemy.exc.IntegrityErrorare catched and forsqlite3.IntegrityErrorthe general (non-specific) exception is raised causing fallback toadd_table_to_non_sqlite_database()for some reasonHandling duplicates
changed_dso_assignment.EinheitMastrNummer, e.g.SEE909729154103process_xml_file()Messaging in
write_single_entries_until_not_unique_comes_up()I already started reworking some parts.