fix problems with the new versions#49
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def _validate_datasets(self, csv_path, cdes, copy_from_file): | ||
| csv_columns = pd.read_csv(csv_path, nrows=0).columns.tolist() | ||
| if DATASET_COLUMN_NAME not in csv_columns: | ||
| raise InvalidDatasetError( | ||
| "The 'dataset' column is required to exist in the csv." | ||
| ) | ||
| metadata_table = MetadataTable.from_db(data_model, self.sqlite_db) | ||
| cdes = metadata_table.table | ||
|
|
||
| if self.monetdb is None: | ||
| LOGGER.debug("MonetDB disabled – skipping primary data table creation.") | ||
| df = pd.read_csv(csv_path, usecols=["dataset"]) | ||
| unique_datasets = df["dataset"].unique().tolist() | ||
| return unique_datasets |
There was a problem hiding this comment.
Skipping dataset schema validation when MonetDB is disabled
When ValidateDataset._validate_datasets sees self.monetdb is None it now returns immediately after reading only the dataset column. This short‑circuit means that all other validations (type checks, min/max boundaries, enumeration validations other than dataset name) are never executed whenever the CLI is run with --no-monetdb. Previously the same method always invoked validate_csv/validate_csv_with_volume, so invalid rows would be rejected before import. After this change, any CSV with wrong types or values will be accepted silently in SQLite‑only mode, allowing bad data into the system. The MonetDB dependency is only needed for the COPY‑FROM‑FILE branch; the pure Pandas validation should still run even without MonetDB. Consider removing the early return and run the same validation logic regardless of MonetDB availability.
Useful? React with 👍 / 👎.
No description provided.