Skip to content

Make the CSV Base Importer's date function robust against empty lines #197

@johannesjh

Description

@johannesjh

Please make the beangulp.importers.csvbase.Importer.extract method robust against empty lines in the CSV file. E.g., by simply adding if row to the expression, resulting in return max(row.date for row in self.read(filepath) if row).

Problem: The CSV Base Importer's date function currently is not robust against empty lines in the CSV files. In other words, empty lines in the CSV lead to an IndexError exception as follows:

HTTP 500 - Error calling method on importer:

Traceback (most recent call last):
  File ".venv/lib/python3.13/site-packages/fava/core/ingest.py", line 183, in wrapper
    return func(*args, **kwds)
  File ".venv/lib/python3.13/site-packages/fava/core/ingest.py", line 237, in file_import_info
    date = importer.date(str_path)
  File ".venv/lib/python3.13/site-packages/beangulp/importers/csvbase.py", line 264, in date
    return max(row.date for row in self.read(filepath))
  File ".venv/lib/python3.13/site-packages/beangulp/importers/csvbase.py", line 264, in <genexpr>
    return max(row.date for row in self.read(filepath))
               ^^^^^^^^
  File ".venv/lib/python3.13/site-packages/beangulp/importers/csvbase.py", line 72, in func
    value = tuple(obj[i] for i in idxs)
  File ".venv/lib/python3.13/site-packages/beangulp/importers/csvbase.py", line 72, in <genexpr>
    value = tuple(obj[i] for i in idxs)
                  ~~~^^^
IndexError: tuple index out of range

Analysis: The current implementation in beangulp looks like this, see beangulp.importers.csvbase.Importer.date. The code expects each row to have a "date" attribute, but this expectation is not met by rows resulting from empty lines in the .csv file.

    def date(self, filepath):
        """Implement beangulp.Importer::date()"""
        return max(row.date for row in self.read(filepath))

On a side note, other parts beangulp's CSV Base Importer are already robust against empty lines. Specifically, beangulp.importers.csvbase.Importer.extract includes the following lines to handle empty lines:

            # Skip empty lines.
            if not row:
                continue

Suggested solution: Make beangulp.importers.csvbase.Importer.date robust against empty lines, in a similar way to how the extract method already is robust against empty lines. This can by achieved by adding just the two words if row to the date function, as follows:

    def date(self, filepath):
        """Implement beangulp.Importer::date()"""
        return max(row.date for row in self.read(filepath) if row)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions