Skip to content

Restrict DOI ingestion to specific file formats and structures #15

@FrancisTembo

Description

@FrancisTembo

The current implementation of the DOI ingestion function processes the input file line by line, regardless of file format or structure. This approach allows any file type to be processed. Current code:

with open(args.list_of_dois, "r") as csv_file:
    for line in csv_file:
        list_of_dois.append(line.strip())

Problem: This code does not validate the file type, so it will try to process any input file (e.g., .txt, .csv, .json, yaml). While it works for line-based formats, this lack of restriction could lead to issues if the input is a file with a different format or structure.

Also, if one passes the invalid .csv file the pipeline does not have a failure feedback mechanism as it gives a Success message.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions