Fix null publisher and None author crashes during DOI ingestion#50
Fix null publisher and None author crashes during DOI ingestion#50ssheikholeslami wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Prevents DOI ingestion crashes when OpenAire metadata contains null values for publisher or authors.
Changes:
- Default
publisherto an empty string when OpenAire returnsnull/missing publisher. - Skip
parse_authorwhenauthorsis explicitlyNoneto avoidAttributeError.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| title = clean_html(entity["mainTitle"]) | ||
|
|
||
| publisher = entity.get("publisher", None) | ||
| publisher = entity.get("publisher") or "" | ||
|
|
There was a problem hiding this comment.
This change alters metadata parsing by coercing a missing/None publisher to an empty string, but there’s no test exercising the publisher is None (or missing key) case. Add a unit test that parses an OpenAire fixture with publisher: null (and/or no publisher key) and asserts AnonymousArticle.publisher == "" to prevent regressions.
| authors = entity.get("authors", None) | ||
|
|
||
| all_authors: List[AnonymousAuthor] = [] | ||
| if isinstance(authors, list): | ||
| for x in authors: | ||
| author = parse_author(x) | ||
| if author: | ||
| all_authors.append(author) | ||
| else: | ||
| elif authors is not None: | ||
| author = parse_author(authors) | ||
| if author: |
There was a problem hiding this comment.
The new authors is not None guard prevents calling parse_author(None), but there’s no test covering the authors: null case in OpenAire responses. Add a unit test that parses metadata where authors is explicitly null and asserts parsing succeeds and returns an empty authors list.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Nonepublisher from OpenAire metadata by falling back to an empty string, preventing a Pydantic validation errorparse_authorwhen the authors field isNoneto prevent anAttributeErrorduring metadata parsing