Skip to content

Fix null publisher and None author crashes during DOI ingestion#50

Open
ssheikholeslami wants to merge 2 commits intomainfrom
fix-author-publisher-checks
Open

Fix null publisher and None author crashes during DOI ingestion#50
ssheikholeslami wants to merge 2 commits intomainfrom
fix-author-publisher-checks

Conversation

@ssheikholeslami
Copy link
Copy Markdown
Collaborator

  • Handle None publisher from OpenAire metadata by falling back to an empty string, preventing a Pydantic validation error
  • Skip parse_author when the authors field is None to prevent an AttributeError during metadata parsing

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Prevents DOI ingestion crashes when OpenAire metadata contains null values for publisher or authors.

Changes:

  • Default publisher to an empty string when OpenAire returns null/missing publisher.
  • Skip parse_author when authors is explicitly None to avoid AttributeError.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 113 to 116
title = clean_html(entity["mainTitle"])

publisher = entity.get("publisher", None)
publisher = entity.get("publisher") or ""

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alters metadata parsing by coercing a missing/None publisher to an empty string, but there’s no test exercising the publisher is None (or missing key) case. Add a unit test that parses an OpenAire fixture with publisher: null (and/or no publisher key) and asserts AnonymousArticle.publisher == "" to prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines 132 to 142
authors = entity.get("authors", None)

all_authors: List[AnonymousAuthor] = []
if isinstance(authors, list):
for x in authors:
author = parse_author(x)
if author:
all_authors.append(author)
else:
elif authors is not None:
author = parse_author(authors)
if author:
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new authors is not None guard prevents calling parse_author(None), but there’s no test covering the authors: null case in OpenAire responses. Add a unit test that parses metadata where authors is explicitly null and asserts parsing succeeds and returns an empty authors list.

Copilot uses AI. Check for mistakes.
@ssheikholeslami ssheikholeslami requested a review from Copilot April 1, 2026 14:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ssheikholeslami ssheikholeslami requested a review from willu47 April 1, 2026 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants