Skip to content

Fix DOI ingestion bugs#47

Merged
willu47 merged 3 commits intomainfrom
fix/doi-ingestion-bugs
Feb 9, 2026
Merged

Fix DOI ingestion bugs#47
willu47 merged 3 commits intomainfrom
fix/doi-ingestion-bugs

Conversation

@ssheikholeslami
Copy link
Copy Markdown
Collaborator

Summary

  • Fix AttributeError when OpenAIRE API returns None for author name/surname fields
  • Fix TypeError from double db injection when calling search_dois() from validate_dois()

Problem

When processing longer DOI lists, two bugs prevented successful ingestion:

  1. parser.py: The OpenAIRE API can return {"name": None} for authors. Using .get("name", "") only provides a default when the key is missing, not when the value is None, causing AttributeError: 'NoneType' object has no attribute 'title'.

  2. doi.py: Both validate_dois() and search_dois() are decorated with @connect_to_db. Calling self.search_dois(db) passed db explicitly while the decorator also injected it, causing TypeError: search_dois() takes 2 positional arguments but 3 were given.

Fix

  1. Changed metadata.get("name", "") to (metadata.get("name") or "") to handle None values
  2. Changed self.search_dois(db) to self.search_dois() since the decorator handles db injection

Test plan

  • Tested with longer DOI list (list_of_doi.csv from CCG) that previously failed.

ssheikholeslami and others added 2 commits February 4, 2026 16:57
The OpenAIRE API can return {"name": None} for authors. Using
.get("name", "") only provides a default when the key is missing,
not when the value is None. Changed to use `or ""` to handle both cases.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The search_dois() method is decorated with @connect_to_db which injects
the db argument automatically. Calling it with self.search_dois(db) from
validate_dois() (also decorated) caused the db to be passed twice.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two runtime errors encountered during DOI ingestion: handling None author name fields from the OpenAIRE API and avoiding a double db injection when validating DOIs.

Changes:

  • Updated parse_author in parser.py to safely handle None values for name and surname from OpenAIRE metadata without raising AttributeError.
  • Updated validate_dois in doi.py to call the decorated search_dois without explicitly passing the db argument, preventing a TypeError from double-injecting the database driver.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/research_index_backend/parser.py Adjusts author name parsing to coerce None name/surname values to empty strings before title-casing, preventing ingestion failures when OpenAIRE returns null for these fields.
src/research_index_backend/doi.py Fixes the internal call from validate_dois to search_dois to rely solely on the @connect_to_db decorator, resolving the double-db injection error during DOI validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@willu47 willu47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested that the code works locally and confirm that it does.

@willu47 willu47 merged commit 2138482 into main Feb 9, 2026
10 of 11 checks passed
@willu47 willu47 deleted the fix/doi-ingestion-bugs branch February 9, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants