Fix top 3 Sentry production errors (~104K events)#3196
Draft
cursor[bot] wants to merge 2 commits intomainfrom
Draft
Fix top 3 Sentry production errors (~104K events)#3196cursor[bot] wants to merge 2 commits intomainfrom
cursor[bot] wants to merge 2 commits intomainfrom
Conversation
…pleObjectsReturned, OpenSearch headline mapping Fixes RESEARCHHUB-BACKEND-4KPZ: _extract_license_info crashes when OpenAlex 'source' field is explicitly null (67K events). The dict.get() default only applies when the key is missing, not when the value is None. Use isinstance check instead. Fixes RESEARCHHUB-BACKEND-4KQ0: Author.objects.update_or_create raises MultipleObjectsReturned when duplicate openalex_ids exist (35K events). Replace with filter().first() + manual update/create pattern. Fixes RESEARCHHUB-BACKEND-4KKS: UserDocument author_profile ObjectField lacks explicit property definitions, causing OpenSearch to infer headline as an object type from dynamic mapping. Add explicit properties to prevent mapper_parsing_exception (1.1K events). Co-authored-by: Kobe Attias <kobeattias@gmail.com>
…ndling Tests for RESEARCHHUB-BACKEND-4KPZ: Verify _extract_license_info handles null source, missing source, valid source, empty/null primary_location, and non-dict source values. Tests for RESEARCHHUB-BACKEND-4KQ0: Verify process_authors handles duplicate openalex_ids, creates new authors, and updates existing ones. Co-authored-by: Kobe Attias <kobeattias@gmail.com>
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3196 +/- ##
==========================================
+ Coverage 78.49% 78.59% +0.10%
==========================================
Files 612 612
Lines 34250 34255 +5
==========================================
+ Hits 26885 26924 +39
+ Misses 7365 7331 -34 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Addresses the three highest-impact unresolved production errors by event count:
1. RESEARCHHUB-BACKEND-4KPZ — AttributeError in OpenAlex license extraction (67,717 events)
primary_location.get("source", {})returnsNone(not{}) when thesourcekey exists but its value is explicitlyNone. Calling.get("display_name")onNoneraisesAttributeError.isinstance(source, dict)check before accessing nested fields.2. RESEARCHHUB-BACKEND-4KQ0 — Author.MultipleObjectsReturned (35,197 events)
Author.objects.update_or_create(openalex_ids=...)calls.get()internally, which raisesMultipleObjectsReturnedwhen duplicate author records exist with the sameopenalex_ids.filter().first()+ manual update/create pattern to gracefully handle duplicates.3. RESEARCHHUB-BACKEND-4KKS — BulkIndexError in OpenSearch user indexing (1,130 events)
UserDocument.author_profileis defined asObjectField()without explicit property types. OpenSearch uses dynamic mapping, which can inferheadlineas an object type from one document and then reject string values from subsequent documents.properties(id, headline, profile_image) to theObjectFielddefinition.Tests added:
_extract_license_infoedge cases (null source, missing source, valid source, empty/null primary_location, non-dict source)process_authors(duplicate handling, creation, update)