Skip to content

Fix 9 production error patterns with tests (~100K+ events)#3225

Draft
cursor[bot] wants to merge 2 commits intomainfrom
cursor/sentry-error-investigation-dee8
Draft

Fix 9 production error patterns with tests (~100K+ events)#3225
cursor[bot] wants to merge 2 commits intomainfrom
cursor/sentry-error-investigation-dee8

Conversation

@cursor
Copy link
Copy Markdown

@cursor cursor Bot commented Mar 23, 2026

Addresses 9 high-impact production error patterns identified through code audit. None of these fixes from previous closed PRs (#3196, #3199, #3209) were merged.

Fixes

1. OpenAlex license extraction AttributeError (~67K events)

  • primary_location.get("source", {}).get("display_name") fails when source is explicitly null
  • Fix: Use isinstance(source, dict) guard before accessing nested fields

2. Author MultipleObjectsReturned (~35K events)

  • Author.objects.update_or_create(openalex_ids=...) raises when duplicate authors exist
  • Fix: Replace with filter().first() + manual update/create pattern

3. OpenAlex util AttributeError on null dict chains

  • Multiple .get("key", {}).get(...) patterns fail when values are explicitly null
  • Fix: Use (x or {}) pattern; guard against empty display_name causing IndexError

4. Persona webhook AttributeError on .lower()

  • _get_nested_attr(...).lower() crashes when status path returns None
  • Fix: Add null safety with (raw_status or "").lower() and defaults for all fields

5. "Some authors not found" Sentry noise

  • log_error(ValueError(...)) sends to Sentry for a non-actionable condition
  • Fix: Downgrade to logger.warning

6. UserDocument BulkIndexError from dynamic mapping

  • Bare ObjectField() causes OpenSearch mapping conflicts for headline
  • Fix: Add explicit properties (id, headline, profile_image)

7. PersonDocument mapper_parsing_exception

  • Non-string headline values cause OpenSearch indexing failures
  • Fix: Add prepare_headline normalizer that returns empty string for non-strings

8. Contribution views NoneType error

  • hypothesis model no longer exists but is still in allowed models list
  • Fix: Remove from _get_allowed_models()

9. Search indexing BulkIndexError / NotFoundError

  • Delete not_found errors in bulk operations crash celery tasks
  • Fix: Catch NotFoundError and benign BulkIndexError patterns

Tests added

  • 6 tests for license extraction null safety
  • 7 tests for openalex_util null safety
  • 4 tests for Persona webhook null safety
  • 4 tests for PersonDocument headline normalizer
  • 6 tests for search celery error handling (including _is_benign_bulk_error)
Open in Web View Automation 

- Fix _extract_license_info None source in OpenAlex mapper (~67K events)
  Source field can be explicitly null; use isinstance check
- Fix Author update_or_create MultipleObjectsReturned (~35K events)
  Replace with filter().first() pattern to handle duplicates
- Fix openalex_util.py None-in-dict chains
  Use (x or {}) pattern instead of .get('key', {}) for null values
  Guard against empty display_name causing IndexError
- Fix Persona webhook .lower() on None
  Add null safety for status, first_name, last_name, inquiry_id
- Downgrade 'Some authors not found' from Sentry error to logger.warning
- Add explicit properties to UserDocument author_profile ObjectField
  Prevents BulkIndexError from dynamic mapping conflicts
- Add prepare_headline normalizer to PersonDocument
  Prevents mapper_parsing_exception for non-string headlines
- Remove 'hypothesis' from contribution allowed models
  Model no longer exists, causes NoneType errors
- Add BulkIndexError handling in search celery tasks
  Ignore benign delete not_found errors, handle NotFoundError
- 6 tests for _extract_license_info null safety (None source, missing,
  non-dict, etc.)
- 7 tests for openalex_util null safety (None author in authorship,
  empty display_name, null authorships)
- 4 tests for Persona webhook null safety (_get_nested_attr with None
  paths, missing status field)
- 4 tests for PersonDocument prepare_headline normalizer (valid string,
  None, non-string, empty string)
- 6 tests for search celery BulkIndexError handling (NotFoundError,
  benign bulk error, real error propagation, _is_benign_bulk_error)
@sonarqubecloud
Copy link
Copy Markdown

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 23, 2026

Codecov Report

❌ Patch coverage is 87.87879% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.79%. Comparing base (ce7eedb) to head (250ee5d).

Files with missing lines Patch % Lines
src/search/celery.py 66.66% 6 Missing ⚠️
src/paper/openalex_util.py 95.83% 1 Missing ⚠️
src/paper/views/paper_views.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3225      +/-   ##
==========================================
+ Coverage   78.75%   78.79%   +0.03%     
==========================================
  Files         610      610              
  Lines       34334    34379      +45     
==========================================
+ Hits        27039    27088      +49     
+ Misses       7295     7291       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant