Skip to content

[Expert Finder] Move experts from unstructured JSON toward a relational model#3295

Closed
nicktytarenko wants to merge 11 commits intomainfrom
expert-finder/refactor-expert-search-models-serializers-migrate-expert-results-v2
Closed

[Expert Finder] Move experts from unstructured JSON toward a relational model#3295
nicktytarenko wants to merge 11 commits intomainfrom
expert-finder/refactor-expert-search-models-serializers-migrate-expert-results-v2

Conversation

@nicktytarenko
Copy link
Copy Markdown
Contributor

@nicktytarenko nicktytarenko commented Apr 10, 2026

What changed

Relational experts (Expert + SearchExpert)

  • Adds Expert: one row per normalized professional email, with registered_user, last_email_sent_at, and structured fields (name parts, academic title, affiliation, etc.).
  • Adds SearchExpert: ordered membership linking an Expert to an ExpertSearch.

This mirrors behavior that existed in ARTEMIS, where a JSON blob was enough; in ResearchHub we normalize that data into proper tables so the same concepts are queryable, deduplicated, and consistent with the rest of the stack.

Why it matters

  • Duplicate experts: easier to fix now that experts are keyed and stored in a structured table instead of only in per-search JSON.
  • Exclusions: we stop using expert names for exclusion (fragile / ambiguous). excluded_expert_names is removed; excluded_search_ids (other search IDs) drives exclusion via resolved expert IDs from those searches.
  • Invited experts: no separate invited-expert table. Expert.registered_user links a canonical expert row to a User when someone registers with the same email.
  • Names from the LLM: the model used to return a single “full name” string, and suffixes like PhD could get glued onto the wrong part. Honorific, first, middle, last, suffix, and academic title are separate fields so we can build display/salutation strings (for now: first + middle + last as needed) without mis-parsing credentials.

Data migration

  • Adds a management command to backfill Expert / SearchExpert from the legacy JSON field. It is temporary and can be removed after migration is complete and JSON is no longer the source of truth.

Important

Should be released together with ResearchHub/web#758

Introduce canonical Expert rows (keyed by email) and SearchExpert
membership for expert lists per ExpertSearch.
One-off / operational command to copy legacy expert_results JSON into
Expert and SearchExpert rows, with dry-run support for safe rollout.
Align finder flow with relational experts: persist/replace search
experts, parse markdown table into structured fields, and keep prompts
and constants consistent with the new exclusion and column layout.
context, invited expert derivation without the removed invited table
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 87.16356% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.33%. Comparing base (4a83c39) to head (3596017).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
src/research_ai/serializers.py 74.24% 17 Missing ⚠️
src/research_ai/services/expert_persist.py 70.45% 13 Missing ⚠️
src/research_ai/services/expert_llm_table.py 92.74% 9 Missing ⚠️
src/research_ai/views/expert_finder_views.py 55.00% 9 Missing ⚠️
src/research_ai/services/expert_finder_service.py 82.50% 7 Missing ⚠️
src/research_ai/models.py 94.73% 2 Missing ⚠️
src/research_ai/services/expert_display.py 95.65% 2 Missing ⚠️
...rc/research_ai/services/email_generator_service.py 80.00% 1 Missing ⚠️
...c/research_ai/services/email_template_variables.py 80.00% 1 Missing ⚠️
...rc/research_ai/services/invited_experts_service.py 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3295      +/-   ##
==========================================
+ Coverage   79.28%   79.33%   +0.04%     
==========================================
  Files         619      623       +4     
  Lines       35224    35558     +334     
==========================================
+ Hits        27929    28211     +282     
- Misses       7295     7347      +52     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nicktytarenko nicktytarenko marked this pull request as ready for review April 13, 2026 17:24
@nicktytarenko nicktytarenko requested a review from a team as a code owner April 13, 2026 17:24
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

…kend into expert-finder/refactor-expert-search-models-serializers-migrate-expert-results-v2
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant