Skip to content

Conversation

@caviri
Copy link
Member

@caviri caviri commented Nov 6, 2025

Note

Adds ROR/Infoscience URL validation agents and integrates them into enrichment, updates models (use orcid, unify relatedToOrganizations), normalizes/validates links, and optimizes prompts with token breakdown logging.

  • Validations:
    • Add URL validation agent (src/agents/url_validation.py) with ROR API v2 and Infoscience HTML normalization; new ValidationResult model and utilities (validation_utils.py).
    • Integrate validation into organization and academic catalog enrichment (normalize URLs, drop invalid, adjust confidence).
  • Organization Enrichment:
    • Pre-search ROR data and include in prompts; add validate_ror_organization_tool; simplify instructions; add detailed token breakdown logging.
  • Data Models:
    • Person: replace orcidId with orcid; email supports Union[str, List[str]].
    • Repository: merge relatedToOrganizationsROR into relatedToOrganizations as List[Union[str, Organization]]; validators/mappings updated.
    • AcademicCatalogRelation: remove externalId; fix matchedOn typing.
    • Infoscience entities: add type discriminators and strict URL validators; normalized URLs.
    • Conversion/exports updated to reflect new fields and JSON-LD mappings.
  • Agents/Prompts:
    • Academic catalog: validate Infoscience relations; structured outputs assigned directly.
    • Repository/User/Org prompts: enforce single-search policy on 0 results; optimize wording; add token breakdown logs.
    • Analysis flows updated to handle mixed org strings/objects and new orcid field.
  • Context/Utils:
    • Infoscience parsers now emit normalized entity URLs; “STOP searching” messaging in empty results.
    • URL utilities tolerate empty strings and update ORCID checks.
  • Dependencies:
    • Add markdownify for HTML→markdown conversion.

Written by Cursor Bugbot for commit f944e6f. This will update automatically on new commits. Configure here.

@caviri caviri merged commit 370e63f into develop Nov 6, 2025
3 checks passed
@caviri caviri deleted the feat-links-validator branch November 6, 2025 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants