Skip to content

feat(analytics): implement NER tagging & entity filtering for news article (#455)#485

Merged
Cedarich merged 2 commits intoPulsefy:mainfrom
titilayo967:feature/Enhanced-News-Tagging-and-Entity-Recognition-NER
Mar 28, 2026
Merged

feat(analytics): implement NER tagging & entity filtering for news article (#455)#485
Cedarich merged 2 commits intoPulsefy:mainfrom
titilayo967:feature/Enhanced-News-Tagging-and-Entity-Recognition-NER

Conversation

@titilayo967
Copy link
Copy Markdown
Contributor

@titilayo967 titilayo967 commented Mar 27, 2026

Summary

Implements Named Entity Recognition (NER) for automatic tagging of news articles with detected entities (projects, people, assets) and adds API filtering by entity to enable ecosystem-specific news discovery.

Key Features:

  • NERService using spaCy with crypto-domain entity ruler patterns
  • detected_entities JSON column added to Article model
  • Automatic entity extraction on article save in PostgresService
  • Entity-based filtering support for news queries (?entity=Soroban)

Linked Issue

Closes #455

Type of Change

  • feat
  • fix
  • docs
  • refactor
  • test
  • chore

Validation

  • Lint passed for affected area(s)
  • Tests passed for affected area(s)
  • Manual verification completed (if applicable)

Test Results:

  • Unit tests: NER extraction accuracy validated
  • Integration tests: Entity persistence and filtering verified
  • All existing tests passing

Documentation

  • Documentation updated (or N/A with explanation)
  • Screenshots/videos attached for UI changes

Note: This is a backend-only change; documentation for the new ?entity filter is available in API endpoint descriptions.

Files Added/Modified

File Type Description
apps/data-processing/src/analytics/ner_service.py Added NER extraction with spaCy and entity ruler patterns
apps/data-processing/alembic/versions/002_add_detected_entities_to_articles.py Added Migration for detected_entities JSON column
apps/data-processing/tests/test_ner_service.py Added Unit tests for NER extraction
apps/data-processing/src/db/models.py Modified Added detected_entities field to Article
apps/data-processing/src/db/postgres_service.py Modified Auto-tag on save, entity filter support
apps/data-processing/src/api/server.py Modified Added ?entity query parameter to /news endpoint
apps/data-processing/requirements.txt Modified Added spacy>=3.7.0

API Usage

# Get all news articles
GET /news

# Filter news by entity
GET /news?entity=Soroban
GET /news?entity=Stellar
GET /news?entity=Project&days=7

Checklist

  • Branch name uses feat/, fix/, or docs/
  • Commit messages follow Conventional Commits
  • [x ] PR scope matches linked issue acceptance criteria

Closes #455

…ticles (Pulsefy#455)

- Add NERService in src/analytics/ner_service.py using spaCy with
  crypto-domain entity ruler patterns and graceful blank-pipeline fallback
- Add detected_entities JSON column to Article model and Alembic migration 002
- Auto-tag articles with detected entities on save in PostgresService
- Add entity= filter support to get_recent_articles()
- Add GET /news?entity=Soroban API endpoint with asset and time window filters
- Add unit tests for NER extraction and integration test for entity
  persistence and filtering
- Add spacy>=3.7.0 to requirements.txt
@drips-wave
Copy link
Copy Markdown

drips-wave bot commented Mar 27, 2026

@titilayo967 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

@titilayo967
Copy link
Copy Markdown
Contributor Author

@Cedarich please, review.

@Cedarich Cedarich merged commit c8281fd into Pulsefy:main Mar 28, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhanced News Tagging & Entity Recognition (NER)

2 participants