Skip to content

feat: Instaloader scraping, entity extraction, and user management#1

Open
undeemed wants to merge 1 commit intomainfrom
feat/instaloader-and-improvements
Open

feat: Instaloader scraping, entity extraction, and user management#1
undeemed wants to merge 1 commit intomainfrom
feat/instaloader-and-improvements

Conversation

@undeemed
Copy link
Member

Changes

Instagram Scraping

  • Replace Apify scraper with Instaloader (in-process, no proxy needed)
  • Carousel/sidecar support — all slides analyzed, not just the cover image
  • Caption-aware image analysis eliminates hallucinations

AI Pipeline

  • Replace broken Pioneer GLiNER2 with Reka-based interest extraction
  • Strict extraction prompt: max 5 interests, no generic verbs/visual descriptions
  • Plural/singular entity category normalization (hobbies→hobby, brands→brand)
  • Brand nodes now appear in graph visualization

User Management

  • User sessions with localStorage persistence
  • Account panel with add/remove Instagram accounts
  • Force bypass for cooldown (?force=true, auto on localhost)

Bug Fixes

  • JSON serialization: json.dumps instead of str()/eval()
  • Job polling endpoint now works correctly
  • Removed stale scraper_url config

Docs

  • Added project README with setup instructions and architecture

…EADME

- Replace Apify scraper with Instaloader (in-process, no proxy needed)
- Add carousel/sidecar image support (all slides analyzed)
- Replace broken Pioneer GLiNER2 with Reka-based interest extraction
- Caption-aware image analysis to prevent hallucinations
- User account management (login, account panel, re-sync)
- Force bypass for cooldown (?force=true, auto on localhost)
- Brand nodes in graph visualization
- Fix JSON serialization (json.dumps instead of str/eval)
- Plural/singular entity category normalization
- Add README with setup instructions and architecture
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant