Skip to content
barkat-10 edited this page May 12, 2025 · 4 revisions

Welcome to the NAF Wiki

Welcome to the NAF wiki pages! This wiki contains all the information about our project and how each component functions.
The project is organized into three major components:

  • Crawler
  • Analyzer
  • Enricher

Each component has its own dedicated wiki page with setup instructions, functionality breakdowns, and implementation details.


🔍 Crawler

Purpose:

Automatically searches for potential NAF alumni profiles using Google queries, which return the best yield of NAF alumni by tailoring searches to relevant attributes.

How it works:

  • Queries Google for LinkedIn pages using custom search strings.
  • Iterates through the resulting links and collects the HTML of each LinkedIn profile.
  • Parses the HTML to generate corresponding JSON objects.
  • Stores profiles in the development database based on matching identifiers (such as schools, companies, or certifications).

🧠 Analyzer

Purpose:

Evaluates the likelihood that a profile collected by the crawler belongs to a NAF alum.

How it works:

  • Assigns weights to different identifiers such as schools or job roles.
  • Calculates a confidence percentage for each profile based on those weights.
  • Labels profiles as either likely NAF alumni or not based on their probability score.

🧩 Enricher

Purpose:

Updates and enriches information on individuals already present in the NAF database, particularly when searched manually.

How it works:

  • First searches for the individual in a dynamic PostgreSQL database.
  • If the person is not found, it uses a headless browser to search for them online (primarily LinkedIn).
  • Scrapes key information from the resulting profiles.
  • Exports the updated information to a CSV file and stores structured data in the database.

📄 Explore Further

Each component has a dedicated wiki page.
If you’d like to dive deeper into how they work, learn how to set them up, or run them locally, check out the respective component's documentation:

Clone this wiki locally