Notion → AI Research Pipeline

Automates the full research workflow for crypto projects stored in a Notion CRM or Cards Database:

Detect a card whose "Due-Diligence Questionnaire" (DDQ) child-page has been marked as Completed.
Generate an in-depth Markdown report via the multi-step Deep Research agent (web scraping + LLM).
Publish/overwrite a child page called "AI Deep Research Report" directly under the Notion card.
Run an LLM-based scoring function to rate the project and save the JSON answers.
Publish a 🔥 Ratings inline database with nested cards to relay project scores and Q&A, also for analysts to consult, chime in on Notion.

All steps run entirely server-less – schedule python main.py on a cron / GitHub Actions and you are done.

🖼️ Pipeline Overview

flowchart TD
    A[watcher.py poll Notion DB] -->|Completed DDQ| B[research.py deep-web research]
    B --> C[writer.py create/overwrite report page]
    C --> D[scorer.py LLM JSON scoring]
    D --> E[pusher.py update 🔥 Ratings DB]

Every module can be executed independently (useful during development) yet main.py stitches them together for weekly automation.

🗄️ Repo Layout

.
├── src/                # Runtime modules (watcher, research, writer, scorer, pusher)
├── web_research/       # Deep-research agent & async search/scrape stack
├── tests/              # Pytest suite covering the whole flow
├── main.py             # Orchestrator (weekly cron job)
├── requirements.txt    # Pinned dependencies
└── README.md           # ← you are here

⚡ Quick Start

1. Clone & enter

git clone https://github.com/Liscivia/AI_intern.git
cd AI_intern

2. Create a virtual environment (Python 3.11)

Windows PowerShell

python -m venv ai_intern
.\ai_intern\Scripts\Activate.ps1

macOS / Linux

python3 -m venv ai_intern
source ai_intern/bin/activate

3. Install dependencies

pip install --upgrade pip
pip install -r requirements.txt
# Playwright needs browser binaries once
playwright install

4. Environment variables

Create the .env file, using the .env.example as reference:

# Notion
NOTION_TOKEN=secret_…
NOTION_DB_ID=<database-id> # enable automation on the notion CRM itself 

# LLM provider
OPENAI_API_KEY=sk-…
OPENAI_MODEL=gpt-4o-mini    # or any compatible model (deep-thinking & strong tool-calling capabilities recommended)

# Web-scraping
#   "firecrawl" (API-based) is default – set key or flip to Playwright fallback.
FIRECRAWL_API_KEY=fc_…       # recommended
DEFAULT_SCRAPER=firecrawl    # or playwright_ddgs

# Prompts
DEEP_RESEARCH_PROMPT="You are…"

5. Run the pipeline once

python main.py

Or execute the full test that mirrors the cron job:

pytest tests/test_final.py -q

or execute any of the tests to check out the functioning of each components

pytest tests/test_watcher.py -v -s

📝 Logs

File	Purpose
`logs/watcher.log`	Notion polling & pagination
`logs/research.log`	Deep-research orchestration & web searches
`logs/writer.log`	Markdown → Notion block conversion
`logs/scorer.log`	LLM scoring lifecycle

Each line is written in key=value format so you can grep/filter easily.

🪄 Troubleshooting

Symptom	Fix
`RuntimeError: Environment variable NOTION_TOKEN is required.`	Load/define all required vars (`NOTION_TOKEN`, `NOTION_DB_ID`, `OPENAI_API_KEY`).
Firecrawl 429s / quota	Lower `RESEARCH_CONCURRENCY` env vars or set `DEFAULT_SCRAPER=playwright_ddgs`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notion → AI Research Pipeline

🖼️ Pipeline Overview

🗄️ Repo Layout

⚡ Quick Start

1. Clone & enter

2. Create a virtual environment (Python 3.11)

3. Install dependencies

4. Environment variables

5. Run the pipeline once

📝 Logs

🪄 Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Streamlit		Streamlit
logs		logs
reports		reports
src		src
tests		tests
web_research		web_research
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Notion → AI Research Pipeline

🖼️ Pipeline Overview

🗄️ Repo Layout

⚡ Quick Start

1. Clone & enter

2. Create a virtual environment (Python 3.11)

3. Install dependencies

4. Environment variables

5. Run the pipeline once

📝 Logs

🪄 Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages