Skip to content

dagosgi/ECC_scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ECC Transcript Scraping API (Local)

This repository allows you to scrape Earnings Conference Calls transcripts from The Motley Fool website.

Inputs:

  • ticker (required)
  • exchange (optional, default: nyse)

Important

Valid quote exchanges on The Motley Fool website are:

nyse, nasdaq, crypto

  • max_transcripts (optional, default: 5, range: 1–50)

If the ticker is missing or invalid, the call is rejected immediately.

If the quote webpage does not exist for the ticker+exchange combination, the output includes a "warning" field and returns an empty list.

Output:

A "TranscriptResult" object, which contains a list of dictionaries -- each corresponding to a scraped ECC.

Dictionary fields:

  • "quarter" → the fiscal quarter
  • "year" → the reference year
  • "date" → date of issue of the ECC
  • "url" → page URL
  • "page_title" → webpage title
  • "html_text" → page content that only comprises the plain ECC text (HTML format)
  • "full_soup" → full scraped content

Option 1 -- Import as a Python library

Note

Recommended option at the present time

!git clone https://github.com/dagosgi/ECC_scraping.git
!pip install playwright
!playwright install firefox
!playwright install-deps
!pip install -r /ECC_scraping/requirements.txt

Tip

Example call:

from ECC_scraping import get_transcripts

result = await get_transcripts("YOUR_TICKER", exchange="nyse", max_transcripts=5)
print(result.warning)
print(len(result.items))

And then you can extract, say, the HTML-formatted text of the first ECC:

result.items[0].get("html_text")

Option 2 -- Python virtual environment (non-Docker)

Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
playwright install firefox
playwright install-deps firefox

Option 3 -- Use as a CLI

python -m ECC_scraping.cli --ticker YOUR_TICKER --exchange nyse --max-transcripts 5 --out file.json

Warning

If you omit --out, JSON is printed to stdout.

Option 4 -- Run as a local API server (FastAPI)

Start the server:

uvicorn ECC_scraping.api:app --reload

Open:

http://127.0.0.1:8000/docs

Tip

Example call:

http://127.0.0.1:8000/transcripts?ticker=YOUR_TICKER&exchange=nyse&max_transcripts=5

Option 5 -- Docker

With Docker compose

docker compose up --build

Then open:

http://127.0.0.1:8000/docs

With plain Docker

docker build -t ECC_scraping .
docker run --rm -p 8000:8000 ECC_scraping

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors