Skip to content

camreon/property-pipeline

Repository files navigation

PA Property Lead Pipeline

Automated pipeline for generating property leads from Pennsylvania obituaries and public notices.

Features

  • Scrape: Daily monitoring of Legacy.com and PA Public Notice sites
  • Validate: Automated search across county tax assessor portals (DevNet, GIS, ArcGIS)
  • Enrich: Skip tracing via BatchData API
  • Deliver: Automatic sync to Google Sheets

Setup

1. Install Dependencies

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
playwright install chromium

2. Configure Environment

cp .env.example .env
# Edit .env with your API keys

3. Set Up Google Sheets

  1. Create a Google Cloud project
  2. Enable the Google Sheets API
  3. Create a service account and download JSON credentials
  4. Save as config/google_credentials.json
  5. Share your target spreadsheet with the service account email

4. Configure Counties

Edit config/counties.yaml to add/modify county portal configurations.

Usage

Run Full Pipeline

python main.py run

Individual Commands

# Scrape obituaries and public notices
python main.py scrape

# Validate scraped leads against tax portals
python main.py validate

# Enrich validated leads with skip tracing
python main.py enrich

# Sync to Google Sheets
python main.py sync

Options

python main.py run --county allegheny  # Run for specific county
python main.py run --days 7            # Process last 7 days
python main.py run --dry-run           # Preview without changes

Project Structure

pa-property-pipeline/
├── config/           # Configuration files
├── scrapers/         # Web scrapers
├── cleaners/         # Data cleaning utilities
├── validators/       # County portal validators
├── enrichers/        # Skip trace integration
├── delivery/         # Google Sheets sync
├── models/           # Data models
├── utils/            # Shared utilities
├── logs/             # Runtime logs
├── data/             # Local data storage
└── tests/            # Test suite

About

CLI tool to connect data between tax portals and public notices

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors