PA Property Lead Pipeline

Automated pipeline for generating property leads from Pennsylvania obituaries and public notices.

Features

Scrape: Daily monitoring of Legacy.com and PA Public Notice sites
Validate: Automated search across county tax assessor portals (DevNet, GIS, ArcGIS)
Enrich: Skip tracing via BatchData API
Deliver: Automatic sync to Google Sheets

Setup

1. Install Dependencies

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
playwright install chromium

2. Configure Environment

cp .env.example .env
# Edit .env with your API keys

3. Set Up Google Sheets

Create a Google Cloud project
Enable the Google Sheets API
Create a service account and download JSON credentials
Save as config/google_credentials.json
Share your target spreadsheet with the service account email

4. Configure Counties

Edit config/counties.yaml to add/modify county portal configurations.

Usage

Run Full Pipeline

python main.py run

Individual Commands

# Scrape obituaries and public notices
python main.py scrape

# Validate scraped leads against tax portals
python main.py validate

# Enrich validated leads with skip tracing
python main.py enrich

# Sync to Google Sheets
python main.py sync

Options

python main.py run --county allegheny  # Run for specific county
python main.py run --days 7            # Process last 7 days
python main.py run --dry-run           # Preview without changes

Project Structure

pa-property-pipeline/
├── config/           # Configuration files
├── scrapers/         # Web scrapers
├── cleaners/         # Data cleaning utilities
├── validators/       # County portal validators
├── enrichers/        # Skip trace integration
├── delivery/         # Google Sheets sync
├── models/           # Data models
├── utils/            # Shared utilities
├── logs/             # Runtime logs
├── data/             # Local data storage
└── tests/            # Test suite

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PA Property Lead Pipeline

Features

Setup

1. Install Dependencies

2. Configure Environment

3. Set Up Google Sheets

4. Configure Counties

Usage

Run Full Pipeline

Individual Commands

Options

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cleaners		cleaners
config		config
delivery		delivery
enrichers		enrichers
models		models
scrapers		scrapers
scripts		scripts
tests		tests
utils		utils
validators		validators
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PA Property Lead Pipeline

Features

Setup

1. Install Dependencies

2. Configure Environment

3. Set Up Google Sheets

4. Configure Counties

Usage

Run Full Pipeline

Individual Commands

Options

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages