Data pipeline

Automating the classification process will require a new data pipeline. Besides the classification process, some of the challenges include:

Matching donors across multiple records/campaigns/etc.
Deduping entries across amended submissions.
Matching entities to campaigns.

As we continue focusing on municipal elections, manually matching entities to campaigns remains feasible, but this will need to be changed in the long-term.

Matching donors across records is a much bigger challenge. Donors can be identified by name and address, but this is plagued by typos and lack of consistency (e.g.; firstname lastname; lastname, firstname; firstname middleinitial lastname, honorific; etc.). I recently found libpostal, an external library that does an excellent job of normalizing addresses, and have installed it on the server.

Ideas
Waffle Kanban board
Meeting Minutes
Use Cases
Handy links

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data pipeline

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally