Skip to content

Normalize addresses #17

@waldoj

Description

@waldoj

Address data does not appear to be normalized. For instance, Virginia Engineers PAC's 09/07/2012 contribution says that their primary place of business is "Richmon, VA." (To be fair, that's not address data.) It appears that the software used by many committees is providing normalization, but they normalize differently. For instance, some software normalizes on long street suffixes ("Court," "Boulevard," "Road," etc.), while some software normalizes on short street suffixes ("Ct.," "Blvd.," "Rd.," etc.) So the good news is that reports often have internal consistency that should make it easy to join all of the reports in collective consistency.

Implement the an address normalization system (presumably the USPS's API) to deal with this problem.

The only question is at what point this should be done. Is it appropriate to do this prior to saving the data and generating the JSON? Or is it wrong to alter the SBE's data? Wouldn't this mean making tens of thousands of API calls every time that the parser is run?

This might be an argument for standardizing addresses via a cruder, local function at the time of input, and save the USPS API calls to be used beyond the Saberva pipeline.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions