Skip to content
This repository was archived by the owner on Jan 8, 2020. It is now read-only.
This repository was archived by the owner on Jan 8, 2020. It is now read-only.

Difference Between Bulk and Single Line Gecoding #204

@kgudel

Description

@kgudel

Given the addresses file in the test harness (~4350 rows) I found examples where I have run the batch geocoder and addresses have not been found which are found by the states address points geocoder when individually geocoding the addresses. Following are some examples:

7437 South 1550 East SOUTH WEBER UT 84405
7473 South 1550 East SOUTH WEBER UT 84405
9 East 750 South FARMINGTON UT 84025
871 West Brandon Drive KAYSVILLE UT 84037
857 West Brandon Drive KAYSVILLE UT 84037
521 Wharton Road Lowell AR 72745
2 Arrow Brook Court Little Rock AR 72227
4 Amherst Cove Little Rock AR 72205
43 Temple Court Northwest WASHINGTON DC 20001
45 K Street Northwest WASHINGTON DC 20001
2 Lupine Lane SOUTH BURLINGTON VT 05403
57 Munson Drive WILLISTON VT 05495

In case it is relevant this is also after I made some changes to the geocoder by adding more synonyms to the loader and changing the file in grasshopper which builds the elasticsearch query by adding proximity searching.

Specifically changing:

private def searchAddressFields(client: Client, index: String, indexType: String, number: String, streetName: String, city: String, state: String, zipCode: String): Array[SearchHit] = { val numberQuery = QueryBuilders.matchQuery("number", number) val streetQuery = QueryBuilders.matchPhraseQuery("street", streetName) val cityQuery = QueryBuilders.matchQuery("city", city) val stateQuery = QueryBuilders.matchQuery("state", state) val zipQuery = QueryBuilders.matchQuery("zip", zipCode)

val query = QueryBuilders .boolQuery() .must(numberQuery) .must(streetQuery) //.must(cityQuery) Removing for now, decreases response rate if data is not 100% accurate .must(stateQuery) .must(zipQuery)

to

private def searchAddressFields(client: Client, index: String, indexType: String, number: String, streetName: String, city: String, state: String, zipCode: String): Array[SearchHit] = { val numberQuery = QueryBuilders.matchQuery("number", number) val streetQuery_strict = QueryBuilders.matchPhraseQuery("street", streetName) val cityQuery = QueryBuilders.matchQuery("city", city) val stateQuery = QueryBuilders.matchQuery("state", state) val zipQuery = QueryBuilders.matchQuery("zip", zipCode) val streetQuery_loose = QueryBuilders.matchQuery("street", streetName)

val query = QueryBuilders .boolQuery() .must(numberQuery) //.must(streetQuery_strict) .must(streetQuery_loose) //.must(cityQuery) Removing for now, decreases response rate if data is not 100% accurate .must(stateQuery) .must(zipQuery) .should(streetQuery_strict)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions