-
Notifications
You must be signed in to change notification settings - Fork 13
Difference Between Bulk and Single Line Gecoding #204
Description
Given the addresses file in the test harness (~4350 rows) I found examples where I have run the batch geocoder and addresses have not been found which are found by the states address points geocoder when individually geocoding the addresses. Following are some examples:
7437 South 1550 East SOUTH WEBER UT 84405
7473 South 1550 East SOUTH WEBER UT 84405
9 East 750 South FARMINGTON UT 84025
871 West Brandon Drive KAYSVILLE UT 84037
857 West Brandon Drive KAYSVILLE UT 84037
521 Wharton Road Lowell AR 72745
2 Arrow Brook Court Little Rock AR 72227
4 Amherst Cove Little Rock AR 72205
43 Temple Court Northwest WASHINGTON DC 20001
45 K Street Northwest WASHINGTON DC 20001
2 Lupine Lane SOUTH BURLINGTON VT 05403
57 Munson Drive WILLISTON VT 05495
In case it is relevant this is also after I made some changes to the geocoder by adding more synonyms to the loader and changing the file in grasshopper which builds the elasticsearch query by adding proximity searching.
Specifically changing:
private def searchAddressFields(client: Client, index: String, indexType: String, number: String, streetName: String, city: String, state: String, zipCode: String): Array[SearchHit] = { val numberQuery = QueryBuilders.matchQuery("number", number) val streetQuery = QueryBuilders.matchPhraseQuery("street", streetName) val cityQuery = QueryBuilders.matchQuery("city", city) val stateQuery = QueryBuilders.matchQuery("state", state) val zipQuery = QueryBuilders.matchQuery("zip", zipCode)
val query = QueryBuilders .boolQuery() .must(numberQuery) .must(streetQuery) //.must(cityQuery) Removing for now, decreases response rate if data is not 100% accurate .must(stateQuery) .must(zipQuery)
to
private def searchAddressFields(client: Client, index: String, indexType: String, number: String, streetName: String, city: String, state: String, zipCode: String): Array[SearchHit] = { val numberQuery = QueryBuilders.matchQuery("number", number) val streetQuery_strict = QueryBuilders.matchPhraseQuery("street", streetName) val cityQuery = QueryBuilders.matchQuery("city", city) val stateQuery = QueryBuilders.matchQuery("state", state) val zipQuery = QueryBuilders.matchQuery("zip", zipCode) val streetQuery_loose = QueryBuilders.matchQuery("street", streetName)
val query = QueryBuilders .boolQuery() .must(numberQuery) //.must(streetQuery_strict) .must(streetQuery_loose) //.must(cityQuery) Removing for now, decreases response rate if data is not 100% accurate .must(stateQuery) .must(zipQuery) .should(streetQuery_strict)