Skip to content

Comments

remove parenthesed portion of names#559

Draft
missinglink wants to merge 1 commit intomasterfrom
parenthesed-names
Draft

remove parenthesed portion of names#559
missinglink wants to merge 1 commit intomasterfrom
parenthesed-names

Conversation

@missinglink
Copy link
Member

@missinglink missinglink commented Aug 11, 2021

this DRAFT PR is to explore the idea of removing parenthesed portions of names.
I'm not 100% sure this is a great idea, the test cases illustrate some positive and some potentially negative results.

@missinglink
Copy link
Member Author

This was motivated by the following results from a TV series showing up for the query 90210:

Screenshot 2021-08-11 at 15 03 31

@missinglink
Copy link
Member Author

missinglink commented Aug 11, 2021

I also considered implementing something similar in pelias/schema where we would store the original text verbatim but only index the tokens outside the parenthesis. It's also not without its potential issues...

@orangejulius
Copy link
Member

I like this! I'm sure it has a downside somewhere, but I think it's worth exploring. Definitely worth kicking off a build. The diff in the Vancouver extract actually looks very positive.

@orangejulius
Copy link
Member

I came across this PR again today and figured we should test it out. Branch is rebased and a planet build is kicked off :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants