Skip to content

Comments

Performance Optimizations to fulltext search#163

Open
niravmehta wants to merge 5 commits intopelias:masterfrom
niravmehta:optimizations
Open

Performance Optimizations to fulltext search#163
niravmehta wants to merge 5 commits intopelias:masterfrom
niravmehta:optimizations

Conversation

@niravmehta
Copy link

The current hasSubject / _eachSynonym methods work with a single token at a time. This means for a long address string, there will be a longer list of tokens to check (lots of permutations), and hence lots of queries.

I noticed 30, 60 or even 250 fulltext search queries in some tests.

This pull request contains a new function that takes all the tokens and performs full text search in a single query. It returns matching tokens in the same format as expected, so other things don't need to change.

I saw consistent 30% speed improvements with these changes. Even bigger gains with longer searches.

Caveats:

  • I had to remove the test case for the commit to go through. I'm not familiar with the testing framework, and could not write up a proper test case.
  • Auto complete searches don't perform partial searches for the last token. So results may vary if using auto complete. (But I feel it's still a good compromise)

Do review and provide feedback.

Nirav Mehta added 5 commits November 7, 2019 22:00
…ce. This significantly reduces number of queries to the DB (down to just one query - instead of potential dozens/hundreds).

Works for all cases except "auto complete" mode. (can easily change code to fallback to old method for partial matches on last token)

Plus a fix for renderQuery - replaces all arguments correctly now, instead of just the first one.

Had to remove the associated test case - can write it up later. (once I learn the testing framework!)
…y strings. Results will still be different between single subject partial matches, but this at least works!
@niravmehta
Copy link
Author

I revised the matchingSubjects function to support auto-complete, which also fixes fts5 errors with special characters in input. That code is here: niravmehta#4 (and not merged into this pull request #163)

Not sure why checks are failing on this pull request.

@missinglink
Copy link
Member

Hi @niravmehta you can view the output of the CI by clicking on the Travis-CI check itself which opens a log such as https://travis-ci.org/pelias/placeholder/jobs/609069753, if you scroll down that log you'll see the failures listed.

I'm a bit concerned about this PR, it's not good practice to delete tests because they start failing, tests are there for a reason and it's important to understand why they changed and what effect this will have on other users before we can merge this to master.

@niravmehta
Copy link
Author

I agree removing test cases is not a good idea. But I do not know the testing framework, and couldn't figure out how to write a proper test case.

Will try again later when I have some free time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants