-
Notifications
You must be signed in to change notification settings - Fork 0
ETT-827: full-text advanced search boolean operator bug fix #136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| locationData = [], | ||
| collid = null, | ||
| collectionName = null, | ||
| mockSubmit = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New mockSubmit function included here as a prop to pass in as an argument for storybook test.
| lookFors.fill(''); | ||
| let bools = $state(new Array(4)); | ||
| bools.fill('AND'); | ||
| bools[0] = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting op1 as null since it's never used.
| } | ||
| function submitForm(event) { | ||
| function buildSearchUrl(event) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored this function and renamed as buildSearchUrl. I extracted the part that does the form "submission" so I could test the URL that is constructed in this step.
| const filteredAnyAlls = []; | ||
| const filteredBools = []; | ||
| for (let i = 0; i < lookFors.length; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop that filters the search arrays depending on whether that line in the search form was filled in.
| let hasSearchTerms = false; | ||
| lookFors.forEach((value, idx) => { | ||
| filteredLookFors.forEach((value, idx) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting params based on the filtered arrays.
| return url.toString(); | ||
| } | ||
| function submitForm(event) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted function that "submits" the form. It actually just reassigns the page's URL to the constructed URL. The storybook test is the only file that uses mockSubmit.
| types[3] = params.get('field4') || 'author'; | ||
| anyalls[3] = params.get('anyall4') || 'all'; | ||
| bools[4] = params.get('op3') || 'AND'; | ||
| bools[3] = params.get('op4') || 'AND'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bools needed some work. @moseshll pointed out the mismatch in an earlier comment and this rearranged the indexes to have them match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely seems like a case where it would be easier to reason about and work with if the params had more structure directly (i.e. were expressed in JSON), but not feasible to change now without making changes in ls.
| <div class="d-flex mb-3 justify-content-end"> | ||
| <button class="btn btn-primary btn-lg" type="submit"> | ||
| <button data-testid="advanced-search-submit" class="btn btn-primary btn-lg" type="submit"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gayathri , Ange, and I have all lamented that these buttons don't make sense and that advanced search in general needs a redesign. Why are there 2 "Advanced Search" buttons on this page? I don't know, but it was too hard to select this button for storybook without giving it a test ID.
| @@ -0,0 +1,12 @@ | |||
| <script> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to add a decorator to this story to account for the HT global variable in the component.
aelkiss
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having tests in storybook for this is great; as I've commented, in many ways we're limited by the interface that full-text search has given us. I think it would be tricky to do much more refactoring here in a way that makes it easier to understand. One benefit of what's here is that it makes some of the "data clumpiness" a little more apparent for consideration/rework in a future redesign of advanced search.
| await waitFor(() => expect(args.mockSubmit).toHaveBeenCalled()) | ||
|
|
||
| const calledUrl = args.mockSubmit.mock.calls[0][0]; | ||
| expect(calledUrl).toContain('lmt=ft&a=srchls&adv=1&q1=elephant&q2=conservation&field1=ocr&field2=subject&anyall1=all&anyall2=all&op2=OR') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stories look good; I verified that these URLs in full-text search (i.e. prepending this with https://babel.hathitrust.org/cgi/ls?) appears to work as expected.
| searchParams.set('c', collid); | ||
| } | ||
| const filteredLookFors = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like one of those "data clump" cases as we were discussing with 99 bottles, BUT it's very dependent on the interface with full-text search -- so this seems like more of a design opportunity to rethink the data structure here (maybe passing something with more structure as JSON) in the future, rather than doing anything different here right now.
| types[3] = params.get('field4') || 'author'; | ||
| anyalls[3] = params.get('anyall4') || 'all'; | ||
| bools[4] = params.get('op3') || 'AND'; | ||
| bools[3] = params.get('op4') || 'AND'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely seems like a case where it would be easier to reason about and work with if the params had more structure directly (i.e. were expressed in JSON), but not feasible to change now without making changes in ls.
|
@moseshll discovered a bug with catalog searches, so I'm going to fix that up before going for a second review from him. |
| } | ||
| } | ||
|
|
||
| export const FullTextFields1And4WithFirstOr = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test for fields 1 and 4 if connected with the first OR (this wasn't working before).
|
|
||
| } | ||
| } | ||
| export const CatalogFields1And4WithLastOr = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test for constructing a catalog search URL with fields 1 and 4 connected with the third boolean OR.
| } | ||
| } | ||
| const filteredLookFors = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved these filtered arrays to outside the ls scope so catalog URL construction could access filteredBools, too.
| filteredAnyAlls.push(anyalls[i]); | ||
| // use OR if it's the connecting bool between two lines, otherwise use AND | ||
| if (lastValidIndex !== null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept running into an issue while working on this where I could get the value for the first OR or the last OR but never both of them.
I used Claude to help figure this out. What Ange really wanted was to always default to an 'OR' if the user selected it somewhere:
would it make sense to write a rule where, because AND is the default, then selecting OR should have priority over AND when they come into potential conflict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, since it takes effort to select the non-default.
| return; | ||
| } | ||
| if (value && lookFors[idx] && lookFors[idx - 1]) { | ||
| filteredBools.forEach((value, idx) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored the catalog bools to use the filteredBools array.
a2eead3 to
b832cce
Compare
moseshll
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the latest on dev-3 and I am happy with the result. APPROVE
This work-around more or less fixes the bug that was described in the ticket, but there is more under the surface here that I don't 100% understand.
To set the scene: Looking at this form, it's visually obvious that there can be 4 keywords/fields/types and only 3 boolean operators at the maximum. @moseshll discovered that
op1/[bools][0]seems to be doing nothing, and I agree, so it gets skipped in this new URL construction. All other search-related arrays are filtered against thelookFors/keyword array so that the related indexes are filtered and reassigned. In a search on prod, searching for "apples" on line 1 and "oranges" on line 4 results in a URL likeq1=apples&q4=oranges&op1=AND&op4=AND. This fix changes changes the URL output to something likeq1=apples&q2=oranges&op2=AND.Some history: The version of advanced search before this searched groups of clauses, and I think there is some leftover logic for that hanging out here in a way that isn’t supported in the interface. Lianet’s full-text XML spreadsheet indicates that the param “OP3” is “the operator between second and third search fields, i.e. between groups.” In the old interface, there were two groups of inputs connected with a separate boolean, making it possible to search like

(cake AND pie) OR (cake AND ice cream). Theop3param seems to be that connecting OR operator in my example. I'm not sure this makes much of a difference in search results now that I'm re-writing the placement of keywords (e.g.q1=cake&q4=pie&op1=AND&op4=ANDis nowq1=cake&q2=pie&op2=AND), but in prod currently, if you try to do a search on lines 2 and 3 or 3 and 4 without anything in the first line, it spits out an error about parentheses and/or booleans.This fix unfortunately creates a new sporadic bug in catalog advanced search. Occasionally, the third boolean radio group will have nothing selected, just two empty radio buttons. This doesn't really affect the searching mechanism or outcome of the search since a user can just select one and move on, but it is annoying. Thankfully (?), it's pretty challenging to get to the catalog version of advanced search, so I'm inclined to ignore this problem until we get complaints about it and/or redesign search.