Conversation
|
👋 @philtweir — welcome! No work is in progress yet for this, so I would love to work with you to get this tidied up and merged in 🙂 Notes on your notes:
|
|
Thanks @bglw !
Only the metadata, so this was simply to be least invasive on the Rust end (and this is for batch runs, so I am less concerned about the size). However, keeping the IntermediaryPageData and passing that back instead of the encoded_data seems to address the primary use-case with much less overhead and given that the metadata can hold arbitrary key-value pairs, the user can store their own ID for looking up the original data if they want.
Sounds good!
Of course, and thanks - no objection to doing that, but that went beyond my immediate need, so left it as a todo for this discussion (having assumed it would be essential for merge).
Grand, will do - I did do a build from a clean repo clone the same day that I submitted this, so it is perhaps more likely I have got some steps out of sequence (again, was keen to check before spending more time but know to take another crack now).
Great - not sure how I missed that - classic searching everywhere except the problem packagename in the issue tracker 🙄.
Thank you! |
Only recently came to pagefind and it is amazing, thank you!
As in #371, my use-case is geospatial, and using pagefind for text search works really well except that I really need to be able to filter by a map-window. In short, we have 30-60k records to search and while leaflet.js cluster markers can handle that, geographically windowing the search is critical for the tool to be usable.
Is this an XY problem?: The two ideas I had were, (1) adding a (taxonomy-based) locality filter, but the reality is that would be hard to do in a useful way given how many localities there are and unintuitive, (2) searching the full result set with the geographic window (thanks @indus for highlighting flatbush) and intersecting the results. The second does work, but without being able to map pagefind hashes to any lookup outside pagefind, that means loading all matching results via
result.data()and using metadata. I cannot take the firstNresults because they may not containO(N)matches in the geospatial window - users are likely to wish to search very generic terms in a very small area, so loading every fragment to filter down to 5 results generates a huge amount of traffic and kills the browser. As such, the minimum viable option seems to be addingGetIndexCatalogueso that I can intersect flathub results with pagefind results before loading any fragments. Happy for other simpler alternatives.I saw in #371 that this is the same direction @bglw is intending but as I could not see a PR and it was a blocker for me, I went ahead and did a minimal patch. There is clearly more work:
mainto run (an issue where fragment requests get sent as single-element arrays with the fragment hash and 404, which I thought someone might have a quick explanation for) so this is tested on top ofv1.3.0tagcargo build --extendedinpagefind/pagefindHowever, happy to tidy up if the direction is good - if there is equivalent work-in-progress or there is a better approach, feel free to close and I will keep using this only until it lands!