Adding support for fewer number of fragment files (#74 and #739)#1020
Adding support for fewer number of fragment files (#74 and #739)#1020yakovsh wants to merge 1 commit intoPagefind:mainfrom
Conversation
|
👋 Thanks for the PR! This is a feature we need, though this isn't quite how I'd like to approach it. Some notes:
Let me know how you want to proceed with this one, whether you want to reshape this PR or if you'd rather I take a crack at this feature after 1.5.0 ships :) |
|
With the approach of "--max-fragments", we would need to know first how many total pages there are, then decide how many fragments we would end up generating. So if "--max-fragments" is larger than the total number of pages, I would assume things can stay as today. Once we find the total number of pages, the number of fragments would be pages/max_fragments. (An alternative approach might be to specify how many pages we store per fragment, something like "--pages-per-fragment" which can make this a little cleaner but probably too much tied to implementation details) Then, instead of using page hashes, the fragments can be named like "en_001.pf_fragment", "en_002.pf_fragment", etc. and the metadata that ties pages to fragments can be numeric like "1, 2, 3", instead of hashes as they are today? Is this how you understand it? |
|
No we do still need hashes, and in fact an issue I didn't highlight with this PR is that reducing the hash prefix down too far will cause cache collisions. One of the jobs of the hashes is to allow indefinite caching of Pagefind assets, as they naturally cache bust when content changes. A |
|
I didn't think of that since in my use case, I refresh the caching manually via a CDN. Let me try to reshape the PR. |
|
Going to re-open once I solve the merge/re-base issues |
|
I re-shaped the PR to have this as a new option "max-fragments" instead of the original approach. I didn't update the NodeJS and Python wrappers yet fully but let me know if this is the direction you were thinking of, and I will update the code for those as well. Thank you! |
|
Looks like some of the tests are failing on Github but they work locally. Same tests are failing in other PRs which leads me to believe its not due to my change: The specific test is "Web Components Tests > Summary component displays search status" |
This PR adds support for grouping fragments into smaller number of files by combining them by the first N characters from the hash into the same file (issues #74 and #739). I tried to make the changes minimally invasive and added some tests. Let me know if something needs to be adjusted or changed.