-
Notifications
You must be signed in to change notification settings - Fork 20
C++ | newCache v2 #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
MIT license in mapped_file.hpp and mapped_file.cpp - Supports 64-bit file seeking. (+4GiB files) - Can memory map portions of the opened file or entire file. - Can flush modified read-write mappings back into disk. - Read-write regions will grow the backing file in multiple 4096 blocks. - mapped::file class for accessing an file on disk. - mapped::region class for memory mapping raw area of file. - mapped::struct_region<T> template for accessing an on-disk structure - mapped::array_region<T> template for accessing an on-disk array of T
- silence few std::printf's since opening non-existing file is handled by returning -1
The memory map now supports mapping oversized "window" into the file: - flush(), sync() only flush the user area - jump(), flushJump() have fast path speed up when new user area fits into the oversized window.
- Provide region::writeAt() and region::readAt() that enable copying data into/from the backing file even if the target area of the backing file is not memory-mapped. - Fixup flushed length in flush() sync() - Run clang-format
- Provide FSTUNE flag that attempts to speed up file access when new file created with CREATE|RESIZE. It effectievely sets chattr +X and +A flags on the file. - Make readAt() const qualified.
- Provide proper move aware object. region objects are now safe to use in STL containers like vector/deque. - Implement region::resident() (not tested)
- region::window() allows over-extending the memory-mapping The "user mapped" portions stays same but regionSize() is changed.
- For resident() it is better to mark the entire mapped region rather than just the user area.
- Implement more fine-grained locking for region. - Implement region::discard() This effectively zero fills memory area within the mapping and punches hole into the backing file.
- The filePointer points into read-only memory from mmap() so apply const to few places to ensure nothing is writing into it. - getCubesByShape() may return pointers to past-end of the mmap() area if shape table entry size is zero. ShapeEntry::offset can be wrong if the size is also zero.
- I can actually read how the progress is calculated.
DEBUG_LEVEL selects the level of debug prints that are compiled in. 0 => Same as not compiling with DEBUG at all. 1 => Only DEBUG_PRINT() 2 => DEBUG1_PRINT() and lower levels are enabled 3 => DEBUG2_PRINT() and lower levels are enabled Change few of the noisiest prints to be silent with DEBUG_LEVEL == 1
This is v3 reversion of this hack: Previously the uint8_t bit-field actually caused Cube to be 16-bytes due to padding. Bitpack/Hack the size, is_shared flag and memory address into into private struct bits_t. This halves the Cube struct size. Note: If we get any segfaults from de-referencing the pointer returned by get() helper this hack must be reverted.
- Small changes diffed.
- Launching new threads is expensive. Refactor the cubes.cpp threading code so that The started threads are kept running until the main process is complete. - Allow main thread do a it's preparation work in parallel with the running Workset. (The next cache file can be loaded while the old one is being processed.)
Implement replacement for Cache::save() CacheWriter should produce identical files to the old code, but is slightly faster as it doesn't wait for the file finalization. The old code still exists as reference but nothing is using it except tests. - libmappedfile would allow the serialization process to be parallelized. (WIP, Not implemented yet.) - Move Header ShapeEntry into cacheformat namespace - Implement CacheWriter - Update cubes.cpp to use the new CacheWriter - Cube::copyout() helper. Idea for this helper is that if the cube representation is something else than plain XYZ array.
- CacheWriter now uses thread pool and copies the Hashy using worker threads. This would not be possible without libmapped_file. (N=13 completes now in less than 310 seconds, depends on disk) - Add nice progress bar
The old cache code has been deprecated since CacheWriter arrived: Only user was in tests/src/test_cache.cpp so drop the test case because it doesn't have any impact on the main cubes anymore. - Delete include/cache.hpp src/cache.cpp source files. Hopefully they will not be missed. :-)
CacheWriter didn't properly wait for queued job(s) to complete. Fix with counter that is incremented on queue and decremented *after* the task is run.
|
A few thoughts:
Perhaps a check if the upper bits in pointers are used would prevent segfaults and could warn the user about the problem. Also the copyout function should check if num<=size(). |
Did you get segfaults? 👀 I read the Intel 5-level paging wiki now and I realized the hack is likely broken in current form on many systems. Virtual addresses must be in canonical form for CPU not to segfault on them. I convert this PR into draft because the code is outdated and likely breaks on some systems.
Would an |
|
I did not have any segfaults but thought it would be easy to detect if we are truncating important bits from the pointers. Perhaps an extra function to do the check once is also a possibility. Assert seems fine. Should be systemic error if num and size are not matching. |
For Intel x86-64 processors the high 8 or 16-bits of memory address must be replication of (depending if 5-level paging is enabled) 56-bit or 47-bit. The check is CPU Arch/system specific and not viable to do on runtime. |
Hello!
I have some upgrades to the newCache + project code:
This is my own library in making, partly unrelated to opencubes project.
OpenCubes just happens to be first client for the library outside my projects.
The library source files have explicit MIT license prepended to them.
Overall I can do now N=13 in less than 310 seconds give or take. 😄
There still is no any kind of compression of the PolyCube data...
To generate starting cache files run
cubes -n 10 -t <threads> -w -sfollowed bycubes -n 11 -t <threads> -w -s -uto continue with split cache files.EDIT:
I have re-arranged my repository and enabled GPG verified commits.
The default branch https://github.com/JATothrim/opencubes is now fork of mikepound/main
and includes the changes in this PR.
@bertie2 can you take a look on this?