Skip to content

Commit 6f22b57

Browse files
authored
Add 64-byte alignment (#67)
1 parent 5c86fb7 commit 6f22b57

File tree

15 files changed

+283
-67
lines changed

15 files changed

+283
-67
lines changed

.github/workflows/rust-lint.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,13 @@ jobs:
1010
- uses: actions/checkout@v3
1111
- uses: dtolnay/rust-toolchain@stable
1212

13+
# Note: This is a workaround for an issue that just started appearing in lint checks
14+
# and I'm not yet sure if it's due to GitHub Actions having updated something behind
15+
# the scenes:
16+
# error: 'cargo-fmt' is not installed for the toolchain 'stable-x86_64-unknown-linux-gnu'
17+
- name: Install rustfmt
18+
run: rustup component add rustfmt clippy
19+
1320
- name: Install tools
1421
run: |
1522
cargo install cargo-deny

CHANGELOG.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,39 @@ The format is based on Keep a Changelog and this project adheres to
1515
### Migration
1616
- If there are breaking changes, put a short, actionable checklist here.
1717

18-
## [0.14.0-alpha] - 2024-09-08
18+
---
19+
20+
## [0.15.0-alpha] - 2025-09-25
21+
### Breaking
22+
- Default payload alignment increased from 16 bytes to 64 bytes to ensure
23+
SIMD- and cacheline-safe zero-copy access across SSE/AVX/AVX-512 code
24+
paths. Readers/writers compiled with `<= 0.14.x-alpha` that assume
25+
16-byte alignment will not be able to parse 0.15.x stores correctly.
26+
27+
### Added
28+
- Debug/test-only assertions (`assert_aligned`, `assert_aligned_offset`)
29+
to validate both pointer- and offset-level alignment invariants.
30+
31+
### Changed
32+
- Updated documentation and examples to reflect the new 64-byte default
33+
`PAYLOAD_ALIGNMENT` (still configurable in
34+
`src/storage_engine/constants.rs`).
35+
- `EntryHandle::as_arrow_buffer` and `into_arrow_buffer` now check both
36+
pointer and offset alignment when compiled in test or debug mode.
37+
38+
### Migration
39+
- Stores created with 0.15.x are not backward-compatible with
40+
0.14.x readers/writers due to the alignment change.
41+
- To migrate:
42+
1. Read entries with your existing 0.14.x binary.
43+
2. Rewrite into a fresh 0.15.x store (which will apply 64-byte
44+
alignment).
45+
3. Deploy upgraded readers before upgrading writers in multi-service
46+
environments.
47+
48+
---
49+
50+
## [0.14.0-alpha] - 2025-09-08
1951
### Breaking
2052
- Files written by 0.14.0-alpha use padded payload starts for fixed alignment.
2153
Older readers (<= 0.13.x-alpha) may misinterpret pre-pad bytes as part of the

Cargo.lock

Lines changed: 22 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[workspace.package]
22
authors = ["Jeremy Harris <jeremy.harris@zenosmosis.com>"]
3-
version = "0.14.0-alpha"
3+
version = "0.15.0-alpha"
44
edition = "2024"
55
repository = "https://github.com/jzombie/rust-simd-r-drive"
66
license = "Apache-2.0"
@@ -79,10 +79,10 @@ resolver = "2"
7979

8080
[workspace.dependencies]
8181
# Intra-workspace crates
82-
simd-r-drive = { path = ".", version = "0.14.0-alpha" }
83-
simd-r-drive-entry-handle = { path = "./simd-r-drive-entry-handle", version = "0.14.0-alpha" }
84-
simd-r-drive-ws-client = { path = "./experiments/simd-r-drive-ws-client", version = "0.14.0-alpha" }
85-
simd-r-drive-muxio-service-definition = { path = "./experiments/simd-r-drive-muxio-service-definition", version = "0.14.0-alpha" }
82+
simd-r-drive = { path = ".", version = "0.15.0-alpha" }
83+
simd-r-drive-entry-handle = { path = "./simd-r-drive-entry-handle", version = "0.15.0-alpha" }
84+
simd-r-drive-ws-client = { path = "./experiments/simd-r-drive-ws-client", version = "0.15.0-alpha" }
85+
simd-r-drive-muxio-service-definition = { path = "./experiments/simd-r-drive-muxio-service-definition", version = "0.15.0-alpha" }
8686
muxio-tokio-rpc-client = "0.9.0-alpha"
8787
muxio-tokio-rpc-server = "0.9.0-alpha"
8888
muxio-rpc-service = "0.9.0-alpha"

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
`SIMD R Drive` is a high-performance, thread-safe storage engine using a single-file storage container optimized for zero-copy binary access.
66

7+
Payloads are written at fixed 64-byte aligned boundaries, ensuring efficient zero-copy access and predictable performance for SIMD and cache-friendly workloads.
8+
79
Can be used as a command line interface (CLI) app, or as a library in another application. Continuously tested on Mac, Linux, and Windows.
810

911
[Documentation](https://docs.rs/simd-r-drive/latest/simd_r_drive/)
@@ -48,11 +50,13 @@ Additionally, `SIMD R Drive` is designed to handle datasets larger than availabl
4850

4951
## Fixed Payload Alignment (Zero-Copy Typed Slices)
5052

51-
Every non-tombstone payload now starts at a fixed, power-of-two boundary (16 bytes by default, configurable). This guarantees that, when your payload length matches the element size, you can reinterpret bytes as typed slices (e.g., `&[u16]`, `&[u32]`, `&[u64]`, `&[u128]`) without copying.
53+
Every non-tombstone payload now begins on a fixed, power-of-two boundary (64 bytes by default). This matches the size of a typical CPU cacheline and ensures SIMD/vector loads (AVX, AVX-512, SVE, etc.) can operate at full speed without crossing cacheline boundaries.
54+
55+
When your payload length matches the element size, you can safely reinterpret the bytes as typed slices (e.g., &[u16], &[u32], &[u64], &[u128]) without copying.
5256

53-
This change is transparent to the public API and works with all write modes, including streaming. The on-disk layout may include a few padding bytes per entry to maintain alignment. Tombstones are unaffected.
57+
The on-disk layout may include a few padding bytes per entry to maintain alignment. Tombstones are unaffected.
5458

55-
Practical benefits include faster vectorized reads, simpler use of zero-copy helpers (e.g., casting libraries), and fewer fallback copies. If you need a stricter boundary for a target platform, adjust the [alignment constant](./src/storage_engine/constants.rs) and rebuild.
59+
Practical benefits include cache-friendly zero-copy reads, predictable SIMD performance, simpler use of casting libraries, and fewer fallback copies. If a different boundary is required for your hardware, adjust the [alignment constant](./simd-r-drive-entry-handle/src/constants.rs) and rebuild.
5660

5761
## Single-File Storage Container for Binary Data
5862

@@ -103,6 +107,8 @@ Think of it as a self-contained binary filesystem—capable of storing and retri
103107
<img src="./assets/storage-layout.png" title="Storage Layout" />
104108
</div>
105109

110+
_Note: Illustration is conceptual and does not show the 64-byte aligned boundaries used in the actual on-disk format. In practice, every payload is padded to start on a fixed 64-byte boundary for cacheline and SIMD efficiency._
111+
106112
Aligned entry (non-tombstone):
107113

108114
| Offset Range | Field | Size (Bytes) | Description |

experiments/bindings/python-ws-client/Cargo.lock

Lines changed: 4 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

experiments/bindings/python_(old_client)/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "simd-r-drive-py"
3-
version = "0.14.0-alpha"
3+
version = "0.15.0-alpha"
44
description = "SIMD-optimized append-only schema-less storage engine. Key-based binary storage in a single-file storage container."
55
repository = "https://github.com/jzombie/rust-simd-r-drive"
66
license = "Apache-2.0"

simd-r-drive-entry-handle/src/constants.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,10 @@ pub const CHECKSUM_RANGE: Range<usize> = 16..20;
99

1010
// Define checksum length explicitly since `CHECKSUM_RANGE.len()` isn't `const`
1111
pub const CHECKSUM_LEN: usize = CHECKSUM_RANGE.end - CHECKSUM_RANGE.start;
12+
13+
/// Fixed alignment (power of two) for the start of every payload.
14+
/// 64 bytes matches cache-line size and SIMD-friendly alignment.
15+
/// This improves chances of staying zero-copy in vector kernels.
16+
/// Max pre-pad per entry is `PAYLOAD_ALIGNMENT - 1` bytes.
17+
pub const PAYLOAD_ALIGN_LOG2: u8 = 6; // 2^6 = 64
18+
pub const PAYLOAD_ALIGNMENT: u64 = 1 << PAYLOAD_ALIGN_LOG2;
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
/// Debug-only pointer alignment assertion that is safe to export.
2+
///
3+
/// Why this style:
4+
/// - We need to re-export a symbol other crates can call, but we do not
5+
/// want benches or release builds to pull in debug-only deps or code.
6+
/// - Putting `#[cfg(...)]` on the function itself makes the symbol
7+
/// vanish in release/bench. Callers would then need their own cfg
8+
/// fences, which is brittle across crates.
9+
/// - By keeping the function always present and gating only its body,
10+
/// callers can invoke it unconditionally. In debug/test it asserts;
11+
/// in release/bench it compiles to a no-op.
12+
///
13+
/// Build behavior:
14+
/// - In debug/test, the inner block runs and uses `debug_assert!`.
15+
/// - In release/bench, the else block keeps the args "used" so the
16+
/// function is a true no-op (no codegen warnings, no panic paths).
17+
///
18+
/// Cost:
19+
/// - Inlining plus the cfg-ed body means zero runtime cost in release
20+
/// and bench profiles.
21+
///
22+
/// Usage:
23+
/// - Call anywhere you want a cheap alignment check in debug/test,
24+
/// including from other crates that depend on this one.
25+
#[inline]
26+
pub fn debug_assert_aligned(ptr: *const u8, align: usize) {
27+
#[cfg(any(test, debug_assertions))]
28+
{
29+
debug_assert!(align.is_power_of_two());
30+
debug_assert!(
31+
(ptr as usize & (align - 1)) == 0,
32+
"buffer base is not {}-byte aligned",
33+
align
34+
);
35+
}
36+
37+
#[cfg(not(any(test, debug_assertions)))]
38+
{
39+
// Release/bench: no-op. Keep args used to avoid warnings.
40+
let _ = ptr;
41+
let _ = align;
42+
}
43+
}
44+
45+
/// Debug-only file-offset alignment assertion that is safe to export.
46+
///
47+
/// Same rationale as `debug_assert_aligned`: keep a stable symbol that
48+
/// callers can invoke without cfg fences, while ensuring zero cost in
49+
/// release/bench builds.
50+
///
51+
/// Why not a module-level cfg or `use`:
52+
/// - Some bench setups compile with `--all-features` and may still pull
53+
/// modules in ways that trip cfg-ed imports. Gating inside the body
54+
/// avoids those hazards and keeps the bench linker happy.
55+
///
56+
/// Behavior:
57+
/// - Debug/test: checks that `off` is a multiple of the configured
58+
/// `PAYLOAD_ALIGNMENT`.
59+
/// - Release/bench: no-op, arguments are marked used.
60+
///
61+
/// Notes:
62+
/// - This asserts the *derived start offset* of a payload, not the
63+
/// pointer. Use the pointer variant to assert the actual address you
64+
/// hand to consumers like Arrow.
65+
#[inline]
66+
pub fn debug_assert_aligned_offset(off: u64) {
67+
#[cfg(any(test, debug_assertions))]
68+
{
69+
use crate::constants::PAYLOAD_ALIGNMENT;
70+
71+
debug_assert!(
72+
PAYLOAD_ALIGNMENT.is_power_of_two(),
73+
"PAYLOAD_ALIGNMENT must be a power of two"
74+
);
75+
debug_assert!(
76+
off.is_multiple_of(PAYLOAD_ALIGNMENT),
77+
"derived payload start not {}-byte aligned (got {})",
78+
PAYLOAD_ALIGNMENT,
79+
off
80+
);
81+
}
82+
83+
#[cfg(not(any(test, debug_assertions)))]
84+
{
85+
// Release/bench: no-op. Keep arg used to avoid warnings.
86+
let _ = off;
87+
}
88+
}

simd-r-drive-entry-handle/src/entry_handle.rs

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -387,11 +387,20 @@ impl EntryHandle {
387387
use std::ptr::NonNull;
388388
use std::sync::Arc;
389389

390-
// Pointer to the start of the payload.
391-
let ptr = NonNull::new(self.as_slice().as_ptr() as *mut u8).expect("non-null slice ptr");
390+
let slice = self.as_slice();
391+
#[cfg(any(test, debug_assertions))]
392+
{
393+
use crate::{
394+
constants::PAYLOAD_ALIGNMENT, debug_assert_aligned, debug_assert_aligned_offset,
395+
};
396+
// Assert actual pointer alignment.
397+
debug_assert_aligned(slice.as_ptr(), PAYLOAD_ALIGNMENT as usize);
398+
// Assert derived file offset alignment.
399+
debug_assert_aligned_offset(self.range.start as u64);
400+
}
392401

393-
// Owner keeps the mmap alive for the Buffer's lifetime.
394-
unsafe { Buffer::from_custom_allocation(ptr, self.size(), Arc::new(self.clone())) }
402+
let ptr = NonNull::new(slice.as_ptr() as *mut u8).expect("non-null slice ptr");
403+
unsafe { Buffer::from_custom_allocation(ptr, slice.len(), Arc::new(self.clone())) }
395404
}
396405

397406
/// Convert this handle into an Arrow `Buffer` without copying.
@@ -418,11 +427,20 @@ impl EntryHandle {
418427
use std::ptr::NonNull;
419428
use std::sync::Arc;
420429

421-
let len: usize = self.size();
422-
let ptr = NonNull::new(self.as_slice().as_ptr() as *mut u8).expect("non-null slice ptr");
430+
let slice = self.as_slice();
431+
#[cfg(any(test, debug_assertions))]
432+
{
433+
use crate::{
434+
constants::PAYLOAD_ALIGNMENT, debug_assert_aligned, debug_assert_aligned_offset,
435+
};
436+
// Assert actual pointer alignment.
437+
debug_assert_aligned(slice.as_ptr(), PAYLOAD_ALIGNMENT as usize);
438+
// Assert derived file offset alignment.
439+
debug_assert_aligned_offset(self.range.start as u64);
440+
}
423441

424-
// Move self into the owner to avoid an extra Arc bump later.
425-
unsafe { Buffer::from_custom_allocation(ptr, len, Arc::new(self)) }
442+
let ptr = NonNull::new(slice.as_ptr() as *mut u8).expect("non-null slice ptr");
443+
unsafe { Buffer::from_custom_allocation(ptr, slice.len(), Arc::new(self)) }
426444
}
427445
}
428446

0 commit comments

Comments
 (0)