Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .github/workflows/rust-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ jobs:
- uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@stable

# Note: This is a workaround for an issue that just started appearing in lint checks
# and I'm not yet sure if it's due to GitHub Actions having updated something behind
# the scenes:
# error: 'cargo-fmt' is not installed for the toolchain 'stable-x86_64-unknown-linux-gnu'
- name: Install rustfmt
run: rustup component add rustfmt clippy

- name: Install tools
run: |
cargo install cargo-deny
Expand Down
34 changes: 33 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,39 @@ The format is based on Keep a Changelog and this project adheres to
### Migration
- If there are breaking changes, put a short, actionable checklist here.

## [0.14.0-alpha] - 2024-09-08
---

## [0.15.0-alpha] - 2025-09-25
### Breaking
- Default payload alignment increased from 16 bytes to 64 bytes to ensure
SIMD- and cacheline-safe zero-copy access across SSE/AVX/AVX-512 code
paths. Readers/writers compiled with `<= 0.14.x-alpha` that assume
16-byte alignment will not be able to parse 0.15.x stores correctly.

### Added
- Debug/test-only assertions (`assert_aligned`, `assert_aligned_offset`)
to validate both pointer- and offset-level alignment invariants.

### Changed
- Updated documentation and examples to reflect the new 64-byte default
`PAYLOAD_ALIGNMENT` (still configurable in
`src/storage_engine/constants.rs`).
- `EntryHandle::as_arrow_buffer` and `into_arrow_buffer` now check both
pointer and offset alignment when compiled in test or debug mode.

### Migration
- Stores created with 0.15.x are not backward-compatible with
0.14.x readers/writers due to the alignment change.
- To migrate:
1. Read entries with your existing 0.14.x binary.
2. Rewrite into a fresh 0.15.x store (which will apply 64-byte
alignment).
3. Deploy upgraded readers before upgrading writers in multi-service
environments.

---

## [0.14.0-alpha] - 2025-09-08
### Breaking
- Files written by 0.14.0-alpha use padded payload starts for fixed alignment.
Older readers (<= 0.13.x-alpha) may misinterpret pre-pad bytes as part of the
Expand Down
28 changes: 22 additions & 6 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[workspace.package]
authors = ["Jeremy Harris <[email protected]>"]
version = "0.14.0-alpha"
version = "0.15.0-alpha"
edition = "2024"
repository = "https://github.com/jzombie/rust-simd-r-drive"
license = "Apache-2.0"
Expand Down Expand Up @@ -79,10 +79,10 @@ resolver = "2"

[workspace.dependencies]
# Intra-workspace crates
simd-r-drive = { path = ".", version = "0.14.0-alpha" }
simd-r-drive-entry-handle = { path = "./simd-r-drive-entry-handle", version = "0.14.0-alpha" }
simd-r-drive-ws-client = { path = "./experiments/simd-r-drive-ws-client", version = "0.14.0-alpha" }
simd-r-drive-muxio-service-definition = { path = "./experiments/simd-r-drive-muxio-service-definition", version = "0.14.0-alpha" }
simd-r-drive = { path = ".", version = "0.15.0-alpha" }
simd-r-drive-entry-handle = { path = "./simd-r-drive-entry-handle", version = "0.15.0-alpha" }
simd-r-drive-ws-client = { path = "./experiments/simd-r-drive-ws-client", version = "0.15.0-alpha" }
simd-r-drive-muxio-service-definition = { path = "./experiments/simd-r-drive-muxio-service-definition", version = "0.15.0-alpha" }
muxio-tokio-rpc-client = "0.9.0-alpha"
muxio-tokio-rpc-server = "0.9.0-alpha"
muxio-rpc-service = "0.9.0-alpha"
Expand Down
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

`SIMD R Drive` is a high-performance, thread-safe storage engine using a single-file storage container optimized for zero-copy binary access.

Payloads are written at fixed 64-byte aligned boundaries, ensuring efficient zero-copy access and predictable performance for SIMD and cache-friendly workloads.

Can be used as a command line interface (CLI) app, or as a library in another application. Continuously tested on Mac, Linux, and Windows.

[Documentation](https://docs.rs/simd-r-drive/latest/simd_r_drive/)
Expand Down Expand Up @@ -48,11 +50,13 @@ Additionally, `SIMD R Drive` is designed to handle datasets larger than availabl

## Fixed Payload Alignment (Zero-Copy Typed Slices)

Every non-tombstone payload now starts at a fixed, power-of-two boundary (16 bytes by default, configurable). This guarantees that, when your payload length matches the element size, you can reinterpret bytes as typed slices (e.g., `&[u16]`, `&[u32]`, `&[u64]`, `&[u128]`) without copying.
Every non-tombstone payload now begins on a fixed, power-of-two boundary (64 bytes by default). This matches the size of a typical CPU cacheline and ensures SIMD/vector loads (AVX, AVX-512, SVE, etc.) can operate at full speed without crossing cacheline boundaries.

When your payload length matches the element size, you can safely reinterpret the bytes as typed slices (e.g., &[u16], &[u32], &[u64], &[u128]) without copying.

This change is transparent to the public API and works with all write modes, including streaming. The on-disk layout may include a few padding bytes per entry to maintain alignment. Tombstones are unaffected.
The on-disk layout may include a few padding bytes per entry to maintain alignment. Tombstones are unaffected.

Practical benefits include faster vectorized reads, simpler use of zero-copy helpers (e.g., casting libraries), and fewer fallback copies. If you need a stricter boundary for a target platform, adjust the [alignment constant](./src/storage_engine/constants.rs) and rebuild.
Practical benefits include cache-friendly zero-copy reads, predictable SIMD performance, simpler use of casting libraries, and fewer fallback copies. If a different boundary is required for your hardware, adjust the [alignment constant](./simd-r-drive-entry-handle/src/constants.rs) and rebuild.

## Single-File Storage Container for Binary Data

Expand Down Expand Up @@ -103,6 +107,8 @@ Think of it as a self-contained binary filesystem—capable of storing and retri
<img src="./assets/storage-layout.png" title="Storage Layout" />
</div>

_Note: Illustration is conceptual and does not show the 64-byte aligned boundaries used in the actual on-disk format. In practice, every payload is padded to start on a fixed 64-byte boundary for cacheline and SIMD efficiency._

Aligned entry (non-tombstone):

| Offset Range | Field | Size (Bytes) | Description |
Expand Down
8 changes: 4 additions & 4 deletions experiments/bindings/python-ws-client/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion experiments/bindings/python_(old_client)/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "simd-r-drive-py"
version = "0.14.0-alpha"
version = "0.15.0-alpha"
description = "SIMD-optimized append-only schema-less storage engine. Key-based binary storage in a single-file storage container."
repository = "https://github.com/jzombie/rust-simd-r-drive"
license = "Apache-2.0"
Expand Down
7 changes: 7 additions & 0 deletions simd-r-drive-entry-handle/src/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,10 @@ pub const CHECKSUM_RANGE: Range<usize> = 16..20;

// Define checksum length explicitly since `CHECKSUM_RANGE.len()` isn't `const`
pub const CHECKSUM_LEN: usize = CHECKSUM_RANGE.end - CHECKSUM_RANGE.start;

/// Fixed alignment (power of two) for the start of every payload.
/// 64 bytes matches cache-line size and SIMD-friendly alignment.
/// This improves chances of staying zero-copy in vector kernels.
/// Max pre-pad per entry is `PAYLOAD_ALIGNMENT - 1` bytes.
pub const PAYLOAD_ALIGN_LOG2: u8 = 6; // 2^6 = 64
pub const PAYLOAD_ALIGNMENT: u64 = 1 << PAYLOAD_ALIGN_LOG2;
88 changes: 88 additions & 0 deletions simd-r-drive-entry-handle/src/debug_assert_aligned.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
/// Debug-only pointer alignment assertion that is safe to export.
///
/// Why this style:
/// - We need to re-export a symbol other crates can call, but we do not
/// want benches or release builds to pull in debug-only deps or code.
/// - Putting `#[cfg(...)]` on the function itself makes the symbol
/// vanish in release/bench. Callers would then need their own cfg
/// fences, which is brittle across crates.
/// - By keeping the function always present and gating only its body,
/// callers can invoke it unconditionally. In debug/test it asserts;
/// in release/bench it compiles to a no-op.
///
/// Build behavior:
/// - In debug/test, the inner block runs and uses `debug_assert!`.
/// - In release/bench, the else block keeps the args "used" so the
/// function is a true no-op (no codegen warnings, no panic paths).
///
/// Cost:
/// - Inlining plus the cfg-ed body means zero runtime cost in release
/// and bench profiles.
///
/// Usage:
/// - Call anywhere you want a cheap alignment check in debug/test,
/// including from other crates that depend on this one.
#[inline]
pub fn debug_assert_aligned(ptr: *const u8, align: usize) {
#[cfg(any(test, debug_assertions))]
{
debug_assert!(align.is_power_of_two());
debug_assert!(
(ptr as usize & (align - 1)) == 0,
"buffer base is not {}-byte aligned",
align
);
}

#[cfg(not(any(test, debug_assertions)))]
{
// Release/bench: no-op. Keep args used to avoid warnings.
let _ = ptr;
let _ = align;
}
}

/// Debug-only file-offset alignment assertion that is safe to export.
///
/// Same rationale as `debug_assert_aligned`: keep a stable symbol that
/// callers can invoke without cfg fences, while ensuring zero cost in
/// release/bench builds.
///
/// Why not a module-level cfg or `use`:
/// - Some bench setups compile with `--all-features` and may still pull
/// modules in ways that trip cfg-ed imports. Gating inside the body
/// avoids those hazards and keeps the bench linker happy.
///
/// Behavior:
/// - Debug/test: checks that `off` is a multiple of the configured
/// `PAYLOAD_ALIGNMENT`.
/// - Release/bench: no-op, arguments are marked used.
///
/// Notes:
/// - This asserts the *derived start offset* of a payload, not the
/// pointer. Use the pointer variant to assert the actual address you
/// hand to consumers like Arrow.
#[inline]
pub fn debug_assert_aligned_offset(off: u64) {
#[cfg(any(test, debug_assertions))]
{
use crate::constants::PAYLOAD_ALIGNMENT;

debug_assert!(
PAYLOAD_ALIGNMENT.is_power_of_two(),
"PAYLOAD_ALIGNMENT must be a power of two"
);
debug_assert!(
off.is_multiple_of(PAYLOAD_ALIGNMENT),
"derived payload start not {}-byte aligned (got {})",
PAYLOAD_ALIGNMENT,
off
);
}

#[cfg(not(any(test, debug_assertions)))]
{
// Release/bench: no-op. Keep arg used to avoid warnings.
let _ = off;
}
}
34 changes: 26 additions & 8 deletions simd-r-drive-entry-handle/src/entry_handle.rs
Original file line number Diff line number Diff line change
Expand Up @@ -387,11 +387,20 @@ impl EntryHandle {
use std::ptr::NonNull;
use std::sync::Arc;

// Pointer to the start of the payload.
let ptr = NonNull::new(self.as_slice().as_ptr() as *mut u8).expect("non-null slice ptr");
let slice = self.as_slice();
#[cfg(any(test, debug_assertions))]
{
use crate::{
constants::PAYLOAD_ALIGNMENT, debug_assert_aligned, debug_assert_aligned_offset,
};
// Assert actual pointer alignment.
debug_assert_aligned(slice.as_ptr(), PAYLOAD_ALIGNMENT as usize);
// Assert derived file offset alignment.
debug_assert_aligned_offset(self.range.start as u64);
}

// Owner keeps the mmap alive for the Buffer's lifetime.
unsafe { Buffer::from_custom_allocation(ptr, self.size(), Arc::new(self.clone())) }
let ptr = NonNull::new(slice.as_ptr() as *mut u8).expect("non-null slice ptr");
unsafe { Buffer::from_custom_allocation(ptr, slice.len(), Arc::new(self.clone())) }
}

/// Convert this handle into an Arrow `Buffer` without copying.
Expand All @@ -418,11 +427,20 @@ impl EntryHandle {
use std::ptr::NonNull;
use std::sync::Arc;

let len: usize = self.size();
let ptr = NonNull::new(self.as_slice().as_ptr() as *mut u8).expect("non-null slice ptr");
let slice = self.as_slice();
#[cfg(any(test, debug_assertions))]
{
use crate::{
constants::PAYLOAD_ALIGNMENT, debug_assert_aligned, debug_assert_aligned_offset,
};
// Assert actual pointer alignment.
debug_assert_aligned(slice.as_ptr(), PAYLOAD_ALIGNMENT as usize);
// Assert derived file offset alignment.
debug_assert_aligned_offset(self.range.start as u64);
}

// Move self into the owner to avoid an extra Arc bump later.
unsafe { Buffer::from_custom_allocation(ptr, len, Arc::new(self)) }
let ptr = NonNull::new(slice.as_ptr() as *mut u8).expect("non-null slice ptr");
unsafe { Buffer::from_custom_allocation(ptr, slice.len(), Arc::new(self)) }
}
}

Expand Down
3 changes: 3 additions & 0 deletions simd-r-drive-entry-handle/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@ pub use entry_handle::*;

pub mod entry_metadata;
pub use entry_metadata::*;

pub mod debug_assert_aligned;
pub use debug_assert_aligned::*;
5 changes: 0 additions & 5 deletions src/storage_engine/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,3 @@ pub const NULL_BYTE: [u8; 1] = [0];

/// Stream copy chunk size.
pub const WRITE_STREAM_BUFFER_SIZE: usize = 64 * 1024; // 64 KB

/// Fixed alignment (power of two) for the start of every payload.
/// 16 bytes covers u8/u16/u32/u64/u128 on mainstream targets.
pub const PAYLOAD_ALIGN_LOG2: u8 = 4;
pub const PAYLOAD_ALIGNMENT: u64 = 1 << PAYLOAD_ALIGN_LOG2;
Loading
Loading