A unified Rust library for creating, reading, and managing Engram archives - compressed, cryptographically signed archive files with embedded metadata and SQLite database support.
- 📦 Compressed Archives: LZ4 (fast) and Zstd (high compression ratio) with automatic format selection
- 🔐 Cryptographic Signatures: Ed25519 signatures for authenticity and integrity verification
- 📋 Manifest System: JSON-based metadata with file registry, author info, and capabilities
- 💾 Virtual File System (VFS): Direct SQL queries on embedded SQLite databases without extraction
- 🎡 Inline Spool Access: Read DataSpool cards directly from the archive via
open_spool()— no temp extraction - ⚡ Fast Lookups: O(1) file access via central directory with 320-byte fixed entries
- ✅ Integrity Verification: CRC32 checksums for all files
- 🔒 Encryption Support: AES-256-GCM encryption (per-file or full-archive)
- 🎯 Frame-based Compression: Efficient handling of large files (≥50MB) with incremental decompression
- 🛡️ Battle-Tested: 166 tests covering security, performance, concurrency, and reliability
Add this to your Cargo.toml:
[dependencies]
engram-rs = "1.0"use engram_rs::{ArchiveWriter, CompressionMethod};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a new archive
let mut writer = ArchiveWriter::create("my_archive.eng")?;
// Add files with automatic compression
writer.add_file("readme.txt", b"Hello, Engram!")?;
writer.add_file("data.json", br#"{"version": "1.0"}"#)?;
// Add file from disk
writer.add_file_from_disk("config.toml", std::path::Path::new("./config.toml"))?;
// Finalize the archive (writes central directory)
writer.finalize()?;
Ok(())
}use engram_rs::ArchiveReader;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Open existing archive (convenience method)
let mut reader = ArchiveReader::open_and_init("my_archive.eng")?;
// List all files
for filename in reader.list_files() {
println!("📄 {}", filename);
}
// Read a specific file
let data = reader.read_file("readme.txt")?;
println!("Content: {}", String::from_utf8_lossy(&data));
Ok(())
}use engram_rs::{ArchiveWriter, Manifest, Author};
use ed25519_dalek::SigningKey;
use rand::rngs::OsRng;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Generate a keypair for signing
let signing_key = SigningKey::generate(&mut OsRng);
// Create manifest
let mut manifest = Manifest::new(
"my-archive".to_string(),
"My Archive".to_string(),
Author::new("John Doe"),
"1.0.0".to_string(),
);
// Sign the manifest
manifest.sign(&signing_key, Some("John Doe".to_string()))?;
// Create archive with signed manifest
let mut writer = ArchiveWriter::create("signed_archive.eng")?;
writer.add_file("data.txt", b"Important data")?;
writer.add_manifest(&serde_json::to_value(&manifest)?)?;
writer.finalize()?;
// Later: verify the signature
let mut reader = ArchiveReader::open_and_init("signed_archive.eng")?;
if let Some(manifest_value) = reader.read_manifest()? {
let loaded_manifest: Manifest =
Manifest::from_json(&serde_json::to_vec(&manifest_value)?)?;
let results = loaded_manifest.verify_signatures()?;
println!("Signature valid: {}", results[0]);
}
Ok(())
}use engram_rs::VfsReader;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Open archive with VFS
let mut vfs = VfsReader::open("archive_with_db.eng")?;
// Open embedded SQLite database
let conn = vfs.open_database("data.db")?;
// Execute SQL queries
let mut stmt = conn.prepare("SELECT name, email FROM users WHERE active = 1")?;
let users = stmt.query_map([], |row| {
Ok((row.get::<_, String>(0)?, row.get::<_, String>(1)?))
})?;
for user in users {
let (name, email) = user?;
println!("{} <{}>", name, email);
}
Ok(())
}Engrams can stitch DataSpool (.spool) files inline during compilation. When stored with CompressionMethod::None, open_spool() provides direct random access to individual cards without extracting the spool to a temp file:
use engram_rs::ArchiveReader;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let archive = ArchiveReader::open_and_init("lexicon.eng")?;
// Open an inline spool — reads SP01 header + index directly from the .eng file.
let mut spool = archive.open_spool("d.spool")?;
// Read card by index — single seek + read, no temp files.
let card_bytes = spool.read_card(42)?;
println!("Card 42: {} bytes", card_bytes.len());
Ok(())
}This eliminates the temp-extraction bottleneck for spool-heavy archives. In benchmarks on a 503MB English lexicon Engram, warm query time dropped from ~792ms (with temp extraction) to ~164ms (inline access) — a 4.8x speedup.
Engram uses a custom binary format (v1.0) with the following structure:
┌─────────────────────────────────────────┐
│ File Header (64 bytes) │
│ - Magic: 0x89 'E' 'N' 'G' 0x0D 0x0A 0x1A 0x0A │
│ - Format version (major.minor) │
│ - Central directory offset/size │
│ - Entry count, content version │
│ - CRC32 checksum │
├─────────────────────────────────────────┤
│ Local Entry Header 1 (LOCA) │
│ Compressed File Data 1 │
├─────────────────────────────────────────┤
│ Local Entry Header 2 (LOCA) │
│ Compressed File Data 2 │
├─────────────────────────────────────────┤
│ ... │
├─────────────────────────────────────────┤
│ Central Directory │
│ - Entry 1 (320 bytes fixed) │
│ - Entry 2 (320 bytes fixed) │
│ - ... │
├─────────────────────────────────────────┤
│ End of Central Directory (ENDR) │
├─────────────────────────────────────────┤
│ manifest.json (optional) │
│ - Metadata, author, signatures │
└─────────────────────────────────────────┘
Key Features:
- Magic Number: PNG-style magic bytes for file type detection
- Fixed-Width Entries: 320-byte central directory entries enable O(1) file lookup
- Local Headers: Enable sequential streaming reads without central directory
- End-Placed Directory: Enables streaming creation without manifest foreknowledge
- Manifest: JSON metadata with Ed25519 signature support
See ENGRAM_SPECIFICATION.md for complete binary format specification.
The library automatically selects compression based on file type and size:
| File Type | Size | Compression | Typical Ratio |
|---|---|---|---|
| Text files (.txt, .json, .md, etc.) | ≥ 4KB | Zstd (best ratio) | 50-100x |
| Binary files (.db, .wasm, etc.) | ≥ 4KB | LZ4 (fastest) | 2-5x |
| Already compressed (.png, .jpg, .zip, etc.) | Any | None | 1x |
| Small files | < 4KB | None | N/A |
| Large files | ≥ 50MB | Frame-based | Varies |
Compression Performance:
- Highly compressible data (zeros, patterns): 200-750x
- Text files (JSON, Markdown, code): 50-100x
- Mixed data: 50-100x
- Large files (≥50MB): Automatic 64KB frame compression
You can also manually specify compression:
writer.add_file_with_compression("data.bin", data, CompressionMethod::Zstd)?;use ed25519_dalek::SigningKey;
use rand::rngs::OsRng;
// Generate keypair
let signing_key = SigningKey::generate(&mut OsRng);
// Sign manifest
manifest.sign(&signing_key, Some("Author Name".to_string()))?;
// Verify signatures
let results = manifest.verify_signatures()?;
println!("All signatures valid: {}", results.iter().all(|&v| v));Security:
- Constant-time signature verification (no timing attack vulnerabilities)
- Multiple signatures supported (multi-party signing)
- Signature invalidation on data modification detected
// Encrypt individual files (per-file encryption)
writer.add_encrypted_file("secret.txt", password, data)?;
// Decrypt when reading
let data = reader.read_encrypted_file("secret.txt", password)?;Encryption Modes:
- Archive-level: Entire archive encrypted (backup/secure storage)
- Per-file: Individual file encryption (selective decryption, database queries on unencrypted DBs)
Benchmarks on a test file (10MB, Intel i7-12700K, NVMe SSD):
| Compression | Write Speed | Read Speed | Ratio |
|---|---|---|---|
| None | 450 MB/s | 500 MB/s | 1.0x |
| LZ4 | 380 MB/s | 420 MB/s | 2.1x |
| Zstd | 95 MB/s | 180 MB/s | 3.8x |
Scalability (tested):
- Archive size: Up to 1GB (500MB routinely tested)
- File count: Up to 10,000 files (1,000 files in <50ms)
- File access: O(1) HashMap lookup (sub-millisecond)
- Path length: Up to 255 bytes (engram format limit)
- Directory depth: Up to 20 levels tested
VFS Performance:
- SQLite queries: 80-90% of native filesystem performance
- Cold cache: 60-70% of native (decompression overhead)
- Warm cache: 85-95% of native (cache hits)
engram-rs has undergone comprehensive testing across 4 major phases:
- Total Tests: 166 (all passing)
- 23 unit tests
- 46 Phase 1 tests (security & integrity)
- 33 Phase 2 tests (concurrency & reliability)
- 16 Phase 3 tests + 4 stress tests (performance & scale)
- 26 Phase 4 tests (security audit)
- 10 integration tests
- 7 v1 feature tests
- 5 debug tests
Coverage:
- ✅ Corruption detection (15 tests): Magic number, version, header, central directory, truncation
- ✅ Fuzzing infrastructure: cargo-fuzz ready with seed corpus
- ✅ Signature security (13 tests): Tampering, replay attacks, algorithm downgrade, multi-sig
- ✅ Encryption security (18 tests): Archive-level, per-file, wrong keys, compression+encryption
Findings:
- All corruption scenarios properly detected and rejected
- Signature verification cryptographically sound
- AES-256-GCM implementation secure
- No undefined behavior on malformed inputs
Coverage:
- ✅ Concurrent VFS/SQLite access (5 tests): 10 threads × 1,000 queries
- ✅ Multi-reader stress tests (6 tests): 100 concurrent readers, 64K operations
- ✅ Crash recovery (13 tests): Incomplete archives, truncation at 10-90%, corruption
- ✅ Frame compression edge cases (9 tests): 50MB threshold, 200MB files, data integrity
Findings:
- Thread-safe VFS with no resource leaks
- True parallelism via separate file handles
- All incomplete archives properly rejected
- Frame compression works correctly for large files (≥50MB)
Operations Tested:
- 10,000+ concurrent VFS database queries
- 64,000+ multi-reader operations
- 500MB+ data processed
Coverage:
- ✅ Large archives (8 tests): 500MB-1GB archives, 10K files, path edge cases
- ✅ Compression validation (8 tests): Text, binary, pre-compressed, effectiveness
Findings:
- Scales to 1GB+ archives with no issues
- 10,000+ files handled efficiently (O(1) lookup)
- Compression ratios: 50-227x typical, 227x for zeros, 59x for text
- Performance: ~120 MB/s write, ~200 MB/s read
Stress Tests (run with --ignored):
- 500MB archive: 4.3 seconds (500MB → 1MB, 500x compression)
- 1GB archive: ~10 seconds
- 10,000 files: ~1 second
Coverage:
- ✅ Path traversal prevention (10 tests): ../, absolute paths, null bytes, normalization
- ✅ ZIP bomb protection (8 tests): Compression ratios, decompression safety
- ✅ Cryptographic attacks (8 tests): Timing attacks, weak keys, side-channels
Findings:
Path Security:
⚠️ Path traversal attempts (../, absolute paths) accepted but normalized⚠️ Applications must sanitize paths during extraction- ✅ 255-byte path limit enforced (rejected at finalize())
- ✅ Case-sensitive storage (File.txt ≠ file.txt)
Compression Security:
- ✅ Excellent compression ratios (200-750x)
- ✅ No recursive compression (prevents nested bombs)
- ✅ Frame compression limits memory (64KB frames)
⚠️ Relies on zstd/lz4 library safety checks (no explicit bomb detection)
Cryptographic Security:
- ✅ Ed25519 signatures with constant-time verification
- ✅ No timing attack vulnerabilities detected
- ✅ Weak keys avoided (OsRng used)
- ✅ Signature invalidation on modification detected
- ✅ Multiple signatures supported
Verdict: No critical security vulnerabilities found. engram-rs is production-ready with proper application-level path sanitization.
Comprehensive testing documentation:
- TESTING_PLAN.md - Overall testing strategy and status
- TESTING_PHASE_1.1_FINDINGS.md - Corruption detection
- TESTING_PHASE_1.2_FUZZING.md - Fuzzing infrastructure
- TESTING_PHASE_1.3_SIGNATURES.md - Signature security
- TESTING_PHASE_1.4_ENCRYPTION.md - Encryption security
- TESTING_PHASE_2_CONCURRENCY.md - Concurrency tests
- TESTING_PHASE_3_PERFORMANCE.md - Performance tests
- TESTING_PHASE_4_SECURITY.md - Security audit
ArchiveWriter- Create and write to archivesArchiveReader- Read from existing archivesVfsReader- Query SQLite databases in archivesManifest- Archive metadata and signaturesCompressionMethod- Compression algorithm selectionEngramError- Error types
| Operation | Method |
|---|---|
| Create archive | ArchiveWriter::create(path) |
| Open archive | ArchiveReader::open_and_init(path) |
| Open encrypted | ArchiveReader::open_encrypted(path, key) |
| Add file | writer.add_file(name, data) |
| Add from disk | writer.add_file_from_disk(name, path) |
| Read file | reader.read_file(name) |
| List files | reader.list_files() |
| Add manifest | writer.add_manifest(manifest) |
| Sign manifest | manifest.sign(key, signer) |
| Verify signatures | manifest.verify_signatures() |
| Query database | vfs.open_database(name) |
| Open inline spool | reader.open_spool(name) |
See the examples/ directory for complete examples:
basic.rs- Creating and reading archivesmanifest.rs- Working with manifests and signaturescompression.rs- Compression optionsvfs.rs- Querying embedded databases
Run examples with:
cargo run --example basic
cargo run --example manifest
cargo run --example vfs# Run all tests (fast)
cargo test
# Run with output
cargo test -- --nocapture
# Run specific test file
cargo test --test corruption_test
# Run stress tests (large archives, many files)
cargo test --test stress_large_archives_test -- --ignored --nocaptureTest Execution Time:
- Regular tests (162 tests): <2 seconds
- Stress tests (4 tests): 5-15 seconds (run with
--ignored)
- Rust: 1.75+ (2021 edition)
- Platforms: Windows, macOS, Linux, BSD
- Architectures: x86_64, aarch64 (ARM64)
This library replaces the previous two-crate structure:
// Old
use engram_core::{ArchiveReader, ArchiveWriter};
use engram_vfs::VfsReader;
// New (engram-rs)
use engram_rs::{ArchiveReader, ArchiveWriter, VfsReader};All functionality is now unified in a single crate with improved APIs:
open_and_init()convenience method (was:open()theninitialize())open_encrypted()convenience method for encrypted archives- Simplified manifest signing workflow
engram-rs does not reject path traversal attempts during archive creation. Applications must sanitize paths during extraction:
use std::path::{Path, PathBuf};
fn safe_extract_path(archive_path: &str, dest_root: &Path) -> Result<PathBuf, &'static str> {
let normalized = archive_path.replace('\\', '/');
// Reject absolute paths
if normalized.starts_with('/') || normalized.contains(':') {
return Err("Absolute paths not allowed");
}
// Reject parent directory references
if normalized.contains("..") {
return Err("Parent directory references not allowed");
}
// Build final path and verify it's within dest_root
let final_path = dest_root.join(&normalized);
if !final_path.starts_with(dest_root) {
return Err("Path escapes destination directory");
}
Ok(final_path)
}Always verify signatures before trusting archive contents:
let manifest: Manifest = Manifest::from_json(&manifest_data)?;
let results = manifest.verify_signatures()?;
if !results.iter().all(|&valid| valid) {
return Err("Invalid signature detected");
}For untrusted archives, set resource limits:
# Unix/Linux: Set memory limit
ulimit -v 1048576 # 1GB virtual memory limit
# Monitor decompression size
if decompressed_size > max_allowed_size {
return Err("Decompression size exceeds limit");
}Contributions are welcome! See CONTRIBUTING.md for guidelines.
Licensed under the MIT License.
- dataspool-rs - Card-indexed sequential storage (SP01 format), used for inline spool access
- engram-cli - Command-line tool for managing Engram archives
- engram-specification - Complete format specification
- engram-nodejs - Node.js bindings (native module)
- Crates.io: https://crates.io/crates/engram-rs
- Documentation: https://docs.rs/engram-rs
- Repository: https://github.com/blackfall-labs/engram-rs
- Issues: https://github.com/blackfall-labs/engram-rs/issues
- Format Specification: ENGRAM_SPECIFICATION.md