A thread-safe, in-memory hash store supporting concurrent fetches and writes.
This is not a traditional kv-store, in the sense that it doesn't use any form of keys.
Specific "item" removal is not supported in favor of a fetching type system and can be thought of as a read-only dequeue database.
- Guarantees
- Trade-offs
- Use scenarios
- Benchmarking
- Installation
- Examples
- Basics
- Multithreaded
- Disk
- Testing
- Contributing
- License
- All items will eventually be fetched (no duplication), but ordering is non-deterministic (Not FIFO or FILO)
- Items are never removed once inserted (append-only / arc-reference-fetching)
- All functions are thread safe
- No WAL, lossy on power loss or crash.
- No item removal
- Non-deterministic fetch order (May seem deterministic, not guaranteed)
- Concurrent write throughput is PRIORITIZED over reading performance
- Concurrent queue with unique items only (
HashSet+VecDeque)-like - Fast concurrent insertions are needed over concurrent reads
- Fast reading on a single-thread with multiple concurrent writers
- Persistent in-memory hash-store
This was originally built for a web-scraper which needs to write lots of links with fewer reads.
# Cargo.toml
[dependencies]
extractdb = "0.1.0"use extractdb::ExtractDb;
use std::sync::Arc;
fn main() {
let database: ExtractDb<i32> = ExtractDb::default();
database.push(Arc::new(100));
let total_items_in_db = database.internal_count();
let mut items_in_quick_access_memory = 0;
if total_items_in_db > 0 {
let item: Arc<i32> = database.fetch_next().unwrap();
items_in_quick_access_memory = database.fetch_count();
}
println!("Total items: {} | Quick Access item count: {}", total_items_in_db, items_in_quick_access_memory);
}use std::sync::Arc;
use extractdb::ExtractDb;
use std::thread;
fn main() {
let database: Arc<ExtractDb<String>> = Arc::new(ExtractDb::default());
for thread_id in 0..8 {
let local_database = Arc::clone(&database);
thread::spawn(move || {
local_database.push(Arc::new(format!("Hello from thread {}", thread_id)))
});
}
// Will only print some of the items... since we are not waiting for thread completion.
for _ in 0..8 {
if let Ok(item) = database.fetch_next() {
println!("Item: {}", item);
}
}
}use std::path::PathBuf;
use std::sync::Arc;
use extractdb::{ExtractConfig, ExtractDb};
fn main() {
let config = ExtractConfig::default()
.database_directory(Some(PathBuf::from("./test_db")));
let database: ExtractDb<String> = ExtractDb::new(config);
// `True`: Load all items back into `fetch_next` queue
database.load_from_disk(true).unwrap();
database.push(Arc::new("Hello world!".to_string()));
database.save_to_disk().unwrap();
}use std::sync::Arc;
use std::path::PathBuf;
use std::sync::atomic::{AtomicBool, Ordering};
use extractdb::{CheckpointSettings, ExtractConfig, ExtractDb};
fn main() {
let config = ExtractConfig::default()
.database_directory(Some(PathBuf::from("./test_db_2")));
let database: Arc<ExtractDb<String>> = Arc::new(ExtractDb::new(config));
// `True`: Load all items back into `fetch_next` queue
database.load_from_disk(true).unwrap();
let shutdown_flag = Arc::new(AtomicBool::new(false));
let mut save_settings = CheckpointSettings::new(shutdown_flag.clone());
save_settings.minimum_changes = 1000;
// Spawns a background watcher thread.
// This checks for a minimum of 1000 changes every 30 seconds (default)
ExtractDb::background_checkpoints(save_settings, database.clone());
// Perform single/multithreaded logic
database.push(Arc::new("Hello world!".to_string()));
// Gracefully shutdown the background saving thread
shutdown_flag.store(true, Ordering::Relaxed);
}This project includes some basic tests to maintain functionality please use them.
cargo test
See internal doc-comments for more indepth information about each test:
pushpush_multiplepush_collidedpush_structurecount_empty_storecount_loaded_storefetch_datafetch_data_multiplefetch_data_emptyduplicate_fetch
save_state_to_diskload_state_from_diskload_corrupted_state_from_diskload_shard_mismatch_from_diskload_mismatch_type_from_disk
push_multi_thread
Pull request and issue contributions are very welcome. Please feel free to suggest changes in PRs/Issues :)
This project is licensed under either MIT or Apache-2.0, you choose.