scry-index - project details

A sorted map that learns where your keys live

Concurrent key-value store backed by learned index structures. Piecewise linear models predict key positions instead of searching a tree, giving O(1) expected lookups with lock-free reads and CAS-based writes.

01

The problem

why not BTreeMap

Rust's BTreeMap is the standard sorted container. It works well for general use, but at scale two costs dominate: O(log n) comparisons per lookup, and cache misses from pointer-chasing through tree nodes.

Learned indexes replace tree traversal with a math function. If you know the distribution of your keys, a linear model can predict the position of any key in one step. Collisions are handled by small child nodes, but for well-distributed data the tree stays shallow and most lookups resolve in a single prediction.

All existing concurrent learned indexes (SALI, FINEdex, XIndex, APEX) are C++ only. scry-index is the first Rust implementation.
02

How it works

LIPP chain method

Each node stores a linear model f(key) = slope * key + intercept fitted using the FMCD algorithm from the SALI paper (SIGMOD 2024). The model maps each key to a unique slot in a fixed-size array. When two keys collide on the same slot, a child node is created to hold both. No data is shifted, no rebalancing required.

Root Node f(k) = 0.03k + 0.0 · 64 slots
key=10val="a"
empty
key=30val="c"
child ↓
key=80val="h"
Child Node f(k) = 0.3k - 12.0
key=40val="d"
empty
key=45val="e"

Keys always land at their model-predicted slot. There is no prediction error and no correction scan. Conflicts create children, which can themselves have children, but the FMCD fitting keeps the tree shallow. When a subtree grows beyond a depth threshold, it is rebuilt inline by the thread that triggered it.

03

Concurrency model

no global lock

Reads

Lock-free. A thread pins itself to an epoch, follows the model prediction through atomic loads, and reads inline key-value data. No locks, no contention.

Writes

Per-slot CAS retry loops. An insert claims an empty slot atomically, or creates a child node on collision. Contention is limited to the individual slot being written.

Reclamation

Epoch-based (crossbeam). Retired nodes are deferred until all readers from that epoch have unpinned. No reference counting, no garbage collector.

Rebuilds

Localized subtree rebuilds triggered by depth threshold. The parent's child pointer is frozen (tagged) during rebuild so concurrent inserts spin-wait. No global lock, no stop-the-world.

04

Benchmarks

criterion, 100K-500K keys
Operation scry-index BTreeMap
Lookup (100K, sequential) 1.0 ms 6.2 ms 6.2x
Lookup (100K, random) 1.2 ms 11.6 ms 9.8x
Lookup (500K, sequential) 5.0 ms 32.4 ms 6.5x
Sequential insert (100K) 12.4 ms 12.4 ms parity
Append-only insert (100K) 8.0 ms 8.9 ms 1.1x
Bulk load (100K) 3.7 ms 2.0 ms 0.5x
The O(1) model prediction vs O(log n) tree walk advantage grows with scale. Random access patterns widen the gap further because BTreeMap suffers cache misses at depth. Bulk load is slower due to FMCD model fitting overhead.
05

Design evolution

five phases
06

API

usage example
use scry_index::LearnedMap;

let map = LearnedMap::new();
let m = map.pin();

m.insert(42u64, "the answer");
m.insert(1, "first");
m.insert(100, "last");

assert_eq!(m.get(&42), Some(&"the answer"));

// sorted iteration, range queries
for (k, v) in m.range(1u64..50) {
    println!("{k}: {v}");
}

// atomic check-and-insert
let val = m.get_or_insert(42, "default");
assert_eq!(val, &"the answer"); // existing value, not "default"

For pre-sorted data, bulk_load builds the tree in one pass with optimal model fitting. bulk_load_dedup handles duplicates (keeps the last value per key, matching BTreeMap::from_iter semantics).

// bulk load from sorted data
let data: Vec<(u64, &str)> = vec![(1, "a"), (2, "b"), (3, "c")];
let map = LearnedMap::bulk_load(&data).unwrap();

// supports String, Vec<u8>, [u8; N], and all integer types
let string_map: LearnedMap<String, u64> = LearnedMap::new();
07

Constraints

non-negotiable rules

No unsafe in the public API

#[deny(unsafe_code)] at the crate level. Internal concurrency primitives use targeted unsafe blocks with documented safety invariants. The public surface is fully safe Rust.

No panics in library code

All fallible operations return Result or Option. unwrap() only appears in tests. Panics are treated as bugs.

Clippy pedantic + nursery

All warnings are errors in CI. Issues are fixed at the source, not suppressed, unless there is a documented exception.

Every bug fix ships a test

Regression tests are required for every fix. Property-based tests cover core algorithms. Target: >90% line coverage on core modules.

08

Tradeoffs

when to use something else

Random-key inserts are slower (~4-6x). When keys arrive in random order, the model cannot predict well and creates deep conflict chains. Use bulk_load when you have sorted data available.

Bulk load is slower than BTreeMap. The FMCD model fitting step adds overhead. The payoff comes at read time.

Memory overhead is higher. Each node allocates arrays with expansion factor headroom (default 2x) to leave space for future inserts. The allocated_bytes() API lets you monitor this at runtime.

Best fit: read-heavy concurrent workloads with sorted or semi-sorted keys. Time-series ingestion, lookup tables, analytics indexes. For write-heavy random workloads with no concurrency requirement, BTreeMap is the simpler choice.