Skip to content

Commit b8acf06

Browse files
authored
Rollup merge of rust-lang#135761 - hkBst:patch-9, r=ibraheemdev
Dial down detail of B-tree description fixes rust-lang#134088, though it is a shame to lose some of this wonderful detail. r? `@workingjubilee` EDIT: newest versions keep old detail, but move it down a bit.
2 parents 10ce6f6 + 4401a69 commit b8acf06

File tree

1 file changed

+45
-31
lines changed
  • alloc/src/collections/btree

1 file changed

+45
-31
lines changed

alloc/src/collections/btree/map.rs

Lines changed: 45 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -40,30 +40,15 @@ pub(super) const MIN_LEN: usize = node::MIN_LEN_AFTER_SPLIT;
4040

4141
/// An ordered map based on a [B-Tree].
4242
///
43-
/// B-Trees represent a fundamental compromise between cache-efficiency and actually minimizing
44-
/// the amount of work performed in a search. In theory, a binary search tree (BST) is the optimal
45-
/// choice for a sorted map, as a perfectly balanced BST performs the theoretical minimum amount of
46-
/// comparisons necessary to find an element (log<sub>2</sub>n). However, in practice the way this
47-
/// is done is *very* inefficient for modern computer architectures. In particular, every element
48-
/// is stored in its own individually heap-allocated node. This means that every single insertion
49-
/// triggers a heap-allocation, and every single comparison should be a cache-miss. Since these
50-
/// are both notably expensive things to do in practice, we are forced to, at the very least,
51-
/// reconsider the BST strategy.
52-
///
53-
/// A B-Tree instead makes each node contain B-1 to 2B-1 elements in a contiguous array. By doing
54-
/// this, we reduce the number of allocations by a factor of B, and improve cache efficiency in
55-
/// searches. However, this does mean that searches will have to do *more* comparisons on average.
56-
/// The precise number of comparisons depends on the node search strategy used. For optimal cache
57-
/// efficiency, one could search the nodes linearly. For optimal comparisons, one could search
58-
/// the node using binary search. As a compromise, one could also perform a linear search
59-
/// that initially only checks every i<sup>th</sup> element for some choice of i.
43+
/// Given a key type with a [total order], an ordered map stores its entries in key order.
44+
/// That means that keys must be of a type that implements the [`Ord`] trait,
45+
/// such that two keys can always be compared to determine their [`Ordering`].
46+
/// Examples of keys with a total order are strings with lexicographical order,
47+
/// and numbers with their natural order.
6048
///
61-
/// Currently, our implementation simply performs naive linear search. This provides excellent
62-
/// performance on *small* nodes of elements which are cheap to compare. However in the future we
63-
/// would like to further explore choosing the optimal search strategy based on the choice of B,
64-
/// and possibly other factors. Using linear search, searching for a random element is expected
65-
/// to take B * log(n) comparisons, which is generally worse than a BST. In practice,
66-
/// however, performance is excellent.
49+
/// Iterators obtained from functions such as [`BTreeMap::iter`], [`BTreeMap::into_iter`], [`BTreeMap::values`], or
50+
/// [`BTreeMap::keys`] produce their items in key order, and take worst-case logarithmic and
51+
/// amortized constant time per item returned.
6752
///
6853
/// It is a logic error for a key to be modified in such a way that the key's ordering relative to
6954
/// any other key, as determined by the [`Ord`] trait, changes while it is in the map. This is
@@ -72,14 +57,6 @@ pub(super) const MIN_LEN: usize = node::MIN_LEN_AFTER_SPLIT;
7257
/// `BTreeMap` that observed the logic error and not result in undefined behavior. This could
7358
/// include panics, incorrect results, aborts, memory leaks, and non-termination.
7459
///
75-
/// Iterators obtained from functions such as [`BTreeMap::iter`], [`BTreeMap::into_iter`], [`BTreeMap::values`], or
76-
/// [`BTreeMap::keys`] produce their items in order by key, and take worst-case logarithmic and
77-
/// amortized constant time per item returned.
78-
///
79-
/// [B-Tree]: https://en.wikipedia.org/wiki/B-tree
80-
/// [`Cell`]: core::cell::Cell
81-
/// [`RefCell`]: core::cell::RefCell
82-
///
8360
/// # Examples
8461
///
8562
/// ```
@@ -169,6 +146,43 @@ pub(super) const MIN_LEN: usize = node::MIN_LEN_AFTER_SPLIT;
169146
/// // modify an entry before an insert with in-place mutation
170147
/// player_stats.entry("mana").and_modify(|mana| *mana += 200).or_insert(100);
171148
/// ```
149+
///
150+
/// # Background
151+
///
152+
/// A B-tree is (like) a [binary search tree], but adapted to the natural granularity that modern
153+
/// machines like to consume data at. This means that each node contains an entire array of elements,
154+
/// instead of just a single element.
155+
///
156+
/// B-Trees represent a fundamental compromise between cache-efficiency and actually minimizing
157+
/// the amount of work performed in a search. In theory, a binary search tree (BST) is the optimal
158+
/// choice for a sorted map, as a perfectly balanced BST performs the theoretical minimum number of
159+
/// comparisons necessary to find an element (log<sub>2</sub>n). However, in practice the way this
160+
/// is done is *very* inefficient for modern computer architectures. In particular, every element
161+
/// is stored in its own individually heap-allocated node. This means that every single insertion
162+
/// triggers a heap-allocation, and every comparison is a potential cache-miss due to the indirection.
163+
/// Since both heap-allocations and cache-misses are notably expensive in practice, we are forced to,
164+
/// at the very least, reconsider the BST strategy.
165+
///
166+
/// A B-Tree instead makes each node contain B-1 to 2B-1 elements in a contiguous array. By doing
167+
/// this, we reduce the number of allocations by a factor of B, and improve cache efficiency in
168+
/// searches. However, this does mean that searches will have to do *more* comparisons on average.
169+
/// The precise number of comparisons depends on the node search strategy used. For optimal cache
170+
/// efficiency, one could search the nodes linearly. For optimal comparisons, one could search
171+
/// the node using binary search. As a compromise, one could also perform a linear search
172+
/// that initially only checks every i<sup>th</sup> element for some choice of i.
173+
///
174+
/// Currently, our implementation simply performs naive linear search. This provides excellent
175+
/// performance on *small* nodes of elements which are cheap to compare. However in the future we
176+
/// would like to further explore choosing the optimal search strategy based on the choice of B,
177+
/// and possibly other factors. Using linear search, searching for a random element is expected
178+
/// to take B * log(n) comparisons, which is generally worse than a BST. In practice,
179+
/// however, performance is excellent.
180+
///
181+
/// [B-Tree]: https://en.wikipedia.org/wiki/B-tree
182+
/// [binary search tree]: https://en.wikipedia.org/wiki/Binary_search_tree
183+
/// [total order]: https://en.wikipedia.org/wiki/Total_order
184+
/// [`Cell`]: core::cell::Cell
185+
/// [`RefCell`]: core::cell::RefCell
172186
#[stable(feature = "rust1", since = "1.0.0")]
173187
#[cfg_attr(not(test), rustc_diagnostic_item = "BTreeMap")]
174188
#[rustc_insignificant_dtor]

0 commit comments

Comments
 (0)