implementation | search* | insert* | delete* | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
---|---|---|---|---|---|---|---|---|
seq search | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
binary search | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
goal | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average.
Challenge: Guarantee performance
This lecture: 2-3 trees, left-leaning red-black BSTs, B-trees
Allow 1 or 2 keys per node
Symmetric order: Inorder traversal yields keys in ascending order
Perfect balance: Every path from root to null
link has same length (how to maintain?)
Search
Insertion into a 2-node at bottom
Insertion into a 3-node at bottom
Invariants: Maintains symmetric order and perfect balance
Pf: Each transformation maintains symmetric order and perfect balance
Splitting a 4-node is a local transformation: constant number of operations
What is the range of heights of a 2-3 tree with \(N\) keys?
best case | worst case | |
A. | \(\sim \log_4 N\) | \(\sim \log_3 N\) |
B. | \(\sim \log_3 N\) | \(\sim \log_2 N\) |
C. | \(\sim \log_3 N\) | \(\sim 2 \log_2 N\) |
D. | \(\sim \log_3 N\) | \(\sim N\) |
Perfect balance: Every path from root to null link has same length
Tree height:
Bottom line: Guaranteed logarithmic performance for search and insert
implementation | search* | insert* | delete* | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
---|---|---|---|---|---|---|---|---|
seq search | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
binary search | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
2-3 tree\(^\ddagger\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
\(^\ddagger\)but hidden constant \(c\) is large (depends upon implementation)
Direct implementation is complicated, because
// fantasy code public void put(Key key, Value val) { Node x = root; while(x.getTheCorrectChild(key) != null) { x = x.getTheCorrectChildKey(); if(x.is4Node()) x.split(); } if (x.is2Node()) x.make3Node(key, val); else if(x.is3Node()) x.make4Node(key, val); }
Bottom line: Could do it, but there's a better way
Challenge: How to represent a 3-node as binary tree?
Challenge: How to represent a 3-node as binary tree?
Approach 1: Regular BST
Challenge: How to represent a 3-node as binary tree?
Approach 2: Regular BST with red "glue" nodes
Challenge: How to represent a 3-node as binary tree?
Approach 3: Regular BST with red "glue" links
A 2-3 tree and corresponding red-black BST
Key property: 1-1 correspondence between 2-3 and LLRB
A LLRB tree is a BST such that
null
link has the same number of black links ("perfect black balance")Observation: Search is the same as for elementary BST (ignore color), but runs faster because of better balance
public Value get(Key key) { Node x = root; while(x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if(cmp > 0) x = x.right; else return x.val; } return null; } |
![]() |
Remark: Most other ops (e.g., floor, iteration, selection) are also identical
Each node is pointed to by precisely one link (from its parent); can encode color of links in nodes
private static final boolean RED = true; private static final boolean BLACK = false; private class Node { Key key; Value val; Node left, right; boolean color; // color of parent link } private boolean isRed(Node x) { if(x == null) return false; // null links are black return x.color == RED; }
root.left.color == RED root.right.color == BLACK
Basic strategy: Maintain 1-1 correspondence with 2-3 trees
During internal operations, maintain:
How? Apply elementary red-black BST operations: rotation and color flip
Left rotation: Orient a (temporarily) right-leaning red link to lean left
private node rotateLeft(Node h) { assert isRed(h.right); Node x = h.right; h.right = x.left; x.left = h; x.color = h.color; h.color = RED; return x; }
Invariants: Maintains symmetric order and perfect black balance
Left rotation: Orient a (temporarily) right-leaning red link to lean left
Right rotation: Orient a left-leaning red link to (temporarily) lean right
private node rotateRight(Node h) { assert isRed(h.left); Node x = h.left; h.left = x.right; x.right = h; x.color = h.color; h.color = RED; return x; }
Invariants: Maintains symmetric order and perfect black balance
Right rotation: Orient a left-leaning red link to (temporarily) lean right
Color flip: Recolor to split a (temporary) 4-node
private void flipColors(Node h) { assert !isRed(h); assert isRed(h.left); assert isRed(h.right); h.color = RED; h.left.color = BLACK; h.right.color = BLACK; }
Invariants: Maintains symmetric order and perfect black balance
Color flip: Recolor to split a (temporary) 4-node
Warmup 1: Insert into a tree with exactly 1 node
null
link of rootA
converts 2-node to 3-nodeWarmup 1: Insert into a tree with exactly 1 node
null
link of rootB
(right-leaning)Case 1: Insert into a 2-node at the bottom
Case 1: Insert into a 2-node at the bottom
Case 1: Insert into a 2-node at the bottom
Warmup 2: Insert into a tree with exactly 2 nodes
null
link of rootWarmup 2: Insert into a tree with exactly 2 nodes
null
linkWarmup 2: Insert into a tree with exactly 2 nodes
null
linkCase 2: Insert into a 3-node at the bottom
R
)S
rightR
red, so flip colorsR
red, so flip colorsE
red, so rotate leftR
red, so flip colorsE
red, so rotate leftSame code for all cases
private Node put(Node h, Key key, Value val) { if(h == null) { // insert at bottom and color it red return new Node(key, val, RED); } int cmp = key.compareTo(h.key); if (cmp < 0) h.left = put(h.left, key, val); else if(cmp > 0) h.right = put(h.right, key, val); else h.val = val; // only a few extra LoC provides near-perfect balance if(isRed(h.right) && !isRed(h.left)) h = rotateLeft(h); // lean left if(isRed(h.left) && isRed(h.left.left)) h = rotateRight(h); // balance 4-node if(isRed(h.left) && isred(h.right)) flipColors(h); // split 4-node return h; }
255 insertions in ascending order
255 insertions in descending order
255 random insertions
What is the height of an LLRB tree with \(N\) keys in the worst case?
\(\sim \log_3 N\)
\(\sim \log_2 N\)
\(\sim 2 \log_2 N\)
\(\sim N\)
Proposition: Height of tree is \(\leq 2 \lg N\) in the worst case
Pf:
Property: Height of tree is \(\sim 1.0 \lg N\) in typical applications
implementation | search* | insert* | delete* | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
---|---|---|---|---|---|---|---|---|
seq search | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
binary search | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
2-3 tree\(^\ddagger\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
LLRB\(^\star\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
\(^\ddagger\)hidden constant \(c\) is large (depends upon implementation)
\(^\star\)hidden constant \(c\) is small (at most \(2 \lg N\) compares)
Xerox PARC innovations (1970s)
|
![]() ![]() |
Telephone company contracted with database provider to build real-time database to store customer information
Database implementation
Telephone company contracted with database provider to build real-time database to store customer information
Extended telephone service outage
“If implemented properly, the height of a red-black BST with \(N\) keys is at most \(2 \lg N\).
”
—expert witness
Property: time required for a probe is much larger than time to access data within a page
Cost model: number of probes
Goal: access data using minimum number of probes
B-tree: Generalize 2-3 trees by allowing up to \(M\) keys per node
Proposition: A search or an insertion in a B-tree of order \(M\) with \(N\) keys requires between \(\sim \log_M N\) and \(\sim \log_{M/2} N\) probes.
Pf: All nodes (except possibly root) have between \(\left\lfloor M/2 \right\rfloor\) and \(M\) keys
In practice: Number of probes is at most \(4\) (when \(M=1024\), \(N = 62 \text{ billion}\), then \(\log_{M/2} N \leq 4\))
Which of the following does the B in B-tree not mean?
Bayer
Balanced
Binary
Boeing
Which of the following does the B in B-tree not mean?
B. Balanced C. Binary D. Boeing E. Broad F. Bushy |
“ |
Red-Black trees are widely used as system symbol tables
java.util.TreeMap
, java.util.TreeSet
linux/rbtree.h
B-tree cousins: B+ tree, B*tree, B# tree, ...
B-trees (and cousins) are widely used for file systems and DBs