| implementation | search* | insert* | delete* | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
|---|---|---|---|---|---|---|---|---|
| seq search | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
| binary search | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
| goal | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average.
Challenge: Guarantee performance
This lecture: 2-3 trees, left-leaning red-black BSTs, B-trees

Binary search trees
D)Ordered, but cannot guarantee \(\log\) height unless we make one small tweak
Search tree, but allow 1 or 2 keys per node
Symmetric order: Inorder traversal yields keys in ascending order
Perfect balance: Every path from root to null link has same length (how to maintain?)

Search
Insertion always goes into node at bottom of tree
Two cases: insert into a 2-node at bottom

Insertion always goes into node at bottom of tree
Two cases: insert into a 3-node at bottom

Invariants: Maintains symmetric order and perfect balance
Pf: Each transformation maintains symmetric order and perfect balance






Splitting a 4-node is a local transformation: constant number of operations


What is the range of heights of a 2-3 tree with \(N\) keys?
| best case | worst case | |
| A. | \(\sim \log_4 N\) | \(\sim \log_3 N\) |
| B. | \(\sim \log_3 N\) | \(\sim \log_2 N\) |
| C. | \(\sim \log_3 N\) | \(\sim 2 \log_2 N\) |
| D. | \(\sim \log_3 N\) | \(\sim N\) |
Perfect balance: Every path from root to null link has same length

Tree height:
Bottom line: Guaranteed logarithmic performance for search and insert
| implementation | search* | insert* | delete* | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
|---|---|---|---|---|---|---|---|---|
| seq search | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
| binary search | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
| 2-3 tree\(^\ddagger\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
\(^\ddagger\)but hidden constant \(c\) is large (depends upon implementation)
Direct implementation is complicated, because
// fantasy code
public void put(Key key, Value val) {
Node x = root;
while(x.getTheCorrectChild(key) != null) {
x = x.getTheCorrectChildKey();
if(x.is4Node()) x.split();
}
if (x.is2Node()) x.make3Node(key, val);
else if(x.is3Node()) x.make4Node(key, val);
}
Bottom line: Could do it, but there's a better way
Challenge: How to represent a 3-node as binary tree?

Challenge: How to represent a 3-node as binary tree?
Approach 1: Regular BST


Challenge: How to represent a 3-node as binary tree?
Approach 2: Regular BST with red "glue" nodes


Challenge: How to represent a 3-node as binary tree?
Approach 3: Regular BST with red "glue" links




A 2-3 tree and corresponding red-black BST


Key property: 1-1 correspondence between 2-3 and LLRB


A LLRB tree is a BST such that
null link has the same number of black links ("perfect black balance")
Observation: Search is the same as for elementary BST (ignore color), but runs faster because of better balance
public Value get(Key key) {
Node x = root;
while(x != null) {
int cmp = key.compareTo(x.key);
if (cmp < 0) x = x.left;
else if(cmp > 0) x = x.right;
else return x.val;
}
return null;
}
|
![]() |
Remark: Most other ops (e.g., floor, iteration, selection) are also identical
Each node is pointed to by precisely one link (from its parent); can encode color of links in nodes
private static final boolean RED = true;
private static final boolean BLACK = false;
private class Node {
Key key;
Value val;
Node left, right;
boolean color; // color of parent link
}
private boolean isRed(Node x) {
if(x == null) return false; // null links are black
return x.color == RED;
}

root.left.color == RED root.right.color == BLACK
Basic strategy: Maintain 1-1 correspondence with 2-3 trees
During internal operations, maintain:




How? Apply elementary red-black BST operations: rotation and color flip
Left rotation: Orient a (temporarily) right-leaning red link to lean left
private node rotateLeft(Node h) {
assert isRed(h.right);
Node x = h.right;
h.right = x.left;
x.left = h;
x.color = h.color;
h.color = RED;
return x;
}
Invariants: Maintains symmetric order and perfect black balance
Left rotation: Orient a (temporarily) right-leaning red link to lean left


Right rotation: Orient a left-leaning red link to (temporarily) lean right
private node rotateRight(Node h) {
assert isRed(h.left);
Node x = h.left;
h.left = x.right;
x.right = h;
x.color = h.color;
h.color = RED;
return x;
}
Invariants: Maintains symmetric order and perfect black balance
Right rotation: Orient a left-leaning red link to (temporarily) lean right


Color flip: Recolor to split a (temporary) 4-node
private void flipColors(Node h) {
assert !isRed(h);
assert isRed(h.left);
assert isRed(h.right);
h.color = RED;
h.left.color = BLACK;
h.right.color = BLACK;
}
Invariants: Maintains symmetric order and perfect black balance
Color flip: Recolor to split a (temporary) 4-node


Warmup 1: Insert into a tree with exactly 1 node

null link of rootA converts 2-node to 3-nodeWarmup 1: Insert into a tree with exactly 1 node

null link of rootB (right-leaning)Case 1: Insert into a 2-node at the bottom
Warmup 2: Insert into a tree with exactly 2 nodes

null link of rootWarmup 2: Insert into a tree with exactly 2 nodes

null linkWarmup 2: Insert into a tree with exactly 2 nodes

null linkCase 2: Insert into a 3-node at the bottom

R)
S right
R red, so flip colors
R red, so flip colorsE red, so rotate left
R red, so flip colorsE red, so rotate leftSame code for all cases
private Node put(Node h, Key key, Value val) {
if(h == null) {
// insert at bottom and color it red
return new Node(key, val, RED);
}
int cmp = key.compareTo(h.key);
if (cmp < 0) h.left = put(h.left, key, val);
else if(cmp > 0) h.right = put(h.right, key, val);
else h.val = val;
// only a few extra LoC provides near-perfect balance
if(isRed(h.right) && !isRed(h.left)) h = rotateLeft(h); // lean left
if(isRed(h.left) && isRed(h.left.left)) h = rotateRight(h); // balance 4-node
if(isRed(h.left) && isred(h.right)) flipColors(h); // split 4-node
return h;
}
255 insertions in ascending order

255 insertions in descending order

255 random insertions

What is the height of an LLRB tree with \(N\) keys in the worst case?
\(\sim \log_3 N\)
\(\sim \log_2 N\)
\(\sim 2 \log_2 N\)
\(\sim N\)
Proposition: Height of tree is \(\leq 2 \lg N\) in the worst case
Pf:

Property: Height of tree is \(\sim 1.0 \lg N\) in typical applications
| implementation | search* | insert* | delete* | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
|---|---|---|---|---|---|---|---|---|
| seq search | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
| binary search | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
| 2-3 tree\(^\ddagger\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
| LLRB\(^\star\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
\(^\ddagger\)hidden constant \(c\) is large (depends upon implementation)
\(^\star\)hidden constant \(c\) is small (at most \(2 \lg N\) compares)
|
Xerox PARC innovations (1970s)
|
![]() ![]() |
Telephone company contracted with database provider to build real-time database to store customer information
Database implementation
Telephone company contracted with database provider to build real-time database to store customer information
Extended telephone service outage
“If implemented properly, the height of a red-black BST with \(N\) keys is at most \(2 \lg N\).
”
—expert witness


Property: time required for a probe is much larger than time to access data within a page
Cost model: number of probes
Goal: access data using minimum number of probes
B-tree: Generalize 2-3 trees by allowing up to \(M\) keys per node



Proposition: A search or an insertion in a B-tree of order \(M\) with \(N\) keys requires between \(\sim \log_M N\) and \(\sim \log_{M/2} N\) probes.
Pf: All nodes (except possibly root) have between \(\left\lfloor M/2 \right\rfloor\) and \(M\) keys
In practice: Number of probes is at most \(4\) (when \(M=1024\), \(N = 62 \text{ billion}\), then \(\log_{M/2} N \leq 4\))
Which of the following does the B in B-tree not mean?
Bayer
Balanced
Binary
Boeing
Which of the following does the B in B-tree not mean?
B. Balanced C. Binary D. Boeing E. Broad F. Bushy |
“ |
Red-Black trees are widely used as system symbol tables
java.util.TreeMap, java.util.TreeSetlinux/rbtree.hB-tree cousins: B+ tree, B*tree, B# tree, ...
B-trees (and cousins) are widely used for file systems and DBs