Definition: A BST is a binary tree in symmetric order
A binary tree is either empty or a node with two disjoint binary trees (left and right)
Symmetric order: Each node has a key and every node's key is

Search: if less, go left; if greater, go right; if equal, search hit

Java definition: A BST is a reference to a root Node
A Node is composed of four fields:
Key and a Valueleft and right subtree for smaller and larger keys (resp.)// Key and Value are generic types; Key is Comparable
private class Node {
private Key key;
private Value val;
private Node left, right;
public Node(Key key, Value val) {
this.key = key;
this.val = val;
}
}
public class BST<Key extends Comparable<Key>, Value> {
private Node root; // root of BST
private class Node { /* prev slide */ }
public void put(Key key, Value val) { /* next slides */ }
public Value get(Key key) { /* next slides */ }
public void delete(Key key) { /* next slides */ }
public Iterable<Key> iterator() { /* next slides */ }
}
Get: return value corresponding to given key, or null if no such key
public Value get(Key key) {
Node x = root;
while(x != null) {
int cmp = key.compareTo(x.key);
if (cmp < 0) x = x.left;
else if(cmp > 0) x = x.right;
else return x.val;
}
return null;
}
Cost: number of compares \(= 1 + \text{depth of node}\)
|
Put: Associate value with key Search for key, then two cases:
|
Put: Associate value with key
public void put(Key key, Value val) {
root = put(root, key, val);
}
private Node put(Node x, Key key, Value val) {
if(x == null) return new Node(key, val);
int cmp = key.compareTo(x.key);
// concise, but tricky, recursive code; read carefully!!
if (cmp < 0) x.left = put(x.left, key, val);
else if(cmp > 0) x.right = put(x.right, key, val);
else x.val = val;
return x;
}
Cost: Number of compares \(= 1 + \text{depth of node}\)

Bottom line: tree shape depends on order of insertion
Ex: insert keys in random order

In what order does the traverse(root) code print out the keys in the BST?
private void traverse(Node x)
{
if(x == null) return;
traverse(x.left);
StdOut.println(x.key);
traverse(x.right);
}
|
![]() |
// all keys, in order:
// [smaller keys, in order] key [larger keys, in order]
public Iterable<Key> keys() {
Queue<Key> q = new Queue<Key>();
inorder(root, q);
return q;
}
private void inorder(Node x, Queue<Key> q) {
if(x == null) return;
inorder(x.left, q);
q.enqueue(x.key);
inorder(x.right, q);
}
Property: Inorder traversal of a BST yields keys in ascending order
What is the name of this sorting algorithm? (ignore in-place)
Shuffle the keys Insert the keys into a BST, one at a time Do an inorder traversal of the BST
0 1 2 3 4 5 6 7 8 9 0 1 2 3 P S E U D O M Y T H I C A L P S E U D O M Y T H I C A L | H L E A D O M C I|P|T Y U S .---------P--. D C E A|H|O M L I . . . . . .----H-------. .-T-. A C|D|E . . . . . . . . . . .--D--. .----O S U-. |A|C . . . . . . . . . . . . A-. E I---. Y .|C|. . . . . . . . . . . . C .-M . . .|E|. . . . . . . . . . L . . . . . I M L|O|. . . . . . . . . .|I|M L . . . . . . . . . . . . L|M|. . . . . . . . . . . .|L|. . . . . . . . . . . . . . . . . S|T|U Y . . . . . . . . . .|S|. . . . . . . . . . . . . . .|U|Y . . . . . . . . . . . . .|Y| A C D E H I L M O P S T U Y
Remark: Correspondence is 1–1 if array has no duplicate keys

Remark: Correspondence is 1–1 if array has no duplicate keys
Proposition: If \(N\) distinct keys are inserted into a BST in random order, the expected number of compares for a search/insert is \(\sim 2\ln N\).
Pf: 1–1 correspondence with quicksort partitioning
Proposition [Reed, 2003]: If \(N\) distinct keys are inserted into a BST in random order, the expected height is \(\sim 4.311 \ln N\) (expected depth of function-call stack in quicksort)
But... Worst-case height is \(N-1\) (exponentially small chance when keys are inserted in random order)
| implementation | search\(^*\) | insert\(^*\) | search\(^\dagger\) | insert\(^\dagger\) | ops on keys |
|---|---|---|---|---|---|
| seq search (unordered list) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
| binary search (ordered array) | \(\log N\) | \(N\) | \(\log N\) | \(N\) | compareTo() |
| BST | \(N\) | \(N\) | \(\log N\) | \(\log N\) | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
Why not shuffle to ensure a (probabilistic) guarantee of \(\log N\)?
|
![]() |
Q. How to find the min / max?
|
![]() |
Q. How to find the floor / ceiling?
Floor: Find largest key \(\leq k\)
Three cases:
Challenge: Prove to yourself that the above is correct.
public Key floor(Key key) {
Node x = floor(root, key);
if(x == null) return null;
return x.key;
}
private Node floor(Node x, Key key) {
if(x == null) return null;
int cmp = key.compareTo(x.key);
if(cmp == 0) return x;
if(cmp < 0) return floor(x.left, key);
Node t = floor(x.right, key);
if(t != null) return t;
else return x;
}
Q. How to implement rank() and select() efficiently?

Number in node represents count of nodes in subtree rooted at node
public class BST<Key extends Comparable<Key>, Value> {
private Node root;
private class Node {
/* ... */
private int count; // number of nodes in subtree
}
private Node put(Node x, Key key, Value val) {
if(x == null) {
// init subtree count to 1
return new Node(key, val, 1);
}
int cmp = key.compareTo(x.key);
if (cmp < 0) x.left = put(x.left, key, val);
else if(cmp > 0) x.right = put(x.right, key, val);
else x.val = val;
x.count = 1 + size(x.left) + size(x.right);
return x;
}
public int size() { return size(root); }
private int size(Node x) {
if(x == null) return 0; // ok to call when x is null
return x.count;
}
}
Rank: How many keys \(< k\)?
Three cases:
Easy recursive algorithm (3 cases)
public int rank(Key key) { return rank(key, root); }
private int rank(Key key, Node x) {
if(x == null) return 0;
int cmp = key.compareTo(x.key);
if (cmp < 0) return rank(key, x.left);
else if(cmp > 0) return 1 + size(x.left) + rank(key, x.right);
else return size(x.left);
}
| sequential search | binary search | BST | |
|---|---|---|---|
| search | \(N\) | \(\log N\) | \(h\) |
| insert | \(N\) | \(N\) | \(h\) |
| min / max | \(N\) | \(1\) | \(h\) |
| floor / ceiling | \(N\) | \(\log N\) | \(h\) |
| rank | \(N\) | \(\log N\) | \(h\) |
| select | \(N\) | \(1\) | \(h\) |
| ordered iteration | \(N \log N\) | \(N\) | \(N\) |
where \(h\) is height of BST (proportional to \(\log N\) if keys inserted in random order)
| implementation | search\(^*\) | insert\(^*\) | delete\(^*\) | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ops on keys |
|---|---|---|---|---|---|---|---|
| seq search (unordered list) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
| binary search (ordered array) | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | ? | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
Next: Deletion in BSTs
To remove a node with a given key:
null
Cost: \(\sim 2 \ln N'\) per insert, search, and delete (if keys in random order), where \(N'\) is the number of key-value pairs ever inserted in the BST.
Unsatisfactory solution: tombstone (memory) overload
To delete the minimum key:
null left linkpublic void deleteMin() {
root = deleteMin(root);
}
private Node deleteMin(Node x) {
if(x == null) return null;
if(x.left == null) return x.right;
x.left = deleteMin(x.left);
x.count = 1 + size(x.left) + size(x.right);
return x;
}
Challenge: Prove to yourself that the above is correct.
To delete a node with key k: search for node t containing key k.
t by setting parent link to nullt by replacing parent linkx of tt's right subtreex from tree, but don't garbage collect x!x has no left child, so case "0 children" or "1 child"x in t's spotpublic void delete(Key key) {
root = delete(root, key);
}
private Node delete(Node x, Key key) {
if(x == null) return null;
// search for key
int cmp = key.compareTo(x.key);
if (cmp < 0) x.left = delete(x.left, key);
else if(cmp > 0) x.right = delete(x.right, key);
else {
if(x.right == null) return x.left; // no right child
if(x.left == null) return x.right; // no left child
// replace with successor
Node t = x;
x = min(t.right);
x.right = deleteMin(t.right);
x.left = t.left;
}
// update subtree counts
x.count = size(x.left) + size(x.right) + 1;
return x;
}
Unsatisfactory solution: not symmetric

Surprising consequence: Trees not random (!) \(\Rightarrow \sqrt{N}\) per op
Longstanding open problem: Simple and efficient delete for BSTs
| implementation | search\(^*\) | insert\(^*\) | delete\(^*\) | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ops on keys |
|---|---|---|---|---|---|---|---|
| seq search (unordered list) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
| binary search (ordered array) | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
Average case for other BST operations also become \(\sqrt{N}\) if deletions allowed
Next: Guarantee logarithmic performance for all operations