Critical components in the world's computational infrastructure
Merge sort: last lecture
Quicksort: this lecture
Basic plan
j
a[j]
is in placej
j
input: Q U I C K S O R T E X A M P L E shuffle: K R A T E L E P U I M Q C X O S partition item: ^----------v partition: E C A I E |K| L P U T M Q R X O S ( all<=K ) |K| ( all >= K ) sort left: A C E E I |K| . . . . . . . . . . sort right: . . . . . |K| L M O P Q R S T U X result: A C E E I K L M O P Q R S T U X
Sir Tony Hoare, 1980 Turing Award
“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.
”
“I call it my billion-dollar mistake. It was the invention of the null reference in 1965... This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
”
|
![]() |
Repeat until i
and j
pointers cross
i
from left to right so long as a[i] < a[lo]
j
from right to left so long as a[j] > a[lo]
a[i]
with a[j]
When pointers cross
a[lo]
with a[j]
private static int partition(Comparable[] a, int lo, int hi) { int i = lo, j = hi + 1; while(true) { while(less(a[++i], a[lo])) // find item on left to swap if(i == hi) break; while(less(a[lo], a[--j])) // find item on right to swap if(j == lo) break; if(i >= j) break; // check if pointers cross exch(a, i, j); // swap } exch(a, lo, j); // swap with partition item return j; // return index of item now know to be in place }
Q. How many compares (in the worst case) to partition an array of length \(N\)?
\(\sim \frac{1}{4} N\)
\(\sim \frac{1}{2} N\)
\(\sim N\)
\(\sim N \lg N\)
public class Quick { private static int partition(Comparable[] a, int lo, int hi) { /* as before */ } public static void sort(Comparable[] a) { // shuffle needed for performance guarantee (stay tuned...) StdRandom.shuffle(a); sort(a, 0, a.length - 1); } private static void sort(Comparable[] a, int lo, int hi) { if(hi <= lo) return; int j = partition(a, lo, hi); sort(a, lo, j-1); sort(a, j+1, hi); } }
lo j hi 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 Q U I C K S O R T E X A M P L E <- initial values K R A T E L E P U I M Q C X O S <- random shuffle 0 5 15 E C A I E|K|L P U T M Q R X O S <- K is pivot 0 3 4 E C A|E|I . . . . . . . . . . . <- E is pivot 0 2 2 A C|E|. . . . . . . . . . . . . <- E is pivot 0 0 1 |A|C . . . . . . . . . . . . . . <- A is pivot 1 1 .|C|. . . . . . . . . . . . . . <- no partition 4 4 . . . .|I|. . . . . . . . . . . <- subarrays of sz 1 6 6 15 . . . . . .|L|P U T M Q R X O S <- L is pivot 7 9 15 . . . . . . . M O|P|T Q R X U S <- P is pivot 7 7 8 . . . . . . .|M|O . . . . . . . <- M is pivot 8 8 . . . . . . . .|O|. . . . . . . <- no partition 10 13 15 . . . . . . . . . . S Q R|T|U X <- T 10 12 12 . . . . . . . . . . R Q|S|. . . <- S 10 11 11 . . . . . . . . . . Q|R|. . . . <- R 10 10 . . . . . . . . . .|Q|. . . . . <- no partition 14 14 15 . . . . . . . . . . . . . .|U|X <- U 15 15 . . . . . . . . . . . . . . .|X| <- no partition A C E E I K L M O P Q R S T U X <- result
Key:
i
, j
)Note: data here are not shuffled before sorting, but a random item in subarray is chosen as pivot
Partition in-place
Terminating the loop
Equal keys
Preserving randomness
Running time estimates
Running time estimates:
Insertion Sort | Merge Sort | Quicksort | |||||||
---|---|---|---|---|---|---|---|---|---|
1k | 1m | 1b | 1k | 1m | 1b | 1k | 1m | 1b | |
home | instant | 2.8hrs | 317yrs | instant | 1sec | 18min | instant | 0.6sec | 12min |
super | instant | 1sec | 1wk | instant | instant | instant | instant | instant | instant |
Lesson 1: Good algorithms are better than supercomputers
Lesson 2: Great algorithms are better than good ones
Best case: Number of compares is \(\sim N \lg N\)
lo j hi 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 A B C D E F G H I J K L M N O <- initial values H A C B F E G D L I K J N M O <- random shuffle 0 7 14 D A C B F E G|H|L I K J N M O <- 14 comparisons 0 3 6 B A C|D|F E G . . . . . . . . <- 6 comparisons 0 1 2 A|B|C . . . . . . . . . . . . <- 2 comparisons 0 0 |A|. . . . . . . . . . . . . . 2 2 . .|C|. . . . . . . . . . . . 4 5 6 . . . . E|F|G . . . . . . . . <- 2 comparisons 4 4 . . . .|E|. . . . . . . . . . 6 6 . . . . . .|G|. . . . . . . . 8 11 14 . . . . . . . . J I K|L|N M O <- 6 comparisons 8 9 10 . . . . . . . . I|J|K . . . . <- 2 comparisons 8 8 . . . . . . . .|I|. . . . . . 10 10 . . . . . . . . . .|K|. . . . 12 13 14 . . . . . . . . . . . . M|N|O <- 2 comparisons 12 12 . . . . . . . . . . . .|M|. . 14 14 . . . . . . . . . . . . .|O| A B C D E F G H I J K L M N O <- result, 34 comparisons
Worst case: Number of compares is \(\sim \frac{1}{2} N^2\)
lo j hi 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 A B C D E F G H I J K L M N O <- initial values A B C D E F G H I J K L M N O <- random shuffle 0 0 14 |A|B C D E F G H I J K L M N O <- 14 comparisons 1 1 14 .|B|C D E F G H I J K L M N O <- 13 comparisons 2 2 14 . .|C|D E F G H I J K L M N O <- 12 comparisons 3 3 14 . . .|D|E F G H I J K L M N O <- 11 comparisons 4 4 14 . . . .|E|F G H I J K L M N O <- 10 comparisons 5 5 14 . . . . .|F|G H I J K L M N O <- 9 comparisons 6 6 14 . . . . . .|G|H I J K L M N O <- 8 comparisons 7 7 14 . . . . . . .|H|I J K L M N O <- 7 comparisons 8 8 14 . . . . . . . .|I|J K L M N O <- 6 comparisons 9 9 14 . . . . . . . . .|J|K L M N O <- 5 comparisons 10 10 14 . . . . . . . . . .|K|L M N O <- 4 comparisons 11 11 14 . . . . . . . . . . .|L|M N O <- 3 comparisons 12 12 14 . . . . . . . . . . . .|M|N O <- 2 comparisons 13 13 14 . . . . . . . . . . . . .|N|O <- 1 comparisons 14 14 14 . . . . . . . . . . . . .|O| A B C D E F G H I J K L M N O <- result, 105 comparisons
Proposition: The average number of compares \(C_N\) to quicksort an array of \(N\) distinct keys is \(\sim 2 N \ln N\) (and the number of exchanges is \(\sim\frac{1}{3} N \ln N\))
Pf: \(C_N\) satisfies the recurrence \(C_0 = C_1 = 0\) and for \(N \geq 2\):
\[\scriptsize C_N = \underbrace{(N+1)}_\text{partitioning} + \frac{C_0+C_{N-1}}{N} + \underbrace{\frac{\overbrace{C_1}^\text{left}+\overbrace{C_{N-2}}^\text{right}}{N}}_\text{partitioning probability} + \ldots + \frac{C_{N-1} + C_0}{N}\]
\[\scriptsize C_N = (N+1) + \frac{C_0+C_{N-1}}{N} + \frac{C_1+C_{N-2}}{N} + \ldots + \frac{C_{N-1} + C_0}{N}\]
\[N C_N = N(N+1) + 2(C_0 + C_1 + \ldots + C_{N-1})\]
\[N C_N - (N-1)C_{N-1} = 2N + 2C_{N-1}\]
\[\frac{C_N}{N+1} = \frac{C_{N-1}}{N} + \frac{2}{N+1}\]
\[\begin{array}{rcl} \frac{C_N}{N+1} & = & \frac{C_{N-1}}{N} + \frac{2}{N+1} \\ & = & \frac{C_{N-2}}{N-1} + \frac{2}{N} + \frac{2}{N+1} \\ & = & \frac{C_{N-3}}{N-2} + \frac{2}{N-1} + \frac{2}{N} + \frac{2}{N+1} \\ & = & \frac{2}{3} + \frac{2}{4} + \frac{2}{5} + \ldots + \frac{2}{N+1} \end{array}\]
\[\begin{array}{rcl} C_N & = & 2(N+1) \left(\frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \ldots + \frac{1}{N+1}\right) \\ & \sim & 2(N+1) \int_3^{N+1} \frac{1}{x} dx \end{array}\]
\[C_N \sim 2(N+1) \ln N \approx 1.39 N \lg N\]
Quicksort is a (Las Vegas) randomized algorithm
Average case: Expected number of compares is \(\sim 1.39 N \lg N\)
Best case: Number of compares is \(\sim N \lg N\)
Worst case: Number of compares is \(\sim \frac{1}{2} N^2\) (but more likely that lightning bolt strikes computer during execution!)
Proposition: Quicksort is an in-place sorting algorithm
Pf:
Proposition: Quicksort is not stable.
Pf (by counterexample):
i j 0 1 2 3 B1 C1 C2 A1 <- input with partition = B1 1 3 B1 C1 C2 A1 <- found first inversion for partition 1 3 B1 A1 C2 C1 <- swap (oh no!) 1 A1 B1 C2 C1 <- swap partition in place
Insertion sort small subarrays
private static void sort(Comparable[] a, int lo, int hi) { if(hi <= lo + CUTOFF - 1) { Insertion.sort(a, lo, hi); return; } int j = partition(a, lo, hi); sort(a, lo, j-1); sort(a, j+1, hi); }
Median of sample
private static void sort(Comparable[] a, int lo, int hi) { if(hi <= lo) return; int median = medianOf3(a, lo, lo+(hi-lo)/2, hi); swap(a, lo, median); // swap median into lo position int j = partition(a, lo, hi); sort(a, lo, j-1); sort(a, j+1, hi); }
When the array has large number of duplicates, partition array into three parts: vals \(< v\), vals \(= v\), and vals \(> v\)
private static void sort(Comparable[] a, int lo, int hi) { if(hi <= lo) return; int lt = lo, gt = hi; Comparable v = a[lo]; int i = lo + 1; while(i <= gt) { int cmp = a[i].compareTo(v); if(cmp < 0) swap(a, lt++, i++); else if(cmp > 0) swap(a, i, gt--); else i++; } // a[lo..lt-1] < v = a[lt..gt] < a[gt+1..hi] sort(a, lo, lt - 1); sort(a, gt + 1, hi); }
Key:
i
, j
)Note: data here are not shuffled before sorting, but a random item in subarray is chosen as pivot
Goal: Given an array of \(n\) items, find the \(k^\textit{th}\) smallest item
Examples:
Applications:
Use theory as a guide
Which is true?
Partition array so that:
a[j]
is in placej
j
Repeat in one subarray, depending on j
; finished when j
equals k
.
public static Comparable select(Comparable[] a, int k) { StdRandom.shuffle(a); int lo = 0, hi = a.length - 1; while (hi > lo) { int j = partition(a, lo, hi); if (j < k) lo = j + 1; else if (j > k) hi = j - 1; else return a[k]; } return a[k]; }
Proposition: Quick-select takes linear time on average.
Pf sketch
Proposition: Compare-based selection algorithm whose worst-case running time is linear (Blum, Floyd, Pratt, Rivest, Tarjan. 1973)
Remark: Constants are high → not used in practice
Use theory as a guide