fullscreen
timer
qrcode
plickers
selector
edit
reset

Greedy Algorithms

COS 320 - Algorithm Design

Greedy Algorithms

Coin Changing

coin changing




coin changing

Goal: Given U.S. currency denominations (\(\{1, 5, 10, 25, 100\}\)), devise a method to pay amount to customer using fewest coins.

Ex: 34¢

Cashier's Algorithm: At each iteration, add coin of the largest value that does not take us past the amount to be paid.

Ex: $2.89

cashier's algorithm

At each iteration, add coin of the largest value that does not take us past the amount to be paid.

Cashiers-Algorithm (x, c1, c2, ..., cn)
    Sort n coin denominations so that 0 < c1 < c2 < ... < cn.
    S <- { }  // multiset of coins selected
    While x > 0
        k <- largest coin denomination ck such that ck <= x
        If no such k
            Return "no solution"
        Else
            x <- x - ck
            S <- S | { k };
    Return S

quiz 1: greedy algorithms

Is the cashier's algorithm optimal?

  1. Yes, greedy algorithms are always optimal

  2. Yes, for any set of coin denominations \(c_1 < c_2 < \ldots < c_n\) provided \(c_1 = 1\)

  3. Yes, because of special properties of U.S. coin denominations

  4. No.

Cashier's algorithm (for arbitrary coin denominations)

Q. Is Cashier's algorithm optimal for any set of denominations?

  1. No. Consider U.S. postage: 1, 10, 21, 34, 70, 100, 350, 1225, 1500.
    • Cashier's algorithm: 140¢ = 100 + 34 + 1 + 1 + 1 + 1 + 1 + 1
    • Optimal: 140¢ = 70 + 70
1¢
10¢
10¢
21¢
21¢
34¢
34¢
70¢
70¢
100¢
100¢
350¢
350¢
1225¢
1225¢
1500¢
1500¢
  1. No. It may not even lead to a feasible solution if \(c_1 > 1\): 7,8,9
    • Cashier's algorithm: 15¢ = 9 + ?
    • Optimal: 15¢ = 7 + 8

properties of any optimal solution (U.S. coin denominations)

Property: Number of pennies ≤ 4
Pf: Replace 5 pennies with 1 nickel

Property: Number of nickels ≤ 1

Property: Number of quarters ≤ 3

Property: Number of nickels + number of dimes ≤ 2
Pf:

100¢
100¢
25¢
25¢
10¢
10¢
5¢
1¢

properties of any optimal solution (U.S. coin denominations)

Theorem: Cashier's Algorithm is optimal for U.S. coins {1,5,10,25,100}

Pf: (by induction on amount to be paid \(x\))

properties of any optimal solution (U.S. coin denominations)

\(k\) \(c_k\) all optimal solutions must satisfy max value of coin denominations \(c_1,c_2,\ldots,c_{k-1}\) in any optimal solution
1 1 \(P \leq 4\)
2 5 \(N \leq 1\) \(4c_1 = 4\)
3 10 \(N + D \leq 2\) \(1c_2 + 4c_1 = 5 + 4 = 9\)
4 25 \(Q \leq 3\) \(2c_2 + 4c_1 = 20 + 4 = 24\)
5 100 no limit \(3c_3 + 2c_2 + 4c_1 = 75 + 20 + 4 = 99\)

Greedy Algorithms

Interval Scheduling (4.1)

interval scheduling

           a            :   :   :   :   :   :   
:        b      :   :   :   :   :   :   :   :   
:   :   :      c    :   :   :   :   :   :   :   
:   :   :            d          :   :   :   :   
:   :   :   :        e      :   :   :   :   :   
:   :   :   :   :          f        :   :   :   
:   :   :   :   :   :          g        :   :   
:   :   :   :   :   :   :   :        h      :   
:   :   :   :   :   :   :   :   :   :   :   :   
0   1   2   3   4   5   6   7   8   9   10  11  

Example: jobs d and g are incompatible

quiz 2: greedy algorithms

Consider jobs in some order, taking each job provided it's compatible with the ones already taken. Which rule is optimal?


  1. Consider jobs in ascending order of \(s_j\) (earliest start time)

  2. Consider jobs in ascending order of \(f_j\) (earliest finish time)

  3. Consider jobs in ascending order of \(f_j-s_j\) (shortest interval)

  4. None of the above

interval scheduling: earliest-finish-time-first algorithm

Earliest-Finish-Time-First (n, s1, s2, ..., sn, f1, f2, ..., fn)
    Sort jobs by finish times and renumber so that f1 ≤ f2 ≤ ... ≤ fn
    S <- { }     // set of jobs selected
    For j = 1 to n
        If job j is compatible with S
            S <- S | { j }
    Return S

Proposition: Can implement earliest-finish-time-first in \(O(n \log n)\) time.

group: interval scheduling

Earliest-Finish-Time-First (n, s1, s2, ..., sn, f1, f2, ..., fn)
    Sort jobs by finish times and renumber so that f1 ≤ f2 ≤ ... ≤ fn
    S <- { }     // set of jobs selected
    For j = 1 to n
        If job j is compatible with S
            S <- S | { j }
    Return S
           a            :   :   :   :   :   :   
:        b      :   :   :   :   :   :   :   :   
:   :   :      c    :   :   :   :   :   :   :   
:   :   :            d          :   :   :   :   
:   :   :   :        e      :   :   :   :   :   
:   :   :   :   :          f        :   :   :   
:   :   :   :   :   :          g        :   :   
:   :   :   :   :   :   :   :        h      :   
:   :   :   :   :   :   :   :   :   :   :   :   
0   1   2   3   4   5   6   7   8   9   10  11  
:   :   :   :   :   :   :   :   :   :   :   :   
0   1   2   3   4   5   6   7   8   9   10  11  
           a            :   :   :   :   :   :   
:        b      :   :   :   :   :   :   :   :   
:   :   :      c    :   :   :   :   :   :   :   
:   :   :            d          :   :   :   :   
:   :   :   :        e      :   :   :   :   :   
:   :   :   :   :          f        :   :   :   
:   :   :   :   :   :          g        :   :   
:   :   :   :   :   :   :   :        h      :   
:   :   :   :   :   :   :   :   :   :   :   :   
0   1   2   3   4   5   6   7   8   9   10  11  
:        b           e      :        h      :   
0   1   2   3   4   5   6   7   8   9   10  11  

interval scheduling: earliest-finish-time-first algorithm

Theorem: The earliest-finish-time-first algorithm is optimal
Pf: (by contradiction)

interval scheduling: earliest-finish-time-first algorithm

Theorem: The earliest-finish-time-first algorithm is optimal
Pf: (by contradiction)

quiz 3: greedy algorithms

Suppose that each job also has a positive weight and the goal is to find a maximum weight subset of mutually compatible intervals. Is the earliest-finish-time-first algorithm still optimal?


  1. Yes, because greedy algorithms are always optimal

  2. Yes, because the same proof of correctness is valid

  3. No, because the same proof of correctness is no longer valid

  4. No, because you could assign a huge weight to a job that overlaps the job with the earliest finish time

Greedy Algorithms

Interval Partitioning (4.1)

interval partitioning

Scheduling lectures to classrooms


Ex: This schedule uses 4 classrooms to schedule 10 lectures.

:      :      :      :                          e                     :      :                j          :      
          c          :                d          :                g          :      :      :      :      :      
                        b                        :      :      :                       h                 :      
          a          :      :      :      :      :                f          :                i          :      
09:00  09:30  10:00  10:30  11:00  11:30  12:00  12:30  13:00  13:30  14:00  14:30  15:00  15:30  16:00  16:30  

Example: jobs e and g are incompatible

interval partitioning

Scheduling lectures to classrooms


Ex: This schedule uses 3 classrooms to schedule 10 lectures.

:      :      :      :      :      :      :      :      :      :      :      :      :      :      :      :      
          c          :                d          :                f          :                j          :      
                        b                        :                g          :                i          :      
          a          :                          e                                      h                 :      
09:00  09:30  10:00  10:30  11:00  11:30  12:00  12:30  13:00  13:30  14:00  14:30  15:00  15:30  16:00  16:30  

Note: intervals are open, so e and h do not intersect. Need only 3 classrooms at 14:00.

quiz 4: greedy algorithms

Consider lectures in some order, assigning each lecture to first available classroom (opening a new classroom if none is available). Which rule is optimal?


  1. Consider lectures in ascending order of \(s_j\) (earliest start time)

  2. Consider lectures in ascending order of \(f_j\) (earliest finish time)

  3. Consider lectures in ascending order of \(f_j - s_j\) (shortest interval)

  4. None of the above

interval partitioning: earliest-start-time-first algorithm

// consider lectures in order of start time:
// - assign next lecture to any compatible classroom (if one exists)
// - otherwise, open up a new classroom
Earliest-Start-Time-First (n, s1, s2, ..., sn, f1, f2, ..., fn)
    Sort lectures by start times and renumber so that s1 ≤ s2 ≤ ... ≤ sn
    d <- 0     // number of allocated classrooms
    For j = 1 to n
        If lecture j is compatible with some classroom k
            Schedule lecture j in any such classroom k
        Else
            Allocate a new classroom d+1
            Schedule lecture j in classroom d+1
            d <- d + 1
    Return schedule

group: interval partitioning

// consider lectures in order of start time:
// - assign next lecture to any compatible classroom (if one exists)
// - otherwise, open up a new classroom
Earliest-Start-Time-First (n, s1, s2, ..., sn, f1, f2, ..., fn)
    Sort lectures by start times and renumber so that s1 ≤ s2 ≤ ... ≤ sn
    d <- 0     // number of allocated classrooms
    For j = 1 to n
        If lecture j is compatible with some classroom k
            Schedule lecture j in any such classroom k
        Else
            Allocate a new classroom d+1
            Schedule lecture j in classroom d+1
            d <- d + 1
    Return schedule
          a          :      :      :      :      :      :      :      :      :      :      :      :      :      
                        b                        :      :      :      :      :      :      :      :      :      
          c          :      :      :      :      :      :      :      :      :      :      :      :      :      
:      :      :      :                d          :      :      :      :      :      :      :      :      :      
:      :      :      :      :                       e                 :      :      :      :      :      :      
:      :      :      :      :      :      :      :                f          :      :      :      :      :      
:      :      :      :      :      :      :      :                g          :      :      :      :      :      
:      :      :      :      :      :      :      :      :      :                       h                 :      
:      :      :      :      :      :      :      :      :      :      :      :                i          :      
:      :      :      :      :      :      :      :      :      :      :      :                j          :      
9:00   9:30   10:00  10:30  11:00  11:30  12:00  12:30  13:00  13:30  14:00  14:30  15:00  15:30  16:00  16:30  
          a          :      :      :      :      :      :      :      :      :      :      :      :      :      
                        b                        :      :      :      :      :      :      :      :      :      
          c          :      :      :      :      :      :      :      :      :      :      :      :      :      
:      :      :      :                d          :      :      :      :      :      :      :      :      :      
:      :      :      :      :                       e                 :      :      :      :      :      :      
:      :      :      :      :      :      :      :                f          :      :      :      :      :      
:      :      :      :      :      :      :      :                g          :      :      :      :      :      
:      :      :      :      :      :      :      :      :      :                       h                 :      
:      :      :      :      :      :      :      :      :      :      :      :                i          :      
:      :      :      :      :      :      :      :      :      :      :      :                j          :      
9:00   9:30   10:00  10:30  11:00  11:30  12:00  12:30  13:00  13:30  14:00  14:30  15:00  15:30  16:00  16:30  
          c          :                d          :                f          :                j          :      
                        b                        :                g          :                i          :      
          a          :      :                       e                                  h                 :      
9:00   9:30   10:00  10:30  11:00  11:30  12:00  12:30  13:00  13:30  14:00  14:30  15:00  15:30  16:00  16:30  

interval partitioning: earliest-start-time-first algorithm

Proposition: The earliest-start-time-first algorithm can be implemented in \(O(n \log n)\) time.
Pf: Store classrooms in a priority queue (key = finish time of its last lecture)

Remark: This implementation chooses a classroom \(k\) whose finish time of its last lecture is the earliest

interval partitioning: lower bound on optimal solution

The depth of a set of open intervals is the maximum number of intervals that contain any given point.

Key observation: Number of classrooms needed ≥ depth

Q. Does minimum number of classrooms needed always equal depth?

  1. Yes! Moreover, earliest-start-time-first algorithm finds a schedule whose number of classrooms equals the depth.
          c          :                d          :                f          :                j          :      
                        b                        :                g          :                i          :      
          a          :      :                       e                                  h                 :      
9:00   9:30   10:00  10:30  11:00  11:30  12:00  12:30  13:00  13:30  14:00  14:30  15:00  15:30  16:00  16:30  

interval partitioning: analysis of earliest-start-time-first algorithm

Observation: the earliest-start-time-first algorithm never schedules two incompatible lectures in the same classroom.

Theorem: Earliest-start-time-first algorithm is optimal.
Pf:

Greedy Algorithms

Scheduling to minimize lateness (4.2)

Scheduling to minimize lateness

1 2 3 4 5 6
\(t_j\) 3 2 1 4 3 2
\(d_j\) 6 8 9 9 14 15
 d3=9     d2=8       d6=15           d1=6             d5=14                 d4=9          :     
0     1     2     3     4     5     6     7     8     9     10    11    12    13    14    15    

Example: \(j_1\) has \(d_1=6\), \(f_1=8\), and lateness of \(8-6=2\), and
\(j_4\) has \(d_4=9\), \(f_4=15\), and lateness of \(15-9=6\), so maximum lateness \(L = \max \{2, 6\} = 6\)

quiz 5: Greedy Algorithms

Schedule jobs according to some natural order. Which order minimizes the maximum lateness?


  1. Ascending order of processing time \(t_j\) (shortest processing time)

  2. Ascending order of deadline \(d_j\) (earliest deadline first)

  3. Ascending order of slack: \(d_j - t_j\) (smallest slack)

  4. None of the above

minimizing lateness: earliest deadline first

Earliest-Deadline-First (n, t1, t2, ..., tn, d1, d2, ..., dn)
    Sort jobs by due times and renumber so that d1 ≤ d2 ≤ ... ≤ dn
    t <- 0
    For j = 1 to n
        Assign job j to interval [t, t + tj]
        sj <- t; fj <- t + tj
        t <- t + tj
    Return intervals [s1, f1], [s2, f2], ..., [sn, fn]

1 2 3 4 5 6
\(t_j\) 3 2 1 4 3 2
\(d_j\) 6 8 9 9 14 15
       d1=6           d2=8     d3=9           d4=9                d5=14          d6=15    :     
0     1     2     3     4     5     6     7     8     9     10    11    12    13    14    15    

Note: \(d_4\) has lateness of 1, so \(L = 1\)

minimizing lateness: no idle time

Observation 1: There exists an optimal schedule with no idle time.

                          an optimal schedule                           
    d=4     :            d=6        :     :            d=12       :     
0     1     2     3     4     5     6     7     8     9     10    11    
                 an optimal schedule with no idle time                  
    d=4            d=6               d=12       :     :     :     :     
0     1     2     3     4     5     6     7     8     9     10    11    

Observation 2: The earliest-deadline-first schedule has no idle time.

minimizing lateness: inversions

Given a schedule \(S\), an inversion is a pair of jobs \(i\) and \(j\) such that \(i < j\) but \(j\) is scheduled before \(i\).

                  a schedule with an inversion (i < j)                  
                       j              i                           :     
0     1     2     3     4     5     6     7     8     9     10    11    

Recall: we assume the jobs are numbered so that \(d_1 \leq d_2 \leq \ldots \leq d_n\)


Observation 3: The earliest-deadline-first schedule is the unique idle-free schedule with no inversions.

  1     2     3     4     5     6    ...    n   :     
0     1     2     3     4     5     6     7     8     

minimizing lateness: inversions

Observation 4: If an idle-free schedule has an inversion, then it has an adjacent inversion (two inverted jobs scheduled consecutively)

Pf:

        j     k                 i               :     
0     1     2     3     4     5     6     7     8     

minimizing lateness: inversions

                            before exchange                             
                       j              i                           :     
0     1     2     3     4     5     6     7     8     9     10    11    
                             after exchange                             
                          i              j                        :     
0     1     2     3     4     5     6     7     8     9     10    11    

Key claim: Exchanging two adjacent, inverted jobs \(i\) and \(j\) reduces the number of inversions by 1 and does not increase the max lateness.

minimizing lateness: inversions

Key claim: Exchanging two adjacent, inverted jobs \(i\) and \(j\) reduces the number of inversions by 1 and does not increase the max lateness.

Pf: Let \(\mathcal{l}\) be the lateness before the swap, and let \(\mathcal{l}'\) be it afterwards.

minimizing lateness: analysis of earliest-deadline-first algorithm

Theorem: The earliest-deadline-first schedule \(S\) is optimal

Pf (by contradiction): Define \(S^*\) to be an optimal schedule with the fewest inversions (optimal schedule can have inversions).

greedy analysis strategies

Greedy algorithm stays ahead:

Structural:

Exchange argument:

greedy algorithms

Other greedy algorithms:

Greedy Algorithms

Google's Foo.bar challenge

Group: Google's Foo.bar challenge

A "secret" web tool that Google uses to recruit developers.

  • Triggered by specific searches related to programming.
  • Algorithmic coding challenges of increasing difficulty.
Quantum antimatter fuel comes in small pellets, which is convenient since 
the many moving parts of the LAMBCHOP each need to be fed fuel one pellet 
at a time.  However, minions dump pellets in bulk into the fuel intake.  
You need to figure out the most efficient way to sort and shift the
pellets down to a single pellet at a time.

The fuel control mechanisms have 3 operations:
 -  Add 1 fuel pellet
 -  Remove 1 fuel pellet
 -  Divide the entire group of fuel pellets by 2 (due to the destructive
    energy released when a quantum antimatter pellet is cut in half, the 
    safety controls will only allow this to happen if there is an even
    number of pellets)

Write a function called answer(n) which takes a positive integer n as a
string and returns the minimum number of operations needed to transform
the number of pellets to 1.

Greedy Algorithms

optimal caching (4.3)

optimal caching

Caching


Applications: CPU, RAM, hard drive, web, browser, ...

Goal: Eviction schedule that minimizes the number of evictions

optimal caching

Ex: \(k=2\), initial cache \(= ab\), requests: \(a, b, c, b, c, a, b\)

Optimal eviction schedule: 2 evictions

cache
a a b
b a b
c a c
b b c
c b c
a b a
b b a

Note

Red highlight on cache miss (eviction)

optimal offline caching: greedy algorithms

LIFO/FIFO


LRU


LFU

optimal offline caching: greedy algorithms

LIFO: Evict item brought in most recently (stack)

cache
a a w x y z
d d w x y z
a a w x y z
b b w x y z
c c w x y z
e e w x y z
g ? ? ? ? ?
b
e
d
cache
a a w x y z
d d w x y z
a a w x y z
b b w x y z
c c w x y z
e e w x y z
g g w x y z
b ? ? ? ? ?
e
d
cache
a a w x y z
d d w x y z
a a w x y z
b b w x y z
c c w x y z
e e w x y z
g g w x y z
b b w x y z
e ? ? ? ? ?
d
cache
a a w x y z
d d w x y z
a a w x y z
b b w x y z
c c w x y z
e e w x y z
g g w x y z
b b w x y z
e e w x y z
d ? ? ? ? ?
cache
a a w x y z
d d w x y z
a a w x y z
b b w x y z
c c w x y z
e e w x y z
g g w x y z
b b w x y z
e e w x y z
d d w x y z

optimal offline caching: greedy algorithms

FIFO: Evict item brought in least recently (queue)

cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g ? ? ? ? ?
b
e
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b d g
b ? ? ? ? ?
e
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b d g
b e c b d g
e ? ? ? ? ?
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b d g
b e c b d g
e e c b d g
d ? ? ? ? ?
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b d g
b e c b d g
e e c b d g
d e c b d g

optimal offline caching: greedy algorithms

LRU: Evict item whose most recent access was earliest

cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g ? ? ? ? ?
b
e
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b ? ? ? ? ?
e
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b e c b g a
e ? ? ? ? ?
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b e c b g a
e e c b g a
d ? ? ? ? ?
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b e c b g a
e e c b g a
d e c b g d

optimal offline caching: greedy algorithms

LFU: Evict item that was least frequently requested

cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g ? ? ? ? ?
b
e
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b ? ? ? ? ?
e
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b e c b g a
e ? ? ? ? ?
d
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b e c b g a
e e c b g a
d ? ? ? ? ?
cache
a v w x y a
d v w x d a
a v w x d a
b v w b d a
c v c b d a
e e c b d a
g e c b g a
b e c b g a
e e c b g a
d e c b d a

optimal offline caching: farthest-in-future

Farthest-in-future algorithm evicts item in the cache that is not requested until farthest in the future (clairvoyant algorithm)

Theorem [Bélády 1966]: FF is optimal eviction schedule

Pf: Algorithm and theorem are intuitive; proof is subtle.

optimal offline caching: farthest-in-future

Farthest-in-future algorithm evicts item in the cache that is not requested until farthest in the future (clairvoyant algorithm)

cache
a a b c d e
f ? ? ? ? ?
a
b
c
e
g
b
e
d
cache
a a b c d e
f a b c f e
a ? ? ? ? ?
b
c
e
g
b
e
d
cache
a a b c d e
f a b c f e
a a b c f e
b ? ? ? ? ?
c
e
g
b
e
d
cache
a a b c d e
f a b c f e
a a b c f e
b a b c f e
c ? ? ? ? ?
e
g
b
e
d
cache
a a b c d e
f a b c f e
a a b c f e
b a b c f e
c a b c f e
e ? ? ? ? ?
g
b
e
d
cache
a a b c d e
f a b c f e
a a b c f e
b a b c f e
c a b c f e
e a b c f e
g ? ? ? ? ?
b
e
d
cache
a a b c d e
f a b c f e
a a b c f e
b a b c f e
c a b c f e
e a b c f e
g a b c g e
b ? ? ? ? ?
e
d
cache
a a b c d e
f a b c f e
a a b c f e
b a b c f e
c a b c f e
e a b c f e
g a b c g e
b a b c g e
e ? ? ? ? ?
d
cache
a a b c d e
f a b c f e
a a b c f e
b a b c f e
c a b c f e
e a b c f e
g a b c g e
b a b c g e
e a b c g e
d ? ? ? ? ?
cache
a a b c d e
f a b c f e
a a b c f e
b a b c f e
c a b c f e
e a b c f e
g a b c g e
b a b c g e
e a b c g e
d a b c g d

quiz 6: greedy algorithms

Which item will be evicted next using farthest-in-future schedule?

  1. d

  2. e

  3. c

  4. a

cache
b d b y a
c d b c a
e d e c a
f ? ? ? ?
c
d
a
e
a
c

reduced eviction schedules

A reduced schedule is a schedule that brings an item \(d\) into the cache in step \(j\) only if there is a request for \(d\) in step \(j\) and \(d\) is not already in the cache.

an unreduced schedule:

cache
a a b c
a a b c
c a d c
d a d c
a a c b
b a c b
c a c b
d d c b
d d c d

a reduced schedule:

cache
a a b c
a a b c
c a b c
d a d c
a a d c
b a d b
c a c b
d d c b
d d c b

reduced eviction schedules

Claim: Given any unreduced schedule \(S\), can transform it into a reduced schedule \(S'\) with no more evictions.

Pf by induction on number of steps \(j\):

reduced eviction schedules

unreduced schedule \(S\):

cache
      c
      c
      c
x     d
y     d
z     d
e     e
      e

schedule \(S'\):

cache
      c
      c
      c
x     c
y     c
z     c
e     e
      e

reduced eviction schedules

Claim: Given any unreduced schedule \(S\), can transform it into a reduced schedule \(S'\) with no more evictions.

Pf by induction on number of steps \(j\):

reduced eviction schedules

unreduced schedule \(S\):

cache
      c
      c
      c
x     d
y     d
z     d
d     d
      d

schedule \(S'\):

cache
      c
      c
      c
x     c
y     c
z     c
d     d
      d

reduced eviction schedules

Claim: Given any unreduced schedule \(S\), can transform it into a reduced schedule \(S'\) with no more evictions.

Pf by induction on number of steps \(j\):

reduced eviction schedules

unreduced schedule \(S\):

cache
  d a c
  d a c
  d a c
d d a d
d d a d
c c a d
b c a b
d c a d

schedule \(S'\):

cache
  d a c
  d a c
  d a c
d d a c
d d a c
c c a c
b c a b
d c a d

reduced eviction schedules

Claim: Given any unreduced schedule \(S\), can transform it into a reduced schedule \(S'\) with no more evictions.

Pf by induction on number of steps \(j\):

reduced eviction schedules

unreduced schedule \(S\):

cache
  d a c
  d a c
  d a c
d d a d
d d a d
c c a d
a c a d
d c a d

schedule \(S'\):

cache
  d a c
  d a c
  d a c
d d a c
d d a c
c c a c
a c a c
d c a d

reduced eviction schedules

Claim: Given any unreduced schedule \(S\), can transform it into a reduced schedule \(S'\) with no more evictions.

Pf by induction on number of steps \(j\):

farthest-in-future analysis

Theorem: FF is optimal eviction algorithm

Pf: Follows directly from the following invariant

Invariant: There exists an optimal reduced schedule \(S\) that has the same eviction schedule as \(S_\textit{FF}\) through the first \(j\) steps

farthest-in-future analysis

Invariant: There exists an optimal reduced schedule \(S\) that has the same eviction schedule as \(S_\textit{FF}\) through the first \(j\) steps

Pf by induction on number of steps \(j\):

farthest-in-future analysis

Schedule \(S\) at steps \(j, j+1\)

step cache
 
\(j\)   d e
\(j+1\) d d e

Schedule \(S'\) at step \(j, j+1\)

step cache
 
\(j\)   d e
\(j+1\) d d e

farthest-in-future analysis

Schedule \(S\) at steps \(j, j+1\)

step cache
 
\(j\)   e f
\(j+1\) d d f

Schedule \(S'\) at step \(j, j+1\)

step cache
 
\(j\)   e f
\(j+1\) d d f

farthest-in-future analysis

Schedule \(S\) at steps \(j, j+1\)

step cache
 
\(j\)   e f
\(j+1\) d e d

Schedule \(S'\) at step \(j, j+1\)

step cache
 
\(j\)   e f
\(j+1\) d d f

farthest-in-future analysis

Let \(j'\) be the first step after \(j+1\) that \(S'\) must take a different action from \(S\) (involves either \(e\) or \(f\) or neither); let \(g\) denote the item requested in step \(j'\).

Schedule \(S\) at step \(j, j+1, j'\)

step cache
 
\(j\)   e f
\(j+1\) d e d
 
\(j'\) e e d

Schedule \(S'\) at step \(j,j+1,j'\)

step cache
 
\(j\)   e f
\(j+1\) d d f
 
\(j'\) e d f

farthest-in-future analysis

Schedule \(S\) at step \(j, j+1, j'\)

step cache
 
\(j\)   e f
\(j+1\) d e d
 
\(j'\) f f d

Schedule \(S'\) at step \(j,j+1,j'\)

step cache
 
\(j\)   e f
\(j+1\) d d f
 
\(j'\) f d f

farthest-in-future analysis

Schedule \(S\) at step \(j, j+1, j'\)

step cache
 
\(j\)   e f
\(j+1\) d e d
 
\(j'\) f e f

Schedule \(S'\) at step \(j,j+1,j'\)

step cache
 
\(j\)   e f
\(j+1\) d d f
 
\(j'\) f e f

farthest-in-future analysis

Schedule \(S\) at step \(j, j+1, j'\)

step cache
 
\(j\)   e f
\(j+1\) d e d
 
\(j'\) g g d

Schedule \(S'\) at step \(j,j+1,j'\)

step cache
 
\(j\)   e f
\(j+1\) d d f
 
\(j'\) g d g

caching perspective

Online vs. offline algorithms


FIFO: Evict item brought in least recently

LRU: Evict item whose most recent access was earliest (FF with direction of time reversed!)

caching perspective

Theorem: FF is optimal offline eviction algorithm

Greedy Algorithms

Dijkstra's algorithm (4.4)

single-pair shortest path problem

Problem: Given a digraph \(G = (V,E)\), edge lengths \(l_e \geq 0\), source \(s \in V\), and destination \(t \in V\), find a shortest directed path from \(s\) to \(t\).

Length of path: \(9 + 4 + 1 + 11 = 25\)

Assumption: there exists a path from \(s\) to every node

quiz 7: greedy algorithms

Suppose that you change the length of every edge of \(G\) as follows to create a new graph \(G'\). For which is every shortest path in \(G\) a shortest path in \(G'\)?


  1. \(l'_e = l_e + 17\) (Add 17)

  2. \(l'_e = 17 \cdot l_e\) (Multiply by 17)

  3. Both A and B

  4. Neither A nor B

quiz 8: greedy algorithms

Which variant in car GPS?


  1. Single source: from one node \(s\) to every other node

  2. Single sink: from every node to one node \(t\)

  3. Source-sink: from one node \(s\) to another node \(t\)

  4. All pairs: between all pairs of nodes

shortest path applications

Network Flows: Theory, Algorithms, and Applications, by Ahuja, Magnanti, Orlin ]

shortest path applications

Dijkstra's alg (for single-source shortest paths problem)

Greedy approach: Maintain a set of explored nodes \(S\) for which algorithm has determined \(d[u]\) as length of a shortest \(s {\leadsto} u\) path.

group: Dijkstra's alg: Demo

\(S\): blue nodes, \(\mathrm{pred}[v]\): blue arrow, \(\pi(v)\): node label

\(S\): blue nodes, \(\mathrm{pred}[v]\): blue arrow, \(\pi(v)\): node label

Dijkstra's alg: proof of correctness

Invariant: For each node \(u \in S\): \(d[u]\) is len of a shortest \(s {\leadsto} u\) path

Pf by induction on \(|S|\):

Dijkstra's alg: proof of correctness

Invariant: For each node \(u \in S\): \(d[u]\) is len of a shortest \(s {\leadsto} u\) path

dijkstra's alg: efficient implementation

Critical optimization 1: For each unexplored node \(v \notin S\), explicitly maintain \(\pi[v]\) instead of computing directly from definition

\[ \pi(v) = \min_{e = (u,v): u \in S} d[u] + l_e \]

dijkstra's alg: efficient implementation

Critical optimization 1: For each unexplored node \(v \notin S\), explicitly maintain \(\pi[v]\) instead of computing directly from definition

Critical optimization 2: Use a min-oriented priority queue (MinPQ) to choose an unexplored node that minimizes \(\pi[v]\)

dijkstra's alg: efficient implementation

Implementation

Dijkstra(V, E, l, s):
    ForEach v != s: pi[v] <- infty, pred[v] <- null
    pi[s] <- 0
    Create an empty priority queue pq
    ForEach v in V: Insert(pq, v, pi[v])
    While pq is not empty:
        u <- DelMin(pq)
        ForEach edge e=(u,v) in E leaving u:
            If pi[v] > pi[u] + l[e]
                DecreaseKey(pq, v, pi[u] + l[e])
                pi[v] <- pi[u] + l[e]
                pred[v] <- e

dijkstra's alg: which priority queue?

Performance depends on PQ: \(n\) insert, \(n\) del-min, \(\leq m\) dec-key


priority queue insert del-min dec-key total
unordered array \(O(1)\) \(O(n)\) \(O(1)\) \(O(n^2)\)
binary heap \(O(\log n)\) \(O(\log n)\) \(O(\log n)\) \(O(m \log n)\)
d-way heap \(O(d \log_d n)\) \(O(d \log_d n)\) \(O(\log_d n)\) \(O(m \log_\dagger n)\)
fibonacci heap \(O(1)\) \(O(\log n)^\ddagger\) \(O(1)^\ddagger\) \(O(m + n \log n)\)
integer pq \(O(1)\) \(O(\log \log n)\) \(O(1)\) \(O(m + n \log \log n)\)

\(\dagger\) log base is \(m/n\), \(\ddagger\) amortized

[ d-way [Johnson 1975], fib heap [Fredman-Tarjan 1984], int pq [Thorup 2004] ]

quiz 9: greedy algorithms

How to solve the single-source shortest paths problem in undirected graphs with positive edge lengths?


  1. Replace each undirected edge with two antiparallel edges of same length. Run Dijkstra's algorithm in the resulting digraph

  2. Modify Dijkstra's algorithm so that when it processes node \(u\), it considers all edges incident to \(u\) (instead of edges leaving \(u\))

  3. Both A and B

  4. Neither A nor B

Dijkstra's alg: undirected graphs

Theorem [Thorup 1999]: Can solve single-source shortest paths problem in undirected graphs with positive integer edge lengths in \(O(m)\) time

Remark: Does not explore nodes in increasing order of distance from \(s\)

Extensions of Dijkstra's algorithm

Dijkstra's algorithm and proof extend to several related problems:

Key algebraic structure: Closed semiring (min-plus, bottleneck, Viterbi, ...) \[ \begin{array}{rcl} a+b & = & b+a \\ a + (b+c) & = & (a+b)+c \\ a + 0 & = & a \\ a \cdot (b \cdot c) & = & (a \cdot b) \cdot c \\ a \cdot 0 & = & 0 \cdot a = 0 \\ a \cdot 1 & = & 1 \cdot a = a \\ a \cdot (b + c) & = & a \cdot b + a \cdot c \\ (a + b) \cdot c & = & a \cdot c + b \cdot c \\ a^* = 1 + a\cdot a^* & = & 1 + a^* \cdot a \end{array}\]

Edsger Dijkstra

What's the shortest way to travel from Rotterdam to Groningen? It is the algorithm for the shortest path, which I designed in about 20 minutes. One morning I was shopping in Amsterdam with my young fiancée, and tired, we sat down on the café terrace to drink a cup of coffee and I was just thinking about whether I could do this, and I then designed the algorithm for the shortest path.

moral implications of shortest-path alg

Greedy Algorithms

Google's Foo.bar Challenge

Group: Google's foo.bar challenge

You have maps of parts of the space station, each starting at a prison exit
and ending at the door to an escape pod.  The map is represented as a matrix
of 0s and 1s, where 0s are passable space and 1s are impassable walls.  The
door out of the prison is at the top-left (0,0) and the door into an escape
pod is at the bottom-right (w - 1, h - 1).

Write a function that generates the length of a shortest path from the prison
door to the escape pod, where you are allowed to remove one wall as part of
your remodeling plans.
s
t

Greedy Algorithms

minimum spanning trees (4.5)

paths and cycles

Defn: A path is a sequence of edges which connects a sequence of nodes.

Defn: A cycle is a path with no repeated nodes or edges other than the starting and ending nodes.

path \(P = \{ (1,2), (2,3), (3,4), (4,5), (5,6) \}\)

cycle \(C = \{ (1,2), (2,3), (3,4), (4,5), (5,6), (6,1) \}\)

cuts and cut-sets

Defn: A cut is a partition of the vertices of a graph into two nonempty, disjoint subsets, \(S\) and \(V-S\)

Defn: The cut-set of a cut \(S\) is the set of edges with exactly one endpoint in \(S\)

cut \(S = \{ 4, 5, 8 \}\)

cut-set \(D = \{ (3,4), (3,5), (5,6), (5,7), (8,7) \}\)

quiz 10: greedy algorithms

Consider the cut \(S = \{1,4,6,7\}\). Which edge is in cut-set of \(S\)?


  1. \((5,7)\)
  2. \((1,7)\)
  3. \((2,3)\)
  4. \(S\) is not a cut (not connected)

quiz 11: greedy algorithms

Let \(C\) be a cycle and let \(D\) be a cut-set. How many edges do \(C\) and \(D\) have in common? Choose the best answer.


  1. 0
  2. 2
  3. not 1
  4. an even number

cycle-cut intersection

Proposition: A cycle and a cut-set intersect in an even number of edges

cycle \(C = \{ (1,2), (2,3), (3,4), (4,5), (5,6), (6,1) \}\)

cut-set \(D = \{ (3,4), (3,5), (5,6), (5,7), (7,8) \}\)

intersection \(C \cap D = \{ (3,4), (5,6) \}\)

cycle-cut intersection

Proposition: A cycle and a cut-set intersect in an even number of edges.

Pf by picture:

spanning tree definition

Defn: Let \(H = (V,T)\) be a subgraph of an undirected graph \(G=(V,E)\). \(H\) is a spanning tree of \(G\) if \(H\) is both acyclic and connected.

graph \(G = (V, E)\)

spanning tree \(H = (V,T)\)

note: \(H\) contains all vertices of \(G\); \(T \subseteq E\)

quiz 12: greedy algorithms

Which of the following properties are true for all spanning trees \(H\)?

  1. Contains exactly \(|V|-1\) edges

  2. The removal of any edge \(e \in T\) disconnects it

  3. The addition of any other edge \(e \in E\) creates a cycle

  4. All of the above.

graph \(G = (V, E)\), spanning tree \(H = (V,T)\)

spanning tree properties

Proposition: Let \(H = (V,T)\) be a subgraph of an undirected graph \(G=(V,E)\). Then, the following are equivalent:

minimum spanning tree (MST)

Defn: Given a connected, undirected graph \(G=(V,E)\) with edge costs \(c_e\), a minimum spanning tree \((V,T)\) is a spanning tree of \(G\) such that the sum of the edge costs in \(T\) is minimized.

MST cost: \(4 + 6 + 8 + 5 + 11 + 9 + 7 = 50\)

Cayley's theorem: The complete graph on \(n\) nodes has \(n^{n-2}\) spanning trees (cannot solve by brute force)

quiz 13: greedy algorithms

Suppose that you change the cost of every edge in \(G\) as follows to create a new graph \(G'\). For which is every MST in \(G\) an MST in \(G'\) (and vice versa)? Assume \(c_e > 0\) for each \(e\).


  1. \(c'_e = c_e + 17\)

  2. \(c'_e = 17 \cdot c_e\)

  3. \(c'_e = \log_{17} c_e\)

  4. All of the above

applications

MST is fundamental problem with diverse applications

Network Flows: Theory, Algorithms, and Applications, by Ahuja, Magnanti, Orlin ]

fundamental cycle

Fundamental cycle: Let \(H=(V,T)\) be a spanning tree of \(G=(V,E)\).

graph \(G = (V, E)\), spanning tree \(H = (V,T)\)

Observation: If \(c_e < c_f\), then \(H\) is not an MST

fundamental cut-set

Fundamental cut-set: Let \(H=(V,T)\) be a spanning tree of \(G=(V,E)\)

graph \(G = (V, E)\), spanning tree \(H = (V,T)\)

Observation: If \(c_e < c_f\), then \(H\) is not an MST.

the greedy algorithm

Red rule:

Blue rule:

Greedy Algorithm:

the greedy algorithm

greedy algorithm: proof of correctness

Color invariant: There exists an MST \((V,T^*)\) containing every blue edge and no red edge.

Pf by induction on number of iterations:

Base case: no edges colored ⇒ every MST satisfies invariant.

greedy algorithm: proof of correctness

Color invariant: There exists an MST \((V,T^*)\) containing every blue edge and no red edge.

Pf by induction on number of iterations:

Inductive step (blue): Suppose color invariant true before blue rule

greedy algorithm: proof of correctness

Color invariant: There exists an MST \((V,T^*)\) containing every blue edge and no red edge.

Pf by induction on number of iterations:

Inductive step (red): Suppose color invariant true before red rule

greedy algorithm: proof of correctness

Theorem: The greedy algorithm terminates. Blue edges form MST

Pf: We need to show that either the red or blue rule (or both) applies

greedy algorithm: proof of correctness

Theorem: The greedy algorithm terminates. Blue edges form MST

Pf: We need to show that either the red or blue rule (or both) applies

Greedy Algorithms

prim, kruskal, reverse-delete (4.6)

prim's algorithm

Theorem: Prim's algorithm computes an MST

Pf: Special case of greedy algorithm, where blue rule repeatedly applied to \(S\) (By construction, edges in cut-set are uncolored) ∎

prim's algorithm: implementation

Theorem: Prim's algorithm can be implemented to run in \(O(m \log n)\) time

Pf: Implementation almost identical to Dijkstra's algorithm

Prim(V, E, c):
    S <- {}, T <- {}
    s <- any node in V
    Foreach v != s: pi[v] <- infty, pred[v] <- null; pi[s] <- 0
    Create an empty minimum priority queue pq
    Foreach v in V: Insert(pq, v, pi[v])
    While Is-Not-Empty(pq):
        u <- Del-Min(pq)
        S <- Union(S, { u }), T <- Union(T, { pred[u] })
        Foreach edge e = (u,v) in E with v notin S:
            If c[e] < pi[v]:
                Decrease-Key(pq, v, c[e])
                pi[v] <- c[e]; pred[v] <- e
    Return T

Kruskal's Algorithm

Consider edges in ascending order of cost: Add to tree unless it would create a cycle

Theorem: Kruskal's algorithm computes an MST

Pf: Special case of greedy algorithm

kruskal's algorithm: implementation

Theorem: Kruskal's algorithm can be implemented to run in \(O(m \log m)\) time.

Kruskal(V, E, c):
    Sort m edges by cost and renumber so that c[e1] <= c[e2] <= ... <= c[em]
    T <- {}
    Foreach v in V: UF-Make-Set(v)
    For i = 1 to m:
        (u,v) <- ei
        If UF-Find(u) != UF-Find(v): // u and v in diff components
            T <- Union(T, {ei})
            UF-Union(u, v)           // make u and v in same component
    Return T

reverse-delete algorithm

Theorem: The reverse-delete algorithm computes an MST

Pf: Special case of greedy algorithm

reverse-delete algorithm

Theorem: The reverse-delete algorithm computes an MST

Pf: Special case of greedy algorithm


Fact: [Thorup 2000] Can be implemented to run in \(O(m \log n (\log \log n)^3)\) time

review: the greedy mst algorithm

Red rule:

Blue rule:

Greedy Algorithm:

Theorem: the greedy algorithm is correct.
Special cases: Prim, Kruskal, reverse-delete, ...

group: Find MST of graph

Prim: add any node to \(S\), repeat \(n-1\) times: find min-cost edge with one endpoint in \(S\), add other node to \(S\)

Kruskal: sort edges in inc order, add to tree unless it creates cycle

Reverse-delete: all edges in "tree", sort edges in dec order, del edge from tree unless it would disconnect tree

×