Dynamic Programming (6)

COS 320 - Algorithm Design

Dynamic Programming (6)

Introduction to Dynamic Programming

algorithmic paradigms

Greedy: build up a solution incrementally, myopically optimizing some local criterion.

Divide-and-Conquer: Break up a problem into independent subproblems, solve each subproblem, and combine solution to subproblems to form solution to original problem.

Dynamic Programming: Break up problem into a series of overlapping subproblems, and build up solutions to larger and larger subproblems.

Note: "Dynamic Programming" is a fancy name for caching away intermediate results in a table for later reuse.

dynamic programming history

Bellman: Pioneered the systematic study of dynamic programming in 1950s

Etymology:

Dynamic programming: planning over time
Secretary of Defense was hostile to mathematical research
Bellman sought an impressive name to avoid confrontation

dynamic programming applications

Areas:

bioinformatics
control theory
information theory
operations research
computer science: theory, graphics, AI, compilers, systems, ...
...

dynamic programming applications

Some famous dynamic programming algorithms:

Unix diff for comparing two files
Viterbi for hidden Markov models
De Boor for evaluating spline curves
Smith-Waterman for genetic sequence alignment
Bellman-Ford for shortest path routing in networks
Cocke-Kasami-Younger for parsing context-free grammars
...

Dynamic Programming (6)

weighted interval scheduling

Weighted interval scheduling problem

Job \(j\) starts at \(s_j\), finishes at \(f_j\), and has weight or value \(v_j\)
Two jobs compatible if they don't overlap
Goal: find maximum weight subset of mutually compatible jobs

           a            :   :   :   :   :   :   
:        b      :   :   :   :   :   :   :   :   
:   :   :      c    :   :   :   :   :   :   :   
:   :   :            d          :   :   :   :   
:   :   :   :        e      :   :   :   :   :   
:   :   :   :   :          f        :   :   :   
:   :   :   :   :   :          g        :   :   
:   :   :   :   :   :   :   :        h      :   
0   1   2   3   4   5   6   7   8   9   10  11

earliest-finish-time first algorithm

Earliest finish-time first:

Consider jobs in ascending order of finish time
- Tie-breaking does not matter
Add job to subset if it is compatible with previously chosen jobs

Recall: Greedy algorithm is correct if all weights are 1

Observation: Greedy algorithm fails spectacularly for weighted version

                          b (999)                           :     :     
      a (1)       :     :     :     :     :     :     :     :     :     
:     :     :     :     :     :     :     :           h (1)       :     
0     1     2     3     4     5     6     7     8     9     10    11

weighted interval scheduling

Notation: Label jobs by finishing time: \(f_1 \leq f_2 \leq \ldots \leq f_n\)

Def: \(p(j)\) is largest index \(i<j\) such that job \(i\) is compatible with \(j\)

Ex: \(p(8) = 5, p(7) = 3, p(2) = 0\)

:        1      :   :   :   :   :   :   :   :   
:   :   :      2    :   :   :   :   :   :   :   
           3            :   :   :   :   :   :   
:   :   :   :        4      :   :   :   :   :   
:   :   :          5        :   :   :   :   :   
:   :   :   :   :          6        :   :   :   
:   :   :   :   :   :          7        :   :   
:   :   :   :   :   :   :   :        8      :   
0   1   2   3   4   5   6   7   8   9   10  11

dynamic programming: binary choice

Notation: \(\mathrm{OPT}(j)\) is value of optimal solution to the problem consisting of job requests \(1, 2, \ldots, j\)

Case 1\(^*\): \(\mathrm{OPT}\) selects job \(j\)

Collect profit \(v_j\)
Can't use incompatible jobs \(\{ p(j)+1, p(j)+2, \ldots, j-1 \}\)
Must include optimal solution to problem consisting of remaining compatible jobs \(1, 2, \ldots, p(j)\)

Case 2\(^*\): \(\mathrm{OPT}\) does not select job \(j\)

Must include optimal solution to problem consisting of remaining compatible jobs \(1, 2, \ldots, j-1\)

\[ \mathrm{OPT}(j) = \begin{cases} 0 & \text{if } j = 0 \\ \max \{ v_j + \mathrm{OPT}(p(j)), \mathrm{OPT}(j-1) \} & \text{otherwise} \end{cases} \]

\(^*\)optimal substructure property (proof via exchange argument)

weighted interval scheduling: brute force

\[ \mathrm{OPT}(j) = \begin{cases} 0 & \text{if } j = 0 \\ \max \{ v_j + \mathrm{OPT}(p(j)), \mathrm{OPT}(j-1) \} & \text{otherwise} \end{cases} \]

Brute-Force(n, s1, ..., sn, f1, ..., fn, v1, ..., vn):
    Sort jobs by finish time so that f[1] <= f[2] <= ... <= f[n]
    Compute p[1], p[2], ..., p[n]
    Return Compute-Opt(n)

Compute-Opt(j):
    If j == 0
        Return 0
    Else
        Return max{
            v[j] + Compute-Opt(p[j]),
            Compute-Opt(j-1)
        }

weighted interval scheduling: brute force

Observation: Recursive algorithm fails spectacularly because of redundant subproblems ⇒ exponential algorithm

Ex: Number of recursive calls for family of "layered" instances grows like Fibonacci sequence

     1      :   :   :   :   :   :   :   :   :   
:   :        2      :   :   :   :   :   :   :   
:   :   :   :        3      :   :   :   :   :   
:   :   :   :   :   :        4      :   :   :   
:   :   :   :   :   :   :   :        5      :   
0   1   2   3   4   5   6   7   8   9   10  11

\[p(1)=0, p(j)=j-2\] \[\mathrm{OPT}(j) = \max \left\{ \begin{array}{l} v_j + \mathrm{OPT}(p(j)), \\ \mathrm{OPT}(j-1) \end{array} \right\} \]

weighted interval scheduling: memoization

Memoization: Cache results of each subproblem; lookup as needed

Top-Down(n, s1, ..., sn, f1, ..., fn, v1, ..., vn):
    Sort jobs by finish time so that f[1] <= f[2] <= ... <= f[n]
    Compute p[1], p[2], ..., p[n]
    For j = 1 to n
        M[j] <- empty
    M[0] <- 0
    Return M-Compute-Opt(n)

M-Compute-Opt(j):
    If M[j] is empty
        M[j] <- max{
            v[j] + M-Compute-Opt(p[j]),
            M-Compute-Opt(j-1)
        }
    Return M[j]

weighted interval scheduling: running time

Claim: Memoized version of algorithm takes \(O(n \log n)\) time

Sort by finish time: \(O(n \log n)\)
Computing \(p(\cdot)\): \(O(n \log n)\) via sorting by start time
M-Compute-Opt(j)
- each invocation takes \(O(1)\) time and either
  1. returns an existing value M[j], or
  2. fills in one new entry M[j] and makes two recursive calls
Progress measure \(\Phi\) = number of nonempty entries of M[]
- Initially \(\Phi = 0\), throughout \(\Phi \leq n\)
- Step 2 above increases \(\Phi\) by \(1\) ⇒ at most \(2n\) recursive calls
Overall running time of M-Compute-Opt(n) is \(O(n)\) ∎

Remark: \(O(n)\) if jobs are presorted by finish times

weighted interval scheduling: finding a soln

Q: Dynamic Programming (DP) algorithm computes optimal value. How to find a solution itself?

A: Make a second pass

Find-Solution(j)
    If j = 0
        Return {}
    Else If v[j] + M[p[j]] > M[j-1]
        Return Union({ j }, Find-Solution(p[j]))
    Else
        Return Find-Solution(j-1)

Analysis: number of recursive calls \(\leq n\) ⇒ \(O(n)\)

weighted interval scheduling: bottom-up

Bottom-up dynamic programming: Unwind recursion

Bottom-Up(n, s1, ..., sn, f1, ..., fn, v1, ..., vn):
    Sort jobs by finish time so that f[1]<=f[2]<=...<=f[n]
    Compute p[1], p[2], ..., p[n]
    M[0] <- 0
    For j = 1 to n
        M[j] <- max { v[j] + M[p[j]], M[j-1] }

group: find the weighted interval schedule

Use the dynamic programming solution to solve the weighted interval schedule.

Bottom-Up(n, s1, ..., sn, f1, ..., fn, v1, ..., vn):
    Sort jobs by finish time so that f[1]<=f[2]<=...<=f[n]
    Compute p[1], p[2], ..., p[n]
    M[0] <- 0
    For j = 1 to n
        M[j] <- max { v[j] + M[p[j]], M[j-1] }

:       :       :        a (29) :       :       :       :       :       :       
                     b (29)                     :       :       :       :       
:                c (28)         :       :       :       :       :       :       
:       :        d (68) :       :       :       :       :       :       :       
         e (76)         :       :       :       :       :       :       :       
:       :       :       :       :       :            f (33)     :       :       
:       :       :       :       :                    g (48)             :       
:       :       :       :            h (26)     :       :       :       :       
:       :                    i (3)              :       :       :       :       
:       :       :                        j (71)                 :       :       
1       2       3       4       5       6       7       8       9       0

Dynamic Programming (6)

segmented least squares

least squares

Least squares: Foundational problem in statistics

Given \(n\) points in the plane: \((x_1,y_1), (x_2,y_2), \ldots, (x_n,y_n)\)
Find a line \(y=ax+b\) that minimizes the sum of the squared error:

\[ \mathrm{SSE} = \sum_{i=1}^n (y_i - a x_i - b)^2 \]

Solution: Calculus ⇒ min error is achieved when

\[ a = \frac{n \sum_i x_i y_i - (\sum_i x_i) (\sum_i y_i)}{n \sum_i x_i^2 - (\sum_i x_i)^2}, b = \frac{\sum_i y_i - a \sum_i x_i}{n} \]

segmented least squares

Segmented least squares:

Points lie roughly on a sequence of several line segments
Given \(n\) points in plane: \((x_1,y_1), (x_2,y_2), \ldots, (x_n,y_n)\) with \(x_1 < x_2 < \ldots x_n\), find sequence of lines that minimizes \(f(x)\)

Q: What is a reasonable choice for \(f(x)\) to balance accuracy (goodness of fit) and parsimony (number of lines)?

segmented least squares

Given \(n\) points in the plane: \((x_1,y_1), (x_2,y_2), \ldots, (x_n,y_n)\) with \(x_1 < x_2 < \ldots < x_n\) and a constant \(c > 0\), find a sequence of lines that minimizes \(f(x) = E + c L\):

\(E\) is the sum of the sums of the squared errors in each segment
\(L\) is the number of lines

dynamic programming: multiway choice

Notation

\(\mathrm{OPT}(j)\) is minimum cost for points \(p_1, p_2, \ldots, p_j\)
\(e(i,j)\) is minimum sum of squares for points \(p_i, p_{i+1}, \ldots, p_j\)

To compute \(\mathrm{OPT}(j)\):

Last segment uses points \(p_i, p_{i+1}, \ldots, p_j\) for some \(i\)
Cost is \(e(i,j) + c + \mathrm{OPT}(i-1)\) (optimal substructure property; proof via exchange argument) \[ \mathrm{OPT}(j) = \begin{cases} 0 & \text{if } j=0 \\ \min\limits_{1 \leq i \leq j} \{ e(i,j) + c + \mathrm{OPT}(i-1) \} & \text{otherwise} \end{cases} \]

segmented least squares algorithm

Segmented-Least-Squares(n, p1, ..., pn, c):
    For j = 1 to n
        For i = 1 to j
            Compute the least squares e(i,j) for the segment pi--pj

    M[0] <- 0
    For j = 1 to n
        Find i in [1,j] that minimizes e(i,j) + c + M[i-1]
        M[j] <- e(i,j) + c + M[i-1]

    Return M[n]

segmented least squares analysis

Theorem: The dynamic programming algorithm solves the segmented least squares problem in \(O(n^3)\) time and \(O(n^2)\) space

\[ a = \frac{n \sum_i x_i y_i - (\sum_i x_i) (\sum_i y_i)}{n \sum_i x_i^2 - (\sum_i x_i)^2}, b = \frac{\sum_i y_i - a \sum_i x_i}{n} \]

Pf:

Bottleneck is computing \(e(i,j)\) for \(O(n^2)\) pairs
\(O(n)\) per pair using previous formula ∎

segmented least squares analysis

Theorem: The dynamic programming algorithm solves the segmented least squares problem in \(O(n^3)\) time and \(O(n^2)\) space

\[ a = \frac{n \sum_i x_i y_i - (\sum_i x_i) (\sum_i y_i)}{n \sum_i x_i^2 - (\sum_i x_i)^2}, b = \frac{\sum_i y_i - a \sum_i x_i}{n} \]

Remark: Can be improved to \(O(n^2)\) time and \(O(n)\) space

For each \(i\): precompute cumulative sums
- \(\sum_{k=1}^i x_k\), \(\sum_{k=1}^i y_k\), \(\sum_{k=1}^i x_k^2\), \(\sum_{k=1}^i x_k y_k\)
Using cumulative sums, can compute \(e_{ij}\) in \(O(1)\) time

Dynamic Programming (6)

knapsack problem

Knapsack problem

Given \(n\) objects and a "knapsack"
Item \(i\) weights \(w_i > 0\) and has value \(v_i > 0\)
Knapsack has capacity of \(W\)
Goal: fill knapsack so as to maximize total value

Example: Suppose \(W = 11\) and the weights and values are given in table at right.

Ex: \(\{1,2,5\}\) has value 35

Ex: \(\{3,4\}\) has value 40.

Ex: \(\{3,5\}\) has value 46, but exceeds weight limit.

\(i\)	\(v_i\)	\(w_i\)
1	1	1
2	6	2
3	18	5
4	22	6
5	28	7

knapsack problem

Greedy by value: Repeatedly add item with maximum \(v_i\)

Greedy by weight: Repeatedly add item with minimum \(w_i\)

Greedy by ratio: Repeatedly add item with maximum ratio \(v_i / w_i\)

Observation: None of greedy algorithms is optimal

dynamic programming: false start

Def: \(\mathrm{OPT}(i)\) is max profit subset of items \(1, \ldots, i\)

Case 1: \(\mathrm{OPT}(i)\) does not select item \(i\)

\(\mathrm{OPT}\) selects best of \(\{ 1, 2, \ldots, i-1 \}\) (optimal substructure property; proof via exchange argument)

Case 2: \(\mathrm{OPT}(i)\) selects item \(i\)

Selecting item \(i\) does not immediately imply that we will have to reject other items
Without knowing what other items were selected before \(i\), we don't even know if we have enough room for \(i\)

Conclusion: Need more subproblems!

dynamic programming: adding a new var

Def: \(\mathrm{OPT}(i, w)\) is max profit subset of items \(1, \ldots, i\) with weight limit \(w\)

Case 1: \(\mathrm{OPT}(i,w)\) does not select item \(i\) (possibly \(w_i > w\)?)

Select best of \(\{ 1, 2, \ldots, i-1 \}\) using weight limit \(w\)

Case 2: \(\mathrm{OPT}(i,w)\) selects item \(i\)

Collect value \(v_i\)
New weight limit: \(w - w_i\)
Select best of \(\{1,2,\ldots,i-1\}\) using this new weight limit

\[ \mathrm{OPT}(i,w) = \begin{cases} 0 & \text{if } i=0 \\ \mathrm{OPT}(i-1, w) & \text{if } w_i > w \\ \max \{ \mathrm{OPT}(i-1,w), v_i + \mathrm{OPT}(i-1,w-w_i)\} & \text{otherwise} \end{cases} \]

knapsack problem: bottom-up

Knapsack(n, W, w1, ..., wn, v1, ..., vn):
    For w = 0 to W
        M[0, w] <- 0

    For i = 1 to n
        For w = 1 to W
            If wi > w
                M[i, w] <- M[i-1, w]
            Else
                M[i, w] <- Max { M[i-1, w], vi+M[i-1, w-wi] }

    Return M[n, W]

\[ \mathrm{OPT}(i,w) = \begin{cases} 0 & \text{if } i=0 \\ \mathrm{OPT}(i-1, w) & \text{if } w_i > w \\ \max \{ \mathrm{OPT}(i-1,w), v_i + \mathrm{OPT}(i-1,w-w_i)\} & \text{otherwise} \end{cases} \]

group: knapsack algorithm

\(i\)	\(v_i\)	\(w_i\)
1	1	1
2	6	2
3	18	5
4	22	6
5	28	7

\[ \mathrm{OPT}(i,w) = \begin{cases} 0 & \text{if } i=0 \\ \mathrm{OPT}(i-1, w) & \text{if } w_i > w \\ \max \left\{ \begin{array}{l} \mathrm{OPT}(i-1,w), \\ v_i + \mathrm{OPT}(i-1,w-w_i) \end{array} \right\} & \text{otherwise} \end{cases} \]

\(\mathrm{OPT}(i,w) =\) max profit subset of
items \(1, \ldots, i\) with weight limit \(w\)

\(i\)	subset	1	2	3	4	5	6	7	8	9	10	11
0	\(\{ \}\)	0	0	0	0	0	0	0	0	0	0	0
1	\(\{ 1 \}\)
2	\(\{ 1,2 \}\)
3	\(\{ 1,2,3 \}\)
4	\(\{ 1,2,3,4 \}\)
5	\(\{ 1,2,3,4,5 \}\)

knapsack problem: running time

Running time: There exists an algorithm to solve the knapsack problem with \(n\) items and maximum weight \(W\) in \(\Theta(nW)\) time (weights are integers between \(1\) and \(W\))

Not polynomial in input size!
- \(W\) is an input, but not a size; only a number
- "Pseudo-polynomial"
Solving the maximum version of knapsack problem is NP-Hard, while decision version is NP-Complete (chp 8)
There exists a poly-time algorithm that produces a feasible / approximate solution that has value within 0.01% of optimum (see 11.8)
See Wikipedia for more details

Dynamic Programming (6)

RNA secondary structure

RNA: String \(B = b_1 b_2 \ldots b_n\) over alphabet \(\{ A, C, G, U \}\)

Secondary structure: RNA is single-stranded so it tends to loop back and form base pairs with itself. This structure is essential for understanding the behavior of the molecule.