- Characterize a subproblem structure.
- Define an array to contain the value for each relevant subproblem.
- Formulate a recursive definition for the array.
- Fill in the array in order (bottom-up).
- Compute a solution from the array values.

- F
_{1}= 1 - F
_{2}= 1 - for n>2, F
_{n}= F_{n-1}+ F_{n-2}

int fib(int n) { if (n < 3) return 1; else return fib(n-1) + fib(n-2); }This is

The more obvious loop computes F(n) in time which is linear in n (depending on your assumptions about multiplication):

int fib(int n) { int *f, answer; if (n < 3) return 1; f = new int[n+1]; f[1] = f[2] = 1; for (int i = 3; i <= n; i++) f[i] = f[i-1] + f[i-2]; answer = f[n]; delete f; return answer; }We tend to recoil from this one because it is doing a malloc (or "new"), and malloc/free is a very expensive operation.

However, it turns out to run MUCH faster than the recursive version, for anything but the smallest values of n. A change to the 'if' there could take care of the remaining few special cases where the recursive version might be faster, but isn't really necessary.

But we know that recursion is good. Especially in the case of things which are defined recursively, such as the Fibonacci sequence, we can write a much simpler program by using recursion.

However, we also can note that there is something of the recursive solution
still present in the second version of fib().
What this is is the *recurrence*, the formula stating values of fib(n) in
terms of fib(j) for one or more values of j<n.

"Dynamic programming" inverts the recursion. For some problems (not all, or necessarily even most), even if you have to malloc and free the array each time, it can be much faster to compute all the values in an array and then take the ending value.

Part of the fact which makes the dynamic programming approach more appropriate
for fib() than is recursion is the fact that all values of fib(i) for i<n are
used. In other cases we fill in only what we need out of a sparse array, in a
hybrid between dynamic programming and recursion often called "memoization"
-- memoization is a process of making a function remember values it has
already returned, thus not ever repeating a calculation. This can be done to
any partial extent, such as remembering only the last value returned (and its
corresponding argument), so if the function gets called with the same argument
twice in a row, it doesn't have any work to do the second time. Or we can
remember the last *k* arguments, etc.

You can carry a certain volume, or perhaps weight, of coins. Your knapsack has some capacity, in some sort of units. You have to walk ten kilometres home so you're not going to carry anything any other way than in your knapsack.

Is it best to fill it up with gold coins or silver coins? Gold coins are worth more, but they are heavier.

Gold coins, though, are worth more money per gram. So if you can carry a set
amount of weight, you would prefer gold coins.
It depends on how they're made, so let's say that the gold coins in this room
are worth *less* money per *millilitre* (or per litre) than the
silver coins. So if you have a capacity of a certain amount of *space*,
you would prefer the silver coins.

In any case, if you stuff your knapsack with all the coins you can of the one which gives you the most value for the cost, you may end up with some space or weight allowance left over.

You may take all the gold coins in the room, for example, and then have to start on the silver coins (unless you want to waste some of the room in your knapsack).

Or you may fill your knapsack until it has no room for any more gold coins,
but isn't *quite* full; perhaps there's space left which is equivalent to
space for 0.8 gold coins. If there are some other coins which are smaller or
lighter, you may be able to fit a few of those.

Perhaps if you leave out one gold coin, that 1.8 gold coins' worth of space can carry enough silver coins such that their value is greater than 1 gold coin, even though we know it's less than 1.8 gold coins.

So there are many considerations involved, and it's not trivial to write a computer program to solve it other than by "brute force", trying all possibilities. That algorithm ends up taking exponential time, so we'd like to do better.

Another example of a related situation is packing a box for moving. The box has a certain volume; or you may have a maximum weight for the box. So you might fill it up with large objects, and then squeeze some smaller objects in there too.

Here is a precise and formal definition of the knapsack problem. Not everything in the examples above applies to the knapsack problem. In particular, we assume we can't run out of a particular kind of item.

Find the list of indices I_{0}, I_{1}, ..., I_{k-1} (repetitions allowed)
such that

and is maximal.

Working on a knapsack problem solution:

It's hard even to see how to write the "brute-force" solution. We have to enumerate all candidate solutions, but there's an infinite number, because we can have any number of each item.

Actually we can fill the knapsack with only one kind of item at a time, and thus make a list of the maximum possible numbers of each individual item. This will be a lot of combinations, but it yields the brute-force approach.

The time complexity of the resulting algorithm will be something like the number of kinds of items to the power of the capacity. That is, this is an exponential-time algorithm, which is generally considered unacceptable.

To come up with a recursive or dynamic-programming solution, we need to figure out what kinds of partial solutions are of interest. That is, we need the formula for the reduced parameter in the recursive call.

Subproblems with fewer items or smaller weights don't end up yielding an interesting recursion.

The only thing which leads to a useful sub-problem is smaller capacities.

But it's tricky. If we have an optimal selection of objects for a capacity of ten, and another optimal selection of objects for a capacity of 15, if we combine those two groups we don't necessarily have an optimal selection of objects for a capacity of 25. We have a possible selection of objects for a capacity of 25, but not necessarily the best. For example, there could be an object of great value but a cost of 20, which we couldn't consider in the 10 or 15 cases, but is now an option.

However, this does work the other way around! If we have an optimal solution
for cost 25, and *if* it contains some items which have a total cost of
10, then if we remove those items, we necessarily have an optimal solution for
size 15.

This is not a contradiction because in the case of that valuable item of size 20, the sub-problem under consideration differs! That is, some set of subproblems (in terms of reduced capacity) do constitute a partial solution to the big problem, and some don't.

This consideration of subproblems yields a recurrence. The basis of recursion, or of dynamic programming, is a recurrence. We want to separate the recurrence from our choice of implementation, whether it's recursion, memoization, or dynamic programming, or what.

So we work on the recurrence first:

Let M(j) = maximum value attainable for capacity j

(Knapsack problem solution is M(C))

Recurrence:

M(0) = 0

For j>0, M(j) = max(

M(j-1),

MAX)_{i such that wi <= j}of M(j-w_{i}) + v_{i}

Here "MAX" is a quantifier and "max" is a two-argument function.

The above recurrence says that the maximum value attainable for capacity j is the maximum of

- the maximum value attainable for capacity j-1 (this represents leaving some empty space, but sometimes that's best)
- for each kind of item 'i', the maximum value attainable for (j-weight[i]), plus the value of an item of type i. This represents taking out one item of type i from the container of capacity j, finding an optimal sub-solution, then putting that item of type i back in to find the optimal solution for j.

Recursive function:

int M(int j) { int i, max; if (j == 0) return 0; max = M(j-1); for (i = 0; i < n; i++) { if (w[i] <= j) { int subvalue = M(j - w[i]) + v[i]; if (subvalue > max) max = subvalue; } } return max; }This is pretty nasty. For the same reason as recursive Fibonacci, but worse --

However, the dynamic-programming algorithm for this recurrence turns out to be pretty good!

Dynamic-programming version:

M[0] = 0; for (j = 1; j <= C; j++) { M[j] = M[j-1]; for (i = 0; i < n; i++) if (w[i] <= j && M[j - w[i]] + v[i] > M[j]) M[j] = M[j - w[i]] + v[i] ; }This is nice. Just about O(n

We're not quite done; this doesn't actually give the solution! This finds the maximum value, but it doesn't find the list of indices, the list of "I want a gold coin, a silver coin, another gold coin, another gold coin."

But like what we said about the statistics-tracking in your simulation, the structure of the above program parallels the actual problem well enough that we can in a straightforward way add enough tracking to it to get the actual answer.

We keep another array which for all j, specifies ANY ONE OF the items involved in attaining a value of M[j]. Why is any one good enough? Because we can then look at the subproblem of C minus that item's weight and find another one, and so on.

Thus the solution extraction will be recurrent too.

Let L[i] (for "last") be the last item added in attaining the best allocation for C.

// compute best packing M[0] = 0; L[0] = -1; // signal saying there are no more items in the list for (j = 1; j <= C; j++) { M[j] = M[j-1]; L[j] = L[j-1]; for (i = 0; i < n; i++) { if (w[i] <= j && M[j - w[i]] + v[i] > M[j]) { M[j] = M[j - w[i]] + v[i] ; L[j] = i; } } } // extract solution for (j = C; L[j] >= 0; j -= w[L[j]]) printf("%d\n", L[j]);Example: C = 11,

obj | weight | value |
---|---|---|

0 | 3 | 2 |

1 | 4 | 3 |

2 | 5 | 4 |

3 | 7 | 5 |

The filled-up array will be:

index | M | L |
---|---|---|

0 | 0 | -1 |

1 | 0 | -1 |

2 | 0 | -1 |

3 | 2 | 0 |

4 | 3 | 1 |

5 | 4 | 2 |

6 | 4 | 2 |

7 | 5 | 0 |

8 | 6 | 0 |

9 | 7 | 1 |

10 | 8 | 2 |

11 | 8 | 2 |

Notes:

- For 7, we could have used item 3, but the loop will find the other one first, (they're of equal value).
- For 10, we could do weights 7+3, or 4+5... both yield value 7... but ... use two weight 5s!

The answer will be: 2, 2

(i.e. two items of type 2)

The next example is about matrix multiplication.

Suppose we have a bunch of matricies to multiply together.

A x B x C x D x E

Matrix multiplication is associative, but not commutative.

e.g. can multiply AxB first, then xC; or can multiply BxC first, then Ax that.

But we can't multiply CxB instead. This is what it means for matrix multiplication not to be commutative. Also that would often not be what as computer programmers we would think of as "type-correct" -- in multiplying two matrices, the number of columns in the left matrix must equal the number of rows in the right matrix.

There's still substantial choice though, through the associativity.

And while the mathematical answer is the same, it does turn out to matter considerably sometimes which way we do it. The number of little multiplication operations of the matrix elements can be very different.

An i by j matrix times a j by k matrix yields an i by k matrix, in a number of operations proportional to i*j*k.

Example: If A is 10 by 100, B is 100 by 5, C is 5 by 50

Computing (A*B)*C takes time proportional to 5000 + 2500

Computing A*(B*C) takes time proportional to 25000 + 50000

Ten times the work!

How many ways are there to parenthesize a product like A x B x C x D x E ?

LOTS. It's proportional to 4^{n}.

Dynamic programming to the rescue!

There is a good dynamic programming solution for figuring out the optimal order for doing the multiplying.

Recurrence:

The optimal order for multiplying A_{0} * A_{1} * ... *
A_{n-1}
ends with a multiplication of

- the optimal multiplication of Atimes_{0}* ... * A_{k-1}

- the optimal multiplication of A, for some k._{k}* ... * A_{n-1}

There must be some last multiplication; the optimal solution for multiplying
the *n* matrices together is to multiply A_{0} through
A_{k-1} together, and multiply A_{k} through A_{n-1}
together, then multiply the two results.
These subproblems must in turn
be optimal for those lists of matrices, else we could get
a better overall result by a better subresult.

Let array 's' be such that matrices A_{0} through A_{n-1} have row
dimension s_{0}
through s_{n-1} and column dimension s_{1} through
s_{n}, respectively.

That is,
the number of columns in, say, matrix A_{3}, which is s_{4}, must equal the number
of rows in matrix A_{4}, which s_{4} is also defined as being. The only point of
mentioning the column dimensions at all is to define s_{n} as the number of
columns in the rightmost matrix (which isn't equal to anything else, just like
the number of rows in the leftmost matrix).

Let cost[i][j] be the minimum cost of multiplying A_{i} * ... *
A_{j-1}

Base case: cost[i][i+1] = 0, for all i

(We don't use cost[i][i], because by that definition of the cost array, it means nothing.)

cost[i][j] = MIN _{i<=k<j} of cost[i][k] + cost[k][j] + s[i]*s[k]*s[j]

Recursive version (bad, but a guide to writing the dynamic-programming version):

// Cost of multiplying from i to j (semi-inclusive), given array of sizes 's' int r_mat_prod(int i, int j, int *s) { int k, mincost; if (j == i + 1) return 0; k = i + 1; mincost = r_mat_prod(i, k, s) + r_mat_prod(k, j, s) + s[i]*s[k]*s[j]; for (k++; k < j; k++) { int subcost = r_mat_prod(i, k, s) + r_mat_prod(k, j, s) + s[i]*s[k]*s[j]; if (subcost < mincost) mincost = subcost; } return mincost; }Dynamic version:

cost[i][j] will be the cost of multiplying A

int i, len; for (i = 0; i < n; i++) cost[i][i+1] = 0; for (len = 2; len <= n; len++) {As in the knapsack example above, this finds the optimal order but doesn't actually tell us what it is! Fortunately, also as in the knapsack example, we can easily insert appropriate tracking so that we can tell what the optimal order is.(above loop took care of the len==1 cases)for (i = 0; i + len <= n; i++) { int j = i + len; int k = i + 1; cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j]; for (k++; k < j; k++) if (cost[i][k] + cost[k][j] + s[i]*s[k]*s[j] < cost[i][j]) cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j]; } }

int i, len; for (i = 0; i < n; i++) { cost[i][i+1] = 0; best[i][i+1] = i; } for (len = 2; len <= n; len++) { for (i = 0; i + len <= n; i++) { int j = i + len; int k = i + 1; best[i][j] = k; cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j]; for (k++; k < j; k++) { if (cost[i][k] + cost[k][j] + s[i]*s[k]*s[j] < cost[i][j]) { cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j]; best[i][j] = k; } } } }Now, to output it we can use a recursive function:

void output_order(int i, int j) { int k = best[i][j]; if (i < k) { printf("("); output_order(i, k); printf(") * "); } printf("A%d", k); if (k + 1 < j) { printf(" * ("); output_order(k + 1, j); printf(")"); } }The initial call will be:

[list of topics covered, evening section]

[course notes available (evening section)]

[main course page]