Dynamic Programming

Remarks

DP is a method for solving optimization problem by breaking the problem into smaller subproblems.
DP requires that the overall optimal solution depends on the optimal solution to its subproblems.
We might be able to apply DP when:
1. The subproblems can be seen as smaller version of the original problem.
2. The optimal solution of a problem is a function of the optimal solutions to one or more of its subproblems.

Two methods using dynamic programming are top-down memoization and bottom up tabulation.

Top-down memoization is to recurssivly call the method to find solutions to subproblems and cache the results along the way to avoid redundent calculations.
Bottom up tabulation is to solving the smallest subproblem first and then filling up the N dimension optimal solution table.

Classic example for dynammic programming is the fibonacci number calculation.

def fib(n):
    if n <= 1: return 1
    return fib(n - 1) + fib(n - 2)

The recursive relationship here is \(f(n) = f(n - 1) + f(n - 2)\). The above code would work but will be very slow when n increases, due to the redundent calculations in the recursive calls.

With the top down memoization approach, we save the solutions of the subproblems and reuse the reuslts when we don’t need to recalculate.

memo = {0: 1, 1: 1}
def fib_memo(n):
    if n in memo: return memo[n]
    memo[n] = fib_memo(n - 1) + fib_memo(n - 2)
    return memo[n]

With the bottom up tabulation approach, we iterativly fill up the 1 dimensional table up untill we can answer the problem.

def fib_tab(n):
    fibs = [1, 1]
    for i in range(2, n + 1):
        fibs.append(fibs[-1] + fibs[-2])
    return fibs[n]

Patterns

1D DP

Many 1D DP can be solved without explicit 1D array.

Depends on a previous statistic Kadane
Depends on a previous solution
Depends on a couple of previous solutions
Depends on all previous solutions
Expand on the options

Buy and sell stock once max profit is max(running_max, price - previous lowest point)

prev_min_price = prices[0]
max_profit
for price in prices[1:]:
  max_profit = max(max_profit, price - prev_min_price))
  prev_min_price = min(prev_min_price, price)

Maximum subarray sum

max_sum = window_sum = nums[0]
for num in nums[1:]:
  if num > window_sum + num:  window_sum = 0
  window_sum += num
  max_sum = max(max_sum, window_sum)

Counting bits
num i has 1 more bit than i & (i - 1) -> bit_count(i) = bit_count(i & (i - 1)) + 1

dp = [0] * (n + 1)
for i in range(1, n + 1):
    dp[i] = dp[i & (i - 1)] + 1

Fibonacci sequence / climbing stairs at each stair, we either move 1 step up from previous step or move 2 step up from 2 step down…

dp = [1,1]
for i in range(2, n + 1):
    dp[i] = dp[i-1] + dp[i-2]

Min cost tickets

dp = [0] * (days[-1] + 1)
days = set(days)
for i in range(1, len(dp)):
  if i in days:
      dp[i] = min(
          dp[max(i-1, 0)] + costs[0],
          dp[max(i-7, 0)] + costs[1],
          dp[max(i-30, 0)] + costs[2]
      )
  else:
      dp[i] = dp[i-1]

Seat assignment problem

if n <= 2: return 1 / n
dp = [0] * (n + 1)
dp[1] = 1
dp[2] = 1 / 2
accum = 1
for i in range(3, n+1):
  accum += dp[i-1]
  dp[i] = accum / i

Coin Change 2

dp = [1] + [0] * amount
for coin in coins:
  for i in range(amount+1):
      if i + coin > amount: break
      dp[i+coin] += dp[i]

2D DP

Depends on previous row alone
Depends on previous row and current row previous column

Pascal’s Triangle

dp = []
for r in range(1, num_rows + 1):
  row = [None for _ in range(r)]
  row[0], row[-1] = 1, 1
  for j in range(1, r-1):
      row[j] = dp[-1][j-1] + trdpiangle[-1][j]
  triangle.append(row)

Unique Paths

dp = [1] * n
for row, col in product(range(1, m), range(1, n)):
  dp[col] = dp[col] + dp[col-1]

Size of square Submatrices

dp = [[0] * (ncol + 1) for _ in range(nrow + 1)]
for r, c in product(range(1, nrow + 1), range(1, ncol + 1)):
  if matrix[r-1][c-1] == 0: continue
  dp[r][c] = min(dp[r-1][c], dp[r][c-1], dp[r-1][c-1]) + 1

Minimum path sums

nrow, ncol = len(grid), len(grid[0])
dp = [[0] * ncol for _ in range(nrow)]
for r, c in product(range(nrow), range(ncol)):
  if r == 0:   dp[r][c] = sum([grid[0][cc] for cc in range(c + 1)])
  elif c == 0: dp[r][c] = sum([grid[rr][0] for rr in range(r + 1)])
  else:        dp[r][c] = min(dp[r - 1][c] + grid[r][c], dp[r][c - 1] + grid[r][c])

Advanced

Bitmask indexing
Relaxation

Campus bike

INF = 2 ** 32
dp = [[INF] * (1 << len(bikes)) for _ in range(len(workers) + 1)]
dp[0][0] = 0
for i in range(len(workers)):
  for j in range(len(bikes)):
      for bit in range((1<<len(bikes))):
          if bit & 1 << j: continue
          mask = bit | 1 << j
          dp[i+1][mask] = min(
              dp[i+1][mask],
              dp[i][bit] + dist[i][j]
          )