Dynamic Programming Revisit

The core of dynamic programming: recursion with memorization. Breaking a big problem into small sub-problems and only computing each sub-problem once.

Notice dynamic programming is an algorithm that will essentially search the entire solution space

It’s usually helpful to think what’s the last step in your solution (top-down) and what sub-problem will it generate. But when it comes to building the solution, we usually prefer doing it bottom-up. This will be discussed more in later examples.

Weighed Interval Scheduling

Problem statement: Given an array of tasks containing start time, end time, and weight, Arr = [(s_1, e_1, v_1) … (s_n, e_n, v_n)]. Find a valid schedule that maximax the value. Notice a valid schedule refers that for any task i and task j in a schedule, task i and task j are not overlapping.

When it comes to intervals, a common technique is to sort all intervals by their ending time. This is pretty much the same as in the non-weighed interval scheduling, a special case that can be solved by using the greedy algorithm.

After sorting all intervals by their end time, it’s easy to find out for any task i, what are all the tasks that don’t overlap with it? Just do a binary search to find the last task j that finishes before task i start, i.e. f_j <= s_i. And all tasks from index 0 to j, inclusive, are not overlapping with task i. We can sort the original task array and find out the last non-overlapping index for each task, p(i), in O(n*lg(n)) time.

Then, just consider the last interval in our sorted array. Do we take it or not? What are the subproblems?

Say OPT(i) means the maximum value we can get in intervals 0 to i, inclusive. Then just consider the last interval in our sorted interval array. There are only two cases:

  1. if we take it, we gain a value of v_i, but we lose the oppotunity to take all intervals from p(i) + 1 to i – 1. The sub-problem problem becomes what is the maximum value we can get from intervals 0 to p(i), i.e. OPT(p(i))
  2. if we drop task i, the sub-problem becomes what is the maximum value we can get from interval 0 to i – 1

Thus, the recurrence formula is

OPT(i) = Max(OPT(p(i)) + v_i, OPT(i-1))

The base case is simply is we have no intervals left, OPT(-1) = 0

When it comes to building the function, we can start computing the value of OPT(0) and then OPT(1), …

The total time complexity for this recurrence is just O(n) since we are looping n times and in each interaction, the time used is only O(1)

Thus the total time complexity is O(n * lg(n)) and space O(n)

0/1 Knapsack

Problem statement: given an array of items, each item has a weight and value, Arr = [(w_1, v_1) … (w_n, v_n)], and given a total weight constrain W. What is the maximum value we can get such that the total weight is less than W.

This is pretty much the same as the previous problem.

Consider the last item i, there are only two cases:

  1. If we take it, we gain a value of v_i but use part of the weight constrain, w_i, the subproblem is what are the maximum value we can get in item index from 0 to i-1 under the updated constrain W – w_i
  2. If we drop it, the subproblem is simply finding the maximum value in item index from 0 to i-1 under the original constrain W

Thus, we can define OPT(i,W): the maximum value in items index from 0 to i under the constrain W, and our recurrence formula is

OPT(i, W) = Max(OPT(i-1, W-w_i) + v_i, OPT(i-1, W))

The base case is either if we run out of W, i.e. W = 0, or we don’t have any items left, i.e. i = -1. In such cases, the maximum value is just 0

Overall time complexity O(W*n)

Notice this is not an efficient solution since the time complexity is has a polynomial of an input value W. We call such algorithm has pseudo-polynomial time complexity.

Sequence Alignment

Problem statement: Given two strings, str1 and str2. What is the minimum number of operations(add, revise and delete) we need to make these two strings identical?

The idea is similar, consider the last char in str1, i.e. str1[i], and the last char in str2, i.e. str2[j]. What are the possible cases and subproblems?

  1. If str1[i] == str2[j], we don’t need to perform any operation and we just need to find out what is the number of operations needed to align str1[:i-1] and str2[:j-1]. This seems like a subproblem that can be solved recursively.
  2. If str1[i] != str2[j], we can a)add an char to str1, b) add an char to str2 to make them look the same or c) revise str1[i] or str2[j]. So the subproblem in the case a is what is the # of operation needed to align str1[:i] and str2[:j-1], in case b) # of operation needed to align str1[:i-1] and str2[:j], in case c) # of operation needed to align str1[:i-1] and str2[:j-1]
Define OPT(i,j): # of operation needed to align str1[:i] and str2[:j]

Recurrence formula:
OPT(i,j) = Min(OPT(i-1, j-1) if match else OPT(i-1, j-1) + 1, OPT(i-1, j) + 1, OPT(i, j-1) + 1)

Base case is either i == -1, OPT(-1, j) = j + 1 or j == -1, OPT(i, -1) = i + 1

We can see that OPT(i,j) depends on OPT(i-1, j), OPT(i, j-1), and OPT(i-1, j-1). We can fill the table either row-wise or column-wise. And row i/colon j only depends on row i-1/colon j-1

Time complexity: O(mn), Space(m)

Notice if one needs to reconstruct the operations of getting the minimum number of operations, the space is no longer linear. To achieve a linear space while having the same time complexity, we can add divide and conquer on top of this dynamic programming approach. This might be covered later in a separate post.

Bellman-Ford: Finding the minimum distance in graph with negative weight

Problem statement: consider a graph G=(V, E) where some e in E can be negative. What is the shortest distance from node s to t?

First, notice that if there are any negative cycles in any path from s to t, the minimum distance is just -inf, since one can go into the cycles as many times as he wants and the distance will always become smaller. Actually, Bellman-Ford is able to detect if there are any negative cycles in the graph.

Let’s just assume there aren’t any negative cycles for now. The tricky/underlying condition of this problem is in any path that has the minimum distance, the total edges in that path are always less than |V|-1.

The above holds because if a path contains more than |V|-1 edges, it must contain a cycle. Since we assume that there are any negative cycles in the graph, that cycle must have a positive weight and one can always remove such a cycle to get a smaller s-t distance.

Then, a natural question comes out: Does the minimum cost path actually need |V| – 1 edges?

  1. If it does need, consider the all the nodes W = {w_1, … w_k} s.t (w_i,t) in E, i.e. all nodes has a edge to destination t. Any s-t path must go through at least one node in W. Since such (w_i, t) uses one edge, the subproblem become what’s the minimum cost path from s to w_i with |V|-2 edges
  2. If it doesn’t, the subproblem is actually finding the minimum cost path form s to t with |V| – 2 edges
Define OPT(i, v): the minimum cost path from s to v using at most i edges

Recurrence formula
OPT(i, v) = MIN(OPT(i-1,v), OPT(i-1, w_i) + cost_vw_i) for all w_i s.t. (w_i, v) in E)

The base case is a) if we run out of edges, i.e., i == 0, OPT(0, v) = INF. Or b) v == s, cost is 0 since no edge is need.

Time complexity: O(|V| * |E|). Notice for all nodes, we only need to consider all its neighbors and this is bounded by O(|E|).

We can also improve the space complexity to O(|V|). This might be discussed in a separate post.

Finding negative cycles in the graph is based on whether the minimum cost continuously decreases. We can perform another iteration to find the min-cost s-t path by using |V| edges. If we find out the min-cost path using |V| – 1 is greater than the one using |V| edges. This means we have a negative cycle in the graph. As for picking node s and node t, we can make up two new nodes s and t connecting with all nodes in the graph. And run Bellman-Ford with an extra iteration.

Summary

So far we have revisited a few classic dynamic programming problems. To solve these problems, it is usually helpful to think of the recurrence formula top-down and build a solution bottom-up. Again, the key of dynamic programming is recurrence and memorization. Identify the sub-problem that can be solved recurrently and come up with a recurrence formula. When explaining a dynamic programming algorithm, it’s usually helpful to make the meaning of each entry and the relationship between entries clear.

It’s very important that one understands the principle of an algorithm and its application instead of learning by rote, memorizing the pseudocode and complexity for single problems.

I might make mistakes in some of the places. Any comment or suggestions will be appreciated.

Featured

Full Analysis of Finding K Smallest in Sorted/Unsorted Arrays

Problem Statement:

Giving m unsorted arrays, return the k smallest elements

Method #1: Min-heap

Method 1: Using bottom-up heap construction to build a min-heap, then pop the smallest element n times

Complexity analysis:
    Time to construct min-heap: O(n)
    Time to extract min k times: O(k * lgn)
    Total time complexity: O(n + k * lgn)
    Total space complexity: O(1)
def k_smallest_1(array, k):
    """
    input: int[] array, int k
    return: int[]

    Notice:
    1. the children for node i is 2*i+1 and 2*i+2
    """

    def swap(i, j):
        array[i], array[j] = array[j], array[i]

    def heapify(i):
        left, right = 2 * i + 1, 2 * i + 2
        if left >= len(array):
            return

        index = left if right >= len(array) or array[left] < right else right
        if array[index] < array[i]:
            swap(i, index)
            heapify(index)

    for i in range(len(array)):
        heapify(len(array) - i - 1)

    res = []
    for i in range(k):
        res.append(array.pop(0))
        heapify(0)

    return res

Method #2: Max-heap

Method 2: Using a size k max-heap, and loop through the rest n-k element, if element e is smaller than the 
largest number in the heap, extract the max element and insert element e into the heap

Complexity analysis:
    Time to construct a size k max-heap: O(k)
    Time to extract-max and insert n-k times: O((n-k) * lgk)
    Time to sort the max-heap by extract min k times= O(k * lgk)
    Total time complexity: O(k + (n-k) * lgk + k * lgk) = O((n * lgk)
def k_smallest_2(array, k):
    """
    Notice:
    1. we can use the array[:k] as our max-heap so when we do the heapify, the boundary should be k instead of n
    """

    def swap(i, j):
        array[i], array[j] = array[j], array[i]

    # heapify in max-heap
    def heapify(i):
        left, right = 2 * i + 1, 2 * i + 2
        if left >= k:
            return

        index = left if right >= k or array[left] > right else right
        if array[index] > array[i]:
            swap(i, index)
            heapify(index)

    for i in range(k):
        heapify(k - i - 1)

    for i in range(k, len(array)):
        if array[i] < array[0]:
            swap(i, 0)
            heapify(0)

    return sorted(array[:k])

Method #3: Quick Select

Method 3: Using quick select to find the k th element: randomly pick a pivot point and sort the array s.t. all 
elements smaller or equal than pivot are on the left and all elements larger than pivot are on the right. Compare 
pivot's index p with k:
    1. if p == k: pivot is the kth smallest element
    2. if p > k: search the kth smallest in the array[0:p]
    3. if p < k: search the (k-p-1) in array[p+1, n] 
After we find the kth smallest element as the pivot, return a sorted array of the first k element.

Complexity analysis:
    Time to find out the kth element as pivot:
        - Assume the randomly picked pivot is always the mid point, that is, it can always separate the array into two 
equal parts. Then the total element we need to examine is O(n + n/2 + n/4 + ...) = O(n).
        - Or we can use the master theorem, f(n) = O(time to divide) + O(time to combine) = O(n/2) + O(1) = O(n) in 
average, a = # of sub-problems = 1, n/b = size of sub-problem: n/2 in average => n ** log_b(a) = O(1). This 
is case #3: total time complexity if O(n)
    Time to sort the first k elements: O(k * lgk)
    Average total time complexity: O(k * lgk + n)
def k_smallest_3(array, k):
    """
    Notice:
    1. In actually implementation, since we are using the left, right index boundary, we are actually always looking for if the pivot's index between left and right is k-1 or not.

    2. Like quick sort, in quick selection, after finding the pivot point, swap pivot with the rightmost index, perform a rainbow sort and swap it back with the left index

    """
    if k == 0:
        return []

    def swap(i, j):
        array[i], array[j] = array[j], array[i]

    def quick_select(left, right):
        pivot = random.randint(left, right)
        swap(pivot, right)
        l, r = left, right - 1
        while l <= r:
            if array[l] <= array[right]:
                l += 1
            elif array[r] > array[right]:
                r -= 1
            else:
                swap(l, r)
                l += 1
                r -= 1
        swap(l, right)
        if l == k - 1:
            return
        elif l > k - 1:
            quick_select(left, l - 1)
        else:
            quick_select(left + 1, right)

    quick_select(0, len(array) - 1)
    return sorted(array[:k])

Variant #1: Find k smallest elements in m sorted arrays

Method #4: Min heap

Method #4: Initialize a min-heap of size m containing the first element of each element. Execute extract min k times 
and after popping an element, add the next element into the heap

Complexity analysis:
    Time to construct heap: O(m)
    Time to extract-min and insert k times: O(k * lgm)
    Total time complexity: O(m + k * lgm)
def kth_smallest_1(matrix, k):
    """
    Notice:
        1. heapq.heapify() method transforms the list in-place without returned the modified list
        2. syntax for enumerate a list: for idx, item in enumerate(list)
        3. how to override comparator of an object:
            setattr(obj, "__lt__", lambda self, other: self.val <= other.val)
        4. how to make a max-heap:
            using -1 * val in a min-heap
    """
    queue = [(row[0], idx, 0) for idx, row in enumerate(matrix)]
    heapq.heapify(queue)
    heapq._heapify_max(queue)
    res = None
    for i in range(k):
        res, row_idx, col_idx = heapq.heappop(queue)
        if col_idx + 1 < len(matrix[row_idx]):
            heapq.heappush(queue, (matrix[row_idx][col_idx + 1], row_idx, col_idx + 1))
    return res

Method #5: Max-heap

Method #5: Initialize a max-heap of size k, scan through the rest (N - k) elements (N is the total number elements 
in m sorted array, if e < root, execute extract max and insert(e)

Complexity analysis:
    Time to construct max-heap: O(k)
    Time to extract max and insert N - k times: O((N - k) * lg(k))
    
This is slower than method #4 since it's a general method to find k smallest in unsorted array

Variant #2: Sorted Matrix

if m sorted array have the same size n, and the columns are also sorted. Find the kth smallest number

Method #6: Binary Search

Method #6: Since the minimum number in matrix of size m*n is matrix[0][0] and the maximum number is matrix[m-1][n-1]. We can do binary search within range [matrix[0][0], matrix[m-1][n-1]]. kth smallest is equivalent to the smallest number num in the range s.t. |elements in matrix that are less than or equal to num| >= k. 

Complexity Analysis:
    To check if a number num satisfies such quality needs only O(m+n) time: Notice if we know the number in row X-1 androw X+1 that are less or equal to num, i and j, we only need O(i-j) time to check row X. 

        row X-1   xxxxxxxxxxxxxxxxxxixxxxxxxxx
        row X     xxxxxxxxxxxxx[    ]xxxxxxxxx
        row X+1   xxxxxxxxxxxxxjxxxxxxxxxxxxxx
    
    X[j] <= X+1[j] <= num and X[i+1] >= X-1[i+1] > num => We only need to examine X[j+1:i+1] to find out the index of the largest number smaller than or equal to num. We can apply this recursively to find out the time complexity of checking all m rows is just O(n + m).

    Total time complexity = O(lg(MAX - MIN) * (m + n))

Follow-up: can we use another binary search in finding the inf(num) in X[i+1:j+1] ?
    Time complexity: O(lg n * m) worse than O(m + n)
    
Compare with method #1:
    O(lg(MAX - MIN) * (m + n)) vs O(m + k * lgm)
def kth_smallest_3(matrix, k):
    """
    Notice:
        1. we can count num of element le than num by scanning from the top right corner to the bottom left corner
    """
    m, n = len(matrix), len(matrix[0])

    def count_num_le(num):
        cnt = 0
        row_idx, col_idx = 0, n - 1  # start from row 0
        while col_idx >= 0 and row_idx < m:
            if matrix[row_idx][col_idx] <= num:
                cnt += col_idx + 1
                row_idx += 1
            else:
                col_idx -= 1
        return cnt

    left, right = matrix[0][0], matrix[m - 1][n - 1]
    while left < right - 1:
        mid = left + (right - left >> 1)
        cnt = count_num_le(mid)
        if cnt < k:
            left = mid + 1
        else:
            right = mid

    return left if count_num_le(left) >= k else right

Variant #3 (Leetcode #4) Find medium in two sorted array

What if m == 2? Find the medium, assume arr1 has n1 elements, arr2 has n2 elements and n = n1 + n2. 

Method #7: Medium Cut

Method #7: 
    Consider the definition of medium: a number that can separate the array into two equal half of size n // 2
    
    Let's say a cut i is a number between 0 to n1 that separate arr1 into two parts: LEFT = arr1[0 ~ (i - 1)] and 
RIGHT = arr1[i ~ (n1- 1)]. Let j = n // 2 - i. We can now add j more element from arr2 into LEFT s.t. LEFT is the 
combination of arr1[0 ~ (i - 1)] and arr2[0 ~ (j - 1)] where the size(LEFT) = i + j = i + n // 2 - 1 = n // 2.
    
                      LEFT             RIGHT
        arr1      0 ~ (i - 1)   |   i ~ (n1 - 1)
        arr2      0 ~ (j - 1)   |   j ~ (n2 - 1)
    
    Notice the range of i can be 0 to n1, range of j can be 0 to n2. The difference between size(LEFT) and size(
RIGHT) is at most 1 (when n is odd).
    
    For a given cut i, i is a valid cut if 0 <= j <= n2. j < 0 or j > n2 means for cut i there is no ways for us to 
add j elements form arr2 to make two equal parts.
    
    A valid cut is a medium cut if and only if max(LEFT) <= min(RIGHT), i.e. arr1[i - 1] <= arr2[j] 
and arr2[j - 1] <= arr1[i]. 
    
    Then the problem becomes finding a valid cut within range [0, n1] s.t. this cut is a medium cut. We can use 
binary search to find such cut.
    
    For any valid cut i, compare arr1[i-1] vs arr2[j] and arr2[j-1] vs arr1[i]
        1. if arr1[i-1] <= arr2[j] and arr2[j-1] <= arr1[i]:
            we found the medium
        2. if j < 0 or arr1[i-1] > arr2[j]:
            move i to the left
        3. if j > n2 or arr2[j-1] > arr1[i]:
            move i to the right
    Notice when we want to compare the value, we need to make sure that their index is valid. Special cases are when 
i = 0, i = n1, j = 0 and j = n2.

    When we find cut i successfully separate the arrays into two equal half, we output the medium by following:
        if i is odd:
            min(arr1[i], arr2[j]) is the medium: if i is odd, size(RIGHT) = size(LEFT) + 1. The minimum element 
in the RIGHT is the medium
        
        if i is even:
            (max(arr1[i - 1], arr2[j - 1]) + min(arr1[i], arr2[j])) / 2 is the medium: if i is even, 
size(LEFT) = size(RIGHT). The medium is the average element cross the cut i. Notice we also need to make 
sure these indexes are valid.
            
Complexity Analysis:
    O(min(n1, n2))
def find_medium_in_two_sorted_array(arr1, arr2):
    n1, n2 = len(arr1), len(arr2)
    n = n1 + n2
    left, right = 0, n1
    while left <= right:
        i = left + (right - left >> 1)
        j = n // 2 - i

        # compare arr1[i-1] and arr2[j] if they exist
        if j < 0 or (i > 0 and j < n2 and arr1[i - 1] > arr2[j]):
            right = i - 1
        # compare arr2[j-1] and arr1[i] if they exist
        elif j > n2 or (i < n1 and j > 0 and arr2[j - 1] > arr1[i]):
            left = i + 1
        else:
            left_max = max(arr1[i - 1] if i > 0 else float('-inf'), arr2[j - 1] if j > 0 else float('-inf'))
            right_min = min(arr1[i] if i < n1 else float('inf'), arr2[j] if j < n2 else float('inf'))
            return right_min if n & 1 else (left_max + right_min) / 2

Variant #4: Find kth smallest in two sorted array

Method #8: K-th Cut

Method #8: This is similar to Method #7. A cut i is the k-th cut is size(LEFT) = k

    Let j = k - i 
                      LEFT             RIGHT
        arr1      0 ~ (i - 1)   |   i ~ (n1 - 1)
        arr2      0 ~ (j - 1)   |   j ~ (n2 - 1)
        
    We can perform binary search to find i in range 0 ~ n1. Notice j is in range of 0 to n2 as well.
    
    Compare arr1[i-1] vs arr2[j] and arr2[j-1] vs arr1[i]:
        if j < 0 or arr1[i-1] > arr2[j]:
            move i to the left
        if j > n2 or arr2[j-1] > arr1[i]:
            move i to the right
        else
            return max(arr1[i-1], arr2[j-1])
        Notice we need to examine if these indexes are valid before accessing 
def find_kth_smallest_in_two_sorted_1(arr1, arr2, k):
    n1, n2 = len(arr1), len(arr2)
    left, right = 0, len(arr1)
    while left <= right:
        i = left + (right - left >> 1)
        j = k - i
        if j < 0 or (i > 0 and j < n2 and arr1[i - 1] > arr2[j]):
            right = i - 1
        elif j > n2 or (i < n1 and j > 0 and arr2[j - 1] > arr1[i]):
            left = i + 1
        else:
            a_left = float('-inf') if i == 0 else arr1[i - 1]
            b_left = float('-inf') if j == 0 else arr2[j - 1]
            return max(a_left, b_left)

Method #9: Binary Search

Method #9: To find the kth smallest element in the combination of arr1 and arr2, we can compare arr1[k//2 - 1] and 
arr2[k//2 - 1]: we know the total number of elements in arr1[0 : k//2] + arr2[0 : k//2] is k//2 * 2 <= k:
    if arr1[k//2 - 1] < arr2[k//2 - 1]:
        arr1[0 : k//2] is less than arr2[k//2 - 1], they can not be the kth element. Remove arr1[0: k//2] and update 
k to k - k // 2

    if arr1[k//2 - 1] < arr2[k//2 - 1]:
        arr2[0 : k//2] is less than arr1[k//2 - 1], they can not be the kth element. Remove arr2[0: k//2] and update 
k to k - k // 2

    if arr1[k//2 - 1] = arr2[k//2 - 1]:
        we found the kth element:
            if k is even, the kth element is arr1[k//2 - 1] = arr2[k//2 - 1]
            if k is odd, the kth element is min(arr1[k//2 - 1] = arr2[k//2 - 1])
    
    Notice we need to examine if these indexes are valid before accessing 
    
Complexity analysis:
    In each iteration, k decrease to k - k // 2 until 1 or one array has less elements than k // 2. Total complexity 
is O(lgk) = O(lg(m+n)) in the worst case
def find_kth_smallest_in_two_sorted_2(arr1, arr2, k):
    n1, n2 = len(arr1), len(arr2)
    a_left, b_left = 0, 0
    while k > 0:
        if a_left >= n1:
            return arr2[b_left + k - 1]
        elif b_left >= n2:
            return arr1[a_left + k - 1]
        elif k == 1:
            return min(arr1[a_left], arr2[b_left])

        idx = k // 2 - 1
        a_val = float('inf') if idx + a_left >= n1 else arr1[idx + a_left]
        b_val = float('inf') if idx + b_left >= n2 else arr2[idx + b_left]
        if a_val >= b_val:
            b_left += k // 2
        elif a_val < b_val:
            a_left += k // 2
        k -= k // 2

Dynamic Programming

Introduction

Dynamic programming is closely related to divide & conquer and has far-ranging applications. They share the same idea of dividing a bigger and complicated question into smaller and easy-to-solve sub-problems of the same type. So both of them are based on recursion.

The difference: In divide & conquer, sub-problems share the same type with the original problem, but every sub-problem is not totally the same. (For example, when doing merge sort, we split the input array into left-half and right-half and sort them individually. The subproblem, sorting the left-half and right-half array, is the same with the original problem; the only difference is that the input array’s length becomes halved. But sorting the left-half and sorting the right-half is not usually the same problem.) In dynamic programming, we will have exactly the same sub-problem occur multiple times. To avoid solving the same sub-problem more than once, we can memorize each sub-problem result, and before solving one, we check if the same problem has been solved before. Thus every distinct sub-problem will be solved at most once.

To solve a problem using a dynamic programming approach, we normally need a 1-d, 2-d, or more complex array. We also have to figure out what’s the meaning of each block in that array and how later blocks connect to the previous one.

Comparing to divide & conquer, dynamic programming might have less time complexity but usually need more additional space because of memorization.

Fibonacci numbers

Computing the Fibonacci numbers is a classic and straightforward problem, where dynamic programming can be applied.

Referring to the wiki page, the nth Fibonacci number Fib(n) is the nth number in the Fibonacci series. Its definition:

and for n > 1:

The beginning of Fibonacci series:

Our problem is to compute the nth Fibonacci number:

Computing nth Fibonacci number
input: a integer N>=0
output: the N th Fibonacci number

Imagine we’d like to compute Fib(5). By definition, Fib(5) = Fib(4) + Fib(3) and the problem becomes computing Fib(4) and Fib(3). We can continue doing this until getting Fib(0) and Fib(1):

                           fib(5)   
                     /                \
               fib(4)                  fib(3)   
             /        \              /       \ 
         fib(3)      fib(2)       fib(2)     fib(1)
       /     \       /    \        /    \
  fib(2)   fib(1)  fib(1) fib(0) fib(1) fib(0)
 /      \
fib(1) fib(0)

The native method to compute this will be computing all nodes on the above graph, a simple divide & conquer algorithm. In this case, the number of node we need to compute is O(2^logn).

However, you might notice that some sub-problem is exactly the same. For example, Fib(3) appears 2 times, Fib(2) appears 3 times and Fib(1) appears 5 times. If we can memorize distinct nodes with their result, we only need to compute O(n) nodes. Also notice that the extra space needed here is also O(n).

                           fib(5)   
                     /                \
               fib(4)                  fib(3)   
             /        \            
         fib(3)      fib(2)       
       /     \          
  fib(2)   fib(1)   
 /      \
fib(1) fib(0)

The pseudocode of Fib(n):

Algorithm computing nth Fibonacci number

int Fib(n) {
  int fibSeries[] <-- initializing an array of size n + 2 
  fibSeries[0] = 0
  fibSeries[1] = 1
  for i from 2 to n+1
    fibSeries[i] = fibSeries[i - 1] + fibSeries[i - 2]
  return fibSeries[n+1]
}

One technique often used in dynamic programming is building a 1-d, 2-d, or more complex array to store the result of every distinct sub-problem; arrange the sub-problems in order from simpler to complicated, and solve them(fill the array) in that order.

Longest Increasing Subsequence

Longest Increasing Subsequence
input: an array of integer
output: the longest increasing subsequence

Subsequences: From a sequence, any of the element but in sequence.

For example, consider sequence 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15

The longest increasing subsequence is 0, 2, 6, 9, 11, 15

Usually, a useful technique to help us think about dynamic programming problems is considering the last element.

Here, we can fix the last element in the sequence, 15, and we have two choices: either 15 is in our longest increasing subsequence, or it is not. If we decide to exclude 15, the question becomes to find the longest increasing subsequence from remaining elements. If we include 15, since it is the last element, the question becomes to find the longest increasing subsequence from remaining elements with the last element less than 15. So fixing one end (the last of subsequence) seems to work.

Let P(n): the longest increasing subsequence of array arr[1...n] end with arr[n]
P(n) = max(P(1), ..., P(i-1)) + arr[n], for all  1<=i<=n-1 and arr[i] < arr[n]
LIS(arr[1...n]) = max(P(1), ..., P(n))

*LIS stands for the longest increasing subsequence

Now, it looks similar to Fibonacci number since the later value depends on previous ones: to compute P(2), we need P(1), to compute P(3), we need P(2) and P(1), to compute P(n), we need P(1), …, P(n-1). Since the same subproblem raises over and over again, memorization is a plus.

So we will have an array p[1…n] where p[i] stores the LIS end with arr[i]. Since the later value depends on earlier values, we can fill the array from the beginning.

Algorithm find LIS
int LIS(arr[1...n]) {
  p[1...n] <-- initialize empty array
  p[1] = 1 // base case
  for i from 2 to n
    p[i] = max(p[j]) + 1 for j from 1 to i-1 and arr[j] < arr[i]
  return max(p) // return the largest element in p[1...n]
}

We are using two loops here, so the time complexity will be O(n^2). Also, O(n) space is used to store array p[1…n].

Have fun!

Divide & Conquer

Introduction

Divide and conquer is a basic but powerful technique that is widely used in many algorithms. As you might speculate from its name, the principle of divide and conquer is to recursively divide a complicated problem into a few simple sub-problems of the same type (divide), until they become simple enough and we can solve them quickly (conquer); finally we will combine the sub-result to get the final result. Thus, we’ll usually use recursion when we want to use the divide and conquer algorithm.

Merge Sort Algorithm

Merge Sort is a classic divide and conquer algorithm; let’s look at its program specification:

Algorithm: merge sort
input: an array arr[] of comparable element
output: sorted version of the input array, ascending

In merge sort, we’ll apply the divide and conquer idea. Divide: we’ll recursively divide the array into two-parts until we can not divide anymore. At this time, all sub-arrays contains only one element; therefore, they can be considered as sorted array. Conquer: now, the question becomes how to merge two sorted arrays (left-half and right-half) into a single sorted array; this is easy and can be done with O(n) time. To make it clear, we can have another function called merge:

Algorithm: merge
input: two sorted array arr1[], arr2[] of comparable element
output: one sorted array, ascending 

And we can write the pseudocode for merge sort:

arr[] mergesort(arr[]) {
  if (the length of arr[] is 1)
    return arr[]
  left[] <-- copy the left half of arr[]
  right[] <-- copy the right half of arr[]
  left[] = mergesort(left[])
  right[] = mergesort(right[])
  return merge(left[], right[])
}

Here is an concrete example from GeeksforGeeks with every step shown in order, assuming our input array is [38, 27, 43, 3, 9, 82,10]:

Algorithm analyze

Time Complexity: Let n be the number of element in our input array. Since we are recursively dividing the array into two-part, the height of the tree will be 2log(n). Consider every level before merging (the upper log(n) levels), all we do is splitting the current array into left-half and right-half, which can be done with O(n) time. Consider every level after merging (the lower log(n) levels), what we do is merging two sorted arrays, this can be done with two pointers within O(n) time. To sum all these up, the total time complexity is O(n*logn)

Space Complexity: we will need O(n) extra space to store the left-half and right-half sub-arrays we created at any time. If we also want to consider the space used by recursive calls, we need to add extra O(logn) space for the stack frames. Thus, the total space complexity is O(n) +O(logn)

Conclusion

Similar to merge sort, other classic algorithms like binary search, quick sort, closest pairs in 2-d space also use the idea of divide & conquer algorithm. Two things need to be noticed here: 1. the subproblems are easier to solve than the current problem; in other words, dividing need to reduce the difficulty of the problem. 2. The subproblems need to be in the same format so that they can be solved similarly and efficiently, which will be time-saving.

Interestingly, what if the same subproblem occurs multiple times during division? In that case, another algorithm, dynamic programming, similar to divide & conquer, can be applied. We’d like to remember the state/result of the subproblem and only solve the same subproblem once instead of multiple times.

Have fun!

Introduction to Algorithm

In computer science, an algorithm is a sequence of well-defined instructions to solve a problem, usually within a finite amount of time and space. And it is an essential and indispensable part of computer science, theoretically and practically.

Here’s a rough schedule of what will be discussed in this topic:

  • Basic Graph Algorithms
    • Breadth-first search & Depth-first search
    • Dijkstra’s shortest path
    • Floyd-Warshall algorithm
    • Prim’s algorithm
  • Divide & Conquer
    • Merge Sort
  • Dynamic Programming
    • Longest Increasing Subsequence
    • Knapsack
    • Travelling Salesman Problem
  • Greedy
  • Network Flow
  • NP Optimization

Program Correctness

What is a program? A program, the implementation of an algorithm, is a sequence of instructions that are used to solve a problem or do some calculations. It always comes with specifications, including input and output. The process of getting output from the input is considered as a program.

Before getting started, we will discuss some basics of an algorithm: program correctness, a method helps us to verify if an algorithm is correct or not.

Program correctness sometimes has been split into two parts: soundness and termination.

  • Soundness: on every valid input, if program terminates, then its output is correct.
  • Termination: on every valid input, the program terminates.

This is important as we will frequently use it to determine if our algorithm is correct. One technique that is helpful to prove program correctness is mathematical induction. If you are interested, here are more examples provided by Cornell.

Remember, algorithm can be intriguing and fun to exploit! Good luck and have fun!


Credit

Some material in this topic is selected from the coursework of CS577: Introduction to Algorithm and ICPC training material written by Prof. Van Melkebeek @ University of Wisconsin-Madison.