Here are some of my notes on coding.

Backtracking

Remarks

  • The backtracking algorithm solves the optimization problem by incrementally builds out candidate solutions and backtracks a candidate as soon as we can determine it will not leads to optimal.
  • If we represent the partial solutions as a tree, backtracking is doing depth-first search in this tree.
  • Since it explores all of the tree in worst case, its upper bounded to have exponential time complexity. It usually need some constraints to prune the tree.
  • Very often we would pass both the candidate solution as well as the state of partial solution to the recursive call to speed up checks.

Template

To find one solution

def backtrack(candidate, state):
    if criterion(candidate, state): return candidate
    for choice in choices:
        if not constraints(candidate, choice, state): continue
        candidate, state = expand_candidate(candidate, choice, state)
        if backtrack(candidate): return candidate
        candidate, state = restore_candidate(candidate, choice, state)
    return

solution = backtrack(init_candidate, init_state)

To collect all possible solutions

solutions = set()
def backtrack(candidate, state):
    if criterion(candidate, state):
        solutions.add(candidate)
        return
    for choice in choices:
        if not constraints(candidate, choice, state): continue
        backtrack(expand_candidate(candidate, choice, state))

backtrack(init_candidate, init_state)

LC questions

10 17 22 37 39 40 46 47 77 78 401 784 1307


Binary Search

Remarks

  • The binary search algorithm solves the problem of finding a target value within a sorted array.
  • The algorithm iteratively updating the lower and upper bound positions in the array where the target value may resides, by checking the middle element determined by the current bounds.
  • The algorithm runs in O(logN) time worst case.

Template

To find a the target value in sorted array.

def binary_search(array, target):
    low, high = 0, len(array) - 1
    while low <= high:
        mid = (low + high) // 2
        if array[mid] == target: return mid
        elif array[mid] > target: high = mid - 1
        else: low = mid + 1
    return -1  

note at the end of loop, low is the insertion point for x in a to maintain sorted order.

This algorithm can be generalized to find values in bounds that satisfy a criterion, given that the feasible values are collocated as a contiguous range within the bounds, and that the criterion can give us direction to update the bounds. We can return either the left most or the right most value if so desired.

def binary_search(low, high):  # based on prior knowledge of bounds.
    # note here we are assuming a feasible solution exists in the bounds.  
    def criterion(mid):    
        # to be implemented per question

    while low < high:
        mid = (low + high) // 2
        if criterion(mid): high = mid
        else: low = mid + 1
    return # low for left most low - 1 for right most

python has a standard library bisect for binary search.

LC questions

33 34 35 67 74 153 162 278 374 410 658 704 875 1011 1268 1283


Breadth Frist Search

Remarks

  • The breadth first search algorithm traversals tree / graph by exploring all the immediate child nodes / neighboring nodes first before moving to next depth level.

Template

Some dummy criterion and constraints.

def criterion(node, target_val=4):
    return node and node.val == target_val

def constraints(node):
    return node

Iterative implementation using queue.

def bfs(root):
    queue = deque([root])

    while queue:
        node = queue.popleft()
        if criterion(node): return node

        for next_node in node.childrens:
            if not constraints(next_node): continue
            queue.append(next_node)
    return

result = bfs(root)

LC questions

301 133 200 785


Depth Frist Search

Remarks

  • The depth first search algorithm traversals tree / graph by exploring the node branch / edges as far as possible before backtrack and explore other alternatives.

Template

Iterative implementation using queue.

def dfs(root):
    stack = deque([root])

    while stack:
        node = stack.pop()
        if criterion(node): return node

        for next_node in node.childrens:
            if not constraints(next_node): continue
            stack.append(next_node)
    return

result = dfs(root)

Recursive implementation with a hash map memorizing visited nodes.

visited = set()

def constraints(node):
    return node and node not in visited

def dfs(node):
    visited.add(node)
    if criterion(node): return node

    for next_node in node.neighbors:
        if not constraints(next_node): continue
        result = dfs(next_node)
        if result: return result
    else:
        return

result = dfs(root)

LC questions

133


Linked Lists

Remarks

  • A Linked List is a recursive data structure that is either empty or a node holds an item and a reference to another linked list.
  • Its often useful to keep track of both the head and the tail of a linked list.

Common Operations with Linked Lists

class Node(object):
    def __init__(self, item=None):
        self.item = item
        self.next = None


first = Node('head')
second = Node('shoulders')
third = Node('knees')
forth = Node('toes')
first.next = second
second.next = third
third.next = forth

# Traversal to the end
node = first
while node and node.next:
    node = node.next
last = node

# Insert at the begining.  
prev_first = first
first = Node('hat')
first.next = prev_first

# Remove from the begining.
first = first.next

# Insert at the end.
prev_last = last
last = Node('shoes')
prev_last.next = last

Priority Queue / Heap

Remarks

  • Priority Queue is a collection data structure which supports log(N) time for both insertion of a new element and query the maximum(or minimum) element.
  • Heap is a tree in which the parent node is always larger(or smaller) than its children.
  • Binary heap a heap with the tree been a binary tree.
  • Binary heap queue is binary heap plus the binary tree is complete.
  • Binary heap queue is an implementation for priority queue and it is usually stored as an array.
  • The key operations for binary heap queue are sink and swim.
    • Swim is to move the newly inserted element(in the bottom) up to its rightful location.
    • Sink is to move the newly promoted element(at the top) down to its rightful location.

Implementation

from operator import lt, gt

class Heapq(object):
    def __init__(self, data=None, mode='min'):
        self.mode = mode
        self._heapq = [None] + list(data) if data else [None]
        self.comparator = gt if mode == 'min' else lt
        self._n = len(self._heapq) - 1
        if data: self.heapify()

    def __len__(self):
        return self._n

    def is_empty(self):
        return self._n == 0

    def _compare(self, i, j):
        return self.comparator(self._heapq[i], self._heapq[j])

    def _swap(self, i, j):
        self._heapq[i], self._heapq[j] = self._heapq[j], self._heapq[i]

    def top(self):
        assert not self.is_empty(), 'heapq is empty'
        return self._heapq[1]

    def push(self, x):
        self._heapq.append(x)
        self._n += 1
        self.swim(self._n)

    def pop(self):
        assert not self.is_empty(), 'heapq is empty'
        self._swap(1, self._n)
        self._n -= 1
        self.sink(1)
        return self._heapq.pop()

    def swim(self, k):
        while k > 1 and self._compare(k // 2, k):
            self._swap(k // 2, k)
            k //= 2

    def sink(self, k, n=None):
        n = n if n else self._n
        while k * 2 <= n:
            j = k * 2
            if j < n and self._compare(j, j + 1):
                j += 1
            if not self._compare(k, j):
                break
            self._swap(k, j)
            k = j

    def heapify(self):
        for k in range(self._n // 2, 0, -1):
            self.sink(k)

Queue

Remarks

  • A queue is a collection that is First-In-First-Out(FIFO).

Implementation

An implementation of Queue using linked list.

class Queue(object):
    class Node(object):
        def __init__(self, item=None):
            self.item = item
            self.next = None

    def __init__(self):
        self.head = None
        self.tail = None
        self._n = 0

    @property
    def is_empty(self):
        return not self.head

    def __len__(self):
        return self.n

    def enqueue(self, item):
        prev_tail = self.tail
        self.tail = self.Node(item)
        if self.is_empty:
            self.head = self.tail
        else:
            prev_tail.next = self.tail
        self._n += 1

    def dequeue(self):
        assert not self.is_empty, "Queue is empty"
        item = self.head.item
        self.head = self.head.next
        self._n -= 1
        if self.is_empty:
            self.last = None
        return item

Sorting

Here are implementations of a few array sorting algorithms.

a = list('sortingexample')

1. Selection Sort

  • Iteratively put the ith smallest element at ith location from the left.
  • We maintain a sorted section from the left.
  • In each iteration, we expand the sorted section by swap the smallest element from the unsorted section into the location.
def selection_sort(a):
    n = len(a)
    for i in range(n):
        s = i
        for j in range(i + 1, n):
            if a[j] < a[s]:
                s = j
        a[i], a[s] = a[s], a[i]

2. Bubble Sort

  • Iteratively put the ith largest element to ith location from the end.
  • We maintain a sorted seciton from the right.
  • In each iteration, we compare adjacent pairs of elements and swap larger values to the right until we reach the sorted part.
def bubble_sort(a):
    n = len(a)
    for i in range(1, n):
        for j in range(n - i):
            if a[j] > a[j + 1]:
                a[j + 1], a[j] = a[j], a[j + 1]

3. Insertion Sort

  • Iteratively insert new element into the sorted part.
  • We maintain a sorted section from the left.
  • In each iteration, we sink one new element from the right to the correct location in the sorted part.
def insertion_sort(a):
    n = len(a)
    for i in range(n):
        for j in reversed(range(1, i + 1)):
            if a[j] < a[j - 1]:
                a[j - 1], a[j] = a[j], a[j - 1]

4. Merge Sort

  • Recursively divide the array into two halves, sort both(recursively) and then merge the two sorted array.
  • It follows the Divide-and-Conquer paradigm.
# A quick and dirty non-inplace version
def merge_sort(a):
    def merge(left, right):
        merged = []
        while left and right:
            merged.append(left.pop(0) if left[0] < right[0] else right.pop(0))
        merged.extend(left + right)
        return merged

    if len(a) <= 1: return a
    mid = len(a) // 2
    left = merge_sort(a[:mid])
    right = merge_sort(a[mid:])
    return merge(left, right)

# A in place version
def merge_sort(a):
    temp = a.copy()

    def merge(a, low, mid, high):
        for i in range(low, high + 1):
            temp[i] = a[i]

        i, j = low, mid + 1
        for k in range(low, high + 1):
            if i > mid:
                a[k] = temp[j]
                j += 1
            elif j > high:
                a[k] = temp[i]
                i += 1
            elif temp[j] < temp[i]:
                a[k] = temp[j]
                j += 1
            else:
                a[k] = temp[i]
                i += 1

    # bottom-up version
    # n = len(a)
    # step_size = 1
    # while step_size < n:
    #     for low in range(0, n, 2 * step_size):
    #         mid = low + step_size - 1
    #         high = min(n - 1, low + 2 * step_size - 1)
    #         merge(a, low, mid, high)
    #     step_size *= 2

    # top-down version
    def _merge_sort(a, low, high):
        if low >= high: return
        mid = (low + high) // 2
        _merge_sort(a, low, mid)
        _merge_sort(a, mid + 1, high)
        merge(a, low, mid, high)

    _merge_sort(a, 0, len(a) - 1)

Quick Sort

  • Pick a random pivot item, put it in the right place by moving items smaller than it to its left and items larger than it to its right.
  • Use the pivot item(since its in the right location now), to cut the problem in two halves and recursively sort the whole thing.
import random

# an inplace implementation, with random pivot item instead of shuffle and alwasy pick leftmost as pivot.
def quick_sort(a):
    def partition(l, h):
        if l >= h: return
        p = random.randint(l, h)
        a[l], a[p] = a[p], a[l]
        pivot = a[l]
        i, j = l, h
        while i < j:
            while i < h and a[i] <= pivot: i += 1
            while j > l and a[j] >= pivot: j -= 1
            if j > i and a[i] > a[j]: a[i], a[j] = a[j], a[i]
        a[j], a[l] = a[l], a[j]
        return j

    def sort(l, h):
        if h > l:
            j = partition(l, h)
            sort(l, j - 1)
            sort(j + 1, h)

    sort(0, len(a) - 1)

Heap Sort

  • Heapify the array then poping item one at a time.
  • See Heap section for how heap works.

Stack

Remarks

  • A stack is a collection that is Last-In-First-Out(LIFO).

Implementation

An implementation of Stack using linked list.

class Stack(object):
    class Node(object):
        def __init__(self, item=None):
            self.item = item
            self.next = None

    def __init__(self):
        self.head = None
        self._n = 0

    @property
    def is_empty(self):
        return not self.head

    def __len__(self):
        return self.n

    def push(self, item):
        prev_head = self.head
        self.head = self.Node(item)
        self.head.next = prev_head
        self._n += 1

    def pop(self):
        assert not self.is_empty, "Stack is empty"
        item = self.head.item
        self.head = self.head.next
        self._n -= 1
        return item

Union Find

Remarks

  • Union Find is a data structure keeps track of a set of elements that are partitioned into a number of disjoint subsets.
  • It has two operations union(p, q) and find(p, q), where find tells whether p and q are in the same subset and union merges the subsets containing p and q.
  • It memory usage is O(N), and find and union operations are near O(N).
  • The algorithm can be used to find all connected components in a network.

Template

class UnionFind(object):
    def __init__(self, n):
        self.ids = list(range(n))
        self.sizes = [1] * n

    def root(self, i):
        while i != self.ids[i]:
            self.ids[i] = self.ids[self.ids[i]]  # path compression
            i = self.ids[i]
        return i

    def find(self, p, q):
        return self.root(p) == self.root(q)

    def union(self, p, q):
        rootp, rootq = map(self.root, (p, q))
        small, big = sorted([rootp, rootq], key=lambda x: self.sizes[x])
        self.ids[small] = self.ids[big]
        self.sizes[big] += self.sizes[small]

Lets say there are n nodes in a graph and we encode the connectivity information with adjacency list as connections which looks like [(p1, q1), (p2, q2) ...].
To get the number of connected components:

uf = UnionFind(n)
for p, q in connections:
    uf.union(p, q)
num_components = len(set(uf.root(i) for i in range(n)))

LC questions

323