Coding Notes
Backtracking
Remarks
 The backtracking algorithm solves the optimization problem by incrementally building out candidate solutions and backtracks a candidate as soon as we can determine it will not leads to optimal.
 If we represent the partial solutions as a tree, backtracking is doing depthfirst search in this tree.
 Since it explores all of the tree in worst case, its upper bounded to have exponential time complexity. It usually need some constraints to prune the tree.
 Very often we would pass both the candidate solution as well as the state of partial solution to the recursive call to speed up checks. Or in some cases, we can mark the problem space to indicate candidate, so that we don’t need to carry the history around.
Template
To find one solution
def backtrack(candidate, state):
if criterion(candidate, state): return candidate
for choice in choices:
if not constraints(candidate, state, choice): continue
result = backtrack(expand(candidate, state, choice))
if result: return result
return
solution = backtrack(init_candidate, init_state)
To collect all possible solutions
solutions = set()
def backtrack(candidate, state):
if criterion(candidate, state): return solutions.add(candidate)
for choice in choices:
if not constraints(candidate, state, choice): continue
backtrack(expand(candidate, state, choice))
backtrack(init_candidate, init_state)
Binary Search
Remarks
Remarks
 The binary search algorithm solves the problem of finding a target within a sorted sequence.
 The algorithm iteratively updating the lower and upper bound of the search space where the target is sure to reside. The update is done by checking the middle element of search space.
 The algorithm runs in O(logN) time worst case.
 We can apply binary search on the domain of a monotonic function.
 The given sequence as a whole may not be ordered, but if we know certain subsequence is ordered, we can apply binary search on the subsequence.
Usage
 To find a value in array:
def binary_search(arr, t): l, h = 0, len(arr)  1 while l <= h: m = (h + l) // 2 # use l + (h  l) // 2 for non python if arr[m] == t: return m elif arr[m] > t: h = m  1 else: l = m + 1 return 1
In case the target value occurrs more than once, this binary search code will find one, but not guarantee to be either leftmost or rightmost.
 To find the left most occurrence of the value:
def binary_search_left(arr, t): l, h = 0, len(arr) while l < h: m = (h + l) // 2 if arr[m] < t: l = m + 1 else: h = m return l
In the case the target value does not exist in the array, the index returned by this code points to the first element that is larger than the target value. Thus it is the location where we can insert the target value while maintain the sorted order of the sequence.
That is to say, if binary_search_left(arr, t)
returns i
, then arr[:i] < t
and arr[i:] >= t
.
 To find the right most occurrence of the value:
def binary_search_right(arr, t): l, h = 0, len(arr) while l < h: m = (h + l) // 2 if arr[m] <= t: l = m + 1 else: h = m return l  1
In the case the target value does not exist in the array, the index returned by this code points to the last element that is smaller than the target value. Thus insert the target value at this location will violate the sorted order.
That is to say, if binary_search_right(arr, t)
returns i
, then arr[:i+1] <= t
and arr[i+1:] > t
.
Binary Tree Traversals
Remarks
DFS traversals
 Recurrsive implementation of the tree dfs traversals are pretty straightforwardly easy.
 With a hashmap, the iterative implementation can be very easy, but it need extra O(n) space for the hash.
 Otherwise, the iterative version of the three dfs traversals can be quite different and nasty.
def inorder(node):
if node.left: inorder(node.left)
print(node.val)
if node.right: inorder(node.right)
def postorder(node):
if node.left: postorder(node.left)
if node.right: postorder(node.right)
print(node.val)
def preorder(node):
print(node.val)
if node.left: preorder(node.left)
if node.right: preorder(node.right)
def tree_dfs_iterative(root):
stack = [root]
visited = set()
while stack:
node = stack.pop()
if not node: continue
if node in visited:
print(node.val)
else:
visited.add(node)
# stack.extend([node.right, node, node.left]) # inorder
# stack.extend([node, node.right, node.left]) # postorder
stack.extend([node.right, node.left, node]) # preorder
def inorder_iterative(root):
node, stack = root, []
while stack or node:
while node:
stack.append(node)
node = node.left
node = stack.pop()
print(node.val)
node = node.right
def preorder_iterative(root):
stack = [root]
while stack:
node = stack.pop()
if not node: continue
print(node.val)
stack.append(node.right)
stack.append(node.left)
# this version has to cut links, but reads more like the previous two order in code
def postorder_iterative(root):
node, stack = root, []
while stack or node:
if node:
stack.append(node)
node = node.left
elif stack[1].right:
node = stack[1].right
stack[1].right = None
else:
print(stack.pop().val)
# the double push version works better, but harder to explain what's going on.
# basically, the first bush is for backtrack, the second push is to traversal.
def postorder_iterative(root):
stack = [root, root]
while stack:
node = stack.pop()
if stack and stack[1] is node:
if node.right:
stack.extend([node.right] * 2)
if node.left:
stack.extend([node.left] * 2)
else:
print(node.val)
BFS traversal
 BFS traversal with queue is quite straightforward.
def tree_bfs_iterative(root):
queue = deque([root])
while queue:
node = queue.popleft()
if not node: continue
print(node.val)
queue.extend([node.left, node.right])
Bit Manipulation
Reference
4 bit signed integer decimal to binary table
decimal  binary  decimal  binary 

8  1000  7  0111 
7  1001  6  0110 
6  1010  5  0101 
5  1011  4  0100 
4  1100  3  0011 
3  1101  2  0010 
2  1110  1  0001 
1  1111  0  0000 
Operators
operator  name  what it does  example  sample property / usage 

~  complement  switch each 0 to 1 and each 1 to 0  ~0101 = 1010  ~x = x  1 and ~(x  1) = x 
&  and  1 iff both bits are 1 else 0  0110 & 0101 = 0100  x &  x to extract last bit, x & (x  1) to clear last bit 
  or  0 iff both bits are 0 else 1  0110  0101 = 0111  flags = x to set the flag 
^  xor (eXclusive OR)  1 iff only one bit is 1 else 0  0110 ^ 0101 = 0011  0 ^ x = x , x ^ x = 0 simple checksum reduce(operator.xor, words, 0) 
«  left shift  shift bits to left  0011 « 2 = 1100  x << y is \(x \times 2^y\) 
»  right shift  shift bits to right  0110 » 2 = 0001  x >> y is \(\lfloor{x \div 2^y}\rfloor\) 
Usages
 Set a bit at a position
def set_bit(x, pos): mask = 1 << pos return x  mask
 Clear a bit at a position
def clear_bit(x, pos): mask = ~(1 << pos) return x & mask
 Clear leading bits from position
def clear_leading_bits(x, pos): mask = (1 << pos)  1 return x & mask
 Clear trailing bits after position
def clear_trailing_bits(x, pos): mask = (1 << 31)  1 << pos return x & mask
 Flip a bit at position
def flip_bit(x, pos): mask = 1 << pos return x ^ mask
 Check a bit at position
def check_bit(x, pos): return x >> pos & 1
 Check number is even
def check_even(x): return (x & 1) == 0
 Check number is power of 2
def check_pow_2(x): return x & (x  1) == 0
 Check two numbers are of opposite sign
def check_opposite_sign(x, y): return x ^ y < 0
 Swap integers the bit manipulation way
def swap(x, y): x = x ^ y y = x ^ y x = x ^ y return x, y
Breadth First Search
Remarks
 The breadth first search algorithm traversals tree / graph by exploring all the immediate child nodes / neighboring nodes first before moving to next depth level.
 BFS is good for shortest path findings.
Template
Some dummy criterion and constraints.
def criterion(node, target_val=4):
return node and node.val == target_val
def constraints(node):
return node
Iterative implementation using queue.
def bfs(root):
queue = deque([root])
while queue:
node = queue.popleft()
if criterion(node): return node
for child in node.childrens:
if not constraints(child): continue
queue.append(child)
result = bfs(root)
Depth First Search
Remarks
 The depth first search algorithm traversals tree / graph by exploring the node branch / edges as far as possible before backtrack and explore other alternatives.
 DFS can be used to solve problems like:
 Reachability.
 Path finding.
 Topological sort.
 Directed cycle detection.
Template
Iterative implementation using queue.
def dfs(root):
stack = deque([root])
while stack:
node = stack.pop()
if criterion(node): return node
for child in node.childrens:
if not constraints(child): continue
stack.append(child)
result = dfs(root)
Recursive implementation with a hash map memorizing visited nodes.
visited = set()
def constraints(node):
return node and node not in visited
def dfs(node):
visited.add(node)
if criterion(node): return node
for child in node.childrens:
if constraints(child):
result = dfs(child)
if result: return result
result = dfs(root)
Dynamic Programming
Remarks
 DP is a method for solving optimization problem by breaking the problem into smaller subproblems.
 DP requires that the overall optimal solution depends on the optimal solution to its subproblems.
 We might be able to apply DP when:
 The subproblems can be seen as smaller version of the original problem.
 The optimal solution of a problem is a function of the optimal solutions to one or more of its subproblems.
Two methods using dynamic programming are topdown memoization and bottom up tabulation.
 Topdown memoization is to recurssivly call the method to find solutions to subproblems and cache the results along the way to avoid redundent calculations.
 Bottom up tabulation is to solving the smallest subproblem first and then filling up the N dimension optimal solution table.
Classic example for dynammic programming is the fibonacci number calculation.
def fib(n):
if n <= 1: return 1
return fib(n  1) + fib(n  2)
The recursive relationship here is \(f(n) = f(n  1) + f(n  2)\). The above code would work but will be very slow when n increases, due to the redundent calculations in the recursive calls.
With the top down memoization approach, we save the solutions of the subproblems and reuse the reuslts when we don’t need to recalculate.
memo = {0: 1, 1: 1}
def fib_memo(n):
if n in memo: return memo[n]
memo[n] = fib_memo(n  1) + fib_memo(n  2)
return memo[n]
With the bottom up tabulation approach, we iterativly fill up the 1 dimensional table up untill we can answer the problem.
def fib_tab(n):
fibs = [1, 1]
for i in range(2, n + 1):
fibs.append(fibs[1] + fibs[2])
return fibs[n]
Linked Lists
Remarks
 Linked List is a sequence of node objects where each node store pointers to (1) some other node in the sequence, (2) the data this node is associated with.
 Linked List can be used as the underline data structure for many common abstract data types like lists, stacks, queues, etc.
operation  time 

push at front  O(1) 
pop at front  O(1) 
push at back  O(1) 
pop at back  O(N) 
access via index  O(N) 
insert in middle  O(N) 
delete in middle  O(N) 
swap two nodes  O(N) 
Common operations on singly linked list.

Node class for singly linked list. ```python class ListNode(object): def init(self, item=None): self.item = item self.next = None
def repr(self): return ‘ ‘ .join([super().repr()[:1], ‘of’, str(self.item), ‘ >’])
nodes = list(map(ListNode, [‘Head’, ‘Shoulders’, ‘Knees’, ‘Toes’])) for prev, next_ in zip(nodes, nodes[1:]): prev.next = next_
head = nodes[0] tail = nodes[1] del nodes
2. Traversal the linked list.
```python
def traversal(head, verbose=0):
trav = head
n = 0
while trav:
if verbose: print(trav.item)
if not trav.next: tail = trav
trav = trav.next
n += 1
return tail, n
# def traversal(head, k):
# def traversal(head, val):
assert traversal(head) == (tail, 4)
 Push and pop node at front. ```python def push_front(head, node): node.next = head return node
def pop_front(head): return head.next
node = ListNode(‘Hat’)
head = push_front(head, node)
assert head == node and traversal(head)[1] == 5
head = pop_front(head) assert head == node.next and traversal(head)[1] == 4
4. Push and pop node at back.
```python
def push_back(tail, node):
tail.next = node
return node
def pop_back(head, tail):
trav = head
while trav.next is not tail: trav = trav.next
trav.next = None
tail = trav
return tail
node = ListNode('Shoes')
tail = push_back(tail, node)
assert tail == node and traversal(head)[1] == 5
tail = pop_back(head, tail)
assert traversal(head)[1] == 4 and tail.next == None
 Insert and delete nodes in the middle. ```python def get(head, i): trav = head prev = None while trav and i: i = 1 prev = trav trav = trav.next return prev, trav
def insert(head, i, node): prev, next_ = get(head, i) prev.next = node node.next = next_
def erase(head, i): prev, curr = get(head, i) prev.next = curr.next
node = ListNode(‘Belly Button’) insert(head, 2, node) assert get(head, 2)[1] == node and traversal(head)[1] == 5
erase(head, 2) assert get(head, 2)[1] == node.next and traversal(head)[1] == 4
6. Swap nodes
```python
def swap(head, i, j):
i, j = sorted([i, j])
prev_i, node_i = get(head, i)
prev_j, node_j = get(node_i, ji)
if prev_i: prev_i.next = node_j
prev_j.next = node_i
temp = node_i.next
node_i.next = node_j.next
node_j.next = temp
return head if prev_i else node_j
old_head = head
head = swap(head, 1, 0)
head = swap(head, 2, 3)
head = swap(head, 3, 2)
head = swap(head, 0, 1)
assert head == old_head
 Reverse Linked List
def reverse(head): prev = None trav = head while trav: # trav.next, prev, trav = prev, trav, trav.next next_ = trav.next trav.next = prev prev = trav trav = next_ return prev
Common Problems
 Two Pointers
 left, right  Palindrome
 slow, fast  cycle, last k
 Multiple Lists
 Merge, Intersection
 Search and Sorting
 Implement Stack/Queue/Deque etc.
 Doubly Linked List + Hash Table
Priority Queue / Heap
Remarks
 Priority Queue is a Queue, in which elements will be dequeued based on their priorites(other than timestamp).
 Elements on the queue should be comparable, otherwise we can’t deduce their ralative priorities.
 Priority Queue can be implemented with heap.
 Heap is a tree in which the parent is always larger(max heap) or smaller(min heap) than its children. (Heap property)
 We can implement priority queue using complete binary heap.
Time complexity of a Binary Heap Queue
 operation  time   —————  —:   construct  O(N)   front  O(1)   dequeue  O(logN)   enqueue  O(logN) 
Implementation
class HeapQueue(object):
from operator import lt, gt
comparators = {'min': lt, 'max': gt}
def __init__(self, data=None, comparator='min'):
self._heapq = [None] + list(data) if data else [None]
self._comparator = self.comparators.get(comparator, comparator)
self._n = len(self._heapq)  1
if data: self.heapify()
def _compare(self, i, j):
return self._comparator(self._heapq[i], self._heapq[j])
def __len__(self):
return self._n
def is_empty(self):
return self._n == 0
def _swap(self, i, j):
self._heapq[i], self._heapq[j] = self._heapq[j], self._heapq[i]
def front(self):
assert not self.is_empty(), 'heapq is empty'
return self._heapq[1]
def enqueue(self, x):
self._heapq.append(x)
self._n += 1
self.swim(self._n)
def dequeue(self):
assert not self.is_empty(), 'heapq is empty'
self._swap(1, self._n)
self._n = 1
self.sink(1)
return self._heapq.pop()
def swim(self, k):
while k > 1 and self._compare(k, k >> 1):
self._swap(k, k >> 1); k >>= 1
def sink(self, k):
n = self._n
while k * 2 <= n:
j = k * 2
if j < n and self._compare(j + 1, j): j += 1
if self._compare(k, j): break
self._swap(k, j); k = j
def heapify(self):
for k in range(self._n >> 1, 0, 1):
self.sink(k)
Queue
Remarks
 A queue is a abstract data type that is FirstInFirstOut(FIFO).
Implementation
An implementation of Queue using linked list.
class Queue(object):
class Node(object):
def __init__(self, item=None):
self.item = item
self.next = None
def __init__(self):
self.first = None
self.last = None
self._n = 0
@property
def is_empty(self):
return not self.first
def __len__(self):
return self.n
def peek(self):
return self.head.item
def enqueue(self, item):
prev = self.last
self.last = self.Node(item)
if self.is_empty():
self.first = self.last
else:
prev.next = self.last
self._n += 1
def dequeue(self):
assert not self.is_empty(), "Queue is empty"
item = self.first.item
self.first = self.first.next
self._n = 1
if self.is_empty():
self.last = None
return item
An implementation of Queue using fixed length array and pointer. Note the formula for pointer updates:
 Enqueue: (first + n) % capacity
 Dequeue: (first + 1) % capacity
class Queue(object):
def __init__(self, size):
self.size = size
self.queue = [None] * size
self.first = 0
self.n = 0
def enqueue(self, item):
assert not self.is_full(), 'Queue is full'
self.queue[(self.first + self.n) % self.size] = value
self.n += 1
def dequeue(self):
assert not self.is_empty(), 'Queue is empty'
item = self.queue[self.first]
self.queue[self.first] = None
self.first = (self.first + 1) % self.size
self.n = 1
return item
def peek(self):
return self.queue[self.first]
def is_empty(self):
return self.n == 0
def is_full(self):
return self.n == self.size
Sorting
Here are implementations of a few array sorting algorithms.
a = list('sortingexample')
1. Selection Sort
 Iteratively put the ith smallest element at ith location from the left.
 We maintain a sorted section from the left.
 In each iteration, we expand the sorted section by swap the smallest element from the unsorted section into the location.
def selection_sort(a):
n = len(a)
for i in range(n):
s = i
for j in range(i + 1, n):
if a[j] < a[s]:
s = j
a[i], a[s] = a[s], a[i]
2. Bubble Sort
 Iteratively put the ith largest element to ith location from the end.
 We maintain a sorted seciton from the right.
 In each iteration, we compare adjacent pairs of elements and swap larger values to the right until we reach the sorted part.
def bubble_sort(a):
n = len(a)
for i in range(1, n):
for j in range(n  i):
if a[j] > a[j + 1]:
a[j + 1], a[j] = a[j], a[j + 1]
3. Insertion Sort
 Iteratively insert new element into the sorted part.
 We maintain a sorted section from the left.
 In each iteration, we sink one new element from the right to the correct location in the sorted part.
def insertion_sort(a):
n = len(a)
for i in range(n):
for j in range(i, 0, 1):
if a[j] < a[j  1]:
a[j  1], a[j] = a[j], a[j  1]
4. Merge Sort
 Recursively divide the array into two halves, sort both(recursively) and then merge the two sorted array.
 It follows the DivideandConquer paradigm.
# A quick and dirty noninplace version
def merge_sort(a):
def merge(left, right):
merged = []
while left and right:
merged.append(left.pop() if left[1] > right[1] else right.pop())
merged.extend(left + right)
return merged[::1]
if len(a) <= 1: return a
mid = len(a) // 2
left = merge_sort(a[:mid])
right = merge_sort(a[mid:])
return merge(left, right)
# A in place version
def merge_sort(a):
temp = a.copy()
def merge(low, mid, high):
for i in range(low, high + 1):
temp[i] = a[i]
i, j = low, mid + 1
for k in range(low, high + 1):
if i > mid:
a[k] = temp[j]
j += 1
elif j > high:
a[k] = temp[i]
i += 1
elif temp[j] < temp[i]:
a[k] = temp[j]
j += 1
else:
a[k] = temp[i]
i += 1
# bottomup version
# n = len(a)
# step_size = 1
# while step_size < n:
# for low in range(0, n, 2 * step_size):
# mid = low + step_size  1
# high = min(n  1, low + 2 * step_size  1)
# merge(low, mid, high)
# step_size *= 2
# # topdown version
def _merge_sort(low, high):
if low >= high: return
mid = (low + high) // 2
_merge_sort(low, mid)
_merge_sort(mid + 1, high)
merge(low, mid, high)
_merge_sort(0, len(a)  1)
5. Quick Sort
 Pick a random pivot item, put it in the right place by moving items smaller than it to its left and items larger than it to its right.
 Use the pivot item(since its in the right location now), to cut the problem in two halves and recursively sort the whole thing.
import random
# an inplace implementation, with random pivot item instead of shuffle and alwasy pick leftmost as pivot.
def quick_sort(a):
def partition(l, h):
if l >= h: return
p = random.randint(l, h)
a[l], a[p] = a[p], a[l]
pivot = a[l]
i, j = l, h
while i < j:
while i < h and a[i] <= pivot: i += 1
while j > l and a[j] >= pivot: j = 1
if j > i and a[i] > a[j]: a[i], a[j] = a[j], a[i]
a[j], a[l] = a[l], a[j]
return j
def sort(l, h):
if h > l:
j = partition(l, h)
sort(l, j  1)
sort(j + 1, h)
sort(0, len(a)  1)
6. Heap Sort
 Heapify the array then poping item one at a time.
 See Heap section for how heap works.
Stack
Remarks
 A stack is a collection that is LastInFirstOut(LIFO).
Implementation
An implementation of Stack using linked list.
class Stack(object):
class Node(object):
def __init__(self, item=None):
self.item = item
self.next = None
def __init__(self):
self.node = None
self._n = 0
@property
def is_empty(self):
return not self.node
def __len__(self):
return self.n
def push(self, item):
prev = self.node
self.node = self.Node(item)
self.node.next = prev
self._n += 1
def pop(self):
assert not self.is_empty, "Stack is empty"
item = self.node.item
self.node = self.node.next
self._n = 1
return item
def peek(self):
return self.node.item
Sweep Line
Remarks
 It is more of a way of approaching problems than an algorithm.
 It generally means that We will process data in some kind of order, and once we processed one piece of data, we won’t go back anymore.
 Sweep a line across problem space, events of interest popsup, keep track of these events.
 Deal with the events at the line and be done with it.
 The information about past events can be summarized in various format to handle new events. Example can be past min/max etc for single reference, a monotone stack or a priority queue for multi past events.
 Given a problem, there might be more than one way to define what is a event, which can lead to different solutions.
Common Problems
 Segment intersection. 1D
 Rectangle intersection. 2D
Topological Sort
Remarks
 Topological sort gives a topological order of the nodes in a directed acyclic graph(DAG).
 A topological ordering implies: for each directed edge
u > v
,u
must comes beforev
in the ordering.  It is a useful preprocessing on the graph, we can implement search for shortest/longest path easily based on this ordering.
 It also gives sort of a notion of levels/diameter of the connected components in the graph.
Implementation
 One implementation of topological sort is to do DFS and put the postorder on to an array.
 The reverse of the postorder traversal is the topological order.
 We can also incorporate cycle detection into the DFS topological sort implementation. We can stop early when we found a cycle.
def topsort(nodes, edges):
n = len(nodes)
visited = set()
path = set() # for cycle detection
top_order = deque([])
def dfs(node):
''' Perform dfs traversal, Return True if cycle is found. '''
if node in visited: return False
if node in path: return True
path.add(node)
for neighbor in edges[node]:
if dfs(neighbor): return True
visited.add(node)
path.discard(node)
top_order.appendleft(node) # put postorder on to array
return False
for node in nodes:
if dfs(node): return []
return top_order
nodes = set(range(7))
edges = {
0: [2, 5, 1],
1: [4, 6],
2: [3],
3: [4],
4: [6],
5: [2,3],
6: []
}
print(topsort(nodes, edges))
 Another implementation for topological sort is BFS based, aka the Khan’s algorith.
 The BFS topological sort would require more graph preprocessing, we will also need the in degrees(or in edges) for each node.
 We can still detect cycle with BFS topological sort. But only after the procedure is done. When there are cycles in the graph, the ordering will be incomplete. Because, the in_degrees for the nodes in the cycle won’t be 0.
 With BFS topological sort, we can also test the uniqueness of the topological ordering. If the topological ordering is unique, the queue can only hold one node at any given time.
def topsort(queue, out_edges, in_degrees):
ordering = []
while queue:
node = queue.popleft()
ordering.append(node)
for next_ in out_edges[node]:
in_degrees[next_] = 1
if not in_degrees[next_]:
queue.append(next_)
return ordering
in_degrees = {0:0, 1:1, 2:2, 3:2, 4:2, 5:1, 6:2}
queue = deque([node for node in in_degrees if not in_degrees[node]])
print(topsort(queue, edges, in_degrees))
Tries / Prefix Trees
Remarks
 Trie(pronounce as try) or Prefix Tree is a Nway tree used to facilitate search of string valued keys.
 The node path corresponds to the characters in the key.
 We use a flag to indicate the end of key.
 Trie can be implemented with Tree Nodes or as hashmap of hashmaps.
 The basic operations like
add
,get
or prefix searchstartswith
all take \(O(k)\) time wherek = len(key)
Implementation
Below we implement a Trie using python dictionaries.
from collections import defaultdict
class Trie(object):
def __init__(self):
_trie = lambda: defaultdict(_trie)
self._trie = _trie()
self.flag = '#'
def add(self, word):
trie = self._trie
for char in word:
trie = trie[char]
trie[self.flag] = trie.get(self.flag, 0) + 1
def get(self, word):
trie = self._trie
for char in word:
if char not in trie: return False
trie = trie[char]
return False if self.flag not in trie else trie[self.flag]
def contains(self, word):
return self.get(word) is not False
def startswith(self, prefix):
trie = self._trie
for char in prefix:
if char not in trie: return False
trie = trie[char]
return True
When there are more stuff the node needs to keep track of, we can implement Trie with actual TrieNode objects:
class Trie(object):
class TrieNode:
def __init__(self):
self.childrens = defaultdict(Trie.TrieNode)
self.terminal = False
def __init__(self):
self.root = self.TrieNode()
def add(self, word: str) > None:
trie = self.root
for char in word:
trie = trie.childrens[char]
trie.terminal = True
def contains(self, word: str) > bool:
trie = self.root
for char in word:
if char not in trie.childrens: return False
trie = trie.childrens[char]
return trie.terminal
def startswith(self, prefix: str) > bool:
trie = self.root
for char in prefix:
if char not in trie.childrens: return False
trie = trie.childrens[char]
return True
Union Find / Disjoint Sets
Remarks
 Union Find is a data structure keeps track of a set of elements that are partitioned into a number of disjoint subsets.
 It has two operations
union(p, q)
andfind(p)
. the find/search finds the subset element
p
belongs to.  and union/merge merges the subsets containing
p
andq
.
 the find/search finds the subset element
 It memory usage is O(N), and each find and union operations are near O(1) in time.
 The algorithm can be used to find all connected components in a network.
 It is also used in kruskal’s algorithm to find the minimal spanning tree for a graph.
Implementation
class UnionFind(object):
def __init__(self, n):
self.parents = list(range(n))
self.sizes = [1] * n
def _find_recursive(self, i):
while i != self.parents[i]:
# path compression, have i points to the cluster centroid
self.parents[i] = self.find_recursive(self.parents[i])
i = self.parents[i]
return i
def _find_iterative(self, i):
# 1 path to find the root
root = i
while self.parents[root] != root:
root = self.parents[root]
# 2 path to assign every node in the path to points at root
while self.parents[i] != root:
parent = self.parents[i]
self.parents[i] = root
i = parent
return root
find = _find_recursive
def union(self, p, q):
root_p, root_q = map(self.find, (p, q))
if root_p == root_q: return
small, big = sorted([root_p, root_q], key=lambda x: self.sizes[x])
self.parents[small] = big
self.sizes[big] += self.sizes[small]
Use Union Find to find connected components in undirected graph.
connections = [[0, 1], [1, 2], [2, 3], [3, 4], [5, 6], [6, 8], [7, 9]]
uf = UnionFind(10)
for p, q in connections: uf.union(p, q)
num_components = len(set(uf.find(i) for i in range(10)))
print(num_components)
 To find minimum spanning tree in the graph where there are weights on the edges.
 Sort edges by edge weights ascendingly.
 Iteratve over the edges unify nodes when two nodes don’t belong to same cluster.
 Repeat 2 until no nodes or no edges.
An Implementation without preallocation
class UnionFind(object):
def __init__(self):
self.parents = dict()
self.sizes = dict()
self.n_sets = 0
def __contains__(self, i):
return i in self.parents
def insert(self, i):
if self.__contains__(i): return
self.parents[i] = i
self.sizes[i] = 1
self.n_sets += 1
def find(self, i):
while i != self.parents[i]:
self.parents[i] = self.find(self.parents[i])
i = self.parents[i]
return i
def union(self, p, q):
root_p, root_q = map(self.find, (p, q))
if root_p == root_q: return
small, big = sorted([root_p, root_q], key=lambda x: self.sizes[x])
self.parents[small] = big
self.sizes[big] += self.sizes[small]
self.n_sets = 1
Miscellaneous
A collection of handy preprocessing code for common data.
 Get unique characters from a collection of strings.
functools.reduce(set.union, map(set, list_of_words))
Patterns
1. Sliding Window
For problems on Array or Linked List, where the task is to find a subarray/ a section of the linked list satisify certain criterion. General approach is to use two pointers to indicate the left and right of the window, move, expand and contract the window along the line.
 Use some variables to summarize the window for simple questions like sum, avg.
 Use a hashmap/hashset/orderedhashmap to keep track of counts of elements in the window for more complex problems.