# Coding Notes

## Backtracking

### Remarks

- The backtracking algorithm solves the optimization problem by incrementally builds out candidate solutions and backtracks a candidate as soon as we can determine it will not leads to optimal.
- If we represent the partial solutions as a tree, backtracking is doing depth-first search in this tree.
- Since it explores all of the tree in worst case, its upper bounded to have exponential time complexity. It usually need some constraints to prune the tree.
- Very often we would pass both the candidate solution as well as the state of partial solution to the recursive call to speed up checks.

### Template

To find one solution

```
def backtrack(candidate, state):
if criterion(candidate, state): return candidate
for choice in choices:
if not constraints(candidate, state, choice): continue
result = backtrack(expand(candidate, state, choice))
if result: return result
return
solution = backtrack(init_candidate, init_state)
```

To collect all possible solutions

```
solutions = set()
def backtrack(candidate, state):
if criterion(candidate, state): return solutions.add(candidate)
for choice in choices:
if not constraints(candidate, state, choice): continue
backtrack(expand(candidate, state, choice))
backtrack(init_candidate, init_state)
```

## Binary Search

### Remarks

- The binary search algorithm solves the problem of finding a target value within a sorted array.
- The algorithm iteratively updating the lower and upper bound positions in the array where the target value may resides, by checking the middle element determined by the current bounds.
- The algorithm runs in O(logN) time worst case.

### Template

To find a the target value in sorted array.

```
def binary_search(array, target):
low, high = 0, len(array) - 1
while low <= high:
mid = (low + high) // 2
if array[mid] == target: return mid
elif array[mid] > target: high = mid - 1
else: low = mid + 1
return -1
```

note at the end of loop, low is the insertion point for x in a to maintain sorted order.

This algorithm can be generalized to find values in bounds that satisfy a criterion, given that the feasible values are collocated as a contiguous range within the bounds, and that the criterion can give us direction to update the bounds. We can return either the left most or the right most value if so desired.

```
def binary_search(low, high): # based on prior knowledge of bounds.
# note here we are assuming a feasible solution exists in the bounds.
def criterion(mid):
# to be implemented per question
while low < high:
mid = (low + high) // 2
if criterion(mid): high = mid
else: low = mid + 1
return # low for left most low - 1 for right most
```

python has a standard library bisect for binary search.

## Binary Tree Traversals

### Remarks

### DFS traversals

- Recurrsive implementation of the tree dfs traversals are pretty straightforwardly easy.
- With a hashmap, the iterative implementation can be very easy, but it need extra O(n) space for the hash.
- Otherwise, the iterative version of the three dfs traversals can be quite different and nasty.

```
def inorder(node):
if node.left: inorder(node.left)
print(node.val)
if node.right: inorder(node.right)
def postorder(node):
if node.left: postorder(node.left)
if node.right: postorder(node.right)
print(node.val)
def preorder(node):
print(node.val)
if node.left: preorder(node.left)
if node.right: preorder(node.right)
def tree_dfs_iterative(root):
stack = [root]
visited = set()
while stack:
node = stack.pop()
if not node: continue
if node in visited:
print(node.val)
else:
visited.add(node)
# stack.extend([node.right, node, node.left]) # inorder
# stack.extend([node, node.right, node.left]) # postorder
stack.extend([node.right, node.left, node]) # preorder
def inorder_iterative(root):
node, stack = root, []
while stack or node:
while node:
stack.append(node)
node = node.left
node = stack.pop()
print(node.val)
node = node.right
def preorder_iterative(root):
stack = [root]
while stack:
node = stack.pop()
if not node: continue
print(node.val)
stack.append(node.right)
stack.append(node.left)
# this version has to cut links, but reads more like the previous two order in code
def postorder_iterative(root):
node, stack = root, []
while stack or node:
if node:
stack.append(node)
node = node.left
elif stack[-1].right:
node = stack[-1].right
stack[-1].right = None
else:
print(stack.pop().val)
# the double push version works better, but harder to explain what's going on.
# basically, the first bush is for backtrack, the second push is to traversal.
def postorder_iterative(root):
stack = [root, root]
while stack:
node = stack.pop()
if stack and stack[-1] is node:
if node.right:
stack.extend([node.right] * 2)
if node.left:
stack.extend([node.left] * 2)
else:
print(node.val)
```

### BFS traversal

- BFS traversal with queue is quite straightforward.

```
def tree_bfs_iterative(root):
queue = deque([root])
while queue:
node = queue.popleft()
if not node: continue
print(node.val)
queue.extend([node.left, node.right])
```

## Bit Manipulation

### Operators

operators | name | rlue |
---|---|---|

& / operator.and_ | and | 1 if both 1 |

| / operator.or_ | or | 1 if either or both 1 |

~ / operator.not_ | not | 1 if 0 / 0 if 1 |

^ / operator.xor | eXclusive OR | 1 if only 1 is 1 |

« / operator.lshift | left shift | move bits to left, pad 0 to right |

« / operator.rshift | right shift | move bits to right, pad 0 to left |

### Simple Usages

usage | meaning |
---|---|

-x = ~(x - 1) | property |

-x - 1 = ~x | property |

0 ^ x = x | property |

x « y | x * 2 ** y |

x » y | x // 2 ** y |

x & - x | extract last/rightmost bit |

x & ~(x - 1) | extract last/rightmost bit |

x ^ (x & (x - 1)) | extract last/rightmost bit |

x & (x - 1) | clear last bit |

### Common Pattern

**Set bit at position**

```
def set_bit(x, pos):
mask = 1 << pos
return x | mask
```

**clear bit at position**

```
def clear_bit(x, pos):
mask = ~(1 << pos)
return x & mask
```

**clear bit from pos all the way to the left**

```
def clear_bit_until(x, pos):
mask = (1 << pos) - 1
return x & mask
```

**clear bit to the right of a position**

```
def clear_bit_until(x, pos):
mask = ((1 << 31) - 1 << pos)
return x & mask
```

**flip bit at position**

```
def flip_bit(x, pos):
mask = 1 << pos
return x ^ mask
```

**check bit set at position**

```
def check_bit(x, pos):
shifted = x >> pos
return shifted & 1
```

**check number is even**

```
def check_even(x):
return (x & 1) == 0
```

**check number is power of 2**

```
def check_pow_2(x):
return (x & (x - 1)) == 0
```

**check two number of opposite sign**

```
def check_opposite_sign(x, y):
return (x ^ y) < 0
```

**swap two integers with xor**

```
def swap(x, y):
x = x ^ y
y = x ^ y
x = x ^ y
return x, y
```

## Breadth First Search

### Remarks

- The breadth first search algorithm traversals tree / graph by exploring all the immediate child nodes / neighboring nodes first before moving to next depth level.
- BFS is good for shortest path findings.

### Template

Some dummy criterion and constraints.

```
def criterion(node, target_val=4):
return node and node.val == target_val
def constraints(node):
return node
```

Iterative implementation using queue.

```
def bfs(root):
queue = deque([root])
while queue:
node = queue.popleft()
if criterion(node): return node
for child in node.childrens:
if not constraints(child): continue
queue.append(child)
result = bfs(root)
```

## Depth First Search

### Remarks

- The depth first search algorithm traversals tree / graph by exploring the node branch / edges as far as possible before backtrack and explore other alternatives.
- DFS can be used to solve problems like:
- Reachability.
- Path finding.
- Topological sort.
- Directed cycle detection.

### Template

Iterative implementation using queue.

```
def dfs(root):
stack = deque([root])
while stack:
node = stack.pop()
if criterion(node): return node
for child in node.childrens:
if not constraints(child): continue
stack.append(child)
result = dfs(root)
```

Recursive implementation with a hash map memorizing visited nodes.

```
visited = set()
def constraints(node):
return node and node not in visited
def dfs(node):
visited.add(node)
if criterion(node): return node
for child in node.childrens:
if constraints(child):
result = dfs(child)
if result: return result
result = dfs(root)
```

## Dynamic Programming

### Remarks

- DP is a method for solving optimization problem by breaking the problem into smaller subproblems.
- DP requires that the overall optimal solution depends on the optimal solution to its subproblems.
- We might be able to apply DP when:
- The subproblems can be seen as smaller version of the original problem.
- The optimal solution of a problem is a function of the optimal solutions to one or more of its subproblems.

Two methods using dynamic programming are top-down memoization and bottom up tabulation.

- Top-down memoization is to recurssivly call the method to find solutions to subproblems and cache the results along the way to avoid redundent calculations.
- Bottom up tabulation is to solving the smallest subproblem first and then filling up the N dimension optimal solution table.

Classic example for dynammic programming is the fibonacci number calculation.

```
def fib(n):
if n <= 1: return 1
return fib(n - 1) + fib(n - 2)
```

The recursive relationship here is . The above code would work but will be very slow when n increases, due to the redundent calculations in the recursive calls.

With the top down memoization approach, we save the solutions of the subproblems and reuse the reuslts when we don’t need to recalculate.

```
memo = {0: 1, 1: 1}
def fib_memo(n):
if n in memo: return memo[n]
memo[n] = fib_memo(n - 1) + fib_memo(n - 2)
return memo[n]
```

With the bottom up tabulation approach, we iterativly fill up the 1 dimensional table up untill we can answer the problem.

```
def fib_tab(n):
fibs = [1, 1]
for i in range(2, n + 1):
fibs.append(fibs[-1] + fibs[-2])
return fibs[n]
```

## Linked Lists

### Remarks

- A Linked List is a recursive data structure that is either empty or a node holds an item and a reference to another linked list.
- Its often useful to keep track of both the head and the tail of a linked list.

### Common Operations with Linked Lists

```
class Node(object):
def __init__(self, item=None):
self.item = item
self.next = None
first = Node('head')
second = Node('shoulders')
third = Node('knees')
forth = Node('toes')
first.next = second
second.next = third
third.next = forth
head = Node(None)
head.next = first
tail = Node(None)
tail.next = forth
# Traversal to the end. O(n)
trav = head
num_nodes = 0
while trav and trav.next:
trav = trav.next
num_nodes += 1
assert tail.next == trav
# Insert at the begining. O(1)
prev_first = head.next
new_first = Node('hat')
new_first.next = prev_first
head.next = new_first
# Remove from the begining. O(1)
first = head.next
head.next = first.next
# or
head.next = head.next.next
# Insert at the end. O(1) with tail pointer, O(n) if need to create tail pointer
prev_last = tail.next
new_last = Node('shoes')
prev_last.next = new_last
tail.next = new_last
# Remove from the end. O(n).
trav = head
while trav.next and trav.next.next:
trav = trav.next
if not trav.next:
head.next = tail.next = None
else:
trav.next = None
tail.next = trav
# Reverse a linked list
def reverse(head, tail):
if not (head and tail and head.next): return head, tail
prev = None
tail.next = trav = head.next
while trav:
next = trav.next
trav.next = prev
prev = trav
trav = next
head.next = prev
return head, tail
```

## Priority Queue / Heap

### Remarks

- Priority Queue is a Queue that elements would be dequed based on their priorites. (like the current min or max go first, etc)
- Heap is a tree in which the parent node is always larger(or smaller) than its children.
- Binary heap a heap with the tree been a binary tree.
- Binary heap queue is binary heap plus the binary tree is complete.
- Binary heap queue is an implementation for priority queue and it is usually stored as an array.
- The key operations for binary heap queue are sink and swim.
- Swim is to move the newly inserted element(in the bottom) up to its rightful location.
- Sink is to move the newly promoted element(at the top) down to its rightful location.

### Implementation

```
from operator import lt, gt
class Heapq(object):
def __init__(self, data=None, mode='min'):
self.mode = mode
self._heapq = [None] + list(data) if data else [None]
self.comparator = gt if mode == 'min' else lt
self._n = len(self._heapq) - 1
if data: self.heapify()
def __len__(self):
return self._n
def is_empty(self):
return self._n == 0
def _compare(self, i, j):
return self.comparator(self._heapq[i], self._heapq[j])
def _swap(self, i, j):
self._heapq[i], self._heapq[j] = self._heapq[j], self._heapq[i]
def top(self):
assert not self.is_empty(), 'heapq is empty'
return self._heapq[1]
def push(self, x):
self._heapq.append(x)
self._n += 1
self.swim(self._n)
def pop(self):
assert not self.is_empty(), 'heapq is empty'
self._swap(1, self._n)
self._n -= 1
self.sink(1)
return self._heapq.pop()
def swim(self, k):
while k > 1 and self._compare(k // 2, k):
self._swap(k // 2, k)
k //= 2
def sink(self, k, n=None):
n = n if n else self._n
while k * 2 <= n:
j = k * 2
if j < n and self._compare(j, j + 1):
j += 1
if not self._compare(k, j):
break
self._swap(k, j)
k = j
def heapify(self):
for k in range(self._n // 2, 0, -1):
self.sink(k)
```

## Queue

### Remarks

- A queue is a abstract data type that is First-In-First-Out(FIFO).

### Implementation

An implementation of Queue using linked list.

```
class Queue(object):
class Node(object):
def __init__(self, item=None):
self.item = item
self.next = None
def __init__(self):
self.first = None
self.last = None
self._n = 0
@property
def is_empty(self):
return not self.first
def __len__(self):
return self.n
def peek(self):
return self.head.item
def enqueue(self, item):
prev = self.last
self.last = self.Node(item)
if self.is_empty():
self.first = self.last
else:
prev.next = self.last
self._n += 1
def dequeue(self):
assert not self.is_empty(), "Queue is empty"
item = self.first.item
self.first = self.first.next
self._n -= 1
if self.is_empty():
self.last = None
return item
```

An implementation of Queue using fixed length array and pointer. Note the formula for pointer updates:

- Enqueue: (first + n) % capacity
- Dequeue: (first + 1) % capacity

```
class Queue(object):
def __init__(self, size):
self.size = size
self.queue = [None] * size
self.first = 0
self.n = 0
def enqueue(self, item):
assert not self.is_full(), 'Queue is full'
self.queue[(self.first + self.n) % self.size] = value
self.n += 1
def dequeue(self):
assert not self.is_empty(), 'Queue is empty'
item = self.queue[self.first]
self.queue[self.first] = None
self.first = (self.first + 1) % self.size
self.n -= 1
return item
def peek(self):
return self.queue[self.first]
def is_empty(self):
return self.n == 0
def is_full(self):
return self.n == self.size
```

## Sorting

Here are implementations of a few array sorting algorithms.

```
a = list('sortingexample')
```

### 1. Selection Sort

- Iteratively put the ith smallest element at ith location from the left.
- We maintain a sorted section from the left.
- In each iteration, we expand the sorted section by swap the smallest element from the unsorted section into the location.

```
def selection_sort(a):
n = len(a)
for i in range(n):
s = i
for j in range(i + 1, n):
if a[j] < a[s]:
s = j
a[i], a[s] = a[s], a[i]
```

### 2. Bubble Sort

- Iteratively put the ith largest element to ith location from the end.
- We maintain a sorted seciton from the right.
- In each iteration, we compare adjacent pairs of elements and swap larger values to the right until we reach the sorted part.

```
def bubble_sort(a):
n = len(a)
for i in range(1, n):
for j in range(n - i):
if a[j] > a[j + 1]:
a[j + 1], a[j] = a[j], a[j + 1]
```

### 3. Insertion Sort

- Iteratively insert new element into the sorted part.
- We maintain a sorted section from the left.
- In each iteration, we sink one new element from the right to the correct location in the sorted part.

```
def insertion_sort(a):
n = len(a)
for i in range(n):
for j in range(i, 0, -1):
if a[j] < a[j - 1]:
a[j - 1], a[j] = a[j], a[j - 1]
```

### 4. Merge Sort

- Recursively divide the array into two halves, sort both(recursively) and then merge the two sorted array.
- It follows the Divide-and-Conquer paradigm.

```
# A quick and dirty non-inplace version
def merge_sort(a):
def merge(left, right):
merged = []
while left and right:
merged.append(left.pop() if left[-1] > right[-1] else right.pop())
merged.extend(left + right)
return merged[::-1]
if len(a) <= 1: return a
mid = len(a) // 2
left = merge_sort(a[:mid])
right = merge_sort(a[mid:])
return merge(left, right)
# A in place version
def merge_sort(a):
temp = a.copy()
def merge(low, mid, high):
for i in range(low, high + 1):
temp[i] = a[i]
i, j = low, mid + 1
for k in range(low, high + 1):
if i > mid:
a[k] = temp[j]
j += 1
elif j > high:
a[k] = temp[i]
i += 1
elif temp[j] < temp[i]:
a[k] = temp[j]
j += 1
else:
a[k] = temp[i]
i += 1
# bottom-up version
# n = len(a)
# step_size = 1
# while step_size < n:
# for low in range(0, n, 2 * step_size):
# mid = low + step_size - 1
# high = min(n - 1, low + 2 * step_size - 1)
# merge(low, mid, high)
# step_size *= 2
# # top-down version
def _merge_sort(low, high):
if low >= high: return
mid = (low + high) // 2
_merge_sort(low, mid)
_merge_sort(mid + 1, high)
merge(low, mid, high)
_merge_sort(0, len(a) - 1)
```

### 5. Quick Sort

- Pick a random pivot item, put it in the right place by moving items smaller than it to its left and items larger than it to its right.
- Use the pivot item(since its in the right location now), to cut the problem in two halves and recursively sort the whole thing.

```
import random
# an inplace implementation, with random pivot item instead of shuffle and alwasy pick leftmost as pivot.
def quick_sort(a):
def partition(l, h):
if l >= h: return
p = random.randint(l, h)
a[l], a[p] = a[p], a[l]
pivot = a[l]
i, j = l, h
while i < j:
while i < h and a[i] <= pivot: i += 1
while j > l and a[j] >= pivot: j -= 1
if j > i and a[i] > a[j]: a[i], a[j] = a[j], a[i]
a[j], a[l] = a[l], a[j]
return j
def sort(l, h):
if h > l:
j = partition(l, h)
sort(l, j - 1)
sort(j + 1, h)
sort(0, len(a) - 1)
```

### 6. Heap Sort

- Heapify the array then poping item one at a time.
- See Heap section for how heap works.

## Stack

### Remarks

- A stack is a collection that is Last-In-First-Out(LIFO).

### Implementation

An implementation of Stack using linked list.

```
class Stack(object):
class Node(object):
def __init__(self, item=None):
self.item = item
self.next = None
def __init__(self):
self.node = None
self._n = 0
@property
def is_empty(self):
return not self.node
def __len__(self):
return self.n
def push(self, item):
prev = self.node
self.node = self.Node(item)
self.node.next = prev
self._n += 1
def pop(self):
assert not self.is_empty, "Stack is empty"
item = self.node.item
self.node = self.node.next
self._n -= 1
return item
def peek(self):
return self.node.item
```

## Topological Sort

### Remarks

- Topological sort gives a topological order of the nodes in a directed acyclic graph(DAG).
- A topological ordering means for each directed edge
`u -> v`

,`u`

must comes before`v`

in the ordering. - It is a useful preprocessing on the graph, we can implement search for shortest/longest path easily based on this ordering.

### Implementation

- A implementation of topological sort is to do DFS and put the postorder on to an array.
- The reversal of the postorder array is the topological order.
- The code below assues the inputs are a list of nodes and the adjacency lists.
- There may be more than one valid topological order for the same DAG.

```
def topsort(nodes, edges):
n = len(nodes)
visited = set()
top_order = []
def dfs(node):
visited.add(node)
for neighbor in edges[node]:
if neighbor not in visited: dfs(neighbor)
top_order.append(node) # put postorder on to array
for node in nodes:
if node not in visited: dfs(node)
return top_order[::-1] # return the reversal
nodes = set(range(7))
edges = {
0: [2, 5, 1],
1: [4],
2: [0],
3: [2, 4, 5, 6],
4: [],
5: [2],
6: [0, 4]
}
print(topsort(nodes, edges))
```

- Another method is the Khan’s algorith, which requires more processing and uses BFS.
- In addition to the above two inputs, we also need the in_degrees for each node.

```
def topsort(nodes, out_edges, in_degrees):
queue = deque([node for node in in_degrees if not in_degrees[node]])
ordering = []
while queue:
node = queue.popleft()
ordering.append(node)
for nxt_node in out_edges[node]:
in_degrees[nxt_node] -= 1
if not in_degrees[nxt_node]:
queue.append(nxt_node)
return ordering
in_degrees = {0: 2, 1: 1, 2:3, 3:0, 4:3, 5: 2, 6:1 }
print(topsort(nodes, edges, in_degrees))
```

With this method, when there are cycles in the graph, the bfs will terminate early and the ordering is incomplete.

## Tries / Prefix Trees

### Remarks

- Trie(pronounce as try) or Prefix Tree is a N-way tree used to facilitate search of string valued keys.
- The node path corresponds to the characters in the key.
- We use a flag to indicate the end of key.
- Trie can be implemented with Tree Nodes or as hashmap of hashmaps.
- The basic operations like
`add`

,`get`

or prefix search`startswith`

all take time where`k = len(key)`

### Implementation

Below we implement a Trie using python dictionaries.

```
from collections import defaultdict
class Trie(object):
def __init__(self):
_trie = lambda: defaultdict(_trie)
self._trie = _trie()
self.flag = '#'
def add(self, word):
trie = self._trie
for char in word:
trie = trie[char]
trie[self.flag] = trie.get(self.flag, 0) + 1
def get(self, word):
trie = self._trie
for char in word:
if char not in trie: return False
trie = trie[char]
return False if self.flag not in trie else trie[self.flag]
def contains(self, word):
return self.get(word) is not False
def startswith(self, prefix):
trie = self._trie
for char in prefix:
if char not in trie: return False
trie = trie[char]
return True
```

When there are more stuff the node needs to keep track of, we can implement Trie with actual TrieNode objects:

```
class Trie(object):
class TrieNode:
def __init__(self):
self.childrens = defaultdict(Trie.TrieNode)
self.terminal = False
def __init__(self):
self.root = self.TrieNode()
def add(self, word: str) -> None:
trie = self.root
for char in word:
trie = trie.childrens[char]
trie.terminal = True
def contains(self, word: str) -> bool:
trie = self.root
for char in word:
if char not in trie.childrens: return False
trie = trie.childrens[char]
return trie.terminal
def startswith(self, prefix: str) -> bool:
trie = self.root
for char in prefix:
if char not in trie.childrens: return False
trie = trie.childrens[char]
return True
```

## Union Find

### Remarks

- Union Find is a data structure keeps track of a set of elements that are partitioned into a number of disjoint subsets.
- It has two operations
`union(p, q)`

and`find(p, q)`

, where*find*tells whether`p`

and`q`

are in the same subset and*union*merges the subsets containing`p`

and`q`

. - It memory usage is O(N), and
*find*and*union*operations are near O(N). - The algorithm can be used to find all connected components in a network.

### Template

```
class UnionFind(object):
def __init__(self, n):
self.ids = list(range(n))
self.sizes = [1] * n
def root(self, i):
ids = self.ids
while i != ids[i]:
ids[i] = ids[ids[i]] # path compression
i = ids[i]
return i
def find(self, p, q):
return self.root(p) == self.root(q)
def union(self, p, q):
ids, sizes = self.ids, self.sizes
rootp, rootq = map(self.root, (p, q))
small, big = sorted([rootp, rootq], key=lambda x: sizes[x])
ids[small] = ids[big]
sizes[big] += sizes[small]
```

Lets say there are `n`

nodes in a graph and we encode the connectivity information with adjacency list as `connections`

which looks like `[(p1, q1), (p2, q2) ...]`

.
To get the number of connected components:

```
connections = [[0, 1], [1, 2], [2, 3], [3, 4], [5, 6], [6, 8], [7, 9]]
uf = UnionFind(10)
for p, q in connections:
uf.union(p, q)
num_components = len(set(uf.root(i) for i in range(10)))
print(uf.ids)
```

- Union Find can be used to find minimum spanning tree in the graph where there are weights on the edges.
- Sort edges by edge weights ascendingly.
- Iteratve over the edges unify nodes when two nodes don’t belong to same cluster.
- Repeat 2 until no nodes or no edges.

## Miscellaneous

### A collection of handy preprocessing code for common data.

- Get unique characters from a collection of strings.
`functools.reduce(set.union, map(set, list_of_words))`

## Patterns

### 1. Sliding Window

For problems on Array or Linked List, where the task is to find a subarray/ a section of the linked list satisify certain criterion. General approach is to use two pointers to indicate the left and right of the window, move, expand and contract the window along the line.

- Use some variables to summarize the window for simple questions like sum, avg.
- Use a hashmap/hashset/orderedhashmap to keep track of counts of elements in the window for more complex problems.