Bit Manipulation Questions

from functools import reduce, partial
from operator import xor, and_

Hamming Distances / Bit counting.

191. Number of 1 Bits

Write a function that takes an unsigned integer and return the number of ‘1’ bits it has (also known as the Hamming weight.

Thought process:

x & (x - 1) clear the last bit.
we count the number of bits we need to clear to zero out the number.

def hamming_weight(n):
    c = 0
    while n:
        n &= (n - 1)
        c += 1
    return c

461. Hamming Distance

The Hamming distance between two integers is the number of positions at which the corresponding bits are different.

Thought process:

XOR gives the bit difference.
Once we get the difference, we use the above technique to count number of 1 bits.

def hamming_distance(x, y):
    diff = x ^ y
    distance = 0
    while diff:
        distance += 1
        diff &= (diff - 1)
    return distance

The Hamming distance between two integers is the number of positions at which the corresponding bits are different. Now your job is to find the total Hamming distance between all pairs of the given numbers.

Thought process:

Iterative over all possible pairs is \(N^2\).
Count the number of 0 and 1 at each bit, then the total difference from that bit is the product of the two frequencies.
(num & 1 << i) == 0 checks the ith bit is 1 or not.

def total_hamming_distance(nums):
    bits = [[0, 0] for _ in range(32)]
    for num in nums:
        for i in range(32):
            if (num & 1 << i) == 0:
                bits[i][0] += 1
            else:
                bits[i][1] += 1
    return sum(x * y for x, y in bits)

338. Counting Bits

Given a non negative integer number num. For every numbers i in the range 0 ≤ i ≤ num calculate the number of 1’s in their binary representation and return them as an array.

Thought process:

i & (i - 1) clears the last bit of given number.
Thus i is a larger number than i - 1 and i contains one more bit than i - 1.
So we have the recurrsive relation ship num_bits(i) = num_bits(i & (i - 1)) + 1.

def count_bits(num):
    solution = [0] * (num + 1)
    for i in range(1, num + 1):
        solution[i] = solution[i & (i - 1)] + 1
    return solution

Calculation

405. Convert a Number to Hexadecimal

Given an integer, write an algorithm to convert it to hexadecimal. For negative integer, two’s complement method is used.

Thought process:

Every four consecutive bits from right corresponds to a single digit in hex decimal right the right.
x & 15 will extract the last four bits from x.
We just do convertion one digit at a time.

def to_hex(num):
    digits = []
    for i in range(8):
        digits.append('0123456789abcdef'[num & 15])
        num = num >> 4
    return ''.join(digits)[::-1].lstrip('0') or '0'

371. Sum of Two Integers

Calculate the sum of two integers a and b, but you are not allowed to use the operator + and -.

Thought process:

XOR gives the remainder.
& gives the carry which needs to be shifted to left.
while b: a, b = a ^ b, (a & b) << 1 is the workhorse to do addition.
Note python’s binary number has sign, instead of using two’s complement, we need the mask to loss the sign in the loop, and in the end get the sign back.
To do so, let mask be the all 1 bitmask, & mask will cause the negative number to become two’s complement.
~(x ^ mask) will conver the number back to negative.

def get_sum(a, b):
    mask = (1 << 32) - 1
    while b:
        xor = a ^ b
        carry = (a & b) << 1
        a, b =  xor & mask, carry & mask
    return a if a <= (1 << 31) - 1 else ~(a ^ mask)

Finding Number

268. Missing Number

Given an array containing n distinct numbers taken from 0, 1, 2, …, n, find the one that is missing from the array.

Thought process:

x ^ x == 0 and x ^ 0 == x.
We loop over 0 to n once accumulate the xors, then loop over nums to cancel the pairs.
The remainder is the missing number.

def missing_number(nums):
    return reduce(xor, list(range(len(nums) + 1)) + nums)

136. Single Number

Given a non-empty array of integers, every element appears twice except for one. Find that single one.

Thought process:

x ^ x == 0, so the number with frequency two will cancer each other.
0 ^ x == x, so we can use 0 as default in reduction if nums contains only one number.

def single_number(nums):
    return reduce(xor, nums, 0)

137. Single Number II

Given a non-empty array of integers, every element appears three times except for one, which appears exactly once. Find that single one.

Thought process:

Main workhorse is still xor, x ^ x == 0 and 0 ^ x == 0.
The trick is we need to use seperate masks for bit flipping.

first	value	second	value	third	value
once	000	once	010	once	000
twice	000	twice	000	twice	010
num	010	num	010	num	010
once ^ num	010	once ^ num	000	once ^ num	010
~twice	111	~twice	111	~twice	101
~twice & (once ^ num)	010	~twice & (once ^ num)	000	~twice & (once ^ num)	000
once*	010	once*	000	once*	000
~once	101	~once	111	~once	111
twice ^ num	010	twice ^ num	010	twice ^ num	000
~once & (twice ^ num)	000	~once & (twice ^ num)	010	~once & (twice ^ num)	000
twice*	000	twice*	010	twice*	000

def single_number_ii(nums):
    once = twice = 0
    for num in nums:
        once = ~twice & (once ^ num)
        twice = ~once & (twice ^ num)
    return once

260. Single Number III

Given an array of numbers nums, in which exactly two elements appear only once and all the other elements appear exactly twice. Find the two elements that appear only once.

Thought process:

find the difference between the two single numbers with xor.
x & ~(x - 1) or other alternative can extract the last bit from a number.
This last bit must exsit in one and only one of the the two numbers.
We can filter the numbers with by this last bit.
Use xor reduction on the filtered numbers, we can recover one of the single number.
diff ^ x will seperate out the second number from the diff.

def single_number_iii(nums):
    diff = reduce(xor, nums)
    lsb = diff & ~(diff - 1)
    x = reduce(xor, filter(partial(and_, lsb), nums))
    return [x, diff ^ x]

Permutation

78. Subsets

Given a set of distinct integers, nums, return all possible subsets (the power set).

Thought process:

[bin(i) for i in range(1 << length)] will give a permutation map for sequence of given length.
For each permutation we iterate over the elements and ensemble them according to the permutation.

def subsets(nums):
    subsets = []
    for i in range(1 << len(nums)):
        subset = []
        for j, num in enumerate(nums):
            if i & (1 << j):
                subset.append(num)
        subsets.append(subset)
    return subsets

784. Letter Case Permutation

Given a string S, we can transform every letter individually to be lowercase or uppercase to create another string. Return a list of all possible strings we could create.

Thought process:

For each permutation, we iterate over the letters in the string and turn it to upper or lower based on the bit in the permutation.

def letter_case_permutation(S):
    res = []
    idx_map = [idx for idx, ch in enumerate(S) if ch.isalpha()]
    bound = 1 << len(idx_map)
    temp = list(S).copy()

    for permutation in range(bound):
        mask = bound >> 1
        for char_idx in idx_map:
            temp[char_idx] = temp[char_idx].upper() if mask & permutation else temp[char_idx].lower()
            mask >>= 1
        res.append(''.join(temp))
    return res

393. UTF-8 Validation

A character in UTF8 can be from 1 to 4 bytes long, subjected to the following rules:

For 1-byte character, the first bit is a 0, followed by its unicode code. For n-bytes character, the first n-bits are all one’s, the n+1 bit is 0, followed by n-1 bytes with most significant 2 bits being 10. This is how the UTF-8 encoding would work:

Char. number range	UTF-8 octet sequence
(hexadecimal)	(binary)
0000 0000-0000 007F	0xxxxxxx
0000 0080-0000 07FF	110xxxxx 10xxxxxx
0000 0800-0000 FFFF	1110xxxx 10xxxxxx 10xxxxxx
0001 0000-0010 FFFF	11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

Given an array of integers representing the data, return whether it is a valid utf-8 encoding.

Thought process:

Reading comprehension problem…

def valid_utf8(data):
    k = 0

    while k < len(data):
        if (data[k] >> 7 == 0b0):
            k += 1
        else:
            if (data[k] >> 5 == 0b110):
                offset = 2
            elif (data[k] >> 4 == 0b1110):
                offset = 3
            elif (data[k] >> 3 == 0b11110):
                offset = 4
            else:
                return False

            for i in range(1, offset):
                if (k + i >= len(data) or data[k + i] >> 6 != 0b10):
                    return False

            k += offset
    return True