Elements of Programming Interviews: Solutions

1. Foreward
2. Data Structures and Algorithms
3. Domain Specific Problems
4. Honors Class

1 Foreward

All problems presented here are from Elements of Programming Interviews, by Dr. Adnan Aziz, Tsung-Hsien Lee, and Amit Prakash. All credit goes to them.

All solutions presented are my own.

2 Data Structures and Algorithms

2.1 Primitive Types

2.1.1 DONE Computing the parity of a word

The parity of a binary word is 1 if the number of 1s in the word is odd; otherwise, it is 0. For example, the parity of 1011 is 1, and the parity of 10001000 is 0. Parity checks are used to detect single bit errors in data storage and communication. It is fairly straightforward to write code that computes the parity of a single 64-bit word

We assume that the word is provided as a 64-bit unsigned integer.

A naive implementation would be to iterate through every bit of the word, XOR-ing a counter for every 1-bit.

def parity(word):
    p = 0
    for shift in range(0,64):
        p ^= (word>>shift)&1
    return p

all([parity(int(w,2))==p for w,p in [
    ("1011", 1),
    ("0000", 0),
]])

True

This implementation is O(n), where n is the length of the input word.

We can, however, optimize this function further by precomputing the parities of words and storing the parities in a lookup table. For illustration’s purpose, we’ll define a lookup table that stores the parities of all words of length 2:

PARITIES_2 = {
    int(w,2): p for w,p in [
        ("00", 0),
        ("01", 1),
        ("10", 1),
        ("11", 0),
    ]
}

Resulting in the following implementation:

def memoized_parity(word):
    p = 0
    memo_word_length = 2
    for s in range(0,64/memo_word_length):
        mask = 2^memo_word_length - 1
        shift = s * memo_word_length
        p ^= PARITIES_2[(word >> shift) & mask]
    return p

all([memoized_parity(int(w,2))==p for w,p in [
    ("1011", 1),
    ("0000", 0),
]])

True

This revised implementation is O(n/w) = O(n), where w is the word length of the lookup key.

2.1.2 TODO Compute x^y

Compute x^y without using arithmetic operators, i.e. using only assignment, bitwise operators, and equality checks.

2.2 Arrays

2.2.1 TODO The Dutch national flag problem

Write a program that takes an array A and an index i into A, and rearranges the elements such that all elements less than A[i] (the “pivot”) appear first, followed by eleents equal to the pivot followed by elements greater than the pivot.

Hint: Think about the partition step in quicksort.

2.2.1.1 Solution

2.2.2 DONE Sample offline data

Implement an algorithm that takes as input an array of distinct elements and a size, and returns a subset of the given size of the array elements. All subsets should be equally likely.

2.2.2.1 Solution

We can use reservoir sampling to achieve a linear-time implementation.

def sample(N, l):
    from random import randint
    reservoir = [N[i] for i in range(0, l)]
    for i in range(l, len(N)):
        _i = randint(0, i)
        if _i < l:
            reservoir[_i] = N[i]
    return reservoir

2.2.3 DONE Sample online data

Design a program that takes as input a size k, and reads packets, continuously maintaining a uniform random subset of size k of the read packets.

2.2.3.1 Solution

Analogous to solution outlined in “Sample offline data.”

2.2.4 DONE Compute the spiral ordering of a 2D array

def spiral(mtx):
    bounds = {
        "top": -1,
        "bottom": len(mtx),
        "left": -1,
        "right": len(mtx[0]),
    }
    actions = {
        "left": {
            "update": (lambda i,j: (i,j-1)),
            "within": lambda i,j: j>bounds["left"],
            "next": "up",
        },
        "right": {
            "update": (lambda i,j: (i,j+1)),
            "within": lambda i,j: j<bounds["right"],
            "next": "down",
        },
        "up": {
            "update": (lambda i,j: (i-1,j)),
            "within": lambda i,j: i>bounds["top"],
            "next": "right",
        },
        "down": {
            "update": (lambda i,j: (i+1,j)),
            "within": lambda i,j: i<bounds["bottom"],
            "next": "left",
        },
    }

    action = "right"

    i,j=0,0
    moved = False

    while True:
        yield mtx[i][j]
        _i, _j = actions[action]["update"](i,j)
        if actions[action]["within"](_i, _j):
            i,j = _i, _j
            moved = True
        else:
            if action == "right":
                bounds["top"]+=1
            elif action == "down":
                bounds["right"]-=1
            elif action == "left":
                bounds["bottom"]-=1
            elif action == "up":
                bounds["left"]+=1

            action = actions[action]["next"]
            _i, _j = actions[action]["update"](i,j)
            if not actions[action]["within"](_i, _j):
                break
            else:
                i,j = _i, _j

all([
    list(spiral([[1,2],[3,4]])) == [1,2,4,3],
    list(spiral([[1,2,3],[4,5,6],[7,8,9]])) == [1,2,3,6,9,8,7,4,5],
])

True

2.2.5 DONE Buy and sell a stock once

This problem is concerned with the problem of optimally buying and selling a stock once. As an example, consider the following sequence of stock prices: <310, 315, 275, 295, 260, 270, 290, 230, 255, 250>. The maximum profit that can be made with one buy and one sell is 30 – buy at 260 and sell at 290. Note that 260 is not the lowest price, nor 290 the highest price.

Write a program that takes an array denoting the daily stock price, and returns the maximum profit that could be made by buying and then selling one share of that stock.

2.2.5.1 Solution

Note that this problem is a simplification of the knapsack problem. A naive solution would reduce this problem to its inspiration, giving us a O(n²) solution. However, we note that the problem doesn’t ask for exactly which stocks to buy and sell for maximum profit – only the profit amount. This simplification means that we do not need the comprehensive bookkeeping that a DP-based solution to the knapsack problem provides us.

We first note that a lower buying price always results in a higher profit with the same selling price.

We can then implement a O(n) solution that compares the “current profit” – defined as difference between the current sell-price under consideration and the as-yet-seen lowest buy price, with a rolling maximum of that value. Every time we see a value less than the as-yet-seen lowest buy price, we update accordingly. Once we reach the end of the list, we return the rolling max value.

def max_profit(*args):
    min_so_far = args[0]
    profit = 0
    for p in args:
        profit = max(profit, p - min_so_far)
        if p < min_so_far:
            min_so_far = p
    return profit

max_profit(310,315,275,295,260,270,290,230,255,250) == 30

True

2.3 Strings

2.3.1 DONE Interconvert strings and integers

Implement string/integer inter-conversion functions.

2.3.1.1 Solution

def stoi(s):
    i = 0
    for c in s:
        i = 10 * i + ord(c) - ord("0")
    return i

all([
    stoi("123") == 123,
    stoi("0") == 0,
    stoi("98765432198") == 98765432198,
])

True

def itos(i):
    import math
    s = ""
    while True:
        s += chr(ord("0") + i % 10)
        i = int(math.floor(i / 10))
        if i == 0:
            break
    return s[::-1]

all([
    itos(123) == "123",
    itos(0) == "0",
    itos(98765432198) == "98765432198",
])

True

2.3.2 TODO Base conversion

In the decimal number system, the position of a digit is used to signify the power of 10 that digit is to be multiplied with. For example, “314” denotes the number 3 * 100 + 1 * 10 + 4 * 1. The base b number system generalizes the decimal number system: the string “a_k-1a_k-2…a₁a₁”, where 0 ≤ a_i ≤ b, denotes in base-b the integer a₀ × b⁰ + a₁ × b¹ + a₂ × b² + … + a_k-1 × b^k-1.

Write a program that performs base conversion. The input is a string, an integer b₁, and another integer b₂. The string represents an integer in base b₁. The output should be the string representing the integer in base b₂. Assume 2 ≤ b₁, b₂ ≤ 16. Use “A” to represent 10, “B” for 11, …, and “F” for 15. (For example, if the string is “615”, b₁ is 7 and b₂ is 13, then the result should be “1A7”, since 6 × 7² + 1 × 7 + 5 = 1 × 13² + 10 × 13 + 7).

2.3.3 TODO Replace and remove

Consider the following two rules that are to be applied to an array of characters.

Replace each “a” by two “d”s.
Delete each entry containing a “b”.

For example, applying these rules to the array <a,c,d,b,b,c,a> results in the array <d,d,c,d,c,d,d>.

Write a program which takes as input an array of characters, and removes each “b” and replaces each “a” by two “d”s. Specifically, along with the array, you are provided an integer-valued size. Size denotes the number of entries of the array that the operation is to be applied to. You do not have to worry about preserving subsequent entries. For example, if the array is <a,b,a,c,_> and the size is 4, then you can return <d,d,d,d,c>. You can assume there is enough space in the array to hold the final result.

2.3.4 DONE Test palindromicity

For the purpose of this problem, define a palindromic string to be a string which when all the nonalphanumeric are removed it reads the same front to back ignoring case.

Implement a function which takes as input a string s and returns true if s is a palindromic string.

2.3.4.1 Solution

Use two cursors; one that starts at start of string, and one at the end.

Each step, perform equality comparison of the chars under each, returning early with False if equality does not hold. Continue until i_s > i_e and return True if reach end of iteration.

Time O(n) and space O(1).

def is_pal(S):
    i_s, i_e = 0, len(S) - 1
    while i_s < i_e:
        if S[i_s].lower() != S[i_e].lower():
            return False
        i_s += 1
        i_e -= 1
        while not S[i_s].isalnum() and i_s < i_e:
            i_s += 1
        while not S[i_e].isalnum() and i_s < i_e:
            i_e -= 1
    return True

all([
    is_pal("A man, a plan, a canal, Panama"),
    not is_pal(",,a,b,,"),
])

True

2.3.5 DONE Compute all mnemonics for a phone number

Write a program which takes as input a phone number, specified as a string of digits, and returns all possible character sequences that correspond to the phone nuber. The cell phone keypad is specified by a mapping that takes a digit and returns the corresponding set of characters. The character sequencs do not have to be legal words or phrases.

2.3.5.1 Solution

We maintain a static mapping from digits to sets of characters and use recursion to generate the power set of each digits’ set.

Complexity O(4ⁿ), since each recursion step “fans out” at most four times (due to keypad mapping).

def mnemonics(S):
    MAP = {
        "0": ["0"],
        "1": ["1"],
        "2": set("abc"),
        "3": set("def"),
        "4": set("ghi"),
        "5": set("jkl"),
        "6": set("mno"),
        "7": set("pqrs"),
        "8": set("tuv"),
        "9": set("wxyz"),
    }
    if S == "":
        return []
    elif S[0] not in MAP:
        raise Exception
    return set([c+t for c in MAP[S[0]] for t in mnemonics(S[1:])])

all([
    mnemonics("2276696") | { "ACRONYM", "ABPOMZN" },
])

True

2.4 Linked Lists

2.4.1 DONE Merge two sorted lists

Write a program that takes two lists, assumed to be sorted, and returns their merge. The only field your program can change in a node is its next field.

Hint: Two sorted arrays can be merged using two indices. For lists, take care when one iterator reaches the end.

2.4.1.1 Solution

We describe a solution that completes the task in linear time and constant space.

Call input lists A and B.

We decide on the head of the return list with respect to comparison. We save a reference H to this head for final return; in the meantime, we create an additional “work-in-progress” reference l that we will use to iteratively wire up the return value.

While neither A nor B have reached their ends, we compare the head values of each; whichever is less than or equal to the other, becomes the new target for l.next. We then increment both the assignee and l to their next links.

Once one of A or B have reached their end, we treat the other as the “remainder” list. Since the two input lists are given to be sorted, we have the invariant that every element in the remainder is greater than or equal to the current l. As such, we assign l.next = remainder.

For this solution’s purpose, we define a lightweight linked-list API as follows:

class LL():
    def __init__(self, v):
        self.v = v
        self.next = None
    def append(self, l):
        self.next = l
        return self
    def __eq__(self,l):
        me = self
        while me is not None and l is not None:
            if me.v != l.v:
                return False
            me = me.next
            l = l.next
        return me is None and l is None

Our solution is as follows:

def merge(A,B):
    if A is None:
        return B
    if B is None:
        return A
    if A.v < B.v:
        head = A
        A = A.next
    else:
        head = B
        B = B.next
    l = head # wip tracker
    cursors = { "A": A, "B": B }
    while cursors["A"] is not None and cursors["B"] is not None:
        k_next = "A" if cursors["A"].v < cursors["B"].v else "B"
        l.next = cursors[k_next]
        l = l.next
        cursors[k_next] = cursors[k_next].next
    l.next = cursors["A"] if cursors["A"] is not None else cursors["B"]
    return head

all([
    # base cases
    merge(None,None) == None,
    merge(None, LL(1).append(LL(2))) == LL(1).append(LL(2)),
    merge(LL(1).append(LL(3)), None) == LL(1).append(LL(3)),

    # "normal" case
    merge(
        LL(1).append(LL(3).append(LL(5))),
        LL(2).append(LL(4).append(LL(6))),
    ) == LL(1).append(LL(2).append(LL(3).append(LL(4).append(LL(5).append(LL(6)))))),

    # remainder case
    merge(
        LL(1).append(LL(5)),
        LL(2).append(LL(6).append(LL(10))),
    ) == LL(1).append(LL(2).append(LL(5).append(LL(6).append(LL(10))))),
])

True

2.4.2 DONE Reverse a singly linked list

2.4.2.1 Solution

class LL():
    def __init__(self, v):
        self.v = v
        self.next = None
    def append(self, l):
        self.next = l
        return self
    def __eq__(self,l):
        me = self
        while me is not None and l is not None:
            if me.v != l.v:
                return False
            me = me.next
            l = l.next
        return me is None and l is None

def ll_rev(L):
    tail = None
    cursor = L
    while cursor is not None:
        nxt = cursor.next
        cursor.next = tail
        tail = cursor
        cursor = nxt
    return tail

ll_rev(LL(4).append(LL(5).append(LL(6)))) == LL(6).append(LL(5).append(LL(4)))

2.4.3 DONE Test for cyclicity

2.4.3.1 Solution

class LL():
    def __init__(self, value, nxt=None):
        self.value = value
        self.nxt = nxt

For convenience’s sake, we assume that the linked-list has a uniqueness constraint. This constraint allows us to uniquely reference each node in the linked-list by its contained value.

This constraint can be removed by using, say, memory address as the unique reference, but the algorithm remains the same.

def is_cycle(ll):
    c_back = ll
    c_front = ll.nxt
    if not c_front:
        return False
    while c_back.value != c_front.value:
        c_back = c_back.nxt
        for _ in xrange(2):
            c_front = c_front.nxt
            if not c_front:
                return False
    return True

all([
    not is_cycle(LL(1)),
    not is_cycle(LL(1, LL(2, LL(3, LL(4, LL(5)))))),
    is_cycle(LL(1, LL(2, LL(3, LL(2, LL(3, LL(2, LL(3)))))))),
])

True

2.4.4 DONE Test for overlapping lists – lists are cycle-free

Write a program that takes two cycle-free singly linked lists, and determines if there exists a node that is common to both lists.

2.4.4.1 Solution

We note that the case where lists L₁ and L₂ are of equal length is trivial. We therefore attempt to reduce cases where the input lists are of different length to that simple case. Measure the lengths of lists L₁ and L₂; this can be done in O(n) time. Advance the longer of the two lists by the difference in lengths, at which point you’ve arrived at the trivial case; advance through both in tandem until you either reach the end of both lists – showing that there is no overlap – or until you reach the overlap.

2.4.5 DONE Remove the k^th last element from a list

Given a singly linked list and an integer k, write a program to remove the kth last element from the list. Your algorithm cannot use more than a few words of storage, regardless of the length of the list. In particular, you cannot assume that it is possible to record the length of the list.

Hint: If you know the length of the list, can you find the kth last node using two iterators?

2.4.5.1 Solution

We note that we do not need to know the specific length of the list L in order to find the kth-last element.

We use two cursors, c₁ and c₂, where c₂ is k steps ahead of c₁ in the list L. If L is not long enough to satisfy this invariant on initialization, we terminate with an error.

We then iterate each cursor in tandem, keeping a separate pointer to the previous item under c₁ on each iteration – call it c_p – until c₂ reaches the terminus of the list – concretely, the null-pointer of the linked list. At this point, c₁ is referring to the k-th last element of L. We then delete the element the usual way.

This implementation is O(n) in time and O(1) in space.

class LL():
    def __init__(self, v):
        self.v = v
        self.next = None
    def append(self, l):
        self.next = l
        return self
    def __eq__(self,l):
        me = self
        while me is not None and l is not None:
            if me.v != l.v:
                return False
            me = me.next
            l = l.next
        return me is None and l is None 

def cons(v, n=None):
    l = LL(v)
    l.next = n
    return l

def rm_kth_last(L, k):
    out = L
    c_p = None
    c_1, c_2 = out, out
    for _ in range(k):
        if c_2.next is None:
            raise Exception
        c_2 = c_2.next
    while c_2 is not None:
        c_p = c_1
        c_1 = c_1.next
        c_2 = c_2.next
    c_p.next = c_1.next
    return out

all([
    rm_kth_last(cons(1,cons(2,cons(3))), 1) == cons(1,cons(2)),
    rm_kth_last(cons(1,cons(2,cons(3,cons(4,cons(5))))), 3) == cons(1,cons(2,cons(4,cons(5)))),
])

True

2.4.6 TODO Implement even-odd merge

2.4.7 TODO Test whether a singly linked list is palindromic

2.5 Stacks and Queues

2.5.1 DONE Implement a stack with `max` API

Design a stack that includes a max operation, in addition to push and pop. The max method should return the maximum value stored in the stack.

2.5.1.1 Solution

We can use an augmentation of a “vanilla” stack for this purpose. Each element of this augmented stack – call it a “max stack” – will maintain a record of the maximum value at or below its current level. This will allow us to preserve the following invariant for given max-stack S:

S.head.max = max(S)

We can implement the max-stack as follows:

class MaxStack():
    def __init__(self, *args):
        self.record = []
        for v in args:
            self.push(v)
    def push(self, v):
        if not self.record:
            self.record.append((v,v))
        else:
            self.record.append((v,max(v,self.record[-1][1])))
        return self
    def pop(self):
        if not self.record:
            return None
        out = self.record[-1][0]
        self.record = self.record[0:-1]
        return out
    # drop silently pops 
    def drop(self):
        self.pop()
        return self
    def max(self):
        if not self.record:
            return None
        return self.record[-1][1]

all([
    MaxStack(1,4,3,2,5).max() == 5,
    MaxStack(1,4,3,2,5).drop().max() == 4,
    MaxStack(2,3,4,1).drop().drop().max() == 3,
])

True

This implementation is:

O(1) for push;
O(1) for pop;
O(1) for max lookup.

Space complexity is O(2n) = O(n), where n is the stack size.

2.5.2 DONE Test a string of parentheses, braces, and brackets for well-formedness

2.5.2.1 Solution

def is_well_formed(S):
    PAIRS = {
        "{": "}",
        "(": ")",
        "[": "]",
    }
    opens = []
    for c in S:
        if c in PAIRS:
            opens.append(c)
        elif opens and c == PAIRS[opens[-1]]:
            opens = opens[:-1]
        else:
            return False
    return not opens

all([
    is_well_formed(""),
    is_well_formed("()"),
    is_well_formed("[]"),
    is_well_formed("{}"),
    is_well_formed("{[()]}"),
    not is_well_formed("{[([)]}"),
    not is_well_formed("}"),
])

True

2.5.3 DONE Compute binary tree nodes in order of increasing depth

2.5.3.1 Solution

We use a queue as the basis of our solution. We start with the input tree T in the queue. For each node N in the queue, we enqueue its children, and then yield N. We continue until the queue is empty for a final time complexity of O(n) and likewise for space.

def serialize_inc_depth(T):
    q = [T]
    while q and q[0] is not None:
        curr = q[0]
        q.extend([c for c in [curr.l, curr.r] if c])
        yield q.popleft()

2.6 Binary Trees

2.6.1 DONE Test if a binary tree is height-balanced

A binary tree is said to be balanced if for each node in the tree, the difference in the height of its left and right subtrees is at most one. A perfect binary tree is balanced, as is a complete binary tree. A balanced binary tree does not have to be perfect or complete.

Write a program that takes as input the root of a binary tree and checks whether the tree is balanced.

2.6.1.1 Solution

We can use a post-order traversal as the backbone for our implementation.

For each subtree, we determine its height. When traversing parent nodes, if the difference in the height of its two subtrees is greater than 1, we return false immediately. Otherwise, we return one greater than the greater of the two children heights.

def is_height_balanced(T):
    def height(n):
        if not n:
            return 0
        hl, hr = height(n.left), height(n.right)
        if abs(hl - hr) > 1:
            raise Exception
        return max(hl, hr) + 1
    try:
        height(T)
    except Exception:
        return False
    return True

This implementation is O(n), where n is the number of nodes in the tree. It is O(1) in space.

2.6.2 DONE Compute the LCA when nodes have parent pointers

2.6.2.1 Solution

We note that the solution is trivial if the nodes are at the same depth: iterate in tandem until you reach the common ancestor node. This operation is O(log n).

Otherwise, if the nodes are at different depths, we can iterate the deeper node until both cursors are at the same depth, at which point the problem reduces to the same-depth case.

Both of these cases require us to determine the depths of the two nodes. This can be done by tracing the respective parent pointers to the root and storing the traversal length.

We note that both depth-determination and final traversal are O(log n); the combined solution is O(log n) w.r.t. time and O(1) w.r.t. space.

2.6.3 DONE Test if a binary tree is symmetric

A binary tree is symmetric if you can draw a vertical line through the root and then the left subtree is the mirror image of the right subtree.

Write a program that checks whether a binary tree is symmetric.

Hint: The definition of symmetry is recursive.

2.6.3.1 Solution

We note that trees T₁ and T₂ are symmetric if their root values are equal and T₁’s left child equals T₂’s right child and vice-versa.

We recursively check the input tree. The input root level is a special case where we simply check children equality. We then begin recursive “mirroring” comparison on the two child trees. “Mirroring” comparison consists of first checking that the left-right and right-left child value equalities are satisfied and then performing recursive mirroring comparison on the left-right and right-left pairs.

class Tree():
    def __init__(self, v, l=none, r=none):
        self.v = v
        self.l = l
        self.r = r

def is_sym(T):
    def is_mirror(T1, T2):
        return ((T1 is None and T2 is None)
                or (T1.v == T2.v
                    and is_mirror(T1.l, T2.r)
                    and is_mirror(T1.r, T2.l)))
    return T is None or is_mirror(T.l, T.r)

all([
    is_sym(None),
    is_sym(Tree(v=1, l=Tree(v=2), r=Tree(v=2))),
    is_sym(Tree(
        v=1,
        l=Tree(v=2, l=Tree(v=3, l=Tree(v=10)), r=Tree(v=4)),
        r=Tree(v=2, l=Tree(v=4), r=Tree(v=3, r=Tree(v=10))),
    )),
    not is_sym(Tree(v=1, l=Tree(v=2), r=Tree(v=3))),
])

True

2.6.4 DONE Reconstruct a binary tree from traversal data

Given an inorder traversal sequence and a postorder traversal sequence of a binary tree write a program to reconstruct the tree. Assume each node has a unique key.

Hint: Focus on the root.

The key insight here is that, for any input in-order traversal it and any input post-order traversal pt, we have the following invariants:

it and pt are equal in length;
The last element of pt corresponds to the “root” of the given tree.

Furthermore, we observe that it and pt are each laid out as follows:

it [<-—LHS-—><ROOT><-—RHS-—>] pt [<-—LHS-—><-—RHS-—><ROOT>]

As such, we can use the tail value of pt during every iteration to “bisect” the in-order traversal list, giving us the number of elements in both the left branch and the right branch. Using these quantities, we can extract the corresponding sub-lists out of the top-level post-order traversal, giving us everything that we need for a recursive implementation.

def teq(lhs, rhs):
    return (lhs is None and rhs is None) \
        or (lhs.value == rhs.value
            and teq(lhs.lhs, rhs.lhs)
            and teq(lhs.rhs, rhs.rhs))


class Tree():
    def __init__(self, value, lhs=None, rhs=None):
        self.value = value
        self.lhs = lhs
        self.rhs = rhs


def reconstruct(it, pt):
    assert len(it) == len(pt)
    if not it and not pt:
        return None

    if len(it) == 1 and len(pt) == 1:
        assert it[0] == pt[0]
        return Tree(it[0])

    v_root = pt[-1]

    it_root_index = it.index(v_root)

    it_lhs = it[0:it_root_index]
    pt_lhs = [] if it_root_index == 0 else pt[
        0:len(it_lhs)
    ]

    it_rhs = it[it_root_index+1:]

    pt_rhs_start = 0 if len(pt_lhs) == 0 else len(pt_lhs)
    pt_rhs = [] if it_root_index == len(it)-1 else pt[
        pt_rhs_start:pt.index(v_root)
    ]

    return Tree(
        value=v_root,
        lhs=reconstruct(it_lhs, pt_lhs),
        rhs=reconstruct(it_rhs, pt_rhs),
    )

all([
    teq(reconstruct("A", "A"), Tree("A")),
    teq(reconstruct("213", "231"),
        Tree("1", lhs=Tree("2"), rhs=Tree("3"))),
    teq(reconstruct("acbd", "abcd"),
        Tree("d", lhs=Tree("c", lhs=Tree("a"), rhs=Tree("b")))),
    teq(reconstruct("dacb", "abcd"),
        Tree("d", rhs=Tree("c", lhs=Tree("a"), rhs=Tree("b")))),
])

True

This implementation, given that it needs to perform an index-search on the unordered lists effectively for every element of the list, is O(n²) with respect to the number of elements in the tree.

2.6.4.1 Corollary

What about in-order and pre-order?

2.6.5 TODO Implement an inorder traversal with O(1) space

The direct implementation of an inorder traversal using recursion has O(h) space complexity, where h is the height of the tree. Recursion can be removed with an explicit stack, but the space complexity remains O(n).

Write a nonrecursive program for computing the inorder traversal sequence for a binary tree. Assume nodes have parent fields.

Hint: How can you tell whether a node is a left child or right child of its parent?

2.6.6 LOOKED-UP Reconstruct a binary tree from a preorder traversal with markers

Design an algorithm for reconstructing a binary tree from a preorder traversal visit sequence that uses null to mark empty children.

Hint: It’s difficult to solve this problem by examining the preorder traversal visit sequence from left-to-right.

def teq(lhs, rhs):
    return (lhs is None and rhs is None) \
        or (lhs.v == rhs.v
            and teq(lhs.l, rhs.l)
            and teq(lhs.r, rhs.r))


class Tree():
    def __init__(self, v, l=None, r=None):
        self.v = v
        self.l = l
        self.r = r


def from_pre(pre):
    idx = {'idx': 0}

    def _from_pre(pre):
        v = pre[idx['idx']]
        if not v:
            return None
        idx['idx'] += 1
        l = _from_pre(pre)
        idx['idx'] += 1
        r = _from_pre(pre)
        return Tree(v, l=l, r=r)

    return _from_pre(pre)


def test():
    assert teq(from_pre([1, 2, None, None, 3, None, None]),
               Tree(1, l=Tree(2), r=Tree(3)))

2.6.7 TODO Compute the right sibling tree

Assume each binary tree node has an extra field, call it level-next, that holds a binary tree node (this field is distinct from the fields for the left and right children). The level-next field will be used to compute a map from nodes to their right siblings. The input is assumed to be perfect binary tree.

Write a program that takes a perfect binary tree, and sets each node’s level-next field to the node on its right, if one exists.

Hint: Think of an appropriate traversal order.

2.7 Heaps

2.7.1 TODO Merge sorted files

2.7.2 TODO Compute the k closest stars

2.8 Searching

2.8.1 DONE Search a sorted array for first occurrence of `k`

Binary search commonly asks for the index of any element of a sorted array that is equal to a specified element. The following problem has a slight twist on this.

Write a method that takes a sorted array and a key and returns the index of the first occurrence of the key in the array.

2.8.1.1 Solution

A brute-force solution would be to iterate through the array A in its entirety, from start to end, until key k is found. This solution would be O(n), which isn’t terrible but would fail to take advantage of the fact that A is sorted.

To improve, we employ binary search, returning the identity of the first occurrence of k if k ∈ A and null otherwise. To account for relative indices in recursion, we pass in an “offset” that the returned index is then modulated by.

def first_occurrence(A, k):
    import math
    def fo(A, k, offset):
        if A == []:
            return None
        if len(A) == 1:
            return offset if A[0]==k else None
        i_mid = int(math.floor(len(A) / 2))
        if A[i_mid] >= k:
            i = fo(A[:i_mid], k, offset=0)
            return offset+i_mid if not i else i
        else:
            return fo(A[i_mid+1:], k, offset=i_mid+1)
    return fo(A,k,0)

all([
    first_occurrence([1,2,3], 3) == 2,
    first_occurrence([1,2,2,2,3], 2) == 1,
    first_occurrence([1,2,3,4,5], 6) == None,
])

True

2.8.2 TODO Compute the integer square root

2.8.3 TODO Find the k-th largest element

2.9 Hash Tables

2.9.1 DONE Partition into anagrams

Write a program that takes as input a set of words and returns groups of anagrams for those words. Each group must contain at least two words.

2.9.1.1 Solution

We can implement solution that avoids the need to compare all pairs of strings by hashing each string to its sorted version. Strings whose sorted forms are equal are anagrams. This implementation uses n calls to sort for O(n m log m), where n is the number of strings and m is the length of the max string.

def get_anagram_clusters(S):
    cs = {}
    for s in S:
        k = ''.join(sorted(s))
        if k not in cs:
            cs[k] = set()
        cs[k].add(s)
    return [v for _,v in cs.iteritems()]

all([
    s in get_anagram_clusters([
        "debitcard",
        "elvis",
        "silent",
        "badcredit",
        "lives",
        "freedom",
        "listen",
        "levis",
        "money",
    ]) for s in [
        set(["debitcard", "badcredit"]),
        set(["elvis", "lives", "levis"]),
        set(["silent", "listen"]),
    ]
])

True

2.9.2 DONE Test for palindromic permutations

Write a program to test whether the letters forming a string can be permuted to form a palindrome. For instance, “edified” can be permuted to form “deified”.

2.9.2.1 Solution

We assume that there is no requirement that the resulting palindrome be a word in the English language.

We note that, in the case of even-length strings, we require the count of each letter to be evenly divisible by two. We additionally note that, in the case of odd-length strings, there is one and only one letter with count of one.

This implementation is O(n) in time and space.

def can_palindrome(s):
    lcs = {}
    for c in s:
        if c not in lcs:
            lcs[c] = 0
        lcs[c] += 1
    if len(s) % 2 == 0:
        return all(v % 2 == 0 for k,v in lcs.iteritems())
    else:
        is_pivot_found = False
        for k,v in lcs.iteritems():
            if v == 1:
                if is_pivot_found:
                    return False
                else:
                    is_pivot_found = True
                    continue
            elif v % 2 != 0:
                return False
        return True

all([
    can_palindrome("racecar"),
    can_palindrome("rraacce"),
    not can_palindrome("foobar"),
])

True

2.9.3 DONE Is an anonymous letter constructible?

Write a program which takes text for an anonymous letter and text for a magazine and determines if it is possible to write the anonymous letter using the magazine. The letter can be written using the magazine if for each character in the letter, the number of times it appears in the anonymous letter is no more than the number of times it appears in the magazine.

2.9.3.1 Solution

We implement a solution that reduces the letter and the magazine into dictionaries. We then check that the magazine dictionary contains all of the letter dictionary’s keys and, for each of those keys, that it maps to a count greater than or equal to that contained in the letter dictionary.

This solution is in time O(n) with respect to the cumulative length of the letter and magazine. Space is, similarly, O(n).

For the sake of simplicity, we assume that inputs do not contain spaces. Accounting for spaces is trivial and would simply involve splitting each input on whitespace characters and iterating across sub-lists.

def is_possible(l, m):
    def to_dict(s):
        out = {}
        for c in s:
            if c not in out:
                out[c] = 0
            out[c] += 1
        return out

    dl = to_dict(l)
    dm = to_dict(m)

    for k,v in dl.iteritems():
        if k not in dm or dm[k] < v:
            return False

    return True

2.9.4 TODO Implement an ISBN cache

2.10 Sorting

2.10.1 DONE Compute the intersection of two sorted arrays

Write a program which takes as input two sorted arrays, and returns a new array containing elements that are present in both of the input arrays. The input arrays may have duplicate entries, but the returned array should be free of duplicates. For example, if the input is <2,3,3,5,5,6,7,7,8,12> and <5,5,6,8,8,9,10,10>, your output should be <5,6,8>.

2.10.1.1 Solution

def intersection(A,B):
    if not A or not B:
        return []
    out = []
    lower,upper = A, B
    while lower and upper:
        lower = lower if lower[0] < upper[0] else upper
        upper = upper if lower[0] < upper[0] else lower
        while lower and lower[0] != upper[0]:
            lower = lower[1:]
        if not lower or not upper:
            break
        item = lower[0]
        out.append(item)
        while lower and lower[0] == item:
            lower = lower[1:]
        while upper and upper[0] == item:
            upper = upper[1:]
    return out

all([
    intersection([],[]) == [],
    intersection([],[1,2,3]) == [],
    intersection([1,2,3],[]) == [],
    intersection(
        [1,2,3,4,5],
        [4,4,5,6,7],
    ) == [4,5,6,7],
    intersection(
        [1,2,3],
        [4,5,6],
    ) == [],
])

True

This implementation is linear on its inputs.

2.10.2 TODO Implement mergesort in-place

Write a program which takes as input two sorted arrays of integers, and updates the first to the combined entries of the two arrays in sorted order. Assume the first array has enough empty entries at its end to hold the result.

Hint: Avoid repeatedly moving entries.

2.10.3 DONE Count the frequencies of characters in a sentence

Given a string, print in alphabetical order each character that appears in the string, and the number of times that it appears. For example, if the string is “bcdacebe”, output (a,1), (b,2), (c,2), (d,1), (e,2).

Hint: Exploit the fact that the keys are drawn from a small set.

2.10.3.1 Solution

We assume that the input string consists solely of lowercase alphabetic characters. However, the solution is generalizeable.

We point out that the character domain is finite – specifically, of size 26. As such, we use an array of size 26, with index representing character, with “0” corresponding to “a” etc., to record the number of times the corresponding letter appears in the input string. It is then trivial to output the array values in alphabetical order.

Both the record-keeping operation and the output operation are linear. The overall solution is linear in time and constant in space.

def freqs(S):
    counts = [0] * 26
    for c in S:
        counts[ord(c) - ord("a")] += 1
    for i in range(len(counts)):
        if counts[i] > 0:
            yield (chr(ord("a")+i), counts[i])

list(freqs("bcdacebe")) == [("a",1),("b",2),("c",2),("d",1), ("e",2)]

True

2.10.4 DONE Render a calendar

Write a program that takes a set of events, and determines the maximum number of events that can take place concurrently.

2.10.4.1 Solution

We assume that the domain is unbounded – that is, that any event can occur at any given time t.

We assume that an event E is represented as a tuple (t_s, t_e), where t_s is the start time and t_e the end time.

Instead of considering discrete time values, we consider unit intervals. For instance, the event (t, t+2) falls into two interval “buckets” – the first representing the interval [t, t+1], and the second the interval [t+1, t+2].

We maintain a counter dictionary, keyed on the start times of these intervals, that keeps track of how many events overlap with the key interval. For each event E, we split E into its constituent unit intervals and populate the counter accordingly. We choose dictionary for the following reasons:

We assume no bound on the domain of time T, so we choose a data structure that doesn’t require an explicit initial size for convenience;
We make no assumptions about the proximity of the respective events’ intervals; we can very easily have events (0, 10) and (10000, 10010). Using an alternative storage construct, such as an array, would require us to allocate upwards of 10000 buckets to store information for these events, only for all but twenty of those buckets to be meaningless, i.e. with value zero. A dictionary, on the other hand, allows us to only allocate 20 buckets, for considerably greater space efficiency.

The resulting solution is O(nl) in time and space, where n is the number of events and l is the max length of the event intervals.

def atomize_interval(start, end):
    for s in range(start, end):
        yield (s, s+1)

list(atomize_interval(0,5)) == [(0,1), (1,2), (2,3), (3,4), (4,5)]

True

def max_sim(*E):
    time_to_sim = {}
    for e in E:
        for i in atomize_interval(*e):
            if i[0] not in time_to_sim:
                time_to_sim[i[0]] = 0
            time_to_sim[i[0]] += 1
    return time_to_sim[max(time_to_sim, key=(lambda k: time_to_sim[k]))]

all([
    max_sim((0,10)) == 1,
    max_sim((0,10),(2,11),(3,12)) == 3,

    # non-contiguous events
    max_sim((0,10), (2,11), (100, 110), (101,111), (102,112), (103,113)) == 4,
])

True

2.10.5 TODO Compute the union of intervals

2.10.6 TODO Implement a fast sorting algorithm for lists

2.10.7 TODO Partitioning and sorting an array with many repeated entries

2.11 Binary Search Trees

2.11.1 DONE Test if a binary tree satisfies the BST property

Write a program that takes as input a binary tree and checks if the tree satisfies the BST property.

2.11.1.1 Solution

Iterate through each subtree, keeping track of a local maximum and minimum. In addition to asserting that the two leaves relate to the node as necessary, similarly assert that the two leaves fall within the maximum and minimum. When recursing into leaves, update either the maximum or the minimum with the current node value depending on which leave is being recursed into.

2.11.2 DONE Find the first key larger than a given value in a BST

Write a program that takes as input a BST and a value, and returns the first key that would appear in an inorder traversal which is greater than the input value.

Hint: Perform binary search, keeping some additional state.

2.11.2.1 Solution

Given BST T and value v, perform binary search, keeping track of the current “minimum greater-than” value v_gt, which we can initialize to +∞. On finding v within T, if v_gt = +∞, return the right-hand sub-value of v; otherwise, return v_gt.

This implementation is O(1) in space (for v_gt) and O(h) in time.

class Tree():
    def __init__(self, v, l=None, r=None):
        self.v = v
        self.l = l
        self.r = r

def first_gt(T, v):
    v_gt = None
    cursor = T
    while cursor.v != v:
        if cursor.v < v:
            cursor = cursor.r
        elif cursor.v > v:
            v_gt = cursor.v if v_gt is None else min(v_gt, cursor.v)
            cursor = cursor.l
    return v_gt if v_gt != None else cursor.r.v if cursor.r is not None else cursor.r

_t = Tree(
    v=5,
    l=Tree(v=2, l=Tree(v=1), r=Tree(v=4, l=Tree(v=3))),
    r=Tree(
        v=8,
        l=Tree(v=7, l=Tree(v=6)),
        r=Tree(v=10, l=Tree(v=9), r=Tree(v=11))))
all([
    first_gt(_t, 6) == 7,
    first_gt(_t, 10) == 11,
    first_gt(_t, 11) == None,
])

True

2.11.3 DONE Find the k largest elements in a BST

2.11.4 DONE Compute the LCA in a BST

Design an algorithm that takes as input a BST and two nodes, and returns the LCA of the two nodes. Assume all keys are distinct. Nodes do not have references to their parents.

Hint: Take advantage of the BST property.

2.11.4.1 Solution

We note that the BST property gives us that the LCA is between the max of the two nodes values and the min.

We use a cursor C that starts at the head of input tree T. We iterate C according to the BST principle depending on if its current value is greater than or less than the max or the min of the two node values, respectively. Once we arrive at a node that’s in between the max and the min, we are done.

This implementation is O(h) in time.

class Tree():
    def __init__(self, v, l=None, r=None):
        self.v = v
        self.l = l
        self.r = r

def lca(T, v1, v2):
    while not(T.v > min(v1,v2) and T.v < max(v1,v2)):
        if T is None:
            raise Exception
        if T.v > max(v1,v2):
            T = T.l
        elif T.v < min(v1,v2):
            T = T.r
    return T.v

_t = Tree(
    v=19,
    l = Tree(
        v=7,
        l=Tree(v=3, l=Tree(v=2), r=Tree(v=5)),
        r=Tree(v=11, r=Tree(v=17, l=Tree(v=13))),
    ),
    r=Tree(
        v=43,
        l=Tree(v=23, r=Tree(v=37, l=Tree(v=29, r=Tree(v=31)), r=Tree(v=41))),
        r=Tree(v=47, r=Tree(v=53)),
    )
)

all([
    lca(_t, 5, 17) == 7,
    lca(_t, 13, 53) == 19,
])

True

2.11.5 TODO The most visited pages problem

You are given a server log file containing billions of lines. Each line contains a number of fields. For this problem the relevant field is an id denoting the page that was accessed.

Write a function to read a log file line, and a function to find the k most visited pages, where k is an input to the function. Optimize performance for the situation where calls to the two functions are interleaved. You can assume the set of distinct pages is small enough to fit in RAM.

As a concrete example, suppose the log file ids appear in the following order: g,a,t,t,a,a,a,g,t,c,t,a,t, i.e., there are four pages with ids a,c,g,t. After the first 10 lines have been read, the most common page is a with a count of 4, and the next most common page is t with a count of 3.

Hint: For each page, count the number of times it has been visited.

2.11.6 TODO Reconstruct a BST from traversal data

2.11.7 TODO Build a minimum height BST from a sorted array

2.11.8 TODO Enumerate numbers of the form a + b sqrt(2)

2.11.9 TODO Insertion and deletion in a BST

2.11.9.1 Solution

We can use reverse in-order traversal, yielding values until the count has been satisfied, for an implementation that is O(n) in time and O(log n) in space, where n is the number of entries in the BST.

class Tree():
    def __init__(self, v, l=None, r=None):
        self.v = v
        self.l = l
        self.r = r

def get_k_largest(T, k):
    def _get_k_largest(T,k):
        if not T:
            return [], k
        vs, k_rem = _get_k_largest(T.r,k)
        if k_rem == 0:
            return vs, 0
        vs.append(T.v)
        k_rem -= 1
        if k_rem == 0:
            return vs, 0
        lhs, k_rem = _get_k_largest(T.l, k_rem)
        vs.extend(lhs)
        if k_rem == 0:
            return vs, 0
        return vs, k_rem 
    vs, _ = _get_k_largest(T,k)
    return vs

all([
    get_k_largest(Tree(
        v = "A",
        l = Tree(v = "B", l = Tree(v = "D"), r = Tree(v = "E")),
        r = Tree(
            v = "C",
            l = Tree(v = "F"),
            r = Tree(
                v = "G",
                l = Tree(v = "H", r = Tree(v = "J")),
                r = Tree(v = "I"),
            ),
        ),
    ), 9) == ["I", "G", "J", "H", "C", "F", "A", "E", "B"],
])

True

2.12 Recursion

2.12.1 DONE Tower of Hanoi

We note that the base condition – a tower of height 1 – is trivial.

For each operation, each of the three pillars is assigned one of three “purposes”, as follows:

One is the “source” pillar, from which we’d like to move a tile;
One is the “destination” pillar, to which we’d like to move said tile; and
One is the “buffer” pillar, where we move all tiles that are blocking our tile in question from being moved to its destination.

For explanation’s sake, let’s say that all tiles start on Pillar 1, and that we’d like to move them all to Pillar 3 according to the rules of the game.

def hanoi(p1, p2=[], p3=[]):
    _source = p1
    _buff = p2
    _dest = p3

    def move(source=_source, dest=_dest, buff=_buff, n=len(_source)):
        if n == 0:
            return
        move(source=source, dest=buff, buff=dest, n=n-1)
        dest.append(source.pop())
        move(source=buff, dest=dest, buff=source, n=n-1)

    move()
    return _source, _buff, _dest

2.12.2 TODO Generate all nonattacking placements of n-Queens

2.13 Dynamic Programming

2.13.1 WRITTEN Count the number of score combinations

In an American football game, a play can lead to 2 points (safety), 3 points (field goal), or 7 points (touchdown, assuming the extra point). Many different combinations of 2, 3, and 7 point plays can make up a final score. For example, four combinations of plays yield a score of 12:

6 safeties;
3 safeties, 2 field goals;
1 safety, 1 field goal, and 1 touchdown;
4 field goals.

Write a program that takes a final score and scores for individual plays, and returns the number of combinations of plays that result in the final score.

2.13.1.1 Solution

We can memoize the number of combinations that lead to certain scores, iterating through the memo to arrive at the desired final score and, as a result, the final combination count.

Say we have possible play scores 2 and 3, and we’d like the number of possible plays that could lead to a score of 9. We can represent our memo as a two-dimensional array, where one axis is the score and the other represents the set of plays that can comprise the score, the first index representing, in this case, the set {2} and the second, the set {2,3}.

We note that, for a given score S and a given set of plays P = {P', p}, number of combinations leading to score S N(S, P) equals (informally):

N(S-p, P') + N(S-2p, P') + ... + N(0, P')

We say that N(x, y) = 0 for x<0 and any y.

	0	1	2	3	4	5	6	7	8	9
`{2}`	1	0	1	0	1	0	1	0	1	0
`{2,3}`	1	0	1	1	1	1	2	1	2	2

A solution that uses this memoization strategy will be O(S \times |P|), where S is the score and P is the set of play scores. Likewise for space.

2.13.2 TODO Compute the Levenshtein distance

2.13.3 DONE Count the number of ways to traverse a 2D array

2.13.3.1 Solution

Our memoization strategy is as follows. We use a matrix T of the same shape as the input matrix M to track the number of ways to traverse to that point in the input. Matrix T is populated according to function T(i,j), which we define as follows:

T(i,j) = T(i-1,j) + T(i, j-1)
T(i, j) = 0 ∀ j ∈ ℜ, i < 0
T(i, j) = 0 ∀ j < 0, i ∈ ℜ

Our solution then becomes as follows:

def num_traversals(M):
    t = [[0 for _ in M[0]] for _ in M]
    def T(t, i,j):
        if i == -1 or j == -1:
            return 0
        if i == 0 and j == 0:
            return 1
        return t[i-1][j] + t[i][j-1]
    for i in range(0, len(M)):
        for j in range(0, len(M[i])):
            t[i][j] = T(t, i, j)
    return t[len(M)-1][len(M[0])-1]

all([
    num_traversals([[0,0,0,0,0] for _ in xrange(5)]) == 70,
])

This implementation is linear for both time and space with respect to the number of elements in the input matrix.

2.13.4 DONE Compute the binomial coefficient without overflow

2.13.4.1 Solution

For illustration’s purpose, we outline a C matrix, where C[n][k] = C(n+1,k+1) ∀ n,k ∈ ℜ:

1	2	3	4
0	1	3	6
0	0	1	4
0	0	0	1

We note that this gives us the following recursive definition of the binominal coefficient: C(n, k) = C(n-1, k-1) + C(n-1, k). A naive implementation would directly translate this recursive definition into a recursive implementation, resulting in re-computation of the same values for exponential time complexity w.r.t nk. Instead, we memoize intermediate results in a manner identical to the example matrix above:

def bico(n,k):
    def C(_C, n, k):
        if n == k:
            return 1
        elif k == 0:
            return n + 1
        elif k > n:
            return 0
        else:
            return _C[k-1][n-1] + _C[k][n-1]
    _C = [[0 for _ in xrange(n)] for _ in xrange(k)]
    for _k in range(0, k):
        for _n in range(0, n):
            _C[_k][_n] = C(_C, _n, _k)
    return _C[k-1][n-1]

all([
    bico(29, 3) == 3654,
    bico(3, 2) == 3,
])

This solution is O(nk) for both time and space.

2.13.5 DONE Search for a sequence in a 2D array

Write a program that takes as arguments a 2D array and a 1D array, and checks whether the 1D array appears in the 2D array.

2.13.5.1 Solution

We can use iteration through each element of the 2D array as the backbone of our solution’s logic; during iteration, if we encounter an element that’s equal to the first element of the sequence, we break into tracing logic. This tracing logic considers all of the element’s “neighbors” to see if they equal the next value in the sequence. We “trace” the sequence in this way; if we reach the end of the sequence in this way, we return true and are done. If, however, tracing leads to only a partial match, we mark the latest element in the trace as “invalid” and propagate that mark backwards through the trace. This is to prevent re-tracing of paths that are already known to be “lost causes” – an implementation that would lead to time complexity of O(nml), where n and m are the matrix’s dimensions and l is the length of the sequence. The result, where we preemptively avoid tracing paths that have already been deemed to not match the argument sequence, is an implementation that is in time O(nm) (traversal of the input sequence is amortized).

def contains_sequence(M, S):
    # eligibility matrix
    m_e = [[True for _ in xrange(len(M[0]))] for _ in xrange(len(M))]
    def neighbor_coords(i, j):
        if i < len(M)-1:
            yield (i+1, j)
        if i > 0:
            yield (i-1, j)
        if j < len(M[0])-1:
            yield (i, j+1)
        if j > 0:
            yield (i, j-1)
    def trace(i, j, seq):
        if not seq:
            return True
        if not m_e[i][j]:
            return False
        if M[i][j] != seq[0] or not any([
                trace(nc[0], nc[1], seq[1:]) for nc in neighbor_coords(i, j)
        ]):
            m_e[i][j] = False
            return False
        else:
            return True
    for i in range(0, len(M)):
        for j in range(0, len(M[0])):
            if trace(i, j, S):
                return True
    return False

all([
    contains_sequence(M, S) == r for M,S,r in [
        (
            [[1,2,5],
             [3,4,3],
             [5,6,7]],
            [3,4,7],
            False,
        ),
        (
            [[1,2,5],
             [3,4,3],
             [5,6,7]],
            [3,4,6,7],
            True,
        ),
        (
            [[1,2,3],
             [3,4,5],
             [5,6,7]],
            [1,3,4,6],
            True,
        ),
        (
            [[1,2,3],
             [3,4,5],
             [5,6,7]],
            [1,2,3,4],
            False,
        ),
    ]
])

True

2.14 Greedy Algorithms and Invariants

2.14.1 DONE The 3-sum problem

Design an algorithm that takes as input an array and a number, and determines if there are three entries in the array (not necessarily distinct) which add up to the specified number. For example, if the array is <11,2,5,7,3> then there are three entries in the array which add up to 21 (3, 7, 11, and 5, 5, 11) (note that we can use 5 twice, since the problem statement said we c an use the same entry more than once). However, no three entries add up to 22.

Hint: How would you check if a given array entry can be added to two more entries to get the specified number?

2.14.1.1 Solution

Note that we do not need to return the specific set of three entries – only determine that it exists.

First we sort the array A in O(n log n). Then, for each index i, we iterate through indices j and k in opposite directions to determine if any satisfy A[j] + A[k] = v - A[i]. This brings our final time complexity to O(n²).

def has_three_sum(A, v):
    for a in A:
        j = 0
        k = len(A) - 1
        while j < k:
            _v = a + A[j] + A[k]
            if _v < v:
                j += 1
            elif _v > v:
                k -= 1
            else:
                return True
    return False

all([
    has_three_sum([11,2,5,7,3], 21),
    not has_three_sum([11,2,5,7,3], 100),
    not has_three_sum([11,2,5,7,3], 0),
])

True

2.14.2 LOOKED-UP The gasup problem

A number of cities are arranged on a circular road. You need to visit all the cities and come back to the starting city. A certain amount of gas is available at each city. The amount of gas summed up over all cities is equal to the amount of gas required to go around the road once. Your gas tank has unlimited capacity. Call a city ample if you can begin at that city with an empty tank, refill at it, then travel through all the remaining cities, refilling at each, and return to the ample city, without running out of gas at any point.

Given an instance of the gasup problem, how would you efficiently compute an ample city, if one exists?

Hint: Think about starting with more than enough gas to complete the circuit without gassing up. Track the amount of gas as you perform the circuit, gassing up at each city.

2.14.2.1 Solution

Find the city/cities for which gas level is at minimum on arrival (ignoring impossibility of negative gas levels). Call one of these cities c. Since we have that total amount of gas at each city is enough to complete the full circle, and that, due to being at a minimum on arrival, we will never have less gas than we would have on entry to c, we know that we can complete the full circle from c. Finding the exact value of c is a simple matter of finding the minimum entry gas level of each city, which can be done in linear time.

2.14.3 TODO Find the majority element

You are reading a sequence of strings. You know a priori that more than half of the strings are repetitions of a single string but the positions where the majority element occurs are unknown. Write a program that makes a single pass over the sequence and identifies the majority element. For example, if the input is <b,a,c,a,a,b,a,a,c,a>, then a is the majority element (it appears in 6 out of the 10 places).

Hint: Take advantage of the existence of a majority element to perform elimination.

2.14.4 TODO Compute the maximum water trapped by a pair of vertical lines

2.14.5 TODO Compute the largest rectangle under the skyline

2.14.6 TODO Implement Huffman coding

2.15 Graphs

2.15.1 DONE Search a maze

Given a 2D array of black and white entries representing a maze with designated entrance and exit points, find a path from the entrance to the exit, if one exists.

Hint: Model the maze as a graph.

2.15.1.1 <2018-05-05 Sat> Solution

Without loss of generality, we make the following simplifying assumptions for any maze of size MxN:

The entrance is at index (0,0); and
The exit is at index (M-1,N-1).

We can treat this problem as similar to a depth-first search on a graph, where each element on our stack tracks:

The “current” room; as well as
All preceding rooms visited.

At worst, the problem is equivalent to depth-first traversal, with complexity O(V+E) = O(V+4V) = O(V).

def traverse(maze):
    def _mark_visited(i,j):
        maze[i][j]=0
    def _is_unvisited_room(i, j):
        return maze[i][j] == 1

    def _is_within_bounds(i, j):
        return i >= 0 and i < len(maze) and j >= 0 and j < len(maze[0])

    def _get_next_candidates(i, j):
        return filter(
            lambda c: _is_within_bounds(c[0],c[1]) and _is_unvisited_room(c[0],c[1]),
            [(i+1, j), (i-1, j), (i, j+1), (i, j-1)],
        )

    ENTRANCE = (0, 0)
    EXIT = (len(maze)-1, len(maze[0])-1)

    progress = [([], ENTRANCE)]

    while progress:
        path, curr = progress.pop()

        if curr == EXIT:
            return True, path + [curr]

        progress.extend([
            (path + [curr], cand)
            for cand in _get_next_candidates(curr[0], curr[1])
        ])

        _mark_visited(curr[0],curr[1])

    return False, []

all([
    traverse([
        [1, 0, 0],
        [1, 1, 0],
        [0, 1, 1],
    ]) == (True, [(0, 0), (1, 0), (1, 1), (2, 1), (2, 2)]),

    traverse([
        [1, 1, 0, 0, 0],
        [0, 1, 1, 1, 0],
        [0, 0, 0, 1, 0],
        [0, 1, 1, 1, 0],
        [0, 1, 0, 0, 0],
        [0, 1, 1, 1, 1],
    ]) == (True, [
        (0, 0), (0, 1), (1, 1), (1, 2), (1, 3), (2, 3), (3, 3), (3, 2), (3, 1),
        (4, 1), (5, 1), (5, 2), (5, 3), (5, 4),
    ]),

    traverse([
        [1,1,1],
        [0,0,1],
        [0,0,0],
    ]) == (False, []),
])

True

2.15.2 WRITTEN Transform one string to another

Let s and t be strings and D a dictionary, i.e. a set of strings. Define s to produce t if there exists a sequence of strings from the dictionary P = <s₀, s₁, …, s_n-1> such that the first string is s, the last string is t, and adjacent strings have the same length and differ in exactly one character. The sequence P is called a production sequence. For example, if the dictionary is {bat, cot, dog, dag, dot, cat}, then <cat, cot, dot, dog> is a production sequence.

Given a dictionary D and two strings s and t, write a program to determine if s produces t. Assume that all characters are lowercase alphabets. If s does produce t, output the length of the shortest production sequence; otherwise, output -1.

Hint: Treat strings as vertices in an undirected graph, with an edge between u and v if and only if the corresponding strings differ in one character.

2.15.2.1 Solution

Create a graph according to the hint in time complexity O(n²). Traverse the graph using BFS in time O(|V| + |E|) = O(n + n²) = O(n²).

Note that we don’t need to necessarily “create” a graph per se; edges can be “discovered” ad-hoc by finding words in the dictionary that are one character off from the current vertex.

2.15.3 DONE Paint a Boolean matrix

Implement a routine that takes an n x m Boolean array A together with an entry (x,y) and flips the color of the region associated with (x,y).

Hint: Solve this conceptually, then think about implementation optimizations.

2.15.3.1 Solution

Record the target value of (x,y); if (x,y) is True, target is False, and vice-versa. From (x,y), fan out – recursively or with a queue – to all neighbor entries. If neighbor n is already set to the target, do nothing; otherwise, set n to the target, and add n’s neighbors to the visitation queue. Continue in this way until the visitation queue is empty.

Time O(n) and space O(n) for the size of the array.

def flip_region(Ab, i, j):
    def neighbors(Ab, i, j):
        if i > 0:
            yield (i-1,j)
        if i < len(Ab) - 1:
            yield (i+1, j)
        if j > 0:
            yield (i, j-1)
        if j < len(Ab[0]) - 1:
            yield (i, j+1)
    _Ab = Ab
    target = not _Ab[i][j]
    visit_queue = [(i,j)]
    while visit_queue:
        _i, _j = visit_queue.pop(0)
        curr = _Ab[_i][_j]
        if curr == target:
            continue
        for n in neighbors(_Ab, _i, _j):
            _Ab[_i][_j] = target
            visit_queue.append(n)
    return _Ab

all([
    flip_region([
        [1,0,1],
        [0,1,1],
        [1,1,1],
    ], 0, 2) == [
        [1,0,0],
        [0,0,0],
        [0,0,0],
    ],
])

True

2.15.4 DONE Compute enclosed regions

Consider a matrix of black and white entries.

We say that an element in a 2D matrix is “enclosed” if there is no path from any of them to the boundary that only passes through elements of the same color.

Write a program that takes a 2D array A, whose entries are either white or black, and replaces all white elements that cannot reach the boundary with black.

For simplicity’s sake, we’ll redefine A to be a 2D binary array, where 1 is white and 0 is black.

We iterate through all elements of the matrix. If the element is black, we ignore it. If it is white, we initiate depth-first search for an edge element. If we find one, we then initiate a breadth-first sweep of all adjacent white elements to each element on the stack, setting all to black.

def compute_enclosed(A):
    def _is_within_bounds(i, j):
        return i >= 0 and i < len(A) and j >= 0 and j < len(A[0])

    def _is_edge(i, j):
        return i == 0 or i == len(A)-1 or j == 0 or j == len(A[0])-1

    def _is_white(i, j):
        return A[i][j] == 1

    def _get_next_candidates(i, j):
        return filter(
            lambda c: _is_within_bounds(c[0], c[1]) and _is_white(c[0], c[1]),
            [(i+1, j), (i-1, j), (i, j+1), (i, j-1)],
        )

    def _clear(path):
        _path = path

        while _path:
            i, j = _path.pop()
            A[i][j] = 0
            _path.extend(_get_next_candidates(i, j))

    for i in range(len(A)):
        for j in range(len(A[0])):
            if not _is_white(i, j):
                continue

            path = [(i, j)]

            cleared = False
            while path and not cleared:
                _i, _j = path[-1]

                if _is_edge(_i, _j):
                    _clear(path)
                    cleared = True
                    continue

                cands = _get_next_candidates(_i, _j)
                if not cands:
                    cleared = True
                    continue

                path.extend(cands)

    return A