Find Words

StringPrefix SumBinary Search
https://www.lintcode.com/problem/find-words

# Solution

Let nn be length of str and mm length of dict. There are two pointers, i in str and j in dict.

Brute force solution is to compare 2 pointers until i gets to end of str. This is not time efficient especially when str is long. So we need to preprocess w/ data structures such as matrix or hashmap.

# Matrix

Go from back to front, nextpos[i][j] is the next position of char j after str[i]. If str[i]==j, then nextpos[i][j] = i.

Complexity

time: O(max(n,m))O(\max(n, m))
space: O(n)O(n)

def findWords(self, str, dict):
    if not str or not dict:
        return []
    n, res = len(str), []
    nextpos = [[n for _ in range(26)] for _ in range(n+1)]
    # preprocessing
    for i in reversed(range(n)):
        for j in range(26):
            nextpos[i][j] = nextpos[i+1][j]
            if ord(str[i]) - ord('a') == j:
                nextpos[i][j] = i
    # `nextpos` can also be dict
    # nextpos = {}
    # for i in range(n):
    #     nextpos.setdefault(str[i], []).append(i)

    # 2 pointers comparision
    for word in dict:
        # i points in str, j points in dict
        i, j, m = 0, 0, len(word)
        while i < n and j < m:
            i = nextpos[i][ord(word[j]) - ord('a')]
            if i == n: # word not found in str
                break
            i += 1
            j += 1
        if j == m:
            res.append(word)
    return res
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

# Finite Automaton

index records the matched position of words in dict. Search across str, if index[j] is at least as long as length of current word, then word is already matched. Else if char str[i] is equal to next char of word j, increment index[j] by 1.

Note that we can't just append word to res as soon as index[j] >= len(dict[j]), as the problem requires res to be in dictionary order.

Complexity

time: O(nm)O(nm)
space: O(n)O(n)

def findWords(self, str, dict):
    n = len(dict)
    index = [0 for _ in range(n)]
    for i in range(len(str)):
        for j in range(n):
            # finished matching already
            if index[j] >= len(dict[j]):
                continue
            # if index[j]-th char in j-th word in `dict` matches w/ `str[i]`
            # then move index[j] to the right by 1
            if dict[j][index[j]] == str[i]:
                index[j] += 1
    res = []
    for j in range(n):
        if index[j] >= len(dict[j]):
            res.append(dict[j])
    return res
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17