Find the Duplicate Number

ArrayTwo PointersBinary SearchBit
https://leetcode.com/problems/find-the-duplicate-number

# Solution

See more at this LeetCode post (opens new window).

# Built-in Types

# Counter

Complexity

time: O(n)O(n)
space: O(n)O(n)

def findDuplicate(self, nums: List[int]) -> int:
    cnt = Counter(nums)
    return max(cnt, key=cnt.get)
1
2
3

# Sorting

Complexity

time: O(nlogn)O(n \log n)
space: O(n)O(n)

def findDuplicate(self, nums: List[int]) -> int:
    n = len(nums)
    nums.sort()

    for i in range(1, n):
        if nums[i] == nums[i-1]:
            return nums[i]
1
2
3
4
5
6
7

# Set

Complexity

time: O(n)O(n)
space: O(n)O(n)

def findDuplicate(self, nums: List[int]) -> int:
    hashset = set()
    for num in nums:
        if num in hashset:
            return num
        hashset.add(num)
1
2
3
4
5
6

# Follow Up

How can we prove that at least one duplicate number must exist in nums?

Proving that at least one duplicate must exist in nums is simple application of the pigeonhole principle (opens new window). Here, each number in nums is a "pigeon" and each distinct number that can appear in nums is a "pigeonhole". Because there are n+1n + 1 numbers are nn distinct possible numbers, the pigeonhole principle implies that at least one of the numbers is duplicated.

Can you solve the problem using only O(1)O(1) extra space?

cnt[i] represents number of elements in nums <= i

Say the duplicate is targettarget, then all elements in [1, target-1] satisfy cnt[i]<= i, and all elements in [target, n] satisfy cnt[i] > i.

Thus, use binary search to find targettarget according to the monotone cnt.

Complexity

time: O(nlogn)O(n \log n)
space: O(1)O(1)

def findDuplicate(self, nums: List[int]) -> int:
    n = len(nums)
    res = 0
    left = 1
    right = n-1

    while left <= right:
        mid = (left + right) >> 1
        cnt = 0
        for i in range(n):
            if nums[i] <= mid:
                cnt += 1
        if cnt <= mid:
            left = mid + 1
        else:
            right = mid - 1
            res = mid

    return res
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

# Bit Mask

For the ith bit, there are xx elements in nums[1..n] having it as 11, and there are yy elements in [1.n] it as 11. The targettarget has the ith bit as 11 iff x>yx > y.

Complexity

time: O(nlogn)O(n \log n)
space: O(1)O(1)

def findDuplicate(self, nums: List[int]) -> int:
    n = len(nums)
    res = 0
    bit_max = 31
    while not (n - 1) >> bit_max:
        bit_max -= 1
    for bit in range(bit_max+1):
        x = y = 0
        for i in range(n):
            if nums[i] & 1 << bit:
                x += 1
            if i>=1 and i&(1<<bit):
                y += 1
        if x > y:
            res |= 1 << bit
    return res
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

# Floyd's Cycle-Finding Algorithm

Form a graph against nums. Every index has an edge inums[i]i \rightarrow nums[i]. As there exists a duplicate targettarget, so at least 2 edges point to such a target. Thus, the graph has a cycle and targettarget is the entrance of the cycle.

See more at the similar problem Linked List Cycle II.

Complexity

time: O(n)O(n)
space: O(1)O(1)

def findDuplicate(self, nums: List[int]) -> int:
    n = len(nums)
    fast = slow = nums[0]

    # do-while loop
    while True:
        slow = nums[slow]
        fast = nums[nums[fast]]
        if slow == fast:
            break

    # find the entrance to the cycle
    fast = nums[0]
    while slow != fast:
        fast = nums[fast]
        slow = nums[slow]

    return fast
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18