# Find the Duplicate Number

ArrayTwo PointersBinary SearchBit

## # Solution

See more at this LeetCode post (opens new window).

### # Built-in Types

#### # Counter

Complexity

time: $O(n)$
space: $O(n)$

def findDuplicate(self, nums: List[int]) -> int:
cnt = Counter(nums)
return max(cnt, key=cnt.get)

1
2
3

#### # Sorting

Complexity

time: $O(n \log n)$
space: $O(n)$

def findDuplicate(self, nums: List[int]) -> int:
n = len(nums)
nums.sort()

for i in range(1, n):
if nums[i] == nums[i-1]:
return nums[i]

1
2
3
4
5
6
7

#### # Set

Complexity

time: $O(n)$
space: $O(n)$

def findDuplicate(self, nums: List[int]) -> int:
hashset = set()
for num in nums:
if num in hashset:
return num

1
2
3
4
5
6

How can we prove that at least one duplicate number must exist in nums?

Proving that at least one duplicate must exist in nums is simple application of the pigeonhole principle (opens new window). Here, each number in nums is a "pigeon" and each distinct number that can appear in nums is a "pigeonhole". Because there are $n + 1$ numbers are $n$ distinct possible numbers, the pigeonhole principle implies that at least one of the numbers is duplicated.

Can you solve the problem using only $O(1)$ extra space?

cnt[i] represents number of elements in nums <= i

Say the duplicate is $target$, then all elements in [1, target-1] satisfy cnt[i]<= i, and all elements in [target, n] satisfy cnt[i] > i.

Thus, use binary search to find $target$ according to the monotone cnt.

Complexity

time: $O(n \log n)$
space: $O(1)$

def findDuplicate(self, nums: List[int]) -> int:
n = len(nums)
res = 0
left = 1
right = n-1

while left <= right:
mid = (left + right) >> 1
cnt = 0
for i in range(n):
if nums[i] <= mid:
cnt += 1
if cnt <= mid:
left = mid + 1
else:
right = mid - 1
res = mid

return res

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

For the ith bit, there are $x$ elements in nums[1..n] having it as $1$, and there are $y$ elements in [1.n] it as $1$. The $target$ has the ith bit as $1$ iff $x > y$.

Complexity

time: $O(n \log n)$
space: $O(1)$

def findDuplicate(self, nums: List[int]) -> int:
n = len(nums)
res = 0
bit_max = 31
while not (n - 1) >> bit_max:
bit_max -= 1
for bit in range(bit_max+1):
x = y = 0
for i in range(n):
if nums[i] & 1 << bit:
x += 1
if i>=1 and i&(1<<bit):
y += 1
if x > y:
res |= 1 << bit
return res

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

### # Floyd's Cycle-Finding Algorithm

Form a graph against nums. Every index has an edge $i \rightarrow nums[i]$. As there exists a duplicate $target$, so at least 2 edges point to such a target. Thus, the graph has a cycle and $target$ is the entrance of the cycle.

See more at the similar problem Linked List Cycle II.

Complexity

time: $O(n)$
space: $O(1)$

def findDuplicate(self, nums: List[int]) -> int:
n = len(nums)
fast = slow = nums[0]

# do-while loop
while True:
slow = nums[slow]
fast = nums[nums[fast]]
if slow == fast:
break

# find the entrance to the cycle
fast = nums[0]
while slow != fast:
fast = nums[fast]
slow = nums[slow]

return fast

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18