Generated by Codex with GPT-5
Quick facts
- Difficulty:
HARD - Problem: First Missing Positive
- Main tags:
Array,Hash Table
What the problem is really asking
Given an unsorted integer array, return the smallest positive integer that does not appear in the array.
The tricky part is not the definition of the answer. The tricky part is the constraint: the best solution must run in O(n) time and use only O(1) extra space.
That immediately rules out the most obvious ideas:
- sorting the array, because that costs
O(n log n) - putting all positive values in a set, because that uses
O(n)extra space
So the real interview question is this:
how can the array itself be reused as a bookkeeping structure?
The key insight
If the array has length n, then the answer must be in the range 1 through n + 1.
Why? Because there are only n slots. If the array contains every positive integer from 1 through n, then the first missing positive is n + 1. Otherwise, the first missing positive is somewhere inside 1 through n.
That means values outside this range are never the answer:
- negative numbers do not matter
0does not matter- values larger than
ndo not matter
The only values worth carefully tracking are the integers 1 through n.
Once that becomes clear, the natural goal is:
put each useful value x into the position where it belongs, which is index x - 1.
If that rearrangement is done correctly, then a final left-to-right scan tells the answer immediately:
- if index
0does not contain1, the answer is1 - if index
1does not contain2, the answer is2 - and so on
How to derive the optimal solution
Start with the bound above: only values in 1..n can affect the answer.
That suggests treating the array like a compact hash table where each index represents whether a particular number exists:
- index
0represents number1 - index
1represents number2 - index
2represents number3
The cleanest in-place way to do that is repeated placement:
- Walk through the array.
- For each value
x, ifxis in1..n, try to move it to indexx - 1. - Keep swapping until the current index either contains the correct value, an irrelevant value, or a duplicate.
This works because every successful swap puts at least one value into its final home position. That prevents the algorithm from degenerating into quadratic behavior.
After the placement phase, the array has a very strong meaning:
- if
nums[i] == i + 1, then that positive integer exists - if
nums[i] != i + 1, theni + 1is missing
So the first mismatch is the answer.
The complexity is:
- Time:
O(n)because each value is moved only a limited number of times - Extra space:
O(1)because the algorithm only uses a few variables
Best approaches
There are two strong O(n) interview solutions.
The most direct one is the in-place placement approach used below. It is easy to reason about because it tries to make the array look like:
[1, 2, 3, 4, ...]
whenever those values exist.
Another equally valid approach is in-place sign marking:
- first normalize unusable values
- then use index positions as presence markers by flipping signs
- finally scan for the first index that was never marked
That solution has the same asymptotic complexity, but the placement approach is often easier to explain because the final scan becomes visually obvious.
Why the in-place placement solution works
The invariant is simple:
if a value x belongs in the array and it can be placed at index x - 1, the algorithm keeps doing that until the correct value sits there or a duplicate blocks further progress.
There are only three reasons to stop at a given index:
- the current value is out of range, so it can never help
- the current value is already in the correct position
- the destination already contains the same value, which means this is a duplicate
Because each productive swap fixes a home position, the algorithm keeps making irreversible progress.
When the placement phase ends, every positive integer that exists in the array and can be represented in 1..n is either sitting in its home slot or absent from the array entirely.
That is exactly why the first index where nums[i] != i + 1 identifies the smallest missing positive.
Python solution
from typing import List
class Solution:
def firstMissingPositive(self, numbers: List[int]) -> int:
self._place_values_into_home_positions(numbers)
return self._find_first_missing_positive(numbers)
def _place_values_into_home_positions(self, numbers: List[int]) -> None:
array_length = len(numbers)
current_index = 0
while current_index < array_length:
current_value = numbers[current_index]
if not self._should_place_value(
numbers=numbers,
current_index=current_index,
current_value=current_value,
):
current_index += 1
continue
target_index = current_value - 1
self._swap(numbers, current_index, target_index)
def _should_place_value(
self,
numbers: List[int],
current_index: int,
current_value: int,
) -> bool:
array_length = len(numbers)
if current_value < 1 or current_value > array_length:
return False
target_index = current_value - 1
if target_index == current_index:
return False
# If the target already holds the same value, this occurrence is a
# duplicate and swapping would loop forever.
if numbers[target_index] == current_value:
return False
return True
@staticmethod
def _swap(numbers: List[int], left_index: int, right_index: int) -> None:
numbers[left_index], numbers[right_index] = (
numbers[right_index],
numbers[left_index],
)
@staticmethod
def _find_first_missing_positive(numbers: List[int]) -> int:
for index, value in enumerate(numbers):
expected_value = index + 1
if value != expected_value:
return expected_value
return len(numbers) + 1Interview follow-ups
What changes if the interviewer removes the O(1) extra-space requirement?
Then the simplest answer is to use a hash set. Insert every positive value into the set, then start at 1 and keep checking whether that number exists. The first value that is not in the set is the answer.
That approach is easy to implement and still runs in expected O(n) time, but it spends O(n) extra memory. It is a good fallback when the constraints are relaxed, but it misses the real point of this problem, which is learning to reuse the input array itself.
Is there another valid O(n) time and O(1) space solution besides swapping values into place?
Yes. A common alternative is sign marking. First, convert unusable values so they do not interfere. Then use the index for each positive number as a presence marker by making the value at that index negative. After that, the first index that remains positive corresponds to the missing number.
That method works because index positions again stand in for the values 1..n. The complexity is still O(n) time and O(1) extra space. The tradeoff is that the setup phase is a bit more subtle, so many candidates find the placement solution easier to derive under interview pressure.
Why is the answer guaranteed to be at most n + 1?
Because only n distinct positions are available in the array. To make every number from 1 through n present, the array would already have to use all n slots to cover that full range. In that best-case situation, the smallest missing positive is exactly n + 1.
If even one number from 1 through n is absent, then the answer is smaller and appears inside that range instead. This bound is what makes the in-place index-mapping idea possible in the first place.
Why do duplicates matter in the swap-based solution?
Duplicates are the main source of accidental infinite loops. If the current value wants to go to an index that already stores the same value, swapping would not change the array, so the algorithm would get stuck repeating the same step forever.
That is why the duplicate check is part of the core logic. It preserves the linear-time bound by ensuring the algorithm only performs swaps that make real progress toward placing a value into its correct home slot.
Practical takeaway
The cleanest way to think about this problem is to ignore almost all of the integers and focus only on whether the array contains each value from 1 through n.
Once the array is treated as its own lookup table, the hard-looking constraints become manageable, and the first missing positive falls out from one final scan.