LeetCode HARD 41 First Missing Positive Summary

Generated by Codex with GPT-5

Quick facts

Difficulty: HARD
Problem: First Missing Positive
Main tags: Array, Hash Table

What the problem is really asking

Given an unsorted integer array, return the smallest positive integer that does not appear in the array.

The tricky part is not the definition of the answer. The tricky part is the constraint: the best solution must run in O(n) time and use only O(1) extra space.

That immediately rules out the most obvious ideas:

sorting the array, because that costs O(n log n)
putting all positive values in a set, because that uses O(n) extra space

So the real interview question is this:

how can the array itself be reused as a bookkeeping structure?

The key insight

If the array has length n, then the answer must be in the range 1 through n + 1.

Why? Because there are only n slots. If the array contains every positive integer from 1 through n, then the first missing positive is n + 1. Otherwise, the first missing positive is somewhere inside 1 through n.

That means values outside this range are never the answer:

negative numbers do not matter
0 does not matter
values larger than n do not matter

The only values worth carefully tracking are the integers 1 through n.

Once that becomes clear, the natural goal is:

put each useful value x into the position where it belongs, which is index x - 1.

If that rearrangement is done correctly, then a final left-to-right scan tells the answer immediately:

if index 0 does not contain 1, the answer is 1
if index 1 does not contain 2, the answer is 2
and so on

How to derive the optimal solution

Start with the bound above: only values in 1..n can affect the answer.

That suggests treating the array like a compact hash table where each index represents whether a particular number exists:

index 0 represents number 1
index 1 represents number 2
index 2 represents number 3

The cleanest in-place way to do that is repeated placement:

Walk through the array.
For each value x, if x is in 1..n, try to move it to index x - 1.
Keep swapping until the current index either contains the correct value, an irrelevant value, or a duplicate.

This works because every successful swap puts at least one value into its final home position. That prevents the algorithm from degenerating into quadratic behavior.

After the placement phase, the array has a very strong meaning:

if nums[i] == i + 1, then that positive integer exists
if nums[i] != i + 1, then i + 1 is missing

So the first mismatch is the answer.

The complexity is:

Time: O(n) because each value is moved only a limited number of times
Extra space: O(1) because the algorithm only uses a few variables

Best approaches

There are two strong O(n) interview solutions.

The most direct one is the in-place placement approach used below. It is easy to reason about because it tries to make the array look like:

[1, 2, 3, 4, ...]

whenever those values exist.

Another equally valid approach is in-place sign marking:

first normalize unusable values
then use index positions as presence markers by flipping signs
finally scan for the first index that was never marked

That solution has the same asymptotic complexity, but the placement approach is often easier to explain because the final scan becomes visually obvious.

Why the in-place placement solution works

The invariant is simple:

if a value x belongs in the array and it can be placed at index x - 1, the algorithm keeps doing that until the correct value sits there or a duplicate blocks further progress.

There are only three reasons to stop at a given index:

the current value is out of range, so it can never help
the current value is already in the correct position
the destination already contains the same value, which means this is a duplicate

Because each productive swap fixes a home position, the algorithm keeps making irreversible progress.

When the placement phase ends, every positive integer that exists in the array and can be represented in 1..n is either sitting in its home slot or absent from the array entirely.

That is exactly why the first index where nums[i] != i + 1 identifies the smallest missing positive.

Python solution

from typing import List


class Solution:
    def firstMissingPositive(self, numbers: List[int]) -> int:
        self._place_values_into_home_positions(numbers)
        return self._find_first_missing_positive(numbers)

    def _place_values_into_home_positions(self, numbers: List[int]) -> None:
        array_length = len(numbers)
        current_index = 0

        while current_index < array_length:
            current_value = numbers[current_index]

            if not self._should_place_value(
                numbers=numbers,
                current_index=current_index,
                current_value=current_value,
            ):
                current_index += 1
                continue

            target_index = current_value - 1
            self._swap(numbers, current_index, target_index)

    def _should_place_value(
        self,
        numbers: List[int],
        current_index: int,
        current_value: int,
    ) -> bool:
        array_length = len(numbers)

        if current_value < 1 or current_value > array_length:
            return False

        target_index = current_value - 1

        if target_index == current_index:
            return False

        # If the target already holds the same value, this occurrence is a
        # duplicate and swapping would loop forever.
        if numbers[target_index] == current_value:
            return False

        return True

    @staticmethod
    def _swap(numbers: List[int], left_index: int, right_index: int) -> None:
        numbers[left_index], numbers[right_index] = (
            numbers[right_index],
            numbers[left_index],
        )

    @staticmethod
    def _find_first_missing_positive(numbers: List[int]) -> int:
        for index, value in enumerate(numbers):
            expected_value = index + 1
            if value != expected_value:
                return expected_value

        return len(numbers) + 1

Interview follow-ups

What changes if the interviewer removes the `O(1)` extra-space requirement?

Then the simplest answer is to use a hash set. Insert every positive value into the set, then start at 1 and keep checking whether that number exists. The first value that is not in the set is the answer.

That approach is easy to implement and still runs in expected O(n) time, but it spends O(n) extra memory. It is a good fallback when the constraints are relaxed, but it misses the real point of this problem, which is learning to reuse the input array itself.

Is there another valid `O(n)` time and `O(1)` space solution besides swapping values into place?

Yes. A common alternative is sign marking. First, convert unusable values so they do not interfere. Then use the index for each positive number as a presence marker by making the value at that index negative. After that, the first index that remains positive corresponds to the missing number.

That method works because index positions again stand in for the values 1..n. The complexity is still O(n) time and O(1) extra space. The tradeoff is that the setup phase is a bit more subtle, so many candidates find the placement solution easier to derive under interview pressure.

Why is the answer guaranteed to be at most `n + 1`?

Because only n distinct positions are available in the array. To make every number from 1 through n present, the array would already have to use all n slots to cover that full range. In that best-case situation, the smallest missing positive is exactly n + 1.

If even one number from 1 through n is absent, then the answer is smaller and appears inside that range instead. This bound is what makes the in-place index-mapping idea possible in the first place.

Why do duplicates matter in the swap-based solution?

Duplicates are the main source of accidental infinite loops. If the current value wants to go to an index that already stores the same value, swapping would not change the array, so the algorithm would get stuck repeating the same step forever.

That is why the duplicate check is part of the core logic. It preserves the linear-time bound by ensuring the algorithm only performs swaps that make real progress toward placing a value into its correct home slot.

Practical takeaway

The cleanest way to think about this problem is to ignore almost all of the integers and focus only on whether the array contains each value from 1 through n.

Once the array is treated as its own lookup table, the hard-looking constraints become manageable, and the first missing positive falls out from one final scan.