LeetCode MEDIUM 78 Subsets Summary

Generated by Codex with GPT-5

Quick facts

Difficulty: MEDIUM
Problem: Subsets
Main tags: Array, Backtracking, Bit Manipulation

What the problem is really asking

Given an array nums of distinct integers, return every possible subset of those numbers.

A subset can contain none of the numbers, some of the numbers, or all of the numbers. The empty subset [] is valid, and the full array is valid too. Because the input numbers are distinct, there is no duplicate-handling problem in the base version.

The key detail is that the answer itself is large. If nums has length n, each number has two choices: it is either included or excluded. That means there are exactly 2^n subsets. No algorithm can do better than producing 2^n results, because the output has that many rows.

The core insight

The problem is a decision tree.

For each number, decide whether it belongs in the current subset. Walking through all combinations of those decisions generates the full power set.

There are two common ways to express that idea:

Include/exclude recursion: at index i, branch into “skip nums[i]” and “take nums[i]”.
Start-index backtracking: record the current subset, then try adding each later number one at a time.

Both are optimal. The start-index version is often cleaner in Python because every recursive call represents one valid subset immediately. It avoids waiting until the bottom of the tree to append results.

How to derive the backtracking approach

Start with a tiny input: nums = [1, 2, 3].

The empty subset comes first: [].

From there, choose a first element. If the first choice is 1, the current subset becomes [1]. After choosing 1, future choices can only come from the numbers to its right: 2 and 3. That gives [1, 2], [1, 2, 3], and [1, 3].

Then return to the empty subset and choose 2 as the first element. Future choices can only come after 2, so that gives [2] and [2, 3].

Then choose 3 as the first element, giving [3].

This “choose, explore, undo” pattern is exactly backtracking:

Save a copy of the current subset because it is a valid answer.
Try each candidate number starting from the current index.
Add the candidate to the current subset.
Recurse with the next starting index, so numbers are not reused.
Remove the candidate before trying the next one.

The start index is what prevents duplicates. Once 1 has chosen 2, the algorithm never later creates [2, 1], because it only moves forward through the input.

Why this is optimal

There are 2^n subsets, and many of those subsets contain up to n numbers. Returning them as lists requires copying the current subset into the answer.

That gives the natural complexity:

Time: O(n * 2^n), counting the cost to copy subsets into the result.
Space: O(n) auxiliary recursion space, plus O(n * 2^n) for the returned output.

The output space is unavoidable. Any accepted solution must store or emit every subset.

Python solution

from typing import Sequence


def generate_subsets(numbers: Sequence[int]) -> list[list[int]]:
    """Return every subset of a sequence of distinct integers."""
    all_subsets: list[list[int]] = []
    current_subset: list[int] = []

    def backtrack(start_index: int) -> None:
        # The current path is already a complete valid subset.
        all_subsets.append(current_subset.copy())

        for candidate_index in range(start_index, len(numbers)):
            current_subset.append(numbers[candidate_index])

            # Move forward so each number is used at most once and ordering
            # duplicates like [2, 1] are never generated after [1, 2].
            backtrack(candidate_index + 1)

            current_subset.pop()

    backtrack(start_index=0)
    return all_subsets


class Solution:
    def subsets(self, nums: list[int]) -> list[list[int]]:
        """LeetCode entry point."""
        return generate_subsets(nums)

Why this implementation works

The recursive state is small and precise:

current_subset contains the numbers chosen so far.
start_index marks the first position that is still eligible to be chosen next.
all_subsets stores snapshots of every valid state reached by the search.

The algorithm appends current_subset.copy() at the start of every call because every partial choice is a valid subset. This is different from problems like permutations, where an answer is complete only after it reaches a certain length.

The loop then tries each remaining number as the next addition. After the recursive call returns, pop() restores the previous state. That restoration step is what lets the same list object be reused safely across the whole search.

Because the recursion only calls backtrack(candidate_index + 1), the search never goes backward. Each subset is generated once, in increasing index order.

Interview follow-ups

What changes if `nums` can contain duplicates?

Sort the input first, then skip duplicate values at the same recursion depth.

For example, with [1, 2, 2], the second 2 should not be allowed to start a branch immediately after the first 2 was skipped at that same depth. Otherwise, the algorithm would produce the same subset more than once.

The usual rule is: inside the loop, if candidate_index > start_index and numbers[candidate_index] == numbers[candidate_index - 1], skip that candidate. Sorting groups equal values together, and the depth check ensures only duplicate sibling branches are removed. Duplicate values are still allowed when one copy was already chosen in the current path.

The time is still proportional to the number of generated subsets times the cost to copy them, with an added O(n log n) sort. The output may be smaller than 2^n because duplicates collapse into fewer unique subsets.

How would you generate subsets of exactly size `k`?

Keep the same backtracking shape, but only save a subset when len(current_subset) == k.

The recursion can also prune impossible branches. If the current subset still needs remaining_needed numbers and there are fewer than that many candidates left, the branch can stop immediately. This matters when n is large and k is small, because the number of valid answers is C(n, k) rather than 2^n.

The time becomes O(k * C(n, k)) for the output copies, and the auxiliary recursion space is O(k) if the recursion stops once the subset reaches size k.

Can this be solved iteratively instead of recursively?

Yes. Start with result = [[]]. For each number, take every subset that already exists and append a new subset with that number added.

For nums = [1, 2], the process starts as [[]]. After seeing 1, it becomes [[], [1]]. After seeing 2, it extends those existing subsets into [2] and [1, 2], producing [[], [1], [2], [1, 2]].

This approach has the same O(n * 2^n) time and output space. It avoids recursion and can be easier to reason about for some candidates, but the backtracking version usually adapts more naturally to follow-ups that add constraints.

Can bit manipulation generate the subsets?

Yes. Treat each integer from 0 to 2^n - 1 as a bitmask. If bit i is set, include nums[i] in that subset.

For n = 3, mask 101 means include positions 0 and 2, so the subset is [nums[0], nums[2]].

The bitmask approach is compact and makes the 2^n count very explicit. Its complexity is also O(n * 2^n) time and O(n * 2^n) output space. The tradeoff is readability: bit operations are concise, but backtracking usually communicates the search structure more clearly in interviews.

What if the full output is too large to store in memory?

The combinatorial explosion cannot be avoided if the caller truly needs every subset. However, the interface can be changed from “return a list” to “yield subsets one at a time.”

A generator-based backtracking solution would yield current_subset.copy() instead of appending into all_subsets. That reduces memory usage from storing the entire output to storing only the current recursion path plus one yielded subset at a time.

The total time remains exponential because every subset is still produced. The improvement is memory behavior, not asymptotic work.