Generated by Codex with GPT-5
Quick facts
- Difficulty:
HARD - Problem: Russian Doll Envelopes
- Main tags:
Array,Binary Search,Dynamic Programming,Sorting
What the problem is really asking
Each envelope has a width and a height. One envelope can go inside another only if both dimensions are strictly smaller.
So the real question is:
what is the longest chain of envelopes where both width and height keep increasing?
That makes this feel like a two-dimensional version of the Longest Increasing Subsequence problem.
Why the problem is tricky
If there were only one number per item, this would be a standard LIS problem.
But here there are two numbers, and both must increase at the same time. A naive dynamic programming solution compares every envelope with every other envelope after sorting, which works in O(n^2) time.
That is a perfectly reasonable first answer in an interview, but the more interesting solution is O(n log n).
The key insight
The breakthrough is to sort envelopes like this:
- sort by width ascending
- for equal widths, sort by height descending
After that sort, the width condition is already handled from left to right. Then the problem becomes:
find the longest strictly increasing subsequence of the heights.
The descending tie-break on height is the critical detail. Without it, two envelopes with the same width could incorrectly appear usable in the height LIS, even though equal widths are not allowed to nest.
For example, if envelopes with width 5 were ordered as heights 4, 5, the LIS on heights could pick both, which would be invalid. Sorting equal widths as 5, 4 prevents that mistake because a strictly increasing height subsequence can no longer take both.
How to derive the optimal solution
A clean interview path is:
Start with the O(n^2) DP idea. Sort the envelopes, then let dp[i] be the longest valid chain ending at envelope i. For every earlier envelope j, if both width and height are smaller than envelope i, then envelope i can extend that chain. This proves the structure of the problem.
Then look for a way to avoid comparing each pair.
The important observation is that sorting can remove one dimension of difficulty. If widths are processed in ascending order, the only remaining concern is making sure the height sequence is strictly increasing. The special tie-break for equal widths makes that reduction valid. Once that is in place, the remaining task is exactly LIS on the height array, and the standard patience-sorting-style LIS algorithm runs in O(n log n) time with binary search.
A simple mental model
Think of the sort step as lining up the envelopes in an order where width mistakes are impossible to make later.
After that, each envelope contributes only its height to the decision. The algorithm keeps a list called tails where:
tails[k]is the smallest possible ending height of any valid chain of lengthk + 1
Smaller ending heights are better because they leave more room to extend the chain later. Each new height either extends the longest chain so far or replaces an existing tail with a smaller one.
That is why the algorithm finds the maximum chain length without explicitly storing every chain.
Python solution
from bisect import bisect_left
from typing import Iterable, List, Sequence, Tuple
class EnvelopeNestingSolver:
"""Compute the maximum number of envelopes that can be nested."""
def max_nestable_envelopes(self, envelopes: Sequence[Sequence[int]]) -> int:
if not envelopes:
return 0
ordered_envelopes = self._sort_envelopes(envelopes)
ordered_heights = self._extract_heights(ordered_envelopes)
return self._longest_strictly_increasing_subsequence_length(ordered_heights)
def _sort_envelopes(
self,
envelopes: Sequence[Sequence[int]],
) -> List[Tuple[int, int]]:
"""
Sort by width ascending and height descending.
The descending height tie-break for equal widths prevents envelopes with
the same width from being chained together by the LIS step.
"""
return sorted(
((int(width), int(height)) for width, height in envelopes),
key=lambda envelope: (envelope[0], -envelope[1]),
)
def _extract_heights(
self,
envelopes: Iterable[Tuple[int, int]],
) -> List[int]:
"""Project the sorted envelopes into their height sequence."""
return [height for _, height in envelopes]
def _longest_strictly_increasing_subsequence_length(
self,
values: Sequence[int],
) -> int:
"""
Return the LIS length using the classic tails array technique.
tails[i] stores the smallest possible tail value for an increasing
subsequence of length i + 1.
"""
tails: List[int] = []
for value in values:
insertion_index = bisect_left(tails, value)
if insertion_index == len(tails):
tails.append(value)
else:
tails[insertion_index] = value
return len(tails)
class Solution:
def maxEnvelopes(self, envelopes: List[List[int]]) -> int:
solver = EnvelopeNestingSolver()
return solver.max_nestable_envelopes(envelopes)Why this works
After sorting by width ascending, any later envelope has width greater than or equal to an earlier one.
The only dangerous case is equal widths, because those are not allowed to nest. Sorting equal widths by height descending neutralizes that case. Among envelopes with the same width, heights now go downward, so a strictly increasing subsequence of heights can never pick more than one of them.
That means every strictly increasing subsequence of heights corresponds to a valid nesting chain, and every valid nesting chain appears as a strictly increasing subsequence of heights in the sorted order.
The LIS routine is correct because tails[length - 1] always keeps the smallest possible ending height for a chain of that length. Replacing a larger tail with a smaller one never hurts future possibilities; it only makes extension easier. So when the algorithm finishes, the length of tails is exactly the maximum number of envelopes that can be nested.
The time complexity is O(n log n):
- sorting costs
O(n log n) - the LIS scan does
nbinary searches, so anotherO(n log n)
The extra space is O(n) for the sorted list, height list, and LIS tails array.
Interview follow-ups
What if the interviewer asks for the simpler dynamic programming solution first?
That version sorts the envelopes and uses dp[i] for the best chain ending at i. For each i, scan all earlier j values and extend from any envelope where both width and height are smaller. This is often the easiest way to show understanding before optimizing. It works because every valid chain ending at i must come from some earlier valid predecessor, so trying all earlier envelopes is exhaustive. The tradeoff is time: it takes O(n^2) and becomes too slow on large inputs, which is why the LIS reduction is the stronger final answer.
Why is sorting equal widths by height descending necessary?
Because the LIS step only looks at heights. If equal-width envelopes were sorted by height ascending, something like (5, 4) followed by (5, 5) would look increasing in height, so the LIS could take both even though the widths are equal and nesting is illegal. Sorting equal widths in the opposite direction makes those heights decreasing inside each width group, which blocks the LIS from chaining them together. This small sort rule is what turns a dangerous 2D problem into a correct 1D LIS problem.
What if the interviewer wants the actual sequence of envelopes, not just the maximum count?
The idea stays the same, but the implementation stores predecessor pointers and the index chosen for each LIS length. When a height extends or replaces a tail, the algorithm records which earlier envelope it came from. At the end, it backtracks from the last chosen index to reconstruct one optimal nesting chain. This still uses the same sorted order and the same width-descending tie-break. The tradeoff is additional bookkeeping and slightly more complex code, but the asymptotic complexity stays O(n log n).
What if rotation were allowed?
Then each envelope can be normalized first by sorting its two sides so the smaller side is treated as width and the larger side as height. After that normalization, the same nesting logic applies because every envelope has a consistent orientation before comparison. The rest of the solution remains sort-plus-LIS. The important tradeoff is that this changes the problem definition: once rotation is allowed, preprocessing is part of the solution, and correctness depends on applying that normalization consistently before sorting and LIS.