Generated by Codex with GPT-5

Quick facts

  • Difficulty: HARD
  • Problem: Maximal Rectangle
  • Main tags: Array, Dynamic Programming, Stack, Matrix, Monotonic Stack

What the problem is really asking

The matrix contains only "0" and "1", and the goal is to find the area of the largest axis-aligned rectangle made entirely of "1" cells.

The hard part is that rectangles can start and end anywhere. They are not tied to one row, one column, or one obvious anchor cell. A brute-force search over all possible rectangles is far too expensive because there are too many ways to choose top, bottom, left, and right boundaries.

The right way to see the problem is this:

  • pick one row and pretend it is the bottom of the rectangle
  • for every column, compute how many consecutive "1" cells extend upward to that row
  • now the problem on that row becomes: what is the largest rectangle in this histogram?

Once that reduction is clear, the 2D problem turns into a sequence of 1D subproblems.

The key reduction: each row becomes a histogram

Suppose the current row is treated as the bottom edge of a candidate rectangle. For each column:

  • if the current cell is "1", the height grows by one
  • if the current cell is "0", the height resets to zero

That produces a histogram where each bar says, “How tall can a rectangle be if it ends on this row in this column?”

For example, if the running heights become [3, 1, 3, 2, 2], then any all-1 rectangle ending on the current row must correspond to a contiguous block in that histogram. The area of such a rectangle is:

height * width

So after updating the heights for a row, the only remaining task is to compute the largest rectangle area inside that histogram. Do that for every row, and take the maximum.

Why the histogram subproblem is linear-time

The standard histogram solution uses a monotonic increasing stack of bar indices.

The stack keeps bars whose heights are in nondecreasing order. When a new bar is shorter than the bar at the top of the stack, the taller bar can no longer extend to the right past the current position. That means its best possible rectangle can be finalized immediately:

  • the popped bar gives the height
  • the current index is the first position to the right where that height fails
  • the new top of the stack gives the nearest smaller bar to the left

That tells the algorithm the full width over which the popped height is the limiting height. Each index is pushed once and popped once, so the histogram work is O(columns) per row.

Combining that with the row-by-row scan gives total time O(rows * columns).

How to derive this in an interview

A strong interview explanation usually progresses in three steps.

First, acknowledge the brute-force baseline: enumerate rectangles and test whether they are all ones. That makes it obvious why the naive approach is too slow.

Second, notice that rectangles are much easier to reason about if the bottom row is fixed. Once the bottom row is fixed, each column contributes a height of consecutive ones, which naturally produces a histogram.

Third, reuse the classic “largest rectangle in histogram” solution with a monotonic stack. That is the key leap: the matrix problem is really many histogram problems stitched together.

This derivation is worth stating clearly because it shows understanding instead of memorization. The optimal solution is not a mysterious matrix trick. It is a clean reduction.

Python solution

from __future__ import annotations

from typing import List


class HistogramRectangleSolver:
    """Compute the largest rectangle area in a histogram in linear time."""

    def largest_rectangle_area(self, heights: List[int]) -> int:
        max_area = 0
        increasing_index_stack: List[int] = []

        # A trailing zero forces every remaining bar to be processed by the end.
        heights_with_sentinel = heights + [0]

        for current_index, current_height in enumerate(heights_with_sentinel):
            while (
                increasing_index_stack
                and heights_with_sentinel[increasing_index_stack[-1]] > current_height
            ):
                tallest_bar_index = increasing_index_stack.pop()
                tallest_bar_height = heights_with_sentinel[tallest_bar_index]

                left_smaller_index = (
                    increasing_index_stack[-1] if increasing_index_stack else -1
                )
                width = current_index - left_smaller_index - 1
                max_area = max(max_area, tallest_bar_height * width)

            increasing_index_stack.append(current_index)

        return max_area


class MaximalRectangleSolver:
    """Solve the maximal all-ones rectangle problem row by row."""

    def __init__(self) -> None:
        self.histogram_solver = HistogramRectangleSolver()

    def maximal_rectangle(self, matrix: List[List[str]]) -> int:
        if not matrix or not matrix[0]:
            return 0

        column_count = len(matrix[0])
        consecutive_one_heights = [0] * column_count
        best_area = 0

        for row in matrix:
            self._update_histogram(row, consecutive_one_heights)
            best_area = max(
                best_area,
                self.histogram_solver.largest_rectangle_area(
                    consecutive_one_heights
                ),
            )

        return best_area

    def _update_histogram(self, row: List[str], heights: List[int]) -> None:
        """Update each column height based on whether the current cell is 1."""
        for column_index, cell_value in enumerate(row):
            if cell_value == "1":
                heights[column_index] += 1
            else:
                heights[column_index] = 0


class Solution:
    def maximalRectangle(self, matrix: List[List[str]]) -> int:
        solver = MaximalRectangleSolver()
        return solver.maximal_rectangle(matrix)

Why this works

The histogram array is correct because after processing row r, each entry stores exactly the number of consecutive "1" cells ending at row r in that column. That means every all-1 rectangle whose bottom edge is row r corresponds to a contiguous set of histogram bars, and its height is limited by the shortest bar in that span.

The monotonic stack is correct because when a bar is popped, the algorithm has found the first strictly smaller height on its right, and the new top of the stack is the first strictly smaller height on its left. Those two smaller bars define the widest interval over which the popped bar can serve as the minimum height. So the area computed at pop time is the largest possible rectangle that uses that bar as its limiting height.

Since the algorithm evaluates every row as a possible bottom boundary, and the histogram solver evaluates every possible limiting height within that row, the global maximum rectangle area is guaranteed to be found.

Interview follow-ups

What if the interviewer wants the rectangle coordinates, not just the area?

The same overall idea still works. During the histogram step, when a bar is popped from the stack, the algorithm already knows the rectangle’s height and its left and right boundaries in the current row. From that, it can reconstruct the full rectangle. The bottom row is the current row, the top row is current_row - height + 1, the left column is left_smaller_index + 1, and the right column is current_index - 1.

Instead of storing only the best area, the implementation can store the best area plus these four coordinates. The runtime stays O(rows * columns) and the space stays O(columns). The only change is keeping richer metadata when a new best rectangle is found.

Is there another optimal solution besides reusing Largest Rectangle in Histogram?

Yes. A common alternative dynamic-programming formulation tracks three arrays for each row: height[col] for consecutive ones ending at this row, left[col] for the leftmost valid boundary, and right[col] for the rightmost valid boundary. After updating those arrays, the rectangle area at each column is height[col] * (right[col] - left[col]). That solution is also O(rows * columns) time and O(columns) space.

The tradeoff is mostly conceptual. The left/right DP version is very good once it is understood, but many candidates find the histogram reduction easier to derive and easier to defend, especially if they already know the monotonic stack pattern. In interviews, the histogram approach is often the clearest answer.

What if the rows arrive one at a time as a stream?

That version is still manageable because the algorithm only needs the current histogram, not the whole matrix history.

For each new row, update the running heights, solve the histogram problem for that row, and update the best answer seen so far. That means the state can stay at O(columns) instead of storing all prior rows. The per-row work remains linear in the number of columns.

This is a useful follow-up because it highlights the real strength of the approach: it is incremental. Each row contributes just enough information to refresh the histogram, and the algorithm never needs to revisit older rows directly.

Practical takeaway

The fastest way to unlock this problem is to stop treating it as a raw 2D search.

Once each row is viewed as the bottom of a histogram, the problem becomes a repeat application of a well-structured 1D algorithm. That reduction is the entire interview story: build heights, solve histogram, keep the best area.