LeetCode HARD 76 Minimum Window Substring Summary

Generated by Codex with GPT-5

Quick facts

Difficulty: HARD
Problem: Minimum Window Substring
Main tags: Hash Table, String, Sliding Window

What the problem is really asking

There are two strings:

s, the big string where the answer must come from
t, the set of required characters the answer must contain

The task is to return the shortest substring of s that contains every character from t, including duplicates.

That duplicate requirement is the part people often miss. If t = "AABC", then a window with only one A is not good enough. The window has to contain two As, one B, and one C.

So the problem is not just “find a substring containing all distinct letters.” It is really “find the smallest window whose character counts are at least the required counts from t.”

The key insight

This is a classic sliding-window problem because the answer is a contiguous substring and the window can be adjusted from both sides.

The useful mental model is:

expand the right edge until the window becomes valid
once valid, shrink the left edge as much as possible
record the best valid window seen so far
repeat until the right edge reaches the end

The reason this works is that once a window is valid, any larger window with the same right boundary is automatically worse than the smallest valid one. So after reaching validity, the only smart move is to squeeze the left side inward until the window would break.

How to derive the optimal solution

The brute-force idea is straightforward: try every substring, count its characters, and check whether it covers t. That is far too slow because there are O(n^2) substrings and each check can cost additional time.

To improve it, notice that the window only changes by one character at a time when a pointer moves. That means the character counts can be updated incrementally instead of recomputed from scratch.

From there, the structure becomes natural:

keep a frequency map for the characters required by t
keep another frequency map for the current window in s
track how many required character types are currently satisfied
move the right pointer to add characters
when all required types are satisfied, move the left pointer to remove unnecessary characters

Each character enters the window once and leaves the window once, so the total work is linear.

Best approach

The best default solution uses a sliding window with frequency counts.

Time: O(len(s) + len(t))
Extra space: O(len(unique characters in t))

This is optimal for the standard interview version because the input must be read at least once, and the algorithm only does constant extra work for each pointer movement.

Python solution

from collections import Counter, defaultdict
from typing import DefaultDict


class Solution:
    def minWindow(self, source_text: str, required_text: str) -> str:
        """Return the shortest substring of source_text that covers required_text."""
        if not source_text or not required_text:
            return ""

        if len(required_text) > len(source_text):
            return ""

        required_character_counts = self._build_required_character_counts(required_text)
        window_character_counts: DefaultDict[str, int] = defaultdict(int)

        total_required_character_types = len(required_character_counts)
        satisfied_character_types = 0

        best_window_start = 0
        best_window_length = float("inf")
        left_index = 0

        for right_index, current_character in enumerate(source_text):
            # Only characters that appear in required_text can help satisfy the window.
            if current_character in required_character_counts:
                window_character_counts[current_character] += 1

                if (
                    window_character_counts[current_character]
                    == required_character_counts[current_character]
                ):
                    satisfied_character_types += 1

            # When all required character counts are covered, shrink from the left
            # to find the smallest valid window for this right boundary.
            while satisfied_character_types == total_required_character_types:
                current_window_length = right_index - left_index + 1
                if current_window_length < best_window_length:
                    best_window_start = left_index
                    best_window_length = current_window_length

                left_character = source_text[left_index]
                if left_character in required_character_counts:
                    window_character_counts[left_character] -= 1

                    if (
                        window_character_counts[left_character]
                        < required_character_counts[left_character]
                    ):
                        satisfied_character_types -= 1

                left_index += 1

        if best_window_length == float("inf"):
            return ""

        best_window_end = best_window_start + best_window_length
        return source_text[best_window_start:best_window_end]

    @staticmethod
    def _build_required_character_counts(required_text: str) -> Counter:
        """Count how many times each character must appear in a valid window."""
        return Counter(required_text)

Why this works

The algorithm maintains one invariant: whenever satisfied_character_types equals the number of required character types, the current window covers all of t.

Expanding the right pointer is the only way to turn an invalid window into a valid one. Shrinking the left pointer is the only way to remove extra characters and make the window minimal. Because the algorithm always shrinks as soon as the window becomes valid, every candidate it records is the smallest valid window for that specific right boundary.

That is enough to guarantee correctness. If a better answer existed, then when the right pointer reached that answer’s end position, the shrinking loop would have reduced the window down to at most that size and recorded it.

Interview follow-ups

What changes if the interviewer asks for only the length, not the substring itself?

The core algorithm stays the same. The only difference is that instead of storing the starting index and slicing the final substring, the solution can keep only the best length seen so far. That slightly simplifies the bookkeeping and avoids the final substring extraction, but it does not change the asymptotic complexity.

This version still works because the difficult part of the problem is deciding when a window is valid and how to shrink it safely. Once that logic is in place, returning the length or the actual substring is just a reporting detail.

How would you handle a version where `s` is extremely large and most characters are irrelevant?

One practical optimization is to pre-filter s down to the positions whose characters appear in t, then run the same sliding-window logic on that filtered list of (index, character) pairs. This does not improve the worst-case asymptotic bound, because the filtered list can still be large, but it can reduce the constant factors a lot when only a small fraction of s matters.

It works because characters not present in t can never help satisfy the requirement. Skipping them does not change which windows are valid; it only makes the implementation do less pointless work between useful positions. The tradeoff is extra preprocessing and slightly more complex indexing when reconstructing the final substring.

What if the interviewer asks whether the solution still works when `t` contains repeated characters?

Yes, and that is exactly why the frequency map matters. The window is not considered valid when it has merely seen each distinct character once. It is valid only when, for every character, the window count is at least the required count from t.

For example, if t = "AABC", then the window must contain two As. A set-based approach would fail here because it would treat one A and two As as equivalent. The counting-based sliding window handles this cleanly, with no change in complexity.

Could this be solved with binary search on the answer length?

Yes, but it is usually worse in interviews. One can binary-search a candidate window length and ask whether any substring of that length covers t, using rolling frequency counts to check each fixed-size window. That gives a valid solution pattern, but it adds an extra logarithmic factor and is harder to explain cleanly.

The direct sliding-window method is better because it finds the minimal answer in one pass. Instead of repeatedly testing lengths, it grows and shrinks the window exactly when the counts say it should. That is both simpler and more efficient.

Practical takeaway

The most important way to remember this problem is not as a memorized trick, but as a pattern:

use a sliding window when the answer is a contiguous range, use frequency counts when duplicates matter, and shrink immediately once the window becomes valid.

That recipe is what turns a seemingly complicated hard problem into a very systematic linear-time solution.