Generated by Codex with GPT-5
Quick facts
- Difficulty:
HARD - Problem: Word Ladder
- Main tags:
Breadth-First Search,Hash Table,String,Graph
What the problem is really asking
The input gives a beginWord, an endWord, and a dictionary called wordList.
Each move can change exactly one letter, and every intermediate word must appear in wordList. The task is to return the number of words in the shortest valid transformation sequence from beginWord to endWord. If no such sequence exists, return 0.
For example, hit -> hot -> dot -> dog -> cog has length 5 because the sequence contains five words, not because it makes five edits.
The key constraint is that every move has the same cost: one word change. That makes this a shortest-path problem on an unweighted graph.
The graph hidden inside the problem
Think of each word as a node. Two words have an edge between them if they differ by exactly one character.
The brute-force graph would compare every word with every other word to see whether an edge exists. That is too much work for larger dictionaries. The better mental model is to generate only possible neighbors of the current word:
- change one position
- try each lowercase letter
- keep the candidate only if it is in the dictionary
Because dictionary membership can be checked with a set, each neighbor test is fast.
Best approach
The standard optimal approach is BFS, because BFS explores all words at distance 1, then all words at distance 2, and so on. The first time it reaches endWord, it has found the shortest transformation.
A practical interview-grade improvement is bidirectional BFS. Instead of searching only from beginWord, search from both beginWord and endWord, always expanding the smaller frontier. This often cuts the explored search space dramatically because word-ladder graphs branch quickly.
The algorithm is:
- Put
beginWordin the forward frontier andendWordin the backward frontier. - Expand the smaller frontier by one letter change.
- If a generated word has already been reached from the other side, the two searches have met and the shortest ladder length is known.
- Remove visited words from the unvisited set so they are never processed twice.
- If a frontier becomes empty, no ladder exists.
How to derive it
Start with the simplest correct idea: from beginWord, try every valid one-letter transformation, then every transformation from those words, and so on. That is exactly BFS.
The reason BFS is enough is that every edge has equal weight. There is no cheaper or more expensive transformation; each step changes one word. So the first time BFS reaches a word, it has reached that word in the fewest possible steps.
Bidirectional BFS comes from noticing that a shortest path can be discovered from either end. If the shortest path has length d, a one-sided BFS may explore roughly branching_factor^d states. Two-sided BFS instead tries to meet near the middle, closer to 2 * branching_factor^(d / 2) in the typical case. The worst-case bound is still tied to the dictionary size, but the practical difference is large.
Complexity
Let n be the number of words in wordList and l be the word length.
- Time:
O(n * l * 26 * l)in this implementation, commonly simplified toO(n * l^2)because the alphabet size is fixed. - Space:
O(n)for the unvisited set and BFS frontiers.
The extra l in the time bound comes from building candidate strings with slicing. A wildcard-pattern index can reduce some constant factors, but the BFS reasoning is the same.
Python solution
from string import ascii_lowercase
from typing import Iterable, Iterator
class BidirectionalWordLadderSearch:
"""Find the shortest word ladder length with bidirectional BFS."""
def __init__(
self,
begin_word: str,
end_word: str,
word_list: Iterable[str],
) -> None:
self.begin_word = begin_word
self.end_word = end_word
self.word_length = len(begin_word)
self.unvisited_words = {
word for word in word_list if len(word) == self.word_length
}
def find_shortest_length(self) -> int:
if len(self.end_word) != self.word_length:
return 0
if self.end_word not in self.unvisited_words:
return 0
forward_frontier = {self.begin_word}
backward_frontier = {self.end_word}
forward_visited = {self.begin_word}
backward_visited = {self.end_word}
self.unvisited_words.discard(self.begin_word)
self.unvisited_words.discard(self.end_word)
ladder_length = 1
while forward_frontier and backward_frontier:
# Expanding the smaller frontier keeps the branching factor low.
if len(forward_frontier) > len(backward_frontier):
forward_frontier, backward_frontier = (
backward_frontier,
forward_frontier,
)
forward_visited, backward_visited = (
backward_visited,
forward_visited,
)
next_frontier: set[str] = set()
for current_word in forward_frontier:
for candidate_word in self._one_letter_variants(current_word):
if candidate_word in backward_visited:
return ladder_length + 1
if candidate_word in self.unvisited_words:
self.unvisited_words.remove(candidate_word)
forward_visited.add(candidate_word)
next_frontier.add(candidate_word)
forward_frontier = next_frontier
ladder_length += 1
return 0
def _one_letter_variants(self, word: str) -> Iterator[str]:
for index in range(self.word_length):
prefix = word[:index]
suffix = word[index + 1 :]
original_letter = word[index]
for replacement_letter in ascii_lowercase:
if replacement_letter == original_letter:
continue
yield f"{prefix}{replacement_letter}{suffix}"
class Solution:
def ladderLength(
self,
beginWord: str,
endWord: str,
wordList: list[str],
) -> int:
search = BidirectionalWordLadderSearch(beginWord, endWord, wordList)
return search.find_shortest_length()Why this works
The invariant is that each frontier contains exactly the words reachable in the current number of steps from its side of the search.
When the algorithm expands a frontier, every generated valid candidate is one transformation farther away. If that candidate has already been reached from the opposite side, the two partial paths connect into a complete ladder. Because both sides expand level by level, that meeting point gives the shortest possible sequence.
Removing newly visited words from unvisited_words is safe because BFS reaches each word at its minimum distance from the side currently expanding. Revisiting the same word later cannot produce a shorter path; it would only repeat work.
Interview follow-ups
How would you solve it with ordinary one-direction BFS?
The one-direction solution starts from beginWord, uses a queue, and pushes every unvisited dictionary word that differs by one letter. The first time it dequeues or generates endWord, it returns the current distance plus one.
This works for the same reason as the bidirectional version: the graph is unweighted, so BFS discovers nodes in increasing distance order. The tradeoff is performance. One-direction BFS is simpler to code, but it may explore a much larger part of the dictionary before reaching the target.
How can a wildcard-pattern map help?
Instead of trying all 26 replacement letters for every position, the solution can preprocess each word into patterns such as h*t, *ot, and ho*. Words that share a pattern differ by one character at that position, so each pattern bucket gives immediate neighbors.
This works because every valid edge corresponds to at least one shared wildcard pattern. The tradeoff is memory and preprocessing time: the map stores n * l pattern entries. In return, neighbor lookup can be cleaner and often faster when the alphabet is large or when generating candidates is inconvenient.
What if the interviewer asks for the actual transformation path?
Then the BFS needs parent pointers, not just distances. In one-direction BFS, store parent[child] = current_word when a word is first visited. When endWord is found, walk backward through the parent map and reverse the result.
For bidirectional BFS, path reconstruction is more delicate because the search can meet in the middle and the active direction may swap. The usual approach is to keep parent links for both sides, then stitch the begin-side path and end-side path together at the meeting word. The time complexity stays linear in the number of explored edges, but the implementation has more bookkeeping.
What changes if all shortest ladders are required?
That becomes Word Ladder II. A normal visited set is not enough because multiple parents at the same BFS depth may lead to different shortest paths.
The expected approach is to run BFS level by level, build a graph of parent links for all shortest-depth transitions, and stop expanding after the first level that reaches endWord. Then use backtracking from endWord to beginWord through those parent links. This preserves all shortest paths without wandering into longer ones. The tradeoff is output size: there can be many shortest ladders, so the result itself may be large.
What if words can have different lengths?
If transformations still change exactly one character and do not allow insertion or deletion, then words of different lengths can never be neighbors. The solver can safely filter the dictionary to words with the same length as beginWord, as the implementation does.
If the rules change to allow insertions or deletions, the graph definition changes. Neighbor generation would need to include deletion and insertion candidates, and the branching factor would increase. BFS would still be the right shortest-path tool as long as every operation costs the same.
Practical takeaway
The main move is to stop thinking of the problem as string manipulation and start thinking of it as shortest path in an implicit graph.
Once that graph view is clear, BFS is the natural answer. Bidirectional BFS is the practical upgrade: it keeps the correctness of level-order search while reducing the number of words the algorithm usually has to explore.