Generated by Codex with GPT-5
Difficulty: MEDIUM
Problem: Lowest Common Ancestor of a Binary Tree
Problem gist
Given the root of a binary tree and two existing nodes p and q, return their lowest common ancestor.
An ancestor can be the node itself. That detail matters: if p is above q, then p is the answer. “Lowest” means deepest in the tree, not smallest by value. This is a normal binary tree, so there is no useful ordering rule like there would be in a binary search tree.
The problem is really asking where the paths from the root to p and q split. The lowest common ancestor is the last shared node on those two root-to-node paths.
Deriving the optimal solution
A direct way to think about the problem is to search each subtree and ask a small question: does this subtree contain p, q, or neither?
Use a postorder DFS because a node can only know whether it is the answer after it hears from its children:
- If the current node is
None, it contributes no match. - If the current node is
porq, return it upward as a found target. - Recursively search the left and right subtrees.
- If both sides return a node, the current node is where the two targets meet, so return the current node.
- If only one side returns a node, pass that node upward.
This works because the recursion returns useful evidence, not just booleans. A returned target means “this subtree contains one of the nodes.” A returned ancestor means “this subtree already contains both nodes, and this is their lowest meeting point.” Once both left and right branches report a match, no lower node can contain both targets because the targets are split across different child subtrees.
The algorithm visits each tree node at most once, so the time complexity is O(n). The extra space is O(h) for the recursion stack, where h is the tree height. In a balanced tree that is O(log n); in a skewed tree it can be O(n).
Python solution
from dataclasses import dataclass
from typing import Optional
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, x: int):
# self.val = x
# self.left = None
# self.right = None
@dataclass(frozen=True)
class SearchResult:
ancestor: Optional["TreeNode"]
found_first_target: bool
found_second_target: bool
class Solution:
def lowestCommonAncestor(
self,
root: "TreeNode",
p: "TreeNode",
q: "TreeNode",
) -> "TreeNode":
"""Return the deepest node that has both p and q as descendants."""
result = self._find_lowest_common_ancestor(root, p, q)
if result.ancestor is None:
raise ValueError("Both target nodes must exist in the tree.")
return result.ancestor
def _find_lowest_common_ancestor(
self,
node: Optional["TreeNode"],
first_target: "TreeNode",
second_target: "TreeNode",
) -> SearchResult:
if node is None:
return SearchResult(
ancestor=None,
found_first_target=False,
found_second_target=False,
)
left_result = self._find_lowest_common_ancestor(
node.left,
first_target,
second_target,
)
# If one child subtree already contains both targets, that ancestor is
# lower than the current node and cannot be improved by searching here.
if left_result.ancestor is not None:
return left_result
right_result = self._find_lowest_common_ancestor(
node.right,
first_target,
second_target,
)
if right_result.ancestor is not None:
return right_result
# LeetCode passes actual node objects, so identity comparison is exact.
found_first_target = (
left_result.found_first_target
or right_result.found_first_target
or node is first_target
)
found_second_target = (
left_result.found_second_target
or right_result.found_second_target
or node is second_target
)
ancestor = node if found_first_target and found_second_target else None
return SearchResult(
ancestor=ancestor,
found_first_target=found_first_target,
found_second_target=found_second_target,
)The helper has one clear contract: return whether each target appears in the current subtree, and return an ancestor only once both targets are known to be inside that subtree. This is a production-friendly version of the common interview solution because it does not silently return a partial answer when one target is missing.
Interview follow-ups
What if one or both target nodes may not exist in the tree?
The usual LeetCode version guarantees both nodes exist, so many interview solutions return a node directly from each subtree and skip validation. Production code often cannot assume that. The clean extension is for the DFS to return three pieces of information: the candidate LCA, whether p was seen, and whether q was seen.
The Python solution above uses that extension. Each recursive call combines its left and right results with whether the current node is one of the targets. The caller only accepts the candidate ancestor if both p and q were found somewhere in the tree. This still works because the original LCA logic is unchanged; the algorithm simply carries enough validation state to avoid returning p when q is absent.
The time complexity remains O(n) because every node is still visited once. The recursion stack remains O(h). The tradeoff is slightly more verbose code because each return value carries status flags instead of only a node reference.
How does the solution change for a binary search tree?
For a binary search tree, node values provide direction. If both target values are smaller than the current node’s value, the LCA must be in the left subtree. If both are larger, it must be in the right subtree. Otherwise, the current node is the split point and therefore the LCA.
This works because every left subtree value is less than the current node and every right subtree value is greater. The first node where the targets stop going in the same direction is exactly the deepest shared ancestor.
The time complexity becomes O(h) instead of O(n) because the algorithm follows one root-to-leaf path rather than scanning the whole tree. In a balanced BST that is O(log n), while a skewed BST can still degrade to O(n).
What if each node has a parent pointer?
With parent pointers, the tree can be treated like two linked lists that climb toward the root. One practical approach is to walk from p to the root and store every ancestor in a set. Then walk from q upward; the first node already in the set is the lowest common ancestor.
The first repeated ancestor is lowest because the walk from q starts at q and moves upward one level at a time. As soon as it touches an ancestor of p, it has found the closest shared ancestor from q’s side.
This takes O(h) time and O(h) extra space. If space must be constant, compute the depths of p and q, lift the deeper node until both are at the same depth, then move both upward together until they meet. That version is still O(h) time but only O(1) extra space.
What if the interviewer asks for the LCA of many node pairs?
If there are only a few queries, running the DFS solution once per query is usually fine. Each query costs O(n), which is simple and reliable.
For many repeated queries on a mostly static tree, preprocessing becomes useful. One common approach is binary lifting: store each node’s parent at powers of two distances, such as its 2^0, 2^1, 2^2, and larger ancestors. To answer a query, first lift the deeper node up to the same depth as the shallower node, then lift both nodes upward in decreasing powers until their parents match.
Preprocessing costs O(n log n) time and space. Each query then costs O(log n). The tradeoff is only worth it when query volume is high enough to justify the extra data structure.
What if the tree can contain duplicate values?
The recursive solution should compare node identity, not node values. Duplicate values do not matter if p and q are actual node references, because the algorithm is locating specific nodes in the tree.
If the input only gives values instead of node references, the problem is ambiguous when duplicates exist. The interviewer must define whether the first matching value, all matching values, or a specific occurrence is intended. Once the target nodes are identified unambiguously, the same LCA algorithm applies.
Takeaway
The useful mental model is that each subtree reports what it found. A node becomes the lowest common ancestor exactly when one target appears on one side and the other target appears on the other side, or when the node itself is one target and the other target appears below it. That gives a one-pass DFS with optimal O(n) time for a general binary tree.