Generated by Codex with GPT-5

Quick facts

What the problem is really asking

This problem is easy to misread at first.

The path does not have to start at the root, and it does not have to end at a leaf. It can start anywhere and end anywhere, as long as it follows parent-child edges and does not reuse a node.

That means the best path might:

  • stay completely inside one subtree
  • go from a node up to some ancestor and then down into the other subtree
  • consist of just one node when every other extension would make the sum worse

So the real question is not “what is the best root-to-leaf path?”

It is:

for every node, what is the best path that uses this node as the highest point of the path, and how can all of those candidates be checked in one traversal?

The key insight

Every valid path in a tree has a unique “turning point” or highest node.

At that turning point, the path can use:

  • the node itself
  • a downward path from the left child
  • a downward path from the right child

That immediately suggests two different quantities for each node.

First, the node should report a downward gain to its parent:

downward_gain(node) = node.val + max(0, left_gain, right_gain)

This value is restricted to one side because once the path continues upward to the parent, it cannot split in two directions without reusing the current node.

Second, the node should also be allowed to act as the full turning point of a completed path:

path_through_node = node.val + max(0, left_gain) + max(0, right_gain)

That candidate may use both children because the path is allowed to come up from one side and go down the other side exactly once.

The max(0, ...) part matters because a negative branch never helps. If a child subtree only lowers the total, the best choice is to skip it entirely.

How to derive the optimal solution

The brute-force mindset is to consider many possible start and end nodes, but that explodes because a tree has too many distinct paths.

The cleaner way is to ask what information a parent actually needs from a child.

A parent does not need every possible path inside the child subtree. It only needs the best single downward path that starts at that child, because that is the only kind of path the parent can extend upward through itself.

Once each child provides that one number, the parent can do both jobs locally:

  1. compute the best downward gain it can pass upward
  2. compute the best complete path that turns at this node

That is exactly a postorder dynamic programming traversal:

  • children first
  • then combine their results at the parent

The overall answer is the maximum path_through_node seen anywhere in the tree.

This gives:

  • Time: O(n) because every node is processed once
  • Extra space: O(n) in the iterative Python implementation below

Best approaches

The standard interview solution is a recursive postorder DFS. It is concise and captures the core idea very clearly.

For Python specifically, there is one practical issue: the tree can be very deep, and a skewed tree can exceed Python’s default recursion limit. Because of that, the implementation below uses an iterative postorder traversal. It computes the exact same dynamic programming state, but it does not depend on recursion depth.

Python solution

The implementation below does two passes without recursion:

  1. build a postorder list of nodes
  2. process that list from the leaves upward while computing each node’s downward gain and updating the best complete path sum
from typing import Dict, List, Optional


# Definition for a binary tree node.
class TreeNode:
    def __init__(
        self,
        val: int = 0,
        left: Optional["TreeNode"] = None,
        right: Optional["TreeNode"] = None,
    ) -> None:
        self.val = val
        self.left = left
        self.right = right


def build_postorder_nodes(root: TreeNode) -> List[TreeNode]:
    """
    Return nodes in postorder without recursion.

    The first loop collects nodes in root-right-left order, which becomes
    left-right-root after reversal.
    """
    traversal_stack: List[TreeNode] = [root]
    reverse_postorder: List[TreeNode] = []

    while traversal_stack:
        node = traversal_stack.pop()
        reverse_postorder.append(node)

        if node.left is not None:
            traversal_stack.append(node.left)
        if node.right is not None:
            traversal_stack.append(node.right)

    reverse_postorder.reverse()
    return reverse_postorder


def child_gain(child: Optional[TreeNode], gain_by_node_id: Dict[int, int]) -> int:
    """Return the child's downward gain, or 0 when the child is missing."""
    if child is None:
        return 0
    return gain_by_node_id[id(child)]


def compute_max_path_sum(root: Optional[TreeNode]) -> int:
    """
    Compute the maximum non-empty path sum in O(n) time.

    gain_by_node_id[id(node)] stores the best sum of a path that starts at
    that node and extends downward through at most one child.
    """
    if root is None:
        raise ValueError("The tree must contain at least one node.")

    postorder_nodes = build_postorder_nodes(root)
    gain_by_node_id: Dict[int, int] = {}
    best_path_sum = root.val

    for node in postorder_nodes:
        left_gain = max(0, child_gain(node.left, gain_by_node_id))
        right_gain = max(0, child_gain(node.right, gain_by_node_id))

        # Treat the current node as the turning point of the full path.
        path_through_node = node.val + left_gain + right_gain
        best_path_sum = max(best_path_sum, path_through_node)

        # Only one side can continue upward to the parent.
        gain_by_node_id[id(node)] = node.val + max(left_gain, right_gain)

    return best_path_sum


class Solution:
    def maxPathSum(self, root: Optional[TreeNode]) -> int:
        return compute_max_path_sum(root)

Why this works

The algorithm works because every candidate path is evaluated exactly where it should be evaluated: at its highest node.

When processing a node, the downward gain from each child is already known because the traversal is postorder. That lets the algorithm decide two things immediately:

  • what is the best one-sided path that can be extended by the parent?
  • what is the best complete path that turns at this node?

Those are the only two facts that matter.

If a child contributes a negative gain, ignoring that child is always better than including it. If a child contributes a positive gain, including it can only improve the total. That is why clamping child gains at zero is correct.

Because every simple path in a tree has a unique highest node, the global optimum must appear as path_through_node for exactly one processed node. The traversal checks all of them, so the best answer cannot be missed.

Interview follow-ups

Can you also return the actual path, not just the sum?

Yes. The same dynamic programming idea still works, but the implementation has to store reconstruction data in addition to the numeric gain. For each node, it is useful to remember which child produced the best downward gain, and when a new global maximum is found, also remember whether the winning path used the left side, the right side, or both. After the traversal finishes, the path can be rebuilt by walking down the recorded child choices from the turning node. This still runs in O(n) time, but the bookkeeping is more detailed than the sum-only version.

What changes if the tree is not binary and each node can have many children?

The core idea stays the same. A node still reports one downward gain to its parent, and a complete path that turns at that node can still use at most two child branches. The only difference is that instead of looking at exactly two children, the algorithm scans all children and keeps the two largest non-negative downward gains. The best completed path at that node becomes node.val + best_gain_1 + best_gain_2, while the upward gain is node.val + best_gain_1. This remains O(n) time overall if each edge is processed once, but the per-node combination step grows with the node’s number of children.

What if the interviewer changes the problem so the path must start at the root?

That becomes a simpler problem because the path no longer has freedom to start anywhere or turn in the middle. If the path must go from the root down to any node, then a preorder or postorder traversal can carry the best prefix sum seen so far. If the path must go from the root to a leaf, the recurrence is just root.val + max(left_subtree_answer, right_subtree_answer), with careful handling for missing children. These variants are easier because the algorithm no longer needs a separate global “turning point” candidate at every node.

Practical takeaway

The cleanest mental model is this:

  • one value travels upward: the best downward gain starting at the current node
  • one value updates the global answer locally: the best path that turns at the current node

Once that split is clear, the problem stops feeling like a complicated tree search and becomes a very manageable postorder dynamic programming pattern.