LeetCode MEDIUM 399 Evaluate Division Summary

Generated by Codex with GPT-5

Quick facts

Difficulty: MEDIUM
Problem: Evaluate Division
Main tags: Graph, DFS, BFS, Union-Find

What the problem is really asking

Each equation like a / b = 2.0 gives a multiplicative relationship between two variables.

The query x / y asks:

can the known equations connect x to y, and if they can, what ratio do those equations imply?

That is the whole problem. The variables are not numbers on their own. They only have meaning relative to one another. So the task is really about building a structure that preserves relative ratios inside each connected group of variables.

The key modeling step

The cleanest first model is a weighted graph.

If a / b = 2.0, then:

there is an edge from a to b with weight 2.0
there is an edge from b to a with weight 0.5

Now a query like a / c becomes:

find any path from a to c, then multiply the edge weights along that path.

For example, if:

a / b = 2
b / c = 3

then a / c = 2 * 3 = 6.

That immediately explains why graph traversal works: every valid path composes ratios by multiplication.

The first correct solution

The most direct solution is:

Build the weighted graph from the equations.
For each query, run DFS or BFS from the numerator to the denominator.
Multiply weights as the search moves through the graph.
If the destination is never reached, return -1.0.

This is easy to derive in an interview and easy to justify. The downside is repeated work. If there are many queries inside the same connected component, the algorithm keeps walking the same graph over and over again.

That is the clue that a stronger preprocessing approach exists.

Why weighted union-find is the better final answer

Union-find already knows how to answer one important question efficiently:

are two nodes in the same connected component?

This problem needs one extra piece of information:

if two variables are connected, what is their ratio?

So the natural upgrade is weighted union-find:

each variable points to a parent
each variable also stores the ratio between itself and its parent
after path compression, each variable effectively knows its ratio to the root of its component

Once that is true, answering a query is easy.

If:

x / root = wx
y / root = wy

then:

x / y = wx / wy

That turns a path problem into a constant-time ratio calculation after the structure is built.

How to derive the weighted union-find formula

Suppose an equation says:

x / y = value

Assume:

x / root_x = wx
y / root_y = wy

If the two roots are different, union-find needs to connect one root under the other. Say root_x becomes a child of root_y.

Then the weight stored on that new parent edge must satisfy:

x / y = (x / root_x) * (root_x / root_y) / (y / root_y)

So:

value = wx * (root_x / root_y) / wy

which means:

root_x / root_y = value * wy / wx

That one equation is the entire trick. Once that relationship is stored, every future query in the component can reuse it.

Why the algorithm is correct

The invariant is:

after every union and every compressed find, weight_to_parent[var] participates in a chain whose product equals var / root(var).

That invariant starts true for isolated variables because each variable is its own root with ratio 1.0.

During path compression, a node is rewired directly to the root, and its stored weight is multiplied by its old parent’s weight-to-root. That preserves the same overall ratio.

During union, the new edge between component roots is chosen precisely so the given equation remains true. Since every other node in each component already has a correct ratio to its own root, the whole merged component becomes consistent as well.

Because queries compare two ratios to the same root, dividing those ratios gives the correct answer whenever the variables are connected. If they are not connected, no chain of equations defines the requested division, so -1.0 is correct.

Best approach

There are two strong answers:

Weighted graph + DFS/BFS is the best way to discover the problem and is often the fastest route to a correct interview solution.
Weighted union-find is the best final answer when there are many queries, because it preprocesses the relationships once and then answers each query very cheaply.

For this note, the implementation uses weighted union-find.

Build: O(E * alpha(V))
Query: O(alpha(V)) amortized
Space: O(V)

Here V is the number of distinct variables and E is the number of equations.

Python solution

The implementation below wraps the logic in a dedicated WeightedUnionFind class. Each variable stores its parent and the ratio between the variable and that parent. After find, the stored weight becomes the ratio from the variable directly to the root, which makes queries simple and fast.

from typing import Dict, List


class WeightedUnionFind:
    """
    Union-find where each node stores:
    node / parent[node] = weight_to_parent[node]
    """

    def __init__(self) -> None:
        self.parent: Dict[str, str] = {}
        self.weight_to_parent: Dict[str, float] = {}

    def add(self, variable: str) -> None:
        """Initialize a variable as its own component root."""
        if variable not in self.parent:
            self.parent[variable] = variable
            self.weight_to_parent[variable] = 1.0

    def find(self, variable: str) -> str:
        """
        Return the component root.

        After path compression finishes:
        weight_to_parent[variable] = variable / root
        """
        parent_variable = self.parent[variable]
        if parent_variable != variable:
            root = self.find(parent_variable)

            # Multiply through the old parent so the stored weight becomes
            # variable / root after compression.
            self.weight_to_parent[variable] *= self.weight_to_parent[parent_variable]
            self.parent[variable] = root

        return self.parent[variable]

    def union(self, numerator: str, denominator: str, ratio: float) -> None:
        """
        Merge the components so that:
        numerator / denominator = ratio
        """
        self.add(numerator)
        self.add(denominator)

        numerator_root = self.find(numerator)
        denominator_root = self.find(denominator)

        if numerator_root == denominator_root:
            return

        numerator_to_root = self.weight_to_parent[numerator]
        denominator_to_root = self.weight_to_parent[denominator]

        # Attach numerator_root under denominator_root.
        # We need:
        # numerator / denominator
        # = (numerator / numerator_root)
        #   * (numerator_root / denominator_root)
        #   / (denominator / denominator_root)
        # = ratio
        #
        # Therefore:
        # numerator_root / denominator_root
        # = ratio * denominator_to_root / numerator_to_root
        self.parent[numerator_root] = denominator_root
        self.weight_to_parent[numerator_root] = (
            ratio * denominator_to_root / numerator_to_root
        )

    def are_connected(self, numerator: str, denominator: str) -> bool:
        """Return whether two variables belong to the same component."""
        if numerator not in self.parent or denominator not in self.parent:
            return False

        return self.find(numerator) == self.find(denominator)

    def divide(self, numerator: str, denominator: str) -> float:
        """Return numerator / denominator if defined, otherwise -1.0."""
        if not self.are_connected(numerator, denominator):
            return -1.0

        return self.weight_to_parent[numerator] / self.weight_to_parent[denominator]


class Solution:
    def calcEquation(
        self,
        equations: List[List[str]],
        values: List[float],
        queries: List[List[str]],
    ) -> List[float]:
        union_find = WeightedUnionFind()

        for (numerator, denominator), ratio in zip(equations, values):
            union_find.union(numerator, denominator, ratio)

        answers: List[float] = []
        for numerator, denominator in queries:
            answers.append(union_find.divide(numerator, denominator))

        return answers

Interview follow-ups

What if the interviewer wants the simpler graph solution first?

That is often the best way to start. Build an adjacency list where each equation adds two directed edges with reciprocal weights. Then answer each query with DFS or BFS while carrying the running product from the start node to the current node. This is easier to invent on the spot because it follows directly from the equation semantics. The tradeoff is that each query may cost O(V + E) in the worst case, so repeated queries over the same component redo a lot of work.

Why is union-find better when there are many queries?

Union-find turns repeated traversals into preprocessing. After the equations are merged, each variable effectively knows its ratio to a component root. Then a query only needs to check whether the two variables share a root and, if they do, divide their root-relative weights. That gives near-constant amortized query time. The tradeoff is that the implementation is less obvious than graph traversal, so it is usually better as the polished final answer than as the first thing said in an interview.

How would you handle contradictory equations?

The original LeetCode problem guarantees consistency, so the basic solution does not need contradiction checks. In an interview variant without that guarantee, the algorithm should detect when an equation connects two variables that are already in the same component but implies a different ratio from the one currently stored. In the graph approach, that means the new equation disagrees with an existing path product. In the union-find approach, it means numerator / denominator computed from root weights does not match the new value within a small floating-point tolerance. The tradeoff is extra validation logic, but the asymptotic performance remains essentially the same.

What if equations are added over time and queries keep arriving?

Weighted union-find handles incremental additions very well as long as the system only adds equations and never deletes them. Each new equation becomes another union operation, and each query remains a connectivity check plus a ratio calculation. That makes it a strong fit for online updates. The important limitation is deletions: standard union-find is not designed to remove relationships cleanly, so if the interviewer adds deletions, a graph-based structure or a more advanced dynamic connectivity technique becomes necessary.

What if the interviewer asks about numerical stability?

The stored values are floating-point ratios, so tiny rounding differences can accumulate in long chains. For the LeetCode constraints this is fine, but in a production system it is worth comparing values with a tolerance instead of exact equality when validating consistency. Another practical step is keeping path compression enabled, because it shortens chains and reduces the number of multiplications needed on later operations. The tradeoff is that floating-point arithmetic remains approximate, but in ordinary interview settings that is acceptable and expected.