LeetCode MEDIUM 12 Integer to Roman Summary

Generated by Codex with GPT-5

Quick facts

Difficulty: MEDIUM
Problem: Integer to Roman
Main tags: Hash Table, Math, String

What the problem is really asking

The input is an integer between 1 and 3999, and the output must be its Roman numeral form.

The main challenge is not basic lookup. The hard part is handling the subtractive cases correctly:

4 is IV, not IIII
9 is IX, not VIIII
40 is XL
90 is XC
400 is CD
900 is CM

So the real interview question is:

how can the conversion be structured so those special cases fall out naturally instead of becoming a pile of ad hoc conditionals?

The key insight

Roman numerals become simple if the symbol table includes both the ordinary symbols and the subtractive combinations.

That means using this ordered list:

1000 -> M
900 -> CM
500 -> D
400 -> CD
100 -> C
90 -> XC
50 -> L
40 -> XL
10 -> X
9 -> IX
5 -> V
4 -> IV
1 -> I

Once the table is written in descending order, the conversion is greedy:

Take the largest symbol value that does not exceed the remaining number.
Append that symbol.
Subtract its value.
Repeat until nothing is left.

Because the subtractive forms are already in the table, the greedy choice never gets stuck producing an invalid representation.

How to derive the optimal solution

Start with the most obvious version: repeatedly peel off the largest Roman chunk that still fits.

That idea almost works with only M, D, C, L, X, V, and I, but it breaks on values like 4, 9, 40, and 900. So the next step is to stop treating those as exceptions and instead promote them into first-class symbols in the table.

After that change, the algorithm becomes uniform. Every step asks the same question:

what is the largest legal Roman block that can be emitted next?

This leads directly to the standard optimal solution:

store the legal value-symbol pairs in descending order
walk the table once
emit each symbol as many times as it fits

For LeetCode’s constraints, this is effectively constant time and constant extra space apart from the output string. More generally, the running time is proportional to the number of emitted Roman symbols, which is still tiny here.

Best approaches

There are two strong ways to solve this problem.

The first is the greedy value-symbol table used below. It is easy to derive in an interview, easy to explain, and easy to extend if the symbol set changes.

The second is a place-value lookup approach:

precompute Roman strings for the thousands digit
precompute Roman strings for the hundreds digit
precompute Roman strings for the tens digit
precompute Roman strings for the ones digit

Then index into those four tables and concatenate the results. That version is also optimal for this problem and can be slightly shorter, but the greedy table is usually the better interview answer because it shows the rule behind the notation rather than hardcoding digit buckets.

Python solution

from typing import List, Tuple


ROMAN_VALUE_SYMBOL_PAIRS: List[Tuple[int, str]] = [
    (1000, "M"),
    (900, "CM"),
    (500, "D"),
    (400, "CD"),
    (100, "C"),
    (90, "XC"),
    (50, "L"),
    (40, "XL"),
    (10, "X"),
    (9, "IX"),
    (5, "V"),
    (4, "IV"),
    (1, "I"),
]


def convert_integer_to_roman(number: int) -> str:
    """Convert a positive integer in the range 1..3999 to Roman numerals."""
    if not 1 <= number <= 3999:
        raise ValueError("Roman numeral conversion expects 1 <= number <= 3999.")

    remaining_value = number
    roman_parts: List[str] = []

    for value, symbol in ROMAN_VALUE_SYMBOL_PAIRS:
        if remaining_value == 0:
            break

        # Emit the current Roman block as many times as it fits.
        symbol_count = remaining_value // value
        if symbol_count == 0:
            continue

        roman_parts.append(symbol * symbol_count)
        remaining_value -= value * symbol_count

    return "".join(roman_parts)


class Solution:
    def intToRoman(self, num: int) -> str:
        return convert_integer_to_roman(num)

Interview follow-ups

What if the interviewer asks for the reverse conversion, Roman to Integer?

Then the clean approach is to scan from left to right and compare each symbol with the one after it. If the current symbol is smaller than the next one, subtract it; otherwise add it. That works because subtractive notation only appears when a smaller symbol precedes a larger one, such as I before V or X before C.

The time complexity is linear in the string length and the extra space is O(1). This is a good follow-up because it checks whether the candidate understands the structure of Roman numerals in both directions, not just how to memorize one mapping table.

How would you validate invalid Roman numerals instead of assuming the input is well-formed?

A production parser usually needs more than simple addition and subtraction. It must reject illegal repetitions like VV, invalid subtractive pairs like IL, and malformed ordering patterns.

One practical strategy is to parse the Roman numeral to an integer, convert that integer back to a canonical Roman numeral with the same conversion function, and compare the rebuilt string with the original input. If they differ, the original string was not in canonical form. That keeps the logic simple and centralizes the formatting rules in one place, though it adds an extra conversion pass.

What changes if the numeral system must support values larger than 3999?

The LeetCode problem avoids that complication by capping the input at 3999, which fits the standard Roman symbols. If the system must go beyond that, the first thing to decide is the notation rule: some variants use overlines to mean multiplication by 1000, while others define nonstandard extensions.

Once the notation is defined, the same greedy-table design still works. The table just needs more value-symbol pairs. The tradeoff is not algorithmic complexity; it is making sure the symbol set and formatting rules are specified clearly enough that there is still one canonical representation for each number.

Practical takeaway

The simplest way to remember this problem is: Roman numeral conversion is just greedy subtraction over a table that already includes the subtractive forms.

If the symbol table is correct and ordered from largest to smallest, the implementation stays short, readable, and robust.