Generated by Codex with GPT-5

Quick facts

  • Difficulty: HARD
  • Problem: Custom LeetCode-style prompt: Encode Any Input to JSON String
  • Topics: Serialization, Depth-First Search, Hash Table, Graph, Design

Problem gist

The task is to write an encoder that accepts an input value and returns the same kind of string a normal JSON serializer would return:

  • 1 becomes "1"
  • True becomes "true"
  • False becomes "false"
  • None becomes "null"
  • "abc" becomes "\"abc\""
  • [1, True] becomes "[1,true]"
  • {"name": "Ada"} becomes "{\"name\":\"Ada\"}"

The recursive part is straightforward for tree-shaped input. The hard edge case is a cyclic runtime object graph. For example:

a = []
b = {"name": a}
a.append(b)

Now a contains b, and b["name"] points back to a. A naive recursive encoder will loop forever:

list -> dict -> list -> dict -> ...

There is no ordinary JSON string that preserves that cycle. JSON values are trees, not graphs. So a strict regular-JSON encoder should handle the cycle by detecting it and raising a clear error instead of recursing forever. This is also how standard serializers behave: cyclic input is not JSON-serializable.

Key idea

Build the JSON string recursively, but track the containers currently on the recursion stack:

  • primitives encode directly
  • strings need JSON escaping
  • arrays encode each item and join them with commas
  • objects encode each key as a JSON string and each value recursively
  • before descending into a list or dictionary, add its object identity to an active set
  • after finishing that container, remove it from the active set
  • if a container is already active, the input contains a cycle, so raise an error

Use an active recursion stack rather than a permanent “seen” set. A repeated object is not necessarily a cycle:

child = []
value = [child, child]

This can be encoded as [[],[]] by visiting the same child twice. It only becomes impossible when the traversal reaches an object that is already on the current path.

How to derive the solution

Start with the primitive cases:

  1. None maps to null
  2. booleans map to lowercase true and false
  3. integers and finite floats map to their decimal representation
  4. strings map to quoted strings with JSON escape rules

Then add recursive containers:

  1. lists and tuples become [item1,item2,...]
  2. dictionaries become {"key":value,...}
  3. dictionary keys must become JSON object field names, so this solution accepts string keys and raises for other key types in strict mode

Finally, wrap container encoding with cycle detection. The encoder inserts id(container) into active_container_ids before recursively encoding children, and removes it in a finally block so the set is cleaned up even if a child raises.

Python solution

import math
from collections.abc import Mapping
from typing import Any


class Solution:
    def encode(self, value: Any) -> str:
        return JsonEncoder().encode(value)


class JsonEncoder:
    def __init__(self) -> None:
        self._active_container_ids: set[int] = set()

    def encode(self, value: Any) -> str:
        if value is None:
            return "null"

        if value is True:
            return "true"

        if value is False:
            return "false"

        if isinstance(value, str):
            return self._encode_string(value)

        if isinstance(value, int):
            return str(value)

        if isinstance(value, float):
            if not math.isfinite(value):
                raise ValueError("JSON does not support NaN or Infinity")
            return repr(value)

        if isinstance(value, (list, tuple)):
            return self._encode_array(value)

        if isinstance(value, Mapping):
            return self._encode_object(value)

        raise TypeError(f"Object of type {type(value).__name__} is not JSON serializable")

    def _encode_array(self, values: list[Any] | tuple[Any, ...]) -> str:
        self._enter_container(values)
        try:
            return "[" + ",".join(self.encode(item) for item in values) + "]"
        finally:
            self._leave_container(values)

    def _encode_object(self, mapping: Mapping[Any, Any]) -> str:
        self._enter_container(mapping)
        try:
            fields: list[str] = []
            for key, value in mapping.items():
                if not isinstance(key, str):
                    raise TypeError("JSON object keys must be strings")
                fields.append(f"{self._encode_string(key)}:{self.encode(value)}")
            return "{" + ",".join(fields) + "}"
        finally:
            self._leave_container(mapping)

    def _enter_container(self, value: Any) -> None:
        object_id = id(value)
        if object_id in self._active_container_ids:
            raise ValueError("Circular reference detected")
        self._active_container_ids.add(object_id)

    def _leave_container(self, value: Any) -> None:
        self._active_container_ids.remove(id(value))

    def _encode_string(self, value: str) -> str:
        escaped_parts: list[str] = ['"']

        for char in value:
            code_point = ord(char)

            if char == '"':
                escaped_parts.append('\\"')
            elif char == "\\":
                escaped_parts.append("\\\\")
            elif char == "\b":
                escaped_parts.append("\\b")
            elif char == "\f":
                escaped_parts.append("\\f")
            elif char == "\n":
                escaped_parts.append("\\n")
            elif char == "\r":
                escaped_parts.append("\\r")
            elif char == "\t":
                escaped_parts.append("\\t")
            elif code_point < 0x20:
                escaped_parts.append(f"\\u{code_point:04x}")
            else:
                escaped_parts.append(char)

        escaped_parts.append('"')
        return "".join(escaped_parts)

Example outputs:

print(Solution().encode(1))                 # 1
print(Solution().encode(True))              # true
print(Solution().encode(None))              # null
print(Solution().encode("a\nb"))            # "a\nb"
print(Solution().encode([1, True, None]))   # [1,true,null]
print(Solution().encode({"name": "Ada"}))   # {"name":"Ada"}

For the cyclic sample, the encoder raises instead of looping:

a = []
b = {"name": a}
a.append(b)

Solution().encode(a)  # ValueError: Circular reference detected

The runtime is O(n + s), where n is the number of reachable values and s is the size of the produced JSON string. The extra memory is O(d), where d is the maximum nesting depth, because the active set only tracks containers on the current recursion path.

Important edge cases

Booleans must be checked before integers. In Python, bool is a subclass of int, so checking int first would incorrectly turn True into "True" or "1" instead of "true".

Shared references are not the same as cycles. This input is valid:

child = []
value = [child, child]

It should encode as [[],[]]. The same object appears twice, but not while it is already on the current recursion path.

Self-cycles need the same detection as larger cycles:

a = []
a.append(a)

This should raise ValueError. Without the active-container set, the recursive call would never terminate.

Special floating-point values also need deliberate handling. Strict JSON does not support NaN, Infinity, or -Infinity, so the encoder raises instead of emitting non-standard tokens.

Interview follow-ups

What if the interviewer asks why cycles cannot be encoded as regular JSON?

Regular JSON has arrays, objects, strings, numbers, booleans, and null. None of those can express “this value points back to an earlier container.” A parsed JSON document is a tree, while a cyclic runtime object is a graph. Preserving the cycle requires a non-standard convention such as "$id" and "$ref", which is valid JSON syntax but no longer the ordinary JSON representation of the original value.

So if the output must be regular JSON, detecting and rejecting cycles is the correct behavior. If the output must preserve cycles, the problem statement must allow an extended schema.

What if the interviewer wants to handle arbitrary objects?

A common extension is to encode an object’s public fields as a JSON object, similar to vars(obj). That still has to respect the same rules: field names become JSON object keys, field values are encoded recursively, and object identity is added to the active set while its fields are being encoded.

The tradeoff is that “any object” is language-specific. Some objects have file handles, lazy properties, descriptors, private state, or fields that are not JSON-compatible. In an interview, it is cleaner to first solve the JSON data model, then state how object fields would be adapted.

What if dictionary keys are not strings?

Strict JSON object keys are strings. Python’s json.dumps will coerce some keys such as integers and booleans into strings, but that can be surprising and can create collisions. For an interview implementation, raising on non-string keys is the most defensible strict behavior.

If the interviewer explicitly wants Python-like coercion, add a key conversion helper and document the collision behavior.

What if recursion depth is very large?

The recursive version is the easiest to explain, but deeply nested input can exceed the language’s call stack. A production serializer can use an explicit stack of frames to build the output iteratively.

That does not change the core idea. The implementation still needs to enter a container before processing its children, leave it afterward, and raise if it sees a container that is already active.

Practical takeaway

The main trap is mixing up two different requirements:

  • regular JSON output
  • preserving cyclic object identity

You cannot have both at the same time. For regular JSON, the right answer is to serialize JSON-compatible tree-shaped values and reject cycles cleanly. The cycle-handling requirement is still important, but “handle” means “detect and fail predictably,” not “invent a reference format” unless the problem explicitly permits one.