Generated by Codex with GPT-5
Quick facts
- Difficulty:
MEDIUM - Problem: LRU Cache
- Main tags:
Hash Table,Linked List,Design,Doubly-Linked List
What the problem is really asking
This problem is not mainly about storing key-value pairs. It is about building a data structure that can answer two different questions at the same time:
- “Can I find the value for this key immediately?”
- “Can I tell which key was used least recently and evict it immediately?”
The cache has a fixed capacity. Once it is full, inserting a new key means one
old key must be removed. The one to remove is not the smallest key or the
oldest inserted key. It is the least recently used key, meaning the key
whose last get or put happened furthest in the past.
That combination is what makes the problem interesting. The cache must support:
get(key)inO(1)put(key, value)inO(1)- moving a key to the “most recently used” position in
O(1) - evicting the least recently used key in
O(1)
Why the obvious approaches break down
A plain hash map handles fast lookup, but it does not preserve recency order in a way that lets the least recently used entry be removed in constant time.
A plain list or queue can preserve order, but if a key is accessed again, that
entry must move to the front. Removing a node from the middle of a normal list
or array is not O(1) unless there is already a direct pointer to that exact
node and the structure supports constant-time splice operations.
So the problem is really asking for a hybrid:
- one structure for fast lookup by key
- one structure for fast updates to recency order
The optimal idea: combine a hash map with a doubly linked list
The standard solution uses two pieces:
- a hash map from
key -> node - a doubly linked list ordered from most recently used to least recently used
Each node stores:
- the key
- the value
- a pointer to the previous node
- a pointer to the next node
The hash map makes it possible to jump straight to a node in O(1). The
doubly linked list makes it possible to remove that node from its current
position and move it to the front in O(1).
The list order carries the recency information:
- the front of the list is the most recently used item
- the back of the list is the least recently used item
That means:
get(key)looks up the node, moves it to the front, and returns its valueput(key, value)updates an existing node and moves it to the front, or creates a new node at the front- if capacity is exceeded, the node at the back is evicted
Using dummy head and dummy tail nodes makes the code cleaner because insertion and deletion no longer need special cases for empty lists or one-element lists.
How to derive this solution in an interview
The clean way to derive the solution is to start from the required operations.
First, get must be O(1), so a hash map is almost unavoidable. But once a
key is accessed, it becomes the most recently used item. That means the data
structure must also support a fast “promote this item to the front” operation.
Next, when the cache is full, put must evict the least recently used item in
O(1). That strongly suggests that the least recently used item should always
live at a known fixed position, which is naturally the end of a linked list.
At that point, the design becomes clear:
- the hash map solves direct access
- the linked list solves recency order
- storing node references in the map ties both halves together
This is the key interview insight: the hash map finds the node, and the doubly linked list makes moving and evicting that node constant time.
Python solution
from __future__ import annotations
from dataclasses import dataclass
from typing import Dict, Optional
@dataclass(slots=True)
class CacheNode:
key: int
value: int
prev: Optional["CacheNode"] = None
next: Optional["CacheNode"] = None
class LRUCache:
def __init__(self, capacity: int):
if capacity < 1:
raise ValueError("capacity must be at least 1")
self.capacity = capacity
self.nodes_by_key: Dict[int, CacheNode] = {}
# Sentinels let every insert/remove use the same pointer updates.
self.head = CacheNode(key=0, value=0)
self.tail = CacheNode(key=0, value=0)
self.head.next = self.tail
self.tail.prev = self.head
def get(self, key: int) -> int:
node = self.nodes_by_key.get(key)
if node is None:
return -1
self._move_to_front(node)
return node.value
def put(self, key: int, value: int) -> None:
existing_node = self.nodes_by_key.get(key)
if existing_node is not None:
existing_node.value = value
self._move_to_front(existing_node)
return
if len(self.nodes_by_key) == self.capacity:
self._evict_least_recently_used()
new_node = CacheNode(key=key, value=value)
self.nodes_by_key[key] = new_node
self._insert_after(self.head, new_node)
def _move_to_front(self, node: CacheNode) -> None:
self._detach(node)
self._insert_after(self.head, node)
def _evict_least_recently_used(self) -> None:
least_recent_node = self.tail.prev
if least_recent_node is None or least_recent_node is self.head:
return
self._detach(least_recent_node)
del self.nodes_by_key[least_recent_node.key]
def _detach(self, node: CacheNode) -> None:
previous_node = node.prev
next_node = node.next
if previous_node is None or next_node is None:
raise RuntimeError("Cannot detach a node that is not in the list")
previous_node.next = next_node
next_node.prev = previous_node
node.prev = None
node.next = None
def _insert_after(self, left_node: CacheNode, node_to_insert: CacheNode) -> None:
right_node = left_node.next
if right_node is None:
raise RuntimeError("List is missing the tail sentinel")
node_to_insert.prev = left_node
node_to_insert.next = right_node
left_node.next = node_to_insert
right_node.prev = node_to_insertWhy this works
The hash map ensures that locating a key is constant time. The linked list ensures that once the node is located, removing it from its old position and re-inserting it at the front is also constant time.
That means every important operation stays O(1):
getdoes one hash lookup and at most one node moveputdoes one hash lookup, possibly one eviction, and one insertion
The least recently used item is always the real node just before the tail sentinel, so eviction never requires scanning.
The time complexity is O(1) per operation, and the space complexity is
O(capacity) because the cache stores at most capacity nodes plus the map.
Practical implementation mistakes to avoid
The most common bug is forgetting that updating an existing key also counts
as usage. If put changes a value but leaves the node in place, the recency
order becomes wrong and later evictions will remove the wrong key.
Another common bug is using a singly linked list. A singly linked list can tell
what comes after a node, but it cannot remove an arbitrary node in O(1)
without also knowing the previous node. That is why the doubly linked list is
the right choice here.
The last frequent pitfall is messy edge-case handling around the front and back of the list. Dummy head and tail nodes remove most of that complexity and make pointer code much easier to trust.
Interview follow-ups
Could this be implemented with Python’s OrderedDict instead of a custom linked list?
Yes. In production Python, collections.OrderedDict can be a very practical
solution because it already maintains insertion order and supports moving keys
to one end efficiently. A compact implementation can call move_to_end when a
key is used and popitem(last=False) when the least recently used key must be
removed.
That said, interviewers often prefer the custom hash map plus doubly linked
list solution because it proves the candidate understands the underlying data
structure instead of relying on a language-specific helper. The tradeoff is
between implementation brevity and demonstrating the core design idea. In
Python production code, OrderedDict may be perfectly reasonable. In a data
structures interview, the manual design is usually the safer answer.
How would the design change if the cache needed to be thread-safe?
The main issue is that every get and put mutates shared state. Even get
changes recency order, so concurrent readers are not actually read-only. In a
thread-safe version, those operations must be protected so the hash map and the
linked list cannot drift out of sync.
A straightforward answer is to place a lock around the full body of get and
put. That keeps the implementation correct and easy to reason about, though
it reduces concurrency because only one thread can modify the cache at a time.
More advanced designs may shard the cache or use finer-grained locking, but
that raises complexity quickly. In interview terms, correctness comes first:
the map and list must always be updated atomically as one logical operation.
What if the interviewer asks for LFU cache instead of LRU cache?
That is a materially harder problem because the eviction rule is no longer based on recency alone. Instead of removing the least recently used key, the cache must remove the key with the lowest access frequency, often breaking ties by recency among keys with the same frequency.
The usual LFU design still uses a hash map, but the ordering structure changes.
Rather than one doubly linked list for all entries, the cache keeps frequency
buckets, where each bucket has its own linked list of nodes with the same
frequency. Accessing a node moves it from one frequency bucket to the next. The
time target can still be O(1), but the bookkeeping is more complex because
the system must track both frequency and recency. That contrast helps explain
why LRU is often considered the easier and more interview-friendly cache design.
What if the cache capacity can change after construction?
If capacity increases, the structure barely changes; the cache simply gains room for more nodes. If capacity decreases, the cache must repeatedly evict the least recently used nodes until the current size fits the new capacity.
The core design still works because the least recently used entry is always at the back of the list. The main new concern is API semantics. The implementation must clearly define whether shrinking the capacity triggers immediate eviction or waits until the next insertion. In most practical designs, immediate eviction is cleaner because it keeps the cache state valid after every call and avoids surprising temporary overflow.