Generated by Codex with GPT-5
Quick facts
- Difficulty:
MEDIUM - Problem: Insert Delete GetRandom O(1)
- Main tags:
Array,Hash Table,Math,Design,Randomized
What the problem is really asking
The task is to design a set-like data structure with three operations:
insert(val)adds a value only if it is not already present.remove(val)deletes a value only if it is present.getRandom()returns one currently stored value, with every stored value equally likely.
The important constraint is that all three operations should run in average O(1) time.
A normal hash set makes insert and remove easy, but it does not let us pick a uniformly random element in constant time. A normal array makes random access easy, but deleting a value from the middle is expensive because the later elements must shift. The optimal design combines both structures so each covers the other’s weakness.
The key idea
Use two pieces of state:
- a dynamic array containing all current values
- a hash map from each value to its index in that array
The array makes getRandom() simple: choose a random index and return the value at that index. Since every index is equally likely, every value is equally likely.
The hash map makes lookup simple: it can tell immediately whether a value exists and where it lives in the array.
The only tricky operation is remove. Removing from the middle of an array would normally shift elements, which is not O(1). Instead, move the array’s last value into the deleted value’s slot, update that moved value’s index in the hash map, and then pop the last array element.
This “swap with the last element” trick is what keeps deletion constant time.
How to derive the optimal solution
Start with the uniform-random requirement. If getRandom() must be O(1), the structure needs direct access to a random position, which strongly suggests an array.
Then look at deletion. An array alone cannot delete an arbitrary value in O(1) because it first has to find the value, and deleting from the middle causes shifting. A hash map solves the finding problem by storing each value’s current index.
The remaining issue is shifting. If order does not matter, there is no reason to preserve the array order after a deletion. That means the value being removed can be replaced by the last value, and the array can shrink with a normal pop().
The final invariant is simple:
For every value in the set, indices_by_value[value] must point to the position where that value appears in values.
As long as every insert and remove preserves that invariant, all operations are average O(1).
A small walkthrough
Suppose the structure currently stores:
values = [10, 20, 30, 40]
indices_by_value = {10: 0, 20: 1, 30: 2, 40: 3}
Now remove 20.
The value 20 is at index 1. The last value is 40. Instead of shifting 30 and 40 left, place 40 into index 1, update 40’s stored index to 1, pop the last slot, and delete 20 from the map.
The new state is:
values = [10, 40, 30]
indices_by_value = {10: 0, 30: 2, 40: 1}
The order changed, but the problem never required order. The set contents are correct, and the operation touched only a constant number of array and hash map entries.
Python solution
import random
from typing import Dict, List
class RandomizedSet:
"""
Store values in an array for O(1) random access and keep a value-to-index
map so insertions and deletions can find each value in O(1) average time.
"""
def __init__(self) -> None:
self._values: List[int] = []
self._indices_by_value: Dict[int, int] = {}
def insert(self, val: int) -> bool:
"""
Add val if it is not already present.
Returns True when val is inserted and False when it already exists.
"""
if val in self._indices_by_value:
return False
self._indices_by_value[val] = len(self._values)
self._values.append(val)
return True
def remove(self, val: int) -> bool:
"""
Remove val if it is present.
The array does not preserve order. To delete in O(1), the last value is
moved into val's old slot, then the final array position is popped.
"""
if val not in self._indices_by_value:
return False
removal_index = self._indices_by_value[val]
last_value = self._values[-1]
self._values[removal_index] = last_value
self._indices_by_value[last_value] = removal_index
self._values.pop()
del self._indices_by_value[val]
return True
def getRandom(self) -> int:
"""
Return a uniformly random stored value.
LeetCode only calls this when at least one value is present.
"""
return self._values[random.randrange(len(self._values))]Why this works
The array and hash map always describe the same set from two angles. The array stores the values compactly, with no gaps, so a random array index maps cleanly to a random value. The hash map stores each value’s current position, so the code never has to scan the array.
Insert is constant time because a new value is appended at the end of the array and recorded in the map.
Remove is constant time because the code does not preserve order. It overwrites the removed value’s slot with the last array value, fixes that moved value’s index in the map, and pops the duplicate last slot. Even when the removed value is already the last value, the same logic still works: the value is assigned to its own position, popped, and removed from the map.
getRandom() is uniform because the array has exactly one copy of each stored value and no empty positions. Picking a random index uniformly is therefore the same as picking a random stored value uniformly.
Complexity
insert, remove, and getRandom all run in average O(1) time. The average-case wording comes from the hash map operations.
The data structure uses O(n) space for n stored values, because every value appears once in the array and once in the hash map.
Interview follow-ups
What if duplicate values are allowed?
The main change is that a value can no longer map to just one index. It must map to a set of indices where that value appears in the array.
Insertion still appends the value and records the new index in that value’s index set. Removal chooses any one index from the value’s index set, swaps the last array value into that index, updates the moved value’s index set, pops the array, and deletes the removed value’s index. If a value’s index set becomes empty, remove that value from the map.
This is the idea behind the “Insert Delete GetRandom O(1) - Duplicates allowed” variant. The time complexity remains average O(1), but the implementation is more careful because both the removed value’s index set and the moved value’s index set may need updates.
Why not use only a hash set?
A hash set supports membership checks, insertion, and deletion in average O(1) time, but it does not expose a stable O(1) way to choose a uniformly random element. Iterating to a random offset would take O(n) time, and relying on internal hash table layout would not be a clean or portable design.
The array is necessary because it gives direct index-based access. Once the values are stored compactly in an array, uniform random selection is just a random index lookup.
Why not use only an array?
An array can return a random element in O(1), but it cannot support all required set operations efficiently by itself. Checking whether a value exists would require a scan, and finding a value before deletion would also require a scan.
The hash map removes that scan. It tells the structure exactly where a value is located, so deletion can jump directly to the right slot.
Does swapping with the last element affect random fairness?
No. Random fairness depends on the final array having exactly one slot for each stored value and choosing each slot with equal probability. The order of those slots does not matter.
After a removal, the swapped array still contains each remaining value exactly once. Since getRandom() picks a uniform index from the current array length, each remaining value still has probability 1 / n.
What if getRandom() must be cryptographically secure?
The LeetCode version only needs ordinary random selection, so Python’s random module is enough. If the caller needs cryptographic unpredictability, use the secrets module instead, for example secrets.randbelow(len(values)).
That change improves unpredictability but not asymptotic complexity. The design is still the same array plus hash map structure; only the random-number generator changes.
Practical takeaway
This problem is a classic example of designing around complementary data structures. A hash map gives fast lookup, while an array gives fast random access. The swap-with-last deletion trick connects them by avoiding the one operation arrays are bad at: removing from the middle while preserving order.