Problem. Design a structure that supports inserting numbers and querying the median of all numbers seen so far.
This is the classic online order-statistics with heaps pattern.
Examples:
- After adding
1, the median is1.0 - After adding
1and then2, the median is1.5 - After adding
1,2, and3, the median is2.0
Solution. Keep the lower half in a max-heap and the upper half in a min-heap. Rebalance so their sizes differ by at most one. The median is then either the top of the larger heap or the average of the two tops.
import heapq
class MedianFinder:
def __init__(self):
self.lower = [] # max-heap via negated values
self.upper = [] # min-heap
def add_num(self, num):
heapq.heappush(self.lower, -num)
heapq.heappush(self.upper, -heapq.heappop(self.lower))
if len(self.upper) > len(self.lower):
heapq.heappush(self.lower, -heapq.heappop(self.upper))
def find_median(self):
if len(self.lower) > len(self.upper):
return float(-self.lower[0])
return (-self.lower[0] + self.upper[0]) / 2
O(log n) per insertion, O(1) per median query, O(n) space.
Gotchas.
- Python only has a min-heap, so the lower half is stored as negated values.
- Rebalancing after every insert keeps the median query trivial.
- Be explicit about returning a float when the count of values is even.