Top K Frequent Elements
Problem
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
- 1 ≤ nums.length ≤ 10⁵
- −10⁴ ≤ nums[i] ≤ 10⁴
- k is in the range [1, the number of unique elements in the array]
- The answer is guaranteed to be unique.
Example
nums = [1,1,1,2,2,3], k = 2[1,2]The algorithm first counts the frequency of each element: 1 appears 3 times, 2 appears 2 times, and 3 appears once. The goal is to find the top 2 elements by frequency. The critical moment is when the frequency map is built, enabling the next step to efficiently identify the top k frequent elements.
Approach
Straightforward Solution
A brute-force approach would count frequencies and then sort all unique elements by their frequency in descending order, which takes O(n log n) time due to sorting. This is inefficient for large inputs.
Core Observation
The problem reduces to counting frequencies of elements and then selecting the k elements with the highest counts. Frequency counting is a fundamental step that transforms the problem into a selection problem over counts.
Path to Optimal
The key recognition signals are 'top k frequent elements' and 'frequency counts'. These indicate using a heap (priority queue) because it allows maintaining the k largest frequencies efficiently without sorting the entire dataset. By pushing frequencies into a min-heap of size k, the algorithm discards less frequent elements, ensuring O(n log k) time complexity.
Optimal Approach
Count frequencies using a hash map. Then iterate over the frequency map, pushing each (frequency, element) pair into a min-heap. If the heap size exceeds k, pop the smallest frequency element. After processing all elements, the heap contains the k most frequent elements. Extract and return these elements. This approach balances time and space efficiently for large inputs.
Time
O(n log k)
Counting frequencies takes O(n). Each of the unique elements (at most n) is pushed and possibly popped from a heap of size k, each heap operation costing O(log k), resulting in O(n log k) time.
Space
O(n)
The hash map stores frequencies for up to n unique elements, and the heap stores up to k elements. Both scale linearly with input size in the worst case.
Pattern Spotlight
Heap / Priority Queue (Top K Elements)
Use a min-heap to maintain the top k elements by frequency, pushing all candidates and popping the smallest when exceeding size k, ensuring efficient selection without full sorting.
Solution
| 1 | import heapq
|
| 2 |
|
| 3 | class Solution:
|
| 4 | def topKFrequent(self, nums: List[int], k: int) -> List[int]:
|
| 5 | counts = {}
|
| 6 | for n in nums:
|
| 7 | counts[n] = counts.get(n, 0) + 1
|
| 8 |
|
| 9 | min_heap = []
|
| 10 | for num, freq in counts.items():
|
| 11 | heapq.heappush(min_heap, (freq, num))
|
| 12 | if len(min_heap) > k:
|
| 13 | heapq.heappop(min_heap)
|
| 14 |
|
| 15 | top_k = [num for freq, num in min_heap]
|
| 16 | return top_k |
Step-by-Step Solution
Build Frequency Map from Input Array
| 5 | counts = {}
|
| 6 | for n in nums:
|
| 7 | counts[n] = counts.get(n, 0) + 1
|
Objective
To count the occurrences of each number in the input array using a hash map.
Key Insight
Counting frequencies transforms the problem from raw data to a frequency distribution, which is essential for identifying the most frequent elements. Using a hash map provides O(1) average-time updates per element, enabling a linear pass over the input.
Interview Quick-Check
Core Logic
The hash map stores each unique number as a key and its frequency as the value, enabling constant-time frequency updates.
Complexity
Frequency counting runs in O(n) time, where n is the length of the input array.
Maintain a Min-Heap to Track Top K Frequent Elements
| 9 | min_heap = []
|
| 10 | for num, freq in counts.items():
|
| 11 | heapq.heappush(min_heap, (freq, num))
|
| 12 | if len(min_heap) > k:
|
| 13 | heapq.heappop(min_heap)
|
Objective
To use a min-heap to keep track of the k elements with the highest frequencies efficiently.
Key Insight
By pushing frequency-element pairs into a min-heap and popping the smallest frequency when the heap exceeds size k, the algorithm ensures that only the top k frequent elements remain. This avoids sorting the entire frequency map and reduces time complexity from O(n log n) to O(n log k).
Interview Quick-Check
Core Logic
The min-heap stores tuples of (frequency, element), allowing the smallest frequency to be efficiently removed when the heap size exceeds k.
State & Boundaries
The heap size is maintained at most k to ensure efficient operations and correct final output.
Common Pitfalls & Bugs
Forgetting to pop from the heap when its size exceeds k leads to incorrect results and increased time complexity.
Extract and Return the Top K Frequent Elements
| 15 | top_k = [num for freq, num in min_heap]
|
| 16 | return top_k |
Objective
To collect the elements from the min-heap and return them as the final result.
Key Insight
After maintaining the heap, the remaining elements correspond to the top k frequent numbers. Extracting the elements from the heap and returning them completes the solution. The order is not guaranteed but is acceptable per problem statement.
Interview Quick-Check
Core Logic
Extracting elements from the heap yields the k most frequent elements without additional sorting.
State & Boundaries
The final list length equals k, matching the problem requirement.
Line Analysis
This solution has 3 Critical lines interviewers watch for.
counts[n] = counts.get(n, 0) + 1
Increment the frequency count for the current number.
Using get with default 0 ensures correct initialization and increments frequency counts accurately for all elements.
heapq.heappush(min_heap, (freq, num))
Push the (frequency, number) tuple onto the min-heap.
Heap push maintains the heap property, allowing efficient retrieval and removal of the smallest frequency element.
heapq.heappop(min_heap)
Pop the smallest frequency element from the heap if size exceeds k.
Removing the smallest frequency element discards less frequent elements, preserving only the top k frequent ones in the heap.
counts = {}
Initialize an empty dictionary to count frequencies.
This dictionary is the foundation for frequency counting, enabling O(1) average-time updates for each element's count.
if len(min_heap) > k:
Check if the heap size exceeds k.
Maintaining the heap size at most k ensures that only the top k frequent elements are kept, optimizing time and space.
min_heap = []
Initialize an empty list to serve as a min-heap.
The min-heap will maintain the top k frequent elements by frequency, enabling efficient insertion and removal operations.
top_k = [num for freq, num in min_heap]
Extract the numbers from the heap into a list.
This list comprehension collects the final top k frequent elements from the heap for return.
for num, freq in counts.items():
Iterate over each number and its frequency in the frequency map.
Processing each unique element allows the algorithm to consider all candidates for the top k frequent elements.
return top_k
Return the list of top k frequent elements.
Returning the collected elements completes the solution, fulfilling the problem's requirement.
for n in nums:
Iterate over each number in the input array.
A single pass over the input array is necessary to count frequencies efficiently in O(n) time.
Test Your Understanding
Why use a min-heap of size k instead of sorting all elements by frequency?
A min-heap of size k allows maintaining only the top k frequent elements efficiently, avoiding the O(n log n) cost of sorting all unique elements. This reduces time complexity to O(n log k), which is more efficient when k is much smaller than n.
Related Problems
Heap / Priority Queue pattern
Don't just read it. Drill it.
Reconstruct Top K Frequent Elements from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.
Start drilling Top K Frequent Elements for free