Top K Frequent Words
Problem
Given an array of strings words and an integer k, return the k most frequent strings in words. The answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.
- 1 ≤ words.length ≤ 50000
- 1 ≤ words[i].length ≤ 10
- words[i] consists of lowercase English letters.
- k is in the range [1, Number of unique words]
Example
words = ["i", "love", "leetcode", "i", "love", "coding"], k = 2["i", "love"]The word "i" appears twice, "love" appears twice, and the others appear once. Since "i" and "love" have the same frequency, they are sorted alphabetically, so "i" comes before "love".
Approach
Straightforward Solution
A brute-force approach would count frequencies and then sort all words by frequency descending and lexicographical ascending. This approach is straightforward but sorting all unique words can be costly if the number of unique words is large.
Core Observation
The problem requires identifying the most frequent words and sorting them by frequency and lexicographical order. Frequency counting is a fundamental step, and sorting by multiple criteria is necessary to meet the ordering requirements.
Path to Optimal
PreviewThe key recognition signals are 'k most frequent', 'sorted by frequency', and 'lexicographical order for ties'. These indicate a priority queue or sorting approach…
Full step-by-step walkthrough on Pro →
Optimal Approach
PreviewCount the frequency of each word using a hash map (Counter). Extract the unique words and sort them with a custom key: first by negative frequency (to get descending order), then by lexicographical order (ascending)…
Full step-by-step walkthrough on Pro →
Want the full reasoning chain?
Unlock the complete walkthrough, line-by-line analysis, and recall drill.
Unlock ProTime
O(n + m log m)
Counting frequencies takes O(n) time where n is the number of words. Sorting the m unique words takes O(m log m) time. Since m ≤ n, the overall complexity is O(n log n) in the worst case.
Space
O(m)
The hash map stores frequencies for m unique words, requiring O(m) auxiliary space. The sorted list also requires O(m) space. This space is necessary to store frequency information and the unique words.
Pattern Spotlight
Heap / Priority Queue (Frequency Sorting with Custom Comparator)
When asked for the top k frequent elements with tie-breaking rules, first count frequencies with a hash map, then sort or use a heap with a custom comparator that orders by frequency descending and lexicographical ascending to efficiently retrieve the top k.
Solution
| 1 | class Solution: |
| 2 | def topKFrequent(self, words: List[str], k: int) -> List[str]: |
| 3 | count = Counter(words) |
| 4 | |
| 5 | words = list(count.keys()) |
| 6 | |
| 7 | words.sort(key=lambda w: (-count[w], w)) |
| 8 | |
| 9 | return words[:k] |
Step-by-Step Solution
Count Frequencies of Each Word Using a Hash Map
| 3 | count = Counter(words) |
Objective
To build a frequency map that records how many times each word appears in the input list.
Key Insight
Counting frequencies is the foundation for identifying the most frequent words. Using a hash map (Counter) provides O(1) average-time insertion and lookup, enabling a single pass over the input words to build the frequency distribution efficiently.
Interview Quick-Check
Core Logic
Use a hash map to count occurrences of each word in O(n) time.
Complexity
Counting frequencies requires O(n) time and O(m) space, where m is the number of unique words.
Extract Unique Words and Sort by Frequency and Lexicographical Order
To order the unique words by descending frequency and ascending lexicographical order to satisfy the problem's output requirements.
Return the Top K Words from the Sorted List
To select and return the first k words from the sorted list as the final answer.
2 more steps with full analysis available on Pro.
Line Analysis
This solution has 2 Critical lines interviewers watch for.
words.sort(key=lambda w: (-count[w], w))
Sort the unique words by descending frequency and ascending lexicographical order.
The custom sort key (-count[w], w) ensures that words with higher frequency come first, and ties are broken alphabetically, exactly matching the problem's ordering requirements.
count = Counter(words)
Count the frequency of each word in the input list using Counter.
Using Counter efficiently builds a frequency map in O(n) time, which is essential for identifying the most frequent words without repeated scanning.
Full line-by-line criticality + rationale for all 4 lines available on Pro.
Test Your Understanding
Why is sorting by (-frequency, word) the correct way to order the words for this problem?
See the answer with Pro.
Related Problems
Heap / Priority Queue pattern
Don't just read it. Drill it.
Reconstruct Top K Frequent Words from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.
Unlock the Top K Frequent Words drill