Longest Consecutive Sequence
Problem
Given an unsorted array of integers nums, return the length of the longest consecutive elements sequence.
- 0 ≤ nums.length ≤ 10⁵
- −10⁹ ≤ nums[i] ≤ 10⁹
Example
nums = [100, 4, 200, 1, 3, 2]4The longest consecutive sequence is [1, 2, 3, 4]. The algorithm first converts the array to a set for O(1) lookups. It then iterates over each number, checking if it is the start of a sequence by verifying that num-1 is not in the set. For num=1, since 0 is not in the set, it begins counting consecutive numbers: 2, 3, 4 are found consecutively, yielding a streak length of 4. Other numbers like 100 or 200 are either isolated or start shorter sequences. The maximum streak found is 4.
Approach
Straightforward Solution
A naive approach sorts the array and then scans for consecutive runs, which takes O(n log n) time. This is inefficient for large inputs and does not meet the O(n) time requirement.
Core Observation
The fundamental truth is that consecutive sequences can be identified by checking if a number is the start of a sequence (i.e., num-1 is not present). This avoids redundant counting of sequences multiple times.
Path to Optimal
The key recognition signals are 'longest consecutive sequence' and 'unsorted array'. These indicate a Hash Map (or Set) approach because we need O(1) membership checks to efficiently detect sequence starts and count consecutive elements without sorting. By storing all numbers in a set, the algorithm can identify sequence starts and count lengths in O(n) time.
Optimal Approach
Convert the array to a set for O(1) lookups. Iterate over each number and check if it is the start of a sequence by verifying that num-1 is not in the set. If it is a start, count consecutive numbers by incrementing num and checking membership. Track the maximum streak length found. This approach guarantees each number is visited at most twice, resulting in O(n) time complexity.
Time
O(n)
Each number is inserted into a set once (O(n)), and the inner while loop runs at most n times across all iterations because each number is counted only once when it is part of a sequence start.
Space
O(n)
The set stores all unique numbers from the input array, requiring O(n) auxiliary space proportional to the input size.
Pattern Spotlight
Hash Maps (Set Membership for Sequence Detection)
To find the longest consecutive sequence in O(n), identify sequence starts by checking absence of predecessor, then count forward using O(1) membership checks, ensuring each element is processed only once.
Solution
| 1 | class Solution:
|
| 2 | def longestConsecutive(self, nums: List[int]) -> int:
|
| 3 | if not nums:
|
| 4 | return 0
|
| 5 |
|
| 6 | num_set = set(nums)
|
| 7 | longest_streak = 0
|
| 8 |
|
| 9 | for num in num_set:
|
| 10 | if (num - 1) not in num_set:
|
| 11 | current_num = num
|
| 12 | current_streak = 1
|
| 13 |
|
| 14 | while (current_num + 1) in num_set:
|
| 15 | current_num += 1
|
| 16 | current_streak += 1
|
| 17 |
|
| 18 | longest_streak = max(longest_streak, current_streak)
|
| 19 |
|
| 20 | return longest_streak |
Step-by-Step Solution
Establish Base Case and Build Constant Time Lookup State
| 3 | if not nums:
|
| 4 | return 0
|
| 6 | num_set = set(nums)
|
| 7 | longest_streak = 0
|
Objective
Handle the empty input edge case, then initialize the two pieces of state needed for an O(n) set scan: a membership set and a best answer tracker.
Key Insight
The algorithm relies on constant time membership checks to detect sequence boundaries without sorting. Before that logic can run, you must (1) return 0 when no elements exist, (2) convert the input into a set for O(1) lookups, and (3) initialize the running maximum so each discovered streak can update the best answer.
Interview Quick-Check
State & Boundaries
If nums is empty, the longest consecutive length is 0 by definition, so return immediately.
Core Logic
Convert nums to a set to enable O(1) checks for num-1 and num+1, which is the core enabler of linear time.
Core Logic
Initialize the best answer tracker to 0 so it can be updated by max(best, streak) during scanning.
Iterate Over Set to Identify Sequence Starts and Count Lengths
| 9 | for num in num_set:
|
| 10 | if (num - 1) not in num_set:
|
| 11 | current_num = num
|
| 12 | current_streak = 1
|
| 14 | while (current_num + 1) in num_set:
|
| 15 | current_num += 1
|
| 16 | current_streak += 1
|
| 18 | longest_streak = max(longest_streak, current_streak)
|
Objective
To find the start of each consecutive sequence and count its length by checking consecutive successors.
Key Insight
By only starting to count when num-1 is not in the set, the algorithm ensures each sequence is counted exactly once. The inner while loop increments through consecutive numbers, leveraging O(1) membership checks to efficiently count the streak length. This approach avoids sorting and redundant counting, achieving linear time complexity.
Interview Quick-Check
Core Logic
Check if num-1 is absent to identify sequence starts, then count consecutive numbers by incrementing current_num and checking membership.
State & Boundaries
The outer loop iterates over unique numbers, and the inner loop only runs for sequence starts, ensuring O(n) total iterations.
Common Pitfalls & Bugs
Failing to check for num-1 leads to counting the same sequence multiple times, increasing time complexity.
Return the Maximum Consecutive Sequence Length Found
| 20 | return longest_streak |
Objective
To output the length of the longest consecutive sequence after processing all numbers.
Key Insight
Tracking the maximum streak length during iteration allows the algorithm to return the correct result in O(1) time after the loop completes. This final step consolidates the results of the counting process.
Interview Quick-Check
Core Logic
Return the maximum streak length found, which represents the longest consecutive sequence in the input.
Line Analysis
This solution has 3 Critical lines interviewers watch for.
if (num - 1) not in num_set:
Check if the current number is the start of a sequence by verifying num-1 is not in the set.
This condition identifies sequence starts, preventing redundant counting of sequences multiple times and ensuring linear time complexity.
num_set = set(nums)
Create a set from the input list for O(1) membership checks.
Using a set enables constant-time lookups for presence of numbers, which is essential for efficiently detecting sequence starts and counting consecutive elements.
while (current_num + 1) in num_set:
Check if the next consecutive number exists in the set.
This membership check determines if the sequence continues, enabling counting of consecutive elements efficiently.
longest_streak = max(longest_streak, current_streak)
Update longest_streak if the current streak is longer.
Maintaining the maximum streak length ensures the algorithm returns the correct longest consecutive sequence length after processing all numbers.
if not nums:
Check if the input list is empty.
An empty input means no sequences exist, so returning 0 immediately handles this edge case efficiently.
return 0
Return 0 for empty input.
This early return prevents unnecessary computation and avoids errors from processing an empty list.
for num in num_set:
Iterate over each unique number in the set.
Iterating over the set ensures each number is processed once, which is critical for achieving O(n) time complexity.
current_num += 1
Increment current_num to the next consecutive number.
Advancing current_num allows the algorithm to traverse the consecutive sequence forward.
current_streak += 1
Increment current_streak to count the consecutive number found.
Updating current_streak tracks the length of the current consecutive sequence accurately.
return longest_streak
Return the length of the longest consecutive sequence found.
This final return outputs the solution after all sequences have been evaluated, completing the algorithm.
longest_streak = 0
Initialize a variable to track the longest consecutive sequence length found.
This variable accumulates the maximum streak length discovered during iteration, allowing the algorithm to return the correct final result.
current_num = num
Initialize current_num to the start of the sequence.
Setting current_num to num prepares for counting consecutive numbers starting from this sequence start.
current_streak = 1
Initialize current_streak to 1 for the starting number.
The streak count begins at 1 because the sequence includes the starting number itself.
Test Your Understanding
Why does the algorithm only start counting a sequence when num-1 is not in the set?
Because if num-1 is in the set, then num is not the start of a sequence and will be counted when the sequence start at num-1 is processed, preventing redundant counting and ensuring O(n) time.
Related Problems
Hash Maps pattern
Don't just read it. Drill it.
Reconstruct Longest Consecutive Sequence from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.
Start drilling Longest Consecutive Sequence for free