Accounts Merge
Problem
Given a list of accounts where each account contains a name and a list of emails, merge accounts that share at least one email and return the merged accounts with unique emails sorted lexicographically.
- 1 ≤ accounts.length ≤ 1000
- 2 ≤ accounts[i].length ≤ 10
- 1 ≤ accounts[i][j].length ≤ 30
- accounts[i][0] consists of English letters.
- accounts[i][j] (for j > 0) is a valid email.
Example
accounts = [["John", "johnsmith@mail.com", "john00@mail.com"], ["John", "johnnybravo@mail.com"], ["John", "johnsmith@mail.com", "john_newyork@mail.com"], ["Mary", "mary@mail.com"]][["John", "john00@mail.com", "john_newyork@mail.com", "johnsmith@mail.com"], ["John", "johnnybravo@mail.com"], ["Mary", "mary@mail.com"]]The accounts with emails "johnsmith@mail.com" appear in two accounts, so these accounts are merged. The merged account contains all unique emails sorted lexicographically. The other accounts remain separate because they share no emails.
Approach
Straightforward Solution
A brute-force approach would compare every pair of accounts for shared emails, merging them repeatedly until no overlaps remain. This approach is inefficient (potentially O(n^2 * m^2)) and impractical for large inputs.
Core Observation
Accounts sharing emails form connected components in a graph where emails are nodes and edges connect emails appearing in the same account. Merging accounts is equivalent to finding connected components of this graph.
Path to Optimal
PreviewRecognizing the problem as graph connectivity allows building an undirected graph where each email is a node connected to other emails in the same account. Using DFS to find connected components groups all emails belonging to the same user…
Full step-by-step walkthrough on Pro →
Optimal Approach
PreviewConstruct a graph mapping emails to their neighbors and a map from email to account name. Perform DFS on unvisited emails to collect connected components…
Full step-by-step walkthrough on Pro →
Want the full reasoning chain?
Unlock the complete walkthrough, line-by-line analysis, and recall drill.
Unlock ProTime
O(N * M log M)
Building the graph takes O(N * M) where N is number of accounts and M is max emails per account. DFS visits each email once. Sorting each connected component's emails dominates with O(M log M) per component, leading to overall O(N * M log M).
Space
O(N * M)
The graph and email-to-name map store all emails and their connections, requiring space proportional to total emails.
Pattern Spotlight
DFS (Graph Connected Components)
When merging entities connected by shared attributes, model the problem as a graph and use DFS to find connected components, which represent merged groups.
Solution
| 1 | class Solution: |
| 2 | def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]: |
| 3 | graph = defaultdict(list) |
| 4 | email_to_name = {} |
| 5 | |
| 6 | for account in accounts: |
| 7 | name = account[0] |
| 8 | first_email = account[1] |
| 9 | |
| 10 | for email in account[1:]: |
| 11 | email_to_name[email] = name |
| 12 | graph[first_email].append(email) |
| 13 | graph[email].append(first_email) |
| 14 | |
| 15 | visited = set() |
| 16 | res = [] |
| 17 | |
| 18 | def dfs(email, component_emails): |
| 19 | visited.add(email) |
| 20 | component_emails.append(email) |
| 21 | |
| 22 | for neighbor in graph[email]: |
| 23 | if neighbor not in visited: |
| 24 | dfs(neighbor, component_emails) |
| 25 | |
| 26 | for email in graph: |
| 27 | if email not in visited: |
| 28 | component_emails = [] |
| 29 | dfs(email, component_emails) |
| 30 | |
| 31 | res.append([email_to_name[email]] + sorted(component_emails)) |
| 32 | |
| 33 | return res |
Step-by-Step Solution
Build Email Graph and Map Emails to Names
| 3 | graph = defaultdict(list) |
| 4 | email_to_name = {} |
| 6 | for account in accounts: |
| 7 | name = account[0] |
| 8 | first_email = account[1] |
| 10 | for email in account[1:]: |
| 11 | email_to_name[email] = name |
| 12 | graph[first_email].append(email) |
| 13 | graph[email].append(first_email) |
Objective
To construct an undirected graph connecting emails that appear in the same account and map each email to its account name.
Key Insight
By connecting all emails in an account to the first email, the graph encodes all relationships between emails belonging to the same user. The email-to-name map allows retrieving the user's name after grouping emails. This graph representation transforms the merging problem into a connected components problem.
Interview Quick-Check
Core Logic
Connecting each email to the first email in the account creates edges that link all emails of the same user, enabling DFS to find connected components.
State & Boundaries
The email_to_name map stores the name for each email, ensuring the correct name is associated with merged emails.
Common Pitfalls & Bugs
Forgetting to add edges in both directions (undirected graph) can cause incomplete traversal during DFS.
Traverse Graph with DFS to Collect Connected Emails
To perform DFS on each unvisited email to find all connected emails forming one merged account.
Aggregate and Sort Emails for Each Merged Account
To collect the connected emails from DFS, sort them lexicographically, prepend the account name, and add to the result list.
Return the List of Merged Accounts
To return the final list of merged accounts after processing all connected components.
3 more steps with full analysis available on Pro.
Line Analysis
This solution has 5 Critical lines interviewers watch for.
graph[first_email].append(email)
Add an edge from the first email to the current email in the graph.
Connecting the first email to all others in the account creates edges that link emails belonging to the same user.
graph[email].append(first_email)
Add an edge from the current email back to the first email to maintain an undirected graph.
Adding edges in both directions ensures the graph is undirected, which is necessary for correct DFS traversal of connected components.
visited.add(email)
Mark the current email as visited.
Marking emails as visited before recursion prevents cycles and duplicate visits, ensuring correctness and efficiency.
Full line-by-line criticality + rationale for all 23 lines available on Pro.
Test Your Understanding
Why is DFS on the email graph the correct method to merge accounts?
See the answer with Pro.
Related Problems
DFS pattern
Don't just read it. Drill it.
Reconstruct Accounts Merge from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.
Unlock the Accounts Merge drill