Longest Common Subsequence
Problem
Given two strings text1 and text2, return the length of their longest common subsequence. A subsequence is a sequence that appears in the same relative order, but not necessarily contiguous. You may assume the input strings consist of only lowercase English letters.
- 1 ≤ text1.length, text2.length ≤ 1000
- text1 and text2 consist of only lowercase English characters.
Example
text1 = "abcde", text2 = "ace"3The longest common subsequence is "ace" with length 3. The algorithm explores the strings starting from index 0 in both. At index 0, characters match ('a' == 'a'), so it adds 1 and recurses on the next indices (1,1). At index 1 and 1, characters differ ('b' != 'c'), so it explores two paths: skipping 'b' in text1 or skipping 'c' in text2. It recursively computes the maximum length from these paths, eventually finding the longest subsequence length 3.
Approach
Straightforward Solution
A brute-force approach tries all subsequence combinations, resulting in exponential time complexity because it explores all subsets of characters in both strings.
Core Observation
The longest common subsequence problem exhibits optimal substructure: the solution for prefixes of the strings depends on solutions to smaller prefixes. Specifically, if characters match at current indices, the LCS length increases by one plus the LCS of the remaining substrings; otherwise, it is the maximum LCS length obtained by advancing one index in either string.
Path to Optimal
PreviewRecognizing the overlapping subproblems and optimal substructure leads to dynamic programming. The problem can be solved by defining a recursive function dp(i, j) that returns the LCS length of text1[i:] and text2[j:]…
Full step-by-step walkthrough on Pro →
Optimal Approach
PreviewImplement a top-down recursive function with memoization that returns 0 when either index reaches the end of its string. If characters at i and j match, return 1 plus dp(i+1, j+1)…
Full step-by-step walkthrough on Pro →
Want the full reasoning chain?
Unlock the complete walkthrough, line-by-line analysis, and recall drill.
Unlock ProTime
O(m * n)
Each subproblem dp(i, j) corresponds to a unique pair of indices in text1 and text2. Since there are m * n such pairs and each is computed once due to memoization, the total time is O(m * n).
Space
O(m * n)
The memo dictionary stores results for each pair of indices, requiring O(m * n) auxiliary space. The recursion stack depth is at most m + n, which is O(m + n), but dominated by the memo size.
Pattern Spotlight
Dynamic Programming (Top-Down Memoization)
When a problem involves making decisions at each position with overlapping subproblems and optimal substructure, model it as a recursive state with memoization to avoid redundant work and achieve polynomial time.
Solution
| 1 | class Solution:
|
| 2 | def longestCommonSubsequence(self, text1: str, text2: str) -> int:
|
| 3 | memo = {}
|
| 4 |
|
| 5 | def dp(i, j):
|
| 6 | if i >= len(text1) or j >= len(text2):
|
| 7 | return 0
|
| 8 | if (i, j) in memo:
|
| 9 | return memo[(i, j)]
|
| 10 |
|
| 11 | if text1[i] == text2[j]:
|
| 12 | result = 1 + dp(i + 1, j + 1)
|
| 13 | else:
|
| 14 | result = max(dp(i + 1, j), dp(i, j + 1))
|
| 15 |
|
| 16 | memo[(i, j)] = result
|
| 17 | return result
|
| 18 |
|
| 19 | return dp(0, 0) |
Step-by-Step Solution
Initialize Memoization Cache for Subproblem Results
| 3 | memo = {}
|
Objective
To create a dictionary that caches results of subproblems identified by index pairs (i, j) to avoid redundant recursive calls.
Key Insight
Memoization transforms the naive exponential recursion into polynomial time by storing the results of each unique subproblem. This cache acts as a memory of all previously solved states, enabling instant retrieval and preventing repeated work.
Interview Quick-Check
Core Logic
The memo dictionary keys are tuples (i, j) representing the current indices in text1 and text2, ensuring each subproblem is uniquely identified.
Common Pitfalls & Bugs
Forgetting to check the memo before recursive calls leads to exponential time due to repeated computations.
Recursively Compute LCS Length with Memoization
To define a recursive function that computes the LCS length for substrings starting at indices i and j, using memoization to cache results.
Return Final LCS Length from Starting Indices
To initiate the recursive computation from the start of both strings and return the final longest common subsequence length.
2 more steps with full analysis available on Pro.
Line Analysis
This solution has 6 Critical lines interviewers watch for.
if (i, j) in memo:
Check if the current subproblem result is already cached in memo.
This lookup prevents recomputation of the same subproblem, ensuring the algorithm runs in polynomial time rather than exponential.
if i >= len(text1) or j >= len(text2):
Check if either index has reached the end of its string, returning 0 as the base case.
This base case terminates recursion when one string is exhausted, as no further common subsequence can be found beyond this point.
return memo[(i, j)]
Return the cached result if found.
Returning the memoized value immediately prunes the recursion tree, avoiding redundant work and improving efficiency.
Full line-by-line criticality + rationale for all 13 lines available on Pro.
Test Your Understanding
Why does memoization reduce the time complexity from exponential to polynomial in this problem?
See the answer with Pro.
Related Problems
Dynamic Programming pattern
Don't just read it. Drill it.
Reconstruct Longest Common Subsequence from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.
Unlock the Longest Common Subsequence drill