How do you solve Distinct Subsequences?

Implement a recursive function dp(i, j) that returns the number of subsequences of s[i:] matching t[j:]. Use memoization to cache results. If characters match, sum the counts of matching both characters and skipping s[i]. If not, skip s[i]. Base cases handle when t is fully matched (return 1) or s is exhausted without matching t (return 0). This approach guarantees O(n*m) time and space complexity, where n and m are lengths of s and t respectively.

What pattern does Distinct Subsequences use?

The Pattern: Dynamic Programming (Top-Down Memoization for Counting Subsequences) The Heuristic: When counting subsequences or combinations with overlapping subproblems, model the problem as a state machine over indices and cache results to avoid exponential recomputation, transforming recursion into efficient DP.

Distinct Subsequences — Dynamic Programming

Q: What is the time complexity of Distinct Subsequences?

O(n * m). Each state (i, j) is computed once and cached. There are at most n * m states, where n = len(s) and m = len(t). Each state computation takes O(1) time.

Problem

Given two strings s and t, return the number of distinct subsequences of s which equals t.

0 ≤ s.length, t.length ≤ 1000
s and t consist of English letters.

Example

Input: s = "rabbbit", t = "rabbit"

Output: 3

The string "rabbbit" has 3 distinct subsequences that equal "rabbit". The algorithm explores the string s starting from index 0 and t from index 0. At each step, if characters match, it considers two paths: matching these characters and moving both indices forward, or skipping the current character in s and moving only s's index forward. If characters don't match, it skips the current character in s. Memoization caches results for each (i, j) pair to avoid redundant computations. The critical insight is that the problem breaks down into counting subsequences by exploring these two choices recursively, and memoization ensures polynomial time complexity.

Approach

Straightforward Solution

A brute-force recursive approach tries all subsequences of s and checks if they equal t, resulting in exponential time due to repeated exploration of the same states.

Core Observation

The count of subsequences matching t from position j in s starting at position i depends on whether s[i] matches t[j]. If they match, the total count is the sum of counts from advancing both indices and counts from skipping s[i]. If they don't match, only skipping s[i] is valid. This recursive structure exhibits overlapping subproblems and optimal substructure.

Path to Optimal

Preview

Recognizing the overlapping subproblems, memoization is introduced to cache results for each (i, j) state, preventing redundant computations. This transforms the exponential recursion into a polynomial time dynamic programming solution…

Full step-by-step walkthrough on Pro →

Optimal Approach

Preview

Implement a recursive function dp(i, j) that returns the number of subsequences of s[i:] matching t[j:]. Use memoization to cache results…

Full step-by-step walkthrough on Pro →

Want the full reasoning chain?

Unlock the complete walkthrough, line-by-line analysis, and recall drill.

Unlock Pro

Time

O(n * m)

Each state (i, j) is computed once and cached. There are at most n * m states, where n = len(s) and m = len(t). Each state computation takes O(1) time.

Space

O(n * m)

The memoization dictionary stores results for each pair of indices (i, j), requiring O(n * m) auxiliary space. The recursion stack depth is at most O(n + m), which is bounded by O(n * m).

Solution

Python

1class Solution:
  def numDistinct(self, s: str, t: str) -> int:
      memo = {}
4
      def dp(i, j):
          if j == len(t):
              return 1
          if i == len(s):
              return 0
          if (i, j) in memo:
              return memo[(i, j)]
          
          if s[i] == t[j]:
              memo[(i, j)] = dp(i + 1, j + 1) + dp(i + 1, j)
          else:
              memo[(i, j)] = dp(i + 1, j)
          
          return memo[(i, j)]
19
      return dp(0, 0)

Step-by-Step Solution

1

Cache Recursive Results to Avoid Redundant Computations

3memo = {}
10    if (i, j) in memo:
11        return memo[(i, j)]

Objective

To store the results of subproblems defined by indices (i, j) to prevent exponential recomputation.

Key Insight

Memoization transforms the naive recursive solution into an efficient dynamic programming approach by caching the number of subsequences for each pair of indices. This avoids recomputing the same subproblems multiple times, which is the main cause of exponential time complexity in naive recursion.

Interview Quick-Check

Core Logic

Memoization caches results keyed by (i, j), ensuring each subproblem is solved once.

Common Pitfalls & Bugs

Forgetting to cache results leads to exponential time due to repeated exploration of identical states.

2

Return 1 When Target Fully Matched and 0 When Source Exhausted

To define base cases that terminate recursion when the target string is fully matched or the source string is exhausted without a match.

3

Sum Counts of Matching and Skipping Characters to Count Subsequences

To compute the number of subsequences by considering both matching the current characters and skipping the current character in s.

4

Return Cached Result to Complete Recursive Computation

To return the computed or cached number of subsequences for the current indices.

5

Initiate Recursive Counting from Start of Both Strings

To start the recursive counting process from the beginning of s and t.

4 more steps with full analysis available on Pro.

Line Analysis

This solution has 8 Critical lines interviewers watch for.

Line 6 Critical

if j == len(t):

Check if the target string has been fully matched.

When j equals the length of t, it means a valid subsequence has been found, so return 1 to count this successful path.

Line 8 Critical

if i == len(s):

Check if the source string has been exhausted before matching the target.

If i equals the length of s but j is not at the end of t, no further subsequences can match, so return 0 to indicate failure.

Line 14 Critical

memo[(i, j)] = dp(i + 1, j + 1) + dp(i + 1, j)

Calculate and cache the sum of counts from matching and skipping the current character.

This line embodies the core DP recurrence: when characters match, the total count is the sum of subsequences that include this match and those that skip it, ensuring all distinct subsequences are counted.

Full line-by-line criticality + rationale for all 13 lines available on Pro.

Test Your Understanding

Why does the algorithm add the results of dp(i + 1, j + 1) and dp(i + 1, j) when s[i] == t[j]?

See the answer with Pro.

Don't just read it. Drill it.

Reconstruct Distinct Subsequences from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.

Unlock the Distinct Subsequences drill

or drill a free problem

1	`class Solution:`
2	`def numDistinct(self, s: str, t: str) -> int:`
3	`memo = {}`
4
5	`def dp(i, j):`
6	`if j == len(t):`
7	`return 1`
8	`if i == len(s):`
9	`return 0`
10	`if (i, j) in memo:`
11	`return memo[(i, j)]`
12
13	`if s[i] == t[j]:`
14	`memo[(i, j)] = dp(i + 1, j + 1) + dp(i + 1, j)`
15	`else:`
16	`memo[(i, j)] = dp(i + 1, j)`
17
18	`return memo[(i, j)]`
19
20	`return dp(0, 0)`

3	`memo = {}`
10	`if (i, j) in memo:`
11	`return memo[(i, j)]`