Sale: Get 60% Off on all Pro Plans. Buy Now

Substring with Concatenation of All Words

Problem

Given a string s and an array of strings words, return all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

  • 1 ≤ s.length ≤ 10⁴
  • 1 ≤ words.length ≤ 5000
  • 1 ≤ words[i].length ≤ 30
  • s and words[i] consist of lowercase English letters

Example

Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]

The substrings starting at indices 0 and 9 are "barfoo" and "foobar" respectively, both concatenations of the words "foo" and "bar" without any extra characters. A brute-force approach would check every substring of length equal to total concatenation length, verifying if it contains all words exactly once, which is inefficient. The sliding window approach partitions the string into word-length segments and slides over these segments, maintaining counts of seen words and adjusting the window to ensure no word is overused. This method efficiently finds all valid starting indices in O(n) time relative to the string length.

Approach

Straightforward Solution

A brute-force solution would check every substring of length total_len and verify if it contains all words exactly once by counting frequencies. This approach is O(n * m * k) where n is string length, m is number of words, and k is word length, which is too slow for large inputs.

Core Observation

The problem reduces to finding all substrings of length equal to the total length of all words concatenated, where the substring is composed of the words in any order, each exactly once. Since all words have the same length, the string can be segmented into fixed-size chunks, enabling a sliding window approach that moves in increments of word length.

Path to Optimal

Preview

The key insight is to slide over the string in word_len increments, maintaining a window of words and their counts…

Full step-by-step walkthrough on Pro

Optimal Approach

Preview

Iterate over possible offsets from 0 to word_len - 1 to cover all possible alignments. For each offset, use two pointers (left and right) to define a sliding window of words…

Full step-by-step walkthrough on Pro

Want the full reasoning chain?

Unlock the complete walkthrough, line-by-line analysis, and recall drill.

Unlock Pro

Time

O(n * word_len)

The algorithm iterates over word_len offsets, and for each offset, it scans the string in increments of word_len, processing each word once. Each word is added and removed from the window at most once, resulting in O(n) operations per offset and O(n * word_len) overall.

Space

O(m)

The auxiliary space is dominated by the hash maps storing the target word counts and the current window counts, both proportional to the number of words m.

Pattern Spotlight

Sliding Window (Fixed-Size Chunk Partitioning)

When searching for concatenations of fixed-length words in a string, partition the string into word-length segments and slide a window over these segments, adjusting boundaries to maintain valid word counts and efficiently find all matches.

Solution

Python
1class Solution:
2 def findSubstring(self, s: str, words: List[str]) -> List[int]:
3 if not s or not words:
4 return []
5
6 word_len = len(words[0])
7 num_words = len(words)
8 total_len = word_len * num_words
9
10 target = {}
11 for w in words:
12 target[w] = target.get(w, 0) + 1
13
14 result = []
15
16 for offset in range(word_len):
17 left = offset
18 window = {}
19 words_used = 0
20
21 for right in range(offset, len(s) - word_len + 1, word_len):
22 word = s[right:right + word_len]
23
24 if word in target:
25 window[word] = window.get(word, 0) + 1
26 words_used += 1
27
28 while window[word] > target[word]:
29 left_word = s[left:left + word_len]
30 window[left_word] -= 1
31 left += word_len
32 words_used -= 1
33
34 if words_used == num_words:
35 result.append(left)
36
37 else:
38 window.clear()
39 words_used = 0
40 left = right + word_len
41
42 return result

Step-by-Step Solution

1

Validate Inputs and Derive Fixed Word Parameters

3if not s or not words:
4 return []
6word_len = len(words[0])
7num_words = len(words)
8total_len = word_len * num_words
14result = []

Objective

Ensure the inputs are valid and compute the structural parameters needed for the sliding window algorithm.

Key Insight

Because all words share the same length, the string can be processed in fixed-size word segments. Determining word length and word count early enables the sliding window to move in consistent increments.

Interview Quick-Check

Core Logic

Early validation prevents unnecessary computation and avoids accessing words[0] when the list is empty.

State & Boundaries

The fixed word length is the structural property that makes the sliding window approach possible.

2

Build Frequency Map of Target Words

To create a hash map that counts how many times each word appears in the input list, serving as the target frequency for valid substrings.

3

Slide Window Over String in Word-Length Increments for Each Offset

To iterate over the string starting at each possible offset and use a sliding window to find valid concatenations of words.

4

Return All Valid Starting Indices

To output the list of all starting indices where valid concatenations of all words occur.

3 more steps with full analysis available on Pro.

Line Analysis

This solution has 7 Critical lines interviewers watch for.

Line 28 Critical
while window[word] > target[word]:

While the current word's count exceeds its target frequency, shrink the window from the left.

This loop is the core sliding window adjustment that ensures the window remains valid by removing words from the left until the frequency constraints are satisfied, enabling efficient detection of valid concatenations.

Line 3 Critical
if not s or not words:

Check if input string or words list is empty.

An empty string or empty words list means no valid concatenations can exist, so returning an empty list immediately avoids unnecessary computation.

Line 16 Critical
for offset in range(word_len):

Iterate over all possible offsets from 0 to word_len - 1.

Considering all offsets ensures that all possible alignments of word-length segments are checked, preventing missed matches due to misalignment.

Full line-by-line criticality + rationale for all 30 lines available on Pro.

Test Your Understanding

Why does the algorithm iterate over multiple offsets from 0 to word_len - 1 instead of just once from the start?

See the answer with Pro.

Related Problems

Sliding Window pattern

Don't just read it. Drill it.

Reconstruct Substring with Concatenation of All Words from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.

Unlock the Substring with Concatenation of All Words drill

or drill a free problem