What pattern does Encode and Decode Strings use?

The Pattern: String Manipulation (Length-Prefixed Encoding) The Heuristic: When encoding multiple strings into one string, prefix each string with its length and a delimiter to enable exact slicing during decoding, avoiding ambiguity caused by delimiter characters inside the strings.

Encode and Decode Strings — Simulation

Q: What is the time complexity of Encode and Decode Strings?

O(n). Encoding and decoding each character exactly once results in linear time complexity proportional to the total length of all strings combined.

Problem

Given a list of strings, encode them into a single string such that it can be decoded back to the original list of strings exactly.

0 ≤ strs.length ≤ 10⁴
0 ≤ strs[i].length ≤ 10⁴
strs[i] may contain any ASCII characters

Example

Input: strs = ["hello", "world"]

Output: "5#hello5#world"

The encoding concatenates each string preceded by its length and a delimiter '#'. For example, "hello" has length 5, so it is encoded as "5#hello". The decoder reads the length prefix, then extracts the substring of that length after the '#'. This process repeats until the entire encoded string is decoded back into the original list ["hello", "world"]. The critical moment is correctly parsing the length prefix and using it to slice the exact substring, ensuring no ambiguity even if strings contain digits or '#'.

Approach

Straightforward Solution

A naive approach might join strings with a delimiter like a comma or space, but this fails if strings contain that delimiter. Another naive approach is to escape delimiters, but escaping can be complex and error-prone.

Core Observation

The fundamental challenge is to encode a list of strings into a single string without losing information or introducing ambiguity, especially since strings can contain any characters including digits and the delimiter itself.

Path to Optimal

The key insight is to prefix each string with its length and a unique delimiter (here '#'). This length prefix acts as a precise marker, allowing the decoder to know exactly how many characters to read for each string, regardless of content. This approach avoids ambiguity and does not require escaping characters inside the strings.

Optimal Approach

Encode by concatenating each string as '<length>#<string>'. Decode by reading the length prefix until the delimiter '#', then extract the substring of that length. Repeat until the entire encoded string is processed. This guarantees lossless and unambiguous encoding and decoding.

Time

O(n)

Encoding and decoding each character exactly once results in linear time complexity proportional to the total length of all strings combined.

Space

O(n)

The encoded string and the decoded list require space proportional to the total input size. Auxiliary space is linear due to building the output string and result list.

Solution

Python

1class Solution:
  def encode(self, strs: List[str]) -> str:
      res = ""
      for s in strs:
          res += str(len(s)) + "#" + s
      return res
7
  def decode(self, s: str) -> List[str]:
      res = []
      i = 0
      while i < len(s):
          j = i
          while s[j] != '#':
              j += 1
          length = int(s[i:j])
          
          start_of_str = j + 1
          end_of_str = start_of_str + length
          res.append(s[start_of_str:end_of_str])
          
          i = end_of_str
      return res

Step-by-Step Solution

1

Build Encoded String by Prefixing Length and Delimiter

3res = ""
4for s in strs:
5    res += str(len(s)) + "#" + s
6return res

Objective

To transform the list of strings into a single encoded string by concatenating each string with its length and a delimiter.

Key Insight

Prefixing each string with its length and a delimiter '#' creates a self-describing encoding. This allows the decoder to parse the length first, then extract the exact substring, regardless of the string's content. This approach avoids the complexity and pitfalls of escaping delimiters inside strings.

Interview Quick-Check

Core Logic

Concatenate each string as '<length>#<string>' to enable exact substring extraction during decoding.

Common Pitfalls & Bugs

Forgetting to convert the length to string or omitting the delimiter '#' leads to ambiguous decoding.

2

Decode Encoded String by Parsing Length Prefix and Extracting Substrings

9res = []
10i = 0
11while i < len(s):
12    j = i
13    while s[j] != '#':
14        j += 1
15    length = int(s[i:j])
17    start_of_str = j + 1
18    end_of_str = start_of_str + length
19    res.append(s[start_of_str:end_of_str])
21    i = end_of_str
22return res

Objective

To reconstruct the original list of strings by reading length prefixes and slicing substrings accordingly.

Key Insight

The decoder scans the encoded string, first reading characters until it encounters the delimiter '#', which marks the end of the length prefix. Converting this prefix to an integer gives the exact length of the next string. Using this length, the decoder extracts the substring immediately following the delimiter. This process repeats until the entire encoded string is consumed, ensuring accurate and lossless reconstruction.

Interview Quick-Check

Core Logic

Parse the length prefix by scanning until '#', then slice the substring of that length to decode each original string.

State & Boundaries

Use a pointer to track the current position in the encoded string, advancing it by the length of the parsed substring plus the length prefix and delimiter.

Common Pitfalls & Bugs

Failing to update the pointer correctly after extracting each substring can cause infinite loops or incorrect decoding.

Line Analysis

This solution has 2 Critical lines interviewers watch for.

Line 5 Critical

res += str(len(s)) + "#" + s

Append the length of the string, a delimiter '#', and the string itself to the result.

This line implements the core encoding strategy: prefixing each string with its length and a delimiter to enable unambiguous decoding.

Line 15 Critical

length = int(s[i:j])

Convert the substring between pointers i and j to an integer length.

Parsing the length prefix as an integer determines how many characters to extract for the next string, enabling precise slicing.

Line 19 Core

res.append(s[start_of_str:end_of_str])

Extract the substring of the specified length and append it to the result list.

This step reconstructs one original string from the encoded data, preserving order and content.

Line 13 Core

while s[j] != '#':

Advance the secondary pointer until the delimiter '#' is found.

Locating the delimiter is essential to extract the length prefix accurately.

Line 21 Core

i = end_of_str

Advance the main pointer to the end of the extracted substring for the next iteration.

Updating the pointer ensures the decoder progresses through the encoded string without overlap or repetition.

Line 11 Standard

while i < len(s):

Start a loop to process the encoded string until fully parsed.

This loop ensures that decoding continues until all encoded substrings have been extracted.

Line 12 Standard

j = i

Set a secondary pointer to find the delimiter '#' marking the end of the length prefix.

This pointer scans forward to locate the delimiter, which separates the length prefix from the string content.

Line 14 Standard

j += 1

Increment the secondary pointer to continue searching for the delimiter.

This step iteratively moves the pointer forward, ensuring the delimiter is found even if the length prefix has multiple digits.

Line 4 Standard

for s in strs:

Iterate over each string in the input list.

Processing each string individually allows encoding them with their length prefixes in sequence.

Line 10 Standard

i = 0

Initialize a pointer to track the current position in the encoded string.

The pointer allows sequential parsing of the encoded string, ensuring all substrings are decoded in order.

Line 17 Standard

start_of_str = j + 1

Calculate the start index of the encoded string content after the delimiter.

This index marks where the actual string begins in the encoded string, immediately after the length prefix and delimiter.

Line 18 Standard

end_of_str = start_of_str + length

Calculate the end index of the string content using the parsed length.

This index defines the boundary for slicing the exact substring representing the original string.

Line 6 Standard

return res

Return the fully encoded string after processing all input strings.

Returning the concatenated encoded string completes the encoding process, providing the input for decoding.

Line 22 Standard

return res

Return the list of decoded strings after fully processing the encoded input.

Returning the reconstructed list completes the decoding process, providing the original input list.

Line 3 Basic

res = ""

Initialize an empty string to accumulate the encoded result.

This string will hold the concatenation of all encoded strings, forming the final encoded output.

Line 9 Basic

res = []

Initialize an empty list to store decoded strings.

This list accumulates the decoded strings as they are extracted from the encoded input.

Test Your Understanding

Why is prefixing each string with its length and a delimiter necessary instead of just joining with a delimiter?

Because strings can contain any characters including the delimiter, joining alone can cause ambiguity. The length prefix precisely indicates how many characters to read for each string, ensuring unambiguous and lossless decoding.

Don't just read it. Drill it.

Reconstruct Encode and Decode Strings from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.

Start drilling Encode and Decode Strings for free

1	`class Solution:`
2	`def encode(self, strs: List[str]) -> str:`
3	`res = ""`
4	`for s in strs:`
5	`res += str(len(s)) + "#" + s`
6	`return res`
7
8	`def decode(self, s: str) -> List[str]:`
9	`res = []`
10	`i = 0`
11	`while i < len(s):`
12	`j = i`
13	`while s[j] != '#':`
14	`j += 1`
15	`length = int(s[i:j])`
16
17	`start_of_str = j + 1`
18	`end_of_str = start_of_str + length`
19	`res.append(s[start_of_str:end_of_str])`
20
21	`i = end_of_str`
22	`return res`

3	`res = ""`
4	`for s in strs:`
5	`res += str(len(s)) + "#" + s`
6	`return res`

9	`res = []`
10	`i = 0`
11	`while i < len(s):`
12	`j = i`
13	`while s[j] != '#':`
14	`j += 1`
15	`length = int(s[i:j])`
17	`start_of_str = j + 1`
18	`end_of_str = start_of_str + length`
19	`res.append(s[start_of_str:end_of_str])`
21	`i = end_of_str`
22	`return res`