String Compression
Problem
Given an array of characters chars, compress it in-place such that consecutive groups of the same character are replaced by the character followed by the group's length, and return the new length of the array after compression.
- 1 ≤ chars.length ≤ 2000
- chars[i] is a lowercase English letter, uppercase English letter, digit, or symbol.
Example
chars = ["a","a","b","b","c","c","c"]6The groups are "aa", "bb", and "ccc". After compression, the array becomes ["a","2","b","2","c","3"]. The new length is 6. The algorithm scans the array, counts consecutive characters, writes the character once, and if the count is greater than 1, writes the count digits sequentially. This in-place approach avoids extra space by using two pointers: one to read and one to write.
Approach
Straightforward Solution
A naive approach would build a new string or array to store the compressed result, which uses extra space and violates the in-place constraint. Another brute-force approach might repeatedly shift elements, resulting in O(n²) time complexity.
Core Observation
Compression depends on identifying consecutive groups of identical characters and replacing them with the character plus the count if greater than one. The key is to perform this in-place, which requires careful management of read and write pointers to avoid overwriting unprocessed data.
Path to Optimal
PreviewThe insight is to use two pointers: one to iterate through the input array (read pointer) and one to track the position to write compressed characters (write pointer)…
Full step-by-step walkthrough on Pro →
Optimal Approach
PreviewIterate through the array with a read pointer. For each group of identical characters, count its length by advancing the read pointer…
Full step-by-step walkthrough on Pro →
Want the full reasoning chain?
Unlock the complete walkthrough, line-by-line analysis, and recall drill.
Unlock ProTime
O(n)
Each character is visited at most twice: once by the read pointer to count groups, and once by the write pointer to write compressed characters, resulting in linear time.
Space
O(1)
The algorithm uses only a fixed number of pointers and variables for counting and writing, modifying the input array in-place without additional data structures.
Pattern Spotlight
Two Pointers (In-Place Compression)
Use one pointer to read and count groups, and another to write compressed output in-place, ensuring no data loss or overwriting by carefully advancing pointers only after processing each group.
Solution
| 1 | class Solution: |
| 2 | def compress(self, chars: List[str]) -> int: |
| 3 | write = 0 |
| 4 | i = 0 |
| 5 | |
| 6 | while i < len(chars): |
| 7 | char = chars[i] |
| 8 | count = 0 |
| 9 | |
| 10 | while i < len(chars) and chars[i] == char: |
| 11 | i += 1 |
| 12 | count += 1 |
| 13 | |
| 14 | chars[write] = char |
| 15 | write += 1 |
| 16 | |
| 17 | if count > 1: |
| 18 | for d in str(count): |
| 19 | chars[write] = d |
| 20 | write += 1 |
| 21 | |
| 22 | return write |
Step-by-Step Solution
Initialize Read and Write Pointers for Traversal and Compression
| 3 | write = 0 |
| 4 | i = 0 |
Objective
To set up two pointers: one for reading through the input array and one for writing compressed characters in-place.
Key Insight
Using two pointers allows simultaneous reading and writing without extra space. The read pointer scans the input to identify groups, while the write pointer tracks where to write compressed output. This separation prevents overwriting unprocessed characters and enables in-place modification.
Interview Quick-Check
Core Logic
The read pointer traverses the input array to identify groups, while the write pointer marks the position to write compressed characters.
State & Boundaries
Both pointers start at zero, ensuring the entire array is processed from the beginning.
Common Pitfalls & Bugs
Initializing pointers incorrectly or mixing their roles can cause data corruption or incorrect compression.
Traverse Input to Identify Consecutive Character Groups
To iterate through the input array, counting the length of each group of identical characters.
Write Compressed Characters and Counts In-Place
To write the compressed representation of each group into the array at the write pointer position.
Return Length of Compressed Array
To return the final length of the compressed array after all groups have been processed.
3 more steps with full analysis available on Pro.
Line Analysis
This solution has 4 Critical lines interviewers watch for.
while i < len(chars) and chars[i] == char:
Start inner loop to count consecutive occurrences of the current character.
This loop advances the read pointer and increments count while the characters match, efficiently identifying group length.
chars[write] = char
Write the current character at the write pointer position.
Writing the character first ensures the compressed array starts with the correct character for the group.
chars[write] = d
Write each count digit at the write pointer position.
Writing digits individually maintains the correct order and placement of the count in the compressed array.
Full line-by-line criticality + rationale for all 15 lines available on Pro.
Test Your Understanding
Why must the write pointer only advance after fully processing each group of characters?
See the answer with Pro.
Related Problems
Two Pointers pattern
Don't just read it. Drill it.
Reconstruct String Compression from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.
Unlock the String Compression drill