String Compression

Implement simple string compression where 4 or more identical characters in sequence produces [Character]#[count] in a compressed output string. Consecutive identical characters of count 3 or less are left as-is in the output string. For example:

ABC           -> ABC
AAABBBCCC     -> AAABBBCCC
AAAABBBBBC    -> A#4B#5C
AAAABBBBBCC   -> A#4B#5CC
AAAABBBBBCCC  -> A#4B#5CCC
AAAABBBBBCCCC -> A#4B#5C#4

The approach taken here is to count identical characters in a row and when the current character is different from the previous (or we get to the end of the string), generate more of the output string. If 4 or more identical characters are counted, concatenate to the output string using the #[Count] notation. Otherwise concatenate the actual character 3 or fewer times.

For n characters, the runtime is O(n). Each character in the input loop is visited once. On the output, any identical character sequence produces at most 3 characters on the output.

The output worst case is a string with m identical character sequences of length 3. m * 3 = n chars on the output. So, each input char is visited and n chars are also produced on the output. n inputs + n outputs = 2n total chars = O(n).

Finally, for any input string length n, with one or more identical character sequence where count >= 4, the output string length will be less than n, since any count >= 4 results in its output representation being 3 characters.

func encode(_ input: String) -> String {
  
  var count:Int = 0
  var output:String = ""
  
  guard var char = input.first else { return "" }
  
  for c in input {
    if c == char {
      count = 1 + count
    }
    else {
      output += subencode(char:char, count:count)
      char = c
      count = 1
    }
  }

  output += subencode(char:char, count:count)

  return output
}

func subencode(char:Character, count:Int) -> String {
  var output = ""

  if count > 3 {
    output += "\(char)#\(count)"
  }
  else {
    for _ in 0 ..< count {
      output += "\(char)"
    }
  }
  
  return output
}

print(encode(""))
print(encode("ABC"))
print(encode("AAABBBCCC"))
print(encode("AAAABBBBBC"))
print(encode("AAAABBBBBCC"))
print(encode("AAAABBBBBCCC"))
print(encode("AAAABBBBBCCCC"))
Scroll to top