#String Compression Challenge

4 messages · Page 1 of 1 (latest)

tranquil aspen
#

Goal:

Given a list of N strings, your task is to create a new string that incorporates all the strings from the list. The resultant string should be the shortest possible by taking advantage of overlapping sections between the strings.
Guidelines:

The resulting string should contain all the original strings entirely.
The strings from the list can appear in any order in the resulting string.
If multiple shortest strings are possible, returning any one of them is acceptable.
If there's no possible overlap among the strings, the result will simply be a concatenation of all the strings in the list.
All strings in the input list are unique.

Input:

A list of N strings, where each string has a length between 1 and 100 inclusive.

Output:

A single string that represents the most compressed combination of the given strings.

Example:

Given the input list:
strings = ["pleasure", "apple", "rendered", "clap", "suren"]A possible output is:
"clappleasurendered"

tranquil aspen
#

my solution
cam probably be optimised a bit, just quickly threw it together based on my graph test
||```cpp
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>

typedef struct Node {
std::string word1;
std::string word2;
int overlap;
} Node;

int findOverlap(const std::string &a, const std::string &b) {
int len1 = a.length();
int len2 = b.length();
int maxOverlap = std::min(len1, len2);

for (int o = maxOverlap; o > 0; --o) {
    if (a.substr(len1 - o) == b.substr(0, o)) {
        return o;
    }
}

return 0;

}

Node findMaxOverlap(const std::string &word1, const std::string &word2) {
int overlap1 = findOverlap(word1, word2);
int overlap2 = findOverlap(word2, word1);

if (overlap1 >= overlap2) {
    return Node{word1, word2, overlap1};
} else {
    return Node{word2, word1, overlap2};
}

}

int main() {
std::vectorstd::string words = {"pleasure", "apple", "clap", "suren"};
std::vector<Node> nodes;

while (words.size() > 1) {
  for (uint i = 0; i < words.size() - 1; i++) {
      for (uint j = i + 1; j < words.size(); j++) {
          nodes.push_back(findMaxOverlap(words[i], words[j]));
      }
  }

  std::sort(nodes.begin(), nodes.end(), [](const Node &a, const Node &b) {
      return a.overlap > b.overlap;
  });

  Node top = nodes[0];
  printf("top: %s, %s, %d = %s\n", top.word1.c_str(), top.word2.c_str(), top.overlap, (top.word1 + top.word2.substr(top.overlap)).c_str());
  words.erase(std::remove(words.begin(), words.end(), top.word1), words.end());
  words.erase(std::remove(words.begin(), words.end(), top.word2), words.end());
  words.push_back(top.word1 + top.word2.substr(top.overlap));
  nodes.clear();
}

printf("result: %s\n", words[0].c_str());

return 0;

}

waxen granite
#

@tranquil aspen My solution is going to be O(n!) — based on the different unique ways to order a list of N items

#

reason is because the shortest "compressed" strings depends on the order of N strings given, and I'm not sure if there's an algorithm to find one of these shortest strings consistently 🤷🏿‍♂️