Split a string into maximum number of unique substrings
The problem at hand involves splitting a given string into the maximum number of unique substrings. In simpler terms, we want to find the most optimal way to divide a string into multiple pieces in such a way that each piece is distinct from the others. The goal is to determine the maximum number of these distinct substrings that can be obtained from the given string.
Problem Statement
Given a string, we need to devise an algorithm that will split it into as many unique substrings as possible. The uniqueness here means that no two substrings should be identical.
Example
Let's take the string "AABCCCCCDD" as an example. One possible way to split it into unique substrings is:
- A AB C CCC CD D
Notice that each substring is distinct and appears only once. The maximum number of partitions (or unique substrings) in this case is 6.
Idea to Solve
To solve this problem, we can utilize a recursive approach combined with dynamic programming. The key idea is to iterate through the string and consider all possible prefixes. We keep track of the substrings we have encountered so far using a HashSet. At each step, we try to add a prefix to the set, and then recursively move on to the remaining part of the string. We maximize the result by considering both scenarios: including the current prefix and excluding it.
Pseudocode
function maxUniqueSubString(text, record):
result = 0
for i = 1 to length(text):
prefix = text.substring(0, i)
if prefix not in record:
record.add(prefix)
result = max(result, maxUniqueSubString(text.substring(i), record) + 1)
record.remove(prefix)
return result
function maxUniqueSplit(text):
record = empty HashSet
result = maxUniqueSubString(text, record)
return result
main:
task = new Partition()
result1 = task.maxUniqueSplit("AABCCCCCDD")
result2 = task.maxUniqueSplit("ABCDE")
result3 = task.maxUniqueSplit("ABBBCCDDEA")
print("Unique substrings:", result1, result2, result3)
Algorithm Explanation
- We start by defining the
maxUniqueSubString
function which takes the current substring and the set of encountered substrings (record
) as input. It returns the maximum number of unique substrings that can be obtained from the given substring. - Inside
maxUniqueSubString
, we iterate through the string, considering each prefix from length 1 to the length of the current string. - If the prefix is not already present in the
record
, we add it to therecord
and recursively calculate the maximum number of unique substrings for the remaining part of the string. - We store the maximum of the result with and without including the current prefix.
- We remove the prefix from the
record
after exploring all possibilities. - The
maxUniqueSplit
function initializes an emptyrecord
, calculates the result usingmaxUniqueSubString
, and returns it.
Code Solution
import java.util.HashSet;
/*
Java Program for
Split a string into maximum number of unique substrings
*/
public class Partition
{
// Returns a max value of two numbers
public int maxValue(int a, int b)
{
if (a > b)
{
return a;
}
return b;
}
public int maxUniqueSubString(String text, HashSet < String > record)
{
int result = 0;
for (int i = 1; i <= text.length(); i++)
{
// Collect prefix of length i
String prefix = text.substring(0, i);
if (!record.contains(prefix))
{
// When prefix not exist
// Add prefix value.
record.add(prefix);
// Calculate the maximum distinct substring of
// suffix and prefix substring.
result = maxValue(
maxUniqueSubString(text.substring(i), record) + 1,
result);
// Remove prefix
record.remove(prefix);
}
}
return result;
}
public void maxUniqueSplit(String text)
{
HashSet < String > record = new HashSet < String > ();
System.out.print("\nGiven Text : " + text);
int result = maxUniqueSubString(text, record);
System.out.print("\n Unique substring : " + result);
}
public static void main(String[] args)
{
Partition task = new Partition();
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
task.maxUniqueSplit("AABCCCCCDD");
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
task.maxUniqueSplit("ABCDE");
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
task.maxUniqueSplit("ABBBCCDDEA");
}
}
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
// Include header file
#include <iostream>
#include <set>
#include <string>
using namespace std;
/*
C++ Program for
Split a string into maximum number of unique substrings
*/
class Partition
{
public:
// Returns a max value of two numbers
int maxValue(int a, int b)
{
if (a > b)
{
return a;
}
return b;
}
int maxUniqueSubString(string text, set < string > &record)
{
int result = 0;
for (int i = 1; i <= text.length(); i++)
{
// Collect prefix of length i
string prefix = text.substr(0, i);
if (record.find(prefix) == record.end())
{
// When prefix not exist
// Add prefix value.
record.insert(prefix);
// Calculate the maximum distinct substring of
// suffix and prefix substring.
result = this->maxValue(
this->maxUniqueSubString(text.substr(i), record) + 1,
result);
// Remove prefix
record.erase(prefix);
}
}
return result;
}
void maxUniqueSplit(string text)
{
set < string > record;
cout << "\nGiven Text : " << text;
int result = this->maxUniqueSubString(text, record);
cout << "\n Unique substring : " << result;
}
};
int main()
{
Partition *task = new Partition();
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
task->maxUniqueSplit("AABCCCCCDD");
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
task->maxUniqueSplit("ABCDE");
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
task->maxUniqueSplit("ABBBCCDDEA");
return 0;
}
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
// Include namespace system
using System;
using System.Collections.Generic;
/*
Csharp Program for
Split a string into maximum number of unique substrings
*/
public class Partition
{
// Returns a max value of two numbers
public int maxValue(int a, int b)
{
if (a > b)
{
return a;
}
return b;
}
public int maxUniqueSubString(String text, HashSet < string > record)
{
int result = 0;
for (int i = 1; i <= text.Length; i++)
{
// Collect prefix of length i
String prefix = text.Substring(0, i);
if (!record.Contains(prefix))
{
// When prefix not exist
// Add prefix value.
record.Add(prefix);
// Calculate the maximum distinct substring of
// suffix and prefix substring.
result = this.maxValue(
this.maxUniqueSubString(text.Substring(i), record) + 1,
result);
// Remove prefix
record.Remove(prefix);
}
}
return result;
}
public void maxUniqueSplit(String text)
{
HashSet < string > record = new HashSet < string > ();
Console.Write("\nGiven Text : " + text);
int result = this.maxUniqueSubString(text, record);
Console.Write("\n Unique substring : " + result);
}
public static void Main(String[] args)
{
Partition task = new Partition();
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
task.maxUniqueSplit("AABCCCCCDD");
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
task.maxUniqueSplit("ABCDE");
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
task.maxUniqueSplit("ABBBCCDDEA");
}
}
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
<?php
/*
Php Program for
Split a string into maximum number of unique substrings
*/
class Partition
{
// Returns a max value of two numbers
public function maxValue($a, $b)
{
if ($a > $b)
{
return $a;
}
return $b;
}
public function maxUniqueSubString($text, &$record)
{
$result = 0;
for ($i = 1; $i <= strlen($text); $i++)
{
// Collect prefix of length i
$prefix = substr($text, 0, $i);
if (!in_array($prefix, $record, TRUE))
{
// When prefix not exist
// Add prefix value.
if (in_array($prefix, $record) == false)
{
$record[] = $prefix;
}
// Calculate the maximum distinct substring of
// suffix and prefix substring.
$result = $this->maxValue(
$this->maxUniqueSubString(
substr($text, $i), $record) + 1, $result);
// Remove prefix
unset($record[$prefix]);
}
}
return $result;
}
public function maxUniqueSplit($text)
{
$record = array();
echo("\nGiven Text : ".$text);
$result = $this->maxUniqueSubString($text, $record);
echo("\n Unique substring : ".$result);
}
}
function main()
{
$task = new Partition();
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
$task->maxUniqueSplit("AABCCCCCDD");
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
$task->maxUniqueSplit("ABCDE");
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
$task->maxUniqueSplit("ABBBCCDDEA");
}
main();
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
/*
Node JS Program for
Split a string into maximum number of unique substrings
*/
class Partition
{
// Returns a max value of two numbers
maxValue(a, b)
{
if (a > b)
{
return a;
}
return b;
}
maxUniqueSubString(text, record)
{
var result = 0;
for (var i = 1; i <= text.length; i++)
{
// Collect prefix of length i
var prefix = text.substring(0, i);
if (!record.has(prefix))
{
// When prefix not exist
// Add prefix value.
record.add(prefix);
// Calculate the maximum distinct substring of
// suffix and prefix substring.
result = this.maxValue(
this.maxUniqueSubString(text.substring(i), record) + 1,
result);
// Remove prefix
record.delete(prefix);
}
}
return result;
}
maxUniqueSplit(text)
{
var record = new Set();
process.stdout.write("\nGiven Text : " + text);
var result = this.maxUniqueSubString(text, record);
process.stdout.write("\n Unique substring : " + result);
}
}
function main()
{
var task = new Partition();
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
task.maxUniqueSplit("AABCCCCCDD");
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
task.maxUniqueSplit("ABCDE");
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
task.maxUniqueSplit("ABBBCCDDEA");
}
main();
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
# Python 3 Program for
# Split a string into maximum number of unique substrings
class Partition :
# Returns a max value of two numbers
def maxValue(self, a, b) :
if (a > b) :
return a
return b
def maxUniqueSubString(self, text, record) :
result = 0
i = 1
while (i <= len(text)) :
# Collect prefix of length i
prefix = text[0: i]
if (not prefix in record) :
# When prefix not exist
# Add prefix value.
record.add(prefix)
# Calculate the maximum distinct substring of
# suffix and prefix substring.
result = self.maxValue(
self.maxUniqueSubString(text[i: ], record) + 1,
result)
# Remove prefix
record.remove(prefix)
i += 1
return result
def maxUniqueSplit(self, text) :
record = set()
print("\nGiven Text : ", text, end = "")
result = self.maxUniqueSubString(text, record)
print("\n Unique substring : ", result, end = "")
def main() :
task = Partition()
# Test A
# Given text = AABCCCCCDD
# ————————————————————————
# A AB C CCC CD D
# or
# AA B C CCC CD D
# or
# AA B CC CCC DD
# or
# AA BC C CC CD D
# ...
# etc
# —————————————————————————————
# Maximum number of partition 6.
# Note Unique means no two substring are same
# [Example (XX,XX) or (X,X) This is not valid because both are same]
task.maxUniqueSplit("AABCCCCCDD")
# Test B
# Given text = ABCDE
# ————————————————————————
# A B C D E
# Result = 5
task.maxUniqueSplit("ABCDE")
# Test B
# Given text = ABBBCCDDEA
# ————————————————————————
# AB B BC C D DE A
# or
# A B BB C CD D EA
# etc
# ————————————————————————
# Result = 7
task.maxUniqueSplit("ABBBCCDDEA")
if __name__ == "__main__": main()
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
# Ruby Program for
# Split a string into maximum number of unique substrings
class Partition
# Returns a max value of two numbers
def maxValue(a, b)
if (a > b)
return a
end
return b
end
def maxUniqueSubString(text, record)
result = 0
i = 1
while (i <= text.length)
# Collect prefix of length i
prefix = text[0...(i)]
if (!record.include?(prefix))
# When prefix not exist
# Add prefix value.
record.push(prefix)
# Calculate the maximum distinct substring of
# suffix and prefix substring.
result = self.maxValue(
self.maxUniqueSubString(
text[i..(text.length)], record) + 1, result)
# Remove prefix
record.pop()
end
i += 1
end
return result
end
def maxUniqueSplit(text)
record = []
print("\nGiven Text : ", text)
result = self.maxUniqueSubString(text, record)
print("\n Unique substring : ", result)
end
end
def main()
task = Partition.new()
# Test A
# Given text = AABCCCCCDD
# ————————————————————————
# A AB C CCC CD D
# or
# AA B C CCC CD D
# or
# AA B CC CCC DD
# or
# AA BC C CC CD D
# ...
# etc
# —————————————————————————————
# Maximum number of partition 6.
# Note Unique means no two substring are same
# [Example (XX,XX) or (X,X) This is not valid because both are same]
task.maxUniqueSplit("AABCCCCCDD")
# Test B
# Given text = ABCDE
# ————————————————————————
# A B C D E
# Result = 5
task.maxUniqueSplit("ABCDE")
# Test B
# Given text = ABBBCCDDEA
# ————————————————————————
# AB B BC C D DE A
# or
# A B BB C CD D EA
# etc
# ————————————————————————
# Result = 7
task.maxUniqueSplit("ABBBCCDDEA")
end
main()
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
import scala.collection.mutable._;
/*
Scala Program for
Split a string into maximum number of unique substrings
*/
class Partition()
{
// Returns a max value of two numbers
def maxValue(a: Int, b: Int): Int = {
if (a > b)
{
return a;
}
return b;
}
def maxUniqueSubString(text: String, record: Set[String]): Int = {
var result: Int = 0;
var i: Int = 1;
while (i <= text.length())
{
// Collect prefix of length i
var prefix: String = text.substring(0, i);
if (!record.contains(prefix))
{
// When prefix not exist
// Add prefix value.
record.add(prefix);
// Calculate the maximum distinct substring of
// suffix and prefix substring.
result = maxValue(
maxUniqueSubString(text.substring(i), record) + 1,
result);
// Remove prefix
record.remove(prefix);
}
i += 1;
}
return result;
}
def maxUniqueSplit(text: String): Unit = {
var record: Set[String] = Set();
print("\nGiven Text : " + text);
var result: Int = maxUniqueSubString(text, record);
print("\n Unique substring : " + result);
}
}
object Main
{
def main(args: Array[String]): Unit = {
var task: Partition = new Partition();
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
task.maxUniqueSplit("AABCCCCCDD");
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
task.maxUniqueSplit("ABCDE");
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
task.maxUniqueSplit("ABBBCCDDEA");
}
}
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
/*
Kotlin Program for
Split a string into maximum number of unique substrings
*/
class Partition
{
// Returns a max value of two numbers
fun maxValue(a: Int, b: Int): Int
{
if (a > b)
{
return a;
}
return b;
}
fun maxUniqueSubString(text: String, record: MutableSet <String> ): Int
{
var result: Int = 0;
var i: Int = 1;
while (i <= text.length)
{
// Collect prefix of length i
val prefix: String = text.substring(0, i);
if (!record.contains(prefix))
{
// When prefix not exist
// Add prefix value.
record.add(prefix);
// Calculate the maximum distinct substring of
// suffix and prefix substring.
result = this.maxValue(
this.maxUniqueSubString(text.substring(i), record) + 1,
result);
// Remove prefix
record.remove(prefix);
}
i += 1;
}
return result;
}
fun maxUniqueSplit(text: String): Unit
{
val record : MutableSet <String> = mutableSetOf <String> ();
print("\n Given Text : " + text);
val result: Int = this.maxUniqueSubString(text, record);
print("\n Unique substring : " + result);
}
}
fun main(args: Array < String > ): Unit
{
val task: Partition = Partition();
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
task.maxUniqueSplit("AABCCCCCDD");
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
task.maxUniqueSplit("ABCDE");
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
task.maxUniqueSplit("ABBBCCDDEA");
}
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
package main
import "fmt"
/*
Go Program for
Split a string into maximum number of unique substrings
*/
// Returns a max value of two numbers
func maxValue(a, b int) int {
if a > b {
return a
}
return b
}
func maxUniqueSubString(
text string,
record map[string] bool) int {
var result int = 0
for i := 1 ; i <= len(text) ; i++ {
// Collect prefix of length i
var prefix string = text[0:i]
if _, found := record[prefix] ; !found {
// When prefix not exist
// Add prefix value.
record[prefix] = true
// Calculate the maximum distinct substring of
// suffix and prefix substring.
result = maxValue(maxUniqueSubString(text[i:len(text)], record) + 1, result)
// Remove prefix
record[prefix] = false
}
}
return result
}
func maxUniqueSplit(text string) {
var record = make(map[string] bool)
fmt.Print("\nGiven Text : ", text)
var result int = maxUniqueSubString(text, record)
fmt.Print("\n Unique substring : ", result)
}
func main() {
/*
Test A
Given text = AABCCCCCDD
————————————————————————
A AB C CCC CD D
or
AA B C CCC CD D
or
AA B CC CCC DD
or
AA BC C CC CD D
...
etc
—————————————————————————————
Maximum number of partition 6.
Note Unique means no two substring are same
[Example (XX,XX) or (X,X) This is not valid because both are same]
*/
maxUniqueSplit("AABCCCCCDD")
/*
Test B
Given text = ABCDE
————————————————————————
A B C D E
Result = 5
*/
maxUniqueSplit("ABCDE")
/*
Test B
Given text = ABBBCCDDEA
————————————————————————
AB B BC C D DE A
or
A B BB C CD D EA
etc
————————————————————————
Result = 7
*/
maxUniqueSplit("ABBBCCDDEA")
}
Output
Given Text : AABCCCCCDD
Unique substring : 6
Given Text : ABCDE
Unique substring : 5
Given Text : ABBBCCDDEA
Unique substring : 7
Result Explanation
- For the given examples:
- "AABCCCCCDD" yields a result of 6, as explained earlier.
- "ABCDE" can be split into distinct characters, resulting in 5 unique substrings.
- "ABBBCCDDEA" has a maximum of 7 unique substrings.
- The output matches the expected results for these test cases.
Time Complexity
The time complexity of the algorithm can be high due to the recursive nature. In the worst case, each prefix might be considered multiple times, resulting in exponential time complexity. However, the use of dynamic programming (memoization) can significantly improve this. If we denote the length of the input string as "n," then the time complexity can be approximated as O(n^2) when using memoization. Without memoization, it could be O(2^n) in the worst case.
Please share your knowledge to improve code and content standard. Also submit your doubts, and test case. We improve by your feedback. We will try to resolve your query as soon as possible.
New Comment