1. Introduction
In this tutorial, we’ll show how we can check in Java if a String is a sequence of repeated substrings.
2. The Problem
Before we continue with the implementation, let’s set up some conditions. First, we’ll assume that our String has at least two characters.
Second, there’s at least one repetition of a substring.
This is best illustrated with some examples by checking out a few repeated substrings:
"aa"
"ababab"
"barrybarrybarry"
And a few non-repeated ones:
"aba"
"cbacbac"
"carlosxcarlosy"
We’ll now show a few solutions to the problem.
3. A Naive Solution
Let’s implement the first solution.
The process is rather simple: we’ll check the String‘s length and eliminate the single character Strings at the very beginning.
Then, since the length of a substring can’t be larger than a half of the string’s length, we’ll iterate through the half of the String and create the substring in every iteration by appending the next character to the previous substring.
We’ll next remove those substrings from the original String and check if the length of the “stripped” one is zero. That would mean that it’s made only of its substrings:
public static boolean containsOnlySubstrings(String string) {
if (string.length() < 2) {
return false;
}
StringBuilder substr = new StringBuilder();
for (int i = 0; i < string.length() / 2; i++) {
substr.append(string.charAt(i));
String clearedFromSubstrings
= string.replaceAll(substr.toString(), "");
if (clearedFromSubstrings.length() == 0) {
return true;
}
}
return false;
}
Let’s create some Strings to test our method:
String validString = "aa";
String validStringTwo = "ababab";
String validStringThree = "baeldungbaeldung";
String invalidString = "aca";
String invalidStringTwo = "ababa";
String invalidStringThree = "baeldungnonrepeatedbaeldung";
And, finally, we can easily check its validity:
assertTrue(containsOnlySubstrings(validString));
assertTrue(containsOnlySubstrings(validStringTwo));
assertTrue(containsOnlySubstrings(validStringThree));
assertFalse(containsOnlySubstrings(invalidString));
assertFalse(containsOnlySubstrings(invalidStringTwo));
assertFalse(containsOnlySubstrings(invalidStringThree));
Although this solution works, it’s not very efficient since we iterate through half of the String and use replaceAll() method in every iteration.
Obviously, it comes with the cost regarding the performance. It’ll run in time O(n^2).
4. The Efficient Solution
Now, we’ll illustrate another approach.
Namely, we should make use of the fact that a String is made of the repeated substrings if and only if it’s a nontrivial rotation of itself.
The rotation here means that we remove some characters from the beginning of the String and put them at the end. For example, “eldungba” is the rotation of “baeldung”. If we rotate a String and get the original one, then we can apply this rotation over and over again and get the String consisting of the repeated substrings.
Next, we need to check if this is the case with our example. To accomplish this, we’ll make use of the theorem which says that if String A and String B have the same length, then we can say that A is a rotation of B if and only if A is a substring of BB. If we go with the example from the previous paragraph, we can confirm this theorem: baeldungbaeldung.
Since we know that our String A will always be a substring of AA, we then only need to check if the String A is a substring of AA excluding the first character:
public static boolean containsOnlySubstringsEfficient(String string) {
return ((string + string).indexOf(string, 1) != string.length());
}
We can test this method the same way as the previous one. This time, we have O(n) time complexity.
We can find some useful theorems about the topic in String analysis research.
5. Conclusion
In this article, we illustrated two ways of checking if a String consists only of its substrings in Java.
All code samples used in the article are available over on GitHub.