1. Overview
String manipulation is a crucial skill in Java programming. One common task is splitting a String based on the last occurrence of a specific character.
In this quick tutorial, we’ll explore different ways to achieve this.
2. Introduction to the Problem
Our goal is to split a String into two parts by the last occurrence of the specified character. An example can explain it quickly.
Let’s say we’re given a String input:
static final String INPUT1 = "a b c@e f g@x y z";
Now, we’d like to split the String by the last ‘*@*‘ character and obtain a String array with two elements:
static final String[] EXPECTED1 = new String[] { "a b c@e f g", "x y z" };
The example shows that the input has two ‘*@*‘ characters, but we must split it only by the last ‘*@*‘.
Sometimes, an input contains only one ‘*@*‘ and the character is at the beginning, for instance:
static final String INPUT2 = "@a b c";
In this scenario, we expect to have a String array with two elements. One is an empty String, and the other is the substring after ‘*@*‘:
static final String[] EXPECTED2 = new String[] { "", "a b c" };
Similarly, the unique ‘*@*‘ character can be located at the very end of the input:
static final String INPUT3 = "a b c@";
static final String[] EXPECTED3 = new String[] { "a b c", "" };
Finally, there is still one special case: the input doesn’t contain a ‘*@*‘ character. In this case, the result String array should only contain one element: the input itself.
static final String INPUT4 = "a b c";
static final String[] EXPECTED4 = new String[] { "a b c" };
In this tutorial, we’ll address two methods to solve the problem. For simplicity, we’ll skip null checks in the implementations.
Next, let’s dive into the implementation.
3. Using lastIndexOf()
A straightforward way to solve the problem is first to find the location of the last ‘*@‘ in the input String and then obtain the two substrings before and after the last ‘@*‘ character.
Java’s lastIndexOf() and substring() methods can help us achieve our goal:
String[] splitByLastOccurrence(String input, char character) {
int idx = input.lastIndexOf(character);
return new String[] { input.substring(0, idx), input.substring(idx + 1) };
}
This implementation works for most inputs. However, it fails if the input doesn’t contain character, such as our INPUT4. This is because idx will be -1, and substring(0, -1) throws an exception.
So, we must handle this special case to make the method work for all input Strings:
String[] splitByLastOccurrence(String input, char character) {
int idx = input.lastIndexOf(character);
if (idx < 0) {
return new String[] { input };
}
return new String[] { input.substring(0, idx), input.substring(idx + 1) };
}
Next, let’s create a test to verify whether splitByLastOccurrence() works as expected:
String[] result1 = splitByLastOccurrence(INPUT1, '@');
assertArrayEquals(EXPECTED1, result1);
String[] result2 = splitByLastOccurrence(INPUT2, '@');
assertArrayEquals(EXPECTED2, result2);
String[] result3 = splitByLastOccurrence(INPUT3, '@');
assertArrayEquals(EXPECTED3, result3);
String[] result4 = splitByLastOccurrence(INPUT4, '@');
assertArrayEquals(EXPECTED4, result4);
The test shows this solution works for all scenarios.
4. Using split()
The String.split() method is a convenient tool for solving String splitting problems. So next, let’s create the regex pattern to match the last ‘*@*‘ character, then we can solve our problem using split().
A positive lookahead can help us match the last ‘*@‘ character: “@(?=[^@]*$)*“. This pattern effectively matches the last ‘*@‘ in a String by ensuring no other ‘@*‘ characters after it.
Next, let’s see if we can get the expected results using split() with this regex pattern:
String regex = "@(?=[^@]*$)";
String[] result1 = INPUT1.split(regex);
assertArrayEquals(EXPECTED1, result1);
String[] result2 = INPUT2.split(regex);
assertArrayEquals(EXPECTED2, result2);
String[] result3 = INPUT3.split(regex);
assertArrayEquals(new String[] { "a b c d" }, result3);
String[] result4 = INPUT4.split(regex);
assertArrayEquals(EXPECTED4, result4);
As we can see, the split() approach works for INPUT1, INPUT2, and INPUT4. However, after we split() INPUT3 (“a b c@”), the result array only contains one element. This is because if we don’t pass the limit parameter to split(), split() takes zero as limit. Thus, split() discards the trailing empty Strings.
We can pass limit=2 to split() to fix this problem:
String regex = "@(?=[^@]*$)";
String[] result1 = INPUT1.split(regex, 2);
assertArrayEquals(EXPECTED1, result1);
String[] result2 = INPUT2.split(regex, 2);
assertArrayEquals(EXPECTED2, result2);
String[] result3 = INPUT3.split(regex, 2);
assertArrayEquals(EXPECTED3, result3);
String[] result4 = INPUT4.split(regex, 2);
assertArrayEquals(EXPECTED4, result4);
As the test shows, it works for all cases when we pass 2 as limit to split().
5. Conclusion
In this article, we’ve explored two approaches to splitting a String by the last occurrence of a character through examples. By applying these methods, we can efficiently handle various scenarios, ensuring robust code for String manipulation tasks.
As always, the complete source code for the examples is available over on GitHub.