1. Introduction
In this tutorial, we’ll look at how to sort alphanumeric Strings by the numbers they contain. We’ll focus on removing all non-numeric characters from the String before sorting multiple Strings by the numerical characters that remain.
We’ll look at common edge cases, including empty Strings and invalid numbers.
Finally, we’ll unit test our solution to ensure it works as expected.
2. Outlining the Problem
Before we begin, we need to describe what we want our code to achieve. For this particular problem, we’ll make the following assumptions:
- Our strings may contain only numbers, only letters or a mix of the two.
- The numbers in our strings may be integers or doubles.
- When numbers in a string are separated by letters, we should remove the letter and condense the digits together. For example, 2d3 becomes 23.
- For simplicity, when an invalid or missing number appears, we should treat them as 0.
With this established, let’s get stuck into our solution.
3. A Regex Solution
Since our first step is to search for numeric patterns within our input String, we can put to use regular expressions, commonly known as a regex.
The first thing we need is our regex. We want to conserve all integers as well as decimal points from the input String. We can achieve our goal with the following:
String DIGIT_AND_DECIMAL_REGEX = "[^\\d.]"
String digitsOnly = input.replaceAll(DIGIT_AND_DECIMAL_REGEX, "");
Let’s briefly explain what’s happening:
- ‘[^ ]’ – denotes a negated set, therefore targetting any character not specified by the enclosed regex
- ‘\d’ – match any digit character (0 – 9)
- ‘.’ – match any “.” character
We then use String.replaceAll method to remove any characters not specified by our regex. By doing this, we can ensure that the first three points of our goal can be achieved.
Next, we need to add some conditions to ensure empty and invalid Strings return 0, while valid Strings return a valid Double:
if("".equals(digitsOnly)) return 0;
try {
return Double.parseDouble(digitsOnly);
} catch (NumberFormatException nfe) {
return 0;
}
That completes our logic. All that’s left to do is plug it into a comparator so that we can conveniently sort Lists of input Strings.
Let’s create an efficient method to return our comparator from anywhere we may want it:
public static Comparator<String> createNaturalOrderRegexComparator() {
return Comparator.comparingDouble(NaturalOrderComparators::parseStringToNumber);
}
4. Test, Test, Test
What good is code without tests to verify its functionality? Let’s set up a quick unit test to ensure it all works as we planned:
List<String> testStrings =
Arrays.asList("a1", "d2.2", "b3", "d2.3.3d", "c4", "d2.f4",); // 1, 2.2, 3, 0, 4, 2.4
testStrings.sort(NaturalOrderComparators.createNaturalOrderRegexComparator());
List<String> expected = Arrays.asList("d2.3.3d", "a1", "d2.2", "d2.f4", "b3", "c4");
assertEquals(expected, testStrings);
In this unit test, we’ve packed in all of the scenarios we’ve planned for. Invalid numbers, integers, decimals, and letter-separated numbers all in included in our testStrings variable.
5. Conclusion
In this short article, we’ve demonstrated how to sort alphanumeric strings based on the numbers within them – making use of regular expressions to do the hard work for us.
We’ve handled standard exceptions that may occur when parsing input strings and tested the different scenarios with unit testing.
As always, the code can be found over on GitHub.