1. Overview
In Java, we often need to mask a String, for example to hide sensitive information printed in log files or displayed to the user.
In this tutorial, we’ll explore how to accomplish this using a few simple Java techniques.
2. Introduction to the Problem
We might need to mask sensitive information in many situations, such as credit card numbers, social security numbers, or even email addresses. A common way to do this is to hide all but the last few characters of the String.
As usual, examples help understand a problem quickly. Let’s say we have three sensitive String values:
static final String INPUT_1 = "a b c d 1234";
static final String INPUT_2 = "a b c d ";
static final String INPUT_3 = "a b";
Now, we want to mask all characters in these String values except the last N. For simplicity, let’s take N=4 and mask each character using an asterisk (*) in this tutorial. Therefore, the expected results are:
static final String EXPECTED_1 = "********1234";
static final String EXPECTED_2 = "******** ";
static final String EXPECTED_3 = "a b";
As we can see, *if an input String‘s length is less than or equal to N (4), we skip masking it*. Further, we treat whitespace characters the same as regular characters.
Next, we’ll take these String inputs as examples and use different approaches to mask them to get the expected result. As usual, we’ll leverage unit test assertions to verify if each solution works correctly.
3. Using char Array
We know that a String is composed of a sequence of chars. Therefore, we can convert the String input to a char array and then apply the masking logic to the char array:
String maskByCharArray(String input) {
if (input.length() <= 4) {
return input;
}
char[] chars = input.toCharArray();
Arrays.fill(chars, 0, chars.length - 4, '*');
return new String(chars);
}
Next, let’s walk through the implementation and understand how it works.
First, we check if the input’s length is less than or equal to 4. If so, no masking is applied, and we return the input String as is.
Then, we convert the input String to a char[] using the toCharArray() method*.* Next, we leverage Arrays.fill() to mask the characters. Arrays.fill() allows us to define which part of the input char[] needs to be filled by a specific character. In this case, we only want to fill() (mask) chars.length – 4 characters from the beginning of the array.
Finally, after masking the required portion of the char array, we convert it back into a String using a new String(chars) and return the result.
Next, let’s test if this solution works as expected:
assertEquals(EXPECTED_1, maskByCharArray(INPUT_1));
assertEquals(EXPECTED_2, maskByCharArray(INPUT_2));
assertEquals(EXPECTED_3, maskByCharArray(INPUT_3));
As the test shows, it masks our three inputs correctly.
4. Two Substrings
Our requirement is to mask all characters in the given String except the last four. In other words, we can divide the input String into two substrings: a substring to be masked (toMask), and a substring for the last four characters to keep plain (keep plain).
Then, we can simply replace all of the characters in the toMask substring with ‘*’ and join the two substrings together to form the final result:
String maskBySubstring(String input) {
if (input.length() <= 4) {
return input;
}
String toMask = input.substring(0, input.length() - 4);
String keepPlain = input.substring(input.length() - 4);
return toMask.replaceAll(".", "*") + keepPlain;
}
As the code shows, similar to the char array approach, we first handle the case when the input’s length() <= 4.
Then we use the substring() method to extract two substrings: toMask and keepPlain.
We ask replaceAll() to mask toMask by replacing any character “.” with “*”. It’s important to note that the “.” parameter here is a regular expression (regex) to match any character rather than a literal period character.
Finally, we concatenate the masked portion with the unmasked portion (keepPlain) and return the result.
This method passes our tests, too:
assertEquals(EXPECTED_1, maskBySubstring(INPUT_1));
assertEquals(EXPECTED_2, maskBySubstring(INPUT_2));
assertEquals(EXPECTED_3, maskBySubstring(INPUT_3));
As we can see, this approach is a neat and concise solution to this problem.
5. Using Regex
In the two-substrings solution, we used replaceAll(). We also mentioned that this method supports regex. In fact, by using replaceAll() and a clever regex, we can efficiently solve this masking problem in a single step.
Next, let’s see how this is done:
String maskByRegex(String input) {
if (input.length() <= 4) {
return input;
}
return input.replaceAll(".(?=.{4})", "*");
}
In this example, apart from the input.length() <=4 case handling, we apply the masking logic only by one single replaceAll() call. Next, let’s understand the magic, the regex “*.(?=.{4})*”
This is a lookahead assertion. It ensures that only characters that are followed by exactly four more characters remain unmasked.
Simply put, the regex looks for any character (.) that is followed by four characters (.{4}) and replaces it with “*”. This ensures that only the characters before the last four are masked.
If we test it with our inputs, we get the expected result:
assertEquals(EXPECTED_1, maskByRegex(INPUT_1));
assertEquals(EXPECTED_2, maskByRegex(INPUT_2));
assertEquals(EXPECTED_3, maskByRegex(INPUT_3));
The regex approach efficiently handles the masking in a single pass, making it ideal for concise code.
6. Using the repeat() Method
Since Java 11, the repeat() method has joined the String family. It allows us to create a String value by repeating a certain character several times. If we work with Java 11 or later, we can use repeat() to solve the masking problem:
String maskByRepeat(String input) {
if (input.length() <= 4) {
return input;
}
int maskLen = input.length() - 4;
return "*".repeat(maskLen) + input.substring(maskLen);
}
In this method, we first calculate the mask length (maskLen): input.length() -4. Then, we directly repeat the masking character for the required length and concatenate it with the unmasked substring, forming the final result.
As usual, let’s test this approach using our input Strings:
assertEquals(EXPECTED_1, maskByRepeat(INPUT_1));
assertEquals(EXPECTED_2, maskByRepeat(INPUT_2));
assertEquals(EXPECTED_3, maskByRepeat(INPUT_3));
As the test shows, the repeat() approach does the job.
7. Conclusion
In this article, we’ve explored different ways to mask a String while keeping the last four characters visible.
It’s worth noting that although we picked the ‘*’ character to mask sensitive information and kept the last four (N = 4) characters visible in this tutorial, these methods can be easily adjusted to suit different requirements, for example, a different mask character or N value.
As always, the complete source code for the examples is available over on GitHub.