1. Overview
Java 21 introduced a new set of methods in the java.lang.Character class to provide better support for emojis. These methods allow us to easily check if a character is an emoji and to check the properties and characteristics of the emoji characters.
In this tutorial, we’ll explore the newly added methods and walk through the key concepts related to emoji handling in Java 21.
2. Character API Updates
Java 21 introduced six new methods in the java.lang.Character class related to emoji handling. All of the new methods are static, take an int representing the Unicode code point of a character as a parameter, and return a boolean response.
A Unicode code point is a unique numeric value assigned to each character in the Unicode standard. It represents a specific character across different platforms and languages. For example, the code point U+0041 represents the letter “A” which is 0x0041 in the hexadecimal form.
The Unicode Consortium, a non-profit corporation, maintains and develops the Unicode standard and provides a full list of emojis and their corresponding Unicode code points.
Now, let’s take a closer look at each of these new emoji-related methods.
2.1. isEmoji()
The isEmoji(int codePoint) method is the most basic of the new emoji methods. It takes an int value representing a character’s Unicode code point and returns a boolean indicating whether the character is an emoji or not.
Let’s take a look at its usage:
String messageWithEmoji = "Hello Java 21! 😄";
String messageWithoutEmoji = "Hello Java!";
assertTrue(messageWithEmoji.codePoints().anyMatch(Character::isEmoji));
assertFalse(messageWithoutEmoji.codePoints().anyMatch(Character::isEmoji));
2.2. isEmojiPresentation()
The isEmojiPresentation(int codePoint) method determines whether a character should render as an emoji or not. Certain characters, like digits (0-9) and currency symbols ($ or €), can be rendered as either an emoji or a text character depending on the context.
Here’s a code snippet that demonstrates the usage of this method:
String emojiPresentationMessage = "Hello Java 21! 🔥😄";
String nonEmojiPresentationMessage = "Hello Java 21!";
assertTrue(emojiPresentationMessage.codePoints().anyMatch(Character::isEmojiPresentation));
assertFalse(nonEmojiPresentationMessage.codePoints().anyMatch(Character::isEmojiPresentation));
2.3. isEmojiModifier()
The isEmojiModifier(int codePoint) method checks if a character is an emoji modifier. Emoji modifiers are characters that can modify the appearance of an existing emoji, such as applying skin tone variations.
Let’s see how we can use this method to detect emoji modifiers:
assertTrue(Character.isEmojiModifier(0x1F3FB)); // light skin tone
assertTrue(Character.isEmojiModifier(0x1F3FD)); // medium skin tone
assertTrue(Character.isEmojiModifier(0x1F3FF)); // dark skin tone
In this test, we use the hexadecimal form of Unicode code points, e.g., 0x1F3FB, instead of the actual emoji characters because emoji modifiers don’t typically render as standalone emojis and lack visual distinction.
2.4. isEmojiModifierBase()
The isEmojiModifierBase(int codePoint) method determines whether an emoji modifier can modify a given character. This method helps to identify the emojis that support modifications, as not all emojis have this capability.
Let’s look at some examples to understand this better:
assertTrue(Character.isEmojiModifierBase(Character.codePointAt("👍", 0)));
assertTrue(Character.isEmojiModifierBase(Character.codePointAt("👶", 0)));
assertFalse(Character.isEmojiModifierBase(Character.codePointAt("🍕", 0)));
We see that the thumbs up emoji “👍” and the baby emoji “👶” are valid emoji modifiers bases, and can be used to express diversity by applying skin tone variations to change their appearance.
On the other hand, the pizza emoji “🍕” does not qualify as a valid emoji modifier base, since it’s a standalone emoji that represents an object rather than a character or symbol that can have its appearance modified.
2.5. isEmojiComponent()
The isEmojiComponent(int codePoint) method checks if a character can be used as a component to create a new emoji character. These characters usually combine with other characters to form new emojis, rather than appearing as standalone emojis.
For example, the Zero Width Joiner (ZWJ) character is a non-printing character that indicates to the rendering system that the adjacent characters should be displayed as a single emoji. By combining the emojis of a man “👨” (0x1F468) and a rocket “🚀” (0x1F680) using the zero width joiner character (0x200D), we can create a new emoji of an astronaut “👨🚀”. We can use a Unicode code converter site to test this out with the input: 0x1F4680x200D0x1F680.
The skin tone characters are also emoji components. We can combine the dark skin tone character (0x1F3FF) with a waving hand emoji “👋” (0x1F44B) to create a waving hand with dark skin tone “👋🏿” (0x1F44B0x1F3FF). Since we’re modifying the appearance of an existing emoji instead of creating a new one, we don’t need to use a ZWJ character for skin tone changes.
Let’s take a look at its usage and detect emoji components in our code:
assertTrue(Character.isEmojiComponent(0x200D)); // Zero width joiner
assertTrue(Character.isEmojiComponent(0x1F3FD)); // medium skin tone
2.6. isExtendedPictographic()
The isExtendedPictographic(int codePoint) method checks if a character is part of the broader category of pictographic symbols, which includes not only traditional emojis but also other symbols that are often rendered by text processing systems differently.
Objects, animals, and other graphical symbols have the extended pictographic property. While not always considered typical emojis, they need to be recognised and processed as part of the emoji set to ensure proper display.
Here’s an example that demonstrates the usage of this method:
assertTrue(Character.isExtendedPictographic(Character.codePointAt("☀️", 0))); // Sun with rays
assertTrue(Character.isExtendedPictographic(Character.codePointAt("✔️", 0))); // Checkmark
Both of the above code points return false if passed to the isEmojiPresentation() method, as they’re part of the broader Extended Pictographic category but don’t have the Emoji Presentation property.
3. Emoji Support in Regular Expressions
In addition to the new emoji methods, Java 21 also introduced emoji support in regular expressions. We can now use the \p{IsXXXX} construct to match characters based on their emoji properties.
Let’s take an example to search for any emoji character in a string using regular expressions:
String messageWithEmoji = "Hello Java 21! 😄";
Matcher isEmojiMatcher = Pattern.compile("\\p{IsEmoji}").matcher(messageWithEmoji);
assertTrue(isEmojiMatcher.find());
String messageWithoutEmoji = "Hello Java!";
isEmojiMatcher = Pattern.compile("\\p{IsEmoji}").matcher(messageWithoutEmoji);
assertFalse(isEmojiMatcher.find());
Similarly, we can use the other emoji constructs in our regular expressions:
- IsEmoji_Presentation
- IsEmoji_Modifier
- IsEmoji_Modifier_Base
- IsEmoji_Component
- IsExtended_Pictographic
It’s important to note that the regex property constructs use the snakecase format to reference the methods. This is different from the lower camelcase format that the static methods in the Character class use.
These regular expression constructs provide a clean and easy way to search for and manipulate emoji characters in strings.
4. Conclusion
In this article, we explored the new emoji-related methods introduced in Java 21’s Character class. We understood the behaviour of these static methods by looking at various examples.
We’ve also discussed the newly added emoji support in regular expressions to search for emoji characters and their properties by using the Pattern class.
These new features can help us when we’re building a chat application, a social media platform, or any other type of application that uses emojis.
As always, all the code examples used in this article are available over on GitHub.