1. Overview
The String object is the most used class in the Java language.
In this quick article, we’ll explore the Java String Pool — the special memory region where Strings are stored by the JVM.
2. String Interning
Thanks to the immutability of Strings in Java, the JVM can optimize the amount of memory allocated for them by storing only one copy of each literal String in the pool. This process is called interning.
When we create a String variable and assign a value to it, the JVM searches the pool for a String of equal value.
If found, the Java compiler will simply return a reference to its memory address, without allocating additional memory.
If not found, it’ll be added to the pool (interned) and its reference will be returned.
Let’s write a small test to verify this:
String constantString1 = "Baeldung";
String constantString2 = "Baeldung";
assertThat(constantString1)
.isSameAs(constantString2);
3. Strings Allocated Using the Constructor
When we create a String via the new operator, the Java compiler will create a new object and store it in the heap space reserved for the JVM.
Every String created like this will point to a different memory region with its own address.
Let’s see how this is different from the previous case:
String constantString = "Baeldung";
String newString = new String("Baeldung");
assertThat(constantString).isNotSameAs(newString);
4. String Literal vs String Object
When we create a String object using the new() operator, it always creates a new object in heap memory. On the other hand, if we create an object using String literal syntax e.g. “Baeldung”, it may return an existing object from the String pool, if it already exists. Otherwise, it will create a new String object and put in the string pool for future re-use.
At a high level, both are the String objects, but the main difference comes from the point that new() operator always creates a new String object. Also, when we create a String using literal – it is interned.
This will be much more clear when we compare two String objects created using String literal and the new operator:
String first = "Baeldung";
String second = "Baeldung";
System.out.println(first == second); // True
In this example, the String objects will have the same reference.
Next, let’s create two different objects using new and check that they have different references:
String third = new String("Baeldung");
String fourth = new String("Baeldung");
System.out.println(third == fourth); // False
Similarly, when we compare a String literal with a String object created using new() operator using the == operator, it will return false:
String fifth = "Baeldung";
String sixth = new String("Baeldung");
System.out.println(fifth == sixth); // False
In general, we should use the String literal notation when possible. It is easier to read and it gives the compiler a chance to optimize our code.
5. Manual Interning
We can manually intern a String in the Java String Pool by calling the intern() method on the object we want to intern.
Manually interning the String will store its reference in the pool, and the JVM will return this reference when needed.
Let’s create a test case for this:
String constantString = "interned Baeldung";
String newString = new String("interned Baeldung");
assertThat(constantString).isNotSameAs(newString);
String internedString = newString.intern();
assertThat(constantString)
.isSameAs(internedString);
6. Garbage Collection
Before Java 7, the JVM placed the Java String Pool in the PermGen space, which has a fixed size — it can’t be expanded at runtime and is not eligible for garbage collection.
The risk of interning Strings in the PermGen (instead of the Heap) is that we can get an OutOfMemory error from the JVM if we intern too many Strings.
From Java 7 onwards, the Java String Pool is stored in the Heap space, which is garbage collected by the JVM*.* The advantage of this approach is the reduced risk of OutOfMemory error because unreferenced Strings will be removed from the pool, thereby releasing memory.
7. Performance and Optimizations
In Java 6, the only optimization we can perform is increasing the PermGen space during the program invocation with the MaxPermSize JVM option:
-XX:MaxPermSize=1G
In Java 7, we have more detailed options to examine and expand/reduce the pool size. Let’s see the two options for viewing the pool size:
-XX:+PrintFlagsFinal
-XX:+PrintStringTableStatistics
If we want to increase the pool size in terms of buckets, we can use the StringTableSize JVM option:
-XX:StringTableSize=4901
Prior to Java 7u40, the default pool size was 1009 buckets but this value was subject to a few changes in more recent Java versions. To be precise, the default pool size from Java 7u40 until Java 11 was 60013 and now it increased to 65536.
Note that increasing the pool size will consume more memory but has the advantage of reducing the time required to insert the Strings into the table.
8. A Note About Java 9
Until Java 8, Strings were internally represented as an array of characters – char[], encoded in UTF-16, so that every character uses two bytes of memory.
With Java 9 a new representation is provided, called Compact Strings. This new format will choose the appropriate encoding between char[] and byte[] depending on the stored content.
Since the new String representation will use the UTF-16 encoding only when necessary, the amount of heap memory will be significantly lower, which in turn causes less Garbage Collector overhead on the JVM.
9. Conclusion
In this guide, we showed how the JVM and the Java compiler optimize memory allocations for String objects via the Java String Pool.
All code samples used in the article are available over on GitHub.