1. Overview
In this article, we’ll look at Project Valhalla – the historical reasons for it, the current state of development and what it brings to the table for the day-to-day Java developer once it’s released.
2. Motivation and Reasons for the Valhalla Project
In one of his talks, Brian Goetz, Java language architect at Oracle, said one of the main motivations for the Valhalla Project is the desire to adapt the Java language and runtime to modern hardware. When the Java language was conceived (roughly 25 years ago at the time of writing), the cost of a memory fetch and an arithmetic operation was roughly the same.
Nowadays, this has shifted, with memory fetch operations being from 200 to 1,000 times more expensive than arithmetic operations. In terms of language design, this means that indirections leading to pointer fetches have a detrimental effect on overall performance.
Since most Java data structures in an application are objects, we can consider Java a pointer-heavy language (although we usually don’t see or manipulate them directly). This pointer-based implementation of objects is used to enable object identity, which itself is leveraged for language features such as polymorphism, mutability and locking. Those features come by default for every object, no matter if they are really needed or not.
Following the chain of identity leading to pointers and pointers leading to indirections, with indirections having performance drawbacks, a logical conclusion is to remove those for data structures that have no need for them. This is where value types come into play, but first, let’s have a look at the regular class:
3. Regular Class
Currently, in Java, a regular class look like:
class Point {
int x;
int y;
}
with a memory layout of:
3.1. Negative Effect of Indirection When Using the Regular Class
In the following diagram, we demonstrate the negative effect of indirections when we use the regular Point class in an array.
Each array slot has a heap reference to a Point object, which is costly in performance and memory terms. So, when accessing one array element, an indirection must take place and consequently introduce an overhead. Moreover, if the array is large, accessing a specific element in the array requires a lot of computation and can result in a decrease in array performance. Additionally, indirections can lead to memory leaks, as objects that are no longer used can still remain in the array and occupy memory space.
In response to this, Project Valhalla introduces the Value classes.
4. Value Class
Value classes still are reference types, as regular classes and stored in the heap space. They are implicitly final, including their fields.
The difference is that the Value class is encoded directly with its field values, minimizing the cost from inner object headers, heap allocations and indirections.
This allows the JVM to flatten value types into arrays and objects, as well as into other value types, to an extent.
The idea of value types is to represent pure data aggregates. This comes with dropping the features of regular objects. So, we have pure data without identity. This means, of course, that we’re also losing features we could implement using object identity. Consequentially, equality comparison can only happen based on state. Thus, we can’t use representational polymorphism, and we can’t use immutable or non-nullable objects.
The code and the corresponding memory layout of a Value Point class would be:
value class Point {
int x;
int y;
}
Value classes have a similar layout to the regular classes we saw above but improve that by nesting one object within another in a way that allows more efficient storage in the array and memory. Also, modifying the list is more efficient since new instances have to be created, in contrast with the regular objects where modifications have to take place in the same instance.
Still, there is room for improvement because Value classes are the middle ground between objects and primitives; they present some drawbacks when used as fields or in arrays:
- A variable of a Value type could be null, so we require additional bits to encode null.
- As heap references, Value objects must be modified atomically, so it is not practical to inline Value objects.
Project Valhalla aims to tackle the above by introducing Primitive classes.
5. Primitive Classes
Primitive classes will offer some performance benefits over Value classes. As they act like regular primitives, they are also stored in stack memory, and each primitive belongs to its own thread. Again, multiple threads cannot access the same primitive, and thus the latter don’t need to be modified atomically (performance-intensive). Moreover, we can consider their instances as a composite of primitive types that additionally have extra utility methods.
This is a Primitive type Point[]:
This also enables the JVM to pass value types on the stack instead of having to allocate them on the heap. In the end, this means that we’re getting data aggregates that have runtime behaviour similar to Java primitives, such as int or float.
But unlike primitives, value types can have methods and fields. We can also implement interfaces and use them as generic types. So we can look at the value types from two different angles:
- Faster objects
- User-defined primitives
As additional icing on the cake, we can use value types as generic types without boxing. This directly leads us to the other big Project Valhalla feature: specialized generics, explained later in the article.
6. Classes for the Basic Primitives
Currently, starting from Java 5, primitive values can be stored in Wrapper classes and presented as objects with boxing/unboxing. Still, there is room for improvement; for example, wrapping primitive values in objects has measurable runtime costs, and boxing identical values may lead to two objects not being equal to each other.
Project Valhalla aims to migrate the wrapper classes (java.lang.Integer, java.lang.Double, etc.) to primitive classes.
As stated in JEP 401, this eliminates most of the overhead of modelling primitive values with classes. As a result, it is now practical to treat the basic primitives as class types, gaining all the capabilities of classes and delegating many details of these types to the standard library.
7. Enhanced Generics
When we want to use generics for language primitives, we currently use boxed types, such as Integer for int or Float for float. This boxing creates an additional layer of indirection, thereby defeating the purpose of using primitives for performance enhancement in the first place.
Therefore, we see many dedicated specializations for primitive types in existing frameworks and libraries, like IntStream
So, enhanced generics is an effort to remove the need for those “hacks”. Instead, the Java language strives to enable generic types for basically everything: object references, primitives, value types, and maybe even void.
8. Conclusion
We’ve taken a glimpse at the changes that Project Valhalla will bring to the Java language. Two of the main goals are enhanced performance and less leaky abstractions.
The performance enhancements are tackled by flattening object graphs and removing indirections. This leads to more efficient memory layouts and fewer allocations and garbage collections.
The better abstraction comes with primitives and objects having a more similar behaviour when used as Generic types.
An early prototype of Project Valhalla, introducing value types into the existing type system, has the code name LW1.
We can find more information about Project Valhalla on the corresponding project page and JEPs: