1. Overview
One of the advantages of database abstraction layers, such as ORM (object-relational mapping) frameworks, is their ability to transparently cache data retrieved from the underlying store. This helps eliminate database-access costs for frequently accessed data.
Performance gains can be significant if the read/write ratios of cached content are high. This is especially true for entities that consist of large object graphs.
In this tutorial, we’ll explore Hibernate second-level cache. We’ll understand some basic concepts, and illustrate everything with simple examples. We’ll use JPA, and fall back to Hibernate native API only for those features that aren’t standardized in JPA.
2. What Is a Second-Level Cache?
As with most other fully-equipped ORM frameworks, Hibernate has the concept of a first-level cache. It’s a session-scoped cache which ensures that each entity instance is loaded only once in the persistent context.
Once the session is closed, the first-level cache is terminated as well. This is actually desirable, as it allows for concurrent sessions to work with entity instances in isolation from each other.
Conversely, a second-level cache is SessionFactory-scoped, meaning it’s shared by all sessions created with the same session factory. When an entity instance is looked up by its id (either by application logic or by Hibernate internally, e.g., when it loads associations to that entity from other entities), and second-level caching is enabled for that entity, the following happens:
- If an instance is already present in the first-level cache, it’s returned from there.
- If an instance isn’t found in the first-level cache, and the corresponding instance state is cached in the second-level cache, then the data is fetched from there and an instance is assembled and returned.
- Otherwise, the necessary data are loaded from the database and an instance is assembled and returned.
Once the instance is stored in the persistence context (first-level cache), it’s returned from there in all subsequent calls within the same session until the session is closed, or the instance is manually evicted from the persistence context. The loaded instance state is also stored in the L2 cache if it wasn’t already there.
3. Region Factory
Hibernate second-level caching is designed to be unaware of the actual cache provider used. The framework only needs to be provided with an implementation of the org.hibernate.cache.spi.RegionFactory interface which encapsulates all the details specific to the actual cache providers. Basically, it acts as a bridge between Hibernate and cache providers.
In older versions (5.x) of Hibernate, the maintainers used to provide implementations of RegionFactories. We can find, for example, modules for specific stacks such as hibernate-ehcache or hibernate-infinispan.
Since series 6.x, the standard way to plug-in a second-level cache is through a jsr-107 (jcache) adapter. This approach frees hibernate maintainers from having to implement a RegionFactory for each cache provider.
In this article, we’ll see how to plug-in Ehcache as the second-level cache. To accomplish that, we’ll need the following Maven dependencies:
<dependency>
<groupId>org.hibernate.orm</groupId>
<artifactId>hibernate-jcache</artifactId>
<version>6.5.2.Final</version>
</dependency>
<dependency>
<groupId>org.ehcache</groupId>
<artifactId>ehcache</artifactId>
<version>3.10.8</version>
<classifier>jakarta</classifier>
</dependency>
The hibernate-jcache module brings a compatible hibernate-core as a transitive dependency. In the current release policy, both core and jcache versions will match. For instance*,* if we select hibernate-jcache-6.5.2.Final, then hibernate-core–6.5.2.Final will be used*.*
The Ehcache dependency must be declared with the jakarta classifier to be compatible with Hibernate 6.x series.
4. Enabling Second-Level Caching
With the following four properties, we’ll tell Hibernate that L2 caching is enabled, give it the implementation’s name of the region factory class, and point to the Ehcache configuration file:
hibernate.cache.use_second_level_cache=true
hibernate.cache.region.factory_class=org.hibernate.cache.jcache.internal.JCacheRegionFactory
hibernate.javax.cache.uri=ehcache.xml
hibernate.javax.cache.provider=org.ehcache.jsr107.EhcacheCachingProvider
For example, in persistence.xml, it would look like:
<properties>
...
<property name="hibernate.cache.use_second_level_cache" value="true" />
<property name="hibernate.cache.region.factory_class" value="org.hibernate.cache.jcache.internal.JCacheRegionFactory" />
<property name="hibernate.javax.cache.uri" value="ehcache.xml" />
<property name="hibernate.javax.cache.provider" value="org.ehcache.jsr107.EhcacheCachingProvider" />
...
</properties>
To disable second-level caching (say for debugging purposes), we just set the hibernate.cache.use_second_level_cache property to false.
The hibernate.javax.cache.uri is used to configure and customize Ehcache. When no protocol is specified, it looks for the configuration on the classpath.
5. Making an Entity Cacheable
In order to make an entity eligible for second-level caching, we’ll annotate it with the Hibernate specific @org.hibernate.annotations.Cache annotation, and specify a cache concurrency strategy.
Some developers consider it a good convention to add the standard @jakarta.persistence.Cacheable annotation as well (although not required by Hibernate), so an entity class implementation might look like this:
@Entity
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Foo {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
@Column(name = "ID")
private long id;
@Column(name = "NAME")
private String name;
// getters and setters
}
For each entity class, Hibernate uses a separate cache region to store state of instances for that class. By default, the region name is the fully qualified class name.
For example, Foo instances are stored in a cache named com.baeldung.hibernate.cache.model.Foo in Ehcache.
We may also opt-in to use a shared cache amongst distinct entities by explicitly setting the region name. However, it isn’t recommended to place versioned and unversioned entities in the same region.
To verify that caching is working, we can write a quick test:
@Autowired
EntityManagerFactory emf;
var cache = emf.getCache();
Foo foo = new Foo();
fooService.create(foo);
fooService.findOne(foo.getId());
var wasCached = cache.contains(Foo.class, foo.getId());
assertTrue(wasCache);
Here we use an instance of jakarta.persistence.Cache to verify that the com.baeldung.hibernate.cache.model.Foo cache isn’t empty after we load a Foo instance.
We could also enable the logging of SQL generated by Hibernate, and invoke fooService.findOne(foo.getId()) multiple times in the test to verify that the select statement for loading Foo is printed only once (the first time), meaning that in subsequent calls, the entity instance is fetched from the cache.
6. Cache Concurrency Strategy
Based on use cases, we’re free to pick one of the following cache concurrency strategies:
- READ_ONLY: Used only for entities that never change (an exception is thrown if an attempt to update such an entity is made). It’s very simple and performative. It’s suitable for static reference data that doesn’t change.
- NONSTRICT_READ_WRITE: Cache is updated after the transaction that changed the affected data has been committed. Thus, strong consistency isn’t guaranteed, and there’s a small time window in which stale data may be obtained from the cache. This kind of strategy is suitable for use cases that can tolerate eventual consistency.
- READ_WRITE: This strategy guarantees strong consistency, which it achieves by using ‘soft’ locks. When a cached entity is updated, a soft lock is stored in the cache for that entity as well, which is released after the transaction is committed. All concurrent transactions that access soft-locked entries will fetch the corresponding data directly from the database.
- TRANSACTIONAL: Cache changes are done in distributed XA transactions. A change in a cached entity is either committed or rolled back in both the database and cache in the same XA transaction.
7. Cache Management
If expiration and eviction policies aren’t defined, the cache could grow indefinitely and eventually consume all of the available memory. In most cases, Hibernate leaves cache management duties like these to cache providers, as they are indeed specific to each cache implementation.
For example, we could define the following Ehcache configuration to limit the maximum number of cached Foo instances to 1000:
<cache-template name="entities">
<resources>
<heap unit="entries">1000</heap>
</resources>
</cache-template>
<cache
alias="com.baeldung.hibernate.cache.model.Foo"
uses-template="entities">
</cache>
Additionally, some setup details are worth noticing, especially when migrating from older Hibernate versions:
- By default, Hibernate will automatically create undeclared caches for associated regions on the fly.
- To override this default, we can set the property hibernate.javax.cache.missing_cache_strategy to fail. This may be the intended behavior when we want to enforce fine-grained control over the amount of memory dedicated to caching.
- The regions org.hibernate.cache.UpdateTimestampsCache and org.hibernate.cache.internal.StandardQueryCache, used by older Hibernate versions, were replaced by default-update-timestamps-region and default-query-results-region, respectively.
8. Collection Cache
Collections aren’t cached by default, and we need to explicitly mark them as cacheable:
@Entity
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Foo {
...
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
@OneToMany
private Collection<Bar> bars;
// getters and setters
}
9. Internal Representation of Cached State
Entities usually aren’t stored in the second-level cache as Java instances, but rather in their disassembled (hydrated) state:
- Id (primary key) isn’t stored (it’s stored as part of the cache key)
- Transient properties aren’t stored
- Collections aren’t stored (see below for more details)
- Non-association property values are stored in their original form
- Only id (foreign key) is stored for ToOne associations
This depicts the general Hibernate second-level cache design, where the cache model reflects the underlying relational model, which is space-efficient and makes it easy to keep the two synchronized.
We can store read-only entities as direct references, as long as, a set of constraints is satisfied:
- The entity class is marked with @org.hibernate.annotations.Immutable
- The model is flat, that is, it contains only scalar fields and no relationships
- The property hibernate.cache.use_reference_entries is set to true
Direct references will yield greater lookup performance than disassembled states, however, their usage is limited to pure in-memory caches, so if the cache has some sort of replication mechanism or disk-based storage, direct references will not work.
9.1. Internal Representation of Cached Collections
We already mentioned that we have to explicitly indicate that a collection (OneToMany or ManyToMany association) is cacheable, otherwise, it isn’t cached.
Hibernate actually stores collections in separate cache regions, one for each collection. The region name is a fully qualified class name, plus the name of a collection property (for example, com.baeldung.hibernate.cache.model.Foo.bars). This gives us the flexibility to define separate cache parameters for collections, e.g., eviction/expiration policy.
It’s also important to mention that only the ids of entities contained in a collection are cached for each collection entry. This means that in most cases, it’s a good idea to make the contained entities cacheable as well.
10. Cache Invalidation for HQL DML-Style Queries and Native Queries
When it comes to DML-style HQL (insert, update and delete HQL statements), Hibernate is able to determine which entities are affected by such operations:
entityManager.createQuery("update Foo set … where …").executeUpdate();
In this case, all Foo instances are evicted from the L2 cache, while the other cached content remains unchanged.
However, when it comes to native SQL DML statements, Hibernate can’t guess what’s being updated, so it invalidates all regions of the second-level cache, regardless of the relationships between the entity been updated and the others:
session.createNativeQuery("update ROO set … where …").executeUpdate();
This is probably not what we want. The solution is to tell Hibernate which entities are affected by native DML statements so that it can evict only the entries related to Foo entities:
Query nativeQuery = entityManager.createNativeQuery("update FOO set ... where ...");
nativeQuery.unwrap(org.hibernate.query.NativeQuery.class).addSynchronizedEntityClass(Foo.class);
nativeQuery.executeUpdate();
With this approach, we have to fall back to Hibernate native NativeQuery API, as this feature isn’t yet defined in JPA.
Note that the above applies only to DML statements (insert, update, delete, and native function/procedure calls). Native select queries don’t invalidate the cache.
11. Query Cache
We can also cache the results of HQL queries. This is useful if we frequently execute a query on entities that rarely change.
To enable the query cache, we’ll set the value of the hibernate.cache.use_query_cache property to true:
hibernate.cache.use_query_cache=true
For each query, we have to explicitly indicate that the query is cacheable (via an org.hibernate.cacheable query hint):
entityManager.createQuery("select f from Foo f")
.setHint("org.hibernate.cacheable", true)
.getResultList();
11.1. Query Cache Best Practices
Here are a some guidelines and best practices related to query caching:
- As is the case with collections, only the ids of entities returned as a result of a cacheable query are cached. Therefore, we strongly recommend enabling a second-level cache for such entities.
- There’s one cache entry per each combination of query parameter values (bind variables) for each query, so queries for which we expect lots of different combinations of parameter values aren’t good candidates for caching.
- Queries that involve entity classes for which there are frequent changes in the database aren’t good candidates for caching either because they will be invalidated whenever there’s a change related to any of the entity classed participating in the query, regardless whether the changed instances are cached as part of the query result or not.
- By default, all query cache results are stored in the default-query-results-region. As with entity/collection caching, we can customize cache parameters for this region to define eviction and expiration policies according to our needs. For each query, we can also specify a custom region name in order to provide different settings for different queries.
- For all tables that are queried as part of cacheable queries, Hibernate keeps last update timestamps in a separate region named default-update-timestamps-region. Being aware of this region is very important if we use query caching because Hibernate uses it to verify that cached query results aren’t stale. The entries in this cache must not be evicted/expired as long as there are cached query results for the corresponding tables in the query results regions. It’s best to turn off automatic eviction and expiration for this cache region, as it doesn’t consume lots of memory anyway.
A good default for default-update-timestamps-region would be*:*
<cache alias="default-update-timestamps-region">
<expiry>
<none />
</expiry>
<resources>
<heap unit="entries">1000</heap>
</resources>
</cache>
12. Conclusion
In this article, we learned how to set up a Hibernate second-level cache. Hibernate is fairly easy to configure and use, making second-level cache utilization transparent to the application’s business logic.
The implementation of this article is available over on GitHub. This is a Maven based project, so it should be easy to import and run as it is.