1. Overview
One of the advantages of database abstraction layers, such as ORM (object-relational mapping) frameworks, is their ability to transparently cache data retrieved from the underlying store. This helps eliminate database-access costs for frequently accessed data.
Performance gains can be significant if the read/write ratios of cached content are high. This is especially true for entities which consist of large object graphs.
In this tutorial, we’ll explore Hibernate second-level cache. We’ll explain some basic concepts, and illustrate everything with simple examples. We’ll use JPA, and fall back to Hibernate native API only for those features that aren’t standardized in JPA.
2. What Is a Second-Level Cache?
As with most other fully-equipped ORM frameworks, Hibernate has the concept of a first-level cache. It’s a session scoped cache which ensures that each entity instance is loaded only once in the persistent context.
Once the session is closed, the first-level cache is terminated as well. This is actually desirable, as it allows for concurrent sessions to work with entity instances in isolation from each other.
Conversely, a second-level cache is SessionFactory-scoped, meaning it’s shared by all sessions created with the same session factory. When an entity instance is looked up by its id (either by application logic or by Hibernate internally, e.g. when it loads associations to that entity from other entities), and second-level caching is enabled for that entity, the following happens:
- If an instance is already present in the first-level cache, it’s returned from there.
- If an instance isn’t found in the first-level cache, and the corresponding instance state is cached in the second-level cache, then the data is fetched from there and an instance is assembled and returned.
- Otherwise, the necessary data are loaded from the database and an instance is assembled and returned.
Once the instance is stored in the persistence context (first-level cache), it’s returned from there in all subsequent calls within the same session until the session is closed, or the instance is manually evicted from the persistence context. The loaded instance state is also stored in the L2 cache if it wasn’t already there.
3. Region Factory
Hibernate second-level caching is designed to be unaware of the actual cache provider used. Hibernate only needs to be provided with an implementation of the org.hibernate.cache.spi.RegionFactory interface, which encapsulates all the details specific to the actual cache providers. Basically, it acts as a bridge between Hibernate and cache providers.
In this article, we’ll use Ehcache, a mature and widely used cache, as our cache provider. We could pick any other provider instead, as long as there’s an implementation of a RegionFactory for it.
We’ll add the Ehcache region factory implementation to the classpath with the following Maven dependency:
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-ehcache</artifactId>
<version>5.6.15.Final</version>
</dependency>
We can take a look here for the latest version of hibernate-ehcache. However, we need to make sure that the hibernate-ehcache version is equal to the Hibernate version we’re using in our project (e.g. if we use hibernate-ehcache 5.2.2.Final, like in this example, then the version of Hibernate should also be 5.2.2.Final).
The hibernate-ehcache artifact has a dependency on the Ehcache implementation itself, which is transitively included in the classpath as well.
4. Enabling Second-Level Caching
With the following two properties, we’ll tell Hibernate that L2 caching is enabled, and give it the name of the region factory class:
hibernate.cache.use_second_level_cache=true
hibernate.cache.region.factory_class=org.hibernate.cache.ehcache.EhCacheRegionFactory
For example, in persistence.xml, it would look like:
<properties>
...
<property name="hibernate.cache.use_second_level_cache" value="true"/>
<property name="hibernate.cache.region.factory_class"
value="org.hibernate.cache.ehcache.EhCacheRegionFactory"/>
...
</properties>
To disable second-level caching (say for debugging purposes), we just set the hibernate.cache.use_second_level_cache property to false.
5. Making an Entity Cacheable
In order to make an entity eligible for second-level caching, we’ll annotate it with the Hibernate specific @org.hibernate.annotations.Cache annotation, and specify a cache concurrency strategy.
Some developers consider it a good convention to add the standard @javax.persistence.Cacheable annotation as well (although not required by Hibernate), so an entity class implementation might look like this:
@Entity
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Foo {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
@Column(name = "ID")
private long id;
@Column(name = "NAME")
private String name;
// getters and setters
}
For each entity class, Hibernate will use a separate cache region to store state of instances for that class. The region name is the fully qualified class name.
For example, Foo instances are stored in a cache named com.baeldung.hibernate.cache.model.Foo in Ehcache.
To verify that caching is working, we can write a quick test:
Foo foo = new Foo();
fooService.create(foo);
fooService.findOne(foo.getId());
int size = CacheManager.ALL_CACHE_MANAGERS.get(0)
.getCache("com.baeldung.hibernate.cache.model.Foo").getSize();
assertThat(size, greaterThan(0));
Here we use Ehcache API directly to verify that the com.baeldung.hibernate.cache.model.Foo cache isn’t empty after we load a Foo instance.
We could also enable the logging of SQL generated by Hibernate, and invoke fooService.findOne(foo.getId()) multiple times in the test to verify that the select statement for loading Foo is printed only once (the first time), meaning that in subsequent calls, the entity instance is fetched from the cache.
6. Cache Concurrency Strategy
Based on use cases, we’re free to pick one of the following cache concurrency strategies:
- READ_ONLY: Used only for entities that never change (exception is thrown if an attempt to update such an entity is made). It’s very simple and performative. It’s suitable for static reference data that doesn’t change.
- NONSTRICT_READ_WRITE: Cache is updated after the transaction that changed the affected data has been committed. Thus, strong consistency isn’t guaranteed, and there’s a small time window in which stale data may be obtained from the cache. This kind of strategy is suitable for use cases that can tolerate eventual consistency.
- READ_WRITE: This strategy guarantees strong consistency, which it achieves by using ‘soft’ locks. When a cached entity is updated, a soft lock is stored in the cache for that entity as well, which is released after the transaction is committed. All concurrent transactions that access soft-locked entries will fetch the corresponding data directly from the database.
- TRANSACTIONAL: Cache changes are done in distributed XA transactions. A change in a cached entity is either committed or rolled back in both the database and cache in the same XA transaction.
7. Cache Management
If expiration and eviction policies aren’t defined, the cache could grow indefinitely, and eventually consume all of the available memory. In most cases, Hibernate leaves cache management duties like these to cache providers, as they are indeed specific to each cache implementation.
For example, we could define the following Ehcache configuration to limit the maximum number of cached Foo instances to 1000:
<ehcache>
<cache name="com.baeldung.persistence.model.Foo" maxElementsInMemory="1000" />
</ehcache>
8. Collection Cache
Collections aren’t cached by default, and we need to explicitly mark them as cacheable:
@Entity
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Foo {
...
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
@OneToMany
private Collection<Bar> bars;
// getters and setters
}
9. Internal Representation of Cached State
Entities aren’t stored in the second-level cache as Java instances, but rather in their disassembled (hydrated) state:
- Id (primary key) isn’t stored (it’s stored as part of the cache key)
- Transient properties aren’t stored
- Collections aren’t stored (see below for more details)
- Non-association property values are stored in their original form
- Only id (foreign key) is stored for ToOne associations
This depicts the general Hibernate second-level cache design, where the cache model reflects the underlying relational model, which is space-efficient and makes it easy to keep the two synchronized.
9.1. Internal Representation of Cached Collections
We already mentioned that we have to explicitly indicate that a collection (OneToMany or ManyToMany association) is cacheable, otherwise it isn’t cached.
Hibernate actually stores collections in separate cache regions, one for each collection. The region name is a fully qualified class name, plus the name of a collection property (for example, com.baeldung.hibernate.cache.model.Foo.bars). This gives us the flexibility to define separate cache parameters for collections, e.g. eviction/expiration policy.
It’s also important to mention that only the ids of entities contained in a collection are cached for each collection entry. This means that in most cases, it’s a good idea to make the contained entities cacheable as well.
10. Cache Invalidation for HQL DML-Style Queries and Native Queries
When it comes to DML-style HQL (insert, update and delete HQL statements), Hibernate is able to determine which entities are affected by such operations:
entityManager.createQuery("update Foo set … where …").executeUpdate();
In this case, all Foo instances are evicted from the L2 cache, while the other cached content remains unchanged.
However, when it comes to native SQL DML statements, Hibernate can’t guess what’s being updated, so it invalidates the entire second level cache:
session.createNativeQuery("update FOO set … where …").executeUpdate();
This is probably not what we want. The solution is to tell Hibernate which entities are affected by native DML statements, so that it can evict only the entries related to Foo entities:
Query nativeQuery = entityManager.createNativeQuery("update FOO set ... where ...");
nativeQuery.unwrap(org.hibernate.SQLQuery.class).addSynchronizedEntityClass(Foo.class);
nativeQuery.executeUpdate();
We have to fall back to Hibernate native SQLQuery API, as this feature isn’t yet defined in JPA.
Note that the above applies only to DML statements (insert, update, delete, and native function/procedure calls). Native select queries don’t invalidate the cache.
11. Query Cache
We can also cache the results of HQL queries. This is useful if we frequently execute a query on entities that rarely change.
To enable the query cache, we’ll set the value of the hibernate.cache.use_query_cache property to true:
hibernate.cache.use_query_cache=true
For each query, we have to explicitly indicate that the query is cacheable (via an org.hibernate.cacheable query hint):
entityManager.createQuery("select f from Foo f")
.setHint("org.hibernate.cacheable", true)
.getResultList();
11.1. Query Cache Best Practices
Here are a some guidelines and best practices related to query caching:
- As is the case with collections, only the ids of entities returned as a result of a cacheable query are cached. Therefore, we strongly recommend enabling a second-level cache for such entities.
- There’s one cache entry per each combination of query parameter values (bind variables) for each query, so queries for which we expect lots of different combinations of parameter values aren’t good candidates for caching.
- Queries that involve entity classes for which there are frequent changes in the database aren’t good candidates for caching either because they will be invalidated whenever there’s a change related to any of the entity classed participating in the query, regardless whether the changed instances are cached as part of the query result or not.
- By default, all query cache results are stored in the org.hibernate.cache.internal.StandardQueryCache region. As with entity/collection caching, we can customize cache parameters for this region to define eviction and expiration policies according to our needs. For each query, we can also specify a custom region name in order to provide different settings for different queries.
- For all tables that are queried as part of cacheable queries, Hibernate keeps last update timestamps in a separate region named org.hibernate.cache.spi.UpdateTimestampsCache. Being aware of this region is very important if we use query caching because Hibernate uses it to verify that cached query results aren’t stale. The entries in this cache must not be evicted/expired as long as there are cached query results for the corresponding tables in the query results regions. It’s best to turn off automatic eviction and expiration for this cache region, as it doesn’t consume lots of memory anyway.
12. Conclusion
In this article, we learned how to set up a Hibernate second-level cache. Hibernate is fairly easy to configure and use, making second-level cache utilization transparent to the application business logic.
The implementation of this article is available over on Github. This is a Maven based project, so it should be easy to import and run as it is.