1. Overview

When building our persistence layer, optimizing database query performance is an important requirement.

One technique databases use to improve query performance is SQL statement caching, which reuses the previously prepared SQL statements to avoid the overhead of generating the same execution plans repeatedly in the database engine.

However, statement caching encounters a challenge when dealing with IN clauses as they often have a varying number of parameters.

In this tutorial, we’ll explore how Hibernate’s parameter padding feature addresses this issue and improves the effectiveness of statement caching for queries with IN clauses.

2. Application Setup

Before we explore the concept of parameter padding in Hibernate, let’s set up a simple application that we’ll use throughout this tutorial.

2.1. Dependencies

Let’s start by adding the Hibernate dependency to our project’s pom.xml file:

<dependency>
    <groupId>org.hibernate.orm</groupId>
    <artifactId>hibernate-core</artifactId>
    <version>6.5.2.Final</version>
</dependency>

This dependency provides us with the core Hibernate ORM functionality, including the parameter padding feature we’re discussing in this tutorial.

2.2. Defining the Entity Class

Now, let’s define our entity class:

@Entity
class Pokemon {
    @Id
    private UUID id;

    private String name;

    // standard setters and getters
}

The Pokemon class is the central entity in our tutorial, and we’ll be using it to learn how to use parameter padding to speed up database SQL query execution for queries involving IN clauses in the upcoming sections.

3. SQL Statement Caching

SQL statement caching is a technique used to optimize database query performance. When our database receives a SQL query, it prepares an execution plan and executes it to retrieve the result. This process can be time-consuming, especially for complex queries.

To avoid repeating this overhead, the database engine caches the query execution plan against the prepared statements and reuses them for subsequent executions with different parameter values.

Let’s consider an example where we search for Pokemon by their name attribute:

String[] names = { "Pikachu", "Charizard", "Bulbasaur" };
String query = "SELECT p FROM Pokemon p WHERE p.name = :name";

for (String name : names) {
    Pokemon pokemon = entityManager.createQuery(query, Pokemon.class)
      .setParameter("name", name)
      .getSingleResult();

    assertThat(pokemon)
      .isNotNull()
      .hasNoNullFieldsOrProperties();
}

In our example, the SQL statement SELECT p FROM Pokemon p WHERE p.name = :name is prepared only once and reused for each iteration of the loop.

The named parameter :name is replaced with the actual parameter values stored in the names array during execution. This caching mechanism removes the overhead of repeatedly preparing execution plans for the same SQL query.

4. SQL Statement Caching With IN Clause

While SQL statement caching works well for most scenarios, it’s a little inefficient when dealing with IN clauses that have a varying number of parameters:

String[][] nameGroups = {
    { "Jigglypuff" },
    { "Snorlax", "Squirtle" },
    { "Pikachu", "Charizard", "Bulbasaur" }};
String query = "SELECT p FROM Pokemon p WHERE p.name IN :names";

for (String[] names : nameGroups) {
    List<Pokemon> pokemons = entityManager.createQuery(query, Pokemon.class)
      .setParameter("names", Arrays.asList(names))
      .getResultList();

    assertThat(pokemons)
      .isNotEmpty();
}

In our example, we have groups of Pokemon names which we’re using to retrieve Pokemon entities using the IN clause. However, each group has a different number of names, resulting in a varying number of parameters in the IN clause.

In this case, the database generates a separate execution plan for each query with a different number of parameters. Consequently, statement caching becomes ineffective, as each query is treated as a new statement.

5. Parameter Padding for IN Clause

To address the SQL statement caching issue with IN clauses, Hibernate 5.2.18 introduces the feature of parameter padding. Parameter padding allows us to reuse cached statements even when the number of parameters in the IN clause varies.

We can enable this feature by setting the hibernate.query.in_clause_parameter_padding property in our persistence.xml file to true:

<property>
    name="hibernate.query.in_clause_parameter_padding"
    value="true"
</property>

When working with Spring Data JPA, we can enable parameter padding by adding the following configuration to our application.yaml file:

spring:
  jpa:
    properties:
      hibernate:
        query:
          in_clause_parameter_padding: true

With parameter padding enabled, Hibernate adjusts the number of parameters in the IN clause to the nearest power of 2, padding the list by repeating the last parameter value.

For instance, if our IN clause contains 3 parameters, Hibernate will pad it to 4 parameters. This ensures that only one execution plan is prepared for queries having 3 or 4 parameters.

Similarly, if the number of parameters are between 5 and 8 in our IN clause, Hibernate will use 8 parameters in the prepared statement.

To understand this better, we’ll enable SQL logging in our application and look at the binding parameters:

List<String> names = List.of("Pikachu", "Charizard", "Bulbasaur");
String query = "SELECT p FROM Pokemon p WHERE p.name IN :names";
entityManager.createQuery(query)
  .setParameter("names", names);

When we run the above, we’ll see the following log output:

org.hibernate.SQL - select p1_0.id,p1_0.name from pokemon p1_0 where p1_0.name in (?,?,?,?)

org.hibernate.orm.jdbc.bind - binding parameter (1:VARCHAR) <- [Pikachu]
org.hibernate.orm.jdbc.bind - binding parameter (2:VARCHAR) <- [Charizard]
org.hibernate.orm.jdbc.bind - binding parameter (3:VARCHAR) <- [Bulbasaur]
org.hibernate.orm.jdbc.bind - binding parameter (4:VARCHAR) <- [Bulbasaur]

Although we’ve provided three names, Hibernate has padded the IN clause to four parameters. It repeats the last value, Bulbasaur, to fill the fourth slot.

This feature helps reduce the number of execution plans created, improving performance and memory usage when using the IN clause.

6. When Parameter Padding Fails

While parameter padding is a great feature to speed up SQL query execution in our database, there are certain scenarios where it may not provide the expected benefits or even degrade performance.

Firstly, parameter padding will not be useful for databases that don’t cache execution plans, such as SQLite, MySQL with the BLACKHOLE storage engine, etc. In such cases, enabling parameter padding may introduce unnecessary overhead due to the additional parameters.

Additionally, enabling parameter padding may not be helpful when the number of parameters in our IN clause is either very small or very large. If the number of parameters is consistently small, the benefit of parameter padding will be negligible. On the other hand, if the number of parameters is extremely large, parameter padding will lead to excessive memory consumption in the cache, potentially impacting performance.

7. Conclusion

In this article, we explored the concept of parameter padding in Hibernate and how it addresses the challenges of SQL statement caching with IN clauses.

We learned that enabling the hibernate.query.in_clause_parameter_padding property allows Hibernate to adjust the number of parameters in the IN clause to the nearest power of 2, effectively reducing the number of cached statements and reusing them to improve performance.

As always, all the code examples used in this article are available over on GitHub.