1. Overview
In Spring Boot applications, we’re often tasked to present tabular data to clients in chunks of 20 or 50 rows at a time. Pagination is a common practice for returning a fraction of data from a large dataset. However, there are scenarios where we need to obtain the entire result at once.
In this tutorial, we’ll first revisit how to retrieve data in pagination using Spring Boot. Next, we’ll explore how to retrieve all results from one database table at once using pagination. Finally, we’ll dive into a more complex scenario that retrieves data with relationships.
2. Repository
The Repository is a Spring Data interface that provides data access abstraction. Depending on the Repository subinterface we have chosen, the abstraction provisions a predefined set of database operations.
We don’t need to write code for standard database operations such as select, save, and delete. All we need is to create an interface for our entity and extend it to the chosen Repository subinterface.
At runtime, Spring Data creates a proxy implementation that handles method invocations for our repository. When we invoke a method on the Repository interface, Spring Data generates the query dynamically based on the method and parameters.
There are three common Repository subinterfaces defined in Spring Data:
- CrudRepository – The most fundamental Repository interface provided by Spring Data. It provisions CRUD (Create, Read, Update, and Delete) entity operations
- PagingAndSortingRepository – It extends the CrudRepository interface, and it adds additional methods to support pagination access and result sorting with ease
- JpaRepository – It extends the PagingAndSortingRepository interface and introduces JPA-specific operations such as saving and flushing an entity and deleting entities in a batch
3. Fetching Paged Data
Let’s start with a simple scenario that obtains data from a database using pagination. We first create a Student entity class:
@Entity
@Table(name = "student")
public class Student {
@Id
@Column(name = "student_id")
private String id;
@Column(name = "first_name")
private String firstName;
@Column(name = "last_name")
private String lastName;
}
Subsequently, we’ll create a StudentRepository for retrieving Student entities from the database*.* The JpaRepository interface contains the method findAll(Pageable pageable) by default. Thus, we don’t need to define additional methods, given that we just want to retrieve data in pages without selecting a field:
public interface StudentRepository extends JpaRepository<Student, String> {
}
We can get the first page of Student with 10 rows per page by invoking findAll(Pageable) on StudentRepository. The first argument indicates the current page, which is zero-indexing, while the second argument denotes the number of records fetched per page:
Pageable pageable = PageRequest.of(0, 10);
Page<Student> studentPage = studentRepository.findAll(pageable);
Often, we have to return a paged result that is sorted by a specific field. In such cases, we provide a Sort instance when we create the Pageable instance. This example shows that we’ll sort the page results by the id field from Student in ascending order:
Sort sort = Sort.by(Sort.Direction.ASC, "id");
Pageable pageable = PageRequest.of(0, 10).withSort(sort);
Page<Student> studentPage = studentRepository.findAll(pageable);
4. Fetching All Data
A common question often arises: What if we want to retrieve all data at once? Do we need to call findAll() instead to obtain all the data? The answer is no. The Pageable interface defines a static method unpaged(), which returns a predefined Pageable instance that does not contain pagination information. We fetch all data by calling findAll(Pageable) with that Pageable instance:
Page<Student> studentPage = studentRepository.findAll(Pageable.unpaged());
If we require sorting the results, we can supply a Sort instance as an argument to the unpaged() method from Spring Boot 3.2 onward. For example, suppose we would like to sort the results by the lastName field in ascending order:
Sort sort = Sort.by(Sort.Direction.ASC, "lastName");
Page<Student> studentPage = studentRepository.findAll(Pageable.unpaged(sort));
However, achieving the same is a bit tricky in versions below 3.2, as unpaged() does not accept any argument. Instead, we have to create a PageRequest with the maximum page size and the Sort parameter:
Pageable pageable = PageRequest.of(0, Integer.MAX_VALUE).withSort(sort);
Page<Student> studentPage = studentRepository.getStudents(pageable);
5. Fetching Data With Relationships
We often define relationships between entities in the object-relational mapping (ORM) framework. Utilizing ORM frameworks such as JPA helps developers quickly model entities and relationships and eliminate the need to write SQL queries.
However, there’s a potential issue that arises with data retrieval if we do not thoroughly understand how it works underneath. We must take caution when attempting to retrieve a collection of results from an entity with relationships, as this could lead to a performance impact, especially when fetching all data.
5.1. N+1 Problem
Let’s have an example to illustrate the issue. Consider our Student entity with an additional many-to-one mapping:
@Entity
@Table(name = "student")
public class Student {
@Id
@Column(name = "student_id")
private String id;
@Column(name = "first_name")
private String firstName;
@Column(name = "last_name")
private String lastName;
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "school_id", referencedColumnName = "school_id")
private School school;
// getters and setters
}
Every Student now associates with a School, and we define the School entity as:
@Entity
@Table(name = "school")
public class School {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Column(name = "school_id")
private Integer id;
private String name;
// getters and setters
}
Now, we would like to retrieve all Student records from the database and investigate the actual number of SQL queries issued by JPA. Hypersistence Utilities is a database utility library that provides the assertSelectCount() method to identify the number of select queries executed. Let’s include its Maven dependency in our pom.xml file:
<dependency>
<groupId>io.hypersistence</groupId>
<artifactId>hypersistence-utils-hibernate-62</artifactId>
<version>3.7.0</version>
</dependency>
Now, we create a test case to retrieve all Student records:
@Test
public void whenGetStudentsWithSchool_thenMultipleSelectQueriesAreExecuted() {
Page<Student> studentPage = studentRepository.findAll(Pageable.unpaged());
List<StudentWithSchoolNameDTO> list = studentPage.get()
.map(student -> modelMapper.map(student, StudentWithSchoolNameDTO.class))
.collect(Collectors.toList());
assertSelectCount((int) studentPage.getContent().size() + 1);
}
In a complete application, we do not want to expose our internal entities to clients. We would map the internal entity to an external DTO and return it to the client in practice. In this example, we adopt ModelMapper to convert Student to StudentWithSchoolNameDTO, which contains all fields from Student and the name field from School:
public class StudentWithSchoolNameDTO {
private String id;
private String firstName;
private String lastName;
private String schoolName;
// constructor, getters and setters
}
Let’s observe the Hibernate log after executing the test case:
Hibernate: select studentent0_.student_id as student_1_1_, studentent0_.first_name as first_na2_1_, studentent0_.last_name as last_nam3_1_, studentent0_.school_id as school_i4_1_ from student studentent0_
Hibernate: select schoolenti0_.school_id as school_i1_0_0_, schoolenti0_.name as name2_0_0_ from school schoolenti0_ where schoolenti0_.school_id=?
Hibernate: select schoolenti0_.school_id as school_i1_0_0_, schoolenti0_.name as name2_0_0_ from school schoolenti0_ where schoolenti0_.school_id=?
...
Consider we have retrieved N Student records from the database. Instead of executing a single select query on the Student table, JPA executes additional N queries on the School table to fetch the associated record for each Student.
This behavior emerges during the conversion by ModelMapper when it attempts to read the school field in the Student instance. This issue in object-relational mapping performance is known as the N+1 problem.
It’s worth mentioning that JPA does not always issue N queries on the School table per Student fetch. The actual count is data-dependent. JPA has a first-level caching mechanism that ensures it does not fetch the cached School instances again from the database.
5.2. Avoid Fetching Relationships
When returning a DTO to the client, it’s not always necessary to include all fields in the entity class. Mostly, we only need a subset of them. To avoid triggering additional queries from associated relationships in the entity, we should extract essential fields only.
In our example, we can create a designated DTO class that includes fields merely from the Student table. JPA will not execute any additional query on School if we do not access the school field:
public class StudentDTO {
private String id;
private String firstName;
private String lastName;
// constructor, getters and setters
}
This approach assumes the association fetch type defined on the entity class we’re querying is set to perform a lazy fetch of the associated entity:
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "school_id", referencedColumnName = "school_id")
private School school;
It’s important to note that if the fetch attribute is set to FetchType.EAGER, JPA will actively execute additional queries upon fetching the Student record despite having no access to the field afterward.
5.3. Custom Query
Whenever a field in School is a necessity in the DTO, we can define a custom query to instruct JPA to execute a fetch join to retrieve the associated School entities eagerly in the initial Student query:
public interface StudentRepository extends JpaRepository<Student, String> {
@Query(value = "SELECT stu FROM Student stu LEFT JOIN FETCH stu.school",
countQuery = "SELECT COUNT(stu) FROM Student stu")
Page<Student> findAll(Pageable pageable);
}
Upon executing the same test case, we can observe from the Hibernate log that there is now only one query joining the Student and the School tables executed:
Hibernate: select s1_0.student_id,s1_0.first_name,s1_0.last_name,s2_0.school_id,s2_0.name
from student s1_0 left join school s2_0 on s2_0.school_id=s1_0.school_id
5.4. Entity Graph
A neater solution would be using the @EntityGraph annotation. This helps to optimize the retrieval performance by fetching entities in a single query rather than executing an additional query for each association. JPA uses this annotation to specify which associated entities should be eagerly fetched.
Let’s look at an ad-hoc entity graph example that defines attributePaths to instruct JPA to fetch the School association when querying the Student records:
public interface StudentRepository extends JpaRepository<Student, String> {
@EntityGraph(attributePaths = "school")
Page<Student> findAll(Pageable pageable);
}
There’s an alternative way to define an entity graph by placing the @NamedEntityGraph annotation on the Student entity:
@Entity
@Table(name = "student")
@NamedEntityGraph(name = "Student.school", attributeNodes = @NamedAttributeNode("school"))
public class Student {
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "school_id", referencedColumnName = "school_id")
private School school;
// Other fields, getters and setters
}
Subsequently, we add the annotation @EntityGraph to the StudentRepository findAll() method and refer to the named entity graph we defined in the Student class:
public interface StudentRepository extends JpaRepository<Student, String> {
@EntityGraph(value = "Student.school")
Page<Student> findAll(Pageable pageable);
}
We’ll see an identical join query is executed by JPA, compared to the custom query approach, upon executing the test case:
Hibernate: select s1_0.student_id,s1_0.first_name,s1_0.last_name,s2_0.school_id,s2_0.name
from student s1_0 left join school s2_0 on s2_0.school_id=s1_0.school_id
6. Conclusion
In this article, we’ve learned how to paginate and sort our query results in Spring Boot, including retrieval of partial data and full data. We also learned some efficient data retrieval practices in Spring Boot, particularly when dealing with relationships.
As usual, the sample code is available over on GitHub.