1. Introduction
Spring Batch is a robust framework that makes processing large volumes of data a breeze by providing reusable components and reliable infrastructure. In real-world scenarios, applications often demand simultaneous execution of multiple jobs in a specific sequence of execution to optimize performance and manage dependencies effectively.
In this tutorial, we’ll explore various approaches to running multiple jobs in Spring Batch.
2. Understanding Spring Batch Jobs
In the context of Spring Batch, a job is a container for a sequence of steps, representing the entire process. Each job has a unique identifier and can consist of multiple steps executed in order or based on certain conditions. We can configure jobs using XML or Java and JobLauncher typically launches them.
Running multiple jobs is beneficial in scenarios such as:
- Parallel Processing
- Data Migration and ETL Processes
- Report Generation and more
Efficiently managing multiple jobs is essential for achieving optimal performance, maintainability, and scalability. Let’s explore the different approaches to achieve this in Spring Batch.
3. Configuration
First, let’s configure our dependencies:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
<version>2.2.224</version>
</dependency>
We’ve added spring-boot-starter-web, basic Spring Boot dependency, spring-boot-starter-batch for Batch processing, and h2 for in-memory database.
Next, let’s enable batch processing and configure our data source:
@Configuration
@EnableBatchProcessing
public class BatchConfig {
@Bean
public DataSource dataSource() {
return DataSourceBuilder.create()
.driverClassName("org.h2.Driver")
.url("jdbc:h2:mem:batchdb;DB_CLOSE_DELAY=-1;")
.username("sa")
.password("")
.build();
}
@Bean
public DatabasePopulator databasePopulator(DataSource dataSource) {
ResourceDatabasePopulator populator = new ResourceDatabasePopulator();
populator.addScript(new ClassPathResource("org/springframework/batch/core/schema-h2.sql"));
populator.setContinueOnError(false);
populator.execute(dataSource);
return populator;
}
}
Now, let’s create two different jobs as an example. Each job will perform a simple task:
@Configuration
public class JobsConfig {
private static final Logger log = LoggerFactory.getLogger(SequentialJobsConfig.class);
@Bean
public Job jobOne(JobRepository jobRepository, Step stepOne) {
return new JobBuilder("jobOne", jobRepository).start(stepOne)
.build();
}
@Bean
public Step stepOne(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilder("stepOne", jobRepository).tasklet((contribution, chunkContext) -> {
log.info("Hello");
return RepeatStatus.FINISHED;
}, transactionManager)
.build();
}
@Bean
public Job jobTwo(JobRepository jobRepository, Step stepTwo) {
return new JobBuilder("jobTwo", jobRepository).start(stepTwo)
.build();
}
@Bean
public Step stepTwo(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilder("stepTwo", jobRepository).tasklet((contribution, chunkContext) -> {
log.info("World");
return RepeatStatus.FINISHED;
}, transactionManager)
.build();
}
}
The @EnableBatchProcessing annotation sets up the essential Spring Batch components, such as JobLauncher, JobRepository, and JobExplorer.
We defined two separate jobs, jobOne and jobTwo, as Spring beans. Each job will have its own unique configuration and steps, which we’ll define within these methods. The steps are simple tasklets with transactional support, logging messages to confirm when each step is executed.
Let’s confirm the definition of the jobs:
@Autowired
private Job jobOne;
@Autowired
private Job jobTwo;
@Test
void givenJobsDefinitions_whenJobsLoaded_thenJobNamesShouldMatch() {
assertNotNull(jobOne, "jobOne should be defined");
assertEquals("jobOne", jobOne.getName());
assertNotNull(jobTwo, "jobTwo should be defined");
assertEquals("jobTwo", jobTwo.getName());
}
4. Sequential Job Execution
If our jobs need to run one after another, especially when they depend on each other’s output, sequential execution is the way to go. Let’s see how this works with an example.
@Component
public class SequentialJobsConfig {
@Autowired
private Job jobOne;
@Autowired
private Job jobTwo;
@Autowired
private JobLauncher jobLauncher;
public void runJobsSequentially() {
JobParameters jobParameters = new JobParametersBuilder().addString("ID", "Sequential 1")
.toJobParameters();
JobParameters jobParameters2 = new JobParametersBuilder().addString("ID", "Sequential 2")
.toJobParameters();
// Run jobs one after another
try {
jobLauncher.run(jobOne, jobParameters);
jobLauncher.run(jobTwo, jobParameters2);
} catch (Exception e) {
// handle exception
e.printStackTrace();
}
}
}
We defined a component named SequentialJobsConfig and added the two jobs we created earlier into the class. Afterward, ran the jobs using the JobLauncher. We built jobParameters to ensure each job instance is unique by adding ID with the addString() method. This approach permits us to control execution flow and check the results of each job before proceeding to the next.
Let’s check that the jobs run successfully:
@Autowired
private SequentialJobsConfig sequentialJobsConfig;
@Test
void givenSequentialJobs_whenExecuted_thenRunJobsInOrder() {
assertDoesNotThrow(() -> sequentialJobsConfig.runJobsSequentially(), "Sequential job execution should execute");
}
5. Parallel Job Execution
There are situations where we have jobs that don’t depend on each other, running them in parallel can improve the execution time. We can leverage Spring’s TaskExecutor interface to achieve this:
@Component
public class ParallelJobService {
@Autowired
private JobLauncher jobLauncher;
@Autowired
private Job jobOne;
@Autowired
private Job jobTwo;
public void runJobsInParallel() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.execute(() -> {
try {
jobLauncher.run(jobOne, new JobParametersBuilder().addString("ID", "Parallel 1")
.toJobParameters());
} catch (Exception e) {
e.printStackTrace();
}
});
taskExecutor.execute(() -> {
try {
jobLauncher.run(jobTwo, new JobParametersBuilder().addString("ID", "Parallel 2")
.toJobParameters());
} catch (Exception e) {
e.printStackTrace();
}
});
taskExecutor.close();
}
}
In this configuration, we are using Spring’s SimpleAsyncTaskExecutor to launch jobs using JobLauncher.
However, when using the parallel approach, we need to consider factors such as thread safety, resource contention, and transaction management to ensure stable and efficient execution.
6. Using Job Scheduling
Sometimes, we don’t just want to run multiple jobs, but instead, we want to run these jobs at specific times or intervals. This is where Job scheduling comes into play. This can be easily accomplished using either Spring’s scheduling support or external schedulers.
6.1. Using Spring’s @Scheduling
The @Scheduled annotation allows a method (job) to be repeatedly executed at a given time interval. This approach requires enabling scheduling with @EnableScheduling annotation.
Let’s create a ScheduledJobs class with the required annotation to configure our job(s):
@Configuration
@EnableScheduling
public class ScheduledJobs {
private static final Logger log = LoggerFactory.getLogger(SequentialJobsConfig.class);
@Autowired
private Job jobOne;
@Autowired
private Job jobTwo;
@Autowired
private JobLauncher jobLauncher;
@Scheduled(cron = "0 */1 * * * *") // Run every minute
public void runJob1() throws Exception {
JobParameters jobParameters = new JobParametersBuilder()
.addString("jobID", String.valueOf(System.currentTimeMillis()))
.toJobParameters();
log.info("Executing sheduled job 1");
jobLauncher.run(jobOne, jobParameters);
}
@Scheduled(fixedRate = 1000 * 60 * 3) // Run every 3 minutes
public void runJob2() throws Exception {
JobParameters jobParameters = new JobParametersBuilder()
.addString("jobID", String.valueOf(System.currentTimeMillis()))
.toJobParameters();
log.info("Executing sheduled job 2");
jobLauncher.run(jobTwo, jobParameters);
}
}
In this example, we used the jobs classes we created from the previous section. We configured jobOne to run every minute while jobTwo to run at 3-minute intervals. The @Scheduled annotation allows the definition of simple to complex scheduling patterns using fixed rates or cron expressions.
6.2. Using Quartz Scheduler
The Quartz scheduler is a powerful library for scheduling tasks in Java applications. Just like @Scheduling, Quartz allows multiple jobs to be run at a specific time interval. To be able to use Quartz, we’ll need to add the spring-boot-starter-quartz dependency:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-quartz</artifactId>
<version>3.3.2</version>
</dependency>
Next, let’s create two jobs, QuartzJobOne and QuartzJobTwo:
@Component
public class QuartzJobOne implements Job {
private static final Logger log = LoggerFactory.getLogger(QuartzJobOne.class);
@Override
public void execute(JobExecutionContext context) throws JobExecutionException {
try {
log.info("Job One is executing from quartz");
} catch (Exception e) {
log.error("Error executing Job One: {}", e.getMessage(), e);
throw new JobExecutionException(e);
}
}
}
@Component
public class QuartzJobTwo implements Job {
private static final Logger log = LoggerFactory.getLogger(QuartzJobOne.class);
@Override
public void execute(JobExecutionContext context) throws JobExecutionException {
try {
log.info("Job Two is executing from quartz");
} catch (Exception e) {
log.error("Error executing Job Two: {}", e.getMessage(), e);
throw new JobExecutionException(e);
}
}
}
Now, let’s define two beans, JobDetail, and a Trigger for each job:
@Configuration
public class QuartzConfig {
@Autowired
private Job quartzJobOne;
@Autowired
private Job quartzJobTwo;
@Bean
public JobDetail job1Detail() {
return JobBuilder.newJob().ofType(quartzJobOne.getClass())
.withIdentity("quartzJobOne", "group1")
.storeDurably()
.build();
}
@Bean
public JobDetail job2Detail() {
return JobBuilder.newJob().ofType(quartzJobTwo.getClass())
.withIdentity("quartzJobTwo", "group1")
.storeDurably()
.build();
}
@Bean
public Trigger job1Trigger(JobDetail job1Detail) {
return TriggerBuilder.newTrigger()
.forJob(job1Detail)
.withIdentity("quartzJobOneTrigger", "group1")
.withSchedule(CronScheduleBuilder.cronSchedule("0/10 * * * * ?"))
.build();
}
@Bean
public Trigger job2Trigger(JobDetail job2Detail) {
return TriggerBuilder.newTrigger()
.forJob(job2Detail)
.withIdentity("quartzJobTwoTrigger", "group1")
.withSchedule(CronScheduleBuilder.cronSchedule("0/15 * * * * ?"))
.build();
}
@Bean
public SchedulerFactoryBean schedulerFactoryBean() {
SchedulerFactoryBean schedulerFactory = new SchedulerFactoryBean();
schedulerFactory.setJobDetails(job1Detail(), job2Detail());
schedulerFactory.setTriggers(job1Trigger(job1Detail()), job2Trigger(job2Detail()));
return schedulerFactory;
}
}
We created a JobDetail using JobBuilder for our Quartz jobs, specifying the job classes with their identities respectively. Secondly, we created a trigger and defined when the job should run using a cron expression, which schedules it to run every ten and 15 seconds respectively.
We automatically start our jobs in the schedulerFactoryBean bean. There are many ways to run quartz jobs, which range from running Jobs with parameters, scheduling with calendars, and pausing and resuming jobs.
Quartz is highly flexible and supports complex scheduling scenarios. However, it requires additional setup and is more complex than using @Scheduling.
7. Dynamic Job Execution
We’ve walked through a few approaches to running multiple jobs using Spring Batch and these approaches require that we statically configure and define our jobs upfront. However, there are situations where we would want to create jobs on demand based on some runtime conditions. We can accomplish this using either a chunk-oriented or tasklet-based approach as usual, when working with Spring Batch. For this example, we’ll be using the chunk-based approach.
In chunk -oriented approach, each job’s data is read in from ItemReader and later handled by ItemProcessor. Read and processed chunks are afterward, passed to ItemWriter.
Let’s create a class DynamicJobService and define the method that will be responsible for running our jobs:
@Service
public class DynamicJobService {
private final JobRepository jobRepository;
private final JobLauncher jobLauncher;
private final PlatformTransactionManager transactionManager;
public DynamicJobService(JobRepository jobRepository, JobLauncher jobLauncher, PlatformTransactionManager transactionManager) {
this.jobRepository = jobRepository;
this.jobLauncher = jobLauncher;
this.transactionManager = transactionManager;
}
public void createAndRunJob(Map<String, List<String>> jobsData) throws Exception {
List<Job> jobs = new ArrayList<>();
// Create chunk-oriented jobs
for (Map.Entry<String, List<String>> entry : jobsData.entrySet()) {
if (entry.getValue() instanceof List) {
jobs.add(createJob(entry.getKey(), entry.getValue()));
}
}
// Run all jobs
for (Job job : jobs) {
JobParameters jobParameters = new JobParametersBuilder().addString("jobID", String.valueOf(System.currentTimeMillis()))
.toJobParameters();
jobLauncher.run(job, jobParameters);
}
}
private Job createJob(String jobName, List<String> data) {
return new JobBuilder(jobName, jobRepository).start(createStep(data))
.build();
}
private Step createStep(List<String> data) {
return new StepBuilder("step", jobRepository).<String, String> chunk(10, transactionManager)
.reader(new ListItemReader<>(data))
.processor(item -> item.toUpperCase())
.writer(items -> items.forEach(System.out::println))
.build();
}
}
In the example above, we created a method called createAndRunJob, which generates jobs based on jobsData and launches them. Here’s what happens during execution:
The reader() method reads items one at a time from the input list. Each item is passed to the processor(), which converts the first letter of the item to uppercase. The processed items are then collected into a chunk, with a chunk size defined as 10. Once a chunk is filled or there is no more data, all items in the chunk are passed to the writer(). The writer subsequently prints all items in the chunk to the console, and this process repeats until all items are processed.
Let’s see the service in action:
@Autowired
private DynamicJobService dynamicJobService;
@Test
void givenJobData_whenJobsCreated_thenJobsRunSeccessfully() throws Exception {
Map<String, List<String>> jobsData = new HashMap<>();
jobsData.put("chunkJob1", Arrays.asList("data1", "data2", "data3"));
jobsData.put("chunkJob2", Arrays.asList("data4", "data5", "data6"));
assertDoesNotThrow(() -> dynamicJobService.createAndRunJob(jobsData), "Dynamic job creation and execution should run successfully");
}
We created and passed two jobs to the createAndRunJob method of the service, each with a job identity and its data.
In a real-world example, we’d likely run more complex processing logic. If the built-in implementations don’t meet our specific requirements, it will be best to create custom implementations of ItemReader, ItemProcessor, and ItemWriter respectively.
8. Conclusion
In this article, we’ve explored some approaches to running multiple jobs using Spring Batch. By understanding the basic examples used in this article, we can design a batch-processing system that’s more efficient, scalable, and maintainable.
Whichever approach we should use should depend on what best fits our specific needs.
Full implementation is available over on GitHub.