1. Overview

Typically when making HTTP requests in our applications, we execute these calls sequentially. However, there are occasions when we might want to perform these requests simultaneously.

For example, we may want to do this when retrieving data from multiple sources or when we simply want to try giving our application a performance boost.

In this quick tutorial, we’ll take a look at several approaches to see how we can accomplish this by making parallel service calls using the Spring reactive WebClient.

2. Recap on Reactive Programming

To quickly recap WebClient was introduced in Spring 5 and is included as part of the Spring Web Reactive module. It provides a reactive, non-blocking interface for sending HTTP requests.

For an in-depth guide to reactive programming with WebFlux, check out our excellent Guide to Spring 5 WebFlux.

3. A Simple User Service

We’re going to be using a simple User API in our examples. This API has a GET method that exposes one method getUser for retrieving a user using the id as a parameter.

Let’s take a look at how to make a single call to retrieve a user for a given id:

WebClient webClient = WebClient.create("http://localhost:8080");
public Mono<User> getUser(int id) {
    LOG.info(String.format("Calling getUser(%d)", id));

    return webClient.get()
        .uri("/user/{id}", id)
        .retrieve()
        .bodyToMono(User.class);
}

In the next section, we’ll learn how we can call this method concurrently.

4. Making Simultaneous WebClient Calls

In this section, we’re going see several examples for calling our getUser method concurrently. We’ll also take a look at both publisher implementations Flux and Mono in the examples as well.

4.1. Multiple Calls to the Same Service

Let’s now imagine that we want to fetch data about five users simultaneously and return the result as a list of users:

public Flux fetchUsers(List userIds) {
    return Flux.fromIterable(userIds)
        .flatMap(this::getUser);
}

Let’s decompose the steps to understand what we’ve done:

We begin by creating a Flux from our list of userIds using the static fromIterable method.

Next, we invoke flatMap to run the getUser method we created previously. This reactive operator has a concurrency level of 256 by default, meaning it executes at most 256 getUser calls simultaneously. This number is configurable via method parameter using an overloaded version of flatMap.

It’s worth noting, that since operations are happening in parallel, we don’t know the resulting order. If we need to maintain the input order, we can use flatMapSequential operator instead.

As Spring WebClient uses a non-blocking HTTP client under the hood, there is no need to define any Scheduler by the user. WebClient takes care of scheduling calls and publishing their results on appropriate threads internally, without blocking.

4.2. Multiple Calls to Different Services Returning the Same Type

Let’s now take a look at how we can call multiple services simultaneously.

In this example, we’re going to create another endpoint which returns the same User type:

public Mono<User> getOtherUser(int id) {
    return webClient.get()
        .uri("/otheruser/{id}", id)
        .retrieve()
        .bodyToMono(User.class);
}

Now, the method to perform two or more calls in parallel becomes:

public Flux fetchUserAndOtherUser(int id) {
    return Flux.merge(getUser(id), getOtherUser(id));
}

The main difference in this example is that we’ve used the static method merge instead of the fromIterable method. Using the merge method, we can combine two or more Fluxes into one result.

4.3. Multiple Calls to Different Services Different Types

The probability of having two services returning the same thing is rather low. More typically we’ll have another service providing a different response type and our goal is to merge two (or more) responses.

The Mono class provides the static zip method which lets us combine two or more results:

public Mono fetchUserAndItem(int userId, int itemId) {
    Mono user = getUser(userId);
    Mono item = getItem(itemId);

    return Mono.zip(user, item, UserWithItem::new);
}

*The zip method combines the given user and item Monos into a new Mono with the type UserWithItem*. This is a simple POJO object which wraps a user and item.

5. Testing

In this section, we’re going to see how we can test the code we’ve already seen and, in particular, verify that service calls are happening in parallel.

For this, we’re going to use Wiremock to create a mock server and we’ll test the fetchUsers method:

@Test
public void givenClient_whenFetchingUsers_thenExecutionTimeIsLessThanDouble() {
        
    int requestsNumber = 5;
    int singleRequestTime = 1000;

    for (int i = 1; i <= requestsNumber; i++) {
        stubFor(get(urlEqualTo("/user/" + i)).willReturn(aResponse().withFixedDelay(singleRequestTime)
            .withStatus(200)
            .withHeader("Content-Type", "application/json")
            .withBody(String.format("{ \"id\": %d }", i))));
    }

    List<Integer> userIds = IntStream.rangeClosed(1, requestsNumber)
        .boxed()
        .collect(Collectors.toList());

    Client client = new Client("http://localhost:8089");

    long start = System.currentTimeMillis();
    List<User> users = client.fetchUsers(userIds).collectList().block();
    long end = System.currentTimeMillis();

    long totalExecutionTime = end - start;

    assertEquals("Unexpected number of users", requestsNumber, users.size());
    assertTrue("Execution time is too big", 2 * singleRequestTime > totalExecutionTime);
}

In this example, the approach we’ve taken is to mock the user service and make it respond to any request in one second. Now if we make five calls using our WebClient we can assume that it shouldn’t take more than two seconds as the calls happen concurrently.

To learn about other techniques for testing WebClient check out our guide to Mocking a WebClient in Spring.

6. Conclusion

In this tutorial, we’ve explored a few ways we can make HTTP service calls simultaneously using the Spring 5 Reactive WebClient.

First, we showed how to make calls in parallel to the same service. Later, we saw an example of how to call two services returning different types. Then, we showed how we can test this code using a mock server.

As always, the source code for this article is available over on GitHub.