1. Overview

In this article, we’ll be looking at two constructs from the java.util.concurrent package: LongAdder and LongAccumulator.

Both are created to be very efficient in the multi-threaded environment and both leverage very clever tactics to be lock-free and still remain thread-safe.

2. LongAdder

Let’s consider some logic that’s incrementing some values very often, where using an AtomicLong can be a bottleneck. This uses a compare-and-swap operation, which – under heavy contention – can lead to a lot of wasted CPU cycles.

LongAdder, on the other hand, uses a very clever trick to reduce contention between threads, when these are incrementing it.

When we want to increment an instance of the LongAdder, we need to call the increment() method. That implementation keeps an array of counters that can grow on demand.

And so, when more threads are calling increment(), the array will be longer. Each record in the array can be updated separately – reducing the contention. Due to that fact, the LongAdder is a very efficient way to increment a counter from multiple threads.

Let’s create an instance of the LongAdder class and update it from multiple threads:

LongAdder counter = new LongAdder();
ExecutorService executorService = Executors.newFixedThreadPool(8);

int numberOfThreads = 4;
int numberOfIncrements = 100;

Runnable incrementAction = () -> IntStream
  .range(0, numberOfIncrements)
  .forEach(i -> counter.increment());

for (int i = 0; i < numberOfThreads; i++) {
    executorService.execute(incrementAction);
}

The result of the counter in the LongAdder is not available until we call the sum() method. That method will iterate over all values of the underneath array, and sum those values returning the proper value. We need to be careful though because the call to the sum() method can be very costly:

assertEquals(counter.sum(), numberOfIncrements * numberOfThreads);

Sometimes, after we call sum(), we want to clear all state that is associated with the instance of the LongAdder and start counting from the beginning. We can use the sumThenReset() method to achieve that:

assertEquals(counter.sumThenReset(), numberOfIncrements * numberOfThreads);
assertEquals(counter.sum(), 0);

Note that the subsequent call to the sum() method returns zero meaning that the state was successfully reset.

Moreover, Java also provides DoubleAdder to maintain a summation of double values with a similar API to LongAdder.

3. LongAccumulator

LongAccumulator is also a very interesting class – which allows us to implement a lock-free algorithm in a number of scenarios. For example, it can be used to accumulate results according to the supplied LongBinaryOperator – this works similarly to the reduce() operation from Stream API.

The instance of the LongAccumulator can be created by supplying the LongBinaryOperator and the initial value to its constructor. The important thing to remember that LongAccumulator will work correctly if we supply it with a commutative function where the order of accumulation does not matter.

LongAccumulator accumulator = new LongAccumulator(Long::sum, 0L);

We’re creating a LongAccumulator which will add a new value to the value that was already in the accumulator. We are setting the initial value of the LongAccumulator to zero, so in the first call of the accumulate() method, the previousValue will have a zero value.

Let’s invoke the accumulate() method from multiple threads:

int numberOfThreads = 4;
int numberOfIncrements = 100;

Runnable accumulateAction = () -> IntStream
  .rangeClosed(0, numberOfIncrements)
  .forEach(accumulator::accumulate);

for (int i = 0; i < numberOfThreads; i++) {
    executorService.execute(accumulateAction);
}

Notice how we’re passing a number as an argument to the accumulate() method. That method will invoke our sum() function.

The LongAccumulator is using the compare-and-swap implementation – which leads to these interesting semantics.

Firstly, it executes an action defined as a LongBinaryOperator, and then it checks if the previousValue changed. If it was changed, the action is executed again with the new value. If not, it succeeds in changing the value that is stored in the accumulator.

We can now assert that the sum of all values from all iterations was 20200:

assertEquals(accumulator.get(), 20200);

Interestingly, Java also provides DoubleAccumulator with the same purpose and API but for double values.

4. Dynamic Striping

All adder and accumulator implementations in Java are inheriting from an interesting base-class called Striped64. Instead of using just one value to maintain the current state, this class uses an array of states to distribute the contention to different memory locations. 

Here’s a simple depiction of what Striped64 does:

Dynamic Striping

Different threads are updating different memory locations. Since we’re using an array (that is, stripes) of states, this idea is called dynamic striping. Interestingly, Striped64 is named after this idea and the fact that it works on 64-bit data types.

We expect dynamic striping to improve the overall performance. However, the way the JVM allocates these states may have a counterproductive effect.

To be more specific, the JVM may allocate those states near each other in the heap. This means that a few states can reside in the same CPU cache line. Therefore, updating one memory location may cause a cache miss to its nearby states. This phenomenon, known as false sharing, will hurt the performance.

To prevent false sharing. the Striped64 implementation adds enough padding around each state to make sure that each state resides in its own cache line:

False Sharing

The @Contended annotation is responsible for adding this padding. The padding improves performance at the expense of more memory consumption.

5. Conclusion

In this quick tutorial, we had a look at LongAdder and LongAccumulator and we’ve shown how to use both constructs to implement very efficient and lock-free solutions.

The implementation of all these examples and code snippets can be found in the GitHub project – this is a Maven project, so it should be easy to import and run as it is.