1. Overview
Generating random values is a very common task. This is why Java provides the java.util.Random class.
However, this class doesn’t perform well in a multi-threaded environment.
In a simplified way, the reason for the poor performance of Random in a multi-threaded environment is due to contention – given that multiple threads share the same Random instance.
To address that limitation, Java introduced the java.util.concurrent.ThreadLocalRandom class in JDK 7 – for generating random numbers in a multi-threaded environment.
Let’s see how ThreadLocalRandom performs and how to use it in real-world applications.
2. ThreadLocalRandom Over Random
ThreadLocalRandom is a combination of the ThreadLocal and Random classes (more on this later) and is isolated to the current thread. Thus, it achieves better performance in a multithreaded environment by simply avoiding any concurrent access to instances of Random.
The random number obtained by one thread is not affected by the other thread, whereas java.util.Random provides random numbers globally.
Also, unlike Random, ThreadLocalRandom doesn’t support setting the seed explicitly. Instead, it overrides the setSeed(long seed) method inherited from Random to always throw an UnsupportedOperationException if called.
2.1. Thread Contention
So far, we’ve established that the Random class performs poorly in highly concurrent environments. To better understand this, let’s see how one of its primary operations, next(int), is implemented:
private final AtomicLong seed;
protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
} while (!seed.compareAndSet(oldseed, nextseed));
return (int)(nextseed >>> (48 - bits));
}
This is a Java implementation for the Linear Congruential Generator algorithm. It’s obvious that all threads are sharing the same seed instance variable.
To generate the next random set of bits, it first tries to change the shared seed value atomically via compareAndSet or CAS for short.
When multiple threads attempt to update the seed concurrently using CAS, one thread wins and updates the seed, and the rest lose. Losing threads will try the same process over and over again until they get a chance to update the value and ultimately generate the random number.
This algorithm is lock-free, and different threads can progress concurrently. However, when the contention is high, the number of CAS failures and retries will hurt the overall performance significantly.
On the other hand, the ThreadLocalRandom completely removes this contention, as each thread has its own instance of Random and, consequently, its own confined seed.
Let’s now take a look at some of the ways to generate random int, long and double values.
3. Generating Random Values Using ThreadLocalRandom
As per the Oracle documentation, we just need to call ThreadLocalRandom.current() method, and it will return the instance of ThreadLocalRandom for the current thread. We can then generate random values by invoking available instance methods of the class.
Let’s generate a random int value without any bounds:
int unboundedRandomValue = ThreadLocalRandom.current().nextInt());
Next, let’s see how we can generate a random bounded int value, meaning a value between a given lower and upper limit.
Here’s an example of generating a random int value between 0 and 100:
int boundedRandomValue = ThreadLocalRandom.current().nextInt(0, 100);
Please note, 0 is the inclusive lower limit and 100 is the exclusive upper limit.
We can generate random values for long and double by invoking nextLong() and nextDouble() methods in a similar way as shown in the examples above.
Java 8 also adds the nextGaussian() method to generate the next normally-distributed value with a 0.0 mean and 1.0 standard deviation from the generator’s sequence.
As with the Random class, we can also use the doubles(), ints() and longs() methods to generate streams of random values.
4. Comparing ThreadLocalRandom and Random Using JMH
Let’s see how we can generate random values in a multi-threaded environment, by using the two classes, then compare their performance using JMH.
First, let’s create an example where all the threads are sharing a single instance of Random. Here, we’re submitting the task of generating a random value using the Random instance to an ExecutorService:
ExecutorService executor = Executors.newWorkStealingPool();
List<Callable<Integer>> callables = new ArrayList<>();
Random random = new Random();
for (int i = 0; i < 1000; i++) {
callables.add(() -> {
return random.nextInt();
});
}
executor.invokeAll(callables);
Let’s check the performance of the code above using JMH benchmarking:
# Run complete. Total time: 00:00:36
Benchmark Mode Cnt Score Error Units
ThreadLocalRandomBenchMarker.randomValuesUsingRandom avgt 20 771.613 ± 222.220 us/op
Similarly, let’s now use ThreadLocalRandom instead of the Random instance, which uses one instance of ThreadLocalRandom for each thread in the pool:
ExecutorService executor = Executors.newWorkStealingPool();
List<Callable<Integer>> callables = new ArrayList<>();
for (int i = 0; i < 1000; i++) {
callables.add(() -> {
return ThreadLocalRandom.current().nextInt();
});
}
executor.invokeAll(callables);
Here’s the result of using ThreadLocalRandom:
# Run complete. Total time: 00:00:36
Benchmark Mode Cnt Score Error Units
ThreadLocalRandomBenchMarker.randomValuesUsingThreadLocalRandom avgt 20 624.911 ± 113.268 us/op
Finally, by comparing the JMH results above for both Random and ThreadLocalRandom, we can clearly see that the average time taken to generate 1000 random values using Random is 772 microseconds, whereas using ThreadLocalRandom it’s around 625 microseconds.
Thus, we can conclude that ThreadLocalRandom is more efficient in a highly concurrent environment.
To learn more about JMH, check out our previous article here.
5. Implementation Details
It’s a good mental model to think of a ThreadLocalRandom as a combination of ThreadLocal and Random classes. As a matter of fact, this mental model was aligned with the actual implementation before Java 8.
As of Java 8, however, this alignment broke down completely as the ThreadLocalRandom became a singleton. Here’s how the current() method looks in Java 8+:
static final ThreadLocalRandom instance = new ThreadLocalRandom();
public static ThreadLocalRandom current() {
if (U.getInt(Thread.currentThread(), PROBE) == 0)
localInit();
return instance;
}
It’s true that sharing one global Random instance leads to sub-optimal performance in high contention. However, using one dedicated instance per thread is also overkill.
Instead of a dedicated instance of Random per thread, each thread only needs to maintain its own seed value. As of Java 8, the Thread class itself has been retrofitted to maintain the seed value:
public class Thread implements Runnable {
// omitted
@jdk.internal.vm.annotation.Contended("tlr")
long threadLocalRandomSeed;
@jdk.internal.vm.annotation.Contended("tlr")
int threadLocalRandomProbe;
@jdk.internal.vm.annotation.Contended("tlr")
int threadLocalRandomSecondarySeed;
}
The threadLocalRandomSeed variable is responsible for maintaining the current seed value for ThreadLocalRandom. Moreover, the secondary seed, threadLocalRandomSecondarySeed, is usually used internally by the likes of ForkJoinPool.
This implementation incorporates a few optimizations to make ThreadLocalRandom even more performant:
- Avoiding false sharing by using the @Contented annotation, which basically adds enough padding to isolate the contended variables in their own cache lines
- Using sun.misc.Unsafe to update these three variables instead of using the Reflection API
- Avoiding extra hashtable lookups associated with the ThreadLocal implementation
6. Conclusion
This article illustrated the difference between java.util.Random and java.util.concurrent.ThreadLocalRandom.
We also saw the advantage of ThreadLocalRandom over Random in a multithreaded environment, as well as performance and how we can generate random values using the class.
ThreadLocalRandom is a simple addition to the JDK, but it can create a notable impact in highly concurrent applications.
And, as always, the implementation of all of these examples can be found over on GitHub.