什么是线程安全以及如何实现?

1. 概述

Java支持多线程。这意味着，通过在多个独立线程中并发运行字节码，JVM能够提高应用程序性能。

尽管多线程很强大，但它也付出了一定的代价。在多线程环境中，我们需要以线程安全的方式编写代码实现逻辑。保证不同的线程在访问共享的资源时，不会出现错误的行为或产生不可预测的结果。这种编程方法被称为“线程安全”。

在本教程中，我们介绍实现“线程安全”的几种方法。

2. “无状态”实现

大部分多线程问题，是由于多个线程之间错误的共享状态造成的。

因此，我们介绍的第一种方法是使用无状态实现来实现线程安全。

为了便于理解，下面定义一个MathUtils工具类，它有一个静态方法，该方法计算一个给定数字的阶乘。

    public class MathUtils {

        public static BigInteger factorial(int number) {
            BigInteger f = new BigInteger("1");
            for (int i = 2; i <= number; i++) {
                f = f.multiply(BigInteger.valueOf(i));
            }
            return f;
        }
    }

factorial()方法是一个无状态的确定性函数。给定一个特定的输入，它总是会产生相同的输出。

该方法既不依赖于外部状态，也不维护状态。因此，我们可以认为它是线程安全的，可以安全地被多个线程同时调用。

任何线程都可以安全地调用factorial()方法，并总能获得预期结果，而不会互相干扰。

因此，无状态实现是实现线程安全的最简单的方法。

3. “不可变”实现

If we need to share state between different threads, we can create thread-safe classes by making them immutable.

Immutability is a powerful, language-agnostic concept and it's fairly easy to achieve in Java.

To put it simply, a class instance is immutable when its internal state can't be modified after it has been constructed.

The easiest way to create an immutable class in Java is by declaring all the fields private and final and not providing setters:

public class MessageService {

    private final String message;

    public MessageService(String message) {
        this.message = message;
    }
    
    // standard getter
    
}

A MessageService object is effectively immutable since its state can't change after its construction. Hence, it's thread-safe.

Moreover, if MessageService were actually mutable, but multiple threads only have read-only access to it, it's thread-safe as well.

Thus, immutability is just another way to achieve thread-safety.

4. 线程本地变量(ThreadLocal)

In object-oriented programming (OOP), objects actually need to maintain state through fields and implement behavior through one or more methods.

If we actually need to maintain state, we can create thread-safe classes that don't share state between threads by making their fields thread-local.

We can easily create classes whose fields are thread-local by simply defining private fields in Thread classes.

We could define, for instance, a Thread class that stores an array of integers:

public class ThreadA extends Thread {

    private final List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6);

    @Override
    public void run() {
        numbers.forEach(System.out::println);
    }
}

While another one might hold an array of strings:

public class ThreadB extends Thread {

    private final List<String> letters = Arrays.asList("a", "b", "c", "d", "e", "f");

    @Override
    public void run() {
        letters.forEach(System.out::println);
    }
}

In both implementations, the classes have their own state, but it's not shared with other threads. Thus, the classes are thread-safe.

Similarly, we can create thread-local fields by assigning ThreadLocal instances to a field.

Let's consider, for example, the following StateHolder class:

public class StateHolder {

    private final String state;

    // standard constructors / getter
}

We can easily make it a thread-local variable as follows:

public class ThreadState {

    public static final ThreadLocal<StateHolder> statePerThread = new ThreadLocal<StateHolder>() {

        @Override
        protected StateHolder initialValue() {
            return new StateHolder("active");  
        }
    };

    public static StateHolder getState() {
        return statePerThread.get();
    }
}

Thread-local fields are pretty much like normal class fields, except that each thread that accesses them via a setter/getter gets an independently initialized copy of the field so that each thread has its own state.

5. 同步 Collections

We can easily create thread-safe collections by using the set of synchronization wrappers included within the collections framework.

We can use, for instance, one of these synchronization wrappers to create a thread-safe collection:

Collection<Integer> syncCollection = Collections.synchronizedCollection(new ArrayList<>());
Thread thread1 = new Thread(() -> syncCollection.addAll(Arrays.asList(1, 2, 3, 4, 5, 6)));
Thread thread2 = new Thread(() -> syncCollection.addAll(Arrays.asList(7, 8, 9, 10, 11, 12)));
thread1.start();
thread2.start();

Let's keep in mind that synchronized collections use intrinsic locking in each method (we'll look at intrinsic locking later).

This means that the methods can be accessed by only one thread at a time, while other threads will be blocked until the method is unlocked by the first thread.

Thus, synchronization has a penalty in performance, due to the underlying logic of synchronized access.

6. 支持并发的 Collections

Alternatively to synchronized collections, we can use concurrent collections to create thread-safe collections.

Java provides the java.util.concurrent package, which contains several concurrent collections, such as ConcurrentHashMap:

Map<String,String> concurrentMap = new ConcurrentHashMap<>();
concurrentMap.put("1", "one");
concurrentMap.put("2", "two");
concurrentMap.put("3", "three");

Unlike their synchronized counterparts**, concurrent collections achieve thread-safety by dividing their data into segments**. In a ConcurrentHashMap, for instance, several threads can acquire locks on different map segments, so multiple threads can access the Map at the same time.

Concurrent collections are much more performant than synchronized collections, due to the inherent advantages of concurrent thread access.

It's worth mentioning that synchronized and concurrent collections only make the collection itself thread-safe and not the contents.

7. 原子对象

It's also possible to achieve thread-safety using the set of atomic classes that Java provides, including AtomicInteger, AtomicLong, AtomicBoolean, and AtomicReference.

Atomic classes allow us to perform atomic operations, which are thread-safe, without using synchronization. An atomic operation is executed in one single machine level operation.

To understand the problem this solves, let's look at the following Counter class:

public class Counter {

    private int counter = 0;

    public void incrementCounter() {
        counter += 1;
    }
    
    public int getCounter() {
        return counter;
    }
}

Let's suppose that in a race condition, two threads access the incrementCounter() method at the same time.

In theory, the final value of the counter field will be 2. But we just can't be sure about the result, because the threads are executing the same code block at the same time and incrementation is not atomic.

Let's create a thread-safe implementation of the Counter class by using an AtomicInteger object:

public class AtomicCounter {

    private final AtomicInteger counter = new AtomicInteger();

    public void incrementCounter() {
        counter.incrementAndGet();
    }
    
    public int getCounter() {
        return counter.get();
    }
}

This is thread-safe because, while incrementation, ++, takes more than one operation, incrementAndGet is atomic.

8. 同步方法

While the earlier approaches are very good for collections and primitives, we will at times need greater control than that.

So, another common approach that we can use for achieving thread-safety is implementing synchronized methods.

Simply put**, only one thread can access a synchronized method at a time while blocking access to this method from other threads**. Other threads will remain blocked until the first thread finishes or the method throws an exception.

We can create a thread-safe version of incrementCounter() in another way by making it a synchronized method:

public synchronized void incrementCounter() {
    counter += 1;
}

We've created a synchronized method by prefixing the method signature with the synchronized keyword.

Since one thread at a time can access a synchronized method, one thread will execute the incrementCounter() method, and in turn, others will do the same. No overlapping execution will occur whatsoever.

Synchronized methods rely on the use of “intrinsic locks” or “monitor locks”. An intrinsic lock is an implicit internal entity associated with a particular class instance.

In a multithreaded context, the term monitor is just a reference to the role that the lock performs on the associated object, as it enforces exclusive access to a set of specified methods or statements.

When a thread calls a synchronized method, it acquires the intrinsic lock. After the thread finishes executing the method, it releases the lock, hence allowing other threads to acquire the lock and get access to the method.

We can implement synchronization in instance methods, static methods, and statements (synchronized statements).

9. 同步代码块

Sometimes, synchronizing an entire method might be overkill if we just need to make a segment of the method thread-safe.

To exemplify this use case, let's refactor the incrementCounter() method:

public void incrementCounter() {
    // additional unsynced operations
    synchronized(this) {
        counter += 1; 
    }
}

The example is trivial, but it shows how to create a synchronized statement. Assuming that the method now performs a few additional operations, which don't require synchronization, we only synchronized the relevant state-modifying section by wrapping it within a synchronized block.

Unlike synchronized methods, synchronized statements must specify the object that provides the intrinsic lock, usually the this reference.

Synchronization is expensive, so with this option, we are able to only synchronize the relevant parts of a method.

9.1. Other Objects as a Lock

We can slightly improve the thread-safe implementation of the Counter class by exploiting another object as a monitor lock, instead of this.

Not only does this provide coordinated access to a shared resource in a multithreaded environment, but also it uses an external entity to enforce exclusive access to the resource:

public class ObjectLockCounter {

    private int counter = 0;
    private final Object lock = new Object();
    
    public void incrementCounter() {
        synchronized(lock) {
            counter += 1;
        }
    }
    
    // standard getter
}

We use a plain Object instance to enforce mutual exclusion. This implementation is slightly better, as it promotes security at the lock level.

When using this for intrinsic locking, an attacker could cause a deadlock by acquiring the intrinsic lock and triggering a denial of service (DoS) condition.

On the contrary, when using other objects, that private entity is not accessible from the outside. This makes it harder for an attacker to acquire the lock and cause a deadlock.

9.2. Caveats

Even though we can use any Java object as an intrinsic lock, we should avoid using Strings for locking purposes:

public class Class1 {
    private static final String LOCK  = "Lock";

    // uses the LOCK as the intrinsic lock
}

public class Class2 {
    private static final String LOCK  = "Lock";

    // uses the LOCK as the intrinsic lock
}

At first glance, it seems that these two classes are using two different objects as their lock. However, because of string interning, these two “Lock” values may actually refer to the same object on the string pool. That is, the Class1 and Class2 are sharing the same lock!

This, in turn, may cause some unexpected behaviors in concurrent contexts.

In addition to Strings, we should avoid using any cacheable or reusable objects as intrinsic locks. For example, the Integer.valueOf() method caches small numbers. Therefore, calling Integer.valueOf(1) returns the same object even in different classes.

10. Volatile 字段

Synchronized methods and blocks are handy for addressing variable visibility problems among threads. Even so, the values of regular class fields might be cached by the CPU. Hence, consequent updates to a particular field, even if they're synchronized, might not be visible to other threads.

To prevent this situation, we can use volatile class fields:

public class Counter {

    private volatile int counter;

    // standard constructors / getter

}

With the volatile keyword, we instruct the JVM and the compiler to store the counter variable in the main memory. That way, we make sure that every time the JVM reads the value of the counter variable, it will actually read it from the main memory, instead of from the CPU cache. Likewise, every time the JVM writes to the counter variable, the value will be written to the main memory.

Moreover, the use of a volatile variable ensures that all variables that are visible to a given thread will be read from the main memory as well.

Let's consider the following example:

public class User {

    private String name;
    private volatile int age;

    // standard constructors / getters
    
}

In this case, each time the JVM writes the age volatile variable to the main memory, it will write the non-volatile name variable to the main memory as well. This assures that the latest values of both variables are stored in the main memory, so consequent updates to the variables will automatically be visible to other threads.

Similarly, if a thread reads the value of a volatile variable, all the variables visible to the thread will be read from the main memory too.

This extended guarantee that volatile variables provide is known as the full volatile visibility guarantee.

11. 可重入锁

Java provides an improved set of Lock implementations, whose behavior is slightly more sophisticated than the intrinsic locks discussed above.

With intrinsic locks, the lock acquisition model is rather rigid: one thread acquires the lock, then executes a method or code block, and finally releases the lock, so other threads can acquire it and access the method.

There's no underlying mechanism that checks the queued threads and gives priority access to the longest waiting threads.

ReentrantLock instances allow us to do exactly that, hence preventing queued threads from suffering some types of resource starvation:

public class ReentrantLockCounter {

    private int counter;
    private final ReentrantLock reLock = new ReentrantLock(true);
    
    public void incrementCounter() {
        reLock.lock();
        try {
            counter += 1;
        } finally {
            reLock.unlock();
        }
    }
    
    // standard constructors / getter
    
}

The ReentrantLock constructor takes an optional fairness boolean parameter. When set to true, and multiple threads are trying to acquire a lock, the JVM will give priority to the longest waiting thread and grant access to the lock.

12. 读/写锁

Another powerful mechanism that we can use for achieving thread-safety is the use of ReadWriteLock implementations.

A ReadWriteLock lock actually uses a pair of associated locks, one for read-only operations and other for writing operations.

As a result, it's possible to have many threads reading a resource, as long as there's no thread writing to it. Moreover, the thread writing to the resource will prevent other threads from reading it.

We can use a ReadWriteLock lock as follows:

public class ReentrantReadWriteLockCounter {

    private int counter;
    private final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
    private final Lock readLock = rwLock.readLock();
    private final Lock writeLock = rwLock.writeLock();
    
    public void incrementCounter() {
        writeLock.lock();
        try {
            counter += 1;
        } finally {
            writeLock.unlock();
        }
    }
    
    public int getCounter() {
        readLock.lock();
        try {
            return counter;
        } finally {
            readLock.unlock();
        }
    }

   // standard constructors
   
}

13. 总结

In this article, we learned what thread-safety is in Java, and took an in-depth look at different approaches for achieving it.

As usual, all the code samples shown in this article are available over on GitHub.

Persistence

REST

Security