1. Overview

When managing key-value pairs in a Java application, we often find ourselves considering two main options: Hashtable and ConcurrentHashMap.

While both collections offer the advantage of thread safety, their underlying architectures and capabilities significantly differ. Whether we’re building a legacy system or working on modern, microservices-based cloud applications, understanding these nuances is critical for making the right choice.

In this tutorial, we’ll dissect the differences between Hashtable and ConcurrentHashMap, delving into their performance metrics, synchronization features, and various other aspects to help us make an informed decision.

2. Hashtable

Hashtable is one of the oldest collection classes in Java and has been present since JDK 1.0. It provides key-value storage and retrieval APIs:

Hashtable<String, String> hashtable = new Hashtable<>();
hashtable.put("Key1", "1");
hashtable.put("Key2", "2");
hashtable.putIfAbsent("Key3", "3");
String value = hashtable.get("Key2");

The primary selling point of Hashtable is thread safety, which is achieved through method-level synchronization.

Methods like put(), putIfAbsent(), get(), and remove() are synchronized. Only one thread can execute any of these methods at a given time on a Hashtable instance, ensuring data consistency.

3. ConcurrentHashMap

ConcurrentHashMap is a more modern alternative, introduced with the Java Collections Framework as part of Java 5.

Both Hashtable and ConcurrentHashMap implement the Map interface, which accounts for the similarity in method signatures:

ConcurrentHashMap<String, String> concurrentHashMap = new ConcurrentHashMap<>();
concurrentHashMap.put("Key1", "1");
concurrentHashMap.put("Key2", "2");
concurrentHashMap.putIfAbsent("Key3", "3");
String value = concurrentHashMap.get("Key2");

4. Differences

In this section, we’ll examine key aspects that set Hashtable and ConcurrentHashMap apart, including concurrency, performance, and memory usage.

4.1. Concurrency

As we discussed earlier, Hashtable achieves thread safety through method-level synchronization.

ConcurrentHashMap, on the other hand, provides thread safety with a higher level of concurrency. It allows multiple threads to read and perform limited writes simultaneously without locking the entire data structure. This is especially useful in applications that have more read operations than write operations.

4.2. Performance

While both Hashtable and ConcurrentHashMap guarantee thread safety, they differ in performance due to their underlying synchronization mechanisms.

Hashtable locks the entire table during a write operation, thereby preventing other reads or writes. This could be a bottleneck in a high-concurrency environment.

ConcurrentHashMap, however, allows concurrent reads and limited concurrent writes, making it more scalable and often faster in practice.

Differences in performance numbers may not be noticeable for small datasets. However, ConcurrentHashMap often shows its strength with larger datasets and higher levels of concurrency.

To substantiate performance numbers, let’s run benchmark tests using JMH (the Java Microbenchmark Harness), which uses 10 threads to simulate concurrent activity and performs three warm-up iterations followed by five measurement iterations. It measures the average time taken by each benchmark method, indicating the average execution time:

@Benchmark
@Group("hashtable")
public void benchmarkHashtablePut() {
    for (int i = 0; i < 10000; i++) {
        hashTable.put(String.valueOf(i), i);
    }
}

@Benchmark
@Group("hashtable")
public void benchmarkHashtableGet(Blackhole blackhole) {
    for (int i = 0; i < 10000; i++) {
        Integer value = hashTable.get(String.valueOf(i));
        blackhole.consume(value);
    }
}

@Benchmark
@Group("concurrentHashMap")
public void benchmarkConcurrentHashMapPut() {
    for (int i = 0; i < 10000; i++) {
        concurrentHashMap.put(String.valueOf(i), i);
    }
}

@Benchmark
@Group("concurrentHashMap")
public void benchmarkConcurrentHashMapGet(Blackhole blackhole) {
    for (int i = 0; i < 10000; i++) {
        Integer value = concurrentHashMap.get(String.valueOf(i));
        blackhole.consume(value);
    }
}

Here are the test results:

Benchmark                                                        Mode  Cnt   Score   Error
BenchMarkRunner.concurrentHashMap                                avgt    5   1.788 ± 0.406
BenchMarkRunner.concurrentHashMap:benchmarkConcurrentHashMapGet  avgt    5   1.157 ± 0.185
BenchMarkRunner.concurrentHashMap:benchmarkConcurrentHashMapPut  avgt    5   2.419 ± 0.629
BenchMarkRunner.hashtable                                        avgt    5  10.744 ± 0.873
BenchMarkRunner.hashtable:benchmarkHashtableGet                  avgt    5  10.810 ± 1.208
BenchMarkRunner.hashtable:benchmarkHashtablePut                  avgt    5  10.677 ± 0.541

Benchmark results provide insights into the average execution times of specific methods for both Hashtable and ConcurrentHashMap.

Lower scores indicate better performance, and the results show that, on average, ConcurrentHashMap outperforms Hashtable for both get() and put() operations. 

4.3. Hashtable Iterators

Hashtable iterators are “fail-fast”, which means that if the structure of the Hashtable is modified after an iterator has been created, the iterator will throw a ConcurrentModificationException. This mechanism helps prevent unpredictable behavior by failing quickly when concurrent modifications are detected.

In the example below, we have a Hashtable containing three key-value pairs, and we initiate two threads:

  • iteratorThread: iterates through the Hashtable keys and prints them with 100 milliseconds delay
  • modifierThread: waits for 50 milliseconds and then adds a new key-value pair to the Hashtable

When modifierThread adds a new key-value pair to the Hashtable, iteratorThread throws a ConcurrentModificationException, indicating that the Hashtable structure was modified while the iteration was in progress:

Hashtable<String, Integer> hashtable = new Hashtable<>();
hashtable.put("Key1", 1);
hashtable.put("Key2", 2);
hashtable.put("Key3", 3);
AtomicBoolean exceptionCaught = new AtomicBoolean(false);

Thread iteratorThread = new Thread(() -> {
    Iterator<String> it = hashtable.keySet().iterator();
    try {
        while (it.hasNext()) {
            it.next();
            Thread.sleep(100);
        }
    } catch (ConcurrentModificationException e) {
        exceptionCaught.set(true);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
});

Thread modifierThread = new Thread(() -> {
    try {
        Thread.sleep(50);
        hashtable.put("Key4", 4);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
});

iteratorThread.start();
modifierThread.start();

iteratorThread.join();
modifierThread.join();

assertTrue(exceptionCaught.get());

4.4. ConcurrentHashMap Iterators

In contrast to Hashtable, which uses “fail-fast” iterators, ConcurrentHashMap employs “weakly consistent” iterators.

These iterators can withstand concurrent modifications to the original map, reflecting the state of the map at the time the iterator was created. They might also reflect further changes but aren’t guaranteed to do so. Therefore, we can modify ConcurrentHashMap in one thread while iterating over it in another without getting a ConcurrentModificationException.

The example below demonstrates the weakly consistent nature of iterators in ConcurrentHashMap:

  • iteratorThread: iterates through the ConcurrentHashMap keys and prints them with 100 milliseconds delay
  • modifierThread: waits for 50 milliseconds and then adds a new key-value pair to the ConcurrentHashMap

Unlike Hashtable “fail-fast” iterators, the weakly consistent iterator here doesn’t throw a ConcurrentModificationException. The iterator in iteratorThread continues without any issues, showcasing how ConcurrentHashMap is designed for high-concurrency scenarios:

ConcurrentHashMap<String, Integer> concurrentHashMap = new ConcurrentHashMap<>();
concurrentHashMap.put("Key1", 1);
concurrentHashMap.put("Key2", 2);
concurrentHashMap.put("Key3", 3);
AtomicBoolean exceptionCaught = new AtomicBoolean(false);

Thread iteratorThread = new Thread(() -> {
    Iterator<String> it = concurrentHashMap.keySet().iterator();
    try {
        while (it.hasNext()) {
            it.next();
            Thread.sleep(100);
        }
    } catch (ConcurrentModificationException e) {
        exceptionCaught.set(true);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
});

Thread modifierThread = new Thread(() -> {
    try {
        Thread.sleep(50);
        concurrentHashMap.put("Key4", 4);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
});

iteratorThread.start();
modifierThread.start();

iteratorThread.join();
modifierThread.join();

assertFalse(exceptionCaught.get());

4.5. Memory

Hashtable uses a simple data structure, essentially an array of linked lists. Each bucket in this array stores one key-value pair, so there’s only the overhead of the array itself and the linked list nodes. There are no additional internal data structures to manage the concurrency level, load factor, or other advanced functionalities. Thus Hashtable consumes less memory overall.

ConcurrentHashMap* is more complex and consists of an array of segments, which is essentially a separate *Hashtable. This allows it to perform certain operations concurrently but also consumes additional memory for these segment objects.

For each segment, it maintains extra information, such as count, threshold, load factor, etc., which increases its memory footprint. It dynamically adjusts the number of segments and their sizes to accommodate more entries and reduce collision, which means it has to keep additional metadata to manage these, leading to further memory consumption.

5. Conclusion

In this article, we learned the differences between Hashtable and ConcurrentHashMap.

Both Hashtable and ConcurrentHashMap serve the purpose of storing key-value pairs in a thread-safe manner. However, we saw that ConcurrentHashMap usually has the upper hand in terms of performance and scalability due to its advanced synchronization features.

Hashtable is still useful and might be preferable in legacy systems or scenarios where method-level synchronization is explicitly required. Understanding the specific needs of our application can help us make a more informed decision between these two.

As always, the source for the examples is available over on GitHub.