1. Overview
In this tutorial, we’ll explore one of the Guava collections – Multiset. Like a java.util.Set, it allows for efficient storage and retrieval of items without a guaranteed order.
However, unlike a Set, it allows for multiple occurrences of the same element by tracking the count of each unique element it contains.
2. Maven Dependency
First, let’s add the guava dependency:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>32.1.3-jre</version>
</dependency>
3. Using Multiset
Let’s consider a bookstore which has multiple copies of different books. We might want to perform operations like adding a copy, getting the number of copies, and removing one copy when it’s sold. As a Set does not allow for multiple occurrences of the same element, it can’t handle this requirement.
Let’s get started by adding copies of a book title. The Multiset should return that the title exists and provide us with the correct count*:*
Multiset<String> bookStore = HashMultiset.create();
bookStore.add("Potter");
bookStore.add("Potter");
bookStore.add("Potter");
assertThat(bookStore.contains("Potter")).isTrue();
assertThat(bookStore.count("Potter")).isEqualTo(3);
Now let’s remove one copy. We expect the count to be updated accordingly:
bookStore.remove("Potter");
assertThat(bookStore.contains("Potter")).isTrue();
assertThat(bookStore.count("Potter")).isEqualTo(2);
And actually, we can just set the count instead of performing various add operations:
bookStore.setCount("Potter", 50);
assertThat(bookStore.count("Potter")).isEqualTo(50);
Multiset validates the count value. If we set it to negative, an IllegalArgumentException is thrown:
assertThatThrownBy(() -> bookStore.setCount("Potter", -1))
.isInstanceOf(IllegalArgumentException.class);
4. Comparison with Map
Without access to Multiset, we could achieve all of the operations above by implementing our own logic using java.util.Map:
Map<String, Integer> bookStore = new HashMap<>();
// adding 3 copies
bookStore.put("Potter", 3);
assertThat(bookStore.containsKey("Potter")).isTrue();
assertThat(bookStore.get("Potter")).isEqualTo(3);
// removing 1 copy
bookStore.put("Potter", 2);
assertThat(bookStore.get("Potter")).isEqualTo(2);
When we want to add or remove a copy using a Map, we need to remember the current count and adjust it accordingly. We also need to implement this logic in our calling code every time or construct our own library for this purpose. Our code would also need to control the value argument. If we’re not careful, we could easily set the value to null or negative even though both the values are invalid:
bookStore.put("Potter", null);
assertThat(bookStore.containsKey("Potter")).isTrue();
bookStore.put("Potter", -1);
assertThat(bookStore.containsKey("Potter")).isTrue();
As we can see, it is a lot more convenient to use Multiset instead of Map.
5. Concurrency
When we want to use Multiset in a concurrent environment, we can use ConcurrentHashMultiset, which is a thread-safe Multiset implementation.
We should note that being thread-safe does not guarantee consistency, though. Using the add or remove methods will work well in a multi-threaded environment, but what if several threads called the setCount method?
If we use the setCount method, the final result would depend on the order of execution across threads, which cannot necessarily be predicted. The add and remove methods are incremental, and the ConcurrentHashMultiset is able to protect their behavior. Setting the count directly is not incremental and so can cause unexpected results when used concurrently.
However, there’s another flavor of the setCount method which updates the count only if its current value matches the passed argument. The method returns true if the operation succeeded, a form of optimistic locking:
Multiset<String> bookStore = HashMultiset.create();
// updates the count to 2 if current count is 0
assertThat(bookStore.setCount("Potter", 0, 2)).isTrue();
// updates the count to 5 if the current value is 50
assertThat(bookStore.setCount("Potter", 50, 5)).isFalse();
If we want to use the setCount method in concurrent code, we should use the above version to guarantee consistency. A multi-threaded client could perform a retry if changing the count failed.
6. Conclusion
In this short tutorial, we discussed when and how to use a Multiset, compared it with a standard Map and looked at how best to use it in a concurrent application.
As always, the source code for the examples can be found over on GitHub.