1. Introduction
In this tutorial, we’re going to look at how we can get the index of an item in a Java Set. Sets in Java don’t allow duplicate elements, and some important implementations of the Set interface include the HashSet, TreeSet and LinkedHashSet.
2. Ordered, Unordered and Sorted Collections in Java
Before we look at the problem statement, let’s take a look at the difference between some of the types of collections we have in Java:
- Ordered Collections
- Unordered Collections
- Sorted Collections
Ordered collections maintain the insertion order of its elements. The elements are stored in the order they were inserted and can be accessed by their position. These collections typically provide a get(index) interface to retrieve the element at a specific index. Examples of ordered collections include classes implementing the List interface, such as ArrayList, LinkedList, etc.
Unordered collections in Java, on the other hand, do not guarantee any specific order of traversal. The elements are stored in an order that depends on the underlying data structure that supports it. Elements in an unordered collection are typically accessed by their values, not indices. HashSet and HashMaps are some examples of Unordered collections.
Sorted collections are a special type of collection where traversing the collection will yield elements in their natural order or in accordance with a specified Comparator. TreeSets and TreeMaps are examples of Sorted collections.
3. Why Sets Do Not Provide an indexOf()
Sets in Java are unordered collections. They have the following important characteristics:
- guarantees the uniqueness of its elements
- can confirm the existence of an element efficiently, in constant time
Sets come in different flavors. HashSet stores its elements based on a hash-based mechanism (uses a HashMap internally), while a TreeSet will use a default or custom comparator to store and order its elements.
Sets* also need to be efficient in guaranteeing uniqueness, which means storing elements efficiently is more important than preserving their order. It is not straightforward to get the index of an item in a Set, unlike a *List.
4. The Problem Statement
The problem that we want to solve here is to find the index of an element in a given Set. The index of the element should always be the same and should not change on each query. If the element is absent in the set, we should return -1.
Example 1:
Input Set [10, 2, 9, 15, 0]
Query: getIndexOf(10)
Output: 0
Query: getIndexOf(0)
Output: 4
Example 2:
Input Set ["Java", "Scala", "Python", "Ruby", "Go"]
Query: getIndexOf("Scala")
Output: 1
5. Writing a Utility Method To Get the Index
5.1. Using an Iterator
An Iterator
We first obtain an iterator instance from the Set and use it to iterate until we reach the element we are looking for. We keep track of the steps as well and break when we reach our desired element with the index:
public int getIndexUsingIterator(Set<E> set, E element) {
Iterator<E> iterator = set.iterator();
int index = 0;
while (iterator.hasNext()) {
if (element.equals(iterator.next())) {
return index;
}
index++;
}
return -1;
}
5.2. Using For-Each Loop
We can alternatively apply the same solution using a for-each loop to traverse through the provided set:
public int getIndexUsingForEach(Set<E> set, E element) {
int index = 0;
for (E current : set) {
if (element.equals(current)) {
return index;
}
index++;
}
return -1;
}
*We use these utility methods in conjunction with the Set object we use. These methods run in O(n), or linear time, every time we invoke this method, where n is the size of the set. It does not require any additional space.*
Our implementations here will always return the same index regardless of how many times we call the getIndexUsingIterator() or getIndexUsingForEach() methods. This verifies the correctness of the solution.
However, if there is a need to match the index output of this method to the order in which the element was inserted, we need to dig a little deeper.
5.3. Applying the Implementation on Different Types of Sets
It is important to note here that the index returned by traversing using an iterator might not match the insertion order, especially if we are using a HashSet as the source:
Set<Integer> set = new HashSet<>();
set.add(100);
set.add(20);
// add more elements
Assert.assertEquals(2, integerIndexOfElementsInSet.getIndexUsingIterator(set,100));
Although we inserted 100 as the first element, we see that the index we receive from our implementation is 2. The iterator will iterate through the elements in the order it is stored in the HashSet, not the order it was inserted.
To solve this, we can swap out our HashSet with a LinkedHashSet:
Set<Integer> set = new LinkedHashSet<>();
set.add(100);
set.add(20);
// add more elements
Assert.assertEquals(0, integerIndexOfElementsInSet.getIndexUsingIterator(set, 100));
LinkedHashSet is backed by a LinkedList, which stores the elements and hence maintains the ordering of its elements.
Similarly, when we use a TreeSet, the index we get from our implementation is based on the natural ordering of the elements in the Set:
Set<Integer> set = new TreeSet<>();
set.add(0);
set.add(-1);
set.add(100);
// add more elements
Assert.assertEquals(0, integerIndexOfElementsInSet.getIndexUsingIterator(set, -1));
Assert.assertEquals(3, integerIndexOfElementsInSet.getIndexUsingIterator(set, 100));
In this section, we looked at how we can find the index of an element in a Set and how we can use LinkedHashSet to correctly find the index based on the insertion order.
6. Writing a Custom LinkedHashSet Implementation
We can also write a custom variation of the LinkedHashSet class in Java to supplement its functionality to get the index of elements. Although it is highly unnecessary to create a subclass just for adding one utility method, this is still an option:
public class InsertionIndexAwareSet<E> extends LinkedHashSet<E> {
public int getIndexOf(E element) {
int index = 0;
for (E current : this) {
if (current.equals(element)) {
return index;
}
index++;
}
return -1;
}
}
Finally, we can create an instance of our custom class and call the getIndexOf() method to get the index:
@Test
public void givenIndexAwareSetWithStrings_whenIndexOfElement_thenGivesIndex() {
InsertionIndexAwareSet<String> set = new InsertionIndexAwareSet<>();
set.add("Go");
set.add("Java");
set.add("Scala");
set.add("Python");
Assert.assertEquals(0, set.getIndexOf("Go"));
Assert.assertEquals(2, set.getIndexOf("Scala"));
Assert.assertEquals(-1, set.getIndexOf("C++"));
}
7. Using Apache Commons Collections
Finally, let’s also see how to use the Commons Collections library to solve our problem. The Apache Commons Collections library provides an extensive set of utility methods that help us tackle and extend the functionalities of the Java Collections APIs.
First, we need to add the Maven dependency to use in our code:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.4</version>
</dependency>
We’ll take the help of the ListOrderedSet class here. ListOrderedSet implements the Set interface and uses the decorator pattern to provide the additional benefit of retaining the insertion order of the elements. If we add duplicate elements to the set, the element remains in its original position:
@Test
public void givenListOrderedSet_whenIndexOfElement_thenGivesIndex() {
ListOrderedSet<Integer> set = new ListOrderedSet<>();
set.add(12);
set.add(0);
set.add(-1);
set.add(50);
Assert.assertEquals(0, set.indexOf(12));
Assert.assertEquals(2, set.indexOf(-1));
}
8. Conclusion
In this article, we looked at different ways we can find the index of an element in a Set. We first looked at why it is difficult to find the index of elements in a Set and how we can create our version of a LinkedHashSet to achieve the result. We finally looked at how to use the Apache libraries for the same outcome.
As usual, all the code samples shown in this tutorial are available over on GitHub.