1. Overview

The Vavr library, formerly known as Javaslang, is a functional library for Java. In this article, we explore its powerful collections API.

To get more information about this library, please read this article.

2. Persistent Collections

A persistent collection when modified produces a new version of the collection while preserving the current version.

Maintaining multiple versions of the same collection might lead to inefficient CPU and memory usage. However, the Vavr collection library overcomes this by sharing data structure across different versions of a collection.

This is fundamentally different from Java’s unmodifiableCollection() from the Collections utility class, which merely provides a wrapper around an underlying collection.

Trying to modify such a collection results in UnsupportedOperationException instead of a new version being created. Moreover, the underlying collection is still mutable through its direct reference.

3. Traversable

Traversable is the base type of all Vavr collections – this interface defines methods that are shared among all data structures.

It provides some useful default methods such as size(), get(), filter(), isEmpty() and others which are inherited by sub-interfaces.

Let’s explore the collections library further.

4. Seq

We’ll start with sequences.

The Seq interface represents sequential data structures. It is the parent interface for List, Stream, Queue, Array, Vector, and CharSeq. All these data structures have their own unique properties which we’ll be exploring below.

4.1. List

A List is an eagerly-evaluated sequence of elements extending the LinearSeq interface.

Persistent Lists are formed recursively from a head and a tail:

  • Head – the first element
  • Tail – a list containing remaining elements (that list is also formed from a head and a tail)

There are static factory methods in the List API that can be used for creating a List. We can use the static of() method to create an instance of List from one or more objects.

We can also use the static empty() to create an empty List and ofAll() to create a List from an Iterable type:

List<String> list = List.of(
  "Java", "PHP", "Jquery", "JavaScript", "JShell", "JAVA");

Let’s take a look at some examples on how to manipulate lists.

We can use the drop() and its variants to remove first N elements:

List list1 = list.drop(2);                                      
assertFalse(list1.contains("Java") && list1.contains("PHP"));   
                                                                
List list2 = list.dropRight(2);                                 
assertFalse(list2.contains("JAVA") && list2.contains("JShell"));
                                                                
List list3 = list.dropUntil(s -> s.contains("Shell"));          
assertEquals(list3.size(), 2);                                  
                                                                
List list4 = list.dropWhile(s -> s.length() > 0);               
assertTrue(list4.isEmpty());

drop(int n) removes n number of elements from the list starting from the first element while the dropRight() does the same starting from the last element in the list.

dropUntil() continues removing elements from the list until the predicate evaluates to true whereas the dropWhile() continues dropping elements while the predicate is true.

There’s also dropRightWhile() and dropRightUntil() that starts removing elements from the right.

Next, take(int n) is used to grab elements from a list. It takes n number of elements from the list and then stops. There’s also a takeRight(int n) that starts taking elements from the end of the list:

List list5 = list.take(1);                       
assertEquals(list5.single(), "Java");            
                                                 
List list6 = list.takeRight(1);                  
assertEquals(list6.single(), "JAVA");            
                                                 
List list7 = list.takeUntil(s -> s.length() > 6);
assertEquals(list7.size(), 3);

Finally, takeUntil() continues taking elements from the list until the predicate is true. There’s a takeWhile() variant that takes a predicate argument as well.

Moreover, there are other useful methods in the API, e.g., actually the distinct() that returns a list of non-duplicate elements as well as the distinctBy() that accepts a Comparator to determine equality.

Very interestingly, there’s also the intersperse() that inserts an element in between every element of a list. It can be very handy for String operations:

List list8 = list
  .distinctBy((s1, s2) -> s1.startsWith(s2.charAt(0) + "") ? 0 : 1);
assertEquals(list8.size(), 2);

String words = List.of("Boys", "Girls")
  .intersperse("and")
  .reduce((s1, s2) -> s1.concat( " " + s2 ))
  .trim();  
assertEquals(words, "Boys and Girls");

Want to divide a list into categories? Well, there’s an API for that too:

Iterator<List<String>> iterator = list.grouped(2);
assertEquals(iterator.head().size(), 2);

Map<Boolean, List<String>> map = list.groupBy(e -> e.startsWith("J"));
assertEquals(map.size(), 2);
assertEquals(map.get(false).get().size(), 1);
assertEquals(map.get(true).get().size(), 5);

The group(int n) divides a List into groups of n elements each. The groupdBy() accepts a Function that contains the logic for dividing the list and returns a Map with two entries – true and false.

The true key maps to a List of elements that satisfy the condition specified in the Function; the false key maps to a List of elements that do not.

As expected, when mutating a List, the original List is not actually modified. Instead, a new version of the List is always returned.

We can also interact with a List using stack semantics – last-in-first-out (LIFO) retrieval of elements. To this extent, there are API methods for manipulating a stack such as peek(), pop() and push():

List<Integer> intList = List.empty();

List<Integer> intList1 = intList.pushAll(List.rangeClosed(5,10));

assertEquals(intList1.peek(), Integer.valueOf(10));

List intList2 = intList1.pop();
assertEquals(intList2.size(), (intList1.size() - 1) );

The pushAll() function is used to insert a range of integers onto the stack, while the peek() is used to get the head of the stack. There’s also the peekOption() that can wrap the result in an Option object.

There are other interesting and really useful methods in the List interface that are neatly documented in the Java docs.

4.2. Queue

An immutable Queue stores elements allowing a first-in-first-out (FIFO) retrieval.

A Queue internally consists of two linked lists, a front List, and a rear List. The front List contains the elements that are dequeued, and the rear List contains the elements that are enqueued.

This allows enqueue and dequeue operations to perform in O(1). When the front List runs out of elements, front and rear List’s are swapped, and the rear List is reversed.

Let’s create a queue:

Queue<Integer> queue = Queue.of(1, 2);
Queue<Integer> secondQueue = queue.enqueueAll(List.of(4,5));

assertEquals(3, queue.size());
assertEquals(5, secondQueue.size());

Tuple2<Integer, Queue<Integer>> result = secondQueue.dequeue();
assertEquals(Integer.valueOf(1), result._1);

Queue<Integer> tailQueue = result._2;
assertFalse(tailQueue.contains(secondQueue.get(0)));

The dequeue function removes the head element from the Queue and returns a Tuple2<T, Q>. The tuple contains the head element that has been removed as the first entry and the remaining elements of the Queue as the second entry.

We can use the combination(n) to get all the possible N combinations of elements in the Queue:

Queue<Queue<Integer>> queue1 = queue.combinations(2);
assertEquals(queue1.get(2).toCharSeq(), CharSeq.of("23"));

Again, we can see that the original Queue is not modified while enqueuing/dequeuing elements.

4.3. Stream

A Stream is an implementation of a lazy linked list and is quite different from java.util.stream. Unlike java.util.stream, the Vavr Stream stores data and is lazily evaluating next elements.

Let’s say we have a Stream of integers:

Stream<Integer> s = Stream.of(2, 1, 3, 4);

Printing the result of s.toString() to the console will only show Stream(2, ?). This means that it is only the head of the Stream that has been evaluated while the tail has not been evaluated.

Invoking s.get(3) and subsequently displaying the result of s.tail() returns Stream(1, 3, 4, ?). On the contrary, without invoking s.get(3) first which causes the Stream to evaluate the last element – the result of s.tail() will only be Stream(1, ?). This means just the first element of the tail has been evaluated.

This behaviour can improve performance and makes it possible to use Stream to represent sequences that are (theoretically) infinitely long.

Vavr Stream is immutable and may be Empty or Cons. A Cons consists of a head element and a lazy computed tail Stream. Unlike a List, for a Stream, only the head element is kept in memory. The tail elements are computed on demand.

Let’s create a Stream of 10 positive integers and compute the sum of the even numbers:

Stream<Integer> intStream = Stream.iterate(0, i -> i + 1)
  .take(10);

assertEquals(10, intStream.size());

long evenSum = intStream.filter(i -> i % 2 == 0)
  .sum()
  .longValue();

assertEquals(20, evenSum);

As opposed to Java 8 Stream API, Vavr’s Stream is a data structure for storing a sequence of elements.

Thus, it has methods like get(), append(), insert() and others for manipulating its elements. The drop(), distinct() and some other methods considered earlier are also available.

Finally, let’s quickly demonstrate the tabulate() in a Stream. This method returns a Stream of length n, which contains elements that are the result of applying a function:

Stream<Integer> s1 = Stream.tabulate(5, (i)-> i + 1);
assertEquals(s1.get(2).intValue(), 3);

We can also use the zip() to generate a Stream of Tuple2<Integer, Integer>, which contains elements that are formed by combining two Streams:

Stream<Integer> s = Stream.of(2,1,3,4);

Stream<Tuple2<Integer, Integer>> s2 = s.zip(List.of(7,8,9));
Tuple2<Integer, Integer> t1 = s2.get(0);
 
assertEquals(t1._1().intValue(), 2);
assertEquals(t1._2().intValue(), 7);

4.4. Array

An Array is an immutable, indexed, sequence that allows efficient random access. It is backed by a Java array of objects. Essentially, it is a Traversable wrapper for an array of objects of type T.

We can instantiate an Array by using the static method of(). We can also generate a range elements by using the static range() and rangeBy() methods. The rangeBy() has a third parameter that let us define the step.

The range() and rangeBy() methods will only generate elements starting from the start value to end value minus one. If we need to include the end value we can use either the rangeClosed() or rangeClosedBy():

Array<Integer> rArray = Array.range(1, 5);
assertFalse(rArray.contains(5));

Array<Integer> rArray2 = Array.rangeClosed(1, 5);
assertTrue(rArray2.contains(5));

Array<Integer> rArray3 = Array.rangeClosedBy(1,6,2);
assertEquals(rArray3.size(), 3);

Let’s manipulate the elements by index:

Array<Integer> intArray = Array.of(1, 2, 3);
Array<Integer> newArray = intArray.removeAt(1);

assertEquals(3, intArray.size());
assertEquals(2, newArray.size());
assertEquals(3, newArray.get(1).intValue());

Array<Integer> array2 = intArray.replace(1, 5);
assertEquals(array2.get(0).intValue(), 5);

4.5. Vector

A Vector is a kind of in-between Array and List providing another indexed sequence of elements that allows both random access and modification in constant time:

Vector<Integer> intVector = Vector.range(1, 5);
Vector<Integer> newVector = intVector.replace(2, 6);

assertEquals(4, intVector.size());
assertEquals(4, newVector.size());

assertEquals(2, intVector.get(1).intValue());
assertEquals(6, newVector.get(1).intValue());

4.6. CharSeq

CharSeq is a collection object to express a sequence of primitive characters. It is essentially a String wrapper with the addition of collection operations.

To create a CharSeq:

CharSeq chars = CharSeq.of("vavr");
CharSeq newChars = chars.replace('v', 'V');

assertEquals(4, chars.size());
assertEquals(4, newChars.size());

assertEquals('v', chars.charAt(0));
assertEquals('V', newChars.charAt(0));
assertEquals("Vavr", newChars.mkString());

5. Set

In this section, we elaborate on various Set implementations in the collections library. The unique feature of the Set data structure is that it doesn’t allow duplicate values.

There are, however, different implementations of Set – the HashSet being the basic one. The TreeSet doesn’t allow duplicate elements and can be sorted. The LinkedHashSet maintains the insertion order of its elements.

Let’s have a closer look at these implementations one by one.

5.1. HashSet

HashSet has static factory methods for creating new instances – some of which we have explored previously in this article – like of(), ofAll() and variations of range() methods.

We can get the difference between two sets by using the diff() method. Also, the union() and intersect() methods return the union set and intersection set of the two sets:

HashSet<Integer> set0 = HashSet.rangeClosed(1,5);
HashSet<Integer> set1 = HashSet.rangeClosed(3, 6);

assertEquals(set0.union(set1), HashSet.rangeClosed(1,6));
assertEquals(set0.diff(set1), HashSet.rangeClosed(1,2));
assertEquals(set0.intersect(set1), HashSet.rangeClosed(3,5));

We can also perform basic operations such as adding and removing elements:

HashSet<String> set = HashSet.of("Red", "Green", "Blue");
HashSet<String> newSet = set.add("Yellow");

assertEquals(3, set.size());
assertEquals(4, newSet.size());
assertTrue(newSet.contains("Yellow"));

The HashSet implementation is backed by a Hash array mapped trie (HAMT), which boasts a superior performance when compared to an ordinary HashTable and its structure makes it suitable for backing a persistent collection.

5.2. TreeSet

An immutable TreeSet is an implementation of the SortedSet interface. It stores a Set of sorted elements and is implemented using binary search trees. All its operations run in O(log n) time.

By default, elements of a TreeSet are sorted in their natural order.

Let’s create a SortedSet using natural sorting order:

SortedSet<String> set = TreeSet.of("Red", "Green", "Blue");
assertEquals("Blue", set.head());

SortedSet<Integer> intSet = TreeSet.of(1,2,3);
assertEquals(2, intSet.average().get().intValue());

To order elements in a customized manner, pass a Comparator instance while creating a TreeSet. We can also generate a string from the set elements:

SortedSet<String> reversedSet
  = TreeSet.of(Comparator.reverseOrder(), "Green", "Red", "Blue");
assertEquals("Red", reversedSet.head());

String str = reversedSet.mkString(" and ");
assertEquals("Red and Green and Blue", str);

5.3. BitSet

Vavr collections also contain an immutable BitSet implementation. The BitSet interface extends the SortedSet interface. BitSet can be instantiated using static methods in BitSet.Builder.

Like other implementations of the Set data structure, BitSet does not allow duplicate entries to be added to the set.

It inherits methods for manipulation from the Traversable interface. Note that it is different from the java.util.BitSet in the standard Java library. BitSet data can’t contain String values.

Let’s see how to create a BitSet instance using the factory method of():

BitSet<Integer> bitSet = BitSet.of(1,2,3,4,5,6,7,8);
BitSet<Integer> bitSet1 = bitSet.takeUntil(i -> i > 4);
assertEquals(bitSet1.size(), 4);

We use the takeUntil() to select the first four elements of BitSet. The operation returned a new instance. Take note that the takeUntil() is defined in the Traversable interface, which is a parent interface of BitSet.

Other methods and operations demonstrated above, that are defined in the Traversable interface, are also applicable to BitSet as well.

6. Map

A map is a key-value data structure. Vavr’s Map is immutable and has implementations for HashMap, TreeMap, and LinkedHashMap.

Generally, map contracts don’t allow duplicate keys – though there may be duplicate values mapped to different keys.

6.1. HashMap

A HashMap is an implementation of an immutable Map interface. It stores key-value pairs using the hash code of the keys.

Vavr’s Map uses Tuple2 to represent key-value pairs instead of a traditional Entry type:

Map<Integer, List<Integer>> map = List.rangeClosed(0, 10)
  .groupBy(i -> i % 2);
        
assertEquals(2, map.size());
assertEquals(6, map.get(0).get().size());
assertEquals(5, map.get(1).get().size());

Similar to HashSet, a HashMap implementation is backed by a hash array mapped trie (HAMT) resulting in constant time for almost all operations.

We can filter map entries by keys, using the filterKeys() method or by values, using the filterValues() method. Both methods accept a Predicate as an argument:

Map<String, String> map1
  = HashMap.of("key1", "val1", "key2", "val2", "key3", "val3");
        
Map<String, String> fMap
  = map1.filterKeys(k -> k.contains("1") || k.contains("2"));
assertFalse(fMap.containsKey("key3"));
        
Map<String, String> fMap2
  = map1.filterValues(v -> v.contains("3"));
assertEquals(fMap2.size(), 1);
assertTrue(fMap2.containsValue("val3"));

We can also transform map entries by using the map() method. Let’s, for example, transform map1 to a Map<String, Integer>:

Map<String, Integer> map2 = map1.map(
  (k, v) -> Tuple.of(k, Integer.valueOf(v.charAt(v.length() - 1) + "")));
assertEquals(map2.get("key1").get().intValue(), 1);

6.2. TreeMap

An immutable TreeMap is an implementation of the SortedMap interface. Similar to TreeSet, a Comparator instance is used to custom sort elements of a TreeMap.

Let’s demonstrate the creation of a SortedMap:

SortedMap<Integer, String> map
  = TreeMap.of(3, "Three", 2, "Two", 4, "Four", 1, "One");

assertEquals(1, map.keySet().toJavaArray()[0]);
assertEquals("Four", map.get(4).get());

By default, entries of TreeMap are sorted in the natural order of the keys. We can, however, specify a Comparator that will be used for sorting:

TreeMap<Integer, String> treeMap2 =
  TreeMap.of(Comparator.reverseOrder(), 3,"three", 6, "six", 1, "one");
assertEquals(treeMap2.keySet().mkString(), "631");

As with TreeSet, a TreeMap implementation is also modeled using a tree, hence its operations are of O(log n) time. The map.get(key) returns an Option that wraps a value at the specified key in the map.

7. Interoperability With Java

The collection API is fully interoperable with Java’s collection framework. Let’s see how this is done in practice.

7.1. Java to Vavr Conversion

Each collection implementation in Vavr has a static factory method ofAll() that takes a java.util.Iterable. This allows us to create a Vavr collection out of a Java collection. Likewise, another factory method ofAll() takes a Java Stream directly.

To convert a Java List to an immutable List:

java.util.List<Integer> javaList = java.util.Arrays.asList(1, 2, 3, 4);
List<Integer> vavrList = List.ofAll(javaList);

java.util.stream.Stream<Integer> javaStream = javaList.stream();
Set<Integer> vavrSet = HashSet.ofAll(javaStream);

Another useful function is the collector() that can be used in conjunction with Stream.collect() to obtain a Vavr collection:

List<Integer> vavrList = IntStream.range(1, 10)
  .boxed()
  .filter(i -> i % 2 == 0)
  .collect(List.collector());

assertEquals(4, vavrList.size());
assertEquals(2, vavrList.head().intValue());

7.2. Vavr to Java Conversion

Value interface has many methods to convert a Vavr type to a Java type. These methods are of the format toJavaXXX().

Let’s address a couple of examples:

Integer[] array = List.of(1, 2, 3)
  .toJavaArray(Integer.class);
assertEquals(3, array.length);

java.util.Map<String, Integer> map = List.of("1", "2", "3")
  .toJavaMap(i -> Tuple.of(i, Integer.valueOf(i)));
assertEquals(2, map.get("2").intValue());

We can also use Java 8 Collectors to collect elements from Vavr collections:

java.util.Set<Integer> javaSet = List.of(1, 2, 3)
  .collect(Collectors.toSet());
        
assertEquals(3, javaSet.size());
assertEquals(1, javaSet.toArray()[0]);

7.3. Java Collection Views

Alternatively, the library provides so-called collection views that perform better when converting to Java collections. The conversion methods from the previous section iterate through all the elements to build a Java collection.

Views, on the other hand, implement standard Java interfaces and delegate method calls to the underlying Vavr collection.

As of this writing, only the List view is supported. Each sequential collection has two methods, one to create an immutable view and another for a mutable view.

Calling mutator methods on immutable view results in an UnsupportedOperationException.

Let’s look at an example:

@Test(expected = UnsupportedOperationException.class)
public void givenVavrList_whenViewConverted_thenException() {
    java.util.List<Integer> javaList = List.of(1, 2, 3)
      .asJava();
    
    assertEquals(3, javaList.get(2).intValue());
    javaList.add(4);
}

To create an immutable view:

java.util.List<Integer> javaList = List.of(1, 2, 3)
  .asJavaMutable();
javaList.add(4);

assertEquals(4, javaList.get(3).intValue());

8. Conclusion

In this tutorial, we’ve learned about various functional data structures provided by Vavr’s Collection API.

Finally, it’s important to note that the library also defines Try, Option, Either, and Future that extend the Value interface and as a consequence implement Java’s Iterable interface. This implies that they can behave as a collection in some situations.

The complete source code for all the examples in this article can be found over on Github.


« 上一篇: Java Weekly, 第193期
» 下一篇: JDeferred指南