1. Introduction

Java 8 introduced the concept of Streams to the collection hierarchy. These allow for some very powerful processing of data in a very readable way, utilizing some functional programming concepts to make the process work.

We will investigate how can we achieve the same functionality by using Kotlin idioms. We will also have a look at features that are not available in plain Java.

2. Java vs. Kotlin

In Java 8, the new fancy API can be used only when interacting with java.util.stream.Stream instances.

The good thing is that all standard collections – anything that implements java.util.Collection – have a particular method stream() that can produce a Stream instance.

It’s important to remember that the Stream is not a Collection. It does not implement java.util.Collection and it does not implement any of the normal semantics of Collections in Java. It is more akin to a one-time Iterator in that it is derived from a Collection and is used to work through it, performing operations on each element that is seen.

In Kotlin, all collection types already support these operations without needing to convert them first. A conversion is only needed if the collection semantics are wrong – e.g., a Set has unique elements but is unordered.

One benefit of this is that there’s no need for an initial conversion from a Collection into a Stream, and no need for a final conversion from a Stream back into a collection – using the collect() calls.

For example, in Java 8 we would have to write the following:

someList
  .stream()
  .map() // some operations
  .collect(Collectors.toList());

The equivalent in Kotlin is very simply:

someList
  .map() // some operations

Additionally, Java 8 Streams are also non-reusable. After Stream is consumed, it can’t be used again.

For example, the following will not work:

Stream<Integer> someIntegers = integers.stream();
someIntegers.forEach(...);
someIntegers.forEach(...); // an exception

In Kotlin, the fact that these are all just normal collections means that this problem never arises. Intermediate state can be assigned to variables and shared quickly, and just works as we would expect.

3. Lazy Sequences

One of the key things about Java 8 Streams is that they are evaluated lazily. This means that no more work than needed will be performed.

This is especially useful if we are doing potentially expensive operations on the elements in the Stream, or it makes it possible to work with infinite sequences.

For example, IntStream.generate will produce a potentially infinite Stream of integers. If we call findFirst() on it, we will get the first element, and not run into an infinite loop.

In Kotlin, collections are eager, rather than lazy. The exception here is Sequence, which does evaluate lazily.

This is an important distinction to note, as the following example shows:

val result = listOf(1, 2, 3, 4, 5) 
  .map { n -> n * n } 
  .filter { n -> n < 10 } 
  .first()

The Kotlin version of this will perform five map() operations, five filter() operations and then extract the first value. The Java 8 version will only perform one map() and one filter() because from the perspective of the last operation, no more is needed.

All collections in Kotlin can be converted to a lazy sequence using the asSequence() method.

Using a Sequence instead of a List in the above example performs the same number of operations as in Java 8.

4. Java 8 Stream Operations

In Java 8, Stream operations are broken down into two categories:

  • intermediate and
  • terminal

Intermediate operations essentially convert one Stream into another lazily – for example, a Stream of all integers into a Stream of all even integers.

Terminal options are the final step of Stream method chain and trigger the actual processing.

In Kotlin there is no such distinction. Instead, these are all just functions that take the collection as input and produce a new output.

Note that if we’re using an eager collection in Kotlin, then these operations are evaluated immediately, which may be surprising when compared to Java. If we need it to be lazy, remember to convert to a Sequence first.

4.1. Intermediate Operations

Almost all intermediate operations from the Java 8 Streams API have equivalents in Kotlin. These are not intermediate operations though – except in the case of the Sequence class – as they result in fully populated collections from processing the input collection.

Out of these operations, there are are several that work exactly the same – filter(), map(), flatMap(), distinct() and sorted() – and some that work the same only with different names – limit() is now take, and skip() is now drop(). For example:

val oddSquared = listOf(1, 2, 3, 4, 5)
  .filter { n -> n % 2 == 1 } // 1, 3, 5
  .map { n -> n * n } // 1, 9, 25
  .drop(1) // 9, 25
  .take(1) // 9

This will return the single value “9” – 3².

Some of these operations also have an additional version – suffixed with the word “To” – that outputs into a provided collection instead of producing a new one.

This can be useful for processing several input collections into the same output collection, for example:

val target = mutableList<Int>()
listOf(1, 2, 3, 4, 5)
  .filterTo(target) { n -> n % 2 == 0 }

This will insert the values “2” and “4” into the list “target”.

The only operation that does not normally have a direct replacement is peek() – used in Java 8 to iterate over the entries in the Stream in the middle of a processing pipeline without interrupting the flow.

If we are using a lazy Sequence instead of an eager collection, then there is an onEach() function that does directly replace the peek function. This only exists on this one class though, and so we need to be aware of which type we are using for it to work.

There are also some additional variations on the standard intermediate operations that make life easier. For example, the filter operation has additional versions filterNotNull(), filterIsInstance(), filterNot() and filterIndexed().

For example:

listOf(1, 2, 3, 4, 5)
  .map { n -> n * (n + 1) / 2 }
  .mapIndexed { (i, n) -> "Triangular number $i: $n" }

This will produce the first five triangular numbers, in the form “Triangular number 3: 6”

Another important difference is in the way the flatMap operation works. In Java 8, this operation is required to return a Stream instance, whereas in Kotlin it can return any collection type. This makes it easier to work with.

For example:

val letters = listOf("This", "Is", "An", "Example")
  .flatMap { w -> w.toCharArray() } // Produces a List<Char>
  .filter { c -> Character.isUpperCase(c) }

In Java 8, the second line would need to be wrapped in Arrays.toStream() for this to work.

4.2. Terminal Operations

All of the standard Terminal Operations from the Java 8 Streams API have direct replacements in Kotlin, with the sole exception of collect.

A couple of them do have different names:

  • anyMatch() -> any()
  • allMatch() -> all()
  • noneMatch() -> none()

Some of them have additional variations to work with how Kotlin has differences – there is first() and firstOrNull(), where first throws if the collection is empty, but returns a non-nullable type otherwise.

The interesting case is collect. Java 8 uses this to be able to collect all Stream elements to some collection using a provided strategy.

This allows for an arbitrary Collector to be provided, which will be provided with every element in the collection and will produce an output of some kind. These are used from the Collectors helper class, but we can write our own if needed.

In Kotlin there are direct replacements for almost all of the standard collectors available directly as members on the collection object itself – there is no need for an additional step with the collector being provided.

The one exception here is the summarizingDouble/summarizingInt/summarizingLong methods – which produce mean, count, min, max and sum all in one go. Each of these can be produced individually – though that obviously has a higher cost.

Alternatively, we can manage it using a for-each loop and handle it by hand if needed – it is unlikely we will need all 5 of these values at the same time, so we only need to implement the ones that are important.

5. Additional Operations in Kotlin

Kotlin adds some additional operations to collections that are not possible in Java 8 without implementing them ourselves.

Some of these are simply extensions to the standard operations, as described above. For example, it is possible to do all of the operations such that the result is added to an existing collection rather than returning a new collection.

It is also possible in many cases to have the lambda provided with not only the element in question but also the index of the element – for collections that are ordered, and so indexes make sense.

There are also some operations that take explicit advantage of the null safety of Kotlin – for example; we can perform a filterNotNull() on a List<String?> to return a List, where all nulls are removed.

Actual additional operations that can be done in Kotlin but not in Java 8 Streams include:

  • zip() and unzip() – are used to combine two collections into one sequence of pairs, and conversely to convert a collection of pairs into two collections
  • associate – is used for converting a collection into a map by providing a lambda to convert each entry in the collection into a key/value pair in the resulting map

For example:

val numbers = listOf(1, 2, 3)
val words = listOf("one", "two", "three")
numbers.zip(words)

This produces a List<Pair<Int, String>>, with values 1 to “one”, 2 to “two” and 3 to “three”.

val squares = listOf(1, 2, 3, 4,5)
  .associate { n -> n to n * n }

This produces a Map<Int, Int>, where the keys are the numbers 1 to 5, and the values are the squares of those values.

6. Summary

Most of the stream operations we are used to from Java 8 are directly usable in Kotlin on the standard Collection classes, with no need to convert to a Stream first.

In addition, Kotlin adds more flexibility to how this works, by adding more operations that can be used and more variation on the existing operations.

However, Kotlin is eager by default, not lazy. This can cause additional work to be performed if we are not careful about the collection types that are being used.


» 下一篇: Kotlin和尾递归