1. Introduction

In this tutorial, we’ll look at how to fold lists in Scala using a variety of approaches. The List class’s fold functions take the data in the list, an initial value, and a combining function.

They then step through the elements in the list and apply the provided function to them. We can use any of the three folding functions, depending on our needs; fold, foldLeft or foldRight.

These functions all behave similarly but have different rules and use cases where they fit best.

2. Folding

There’s no special setup needed to fold lists as they are part of core Scala. However, to use fold, we need to have Scala 2.9 onwards. foldLeft and foldRight exist in earlier Scala versions.

Paraphrasing the Wikipedia definition, Folding involves the use of a higher-order function to analyze a recursive data structure and, by applying a given combining operation, recombine the results of processing its subparts, building up a return value:

val list = List(1, 2, 3, 4, 5)
val sum = list.fold(0)((x, y) => x + y)
assert(sum == 15)

The fold method takes two sets of arguments. One contains a start value and the other a combining function. It then steps through the list, recursively applying the function to two operands: an accumulated value and the next element in the list.

In the first step, the start value, 0,  acts as the first operand. The first element of the list acts as the second operand. The result of this computation becomes the first operand of the next step and the second element in the list, the second operand.

Subsequent steps then keep using the next element in the list as the second operand to the combining function, the accumulated value remaining the first operand in each case, until the end of the list:

Step 1: x(0) + y(1) = 1
Step 2: x(1) + y(2) = 3
Step 3: x(3) + y(3) = 6
Step 4: x(6) + y(4) = 10
Step 5: x(10) + y(5) = 15 

As we look at the options available to fold, let’s introduce a simple object to play around with:

case class Person(name: String, sex: String)

And a list of these objects to test left or right folding:

val persons = List(Person("Thomas", "male"), Person("Sowell", "male"), Person("Liz", "female"))

3. foldLeft

foldLeft iterates through the list from left to right, accumulating elements in that order. This also means that when processing the two operands to the combining function, the accumulator is the argument on the left:

val foldedList = persons.foldLeft(List[String]()) { (accumulator, person) =>
  val title = person.sex match {
    case "male" => "Mr."
    case "female" => "Ms."
  }
  accumulator :+ s"$title ${person.name}"
}
assert(foldedList == List("Mr. Thomas", "Mr. Sowell", "Ms. Liz"))

Notice the direction of elements in the final list.

More formally, foldLeft associates to the left. An accumulator will be initialized, and elements will be added to it in left-to-right order:

List(1, 2, 3).foldLeft(0)(f) = f(f(f(0, 1), 2), 3)

4. foldRight

foldRight, on the other hand, iterates through the list from right to left, accumulating elements in that order. Conversely, this means that our accumulator will be the operand on the right in each iteration:

val foldedList = persons.foldRight(List[String]()) { (person, accumulator) =>
  val title = person.sex match {
    case "male" => "Mr."
    case "female" => "Ms."
  }
  accumulator :+ s"$title ${person.name}"
}
assert(foldedList == List("Ms. Liz", "Mr. Sowell", "Mr. Thomas"))

Also, notice the order of the elements in the final list. More formally, foldRight associates to the right. i.e., an accumulator will be initialized, and elements will be accumulated in right-to-left order:

List(1, 2, 3).foldRight(0)(f) = f(1, f(2, f(3, 0)))

5. fold

foldLeft and foldRight are linear operations, while fold can be a tree operation, meaning there’s no guarantee of the processing order. It can decompose a list into subsets in arbitrary order. These can then be evaluated separately, and the results put together to give a final result.

The fold method primarily exists to support parallelism.

5.1. Parallelism

All three folding functions may work the same in some very simple cases, and we may not detect the differences. This is evident when the list, the output, and the initial value are all of the same types:

List(1, 2, 3, 4, 5).fold(0)(_ + _)
List(1, 2, 3, 4, 5).foldLeft(0)(_ + _)
List(1, 2, 3, 4, 5).foldRight(0)(_ + _)

All the above lines produce the same results. However, if we drill into their internal execution, we’ll see some glaring differences. First, let’s parallelize the list so fold can do the magic it’s intended to do. Without parallelizing, it will just work like foldLeft; like in the very first example:

val parallelNumSeq = List(1, 2, 3, 4, 5).par

We should pay close attention to the names of the parameters in the combiner function.

Summing with foldLeft:

val foldLeftResult =
  parallelNumSeq.foldLeft(0) { (acc, currNum) =>
    val sum = acc + currNum
    println(s"FoldLeft: acc($acc) + currNum($currNum) = $sum ")
    sum
  }
println(foldLeftResult)

Accumulation is done left to right. It’s even clearer when we see the printed result:

FoldLeft: acc(0) + currNum(1) = 1 
FoldLeft: acc(1) + currNum(2) = 3 
FoldLeft: acc(3) + currNum(3) = 6 
FoldLeft: acc(6) + currNum(4) = 10 
FoldLeft: acc(10) + currNum(5) = 15 
15

Summing with foldRight:

val foldRightResult = 
  parallelNumSeq.foldRight(0) { (currNum, acc) =>
    val sum = acc + currNum
    println(s"FoldRight: acc($acc) + currNum($currNum) = $sum")
    sum
  }
println(foldRightResult)

This time, the folding happens right to left. The results:

FoldRight: acc(0) + currNum(5) = 5
FoldRight: acc(5) + currNum(4) = 9
FoldRight: acc(9) + currNum(3) = 12
FoldRight: acc(12) + currNum(2) = 14
FoldRight: acc(14) + currNum(1) = 15
15

Finally, let’s sum with fold:

val foldResult = parallelNumSeq.fold(0) { (acc1, acc2) =>
    val sum = acc1 + acc2
    println(s"Fold: acc1($acc1) + acc2($acc2) = $sum")
    sum
  }
println(foldResult)

In fact, in parallel mode, both parameters to the combiner function are accumulators as they could both be results of a sequential fold performed over subsets of the list:

Fold: acc1(0) + acc2(1) = 1
Fold: acc1(0) + acc2(4) = 4
Fold: acc1(0) + acc2(2) = 2
Fold: acc1(0) + acc2(5) = 5
Fold: acc1(0) + acc2(3) = 3
Fold: acc1(4) + acc2(5) = 9
Fold: acc1(1) + acc2(2) = 3
Fold: acc1(3) + acc2(9) = 12
Fold: acc1(3) + acc2(12) = 15
15

Notice how the initial value (0) is used several times while in the first 2 cases, it’s used only once, and folding proceeds sequentially.

5.2. Method Signatures

To further understand the differences between the three folds, let’s take a look at their signatures:

fold[A1 >: A](z: A1)(op: (A1, A1) => A1): A1

foldLeft[B](z: B)(f: (B, A) => B): B

foldRight[B](z: B)(f: (A, B) => B): B

In all 3 folds, the start value z, and the return value B or A1, must be of the same type. But specifically, for a List of type A, foldLeft and foldRight return a result of type B. Here, B may be the same type as A, related to A, or unrelated to A.

There is no relationship defined between A and B in the declaration:

val stringifiedInts = List("1", "2", "3", "4", "5")
val foldLeftSum = stringifiedInts.foldLeft(0)((acc, currNum) => acc + currNum.toInt)
val foldRightSum = stringifiedInts.foldRight(0)((currNum, acc) => currNum.toInt + acc)
assert(foldLeftSum == 15)
assert(foldRightSum == 15)

In previous examples, the list was of type Int. In the above example, it’s of type String. This type is represented by A in the signatures of both foldLeft and foldRight. The only constraint we are dealing with here is that both the start value and the accumulator are of the same type. It doesn’t matter what the list type is, as long as the combining function can deal with the type difference.

This is not the case for fold. The start value, as well as the result, must be of the same type or super-type of the List we are folding. In the fold signature, A is the type variable bound in List[+A]. This means in our summation example, A is Int which meets the constraint.

This constraint for fold and lack thereof for foldLeft and foldRight is tied to the guarantee of the order we discussed earlier, fold cannot use combining function f as it cannot honor the contract of always ensuring B and A are provided in the same order, in one call it could provide B in place of A and vice versa.

5.3. The Neutral Element

As already discussed, while using fold, the final result is a result of combining multiple sequential folds, which all start by adding the neutral element. From the first example:

List(1, 2, 3, 4, 5).foldLeft(5)(_ + _)

This will consistently return 20:

5 + 1 + 2 + 3 + 4 + 5 = 20

However, anything other than 0 should not be used with fold as it’s not a neutral or zero elements and will return inconsistent results. The neutral element for addition is 0, and if this were a product or multiplication operation, it would be 1:

List(1, 2, 3).fold(5)(_ + _)

The above example may return:

(5 + 1 + 2) + (5 + 3) = 16

Or something else:

(5 + 1) + (5 + 2) + (5 + 3) = 21

6. Conclusion

In this article, we’ve explored different approaches for folding lists in Scala. As always, the full source code is available over on GitHub.


« 上一篇: Scala中的隐式导入