1. Overview
In this tutorial, we’ll see how to find the most frequent element in a Scala collection.
2. Finding the Most Frequent Element in a Scala List
To find the most frequent element in a Scala List, we can use common methods such as groupBy() and maxBy() :
scala> val list = List(1,2,2,3,4,5,6)
list: List[Int] = List(1, 2, 2, 3, 4, 5, 6)
scala> list.groupBy(identity)
res1: scala.collection.immutable.Map[Int,List[Int]] = Map(5 -> List(5), 1 -> List(1), 6 -> List(6), 2 -> List(2, 2), 3 -> List(3), 4 -> List(4))
scala> list.groupBy(identity).maxBy(_._2.size)._1
res2: Int = 2
This solution starts by grouping the equal elements into a Map, using groupBy(identity). This produces a Map where each element (key) contains a list of all the element occurrences in the original collection.
We can find which map entries have more elements from here by checking the list size using the maxBy(_._2.size) part.
2.1. Elements With the Same Frequency
There’s a small detail missing. What if our list has several elements with the same frequency?
In that case, our solution selects a somehow random element from the ones with the same frequency. This happens because while the maxBy() method states in the documentation that it returns the first element it finds, the issue arises from the groupBy() method, which returns a Map without guaranteed order:
scala> List(1,1,2,3,4,4).groupBy(identity)
res0: scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(1, 1), 2 -> List(2), 3 -> List(3), 4 -> List(4, 4))
scala> List(1,1,2,3,4,4,5).groupBy(identity)
res1: scala.collection.immutable.Map[Int,List[Int]] = Map(5 -> List(5), 1 -> List(1, 1), 2 -> List(2), 3 -> List(3), 4 -> List(4, 4))
As we can see, while the first example returns a Map sorted naturally, the second doesn’t.
And now, if we ask for the max element, we can see there’s no obvious logic to choosing the frequent element:
scala> List(1,1,2,3,4,4).groupBy(identity).maxBy(_._2.size)._1
res0: Int = 1
scala> List(1,1,2,3,4,4,5).groupBy(identity).maxBy(_._2.size)._1
res1: Int = 1
scala> List(1,1,2,3,4,4,5,5).groupBy(identity).maxBy(_._2.size)._1
res2: Int = 5
In the first example, groupBy() returns a sorted Map, so the final element is the one expected. But in the last example, because the key 5 appears first in the Map, it’s the one returned by maxBy().
2.2. All Elements Are Unique
This is a particular case of the previous point: if all elements are unique, they have the same frequency. Just like in the previous point, the element that we return isn’t guaranteed:
scala> List(1,2,3,4).groupBy(identity).maxBy(_._2.size)._1
res0: Int = 1
scala> List(1,2,3,4,5).groupBy(identity).maxBy(_._2.size)._1
res1: Int = 5
3. Finding the Most Frequent Element in Other Scala Collections
While we’ve been seeing how to get the most frequent element in a List, this works on other collections and Iterables as well:
scala> Stack(1,2,3,1).groupBy(identity).maxBy(_._2.size)._1
res0: Int = 1
scala> "12342".groupBy(identity).maxBy(_._2.size)._1
res1: Char = 2
In fact, this works for most Scala collections.
4. Conclusion
In this article, we’ve learned how to find the most frequent element in a Scala collection. We also discussed the special case where many elements have the same frequency and understood why the solution may only be consistent sometimes in such cases.