1. Overview
In this article, we’ll explore how to find the mode of integers in an array using Java.
When working with datasets in Java, we might often need to find statistical measures such as mean, median, and mode. The mode is the value that appears most frequently in a dataset. If no number is repeated, then the dataset has no mode. If multiple numbers have the same highest frequency, all of them are considered modes.
2. Understanding the Problem
The algorithm aims to find the mode of integers in an array. Let’s consider some examples:
nums = {1, 2, 2, 3, 3, 4, 4, 4, 5}. The mode for this array would be 4.
nums = {1, 2, 2, 1}. The mode for this array is {1, 2}.
For our code, let’s have an example array of integers:
int[] nums = { 1, 2, 2, 3, 3, 4, 4, 4, 5 };
3. Using Sorting
One way to find the mode is by sorting the array and finding the most frequent element. This approach leverages the fact that in a sorted array, duplicate elements are adjacent. Let’s see the code:
Arrays.sort(nums);
int maxCount = 1;
int currentCount = 1;
Set<Integer> modes = new HashSet<>();
for (int i = 1; i < nums.length; i++) {
if (nums[i] == nums[i - 1]) {
currentCount++;
}
else {
currentCount = 1;
}
if (currentCount > maxCount) {
maxCount = currentCount;
modes.clear();
modes.add(nums[i]);
}
else if (currentCount == maxCount) {
modes.add(nums[i]);
}
}
if (nums.length == 1) {
modes.add(nums[0]);
}
This method sorts the input array and then traverses it to count the frequency of each number. It keeps track of the number with the highest frequency and updates the list of modes accordingly. It also handles the edge case where the array contains only one element.
Let’s have a look at time and space complexity:
- Time complexity: O(n log n) due to the sorting step.
- Space Complexity: O(n) in the worst case if the sorting algorithm used is mergesort, or O(k) if we consider only the additional space used for storing the modes.
Here, n is the number of elements in the array and k is the number of modes.
4. Using Frequency Array
If the range of integers in the array is known and limited, a frequency array can be a very efficient solution. This method uses the array index to count occurrences. Let’s see how:
Map<Integer, Integer> frequencyMap = new HashMap<>();
for (int num : nums) {
frequencyMap.put(num, frequencyMap.getOrDefault(num, 0) + 1);
}
int maxFrequency = 0;
for (int frequency : frequencyMap.values()) {
if (frequency > maxFrequency) {
maxFrequency = frequency;
}
}
Set<Integer> modes = new HashSet<>();
for (Map.Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
if (entry.getValue() == maxFrequency) {
modes.add(entry.getKey());
}
}
The method populates a map with the frequency of each integer in the array, then it determines the highest frequency present in the map. And finally, it collects all integers from the map that have the highest frequency.
Let’s have a look at time and space complexity:
- Time Complexity: O(n + m), which simplifies to O(n) in the average case since m is typically much less than n.
- Space Complexity: O(m + k). In the worst case, this could be O(n) if all elements are unique and each is a mode.
Here, n is the number of elements in the array, m is the number of unique elements in the array and k is the number of modes.
5. Using TreeMap
A TreeMap can provide a sorted frequency map, which may be useful in certain contexts. Here is the logic below:
Map<Integer, Integer> frequencyMap = new TreeMap<>();
for (int num : nums) {
frequencyMap.put(num, frequencyMap.getOrDefault(num, 0) + 1);
}
int maxFrequency = 0;
for (int frequency : frequencyMap.values()) {
if (frequency > maxFrequency) {
maxFrequency = frequency;
}
}
Set<Integer> modes = new HashSet<>();
for (Map.Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
if (entry.getValue() == maxFrequency) {
modes.add(entry.getKey());
}
}
The approach used is the same as used in the previous section. The only difference is we used TreeMap here. Using a TreeMap ensures the elements are stored in a sorted order, which can be useful for further operations that require sorted keys.
Let’s have a look at time and space complexity:
- Time Complexity: O(n log m + m), which simplifies to O(n log m) in the average case.
- Space Complexity: O(m + k). In the worst case, this could be O(n) if all elements are unique and each is a mode.
Here, n is the number of elements in the array, m is the number of unique elements in the array and k is the number of modes.
6. Using Streams
When dealing with a large dataset, we can leverage Java’s parallel streams to utilize multi-core processors. Here is the logic below:
Map<Integer, Long> frequencyMap = Arrays.stream(nums)
.boxed()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()));
long maxFrequency = Collections.max(frequencyMap.values());
Set<Integer> modes = frequencyMap.entrySet()
.stream()
.filter(entry -> entry.getValue() == maxFrequency)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());
The code uses Java streams to process the array in a functional style. This makes the code concise and expressive.
First, we convert primitive integers to Integer objects so that it works with generic stream operations, then we group the integers by their values and count their occurrences using Collectors.groupingBy() and Collectors.counting(). The maximum frequency is found using Collections.max(). Finally, the entries with the maximum frequency are filtered, and their keys are collected into a list.
This method is efficient and leverages the power of the Java Stream API to find the mode(s) in a clean and readable way.
Let’s have a look at time and space complexity:
- Time Complexity: O(n + m), which simplifies to O(n) in the average case since m is typically much less than n.
- Space Complexity: O(m + k). In the worst case, this could be O(n) if all elements are unique and each is a mode.
Here, n is the number of elements in the array, m is the number of unique elements in the array and k is the number of modes.
7. Conclusion
In this tutorial, we explored various ways to find the mode of integers in an array. Each of these methods has its advantages and is suitable for different scenarios. Here’s a quick summary to help us choose the right approach:
- Sorting: simple and effective for small to medium-sized arrays
- Frequency array: highly efficient if the range of numbers is small
- TreeMap: useful if we need a sorted frequency map
- Parallel streams: ideal for large datasets to utilize multiple cores
By choosing the appropriate method based on our specific requirements, we can optimize the process of finding the mode of integers in an array in Java.
The source code of all these examples is available over on GitHub.