Arrays.sort 与 Arrays.parallelSort 的比较

1. Overview

We’ve all used Arrays.sort() to sort an array of objects or primitives. In JDK 8, creators enhanced the API to provide a new method: Arrays.parallelSort().

In this tutorial, we’ll draw a comparison between the sort() and parallelSort() methods.

2. Arrays.sort()

The Arrays.sort() method sorts the array of objects or primitives. The sorting algorithm used in this method is Dual-Pivot Quicksort. In other words, it is a custom implementation of the Quicksort algorithm to achieve better performance.

This method is single-threaded and there are two variants:

sort(array) – sorts the full array into ascending order
sort(array, fromIndex, toIndex) – sorts only the elements from fromIndex to toIndex

Let’s see an example of both variants:

@Test
public void givenArrayOfIntegers_whenUsingArraysSortMethod_thenSortFullArrayInAscendingOrder() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

    Arrays.sort(array);

    assertArrayEquals(expected, array);

}

@Test
public void givenArrayOfIntegers_whenUsingArraysSortWithRange_thenSortRangeOfArrayAsc() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 };

    Arrays.sort(array, 2, 8);

    assertArrayEquals(expected, array);
}

Let’s summarize the pros and cons of this approach:

PROS

CONS

Works fast on smaller data sets

Performance degrades for large datasets

Multiple cores of the system aren’t utilized

3. Arrays.parallelSort()

This method also sorts an array of objects or primitives. Similar to sort() it also has two variants to sort a full array and partial array:

@Test
public void givenArrayOfIntegers_whenUsingArraysParallelSortMethod_thenSortFullArrayInAscendingOrder() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

    Arrays.parallelSort(array);

    assertArrayEquals(expected, array);
}

@Test
public void givenArrayOfIntegers_whenUsingArraysParallelSortWithRange_thenSortRangeOfArrayAsc() {
    int[] array = { 10, 4, 6, 2, 1, 9, 7, 8, 3, 5 };
    int[] expected = { 10, 4, 1, 2, 6, 7, 8, 9, 3, 5 };

    Arrays.parallelSort(array, 2, 8);

    assertArrayEquals(expected, array);
}

The parallelSort() is functionally different. Unlike sort(), which sorts data sequentially using a single thread, it uses a parallel sort-merge sorting algorithm. It breaks the array into sub-arrays that are themselves sorted and then merged.

For executing parallel tasks it uses the ForkJoin pool.

But we need to know that it uses parallelism only when certain conditions are met. If the array size is less than or equal to 8192 or the processor has only one core, then it uses the sequential Dual-Pivot Quicksort algorithm. Otherwise, it uses a parallel sort.

Let’s summarize the advantages and disadvantages of using it:

PROS

CONS

Offers better performance for large size datasets

Slower for smaller size arrays

Utilizes multiple cores of the system

4. Comparison

Let’s now see how both methods performed with different size datasets. Below numbers are derived using JMH benchmarking. The test environment uses AMD A10 PRO 2.1Ghz quad-core processor and JDK 1.8.0_221:

Array Size

Arrays.sort()

Arrays.parallelSort()

1000

o.048

0.054

10000

0.847

0.425

100000

7.570

4.395

1000000

65.301

37.998

5. Conclusion

In this quick article, we saw how sort() and parallelSort() differ.

Based on performance results, we can conclude that parallelSort() may be a better choice when we have a large dataset to sort. However, in the case of smaller size arrays, it’s better to go with sort() since it offers better performance.

As always, the complete source code is available over on GitHub.

Persistence

REST

Security

1. Overview

2. Arrays.sort()

3. Arrays.parallelSort()

4. Comparison

5. Conclusion