1. Introduction
When we look at sorting algorithms, we see that they can be divided into two main classes: those that use comparisons and those that count occurrences of elements.
In this tutorial, we’ll explore the latter one. More specifically, we’ll focus on comparing Counting, Bucket and Radix, sort. These algorithms typically take time, where is the size of the array and is the size of the largest number in the array.
2. Counting Sort
We’ll assume that we are given an array with elements ranging in size from to . We can allocate a counting array of size and initialize this to all zeroes:
We then run through , and for each element , increment the value in ![C[A[i]]](/wp-content/ql-cache/quicklatex.com-ab8f8c5c05208ef952b66f63e4ca30f4_l3.svg "Rendered by QuickLaTeX.com") as illustrated in the following diagram:
Once this process is complete, each contains the number of times occurs in . We can now copy the values from back into the array , ensuring that is copied into , times in succession, as shown:
2.1. Pseudocode
algorithm CountingSort(A, n):
// INPUT
// A = an unsorted array of size n
// OUTPUT
// Returns the sorted array A
// find the maximum value k in A
k <- A[1]
for i <- 2 to n by 1:
if A[i] > k:
k <- A[i]
C <- initialize the counting array containing k zeros
// Fill the counting array
for i <- 1 to n by 1:
C[A[i]] <- C[A[i]] + 1
// overwrite A using the values in C
place <- 1
for i <- 1 to k:
if C[i] > 0:
for j <- 1 to C[i]:
A[place] <- i
place <- place + 1
return A
Let’s look into the complexity of this algorithm. The time required to find the maximum is as it only requires running through once. The time for allocating and zeroing out is . Finally, we’ll have to fill up the sorted array , and this takes time, as we’ll have to look at each element of (time ) and fill up appropriately (time ). The total time is .
2.2. Critique
At first sight, this may look to us like a wonderful algorithm, but on careful examination, it has a number of drawbacks:
- it can only work for integer arrays
- it can require a huge counting array . For example, would require to be of size 1000000, just to sort a five-element array
- huge counting arrays require correspondingly large run times: if the maximum element in the array is , then the time required is , which is worse than insertion or selection sort
Let’s look at variants of this algorithm that have better time and space requirements.
3. Bucket Sort
The first variant we’ll look at is Bucket sort. This algorithm works well if the numbers of the input array are uniformly distributed over a range, say . Of course, we can’t guarantee that the uniform distribution is completely satisfied, but we can come close to this condition. Here’s an example of a size 20 array whose elements are distributed over the range :
We can now create ten buckets to hold the elements of the array. These buckets hold elements as shown:
Since each bucket holds a variable number of elements, our best choice is to associate a linked list with each bucket. The bucket , for example, holds the linked list . The order of the elements is the order in which the elements are encountered when sweeping array from left to right. Buckets marked with a diagonal line (for example, ) mean there are no elements in that fall into the corresponding range:
Now comes the interesting part: we can sort the individual linked lists using insertion sort. This results in:
In the figure above, we see that the buckets are in increasing order of magnitude. The linked list within each bucket is sorted.
We can now step through the buckets, copying the elements of each linked list into the original array . This results in a sorted array , as shown:
3.1. Pseudocode
algorithm BucketSort(A, n, k, b):
// INPUT
// A = Unsorted array
// n = size of the array A
// k = maximum value in A
// b = number of buckets
// OUTPUT
// Returns the sorted array A
B <- allocate bucket array B and set each element to empty
// Insert elements of A into buckets
for i <- 1 to n:
InsertList(A[i] / b, A[i])
// Sort the linked lists
for i <- 1 to b:
SortList(B[i])
// Overwrite A using the linked lists from B
Loc <- 1
for i <- 1 to b:
listPlace <- B[i]
while listPlace != empty:
A[Loc] <- Value(B[listPlace])
listPlace <- Next(B[listPlace])
Loc <- Loc + 1
return A
Let’s analyze the running time of Bucket sort. Assuming the array has elements, there are buckets, and the elements of are uniformly distributed, each linked list will be of size . The time complexity for this algorithm is then:
- to create the buckets, to create each linked list–total time to create all linked lists is
- to sort each linked list with insertion sort. Total time to sort all linked lists is
- time to copy from the linked lists back to the original array
The total time is dominated by the sorting time, which is .
3.2. Critique
Let’s review the above analysis of Bucket sort. The major assumption is that the numbers are absolutely uniformly distributed across the buckets. This assumption is unlikely to be true in practice. We can gather that the expression applies only to the uniformly distributed case.
In practice, the time will be , where is the length of the longest list. In the extreme case, all elements of the array fall into one bucket, leading to and a run time of . This means we would have been better off just using plain insertion sort.
At this point, we may be wondering if an algorithm better than insertion sort could be used to sort the lists and obtain a better time than . Unfortunately, algorithms such as heapsort or mergesort cannot efficiently be used with linked lists, so we are stuck with insertion sort.
4. Radix Sort
In the case of Counting sort, we required time , but was as large as the maximum number in the array to be sorted. This led to space and time complexity of . For Bucket sort, we had a small number of buckets, but the linked list in a bucket could be as large as , leading to runtime since insertion sort was required to sort the linked lists.
Can we find a compromise between the two? Indeed we can: Radix Sort uses a fixed number of buckets and repeatedly sorts the digits in each number in the array. Let us explore this powerful sorting algorithm.
In the following diagram, each number in the original array has four decimal digits 1513, 5314, 1216, and so on. Radix sort proceeds by starting with the least significant digit (LSD) and sorting the numbers in the array using this digit only, using Counting sort. Since we are dealing with decimal numbers, the count array need only be of size 10. We can see this in red in the following diagram, where the four-digit numbers are sorted on the least significant digit (which is digit 1).
We then repeat this process for digit 2 (blue), followed by digit 3 (green), and finally digit 4. We can see that the final array is correctly sorted:
4.1. Pseudocode
Radix sort uses Counting sort as a subroutine. In the following, means applying Counting sort to , using the th digit, as described above. In our context, the Counting sort is stable, which means that two numbers that match in the th digit are always placed in their original order. We can see in the figure above that 3480 and 2320 are placed in their original relative positions when sorting on digit 1. Stability of sorting is essential for Radix sort to work correctly:
algorithm RadixSort(A, d):
// INPUT
// A = unsorted array of integers with at most d digits each
// OUTPUT
// Returns the sorted array A
for i <- 1 to d:
A <- CountingSort(A, d)
return A
If we are given an array of size , with each element being a decimal number with at most digits, we can see that the extra space required for the array is constant at 10. The time required is .
If is a fixed constant (e.g., zip codes or social security numbers), we can drop the and rightly claim that the Radix sort is .
4.2. Extensions
Let’s think about what happens if the input array is not decimal. In the case of hexadecimal numbers, for example, the count array would be of length 16. This would make no difference in the asymptotic complexity, which would remain unchanged at .
A popular trick is to combine adjacent numbers into one so that a 6-digit decimal number is treated as (each 2-digit tuple being considered a unique number). The count array has to be of size 100 as it has to hold numbers . However, we now have the overhead of initializing a ten times larger count array.
We can also think about the case where the magnitudes of numbers in the input array are very large (e.g., ). In this case, a very careful analysis is needed to choose between the Radix sort and more traditional comparison-based sorts. We do point out that we can use Radix sort on the binary representation of numbers. If the maximum bits in a number is 35, we can use 5 groups of 7 bits each, with the array having entries. This may well turn out to be better than using a comparison-based sort.
5. Conclusions
In this article, we’ve looked at three interesting sort algorithms that don’t use comparisons. Counting sort is simple and straightforward and is used as a subroutine for Radix sort. Bucket sort is an interesting algorithm but has the limitation of unequally sized linked lists. Radix sort is widely used and has many interesting variants, some of which we’ve discussed above.