1. Introduction

The comb sort is an algorithm designed by Włodzimierz Dobosiewicz, a Polish computer specialist, at Warsaw University in 1980. It is well-known that bubble sort is one of the worst sorting algorithms having a quadratic O(n^2) running time. Bubble sort can be improved in a way, akin to how shell sort is a variant of insertion sort.

In this tutorial, we’ll learn how the comb sort improves upon bubble sort, aptly known as gapped bubble sort.

2. A Quick Refresher of Bubble Sort

Let’s take a quick gander at the bubble sort algorithm. Bubble sort is a comparison sorting algorithm, where large elements are bubbled (up) to the end of the array. We make several passes of the array. In the ith pass, we look at the sub-array A[0 \ldots N-i-1]. In each pass, we step through the input array element by element, comparing each element to its neighbor, and swapping their values, if A[j-1] > A[j].

Consider the following pseudo-code:

algorithm BubbleSort(A):
    // INPUT
    //    A = the reference to the shuffled array
    // OUTPUT
    //    The sorted array A with all its elements in increasing order

    i <- 0

    // Perform n passes of the array
    for i <- 0 to N - 1:
        isSwapped <- false

        // Look at the subarray A[0 : N-i-1]
        // Step through the input array element by element
        for j <- 1 to N - i - 1:
            // Compare each element with its neighbor, swapping values if necessary
            if A[j - 1] > A[j]:
                temp <- A[j - 1]
                A[j - 1] <- A[j]
                A[j] <- temp

                isSwapped <- true

        if not isSwapped:
            break

    return A

If there are no swaps performed in a given iteration, then the array must be sorted, we can safely terminate.

2.1. Rabbits and Turtles

Consider the shuffled array A=[9,3,4,0,8,7,6,5,1,2]. Let’s run through what the array A looks like after each iteration of the outer for-loop of the bubble sort algorithm.

    [\begin{array}{r|l} \text{Iteration} & \text{Array}\\ \hline 0 & [9,3,4,0,8,7,6,5,1,2]\\ 1 & [3, 4, 0, 8, 7, 6, 5, 1, 2, 9]\\ 2 & [3, 0, 4, 7, 6, 5, 1, 2, 8, 9]\\ 3 & [0, 3, 4, 6, 5, 1, 2, 7, 8, 9]\\ 4 & [0, 3, 4, 5, 1, 2, 6, 7, 8, 9]\\ 5 & [0, 3, 4, 1, 2, 5, 6, 7, 8, 9]\\ 6 & [0, 3, 1, 2, 4, 5, 6, 7, 8, 9]\\ 7 & [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] \end{array}]

The element 9 is a rabbit and the element 1 is a turtle. Heavier elements that are to be moved towards the end of the array move very quickly, whereas lighter elements that are to be moved to the beginning of the array move by just one step in each iteration.

3. The Comb Sort Algorithm

The bubble sort algorithm operates on neighboring elements of the array. The comb-sort algorithm performs comparison-swap operations on non-contiguous elements. In fact, the key to its speed is that, at the beginning, it compares and swaps, if necessary, elements that are far from each other. Then, it reduces the distance progressively, comparing-swapping nearer elements. That distance is called the gap or increment.

Turtles slow down bubble-sort tremendously. Comb-sort eliminates this problem by performing comparison-swap on distant elements to begin with.

3.1. Vanilla Comb Sort

The vanilla comb-sort algorithm uses increments that form a geometric progression. Let N be the number of elements in the array and let r be the shrinkage factor. Then, mathematically the gap size takes values from the following sequence:

    [\left\{\frac{N}{r},\frac{N}{r^2},\frac{N}{r^3},\ldots\right\}]

There are some considerations to the shrinkage factor r. As an extreme case, suppose that the array has 16 elements and the shrinkage factor r=2. In such a case, the gap size takes values (8,4,2,1). 8-sorting, 4-sorting and 2-sorting an array will result in no interaction between the even and odd elements. A large amount of work is left to the point where the gap size equals 1. In effect, we end up doing a near bubble-sort of array. This is undesirable. The authors have found that the optimal value of the shrinkage factor r approximately equals 1.3.

3.2. Pseudo-Code

Let’s consider the following pseudo-code:

algorithm CombSort(A):
    // INPUT
    //    A = the shuffled array
    // OUTPUT
    //    The sorted array with all its elements in increasing order

    gap <- length(A)
    isSwapped <- true

    // Keep iterating and stop if and only if the gap equals 1 and the array is fully sorted
    while gap != 1 or isSwapped:
        isSwapped <- false

        gap <- floor(gap / 1.3)

        if gap < 1:
            gap <- 1

        // Look at the subarray A[0 : N-gap-1]
        for i <- 0 to N - gap - 1:
            // Compare each pair of elements separated by a gap, swapping values if necessary
            if A[i] > A[i + gap]:
                temp <- A[i]
                A[i] <- A[i + gap]
                A[i + gap] <- temp

                isSwapped <- true

    return A

The outer while-loop iterates as long as at least one of the conditions (gap \neq 1)\lor(\text{atleast one swap was performed in the last pass}) holds true. Carefully negating this, using De-Morgan’s laws, we find that the while-loop will terminate, when

    [(gap == 1) \land (\text{no swaps occurred in the last pass})]

A sweep of the array with no swaps implicitly means that the array is monotonically increasing, no elements are out of order, and the array is sorted. So, the condition makes intuitive sense.

In each pass, we successively reduce the gap by a factor r = 1.3. If the computed value of the gap variable at any point is smaller than 1, we reset it to 1.

We perform comparison-swap operations on the sub-sequence of elements A[0],A[gap],A[2\times gap],\ldots,A[N-gap-1],A[N-1].

4. The Running Time of Comb Sort

Let’s see how comb sort stacks up, as far as the running time is concerned, as compared with Bubble Sort. The empirical results below show the speedup attained by comb sort vis-a-vis the bubble sort. The worst-case running time of comb sort is O(n^2).

    [\begin{array}{r|l|l} n & \text{Comb-Sort} & \text{Bubble-Sort}\\ \hline 10 & 0.0045960 & 0.00151599\\ 10^2 & 0.077594 & 0.023565999\\ 10^3 & 1.606186 & 1.7639360\\ 10^4 & 24.872394 & 178.3646059\\ 10^5 & 327.263792 & 18536.33366399\\ 10^6 & 4079.795338 & - \end{array}]

5. Implementing Comb Sort in Python

To demonstrate the practical application of Comb Sort, we’ll use Python to write an implementation that encapsulates the algorithm’s core principles by closely following the pseudo code, thanks to Python’s friendly syntax.

We’ll use 1.3, which is the optimal shrinking factor according to the literature. Our comb_sort function accepts an array as its parameter. The function loops over the array, using the shrinking factor to calculate progressively smaller gaps,  then swapping an element with the element located at a gap distance if that element is smaller, repeating this process until the gap reaches 1 and comparing adjacent elements:

def comb_sort(arr):
    gap = len(arr)
    is_swapped = True

    while gap != 1 or is_swapped:
        # Commonly used shrink factor 1.3
        is_swapped = False
        gap = int(gap / 1.3)
        if gap <= 1:
          gap = 1
      
        i = 0
        # range creates an exclusive range hence we do not need the -1
        for i in range(0, len(arr) - gap):
            if arr[i] > arr[i + gap]:
              arr[i], arr[i + gap] = arr[i + gap], arr[i]
              is_swapped = True

This process is repeated until the array is fully sorted, with no swaps needed in a final pass with the gap set to 1, indicating that the array is in order.

Using the  comb_sort function is as simple as calling it on an array. Since the algorithm sorts in place, we can use the same variable after sorting:

# Example usage
arr = [8, 4, 1, 56, 3, -44, 23, -6, 28, 0]
comb_sort(arr)
print("Sorted array:", arr)

6. Conclusion

In this article, we learned about how the comb sort algorithm attains a speed-up over bubble sort. To recap, comb-sort eliminates the problem of turtles by comparison-swap operations between far-away elements.


» 下一篇: 什么是数据仓库?