1. Introduction
2. Top-Down Approach
Top-down Merge Sort recursively divides a list into smaller halves, sorts each half, and then merges the sorted halves back together to form a sorted list:
def merge_sort_top_down(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
sorted_left = merge_sort_top_down(left_half)
sorted_right = merge_sort_top_down(right_half)
return merge(sorted_left, sorted_right)
The top-down Merge Sort makes a new list during the merge to help make the process easier and make sure it is stable. This way, we don’t need to worry about avoiding overwrites or errors when dealing with in-place merging. It fits well with the recursive nature of the algorithm and its following of the divide-and-conquer approach. It takes up extra space but brings clarity and reliability.
First, the function checks if the list has no elements or has exactly one. This is the base case of the recursion: such lists are already sorted.
Next comes the core part. The first thing to do is find the list’s middle index. To do this, we integer-divide the length by 2. Then, we split the list into two equal parts. The left half will have elements from the beginning of the list up to the middle, but it will not include the middle element. The right half will have elements from the list’s middle to end.
The sort is made by sorting the two halves recursively using merge_sort_top_d͏own. This implements the divide-and-conquer strategy.
The next and final step is to use the external merge function to merge the sorted portions into one final sorted portion for the entire input list. The purpose is to combine two sorted lists to form one merged and sorted list.
The merge function is a helper to the Merge Sort algorithm. It combines two sorted sublists (left, right) into a single sorted output list (merged):
def merge(left, right):
merged = []
left_index = 0
right_index = 0
while left_index < len(left) and right_index < len(right):
if left[left_index] <= right[right_index]:
merged.append(left[left_index])
left_index += 1
else:
merged.append(right[right_index])
right_index += 1
merged.extend(left[left_index:])
merged.extend(right[right_index:])
return merged
To start, left_index and right_index both equal zero at the start of the function, indicating which elements of the left and right lists are next to merge. At this point, we initialize another empty list, merged. It will hold the single merged list in sorted order.
Now begins the while loop, which continues until we exhaust one of the input lists. We iterate over the lists and insert the smaller of left[left_index] and right[right_index] into merged. Upon placing an element into the merged list, we increment the index corresponding to the input list from which it came.
After the loop ends, there may be leftover elements in one of the input lists because either left_index = len(left) or right_index = len(right). The left[left_index:] is an empty list in the former case. In the latter, right[right_index:] is empty. We append all leftover elements using the extend method.
3. Bottom-Up Approach
The Bottom-up Merge Sort treats each element in the list as a sorted sublist of size one. Then, it iteratively merges the consecutive pairs of the sublists, effectively doubling their sizes in each iteration. This process continues iteratively until the entire list is sorted:
def merge_sort_bottom_up(arr):
n = len(arr)
auxiliary_array = [None] * n
width = 1
while width < n:
for i in range(0, n, 2 * width):
left = i
mid = min(i + width, n)
right = min(i + 2 * width, n)
merge(arr, auxiliary_array, left, mid, right)
for i in range(n):
arr[i] = temp_arr[i]
width *= 2
return arr
This method eliminates the need for the recursive stack calls required in the top-down approach. As a result, the bottom-up Merge Sort can be more memory-efficient in some scenarios.
We start by finding the list’s length and initializing an auxiliary_array of the same length with all of its values to None. This auxiliary_array will serve as a temporary swap area for merging lists. Then, the variable width is initialized with the value of 1. It serves as the size of the sublists that are to be merged. Now, the function is ready to merge sublists by iteratively doubling their sizes, starting with the smallest possible sublists of size 1.
The function combines increasingly larger sorted sublists within the master list. It starts with the tiniest possible components—single elements—and gradually doubles the merging range in every iteration of the loop. The indices left, mid, and right denote the endpoints of the consecutive sublists to merge.
The merge function combines all the sublists defined by left, mid, and right indices into the auxiliary_array. *Only after all the sublists of the current width have been processed and merged are the elements from the auxiliary_array copied back to arr so that arr contains sorted sublists. After that, the algorithm doubles the width.*
Eventually, the merged sublists form a single sorted list.
Here’s the merge function we use in the bottom-up approach:
def merge(arr, auxiliary_array, left, mid, right):
i, j, k = left, mid, left
while i < mid and j < right:
if arr[i] <= arr[j]:
auxiliary_array[k] = arr[i]
i += 1
else:
auxiliary_array[k] = arr[j]
j += 1
k += 1
while i < mid:
auxiliary_array[k] = arr[i]
i += 1
k += 1
while j < right:
auxiliary_array[k] = arr[j]
j += 1
k += 1
The process starts by setting three index variables (i, j, and k) to track the positions of the passed lists.
Then, we enter a loop comparing two elements from the two sublists. If the current element from the first sublist is smaller than or equal to the current element from the second sublist, we place this element in the output and increment the i index. Otherwise, we do the same for the second sublist but increment the j index.
After processing both sublists, data may remain in either the first or second sublists. The function handles this by extending the auxiliary_array with any leftover elements.
4. Conclusion
In this article, we showed how to implement top-down and bottom-up versions of Merge Sort in Python.
The most well-suited use case of the top-down method is a non-stack limited environment that doesn’t mind the additional memory overhead brought by recursion.
On the other hand, the bottom-up approach to Merge Sort eliminates recursion and drastically reduces the related stack overhead. The bottom-up method is handy when considering large datasets: since stack overflow is not a consideration, we can sort arbitrarily large datasets with this method.
The size of the dataset that can be processed using the bottom-up Merge Sort is limited by the available system memory. This means that while the algorithm is efficient for large datasets, the amount of data it can handle is determined by the memory capacity of the hardware it runs on.