Sorting an array using Merge Sort

Photo by pine watt on Unsplash

Merge Sort is probably one of the best sorting algorithms. It works by repeatedly dividing an array into smaller parts and combining the results. It keeps dividing the array until the array size becomes 1 and then merges the sorted arrays to get the result. Hence the name merge sort. It's a classic recursion problem.

Let's suppose we have an array

[50, 10, 20, 40, 30]

We can start by dividing the array into smaller arrays until the array has only one element left. In Step 1, 2, 3 and 4, we continue dividing the array and by step 4 each array has only 1 element left.

We start merging the sorted subarrays in Step 5, and continue with that till Step 8, which gives us the sorted array. Now all that remains is to define how we would merge any two sorted arrays.


NB: The ordering here is just to explain the proccess. This is not in sync with how the compiler will run the code.


Let's take another example for that. Say we have 2 sorted arrays,

left = [10, 50]
right = [20, 30, 40]

For merging these 2 arrays we can take an empty array merged and 3 indices (indexes) leftIndex, rightIndex, mergedIndex initialized with 0. We will start a loop that will continue as long as,

leftIndex < LENGTH(left) AND rightIndex < LENGTH(right)

For each iteration we will do three things,

  1. Find the smaller number in both the arrays at the current index,
  2. Copy the smaller number to the mergedArray
  3. Increment the indexes for the mergedArray and the array whose current element was smaller.
WHILE leftIndex < LENGTH(left) AND rightIndex < LENGTH(right):
    IF left[leftIndex] < right[rightIndex]:
        merged[mergedIndex] = left[leftIndex]
        leftIndex = leftIndex + 1
    ELSE:
        merged[mergedIndex] = right[rightIndex]
        rightIndex = rightIndex + 1

    mergedIndex = mergedIndex + 1 

When there are no elements left in either one of these arrays we can copy all the elements left in the other array, but because we don't know which array got exhausted we will write the code for both the arrays.

WHILE leftIndex < LENGTH(left):
    merged[mergedIndex] = left[leftIndex]
    leftIndex = leftIndex + 1
    mergedIndex = mergedIndex + 1 
    
WHILE rightIndex < LENGTH(right):
    merged[mergedIndex] = right[rightIndex]
    rightIndex = rightIndex + 1
    mergedIndex = mergedIndex + 1 
    

And that's all we need to do!

I know this was more complicated than anything we have discussed uptil now in this series, but I'm sure if you have followed the rest of the posts you have the foundation ready for this and even complex problems. Let's look at the pseudocode now.

FUNCTION Merge(leftArray, rightArray):
    leftIndex = 0
    rightIndex = 0
    mergedIndex = 0
    mergedArray = []
    WHILE leftIndex < LENGTH(leftArray) AND rightIndex < LENGTH(rightArray):
        IF leftArray[leftIndex] < rightArray[rightIndex]:
            merged[mergedIndex] = leftArray[leftIndex]
            leftIndex = leftIndex + 1
        ELSE:
            merged[mergedIndex] = rightArray[rightIndex]
            rightIndex = rightIndex + 1

        mergedIndex = mergedIndex + 1 

    WHILE leftIndex < LENGTH(leftArray):
        merged[mergedIndex] = leftArray[leftIndex]
        leftIndex = leftIndex + 1
        mergedIndex = mergedIndex + 1 

    WHILE rightIndex < LENGTH(rightArray):
        merged[mergedIndex] = rightArray[rightIndex]
        rightIndex = rightIndex + 1
        mergedIndex = mergedIndex + 1 
    RETURN mergedArray
        
FUNCTION MergeSort(array):
    IF LENGTH(array) < 2:
        RETURN array
    middleIndex = LENGTH(array) / 2
    RETURN Merge(MergeSort(array[0 : middleIndex]), MergeSort(array[middleIndex:]))
    

We have used two different functions here, one for merging the 2 halves and one for the recursion calls.

We are ready to look at the code now.

Python

def merge_sort(array):
    if len(array) < 2:
        # Array has 0 or 1 elements which means it is already sorted
        return array
    middle_index = len(array) // 2

    # Call merge_sort() on left and right halves of the array separately and merge them
    return merge(merge_sort(array[0 : middle_index]), merge_sort(array[middle_index:]))


def merge(left_array, right_array):
    left_index = 0
    right_index = 0
    merged_index = 0
    merged_array = []

    # While both the arrays have elements pick the smallest possible number and append it to the merged_array
    while left_index < len(left_array) and right_index < len(right_array):
        if left_array[left_index] < right_array[right_index]:
            merged_array.append(left_array[left_index])
            left_index = left_index + 1
        else:
            merged_array.append(right_array[right_index])
            right_index = right_index + 1
        merged_index = merged_index + 1 

    # If the left_array still has some elements
    while left_index < len(left_array):
        merged_array.append(left_array[left_index])
        left_index = left_index + 1
        merged_index = merged_index + 1 

    # If the right_array still has some elements
    while right_index < len(right_array):
        merged_array.append(right_array[right_index])
        right_index = right_index + 1
        merged_index = merged_index + 1 
    
    return merged_array


print(merge_sort([10, 20, 50, 10, 30]))

It has the worst case complexity of O(n log n)

Hashjar

Hashjar