Skip to main content

Sorting

Merge Sort

Merge Sort is a divide-and-conquer algorithm that efficiently sorts an array or list by dividing it into smaller parts, sorting those parts, and then merging them back together.

What is a Divide and Conquer Algorithm?

A divide-and-conquer algorithmic paradigm is a systematic approach to solving complex problems by breaking them into smaller subproblems, solving those subproblems independently, and then combining their solutions to form the final result.

This strategy is widely used in many algorithms, such as Merge Sort, Quick Sort, and Binary Search.

Key Steps in Divide and Conquer

  1. Divide
    Break the original problem into smaller, more manageable subproblems. The subproblems should ideally be of the same type as the original problem.
  2. Conquer
    Solve each subproblem recursively. If the subproblem is small or simple enough, solve it directly (base case).
  3. Combine
    Merge or combine the solutions of the subproblems to solve the original problem.

Characteristics of Divide and Conquer

  1. Recursion
    Most divide-and-conquer algorithms use recursion to solve the subproblems.
  2. Subproblem Independence
    The subproblems are solved independently of each other, meaning there's no interaction between them during the solving step.
  3. Combining Step
    Once the subproblems are solved, their results are merged to form the overall solution.

Let us go deep down into the intuition that how this algorithm works!

Intuition

Divide

  • The array is split into two halves recursively until each sub-array contains a single element (or no elements, in the case of an empty array).
  • A single-element array is considered sorted by definition.

Conquer

  • The sorted sub-arrays are merged together in a way that preserves order. This is achieved by comparing the elements of the sub-arrays and taking the smallest elements one by one.

Combine

  • The merging step is repeated for higher levels of recursion until the original array is reassembled as a single sorted array.

Let us understand the merge sort with an example

Step by Step Algorithm

Consider sorting the array: [12, 11, 13, 10, 20, 6, 5]

Step 1: Divide
The array is repeatedly divided into two halves: [12, 11, 13, 10, 20, 6, 5]

Level 1: => [12, 11, 13] and [10, 20, 6, 5]
Level 2: => [12] [11, 13] and [10, 20] [6, 5]
Level 3: => [12] [11] [13] and [10] [20] [6] [5]

Step 2: Conquer (Merge)

Merging starts at the smallest level:

  1. Merge [11] and [13]: → [11, 13] (Level 3)
  2. Merge [12] and [11, 13]: → [11, 12, 13] (Level 2)
  3. Merge [10] and [20]: → [10, 20] (Level 3)
  4. Merge [6] and [5]: → [5, 6] (Level 3)
  5. Merge [10, 20] and [6, 5]: → [5, 6, 10, 20] (Level 2)
  6. Merge [11, 12, 13] and [5, 6, 10, 20]: → [5, 6, 10, 11, 12, 13, 20] (Level 1)

The sorted array is: [5, 6, 10, 11, 12, 13, 20]

In Merge Sort, a recursive function is used to divide an array into multiple subarrays, and a merge function is used to combine those independent subarrays.

Let's first understand the recursive function that helps us divide the array (larger problem) into subarrays (smaller problems).

To write a recursive function, we need a base case because, without it, the recursive function would never stop. Since we know that an array of size 1 is always sorted, the base case for our recursive function is when the array size is 1, as it cannot be divided further into smaller subarrays.

Base Case

If the size of the array (or subarray) is 0 or 1, it is already sorted. The function simply returns the array as it is.

Now if the size of the remaining subarray is not 1, we need to divide it further into smaller subproblems.

Recursive Case

If the size of the array is greater than 1:

  1. Find the Middle Index of the Array

When we want to split an array into two halves, the first step is to determine the middle index. This is crucial because it helps us define the bounds of the two subarrays: the left half and the right half.

The formula for the Middle Index:

To avoid overflow issues (when working with very large indices), the middle index is calculated as mid = start + (end - start) // 2

Where:

  • start is the index of the first element of the current subarray.
  • end is the index of the last element of the current subarray.
  • mid is the index of the middle element of the subarray.

The formula is a safe way to calculate the middle of the subarray without causing integer overflow. Here’s how the calculation works:

  • start + (end - start) // 2 gives the middle index, where (end - start) gives the length of the current subarray, and dividing by 2 gives the index that lies in the middle.
  1. Split the Array into Two Halves

Once we have the middle index, we can split the array into two parts:

  • Left Subarray: This will contain the elements from the beginning of the array up to the middle index (inclusive).
  • Right Subarray: This will contain the elements from the middle index + 1 to the end of the array.

For the array [12, 11, 13, 10, 20, 6, 5] and mid = 3, we split the array into:

  • Left subarray: [12, 11, 13, 10] (from index 0 to 3)
  • Right subarray: [20, 6, 5] (from index 4 to 6)
  1. Call mergeSort Recursively on Both Halves

After splitting the array into two subarrays, we then recursively apply Merge Sort to both halves.

This step follows the divide and conquer strategy:

  • Recursively call mergeSort on the left subarray.
  • Recursively call mergeSort on the right subarray.

This recursive process continues until the subarrays are reduced to their smallest possible size (arrays of size 1), which is the base case of the recursion (as already discussed).

Pseudocode for Merge Sort Function

FUNCTION mergeSort(arr, start, end):
    IF start >= end:                      // Base case: If array has 1 or 0 elements
        RETURN

    mid = start + (end - start) // 2      // Calculate middle index

    // Recursively sort the left half
    mergeSort(arr, start, mid)

    // Recursively sort the right half
    mergeSort(arr, mid + 1, end)

    // Merge the two sorted halves
    merge(arr, start, mid, end)

Now that we have learned how to divide the array into different subarrays, let's focus on how to merge (conquer) them to form the sorted array.

The merging step combines two sorted subarrays into a single sorted array. This is done by comparing the elements of the two subarrays and placing the smaller element into the resulting array. Here's how it works:

Understanding the Merge Process

The merge function takes three parameters:

  1. The array: The array containing the subarrays to be merged.
  2. The indices:
    • start: The starting index of the first subarray.
    • mid: The ending index of the first subarray (and implicitly the starting index of the second subarray is mid + 1).
    • end: The ending index of the second subarray.

The goal is to merge the two sorted subarrays arr[start...mid] and arr[mid+1...end] into a single sorted portion of the array.

Steps to Merge Two Subarrays

  1. Create Temporary Arrays
    Copy the elements of the two subarrays into temporary arrays left and right.
  2. Merge Elements
    Use two pointers (i for left and j for right) to compare elements:
    • Add the smaller element to the original array (arr[k]) and move the pointer forward.
  3. Copy Remaining Elements
    Once one of the temporary arrays is exhausted, copy any remaining elements from the other array into the original array.

Pseudocode for Merge Function

FUNCTION merge(arr, start, mid, end):
    // Step 1: Create temporary arrays
    left = COPY elements from arr[start to mid]
    right = COPY elements from arr[mid+1 to end]

    i = 0, j = 0, k = start      // Pointers for left, right, and original array

    // Step 2: Compare elements and merge
    WHILE i < SIZE(left) AND j < SIZE(right):
        IF left[i] <= right[j]:
            arr[k] = left[i]              // Add smaller element to arr
            i = i + 1
        ELSE:
            arr[k] = right[j]
            j = j + 1
        k = k + 1

    // Step 3: Copy the remaining elements from the left array
    WHILE i < SIZE(left):
        arr[k] = left[i]
        i = i + 1
        k = k + 1

    // Step 4: Copy the remaining elements from the right array
    WHILE j < SIZE(right):
        arr[k] = right[j]
        j = j + 1
        k = k + 1

Example for Merging two arrays

Suppose you are merging two sorted subarrays of arr = [11, 12, 13, 5, 6, 10, 20]:
Left subarray: [11, 12, 13]
Right subarray: [5, 6, 10, 20]

Step-by-Step Merge:

  1. Compare 11 and 5: 5 is smaller → arr=[5, _, _, _, _, _, _].
  2. Compare 11 and 6: 6 is smaller → arr=[5, 6, _, _, _, _, _].
  3. Compare 11 and 10: 10 is smaller → arr=[5, 6, 10, _, _, _, _].
  4. Compare 11 and 20: 11 is smaller → arr=[5, 6, 10, 11, _, _, _].
  5. Compare 12 and 20: 12 is smaller → arr=[5, 6, 10, 11, 12, _, _].
  6. Compare 13 and 20: 13 is smaller → arr=[5, 6, 10, 11, 12, 13, _].
  7. 20 is the only element left → arr=[5, 6, 10, 11, 12, 13, 20].

Let us visualize the Merge sort with a visualizer :

0:00
/2:24

Since we have understood the recursive mergeSort function and the merge function, let's take a look at the complete code for them.

Code Implementation
C++
// Function to merge two sorted subarrays into a single sorted array
void merge(vector<int>& arr, int start, int mid, int end) {
    // Create temporary arrays for the left and right subarrays
    int leftSize = mid - start + 1;
    int rightSize = end - mid;

    int left[leftSize], right[rightSize];

    // Copy elements into the left and right temporary arrays
    for (int i = 0; i < leftSize; i++) {
        left[i] = arr[start + i];
    }

    for (int i = 0; i < rightSize; i++) {
        right[i] = arr[mid + 1 + i];
    }

    int i = 0, j = 0, k = start;

    // Compare elements from the left and right subarrays and merge them
    while (i < leftSize && j < rightSize) {
        if (left[i] <= right[j]) {
            arr[k++] = left[i++];
        } else {
            arr[k++] = right[j++];
        }
    }

    // Copy remaining elements from the left subarray, if any
    while (i < leftSize) {
        arr[k++] = left[i++];
    }

    // Copy remaining elements from the right subarray, if any
    while (j < rightSize) {
        arr[k++] = right[j++];
    }
}

// Recursive function to divide the array and sort it
void mergeSort(vector<int>& arr, int start, int end) {
    // Base case: If the array has only one element, it is already sorted
    if (start >= end) {
        return;
    }

    // Find the middle index to divide the array
    int mid = start + (end - start) / 2;

    // Recursively sort the left half
    mergeSort(arr, start, mid);

    // Recursively sort the right half
    mergeSort(arr, mid + 1, end);

    // Merge the two sorted halves
    merge(arr, start, mid, end);
}

Java
public class MergeSort {

    // Function to merge two sorted subarrays into a single sorted array
    public static void merge(int[] arr, int start, int mid, int end) {
        int leftSize = mid - start + 1;
        int rightSize = end - mid;

        int[] left = new int[leftSize];
        int[] right = new int[rightSize];

        // Copy elements into the left and right temporary arrays
        for (int i = 0; i < leftSize; i++) {
            left[i] = arr[start + i];
        }

        for (int i = 0; i < rightSize; i++) {
            right[i] = arr[mid + 1 + i];
        }

        int i = 0, j = 0, k = start;

        // Compare elements from the left and right subarrays and merge them
        while (i < leftSize && j < rightSize) {
            if (left[i] <= right[j]) {
                arr[k++] = left[i++];
            } else {
                arr[k++] = right[j++];
            }
        }

        // Copy remaining elements from the left subarray, if any
        while (i < leftSize) {
            arr[k++] = left[i++];
        }

        // Copy remaining elements from the right subarray, if any
        while (j < rightSize) {
            arr[k++] = right[j++];
        }
    }

    // Recursive function to divide the array and sort it
    public static void mergeSort(int[] arr, int start, int end) {
        // Base case: If the array has only one element, it is already sorted
        if (start >= end) {
            return;
        }

        // Find the middle index to divide the array
        int mid = start + (end - start) / 2;

        // Recursively sort the left half
        mergeSort(arr, start, mid);

        // Recursively sort the right half
        mergeSort(arr, mid + 1, end);

        // Merge the two sorted halves
        merge(arr, start, mid, end);
    }
}

Python
def merge(arr, start, mid, end):
    leftSize = mid - start + 1
    rightSize = end - mid

    left = arr[start:start + leftSize]
    right = arr[mid + 1:mid + 1 + rightSize]

    i, j, k = 0, 0, start

    # Compare elements from the left and right subarrays and merge them
    while i < leftSize and j < rightSize:
        if left[i] <= right[j]:
            arr[k] = left[i]
            i += 1
        else:
            arr[k] = right[j]
            j += 1
        k += 1

    # Copy remaining elements from the left subarray, if any
    while i < leftSize:
        arr[k] = left[i]
        i += 1
        k += 1

    # Copy remaining elements from the right subarray, if any
    while j < rightSize:
        arr[k] = right[j]
        j += 1
        k += 1

# Recursive function to divide the array and sort it
def mergeSort(arr, start, end):
    # Base case: If the array has only one element, it is already sorted
    if start >= end:
        return

    # Find the middle index to divide the array
    mid = start + (end - start) // 2

    # Recursively sort the left half
    mergeSort(arr, start, mid)

    # Recursively sort the right half
    mergeSort(arr, mid + 1, end)

    # Merge the two sorted halves
    merge(arr, start, mid, end)

Javascript
// Function to merge two sorted subarrays into a single sorted array
function merge(arr, start, mid, end) {
    let leftSize = mid - start + 1;
    let rightSize = end - mid;

    let left = [];
    let right = [];

    // Copy elements into the left and right temporary arrays
    for (let i = 0; i < leftSize; i++) {
        left[i] = arr[start + i];
    }

    for (let i = 0; i < rightSize; i++) {
        right[i] = arr[mid + 1 + i];
    }

    let i = 0, j = 0, k = start;

    // Compare elements from the left and right subarrays and merge them
    while (i < leftSize && j < rightSize) {
        if (left[i] <= right[j]) {
            arr[k++] = left[i++];
        } else {
            arr[k++] = right[j++];
        }
    }

    // Copy remaining elements from the left subarray, if any
    while (i < leftSize) {
        arr[k++] = left[i++];
    }

    // Copy remaining elements from the right subarray, if any
    while (j < rightSize) {
        arr[k++] = right[j++];
    }
}

// Recursive function to divide the array and sort it
function mergeSort(arr, start, end) {
    // Base case: If the array has only one element, it is already sorted
    if (start >= end) {
        return;
    }

    // Find the middle index to divide the array
    let mid = start + Math.floor((end - start) / 2);

    // Recursively sort the left half
    mergeSort(arr, start, mid);

    // Recursively sort the right half
    mergeSort(arr, mid + 1, end);

    // Merge the two sorted halves
    merge(arr, start, mid, end);
}

Time Complexity: O(n logn)

Here’s a detailed breakdown of why the time complexity is O(n log n):

1. Divide Step (Recursion):

Merge Sort works by dividing the array into two halves at each step. This divide step occurs recursively:

  • Initially, the entire array is split into two halves.
  • Each of those halves is further split into two subarrays.
  • This process continues until the array is broken down into individual elements (or one-element subarrays).

The number of times the array is halved is proportional to the logarithm of the array size. Specifically, the array of size n is divided into two parts recursively until we reach a base case where each subarray contains a single element.

Number of divisions = log₂(n). This is because the array is halved in each recursive step, and after log₂(n) steps, we reach a point where each subarray contains only one element.

2. Merge Step (Combining):

After dividing the array into smaller subarrays, we need to merge them back together in sorted order. The merge process involves comparing elements from two sorted subarrays and putting them into a new array in sorted order.

  • At each level of recursion, you are merging pairs of subarrays. In total, at each level of recursion, the total number of elements being merged is equal to the size of the array n (since each element must be merged back into the final sorted array).
  • The merging process takes O(n) time, as every element in the array is visited once during the merge.

3. Total Time Complexity Calculation:

To understand the time complexity, we combine the time for the divide and merge steps:

  • Divide Step: Takes O(log n) time, since we split the array into two halves recursively.
  • Merge Step: At each level, merging takes O(n) time, as every element must be considered.

Since the array is split into halves log₂(n) times, and at each level of recursion we perform a merge operation that takes O(n) time, the total time complexity is: O(n×log⁡n)

Visualizing the Complexity:

To make this clearer, let’s visualize the recursion tree:

  • At the top level, there is 1 merge operation involving n elements.
  • At the second level, there are 2 merge operations, each involving n/2 elements.
  • At the third level, there are 4 merge operations, each involving n/4 elements.
  • This pattern continues, where the number of operations doubles at each level, but the size of each subarray halved.

At each level, the total amThis takesber of elements merged) is always O(n), but the number of levels is O(log n). So, the total time complexity is O(n log n).

Example:

Let’s consider an array of size 8:

  1. Divide Step:
    • Divide [8 elements] → 2 subarrays of 4 elements each → 4 subarrays of 2 elements each → 8 subarrays of 1 element each.
    • This takes log₂(8) = 3 steps.
  2. Merge Step:
    • Level 1: Merge 8 subarrays (1 element each) into 4 subarrays (2 elements each) → O(8) time.
    • Level 2: Merge 4 subarrays (2 elements each) into 2 subarrays (4 elements each) → O(8) time.
    • Level 3: Merge 2 subarrays (4 elements each) into 1 subarray (8 elements) → O(8) time.

So, the total work is: O(8)+O(8)+O(8)=O(24)=O(nlogn)

Summary of Merge Sort Time Complexity:

  • Divide Step: This Takes O(log n) time because the array is split into halves.
  • Merge Step: This takes O(n) time because each element is merged once at each level.
  • Overall Time Complexity: O(n log n).

Space Complexity: O(n)

The space complexity of Merge Sort is O(n) because of the additional memory required during the merging process.

Here's a breakdown of why this happens:

1. Recursive Call Stack (Space for Function Calls):

In Merge Sort, the algorithm uses recursion to divide the array into smaller subarrays. Each recursive call creates a new stack frame, and the depth of recursion is O(log n) (because the array is halved at each level).

  • However, the recursive calls do not directly contribute to the space complexity in terms of additional memory usage. They do occupy space in the call stack, but this is typically not counted as additional space unless it's related to storing data.
  • So, while the recursion depth is O(log n), it does not contribute to the primary space complexity in terms of memory used to store data.

2. Auxiliary Array for Merging:

The key factor that causes O(n) space complexity is the use of an auxiliary array to store the merged subarrays.

  • During the merge step, Merge Sort requires extra space to hold the combined result of two sorted halves. In the worst case, when merging two halves of size n/2, the algorithm needs an additional array of size n to store the sorted result.
  • Therefore, for each merge operation, O(n) extra space is used to store the intermediate results.

Space Complexity Breakdown:

  • Recursive Stack: O(log n) due to the recursive calls (stack depth).
  • Auxiliary Array: O(n) due to the temporary array used to merge sorted subarrays.

So, the overall space complexity is dominated by the O(n) space required for the auxiliary array during the merge process, leading to an overall space complexity of O(n).

Stability of Algorithm

Merge Sort is a stable sorting algorithm, which means that it preserves the relative order of elements with equal values in the sorted output.

Why is Merge Sort Stable?

The stability of an algorithm refers to how it handles equal elements in the input array. If two elements are equal, a stable algorithm ensures that they appear in the same relative order in the sorted array as they did in the original array.

In Merge Sort, this stability arises from how the merging process works. Here's a more detailed explanation:

Merging Process:

When two sorted subarrays are being merged, the algorithm compares the first element of each subarray:

  1. If the element from the left subarray is less than the element from the right subarray, it is placed into the merged array.
  2. If the element from the right subarray is less than the element from the left subarray, it is placed into the merged array.
  3. If both elements are equal, Merge Sort places the element from the left subarray before the element from the right subarray. This ensures that the original relative order of equal elements is preserved.

Application of Merge Sort

  • Sorting Linked Lists:
    • Merge Sort is ideal for sorting linked lists because it doesn’t require shifting elements. It efficiently splits and merges linked list nodes, making it more suitable than other algorithms like Quick Sort for this data structure.
  • Counting Inversions in a Array:
    • Merge Sort can be adapted to count inversions (pairs where an earlier element is greater than a later one) in an array. This is useful in areas like data analysis and computer vision, providing an efficient O(n log n) solution compared to brute force approaches.
  • External Sorting:
    • Merge Sort is widely used in external sorting when data is too large to fit into memory. It sorts data stored on disk by dividing it into chunks, sorting those chunks in memory, and merging them back together, making it ideal for large-scale data processing like in database systems.
💡
Showcase your skills by joining LearnYard Technologies FZ-LLC as a Technical Content Writer. Apply now and inspire the next generation of learners—fill out the form: https://forms.gle/CGDsbGkcapSfvXKM8