Merge Sort: implementation and analysis

def merge(left, right):
    result = []
    i, j = 0, 0

    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    result += left[i:]
    result += right[j:]

    return result


def mergesort(list):
    if len(list) < 2:
        return list

    middle = len(list) / 2
    left = mergesort(list[:middle])
    right = mergesort(list[middle:])

    return merge(left, right)


l = [ n for n in reversed(range(1,11)) ]
print l
print mergesort(l)

What’s going on

  1. mergesort() keeps calling itself on two halved list until eventually
  2. the list is just one element, and returns itself; control goes back to the last mergesort() previously called, which now has values for left and right, and it is able to return merge(left, right)
  3. The return value of merge(left, right) is a new list which becomes the left or right on the parent mergesort(), until the very first one that will just return the final, sorted list.

Also note that this part

result += left[i:]
result += right[j:]

is needed because at some point either i or j will be grater then their len(list), causing the while loop to terminate. However, every time the left branch is merged first it means that there are still elements on the right branch that are still to be added, and vice versa.

Analysis

The number of times we can halve \( n \) before getting to 1 is \( \log(n) \), therefore we have \( \log(n) \) levels of depth.

For every level j we have \( 2^j \) subproblems, each one with \( \frac{n}{2^j} \)  elements. This implies that the running time is the same for every level, since

\[ \frac{n}{2^j} \times 2^j = n = O(n) \\ \]

Combining the two, the running time for Merge Sort results in \( O(n \log(n)) \).