Timsort

Timsort

·

4 min read

Timesort is a hybrid sorting algorithm. We talked about another hybrid sorting algorithm Introspective Sort before, which combines quick sort, heap sort and selection sort to achieve better result. In timsort, we conbine selection sort and merge sort to make the sorting process more efficient.

Selectioin Sort

We know that for small array, selection sort tends to be faster. We talked about selection sort in the Introspective Sort article before, so check it if you are not familiar with it. Let's copy paste code directly in here.

function insertionSort(nums: number[], left: number, right: number): void {
    for (let i = left + 1; i <= right; i++) {
        const key = nums[i];

        let j = i - 1;
        for (; j >= left && nums[j] > key; j--) {
            nums[j + 1] = nums[j];
        }
        nums[j + 1] = key;
    }
}

Merge Sort

The leetcode problem Merge Sorted Array shows the technique to how to use merge sort.

The most simple approach is to get the smallest value one by one in the loop. Because we need to store all the data back into the first array, so we need to copy it first.

function merge(nums1: number[], m: number, nums2: number[], n: number): void {
    const copy = [...nums1];
    const length = m + n;

    let j = 0;
    let k = 0;
    for (let i = 0; i < length; i++) {
        if (k >= n || (j < m && copy[j] < nums2[k])) {
            nums1[i] = copy[j++];
        } else {
            nums1[i] = nums2[k++];
        }
    }
};

Because in this problem, the first array have enough room for storing all two arrays, so we can make use of the latter part first, then we don't need to copy the array. It works like we already copy the second array. So the process is to loop backward. Others are the same.

function merge(nums1: number[], m: number, nums2: number[], n: number): void {
    let j = m - 1;
    let k = n - 1;
    for (let i = m + n - 1; i >= 0; i--) {
        if (k < 0 || (j >= 0 && nums1[j] > nums2[k])) {
            nums1[i] = nums1[j--];
        } else {
            nums1[i] = nums2[k--];
        }
    }
};

In timsort, we actually only have 1 array and we need to store the result in-place. So we can't make use of this idea. And because there is only 1 array, so we make some changes to the merge sort function.

// merge two array: nums.slice(start, start+size) and nums.slice(start+size, start+size+size)
function mergeSort(nums: number[], start: number, size: number): void {
    // only one array left, return
    if (start + size >= nums.length) return;

    const copy = nums.slice(start, start + size);
    const length = Math.min(start + size + size, nums.length);
    let j = start;
    let k = start + size;
    for (let i = 0; i < length; i++) {
        if (k >= length || (j < start + size && copy[j] < nums[k])) {
            nums[i] = copy[j++];
        } else {
            nums[i] = nums[k++];
        }
    }
};

Timsort

OK, now let's see how to conbine these 2 algorithms in timsort.

First we need to define a variable run length, and divide the array into many subarrays with size as this variable. So if we have run length as 4 and an array as [9, 8, 7, 6, 4, 3, 2, 1, 11, 13, 12], then all the subarrays looks like below.

[9, 8, 7, 6, 4, 3, 2, 1, 11, 13, 12]

=>

[
    9, 8, 7, 6, 
    4, 3, 2, 1, 
    11, 13, 12
]

Second, for each subarray, we use insertion sort algorithm to make it sorted.

[
    6, 7, 8, 9,
    1, 2, 3, 4,
    11, 12, 13
]

As you can see, now the problem is converted into a problem to merge multiple sorted arrays. So we use merge sort algorithm to sort the array pair by pair until there is only 1 subarray left.

Now let see all the code.

function timSort(nums: number[]) {
    const runLength = 4;
    for (let i = 0; i < nums.length; i += runLength) {
        const right = i + runLength - 1;
        insertionSort(nums, i, right >= nums.length ? nums.length - 1 : right);
    }

    for (let size = runLength; size < nums.length; size *= 2) {
        for (let start = 0; start < nums.length; start += size * 2) {
            mergeSort(nums, start, size);
        }
    }
}