Sorting

Mon, Oct 4, 2021 8-minute read
What is in-place soring?

Use constant extra space for producing the output by modifying the given array only.

E.g. Insertion sort, selection sort, heap sort. (Merge sort and counting sort are not)

What are internal & external sorting?

External sorting: when all data that needs to be sorted cannot be placed in-memory at a time. Used for massive amount of data.

E.g. merge sort and its variations.

Internal soring: when all data is placed in-memory.

What is stable sorting?

Stability is mainly important when we have key value pairs with duplicate keys possible, and we wish to sort these objects by keys. A soring algorithm is stable if two objects with equal keys appear in the same order in sorted output as they appeat in the input array to be sorted.

  • When equal elements are indistinguishable (with integers), or any data where the entire element is the key, stability is not an issue. Vice versa, if all keys are different.

Stable sorting: Bubble Sort, Insertion Sort, Merge Sort, Count Sort

Unstable sorting: Quick Sort, Heap Sort can be made stable by taking the position of elements into considertaion.

Any comparison based sorting algorithm which is not stable by nature can be modified to be stable by changing the key comparison operation.

Time Complexity

(considering to sort in ascending order)

BestAverageWorst
Selection SortΩ(n^2)θ(n^2)O(n^2)
*no matter what scenario, selection sort need to go through the whole list of n elements to find the ith element
Bubble SortΩ(n)θ(n^2)O(n^2)
In ascending orderIn descending order
Insertion SortΩ(n)θ(n^2)O(n^2)
In ascending orderIn descending order
Heap SortΩ(n log(n))θ(n log(n))O(n log(n))
Quick SortΩ(n log(n))θ(n log(n))O(n^2)
Merge SortΩ(n log(n))θ(n log(n))O(n log(n))
Bucket SortΩ(n+k)θ(n+k)O(n^2)
Radix SortΩ(nk)θ(nk)O(nk)

The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is Ω(nLogn), i.e., they cannot do better than nLogn.

Selection Sort

Repeatly finding the minimum element from unsorted part and putting it at the beginning. It maintains two subarrays.

Bubble Sort

Recursive Bubble Sort

Insertion Sort

Recursive Insertion Sort

Binary Insertion sort

Merge Sort

Divide and Conquer

MergeSort(arr[], l,  r)
If r > l
     1. Find the middle point to divide the array into two halves:  
             middle m = l+ (r-l)/2
     2. Call mergeSort for first half:   
             Call mergeSort(arr, l, m)
     3. Call mergeSort for second half:
             Call mergeSort(arr, m+1, r)
     4. Merge the two halves sorted in step 2 and 3:
             Call merge(arr, l, m, r)

Iterative Merge Sort

Merge Sort for LinkedList

3-way Merge Sort

Quick Sort

Divide and Conquer

It picks an element as pivot and partitions the given array around the picked pivot. There are many different versions of quickSort that pick pivot in different ways.

  1. Always pick first element as pivot.
  2. Always pick last element as pivot.
  3. Pick a random element as pivot.
  4. Pick median as pivot.
/* low  --> Starting index,  high  --> Ending index */
quickSort(arr[], low, high)
{
    if (low < high)
    {
        /* pi is partitioning index, arr[pi] is now
           at right place */
        pi = partition(arr, low, high);

        quickSort(arr, low, pi - 1);  // Before pi
        quickSort(arr, pi + 1, high); // After pi
    }
}

Quick Sort V.S Merge Sort

Quick sort is an internal algorithm which is based on divide and conquer strategy. In this:

  1. The array of elements is divided into parts repeatedly until it is not possible to divide it further.
  2. It is also known as “partition exchange sort”.
  3. It uses a key element (pivot) for partitioning the elements.
  4. One left partition contains all those elements that are smaller than the pivot and one right partition contains all those elements which are greater than the key element.

Merge sort is an external algorithm and based on divide and conquer strategy. In this:

  1. The elements are split into two sub-arrays (n/2) again and again until only one element is left.
  2. Merge sort uses additional storage for sorting the auxiliary array.
  3. Merge sort uses three arrays where two are used for storing each half, and the third external one is used to store the final sorted list by merging other two and each array is then sorted recursively.
  4. At last, the all sub arrays are merged to make it ‘n’ element size of the array.
Partition of elements in the array :

In the merge sort, the array is parted into just 2 halves (i.e. n/2). whereas In case of quick sort, the array is parted into any ratio. There is no compulsion of dividing the array of elements into equal parts in quick sort.

Worst case complexity :

The worst case complexity of quick sort is O(n2) as there is need of lot of comparisons in the worst condition. whereas In merge sort, worst case and average case has same complexities O(n log n). Usage with datasets : Merge sort can work well on any type of data sets irrespective of its size (either large or small). whereas The quick sort cannot work well with large datasets.

Additional storage space requirement :

Merge sort is not in place because it requires additional memory space to store the auxiliary arrays. whereas The quick sort is in place as it doesn’t require any additional storage.

Efficiency :

Merge sort is more efficient and works faster than quick sort in case of larger array size or datasets. whereas Quick sort is more efficient and works faster than merge sort in case of smaller array size or datasets.

Sorting method :

The quick sort is internal sorting method where the data is sorted in main memory. whereas The merge sort is external sorting method in which the data that is to be sorted cannot be accommodated in the memory and needed auxiliary memory for sorting.

Stability :

Merge sort is stable as two elements with equal value appear in the same order in sorted output as they were in the input unsorted array. whereas Quick sort is unstable in this scenario. But it can be made stable using some changes in code.

Preferred for :

Quick sort is preferred for arrays. whereas Merge sort is preferred for linked lists.

Locality of reference :

Quicksort exhibits good cache locality and this makes quicksort faster than merge sort (in many cases like in virtual memory environment).

Basis for comparisonQuick SortMerge Sort
The partition of elements in the arrayThe splitting of a array of elements is in any ratio, not necessarily divided into half.In the merge sort, the array is parted into just 2 halves (i.e. n/2).
Worst case complexityO(n2)O(nlogn)
Works well onIt works well on smaller arrayIt operates fine on any size of array
Speed of executionIt work faster than other sorting algorithms for small data set like Selection sort etcIt has a consistent speed on any size of data
Additional storage space requirementLess(In-place)More(not In-place)
EfficiencyInefficient for larger arraysMore efficient
Sorting methodInternalExternal
StabilityNot StableStable
Preferred forfor Arraysfor Linked Lists
Locality of referencegoodpoor

Iterative Quick Sort

Quick Sort on Singly LinkedList

Quick Sort on Doubly LinkedList

3-way Quick Sort (Dutch National Flag)

Heap Sort

A comparison-based sorting technique based on Binary Heap data structure. Similar to selection sort to find the minimum element and place it at the beginning.

what is Binary Heap?

A complete binary tree where items are stored in a special order such that the value in a parent node is greater than the values in its two children nodes. It can be represented by a binary tree or array.

  • Array: since a binary heap is a complete binary tree, it can be represented by an array and the array-based representation is space-efficient. If parent node is stored at index i, the left child can be calculated by (2i + 1) and the right child by (2i + 2).

Counting Sort

Counting sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in the range from 1 to k.

Radix Sort

The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.

Is Radix Sort preferable to Comparison based sorting algorithms like Quick-Sort?

If we have log2n bits for every digit, the running time of Radix appears to be better than Quick Sort for a wide range of input numbers. The constant factors hidden in asymptotic notation are higher for Radix Sort and Quick-Sort uses hardware caches more effectively. Also, Radix sort uses counting sort as a subroutine and counting sort takes extra space to sort numbers.

Bucket Sort

Bucket sort is mainly useful when input is uniformly distributed over a range. For example, consider the following problem.

ounting sort can not be applied here as we use keys as index in counting sort. Here keys are floating point numbers.
The idea is to use bucket sort.

bucketSort(arr[], n)
1) Create n empty buckets (Or lists).
2) Do following for every array element arr[i].
.......a) Insert arr[i] into bucket[n*array[i]]
3) Sort individual buckets using insertion sort.
4) Concatenate all sorted buckets.

Shell Sort

Tim Sort

Comb Sort

Pigeonhole Sort

Cycle Sort

Cocktail Sort

Strand Sort

Bitonic Sort

Pancake Sorting

BogoSort / Permutation Sort

Gnome Sort

Sleep Sort - The King of Laziness/Sorting while Sleeping

Structure Sorting

Stooge Sort

Tag Sort (both sorted and original)

Tree Sort

Cartesian Tree Sorting

Odd-Even Sort / Brick Sort

(credit to GeeksforGeek)