Sorting
What is in-place soring?
Use constant extra space for producing the output by modifying the given array only.
E.g. Insertion sort, selection sort, heap sort. (Merge sort and counting sort are not)
What are internal & external sorting?
External sorting: when all data that needs to be sorted cannot be placed in-memory at a time. Used for massive amount of data.
E.g. merge sort and its variations.
Internal soring: when all data is placed in-memory.
What is stable sorting?
Stability is mainly important when we have key value pairs with duplicate keys possible, and we wish to sort these objects by keys. A soring algorithm is stable if two objects with equal keys appear in the same order in sorted output as they appeat in the input array to be sorted.
- When equal elements are indistinguishable (with integers), or any data where the entire element is the key, stability is not an issue. Vice versa, if all keys are different.
Stable sorting: Bubble Sort, Insertion Sort, Merge Sort, Count Sort
Unstable sorting: Quick Sort, Heap Sort can be made stable by taking the position of elements into considertaion.
Any comparison based sorting algorithm which is not stable by nature can be modified to be stable by changing the key comparison operation.
Time Complexity
(considering to sort in ascending order)
Best | Average | Worst | |
---|---|---|---|
Selection Sort | Ω(n^2) | θ(n^2) | O(n^2) |
*no matter what scenario, selection sort need to go through the whole list of n elements to find the ith element | |||
Bubble Sort | Ω(n) | θ(n^2) | O(n^2) |
In ascending order | In descending order | ||
Insertion Sort | Ω(n) | θ(n^2) | O(n^2) |
In ascending order | In descending order | ||
Heap Sort | Ω(n log(n)) | θ(n log(n)) | O(n log(n)) |
Quick Sort | Ω(n log(n)) | θ(n log(n)) | O(n^2) |
Merge Sort | Ω(n log(n)) | θ(n log(n)) | O(n log(n)) |
Bucket Sort | Ω(n+k) | θ(n+k) | O(n^2) |
Radix Sort | Ω(nk) | θ(nk) | O(nk) |
The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is Ω(nLogn), i.e., they cannot do better than nLogn.
Selection Sort
Repeatly finding the minimum element from unsorted part and putting it at the beginning. It maintains two subarrays.
Bubble Sort
Recursive Bubble Sort
Insertion Sort
Recursive Insertion Sort
Binary Insertion sort
Merge Sort
Divide and Conquer
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = l+ (r-l)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
Iterative Merge Sort
Merge Sort for LinkedList
3-way Merge Sort
Quick Sort
Divide and Conquer
It picks an element as pivot and partitions the given array around the picked pivot. There are many different versions of quickSort that pick pivot in different ways.
- Always pick first element as pivot.
- Always pick last element as pivot.
- Pick a random element as pivot.
- Pick median as pivot.
/* low --> Starting index, high --> Ending index */
quickSort(arr[], low, high)
{
if (low < high)
{
/* pi is partitioning index, arr[pi] is now
at right place */
pi = partition(arr, low, high);
quickSort(arr, low, pi - 1); // Before pi
quickSort(arr, pi + 1, high); // After pi
}
}
Quick Sort V.S Merge Sort
Quick sort is an internal algorithm which is based on divide and conquer strategy. In this:
- The array of elements is divided into parts repeatedly until it is not possible to divide it further.
- It is also known as “partition exchange sort”.
- It uses a key element (pivot) for partitioning the elements.
- One left partition contains all those elements that are smaller than the pivot and one right partition contains all those elements which are greater than the key element.
Merge sort is an external algorithm and based on divide and conquer strategy. In this:
- The elements are split into two sub-arrays (n/2) again and again until only one element is left.
- Merge sort uses additional storage for sorting the auxiliary array.
- Merge sort uses three arrays where two are used for storing each half, and the third external one is used to store the final sorted list by merging other two and each array is then sorted recursively.
- At last, the all sub arrays are merged to make it ‘n’ element size of the array.
Partition of elements in the array :
In the merge sort, the array is parted into just 2 halves (i.e. n/2). whereas In case of quick sort, the array is parted into any ratio. There is no compulsion of dividing the array of elements into equal parts in quick sort.
Worst case complexity :
The worst case complexity of quick sort is O(n2) as there is need of lot of comparisons in the worst condition. whereas In merge sort, worst case and average case has same complexities O(n log n). Usage with datasets : Merge sort can work well on any type of data sets irrespective of its size (either large or small). whereas The quick sort cannot work well with large datasets.
Additional storage space requirement :
Merge sort is not in place because it requires additional memory space to store the auxiliary arrays. whereas The quick sort is in place as it doesn’t require any additional storage.
Efficiency :
Merge sort is more efficient and works faster than quick sort in case of larger array size or datasets. whereas Quick sort is more efficient and works faster than merge sort in case of smaller array size or datasets.
Sorting method :
The quick sort is internal sorting method where the data is sorted in main memory. whereas The merge sort is external sorting method in which the data that is to be sorted cannot be accommodated in the memory and needed auxiliary memory for sorting.
Stability :
Merge sort is stable as two elements with equal value appear in the same order in sorted output as they were in the input unsorted array. whereas Quick sort is unstable in this scenario. But it can be made stable using some changes in code.
Preferred for :
Quick sort is preferred for arrays. whereas Merge sort is preferred for linked lists.
Locality of reference :
Quicksort exhibits good cache locality and this makes quicksort faster than merge sort (in many cases like in virtual memory environment).
Basis for comparison | Quick Sort | Merge Sort |
---|---|---|
The partition of elements in the array | The splitting of a array of elements is in any ratio, not necessarily divided into half. | In the merge sort, the array is parted into just 2 halves (i.e. n/2). |
Worst case complexity | O(n2) | O(nlogn) |
Works well on | It works well on smaller array | It operates fine on any size of array |
Speed of execution | It work faster than other sorting algorithms for small data set like Selection sort etc | It has a consistent speed on any size of data |
Additional storage space requirement | Less(In-place) | More(not In-place) |
Efficiency | Inefficient for larger arrays | More efficient |
Sorting method | Internal | External |
Stability | Not Stable | Stable |
Preferred for | for Arrays | for Linked Lists |
Locality of reference | good | poor |
Iterative Quick Sort
Quick Sort on Singly LinkedList
Quick Sort on Doubly LinkedList
3-way Quick Sort (Dutch National Flag)
Heap Sort
A comparison-based sorting technique based on Binary Heap data structure. Similar to selection sort to find the minimum element and place it at the beginning.
what is Binary Heap?
A complete binary tree where items are stored in a special order such that the value in a parent node is greater than the values in its two children nodes. It can be represented by a binary tree or array.
- Array: since a binary heap is a complete binary tree, it can be represented by an array and the array-based representation is space-efficient. If parent node is stored at index i, the left child can be calculated by (2i + 1) and the right child by (2i + 2).
Counting Sort
Counting sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in the range from 1 to k.
Radix Sort
The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.
Is Radix Sort preferable to Comparison based sorting algorithms like Quick-Sort?
If we have log2n bits for every digit, the running time of Radix appears to be better than Quick Sort for a wide range of input numbers. The constant factors hidden in asymptotic notation are higher for Radix Sort and Quick-Sort uses hardware caches more effectively. Also, Radix sort uses counting sort as a subroutine and counting sort takes extra space to sort numbers.
Bucket Sort
Bucket sort is mainly useful when input is uniformly distributed over a range. For example, consider the following problem.
ounting sort can not be applied here as we use keys as index in counting sort. Here keys are floating point numbers.
The idea is to use bucket sort.
bucketSort(arr[], n)
1) Create n empty buckets (Or lists).
2) Do following for every array element arr[i].
.......a) Insert arr[i] into bucket[n*array[i]]
3) Sort individual buckets using insertion sort.
4) Concatenate all sorted buckets.
Shell Sort
Tim Sort
Comb Sort
Pigeonhole Sort
Cycle Sort
Cocktail Sort
Strand Sort
Bitonic Sort
Pancake Sorting
BogoSort / Permutation Sort
Gnome Sort
Sleep Sort - The King of Laziness/Sorting while Sleeping
Structure Sorting
Stooge Sort
Tag Sort (both sorted and original)
Tree Sort
Cartesian Tree Sorting
Odd-Even Sort / Brick Sort
(credit to GeeksforGeek)