Counting sort



         


Counting sort (like bucket sort) is a sorting algorithm which takes advantage of knowing the maximum value within the array to be sorted. If the maximum value is not known, an initial pass of the data will be necessary to find the largest value (this pass will be O(n)). It uses this to create an array (referenced by the values within the array to be sorted) that will serve to count the number of values of a certain number and, after this, the future positions of the values on a third array, which is the sorted permutation of the second array (i.e., A[i] <= A[j], i < j).

This algorithm is stable and it is O(n+k), where k is the maximum value within the array. In order for this algorithm to have any semblance of efficiency, k must be comparatively small. The values within the original array have to be known and non-negative. If there are some negative numbers in the array, an initial pass of O(n) is necessary to identify the most negative number, and then the absolute value of this number can be added to all numbers in the list for purposes of calculating which array bin to increment.

Note that the first array that is used in the counting sort has a length of the number of possible elements between 0 and the maximum known number. This is not a problem for small ranges of numbers, but large ranges of numbers will eat up more memory (or even hard drive space) than current systems have available. For instance, if one is ordering a list of numbers whose range is between 0 and 100, the counting sort may be the best possible sort. However, if one is trying to order a list of names alphabetically, the array would have to be of length 27n (using a numeric representation of letters, where 27 accounts for all the letters of the alphabet plus "space" to pad out names which are not of the maximum length), where n is the length of the largest name in characters. It is important to note that the length of the sorting array is independent on the number of items in the list to be sorted and dependent solely on the total possible values in the search space. Even a sort of a list of names where the maximum name length is 8 would require prohibitive amounts of RAM for array storage.

[Top]

References:






  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License