Selection algorithm

In computer science, a selection algorithm is an algorithm for finding the kth smallest number in a list or array; such a number is called the kth order statistic. This includes the cases of finding the minimum, maximum, and median elements. There are O(n)-time (worst-case linear time) selection algorithms, and sublinear performance is possible for structured data; in the extreme, O(1) for an array of sorted data. Selection is a subproblem of more complex problems like the nearest neighbor and shortest path problems. Many selection algorithms are derived by generalizing a sorting algorithm, and conversely some sorting algorithms can be derived as repeated application of selection. In computer science, a selection algorithm is an algorithm for finding the kth smallest number in a list or array; such a number is called the kth order statistic. This includes the cases of finding the minimum, maximum, and median elements. There are O(n)-time (worst-case linear time) selection algorithms, and sublinear performance is possible for structured data; in the extreme, O(1) for an array of sorted data. Selection is a subproblem of more complex problems like the nearest neighbor and shortest path problems. Many selection algorithms are derived by generalizing a sorting algorithm, and conversely some sorting algorithms can be derived as repeated application of selection. The simplest case of a selection algorithm is finding the minimum (or maximum) element by iterating through the list, keeping track of the running minimum – the minimum so far – (or maximum) and can be seen as related to the selection sort. Conversely, the hardest case of a selection algorithm is finding the median. In fact, a specialized median-selection algorithm can be used to build a general selection algorithm, as in median of medians. The best-known selection algorithm is quickselect, which is related to quicksort; like quicksort, it has (asymptotically) optimal average performance, but poor worst-case performance, though it can be modified to give optimal worst-case performance as well. By sorting the list or array then selecting the desired element, selection can be reduced to sorting. This method is inefficient for selecting a single element, but is efficient when many selections need to be made from an array, in which case only one initial, expensive sort is needed, followed by many cheap selection operations – O(1) for an array, though selection is O(n) in a linked list, even if sorted, due to lack of random access. In general, sorting requires O(n log n) time, where n is the length of the list, although a lower bound is possible with non-comparative sorting algorithms like radix sort and counting sort. Rather than sorting the whole list or array, one can instead use partial sorting to select the k smallest or k largest elements. The kth smallest (resp., kth largest element) is then the largest (resp., smallest element) of the partially sorted list – this then takes O(1) to access in an array and O(k) to access in a list. If partial sorting is relaxed so that the k smallest elements are returned, but not in order, the factor of O(k log k) can be eliminated. An additional maximum selection (taking O(k) time) is required, but since k ≤ n {displaystyle kleq n} , this still yields asymptotic complexity of O(n). In fact, partition-based selection algorithms yield both the kth smallest element itself and the k smallest elements (with other elements not in order). This can be done in O(n) time – average complexity of quickselect, and worst-case complexity of refined partition-based selection algorithms. Conversely, given a selection algorithm, one can easily get an unordered partial sort (k smallest elements, not in order) in O(n) time by iterating through the list and recording all elements less than the kth element. If this results in fewer than k − 1 elements, any remaining elements equal the kth element. Care must be taken, due to the possibility of equality of elements: one must not include all elements less than or equal to the kth element, as elements greater than the kth element may also be equal to it. Thus unordered partial sorting (lowest k elements, but not ordered) and selection of the kth element are very similar problems. Not only do they have the same asymptotic complexity, O(n), but a solution to either one can be converted into a solution to the other by a straightforward algorithm (finding a max of k elements, or filtering elements of a list below a cutoff of the value of the kth element). A simple example of selection by partial sorting is to use the partial selection sort. The obvious linear time algorithm to find the minimum (resp. maximum) – iterating over the list and keeping track of the minimum (resp. maximum) element so far – can be seen as a partial selection sort that selects the 1 smallest element. However, many other partial sorts also reduce to this algorithm for the case k = 1, such as a partial heap sort.

Parent Topic

Child Topic

No Parent Topic