Selection sort
is an elementary sorting algorithm based on the following simple idea:
Find a smallest element in the array and swap it with the first one.
Repeat the previous step when considering the sub-array obtained by excluding the first element, until the sub-array consists of only one element.
A straightforward Scala implementation of the algorithm for integer arrays is given below.
defselectionSort(a:Array[Int]):Unit={valn=a.lengthvari=0// The first index of the sub-array consideredwhile(i<n-1){// Find the smallest element in the sub-array a(i..n-1)varsmallest_element=a(i)varsmallest_index=ivarj=i+1while(j<n){if(a(j)<smallest_element){smallest_element=a(j)smallest_index=j}j+=1}// Swap the elements at a(i) and a(smallest_index)a(smallest_index)=a(i)a(i)=smallest_elementi+=1}}
Example: Selection sort
The animation below shows how selection sort works on a smallish array of integers.
Observe how the example illustrates that selection sort is not stable: the relative positions of the elements 10 swap in the sorting process.
Note that only the current minimum element candidate is
stored outside the array at any point of time.
Thus the amount of extra memory required is \(\Theta(1)\) and
the algorithm works in place.
Running time analysis of the algorithm is quite straightforward.
The body of the inner while-loop takes constant time.
So do all the all instructions, excluding the inner while-loop,
in the outer while-loop.
For the outer while-loop index \(i\),
the inner loop body is executed \(n-1-i\) times.
Thus the running time is
\(\sum_{i=0}^{n-2}(c + (n-1-i)d)\),
where \(c\) and \(d\) are constants capturing the constant running times of
the outer while-loop body (excluding the inner while-loop) and the inner while-loop body.
The sum expands to
\((n-1)c + d((n-1) + (n-2) + ... + 1) = (n-1)c + d \frac{(n-1)(n-2)}{2} = \Theta(n^2)\).
The asymptotic best-case and worst-case running times are the same
as the outer and inner while-loops are always run to the end
regardless of the array contents.
The plots below shows the performance of the above Scala selection sort code
when compared to the sorted method of the Array class.
When inspecting the latter plot having logarithmic y-axis,
we can see that the gap between the lines increases when the array sizes increase.
Thus there is more than a constant time factor difference between the running times of the codes.
Based on the plots,
it is obvious that the sorted method implements
an algorithm that scales much better than selection sort.
By using generics
(see also this,
this and
this),
it is easy to build a generic selection sort implementation
that works for any comparable object type (floating point numbers, strings etc).
Using an Orderingord for the array elements,
we only have to change two lines in the code:
the method declaration and the comparison line.
defselectionSort[A](a:Array[A])(implicitord:math.Ordering[A]):Unit={valn=a.lengthvari=0while(i<n-1){// Find the smallest element in the sub-array a(i..n-1)varsmallest_element=a(i)varsmallest_index=ivarj=i+1while(j<n){if(ord.lt(a(j),smallest_element)){// CHANGEDsmallest_element=a(j)smallest_index=j}j+=1}// Swap the elements at a(i) and a(smallest_index)a(smallest_index)=a(i)a(i)=smallest_element// The sub-array a(0..i) is now sortedi+=1}}
Unfortunately, with current Scala compiler,
the generality does not come without a performance penalty.
As an example,
consider applying the generic selection sort to arrays of integers.
To apply the lt comparison method in the Ordering trait,
each integer fetched from the array is “boxed” (wrapped into an object),
causing a memory allocation and then later garbage collection.
In addition, several virtual function invocations are performed.
We can see the practical effect of this by comparing
the running times of our first selection sort working only on integer arrays
to those of the generic selection sort algorithm above.
If we use logarithmic scale on the \(y\)-axis,
we can see that the running times of the generic version are
a constant factor (of roughly ten) slower than those of the non-generic integer array version.