Hello everyone. Today we continue the series of articles that I wrote specifically to launch the OTUS Algorithms and Data Structures course


Array sorting is one of the first serious tasks studied in the classic course "Algorithms and Data Structures" of the discipline computer science. In this regard, tasks for writing sortings and related questions are often encountered in interviews as trainees or junior developers.

Statement of the problem

Traditionally, it is worth starting the presentation of solutions to the problem with its formulation. Typically, the sorting task involves sorting an array of integers in ascending order. But in fact, this is some simplification. The algorithms described in this section can be used to order an array of any objects between which an order relationship is established (that is, about any two elements we can say: the first is greater than the second, the second is greater than the first, or they are equal). You can sort both ascending and descending. We will use the standard simplification.

Insertion Sort

Last time we talked about one of the simplest sortings - sorting by choice . Today we’ll talk about a slightly more complex algorithm - insertion sorting.

Description of the algorithm

Sorting the array by inserts is performed as follows: just as in the case of sorting by selection, the array is divided into two parts. One of the parts is called sorted, and the other is unsorted. The algorithm involves passing through the entire array so that the length of the sorted part becomes equal to the length of the entire array. Within each iteration, we take the first element of the unsorted part of the array and perform the following operation with it: while our element is strictly smaller than the previous one, we swap them. Then we increase the length of the sorted part of the array by one. Thus, by successively moving the studied element, we achieve that it falls into place. An example of one iteration is presented below:
1 3 5 | 2 9 6 - > 1 3 2 5 9 6 - > 1 2 3 5 9 6 - > 1 2 3 5 | 9 6


I suggest looking at the implementation of this algorithm in the C language:

void insertionSortFunction(double array[], int size) { int i, j, temp;//i представляет длину отсортированной части массива, начинаем с 1, потому что один элемент сам по себе считается упорядоченным for (i=1; i < size; i++) { temp=array[i]; for (j=i - 1; j >= 0; j--) { if (array[j] < temp) { break; } array[j + 1]=array[j]; array[j]=temp; } } } 


I propose to analyze this algorithm.

The easiest way to start the analysis is to obtain the asymptotic behavior of the memory. Regardless of the length and structure of the array proposed for sorting, memory is allocated for only two cycle counters and one auxiliary variable, which serves to exchange two variable values. Thus, it is always true:

$ M (n)=O (1) $


Over time, everything is somewhat more interesting. The body of the inner loop itself is executed in O (1), that is, it does not depend on the size of the sorted array. This means that in order to understand the asymptotic behavior of the algorithm, it is necessary to calculate how many times this body is executed. But the number of iterations of the inner loop depends on how well ordered (or not ordered) the elements of the sorted array. For analysis, you need to look at several cases.

The minimum number of iterations is achieved if the sorted array is already sorted. Indeed, for each iteration of the outer for loop, exactly one iteration of the inner loop occurs. This is the so-called best case .

$ T (n)=(n - 1) * O (1 )=O (n) $

Thus, sorting is done in linear time.

In the worst case , the number of iterations is assumed to be the largest, that is, break never works. At the first iteration of the outer loop, one iteration of the inner loop is performed. At the second iteration of the outer loop, 2 iterations of the inner loop are performed. Continuing the discussion further, we can come to the conclusion that at the last ((n - 1) - th) iteration of the outer loop, the (n - 1) iteration of the inner loop is performed.We get:

$ T (n)=O (1) + 2 * O ( 1) + 3 * O (1) +... + (n - 1) * O (1)=O (1 + 2 + 3 +... + (n - 1))=O (n * (n - 1)/2)=O (n ^ 2) $

To perform the calculations, we used the O - notation properties and the sum formula of arithmetic progression.

In the middle case , it is assumed that the number of iterations of the inner loop for a particular iteration of the outer loop is equal to its average value, that is, the mathematical expectation. Suppose that all permissible numbers of positives for the inner loop are equally probable. In this case, the average number of iterations of the inner loop is image. It is assumed that i is the iteration number of the outer loop. Now, to calculate the asymptotics, it is necessary to calculate image. That is, we just count how many times the body of the inner loop is executed. Thus, we get image.

To summarize, the asymptotic behavior of the algorithm from memory is

$ O (1) $

in time at best

$ O (n) $

in both average and worst cases

$ O (n ^ 2) $

Therefore, this sort is classified as a quadratic sort class.

It is also important to note that selection sorting in such an implementation is robust . Let me remind you that sorting is called robust if, during its execution, the sequence of equal elements does not change. This property is not very important for such an educational task as sorting an array of numbers, but if we sorted out some more complex objects with an established order relation this could be important. We can consider a similar example sometime next time when we talk about bitwise sorting.


We examined one more quadratic sorting: insertion sorting, looked at its stable implementation. Sorting is predominantly educational, although in practice it can be applied due to good asymptotics in the best case: if you need to add new data to a sufficiently large ordered data volume so that all the data is ordered again, the inner for loop may be useful. Thus, you can support

$ O (n) $

data volume ordering.

Learn more about the course.