# Lexoranges - what is it and how to use them to efficiently sort lists

Dragging items in the list is a popular feature in modern applications, the presence of which will only please users. However, when implementing such functionality, you need to try not to step on the rake of poor optimization: a large number of elements, recounting the position each time, and if there are several sections in the list, then when dragging between sections, most likely, you need to implement additional logic. How not to get on the forehead, reduce the number of calculations, and how lexoranges will help us in this - read under the cut.

### Denote the problem

So, you decided to add the ability to drag and drop elements into your application. So, you need to sort the elements somehow, otherwise there is no point in dragging and dropping. And what first comes to mind?

### Positions

The most common unremarkable positions. Those same numbers from 1 to infinity (not really). Working with them is simple and convenient, the elements are sorted without problems. At first glance, everything is fine. It’s so good that for most applications this is what you need.

*What, then, is wrong with the numerical position?*

The problem lurks in related operations. Need to embed an element between the second and third elements? We shift everything forward by one starting from the third element, without forgetting to update the data in the database. Performing such an operation once does not look difficult, but this operation will be performed quite often.

Another problematic operation is updating data on the server. Updated the task - you need to send an update of all the affected tasks to the server. The server, in turn, must send this update to everyone who is “subscribed” to the task list. The more often users change the order of tasks in the list, the more data must be sent to the server, and the more data the server must send to clients.

It turns out that when we drag one task, we will not only change the positions of a large number of elements, but also send them to the server, which will then send them to other users.

Conclusion: I want something more optimal

### Solution Options

When we faced a similar problem at the company, the first possible solution was the following algorithm: we will place some standard positions at equal intervals (steps) for all elements. So, the first element will have a position equal to 1, and the second - equal to 1000. When the user wants to drag something between these two elements, we will calculate the average position - (1000 + 1)/2=~ 500. And so on and so forth.

Why this option is bad, I think you guessed right away. We are limited in the number of steps that can be taken. Those. between 1 and 1000 - 500. Between 1 and 500 - 250. Then 125... and ultimately there will be no space left. Increasing the step does not solve this problem.

Can we use fractional numbers?

No, fractional numbers do not fix the problem, but only delay the moment of its appearance.

After a little thought and googling, we came across a report on how lexoranges are used in Jira (Lexorank, report ).

They are based on three things:

1 - strings are easy to sort alphabetically

2 - between two lines you can find the middle line (not always, and this is not so simple)

3 - can not find the average? Let's use a bucket (it sounds strange, yes)

With sorting everything is clear, we go straight to item number 2.

There are letters in the English alphabet in “a” and “c”, and between them, obviously, “b”. But how to find this “b” mathematically?

Let's just subtract the code for the letter “a” from the code for the letter “c”, we get 2 (“c”=143, “a”=141). It remains to divide the result in half. We got 1. And really, if we add one to the code “a”, we get the code of the letter “b”.

*The combination of English letters is called*

**Lexorang />**Situations when there is no place between the two lines, there is also a place to be, and I already wrote that buckets are used to solve them.

*A bucket is a mark before a rank*, it looks like this: "0 | aaa".Here 0 is the bucket number. When there is no space left, the elements are transferred from one bucket to another, and the tags are re-arranged with the preservation of order. That's all the magic!

**How We Used It**

It is not exactly said (rather, we just didn’t find it) how easy and painless it is to find the middle line between the two. So we tensed up and came up with this. Immediately plunge into an example.

Take two lines: “aa” and “cc” and find the middle between them.

After character-by-character subtraction, as above we get the number 11. But what to do next? You might think that you just need to add the result to the string “aa”. Here the line “bb” between “aa” and “ss” will be true, but the algorithm will be incorrect, and now we will see why.

Let’s think what it looks like? “Aa”, “cc”, “11”. On some number system. On what? And on the 26th! Why? Because there are 26 letters in the English alphabet. Here it is.

We must translate the result, “11”, from the 26-decimal number system into the familiar 10-decimal system.

The formula is pretty simple:

*X=y*

_{0 }+ y_{1 }* size + y_{2 }* size ^^{2 }... y_{n }* size ^^{(n-1) }*Here, for*

**size**,**"size"**of the number system (in this case size=26) is indicatedy

_{n }- the nth number on the right

Remember this formula as, for example,

*formula 1*, it will still be useful to us.

We substitute our numbers and this is what happens: 2 + 2 * 26=54. Now we know how many characters are between the lines “aa” and “ss”. But we need to take the middle between the two. Divide 54 by 2, we get 27. It remains only to correctly add our result to the codes “aa”.

How to do it? First we find out how much you need to add to the first (right) character. To do this, we get the remainder of dividing 27 by 26. It turns out 1. Add to “a” 1 - we get the letter “b”.

Now you need to think about how to find out how many characters the second character needs to be shifted.

Here the following formula will help us:

*X=Y/size ^*

^{(n-1) }% sizeUsing this formula, we can find out how much you need to add to a certain place (symbol, set with n).

Substitute everything there, we get (n=2): (27/26)% 26=1. We add. We get the final result "bb".

It’s not so difficult to implement the algorithm on any language when you know exactly how it works. Below I added an implementation of the algorithm in the Dart language (the application in which this problem occurred is written in Flutter).

**Our implementation of finding the 'middle' row**

`String getRankBetween({@required String firstRank, @required String secondRank}) { assert(firstRank.compareTo(secondRank) < 0, "First position must be lower than second. Got firstRank $firstRank and second rank $secondRank");///Make positions equal while (firstRank.length != secondRank.length) { if (firstRank.length > secondRank.length) secondRank += "a"; else firstRank += "a"; } var firstPositionCodes=[]; firstPositionCodes.addAll(firstRank.codeUnits); var secondPositionCodes=[]; secondPositionCodes.addAll(secondRank.codeUnits); var difference=0; for (int index=firstPositionCodes.length - 1; index >= 0; index--) {///Codes of the elements of positions var firstCode=firstPositionCodes[index]; var secondCode=secondPositionCodes[index];///i.e. ' a < b ' if (secondCode < firstCode) {///ALPHABET_SIZE=26 for now secondCode += ALPHABET_SIZE; secondPositionCodes[index - 1] -= 1; }///formula: x=a * size^0 + b * size^1 + c * size^2 final powRes=pow(ALPHABET_SIZE, firstRank.length - index - 1); difference += (secondCode - firstCode) * powRes; } var newElement=""; if (difference <= 1) {///add middle char from alphabet newElement=firstRank + String.fromCharCode('a'.codeUnits.first + ALPHABET_SIZE ~/2); } else { difference ~/= 2; var offset=0; for (int index=0; index < firstRank.length; index++) {///formula: x=difference/(size^place - 1) % size;///i.e. difference=110, size=10, we want place 2 (middle),///then x=100/10^(2 - 1) % 10=100/10 % 10=11 % 10=1 final diffInSymbols=difference ~/pow(ALPHABET_SIZE, index) % (ALPHABET_SIZE); var newElementCode=firstRank.codeUnitAt( secondRank.length - index - 1) + diffInSymbols + offset; offset=0;///if newElement is greater then 'z' if (newElementCode > 'z'.codeUnits.first) { offset++; newElementCode -= ALPHABET_SIZE; } newElement += String.fromCharCode(newElementCode); } newElement=newElement .split('') .reversed .join(); } return newElement; } `

### But that's not all

In any case, for us it was not all. We added this feature to the already released application, so we needed a migration. Writing migrations for SQL is not a problem, but calculating the standard ranks is no longer so simple. But, knowing how the middle line is located, it becomes not difficult to do this. The algorithm will be as follows:

- we set the beginning and the end of the gap (we have "aaa" and "zzz" respectively)
- we consider how many combinations of different characters between the lines, here
*formula 1*will help us - now we divide what happened by the maximum possible number of elements in the list
- so, we have a step, there is an initial position, it remains only to add a step to the initial position, get a rank, then add a step to this rank, get a new rank, then add a step again and so on

All the same on Dart. the

*forNumOfTasks*parameter is responsible for how many positions you will receive. If you put down positions for a list where there are only three elements now, it makes no sense to calculate positions for the entire list (by 50, 100 or some more)

**Our implementation of finding 'default' ranks**

`///modify field forNumOfTasks to get certain number of positions List‹String› getDefaultRank({int forNumOfTasks=TASK_FOR_PROJECT_LIMIT_TOTAL}) { final startPos=START_POSITION; final endPos=END_POSITION; final startCode=startPos.codeUnits.first; final endCode=endPos.codeUnits.first; final diffInOneSymb=endCode - startCode;///x=a + b * size + c * size^2 final totalDiff=diffInOneSymb + diffInOneSymb * ALPHABET_SIZE + diffInOneSymb * ALPHABET_SIZE * ALPHABET_SIZE;///'~/' – div without remainder final diffForOneItem=totalDiff ~/(TASK_FOR_PROJECT_LIMIT_TOTAL + 1);///x=difference/size^(place - 1) % size final List‹int› diffForSymbols=[ diffForOneItem % ALPHABET_SIZE, diffForOneItem ~/ALPHABET_SIZE % ALPHABET_SIZE, diffForOneItem ~/(pow(ALPHABET_SIZE, 2)) % ALPHABET_SIZE ]; List‹String› positions=[]; var lastAddedElement=startPos; for (int ind=0; ind < forNumOfTasks; ind++) { var offset=0; var newElement=""; for (int index=0; index < 3; index++) { final diffInSymbols=diffForSymbols[index]; var newElementCode=lastAddedElement.codeUnitAt(2 - index) + diffInSymbols; if (offset != 0) { newElementCode += 1; offset=0; }///'z' code is 122 if 'll be needed if (newElementCode > 'z'.codeUnitAt(0)) { offset += 1; newElementCode -= ALPHABET_SIZE; } final symbol=String.fromCharCode(newElementCode); newElement += symbol; }///reverse element cuz we are calculating from the end newElement=newElement.split('').reversed.join(); positions.add(newElement); lastAddedElement=newElement; } positions.sort(); positions.forEach((p) => print(p)); return positions; } `

Fuuuuh, are you tired? The hardest part is behind us, very little is left!

We didn't really like the idea of the buckets. Objectively - she is good.But we didn’t like the idea of having a “recovery” algorithm: the positions were over - recover with the help of buckets! So, no buckets. However, the ranks are not endless, which means you need to come up with something.

*And we came up with*

If there is no space between the lines, then we decided to simply add the middle letter of the English alphabet (“n”) to the lower border. Those. if we want to stick an element between “aa” and “ab”, we get “aa”, “aan” and “ab”. Due to the fact that the lines are sorted element-wise from left to right, line elongation will not spoil the sorting. But we now have a place for new elements, and this is without any recovery algorithms.

This piece of code can also be found in the middle row algorithm.

**A piece of code with the addition of a 'middle' character**

`if (difference <= 1) {///add middle char from alphabet newElement=firstRank + String.fromCharCode('a'.codeUnits.first + ALPHABET_SIZE ~/2); } `

### Summary

Lexoranges seemed to us an excellent indexing tool, the use of which optimizes the work with the database and server: when changing the order of tasks, only one changed task needs to be updated.

Share your opinion about lexoranges and what your thoughts are about solving such problems.

Well, for all readers of Habr, we suggest evaluating the result that we have obtained. And also pick up a useful list of “Habr’s authors code" .

Thanks for your attention!.

Source