library iconI woke up somehow in the late afternoon and decided - it's time, it's time to make a new feature in my library . And for one, check the graph for cycles to repair and speed up. By morning dinner, I made a new feature, improved the code, made the graph presentation convenient, and got to the task of finding all simple loops in the graph. Sipping a cup of cold water, he opened Google, typed "search for all simple cycles in the graph" and saw.


I didn’t see much... although it was only 4 in the morning. A couple of links to the algorithms link1 , link2 , and a lot a hint that it’s time to sleep there can be many cycles in the graph, and in general the problem cannot be solved. And another question on the hub on this topic, the answer to which also did not save me - about the search in depth I already knew.


But once I decided, then even NP I’ll solve a difficult task for P the more I drink water, not coffee - but it cannot end abruptly. And I knew that within the framework of my task it would be possible to find all the cycles in a short time, without supercomputers - this is not the order of magnitude for me.


Let’s digress a little from the detective, and understand why I need it.


What kind of library?


The library is called DITranquillity is written in Swift, and its task is inject dependencies . The library copes with the task of dependency injection with a bang, it has features that other Swift libraries do not know how to do, and does it with good speed.


But why did it take me to check the cycles in the dependency graph?


For the sake of killerfiches - if the library does the main functionality with a bang, then you are looking for ways to improve it and make it better. And a killerfic is a check of the dependency graph for correctness - it is a set of different checks that allow you to avoid problems during execution, which saves development time. And the check for cycles stands out separately among all checks, since this operation takes much more time. And until recently, uncultured more time.


I knew about the problem for a long time, but I understood that in the form in which the graph is now stored, it is difficult to make a quick check. And since the library is able to check the dependency graph, then the "Graph API" itself suggests itself. "Graph API" - allows you to submit the dependency graph for external use so that:


  • To analyze him somehow in his own way. For example, at one time, for work, I automatically collected all the dependencies between our modules, which helped to remove unnecessary dependencies and speed up the assembly.
  • Not yet, but someday there will be - visualization of this graph using graphvis.
  • Check the graph for correctness.

Especially for the sake of the second - who it’s like, but I like to look at these terrible pictures and understand how bad everything is.


ITKarma picture

Source data


Let's see what we have to work with:


  • MacBook pro 2019, 2.6 GHz 6-Core i7, 32 Gb, Xcode 11.4, Swift 5.2
  • Swift project with 300k + lines of code (blank lines and comments do not count)
  • Over 900 peaks
  • Over 2000 ribs
  • The maximum depth of dependencies reaches 40
  • Nearly 7000 cycles

All measurements are made in debug mode, not in release, since they will use graph check only in debug.


Until this night, the checkout time was 95 minutes.


For Patients

After optimization, the scan time was reduced to 3 seconds, that is, the acceleration was three orders of magnitude.


Step 1. Presentation of the graph


Back to our detective story. My Count is not Monte Cristo , he oriented .
Each vertex of the graph is a component, or, more simply, information about the class. Until the Graph API, all components were stored in the dictionary, and each time I had to climb into it, and also create a key. Which is not very fast. Therefore, being in my right mind, I understood that the graph should be presented in a convenient form.


As a person not far from graph theory, I remembered only one representation of the graph - the Adjacency Matrix. True, my creation suggested that there were others, and a little prestressed memory, I remembered three options for representing the graph:


  • List of vertices and list of edges - separately store vertices, separately store edges.
  • Adjacency matrix - we store information for each vertex is there a transition to another vertex
  • Adjacency list - for each vertex we store the transition list

But before you strain your brains, your fingers have already done their job, and while I was thinking what time it was, the code for creating a graph in the form of an adjacency matrix had already appeared on the screen. It's a pity of course, but I had to rewrite the code - for my task it is more important to find outgoing edges as quickly as possible, but the incoming ones are of little interest to me. Yes, and memory is endless nowadays - why not save it?


After rewriting the code, it turned out something like this:


Graph: vertices: [Vertex] adjacencyList: [[Edge]] Vertex: more information about vertex Edge: toIndices: [Int] information about edge 

Where Vertex is vertex information, and Edge is transition information, including indexes, where you can go to this edge.
I draw your attention to the fact that the edge stores the transition not one to one, but one to many. This is done specifically to ensure that the edges are unique, which is very important in the case of dependencies, since two transitions to two vertices, and one transition to two vertices means different.


Stage 2. Naive deep search


From the beginning of the article, they still remember that by this time it was already 4 o’clock in the morning, which means that the only idea how to implement a search for all simple cycles was the one I found in Google, and my teacher always said - "before than to optimize, make sure it’s necessary. "Therefore, the first thing I wrote was the usual depth search :


Code
func findAllCycles() -> [Cycle] { result: [Cycle]=[] for index in vertices { result += findCycles(from: index) } return result } func findCycles(from index: Int) -> [Cycle] { result: [Cycle]=[] dfs(startIndex: index, currentIndex: index, visitedIndices: [], result: &result) return result } func dfs(startIndex: Int, currentIndex: Int,//visitedIndices каждый раз копируется visitedIndices: Set<Int>,//result всегда один - это ссылка result: ref [Cycle]) { if currentIndex == startIndex && !visitedIndices.isEmpty { result.append(cycle) return } if visitedIndices.contains(currentIndex) { return } visitedIndices += [currentIndex] for toIndex in adjacencyList[currentIndex] { dfs(startIndex: startIndex, currentIndex: toIndex, visitedIndices: visitedIndices, result: &result) } } 

I started this algorithm, I waited 10 minutes... And, of course, I went to bed - And then the sun had already appeared from behind the tops of buildings.


While sleeping, I thought - why so long? I already wrote about the size of the graph, but what is the problem with this algorithm? Frantically recalling the debogging logs, I remembered that for many vertices the number of calls to the dfs function is one million, and for some 30 million times. That is, on average 900 vertices * 1,000,000=900,000,000 calls to the dfs function...


Where did such crazy numbers come from? If it were a regular forest , then everything would work quickly, but we have a graph with cycles... ZZzzz.


Stage 3. Naive optimization


I woke up again not according to my covenant after dinner, and the first thing was not to go to the toilet, and certainly not to eat, but to see how much my algorithm was running, but well, only about an hour and a half... Well, okay, I for a night figured out how to optimize!


In the first implementation, loops passing only through the specified vertex are searched, and all subcycles found when searching for the main loops are simply ignored. There is a desire to stop ignoring them - then it turns out that you can throw out from the consideration all the peaks through which we have already passed. It’s not so difficult to do this, however, when your fingers do not hit the keyboard a little more complicated, but about half an hour is ready:


Code
func findAllCycles() -> [Cycle] { globalVisitedIndices: Set<Int>=[] result: [Cycle]=[] for index in vertices { if globalVisitedIndices.containts(index) { continue } result += findCycles(from: index, globalVisitedIndices: &globalVisitedIndices) } return result } func findCycles(from index: Int, globalVisitedIndices: ref Set<Int>) -> [Cycle] { result: [Cycle]=[] dfs(currentIndex: index, visitedIndices: [], globalVisitedIndices, &globalVisitedIndices, result: &result) return result } func dfs(currentIndex: Int,//visitedIndices каждый раз копируется visitedIndices: Set<Int>,//globalVisitedIndices всегда один - это ссылка globalVisitedIndices: ref Set<Int>,//result всегда один - это ссылка result: ref [Cycle]) { if visitedIndices.contains(currentIndex) {//если visitedIndices упорядочен, то вырезав кусок, можно получить информацию о цикле result.append(cycle) return } visitedIndices += [currentIndex] for toIndex in adjacencyList[currentIndex] { dfs(currentIndex: toIndex, visitedIndices: visitedIndices, globalVisitedIndices: &globalVisitedIndices, result: &result) } } 

Further, according to the precepts of the programmer "The programmer is eating, the build is coming" I left to eat... I eat fast. Having returned after 10 minutes and seeing that there is still no result, I was upset, and decided to think what the problem was:


  • If I have several large separate cycles, then a lot of time is spent on each such cycle, fortunately, only once.
  • Many cycles are "duplicated" - you need to check the uniqueness when adding, and this is the time for comparison.

Intuition told me that the first option was better. In fact, it is more understandable for analysis, since we remember cycles for a specific vertex, and not for everything in a row.


Step 4. But what if you cut leaves from the search


I already managed to wake up, which means it's time for a piece of paper and a pencil. Looking at the numbers, drawing pictures on a leaflet, it became clear that it makes no sense to go around leaves, and not even leaves, but whole branches that have no cycles. Does it make sense to enter them if we are looking for cycles? Here we will cut them off. While I was thinking this, my hands had already done everything, and even clicked run... Wow, this is an optimization - I didn’t have time to blink, but already found everything... Oh, and why so few cycles were found... Eh... And the problem is clear that I did not pour myself a glass of water . In fact, the problem is in this line:


if visitedIndices.contains(currentIndex) { 

I decided that in this case we also came across a leaf, but this is not true. Let's look at this graph:


ITKarma picture

In this column, there is a sub-cycle B- > E- > C which means this if will be executed. Now suppose we go this way first:
A- > B- > E- > C- > B !. With this passage, C, E is marked as a sheet. After we find the cycle A- > B- > D- > A.
But the cycle A- > C- > B- > D- > A will be missed because the vertex C is marked as a leaf.


If you fix this and discard only leafy branches, then the number of dfs calls is reduced, but not significantly.


Step 5. Making preparations for the search


Okay, still half a day ahead.After looking at the pictures and various debug logs, it became clear that there are situations where the dfs function is called 30 million times, but only 1-2 cycles are found. This is possible in cases like:


ITKarma picture

Where "Big" is some kind of large graph with a bunch of loops but no loop on A.


And here comes the idea! For all vertices from Big and C, you can learn in advance that they do not have transitions to A or B, which means that when switching to C it is clear that this vertex does not need to be considered, since you cannot get to A from it.


How to find out? In advance, for each vertex, start either a search in depth or in width, and do not visit one vertex twice. After save visited peaks. Such a search in the worst case on a full graph will take O (N ^ 2) time, but on real data it will be much less.


I wrote the text for the article much longer than the code for the implementation:


Code
func findAllCycles() -> [Cycle] { reachableIndices: [Set<Int>]=findAllReachableIndices() result: [Cycle]=[] for index in vertices { result += findCycles(from: index, reachableIndices: &reachableIndices) } return result } func findAllReachableIndices() -> [Set<Int>] { reachableIndices: [Set<Int>]=[] for index in vertices { reachableIndices[index]=findAllReachableIndices(for: index) } return reachableIndices } func findAllReachableIndices(for startIndex: Int) -> Set<Int> { visited: Set<Int>=[] stack: [Int]=[startIndex] while fromIndex=stack.popFirst() { visited.insert(fromIndex) for toIndex in adjacencyList[fromIndex] { if !visited.contains(toIndex) { stack.append(toIndex) } } } return visited } func findCycles(from index: Int, reachableIndices: ref [Set<Int>]) -> [Cycle] { result: [Cycle]=[] dfs(startIndex: index, currentIndex: index, visitedIndices: [], reachableIndices: &reachableIndices, result: &result) return result } func dfs(startIndex: Int, currentIndex: Int, visitedIndices: Set<Int>, reachableIndices: ref [Set<Int>], result: ref [Cycle]) { if currentIndex == startIndex && !visitedIndices.isEmpty { result.append(cycle) return } if visitedIndices.contains(currentIndex) { return } if !reachableIndices[currentIndex].contains(startIndex) { return } visitedIndices += [currentIndex] for toIndex in adjacencyList[currentIndex] { dfs(startIndex: startIndex, currentIndex: toIndex, visitedIndices: visitedIndices, result: &result) } } 

Preparing for the worst, I launched a new implementation, and went to look out the window at the nearest tree, 5 meters away - looking into the distance they say it’s useful. And here is happiness - the code was completely executed in 15 minutes, which is 6-7 times faster than the previous version. Rejoicing at the mini victory, and having reformed the code, I began to think what to do - this result did not suit me.


Step 6. Can I use past results?


All the time I was writing code, and while I was sleeping, I was tormented by the question - is it possible to somehow use the result of past calculations. After all, all the cycles through some vertex have already been found, probably something this means for other cycles.
To understand what this means, I needed to do three iterations, each of which was more optimal than the previous one.
It all started with the question - "Why start the search from a new vertex if all outgoing edges lead to vertices that either do not contain a cycle, or is it the vertex through which all the cycles have already been built?". Then the stream of thoughts reached the point that verification can be done recursively. This reduced the time to 5 minutes.


And only having gulped the whole glass with water, and it’s 250ml, I realized that this test can be inserted right inside the search in depth:


Code
func findAllCycles() -> [Cycle] { reachableIndices: [Set<Int>]=findAllReachableIndices() result: [Cycle]=[] for index in vertices { result += findCycles(from: index, reachableIndices: &reachableIndices) } return result } func findAllReachableIndices() -> [Set<Int>] { reachableIndices: [Set<Int>]=[] for index in vertices { reachableIndices[index]=findAllReachableIndices(for: index) } return reachableIndices } func findAllReachableIndices(for startIndex: Int) -> Set<Int> { visited: Set<Int>=[] stack: [Int]=[startIndex] while fromIndex=stack.popFirst() { visited.insert(fromIndex) for toIndex in adjacencyList[fromIndex] { if !visited.contains(toIndex) { stack.append(toIndex) } } } return visited } func findCycles(from index: Int, reachableIndices: ref [Set<Int>]) -> [Cycle] { result: [Cycle]=[] dfs(startIndex: index, currentIndex: index, visitedIndices: [], reachableIndices: &reachableIndices, result: &result) return result } func dfs(startIndex: Int, currentIndex: Int, visitedIndices: Set<Int>, reachableIndices: ref [Set<Int>], result: ref [Cycle]) { if currentIndex == startIndex && !visitedIndices.isEmpty { result.append(cycle) return } if visitedIndices.contains(currentIndex) { return } if currentIndex < startIndex || !reachableIndices[currentIndex].contains(startIndex) { return } visitedIndices += [currentIndex] for toIndex in adjacencyList[currentIndex] { dfs(startIndex: startIndex, currentIndex: toIndex, visitedIndices: visitedIndices, result: &result) } } 

Changes only here: CDMY0CDMY.


After looking at this simple solution, I pressed run, and was ready to move away from the computer again, when all of the checks passed... 6 seconds? No, it cannot be... But according to the debit logs, all cycles were found.


Of course, I unconsciously understood why it worked and what I did. I will try to formulate this idea in the text - If all the cycles passing through the vertex A have already been found, then when searching for cycles passing through any other vertices, it does not make sense to consider the vertex A. Since all cycles through A have already been found, it means that you cannot find new cycles through it.


This check not only greatly speeds up the work, but also completely eliminates the appearance of duplicates, without the need to crop/compare them.
This saves time on the method of storing loops - you can either not store them at all or store them in a regular array rather than a set. This saves another 5-10% of the execution time.


Step 6. Profile


The result of 5-6 seconds already suited me, but I wanted it even faster, the sun even shines on the street! So I opened the profile. I realized that in Swift, low-level optimization is almost impossible, but sometimes you find problems in unexpected places.
And what was my surprise when I discovered that half of the time, out of 6 seconds, is taken by the library logs... Especially considering that I turned them off. As the saying goes - "do you see the gopher? And he is." My gopher was big - half the field. The problem was typical - some string expression was always considered, regardless of the need to write it to the logs.


By launching the application and seeing 3 seconds, I already wanted to stop, but I was tormented by a hunch about going around in width. I have long known that Apple's arrays are made in such a way that inserting at the beginning and at the end of an array takes constant time due to the ring implementation inside (I’m sorry, I don’t remember how it is called correctly). And in Swift, the array has an interesting function CDMY1CDMY, but there is no analogue for the first element. But it’s easier to show.


was (Swift language)
var visited: Set<Int>=[] var stack: [Int]=[startVertexIndex] while let fromIndex=stack.first { stack.removeFirst() visited.insert(fromIndex) for toIndex in graph.adjacencyList[fromIndex].flatMap({ $0.toIndices }) { if !visited.contains(toIndex) { stack.append(toIndex) } } } return visited 

steel (Swift language)
var visited: Set<Int>=[] var stack: [Int]=[startVertexIndex] while let fromIndex=stack.popLast() { visited.insert(fromIndex) for toIndex in graph.adjacencyList[fromIndex].flatMap({ $0.toIndices }) { if !visited.contains(toIndex) { stack.insert(toIndex, at: 0) } } } return visited 

It seems that the changes are not significant and it seems that the second code should work slower - and in many languages ​​the second code will work slower, but on Swift it is 5-10% faster.


Summary


And what could be the outcome? The numbers speak for themselves - it was 95 minutes, it became 2.5-3 seconds, and even new checks were added.


Three seconds also doesn’t look smart, but do not forget that this is on a large and not beautiful dependency graph - you will not find such in the afternoon with fire in mobile development.
And on another project, which is more like an average mobile project, the time from 15 minutes decreased to less than a second, while on a weaker hardware.


Well, thanks to the appearance of the article, Google - who didn’t want to help me, and I invented everything from my head, although I understand that I did not discover America.


All Swift code can be found in the folder .


A bit of advertising and Plans


Who liked the article or didn’t like it, please go to the library page and give it an asterisk.


Every time I voiced development plans, every time I say "soon" and always comes back someday. Therefore, I will not name the terms, but someday it will appear:


  • Converting a dependency graph into a format for graphvis - and it, in turn, will allow you to view graphs visually.
  • Optimization of the main library functionality - With the advent of a large number of new features, my library, although it remained fast, is now not super fast, but at the level of other libraries.
  • Switching to checking the graph and finding problems during compilation, and not when starting the application.

P.S. If you turn off stage 5 completely, this is the addition of additional. actions before starting the search, the speed will decrease by 1.5 times - up to 4.5 seconds. That is, in this operation, even after all other optimizations, there is some sense.


P.P.S. Some facts from the article are fictional, to give the beauty of the picture. But, I actually only drink clean water, and I don’t drink tea/coffee/alcohol.

.

Source