We go deep into the treasure island with the name "Algorithm."



Here is a new article in the series "What is an algorithm?" And again a difficult task. We need to “dive deeper” into the memory structures of a living organism. Unfortunately, from the perspective proposed by the article, this direction is still poorly studied. Which makes the material given below more complicated than that presented in the previous parts, which used proximity with the methods of "genetic algorithms", automated control systems, machine learning systems with a teacher and reinforcements. Therefore, to lay a path deep into the uncharted “territories” of the processes of memory will have to be largely independently. On the one hand it is very interesting. After all, we are pioneers with you! And on the other hand there is a danger of "getting lost." To cheer up at the beginning of the road, I can only say that the path has already been completed once, tours are stacked along our road, a map of the area is drawn, and the treasure at the end of the path promises to be impressive.

But no matter how valuable the final destination, the current article is just one more step towards effective methods of working with algorithms. And the proposed solutions again require discussion. I look forward to constructive criticism in the comments. I would like to mention once again that the description of all the important arguments in support of the obtained solutions cannot be placed in one article, in which it is important to present one side of the theory in as much detail as possible. Therefore, the reinforcement of these decisions by the processes of communication, training and the formation of a natural language will be given in the following parts.

Opening the memory structure will require all the results obtained in previous articles in the series ( Part 1 "Action" , Part 2 "Execution" , Part 3 "Memory" ). Without the calculations listed there, reading further will be more difficult.

Let's get started.

Evolution of chains

The structure of memorization in the brain is "a little" more complicated than was described in the article No. 3 . To start the description, it was necessary to present this structure in a simplified manner and highlight only the most important:

In the brain, synthesized algorithms (body strategies) are stored in the form of a chain structure (sequence of signs and actions ). A large set of synthesized algorithms has a "priority turn" of use, formed and transformed based on experienced different environmental conditions (situations) and evaluation functions (benefit-harm scale for the body).

If you look at the many algorithms in the brain through the eyes of a programmer, you can see that their combination is similar to a software project in the development process. And what helps a programmer deal with an unfamiliar and confusing project without documentation? If the project has a version control system, it will be useful for him to use a history analysis.

In our "brain project software", this can also help us. You need to try to track the transitions from simpler to subsequent complex states of the "program code" and at the same time write down the decisions that are fixed by each "commit". If we recall the geneticists, then this method is akin to evolutionary genomics . But we will not analyze the structure of genes, but the structure of Memory. And, of course, one should pay attention to the fact that this analysis is rather "archaeological", no one has seen the real sequence of development, and it certainly differs from the one given below. But the order of this sequence is not so important for us. The decisions of development made with each step are important. And, it seems, the first important decision for the body was the use of "action".

But put off a little new consideration of the action of the body.Because to begin with, it is necessary to point out the existing factors that help nature to carry out the development of those body algorithms that are not tied to the structures of its Memory.

The first important factor (the "engine" of evolution) is the uneven distribution of resources and environmental conditions. For example, if there is “Food” at the location of the organism at any time, then the body will never need the “Move in search of food” algorithm, and with it the obligatory action of this algorithm - CDMY0CDMY.

The second factor in the development of biological organisms is death. I would never have thought that I would have to say a phrase of the following content, but: "Death is useful." It allows you to perform the development of algorithms in a group of related organisms.

And it’s very good that, using Memory, nature invented more efficient methods of synthesizing algorithms, so the death of a modern programmer is not required in his work. But we have to consider the early stages of the development of body algorithms. Then in an environment where food began to be distributed unevenly, organisms that did not learn how to move became extinct. And the survivors acquired CDMY1CDMY.

Stage "Origin of action"

You might think that you cannot come up with an algorithm for one CDMY2CDMY without using sensors, receptors that emit signs . But actually it needs to be done. Because in the creation of this simple algorithm, an understanding appears of how the body synthesizes more complex algorithms.

How can the body affect the action ? He may or may not use it. He can use it occasionally and often. He can execute it periodically.

For the body in a food situation, it is advisable to use CDMY3CDMY from time to time:

  1. moved to a new place;
  2. stopped moving;
  3. do not use movement for a while (instead, try to eat in a new place);
  4. continue from point number 1.

In the form of a cycle described by the sequence of actions, this algorithm could be given to the programmer, but the organism in question is not yet a programmer, and he does not yet have 10 fingers and a keyboard to "code" this program.

What can ensure the execution of a periodic process in such a simple organism? There are many options. For example, it can be a certain structure with a long accumulation of potential and with the performance of a quick discharge of the accumulated potential when a certain threshold value is exceeded. If we hook the execution of the action in the process of discharging this potential, we get the implementation of the algorithm for cyclic repetition of the action . And the time spent on the accumulation of potential determines the period of such a cycle. For the further narrative, the method of implementing the periodic process is not so important, therefore, without enumerating other options, we will take this "functional base" as a basis.

The first algorithm is assembled! It is trivial, but it can already work (to help the body survive). Its implementation is available even on a biological elemental "functional base", and for a programmer this simple task will not even cause interest. But before complicating the TK programmer, let's introduce the structure of the description of the “commit” (stage of the evolution of memory).

For each stage of memory development, you must provide the following information:

  • an example of an evolutionary factor that caused the developmental step;
  • a verbal description of the emerging possibility of the body, with which you can synthesize new behavior;
  • "functional base" of the organism required to complete the current developmental step in addition to the one already inherited from the previous steps;
  • description of the type of algorithm added to the body’s arsenal at the current developmental stage
  • graphical representation of the algorithm.

Example of an evolutionary factor Universal Description Additional Features Required Add Algorithm Type
Moving is more beneficial than being in one place From time to time, the body performs an action and it’s more beneficial for it than not to do it "Capacity building - discharge by execution" Endless loop for executing a custom interval operation

stage action

Here, of course, it is worth clarifying that the use of Memory at the current stage of the organism’s development is minimal (this is only remembering the successful time interval between the launches of the action ). And the generated algorithm does not have connected execution (in terms of article No. 2 ), which somewhat detracts its value. But an endless loop can be useful to the programmer. And it's time for us to move on and evolutionarily introduce signs into the algorithms of the organism’s behavior.

Stage "Formation of inhibition"

Let's go back to the example of "Food + Movement". In this example, the body needs "energy" to execute CDMY4CDMY (we will not be distracted by clarifying the type of energy). To get this energy, the body needs "food". Absorbing the energy stored in food, the body accumulates this energy (potential action ) and, if sufficient, performs the action . The accumulation of energy in the indicated example is caused by external to the body signs - the amount of food in the current place in the environment. This is very similar to the description of the function "Capacity accumulation - discharge by execution". But from the point of view of the organism, such a naturally formed strategy is harmful. Because in a place where there is a lot of food, the body will quickly accumulate potential, and the strategy will force you to change this place to a new one. And there may not be food in a new place.

This means that the accumulated potential for executing the action should be an internal process, and as the considered example of “running away from food” shows, even with the accumulated potential it is useful not to use CDMY5CDMY if there is a lot of food in the current place. We have found a new evolutionary factor! So you need to take a development step.

Example of an evolutionary factor Universal Description Additional Features Required Add Algorithm Type
It’s harmful to move when there is a lot of food in the current place From time to time, the body performs an action , but does not begin to perform it if there is a limiting sign "Braking", that is, blocking discharge by sign An infinite loop with a sleep waiting for the flag to be removed to complete the operation

stage suppression

Stage "Reflex formation"

And it would be all right with the body and its algorithm for alternately moving and eating food. But the environment has changed, and in the next stage, the body itself has become food for more agile predator organisms. In such an environment, it is useful to run away, that is, execute CDMY6CDMY in a situation where there is a danger (predator) nearby. In the previous sentence, we wrote down a new evolutionary factor, which means it's time to form a new type of behavior algorithm. This time, a new type of algorithm is the reflex (almost in its classic biological significance ).

An example of an evolutionary trait Universal Description Additional Features Required Add Algorithm Type
It’s more useful to move-run when there is danger than to stay in one place An organism necessarily begins to perform an action if there is a attribute "Excitation", that is, the start of discharge on the basis of An infinite loop with a sleep waiting for the flag to be set to execute the operation

stage excitation

Stage "Universalization"

According to the general scheme, which did not require any new functionality from the body, the following two types of algorithms were constructed using the combination of features according to CDMY7CDMY. The simplicity of their structure allows you to immediately go to their tabular formulas.

First, consider the universalization of several constraints.

Example of an evolutionary factor Universal Description Additional Features Required Add Algorithm Type
It is harmful to run away if there is food or an individual for mating in the current place From time to time, the body may perform an action , but does not begin to perform it if there is a limiting sign ("No. 1") or if there is a limiting sign ("No. 2") - An endless loop with a sleep wait (unflashing flag1) OR (unflashing flag2) to execute an operation

stage suppression2

The universalization of several attributes for a reflex is presented below.

Example of an evolutionary factor Universal Description Additional Features Required Add Algorithm Type
Running away from a predator and running away from fire is more beneficial than staying in one place From time to time, the body may perform a action , but it must perform it if there is a sign ("No. 1") or if there is a sign ( "No. 2") - An endless loop with a sleep wait (setting flag1) OR (setting flag2) to execute the operation

stage excitation2

At this point, the simple and classic part of the story is over. And we move on to interesting "difficulties."

Stage "Competition of strategies"

As before, the stage begins with an unpleasant gift of the environment, prepared for the "long-suffering" organism. This “gift” is the possibility of finding both “food” and “danger” in one place. The body does not develop a behavior algorithm based on the types of algorithms that are in its arsenal. So "have to" develop.

Example of an evolutionary factor Universal Description Additional Features Required Add Algorithm Type
Running away from a predator is useful even if there is a lot of food nearby From time to time, the body may perform an action , but it must perform it if there is a priority sign ("No. 1") even if there is a limiting sign ("No. 2") Feature Priority An infinite loop with a sleep wait for a positive correlation of priorities of flag1 and flag2 for execution of the operation

stage competition

Let's look at the complexity of the current stage in terms of implementation.At this stage, there is a need to memorize and compare the utility ratio for previously synthesized behavior algorithms. For the programmer, this operation is only adding weights to the condition and to the constraint, and then correcting these weights according to the results of evaluating the usefulness of executing the action while participating in the initialization of the execution of these signs (conditions and restrictions).

Moreover, the correction of weights can be a "global procedure". That is, to affect a large number of currently unused attributes. It is provided by a simple increase in the weight of one current attribute . But at the same time, all relations with the weights of other related signs change. And after changing the weight of this sign in order to maintain the previously identified priorities of signs , it is necessary to carry out the procedure for adjusting the weights of all existing signs that are related by priorities with modified. Such a global correction is a very expensive operation, and in a large number of related signs it can become unsolvable due to the presence of cyclic dependencies.

The second and more effective way of correction is to localize the combination of the initial signs (specific conditions and limitations) by forming on their basis a macro sign and synthesis of the chain -algorithm with tying action already on this macro sign . Then you can not modify the weights of the original signs , that is, you do not need to carry out a global procedure for amending other signs already related by priorities. And a new strategy with a priority of its use becomes easier to develop. Indeed, the macro attribute has its own weight!

Most likely, the combination of these two methods of correction of weights is the method of biological solution, where the first option in many situations allows you to find a combination of priorities of strategies synthesized and used by the body even with a simple organization of the memory structure. And the second option serves as a solution to complex complexity with an increase in the number of signs in memory. In any case, we move on to one of the most important process in memory - the grouping of signs.

Stage "Grouping signs"


Memory Chains in all the previous steps were simply connected. That is, one connection “ sign - action ” was memorized (that is, strengthened) and weighed (that is, evaluated by use). To parse the grouping of signs , you need to remember more links. At this stage, there is a need to unite the bonds in a chain one after another.

And the appearance of such a chain again has a culprit - this is an environment that throws up new and special situations in the body in which it is useful to perform an action (or, conversely, it is harmful to perform it). Moreover, the peculiarity of each such situation is the characteristic simultaneous presence in the environment of two (or more) signs . And each of these signs , appearing separately in the environment, is not sufficient to detect this special situation. If you go to the "programmer", then the body needs the operation CDMY8CDMY on grounds . Fortunately for nature, the biological version of this operation turned out to be not so limited as that of a programmer, and its initial synchronous version can be “spread” over time with a rather easy transformation, which provides the body with the ability to detect not only the synchronous appearance of signs , but and the appearance of a sequential (with fixed time intervals).

Example of an evolutionary factor Universal Description Additional Features Required Add Algorithm Type
It is useful to run away from a large and toothy organism (lion), but not from just a large (elephant) and not from just a toothy (rabbit) From time to time, the body may perform a action , but it must perform it if there is a sign ("No. 1") and a sign (" No. 2 ") Record in the chain of detecting a combination of signs and a way to re-detect such a combination of signs An infinite loop with a sleep wait for the simultaneous (sequential) setting of flag1 and flag2 to execute the operation

stage sign chain

Now we will describe the “complexity” of the sign operation CDMY9CDMY and the features of their solution by constructing the chain . The first and most important difficulty is that if the number of basic signs in the body is  $ n $ , then the number of macro signs that can be distinguished in a synchronous pair combination of these signs - $ C ^ 2_n=\ frac {n!} {{(n-2)! } \ cdot 2!} $ . This number is greater than $ n $(for $ 3 & lt; n $). And the total number of macro signs increases with the addition of each additional ability to detect combinations of basic signs . Features like:

  • using a combination of more than two basic attributes ;
  • determining the order of appearance of the basic signs ;
  • indication of the time interval between the appearance of the basic signs .

That is, the number of macro signs can be very large. If for each occurrence of a combination of basic signs you try to apply a action and wait for reinforcement from the environment or the teacher, then the search process promises to be so long that the life time of the body may not be enough to build the algorithm using the useful macro tag .

Fortunately for the organism in our environment, the usefulness of the macro sign can be tested without performing the action and without waiting for reinforcement from the environment or the teacher. The way to do this is to identify the Replay (which article number 3 is dedicated to). Only repeated sequence combinations ("patterns") from the basic signs are useful for constructing the algorithm ( article No. 1 ).

Therefore, in the illustration of the grouping of signs , the orange arrow line, which is a chain , is limited by its area. This area is responsible for identifying the macro trait with an assessment of repeatability. In it, sequences of basic signs are "recorded", and in comparison with the record, repeated occurrences of the same sequences are determined. Each occurrence of a repetition of a recorded sequence strengthens the record of this sequence, that is, strengthens its forming connections. And those relationships that did not pass the test, as a result, can weaken and be used to record a new combination of basic signs .

The described method of separating the formation of a macro attribute from the assessment of the completed action allows you to:

  • fight the increasing complexity of "learning" on a large set of basic attributes ;
  • get rid of the strong connection of building a macro attribute and reinforcement;
  • build a hierarchy from the areas of chains and thereby ensure the formation of macro- signs (abstractions) of a higher level with a combination of different supporting reception systems (for example, sound and light).

It is noteworthy that if you look towards machine learning technology using neural networks , then use convolutional layers has basically a similar solution that describes a way to isolate the" training "of selected signs of fragments of the previous layer from decisions made by these attributes on the next layer. Thus, it is possible to build a hierarchy of signs with minimizing the influence of reinforcement from the training mechanism for the back propagation of errors on the inner layers and focusing this function (learning by errors) on the final layer, where for each macro sign teacher assigns a "name".

If we switch from an artificial neural network that works with images to visual cortex , it is almost universally accepted that there is no mechanism for the back propagation of an error. Macro signs of the image (and movements) in the biological version learn completely "autonomously". Perhaps a biological solution for recognizing visual images - Repeat ??

"Grouping of actions" stage

And if the sign of repeatability is controversial in the biological recognition of visual images, then its use for grouping actions hardly raises doubts (we recall the example of playing the piano discussed in article No. 3 ). The basis of the method of forming an algorithm using a sequence of actions is the body repeating the required sequence of actions! But what actions are we talking about? Previously, we considered only one action in the body, and he learned to use it for the benefit of himself. This restriction was used intentionally (for the sake of simplicity of the description of the above development steps). Let's regret the body and remove this restriction.

And this time we need an evolutionary factor that required the body to group actions . This factor was the presence of situations in which the sequence of actions is more useful compared to their single execution.

Example of an evolutionary factor Universal Description Additional Features Required Add Algorithm Type
To move the body in water, it is useful to use a sequence of movement with a flagellum (tail) first one way and then the other way From time to time, the body can perform the sequence action ("No. 1") and action ("No. 2") and it is more useful than not to perform this sequence or perform actions scattered Record in a chain a combination of executable actions and a way to re-execute such a combination of actions Infinite loop for executing a sequence of operations with a custom interval

stage action chain

The scheme, of course, is beautiful. But.

  1. In what way does the body select actions to the detected new situation in the environment?
  2. Where does he store an assessment of the usefulness (harmfulness) of memorized actions?
  3. How does he synthesize the chain of the sequence of actions ?

Let's look at these "difficulties" in order.

The answer to the first question (about how to select actions) is quite simple. In this way, what biologists call game activity . In fairness, it is worth saying that not all aspects of gaming activity are distinguishable at the stage of development of the organism under consideration. Many of its important components need to be considered separately (which is planned to be done in the next article). But the basis of game activity is already visible - this is the spontaneous activation by the body of its actions (due to signs or accumulation of internal potential) and the analysis of signs and reinforcements obtained in result.

After the spontaneous execution of the action , the body has two ways to check the usefulness of this action in the situation experienced. The first way we are familiar and well-developed is reinforcement from the environment or from the teacher. But there is another way. And for its implementation, we again come to the rescue comes Repeat. This method helps the body determine that the initial situation described by signs , after exposure to completed actions, was transformed into a new situation. And the signs of the initial situation, accompanied by such actions, each time lead to the signs of the final situation. Having distinguished such a law by repetition, the “playful” organism retains the chain -algorithm without even receiving reinforcement. And again it goes into spontaneous search for new algorithms.

The second question is more complicated. But you can make a few assumptions about storing the utility score of the chain actions . The easiest solution is to store this assessment at the junction of the attribute of the initial situation and the chain , which ensures the sequence of actions of the reaction. That is, in the input indicative node of the region. The activation of this node will be inhibited if the strategy remembered by the body is harmful, and will ensure the inhibition of the subsequent actions of the chain.

And the last question is the simplest. Because we almost answered him at the stage of "Grouping signs." The chain from the sequence of actions is synthesized by the body in the "Memory" area, in which the activation sequence of the nodes responsible for the start of the execution of actions , for example, spontaneously executed by the body, is stored in Game. The only (and not key) difference from grouping into the chain of nodes of signs is the connection of the internal nodes considered at this stage with the initialization of external actions of the body.

Is the overall memory scheme complete?


I would like all of these stages to be enough to complete the analysis of the processes taking place in the brain. But so far this is not so. Nature worked for a long time and, on the basis of the already described, created a really beautiful tool for working with algorithms. In the following articles we will have to get acquainted with a large number of development steps and evolutionary factors that pushed the body to introduce:

  • hierarchies in characteristic and executive areas of memory;
  • game activity;
  • algorithms for copying chains from organism to organism;
  • language units to enforce the collective algorithms of organisms;
  • using language units for copying chains from organism to organism;
  • methods for synthesizing algorithms based on "chains" of language units.

We are on the final line of this article. The landing on Treasure Island Algorithm has occurred. It turned out longer (in the number of letters) than I would like, but we should not be thrown back into the sea by the surf. Therefore, I had to solidify on the shore more thoroughly. On the other hand, the number of letters in the article is clearly not enough to describe everything you want to tell, and some pitfalls remained unlit overboard.There is a suggestion - to build a dialogue on discussing these difficulties in a question-and-answer format in a separate Issues (Open source ( GPL) of the GitLab project) .

Yes, and completely forgot about the apple.

This constant companion of my publications flew past all the illustrations of the current article. This is unfair. Let him be held in custody by the chief merryman of the treasure island, Dr. Livesey.

stage action chain

Thank you for your attention.


I will be very grateful for the feedback, suggestions and suggestions, as they help me adjust the direction of development of my work.

I have a separate excitement in the style of narration and formatting used in the article (quotation marks, paragraphs, italics). Please write if you have comments on them. You can use a personal message.