Again, we underestimate the complexity of the tasks.

Perceptron has been known to us for more than half a century. And we still use it in the same form, only increasing computing power.

Convolutional neural network - mathematical implementation of the visual subsystem (sticks at different angles and their hierarchical combinations into more complex images). In the first layer, and Sobel filter . Without training. The process will be simplified.
For me, this has long been brought to the hardware level of cameras and technical systems.

On the Internet you can find a lot of information about our neurophysiology. Here's a great article on the hub just in topic. The author writes that not only a neuron is a nonlinear adder, but also synapses. That is, the number of variables is already growing by an order of magnitude. But
Neuron is not a static analog adder. A neuron, and especially a brain, is a digital processor that constantly exchanges pulses with a frequency of up to hundreds of hertz.

The key here is constantly.
Billions of dozens of neurons and trillions of synapses are constantly rebuilding their potential and connections for the current Task. Gain experience. Form a Person.
And we divided this process into at least two parts: training and execution, and we complain that there are somehow a lot of mistakes in both.

Constantly, and not trained CNN and use it in statics for classification.
We also organize competitions for the best neural network architecture. But it’s best only for a certain data set and is constantly evolving .
Tesla’s autopilot saw a partially blurred sign and that’s all - an accident. Immediately the hype in the press: “DeepLearning does not work!”, “Robots are evil.”
Or even more interesting story about advertising Burger King and Tesla.

If we come across a new image, we usually experience fear, an adrenaline rush and emergency brain remodeling. We do not always feel this, because changes are often insignificant. If you get into an unfamiliar or unexpected situation, we most likely fall into a stupor.

Our brain does this - why shouldn't silicone do that?
After all, there are interesting developments by spike neural networks . Pay attention to the neuron model - how do you like such a perceptron?
We also forgot about SOINN , but this is a very original algorithm that combines both training and execution, moreover, in dynamics.
Perhaps he has a sequel somewhere in the secret laboratories of the Japanese.

We are trying to reproduce the whole process on the architecture (silicon semiconductor circuits), which differs radically from the structure of the biological brain. And we reduce the number of variables to meet realtime on a weak processor. We adjust the architecture of the neural network not to its biological counterpart, but to our technical means and complain that we can only classify it. What is not in the car of thinking, modeling, creativity.

No matter how it is.

We “sharpened” our machines for calculating formulas, and more precisely, for basic operations: addition, subtraction, multiplication and division (we will not touch and put other instructions at the address yet).
So let them do it. And they do it very efficiently.

If a formula is given, the machine considers it at a speed unattainable to humans. A vivid example: the flight paths of spacecraft. But it all started with such devices, but even they ultimately exceeded people in the speed of calculations.
There is no need to force C/G/TPU to mimic biological neurons.

A machine can operate with more abstract mental blocks than a neuron. We have a lot of groundwork on this topic. And we don’t need to train the machine for 20 years so that it “understands” the abstractions of the “logistics”, “culinary”, “quantum mechanics” level. The server joins the process immediately after loading the data, which we already have a lot of.

Once upon a time inspired by the works of Boston Dynamics I experimented with robotics.
It is a pity that I did not have such capacities as now, but this led to very interesting ideas.
For a long time, it was not possible to train something adequate on a standard fully-connected network and a sigmoid. I decided to try a function that is more suitable for the model mechanics, based on a simple idea:
Rotating the hinges of the limbs of a robot is trigonometry.

The cos function as an activation function and limiting the values ​​of inputs and outputs [-1; 1] increased the quality of control and the learning speed significantly, reduced the size of the network.

Yes cos is not the fastest function - it is calculated by the table or by expanding in a Taylor series, but it already exists. Especially if you make similar functions as the core of ASIC , then we will have about 10-fold increase in computing speed.
You don’t have to look for it, calculate it, waste time learning ANNs - everything has already been done with the help of our “neural networks” for a long time.
There are many more resource-intensive features than trigonometry. The algorithm is the same:
We take a function from the database - we connect the inputs and outputs with the necessary blocks.

These functions can be of a very high order. Like those used in programming. And their parameters are object IDs of the class Product, Brand, Color, Address, and so on.
In other words - we pass the spike (object code, ID) to the right place, and not a vector of rational numbers, which means it is not clear what.

Returning to the neural network as a classifier, we obtain object classes from the stream of images. This is, in fact, an image encoder to which you can attach a similar system and get the memory and intuition.

There are many electronic circuits that are already “trained” and debugged. They can also be used as functions. The main thing is to connect it correctly. I'm talking about BPMN and the like.
BPMN schemes are more understandable to a person than a multilayer neural network.

As a result, Data Scientist program... schemes - architecture of neural networks. Somewhere intuitively, somewhere based on literary and technical experience or after AutoML selection.
"Dancing with a tambourine" - it doesn’t work out differently. Because DeepLearning works beautifully on training datasets - reality puts everything in place.

How to implement the “thinking process” by schemes at the program level is described in this article , in which there is a serious debate.
There, on a simple applied example, the process of optimal connection of functions is described - essentially training, similar to the selection of weights in a neural network.
Moreover, training (search) is conducted not in the field of rational numbers, but in the discrete space of possible system states.

The search area is significantly reduced not only by this, but also by adding a list of valid classes/values ​​of each function parameter.

The choice of a functional basis and objective function for such a system are topics for individual articles. I hope I have enough inspiration to write them.
I sincerely hope for your support.