I have a goal - to understand what happened in the 60-70s in the Xerox PARC and the surrounding area, as it turned out that several teams of engineers, working hand in hand, created incredible technologies that defined our present, and their ideas will determine the future. Why is this not happening now? (and if it happens, then where?). How to assemble a similar team? Where did we turn the wrong way? What ideas have we missed, but should we take a closer look at them?

I bring to your attention the translation of the beginning of the large text of Alan Kay (150,000 characters), to which he repeatedly refers in all his speeches and answers to Quora and HackerNews.

Who is ready to help with the translation - write in PM.

I. 1960–66 - The emergence of the OOP and other new ideas of the 60s

There were many incentives for the emergence of OOP, but two of them were especially important. One large-scale: to come up with a good approach for complex systems, allowing you to hide their device. Another smaller one: find a more convenient way to distribute computing power, or even get rid of this task. As is usually the case with new ideas, aspects of OOP were formed independently of each other.

Any new product goes through several stages of adoption - both by its creators and everyone else. This is how the creators do it. At first they notice that in different projects the same approach is used. Later their assumptions are confirmed, but so far no one is aware of the grandiose significance of the new model. Then a great paradigm shift occurs and the model becomes a new way of thinking. And finally, it turns into a ossified religion, from which itself originated. Everyone else accepts the novelty according to Schopenhauer: at first she is condemned, calling her insane; after a couple of years she is already considered an obvious routine; in the end, those who rejected it proclaim themselves its creators.

It was the same with me. As a U.S. Air Force programmer, I have repeatedly noticed allusions to OOP in programs. For the first time on a Burroughs 220 computer, a training aviation command (TAK), in a way to transfer files from one installation to another. In those days, there were no standard operating systems and file formats, so some programmer (I still don't know who) found an elegant solution: he divided each file into three parts. The data itself lay in the third part - it could be of any size and format. In the second part, B220 procedures were stored, which were able to copy data, including individual fields, and add them to the third part. Well, the first part was an array of relative pointers to the entry points of the procedures of the second part (the original pointers were stored in the standard order and had a clear purpose). Needless to say, the idea was excellent and was used in many subsequent systems. She disappeared safely when COBOL arrived.

My second meeting with the beginnings of the PLO took place when the command decided to replace the 220th cars with the Burroughs 5000. Then I did not have enough experience to fully appreciate the new ideas, but immediately drew attention to the segmented storage system, how efficiently compiled high-level languages, and byte-code was executed, on automatic mechanisms for calling subprograms and switching between processes, on clean code for sharing, on protection mechanisms, etc. I noticed that access to the program links table is like in file B220 module was given the interfaces of the procedures. But still, my main lesson at that time was not the OOP ideas, but the translation and analysis of high-level languages.

After the Air Force, while I was graduating from university, I worked at the National Center for Atmospheric Research and was mainly involved in extraction systems for large amounts of weather data. I became interested in the simulation - how one machine can simulate another - but there was enough time just to create a one-dimensional version of block transfer of bit fields (bitblt) to the CDC 6600 in order to simulate different word sizes. The rest of the time went to study (and to be honest, to the student theater).When I worked in Chippewa Falls and helped debug the Burroughs 6600, I came across an article by Gordon Moore in which he predicted an exponential increase in density and a decrease in the cost of integrated circuits for many years to come. The prediction is amazing, but I could hardly comprehend it, standing in front of the B6600 occupying an entire room with its 10 MHz and freon cooling.


Sketchpad and Simula Graphics

By a lucky coincidence, in the fall of 1966, without suspecting anything, I ended up in graduate school at the University of Utah. Well, that is, I knew neither about the Office of Advanced Research Projects (ARPA), nor about their projects, nor that the main goal of the team was to solve the hidden line problem in three-dimensional graphics. I didn’t know any of this - until I wandered into the office of Dave Evans in search of work. On his desk lay a huge stack of brown folders. He handed me one of them with the words: “On, read.”

This folder was received by each newcomer. They were all called the same - "Sketchpad: a graphical communication system between a person and a computer." The system was amazing and very different from everything that I had come across before. Here are three of her merits, which are easiest to recognize: 1) the invention of modern interactive computer graphics; 2) a description of entities using "reference drawings", which can then be used to produce specific instances; 3) graphic "frames", applying which to the standards, you can create sets of related entities and use them to control the machine. The data structures in this system were difficult to understand. The only thing that seemed a little familiar was the transfer of procedures to pointers for passing through them into subprograms (the so-called reverse indexing, as in the B220 file system). This is the first window system with image clipping and zooming. Her virtual canvas stretched for half a kilometer on both axes!







Digesting what I read, I sat down at my desktop. On it lay a bunch of films, a printout of the program and a note: “This is Algol for the 1108th. Does not work. Fix it. ” A new postgraduate student - a new assignment that no one wanted to take on.

It was simply impossible to understand the documentation. The language was sort of like Algol for Case-Western Reserve 1107, but thoroughly redesigned and called Simula. The documentation is as if someone wrote it in Norwegian, and then transliterated it into English (by the way, it was). Some words, such as activity and process, were used in a completely different sense.

Together with another graduate student, we rolled out a 25-meter roll in the corridor and began to crawl along it, studying the code and shouting when we found something interesting. The strangest part of the code was the memory allocator, which, unlike Algol, did not use the stack. A couple of days later, we realized why: in Simula, memory was allocated to structures that were very similar to instances of Sketchpad objects, and there were also a kind of description that created objects independent of each other.What Sketchpad had as standards and instances was called activity and process in Simula. Moreover, Simula turned out to be a procedural language for managing Sketchpad-like objects, that is, ahead of Sketchpad with its “framework” in terms of flexibility (but inferior to it in elegance).

Simula surprised and changed me forever. It was the last straw: the next meeting with the ideas of the PLO allowed me to realize them in a general sense, as if I had experienced catharsis. In mathematics, I was engaged in abstract algebras, that is, small sets of operations universally applicable to a wide variety of structures. In biology - cell metabolism and high-level morphogenesis, where simple mechanisms control complex processes, and the universal "bricks" of the body can turn into what is needed right here and now. The B220 file system, the B5000 itself, the Sketchpad graphics system, and finally Simula used the same idea for different purposes. A few days earlier, Bob Barton, chief designer of the B5000 and professor at the University of Utah, said at one of the speeches: “The basic principle of recursive design is that entities at any level of nesting must have equal capabilities." Then I first tried on this idea with a computer and realized that it was completely in vain that it was divided into weaker concepts - data structures and procedures. Why not divide it into small computers, as in time-sharing systems? Not only ten, but immediately thousands, each of which would simulate a useful structure.

The monads of Leibniz, Plato came to my mind with his "divide everything into species, into natural components" and other attempts to curb complexity. Of course, philosophy is controversy and opinion, and engineering is concrete results. Somewhere between them is science. It is no exaggeration to say that Simula has since become for me the main source of ideas. No, I did not want to improve it at all: I was attracted by the opening prospects, a completely new approach to the structure of computing processes. True, it took more than one year to realize and effectively implement new ideas.

II. 1967–69 - FLEX: the first OOP-based PC

Dave Evans thought that you couldn’t learn much in graduate school and, like many ARPA “contractors”, wanted us to do real projects and not spend too much time on theory, and the dissertations would be devoted only to the latest developments. Usually Dave arranged his wards with consultants. So in early 1967, he introduced me to the friendly Ed Cheadle, a true hardware genius. Ed then worked at a local aerospace firm on what he called a "small computer." Of course, the first personal computer, LINC, was created not by him, but by Wes Clark, but Ed wanted to make a machine for noncomputers. For example, he wanted to program it in a high-level language, for example, in BASIC. “Maybe better on JOSS?” I suggested. “Come on,” he answered. And so began our pleasant project with him, which we called FLEX. The deeper we got into the development, the better we realized that we wanted to achieve dynamic simulation and extensibility. JOSS (and no other language I know) was particularly suitable for either the first or the second. Simula disappeared immediately - our computer was too small for her. The beauty of JOSS was an incredible level of attention to users - in this regard, not a single language exceeded it. But for serious calculations, JOSS was too slow, and there were no real procedures, variable scopes, etc. Euler, the language created by Nicklaus Wirth, looked like JOSS, but had much greater potential. It was a generalization of Algol on ideas first expressed by van Weingaarden: the types were removed, many moments were unified, the procedures became first-class objects, etc. Something like Lisp, but without any special tricks.

Following Euler’s example, we decided to simplify Simula by applying the same methods to her. The Euler compiler was part of its formal description and could easily be converted to bytecode similar to that used in the B5000. This bribed - it means that Ed's small computer could execute other people's bytecodes, even if they had to be emulated using a long and slow microcode.One problem: the Euler compiler was written according to the awkward rules of extended precedence grammar, which led to compromises in the syntax. For example, a comma could only be used in one role, because the precedence grammar does not allow a state space. First, I took the Floyd-Evans parser, which worked on the principle of “bottom-up” (and created on the basis of the compiler compiler Jerry Feldman), and later switched to a top-down approach. For some attempts, I took META II, the language for writing Shorr compilers. In the end, the translator has fixed itself in the namespace of the language.

We thought that Simula should have the greatest influence on the semantics of FLEX, not Algol or Euler, but it turned out differently. And it was still not clear how users would interact with the system. Even Ed's first computer had a screen (for graphs, etc.), and LINC was equipped with a “glass teletype”. We could afford only 16 thousand 16-bit words - you could not even dream about Sketchpad on such a machine.

Doug Engelbart and his NLS

So, it was in the first half of 1967. We were all thinking about FLEX, and Dag Engelbart, a visionary of just the same biblical proportions, came to the University of Utah. He was one of the progenitors of what we call a personal computer. He brought with him a 16-mm projector with remote control - START/STOP instead of the cursor, because then the cursors were still new. The main idea that he promoted at ARPA was the Online System (NLS, oNLine Systems), designed to “enhance human intelligence” using an interactive means of moving around “thought vectors in the concept space”. Even by today's standards, the possibilities of his brainchild are amazing: hypertext, graphics, several working panels, effective navigation, convenient input of commands, collaboration tools, etc. are a whole world of concepts and abstractions. Thanks to Engelbart, everyone who wanted to “strengthen their intelligence” immediately understood what interactive computers should be like. So I immediately borrowed many of his ideas for FLEX.

The symbiosis of man and computer, which was at the center of all ARPA projects, and Ed's small computer again reminded me of Moore’s law, but this one finally reached its meaning. I realized that the machine that used to occupy the whole room (the same TX-2 or even the B6600 at 10 MHz) now fits on the table. This thought scared me: it became clear that the modern approach to computers would not last long, and the word "computer" acquired a new meaning. Probably, people who just read the Copernicus treatise and looked at the new Heaven from the new Earth felt the same way.

Instead of a couple of thousand training mainframes scattered all over the planet (even now, in 1992, there were only about 4,000 IBM mainframes in the world) and several thousand users trained in certain applications, there would be millions of personal computers and users in the world beyond the reach of universities and organizations. Where will the applications come from? How will people learn this? How do software developers find out what a particular user needs? Here, an extensible system directly suggested itself - such that people themselves customized it to their needs (and even could modify it). Thanks to the success of time-sharing systems, many at ARPA already understood this. A grandiose metaphor about the symbiosis of man and machine overshadowed any projects and did not allow them to turn into religions, keeping their attention on the abstract Holy Grail - “enhancing human intelligence.”
One of the interesting features of NLS was the parameterized interface: users themselves could set it as a “grammar of interaction” in their TreeMeta compiler compiler. William Newman described this in approximately the same way in his “Reaction Handler”: using a tablet and a stylus, the end user or developer described the interface in a traditional regular expression grammar, where in each state there were procedures that performed some actions (in NLS, embedding was possible due to contextless rules). The idea was tempting for many reasons, especially in Newman's version, but I saw a gaping flaw in it: the grammar forced the user to enter the system state if he wanted to start something new.All these hierarchical menus or screens on which you had to first return to the upper level in order to get somewhere else. It would be useful here to be able to switch from one state to any other, which fit poorly into the theory of formal grammars. In general, a more “flat” interface came up, but how to make it quite functional and interesting?

Let me remind you that FLEX was too small to become mini-NLS. We had to get out to introduce at least a part of advanced ideas into it, and in some cases to develop them. I decided that a universal window into a huge virtual world in the spirit of Sketchpad is better than cramped horizontal panels. Ed and I came up with an image clipping algorithm, very similar to the one that Sutherland created at Harvard: he and his students developed this algorithm simultaneously with us as part of a project to create a “virtual reality” helmet.

On FLEX, entity references were a generalization of B5000 descriptors. Instead of different formats of references to numbers, arrays, and procedures, FLEX descriptors contained two pointers: one to the “standard” of the object, and the other to its specific instances (we later realized that it is better to put the first pointer in instances to save space). To work with generalized tasks, we approached differently. The B5000 had l-values ​​and r-values ​​- in some cases this was enough, but for more complex objects, something else was needed. Here is an example. Let a be a sparse array whose elements have the default value 0. Then a [55]:=0 will still lead to the creation of an element, because “: =” is an operator, and a [55] will be differentiated into an l-value before than anyone can figure out that r-value is the default value. It doesn’t matter what a is: an array or an array procedure. Here you need something like a (55, ': =', 0). With this approach, the program will first check all the necessary operands and only then, if necessary, create an element. In other words, “: =” is no longer an operator, but rather a pointer to a method of a complex object. It took indecently a lot of time to understand this. Probably because I had to turn over the traditional idea of ​​operators, functions, etc. Only after that I realized that behavior should be part of the object. Simply put, an object is a set of key-value pairs, where certain actions act as values. Rudolf Karnap has a book on logic, which helped me understand that traditional ways of expanding programs and “meaningful” definitions have the same capabilities, but the latter are more intuitive and convenient.


As in Simula, a coroutine control structure was used to suspend and resume operation of the facilities. Permanent objects (files, documents) were suspended processes for the machine and sorted according to their static variable areas. The user could see these areas on the screen and select the desired one. Coroutines were also used to organize loops. The while statement checked generators that returned false when they could not provide a new value. Boolean values ​​were used to connect several generators. For example, for loops were written like this:

while i & lt;=1 to 30 by 2 ^ j & lt;=2 to k by 3 do j & lt; -j * i;

The construction... to... by... here is a bit of a coroutine. Many of these ideas subsequently appeared in Smalltalk, but in a stronger form.

Another interesting FLEX control construct is when. She worked on “soft interruptions” from events.

To be continued.

Thank you for the translation to Alexei Nikitin.
(Who believes that this article is important and wants to help with the translation - write in PM or alexey.stacenko@gmail.com)


Translations of Alan Kay:

Richard Hamming


Richard Hamming's book The Art of Doing Science and Engineering: Learning to Learn
  1. Intro to The Art of Doing Science and Engineering: Learning to Learn (March 28, 1995) Translation: Chapter 1
  2. “Foundations of the Digital (Discrete) Revolution” (March 30, 1995) Chapter 2. The Basics of Digital (Discrete) ) revolution
  3. “History of Computers - Hardware” (March 31, 1995) Chapter 3. Computer history - hardware
  4. “History of Computers - Software” (April 4, 1995) Chapter 4. Computer History - Software
  5. “History of Computers - Applications” (April 6, 1995) Chapter 5. Computer history - practical application
  6. “Artificial Intelligence - Part I” (April 7, 1995) Chapter 6. Artificial Intelligence - 1
  7. Artificial Intelligence - Part II (April 11, 1995) Chapter 7. Artificial Intelligence - II
  8. "Artificial Intelligence III" (April 13, 1995) Chapter 8. Artificial Intelligence-III
  9. "n-Dimensional Space" (April 14, 1995) Chapter 9. N-dimensional space
  10. "Coding Theory - The Representation of Information, Part I" (April 18, 1995) Chapter 10. Coding Theory - I
  11. "Coding Theory - The Representation of Information, Part II" (April 20, 1995) Chapter 11.Coding Theory - II
  12. “Error-Correcting Codes” (April 21, 1995) Chapter 12. Error Correction Codes
  13. “Information Theory” (April 25, 1995) Done, left to publish
  14. “Digital Filters, Part I” (April 27, 1995) Chapter 14. Digital Filters - 1
  15. “Digital Filters, Part II” (April 28, 1995) Chapter 15. Digital Filters - 2
  16. “Digital Filters, Part III” (May 2, 1995) Chapter 16. Digital Filters - 3
  17. “Digital Filters, Part IV” (May 4, 1995) Chapter 17. Digital Filters - IV
  18. “Simulation, Part I” (May 5, 1995) Chapter 18. Modeling - I
  19. "Simulation, Part II" (May 9, 1995) Chapter 19. Modeling - II
  20. "Simulation, Part III" (May 11, 1995) Chapter 20. Modeling - III
  21. Fiber Optics (May 12, 1995) Chapter 21. Fiber Optics
  22. Computer Aided Instruction (May 16, 1995) Chapter 22. Computer-Aided Learning (CAI)
  23. "Mathematics" (May 18, 1995) Chapter 23. Mathematics
  24. "Quantum Mechanics" (May 19, 1995) Chapter 24. Quantum mechanics
  25. "Creativity" (May 23, 1995). Translation: Chapter 25. Creativity
  26. “Experts” (May 25, 1995) Chapter 26. Experts
  27. “Unreliable Data” (May 26, 1995) Chapter 27. Invalid Data
  28. “Systems Engineering” (May 30, 1995) Chapter 28. Systems Engineering
  29. “You Get What You Measure” (June 1, 1995) Chapter 29. You get what you measure
  30. "How Do We Know What We Know" (June 2, 1995) translate in 10 minute slices
  31. Hamming, “You and Your Research” (June 6, 1995). Translation: You and your work