An Imaginary Tour of a Biological Computer

(Why Computer Professionals and Molecular Biologists Should Start Collaborating)


Remarks of Seymour Cray to the Shannon Center for Advanced Studies, University of Virginia, May 30, 1996


Seymour R. Cray earned a Bachelor of Science degree in electrical engineering in 1950 from the University of Minnesota. In 1951 he earned a Master of Science degree in applied mathematics from that institution.


From 1950 to 1957, Mr. Cray held several positions with Engineering Research Associates (ERA), St. Paul, Minnesota. At ERA, he worked on the entire development of the ERA 1101 scientific computer for the United States government. Later, he had full design responsibility for a major portion of the ERA 1103, the first commercially successful scientific computer. While at ERA, he worked with the gamut of computer technologies, from vacuum tubes and magnetic amplifiers to transistors and other semiconductors.

Mr. Cray has spent his entire career designing large-scale computer equipment. He was one of the founders of Control Data Corporation (CDC) in 1957 and was responsible for the design of that company's most successful large-scale family of computers, the CDC 1604, 6600 and 7600 systems. He served as a director for CDC from 1957 to 1965 and was senior vice president at his departure in 1972.

In 1972, Cray founded Cray Research, Inc. to design and build the world's highest performance general-purpose supercomputers. His CRAY-1 computer established a new standard in supercomputing upon its introduction in 1976, and his CRAY-2 computer system, introduced in 1985, moved supercomputing forward yet again.

In July 1989, he started Cray Computer Corporation to continue to expand the frontiers of scientific and engineering supercomputing. He was able to incorporate gallium arsenide logic design and micro-miniature supercomputers. The CRAY-4 achieved a clock speed of one nanosecond.

Mr. Cray is the inventor of a number of technologies that have been patented by the companies for which he has worked. Among the more significant are the CRAY-1 vector register technology, the cooling technologies for most CRAY computers, the CDC 6600 freon-cooling system, a magnetic amplifier for ERA, the three-dimensional interconnected module assembly installed in the CRAY-3 and CRAY-5 supercomputers, and the first gallium arsenide logic processor design.

Table of Contents


Can Computers Think? Not Yet!

Pedaflop Computing

Scaling Computers Down

Inside a Biological Computing Facility

Programming Code Needed by Living Cells

The Operating System for Living Cells

Interrupts in Living Cells

The Central Processor in Living Cells

The Role of Transfer RNA

Availability of Spare Codes for Programming

The Biological Power Supply

Control Circuits in Cells

Speculation: A Life Force Supplies Control Functions

Absence of References to the Life Force

Wave/Particle Duality and Computers

Giving Meaning to Binary Data

Are Fundamental Particles Real?

Mr. Cray's Remarks

Can Computers Think? Not Yet!

I remember about 10 years ago there was a lot of talk about artificial intelligence, writing a program that would learn. Particularly in Japan there was a lot of enthusiasm. Now that 10 years that have gone by, I hear less and less about it. I'm sure there's progress. There are some signs that machines are doing things kind of close to thinking, but I don't think we can say that we have a machine that learns today.

I suspect many of you followed, as I did, the recent chess match between Garry Kasparov and the IBM machine. I found that quite interesting on several counts. First of all, machines have got better and better at playing chess, and they are beginning to approach the capabilities of good expert humans. And this machine, the IBM machine, was especially designed to do the absolute best that we thought could be done with the computer.

And so we had this chess match between the IBM machine and a world chess champion. It was for six games. They followed the rules of human chess competition. The chess clock was turned off and on for the computer, and the first game the computer won. And Kasparov was very impressed. So he sat up that evening studying how did he lose that match, what was the strategy of the computer. And what was the computer doing that night? Well, it was turned off in the corner.

So you know what happened. The computer didn't win another game! Garry Kasparov won three and tied two. So computers don't think yet. At least not chess computers.

Pedaflop Computing

Not long ago I attended a workshop, and it was called enabling technologies for pedaflop computing. Now, some of you may not know what a pedaflop is, so let me explain that, assuming that some of you don't. Along about 1960, I remember, because I was involved, we invented the player piano sequence and made the floating point in our computer, versus bringing a subroutine to do it. And from that time on we could say how many flops does your machine do, Floating Point Operations, flops? How many flops does your computer do? And so today when we look at personal computers we say how many megaflops do they do? How many million floating point operations per second?

People that can afford big workstations can say how many gigaflops does your computer do? That's 1000 megaflops. Well, that's enough for most people. But, you know, there's always a government laboratory that wants something bigger. And so we have one today. It's the Sandia National Laboratory in Albuquerque, and they wanted a teraflop machine. And so they ordered one from Intel, and it's being delivered sort of piecemeal now, and by the middle of next year it's supposed to be all done, and it's supposed to run at a teraflop. And it has 9900 processors. It is a real monster. And, of course, all the other national laboratories are very jealous and they say, well, it costs too much, $40-some million, it won't work anyway, who needs one? But I think it's kind of nice that we have a teraflop machine because I guess we needed one. I'm not quite sure. Anyway, that's a teraflop.

Now, I think you know what a pedaflop is. A pedaflop is 1000 teraflops, and we're nowhere near to getting a pedaflop machine. But agencies like to talk about it. So they were the sponsors of this workshop.

I was the keynote speaker at this first pedaflop conference. Now, they are annual. You know, once you get started you can do it every year. And so I talked about revolution. I talked about where we might go in the future to build a pedaflop machine. And I talked about things like can't we use biology? And everyone smiled and said nice things, but as I listened to the

other talks, everyone talked evolution. And what the group thought, and this is a group of probably 30 technical people. They were all supposed to be top-notch in their various areas, they said if we just keep doing what we're doing, in 20 years we'll have pedaflop. And they had a documentation to prove it. They had a straight line on semilog paper.

Now, you know how that works. I mean, anything is a straight line on semilog paper. And so what they'd done is they put two points on the chart for the last 10 years of progress in computers, and they just extended it for 24 years, and sure enough it came to a pedaflop.

Scaling Computers Down

Well, I got to thinking about what that might mean. How did we make progress in the last 10 years? We made machines faster, and we made them smaller, and if we keep doing that for 24 years, what size is it? Well, now we're building half-micron circuit technology. We're soon going to be building quarter. Perhaps some people are now. And if I've talked to people that are doing research, they are talking about .15 micron technology. Well, if we extrapolate for 20 more years it's going to be really tiny.

Well, how tiny? How big is the molecule? Inorganic molecules are like a nanometer, but biological molecules are tens of nanometers. Well, .1 micron is only 100 nanometers. So we don't have far to go until we get down to the dimensions of biological molecules. Let's suppose that this chart is right, and in 20 years we'll build silicon that has details of that dimension. I think that we're going to find that we're coming up against a couple of basic physical things, like the uncertainty principle, that those things will be small enough so they won't behave the way macro things do, and I think we'll be coming face to face with the life force, which I view as a factor here. So I want to talk about those things.

Inside a Biological Computing Facility

What I think would be real interesting today is if we take a tour of a biological computing facility. Now, you have to use a little imagination on this tour. I'll be the tour guide. I want you all to imagine that you are computer engineers, and my job as a tour guide is to translate for you the biological names that we're viewing so you will understand them as computer engineers.

Now, there's another thing. You have to imagine yourself as being quite small, like, you know, maybe 1 micron tall, because biological things are really tiny. So if you're following me, I want to look inside a biological cell and try to identify those computing things which we can relate to our computers today with the name translations. Let's start with an overview. And let's take a human cell, because that's what we're studying most these days. Specifically, we're going to look at a human cell from the standpoint of how does it compute.

For the overview, when we look in the cell, the first thing we see is a big DRAM memory in the nucleus. It's called DNA. Then we look around the cell, and we see there are several thousand microprocessors. They are called mitochondria. And if we look further at how they work, they all share a common memory and they have two levels of cache. Now, you may not believe all this, but wait till we get into the details.

Let's look first at the big DRAM memory. Well, it's packaged in 48 bags. These are called chromosomes. Now, as we look at those we are a little puzzled because there are some little ones and some big ones and some middle-sized ones, and how did that happen?

Well, when you think about it, this computing facility started with a very small memory, and it's been upgraded a number of times, and you know when you go to the store you'd like to get the biggest DRAM parts, but you have to go with what's available. And that's what happened with the biological system. It had to go with what was available at the time it was upgraded.

If we look further into the big DRAM memory, we see that probably the packaging isn't important. Forty-eight banks probably aren't significant. We can view the whole memory as one string of bits, a one-dimensional memory. And biologists, I think, agree with that today. And so how big is it? Well, it's six gigabytes. Now, that's very big compared to a personal computer memory today. That's big compared to even most workstations today. So this is a really big DRAM memory.

Programming Code Needed by Living Cells

The next question is how much of it is program code. You know how embarrassed we are about our program code now. It gets bigger and bigger. For the DNA in a human cell, about 10 percent of the memory is program code. What's the other 90 percent? Biologists tell us it's mostly noise. But if you look close, it looks like old program code that doesn't run anymore. And that's probably what it is.

Let's look at the part that's program code. It's organized into a lot of subroutines -- 150,000 subroutines. Now, that's a lot of subroutines for any program. We call those genes. And we have this great big project worldwide now, the human genome project, to reverse-engineer this thing, and to identify how each subroutine works in the program. And more than that, the end result of this human genome project is to identify every bit in every sequence so we know exactly the code it needs subroutining. Now, that is a monster undertaking. We've been working on it as a human group for 10 years, and I think we can estimate about 20 years to finish. They are at somewhere between 15 and 20 percent through identifying function --function, not bit sequences yet. So we have this big DRAM memory, 150,000 subroutines, and we are working on decoding it and figuring out what each one does.

The Operating System for Living Cells

How much of the program code is operating system? Well, that's another embarrassing thing we have with our computers. The operating system is always too big. So how big is it in the biological system. The answer is a little over 50 percent. That's kind of embarrassing. If we look at this and ask how did that get so big, well, you know, every system you add more extensions to the extension folder, and they keep piling up, and they never get smaller. When you think how long this biological system has been upgrading its operating system and how long it might take, then, to initialize, you know, the more extensions you have in the extension folder, the longer you have to wait while the screen goes through this long sequence. When you think about how big this is, you probably won't be surprised to hear it takes 13 years to initialize the operating system, but if all goes well, you get a smiley face, and it's just like a MacIntosh.

Interrupts in Living Cells

After that, the system is interrupt-driven, and I want to talk about the interrupt-driven part of it for a moment, because I've been reading everything I can read. It's such a rapidly moving field because of the number of people working on the human genome project. But we have recently identified pretty much an entire interrupt sequence. Let's just kind of walk through this and see how it works, because I find this really fascinating.

The interrupt happens when a message comes in from outside the cell and says there is a virus loose, and this is the way it looks. Now, what we know is there is a single subroutine activated by that signal. And that subroutine calls another subroutine, and that one calls another subroutine, and we go through a long interrupt sequence involving hundreds of subroutines, and each one takes exactly the same amount of time, and the sequence is exactly the same no matter what the input signal was. In other words, it is very much like a computer interrupt routine.

And for the case that I'm talking about where a virus is identified and the cell goes through its sequence to make an antibody, it takes two weeks, and that's why it takes two weeks to cure a common cold. We've got to go through this sequence no matter what the input was.

The Central Processor in Living Cells

So let me leave the big DRAM memory now and look at some of the other parts. Let's look at the microprocessor, the ribosome, several thousand of them scattered around the cell, they are all built alike, and they all have two levels of cache memory. Let's look first at the cache memories, L1 and L2, as we computer people talk. Let's look at L2.

This is called messenger RNA in the biological lingo. Messenger RNA copies an entire subroutine out of the big DRAM memory and moves it close to the processor for fast access. How big is this? It's tens of thousands of bits long. That's comfortable for us. That sounds like an L2 memory. How about the L1 memory? Well, it takes small pieces out of the L2 memory and moves it into the microprocessor for translation of instructions. This is called transfer RNA.

The Role of Transfer RNA

Well, what size pieces does it take? It's pretty interesting because this reminds me of the old days. The biologist says first of all that the binary data in the big DRAM is paired, called base pairs. Base pairs, two bits. And for this purpose, three base pairs in a row is treated as an entity, a base pair triplet. In other words, six bits. And that's what's taken, one six-bit field at a time, out of the L2 memory and moved into the microprocessor.

I can remember when we built computers with six-bit codes. That was before the ASCII committee. Apparently the biological system never got the word. And so they're still using six bits in our biological system.

The six-bit codes get translated in sequence to choose amino acids to make a protein molecule. One by one, they get assembled, the whole subroutine gets read, and we generate protein molecules, and we send it off to do whatever it's going to do, and we run the subroutine again and we make another one. And we keep doing this until we have all the protein molecules we want, and then we reprogram the microprocessor with a different subroutine. So that sounds pretty familiar.

Availability of Spare Codes for Programming Monsters

Now, you all know in our computers that we don't use all the codes. There are some spare ones. And it turns out to be true here, too. Actually, the way God designed it, he only used 20 of the 64 codes in the six bits. Well, wouldn't you know, biologists are already tinkering, and they are trying the unused codes. They are putting artificial codes into the DNA and seeing what happens, and sure enough, some of them do weird things. And we can make weird-looking protein molecules.

And so today we say there are 20 natural amino acids, and there are some others that are unnatural. And there's potential for monsters coming out of this one.

Well, that, I think, describes the microprocessor. It's interesting to see the current effort to identify the control section and the code generation section. As I read it lately, the translator is rather small and the code generation is rather big. But never mind. The basic function seems to be one of reading the code out of the DNA and generating protein molecules.

The Biological Power Supply

What else do we have, as we look around this biological computing system? Something that often gets left till last is the power supply. Well, this cell has a power supply, too, and it's called a mitochondria. And this is pretty interesting. It takes big molecules from outside the cell, breaks them down into a string of little ATP molecules, and ships them around all through the cell to power the big DRAM memory and to power the microprocessors. This sounds a lot like converting high voltage AC to low voltage DC, to me.

Now, you know how design of power supplies always lags. If we look at this power supply, we find that the design is really ancient. It apparently dates back to when we crawled out of the ocean. And it has been passed down from motherboard to daughterboard unchanged through eons.

Control Circuits in Cells

What else have we got? We've got the big DRAM memory, we've got the microprocessors, we've got the power supply. There is one more very important thing: control circuits. When we build computers, control circuits are one of the big problems. We have to try to figure out every possible thing that will happen in the system and put in special hardware for control for every one of them. We look in the biological system for the control circuits, and there aren't any. How can that be?

Now I have to speculate, and I don't mind doing that. You realize, of course, that speculation in science can be career limiting at your age. But I'm old enough so I'm no longer concerned about career-limiting speculation, so I can do it. You probably concluded the same thing in the privacy of your own home, but you were afraid to say anything in public. Well, I want you to feel free today -- I mean, we do have a recorder here, but your voice won't be recognized, so if you want to speak out, you can.

Speculation: A Life Force Supplies Control Functions

Here I go now. I believe so far what I've said is relatively accurate, as we view biology today, but now I want to speculate. How does a biological system run without any control circuits for the big DRAM memory or the microprocessors or the power supply? And my conclusion is there is a life force that micromanages every molecule. Now, you probably thought that, too, but you were afraid to say so. That's the only explanation I can find.

Let me give you an example of why I think this must be true, because I can find no other answer. You know, question and answer time is coming, and you can tell me where I missed the boat here, but we recently discovered a single protein molecule which does the error correction code in the big DRAM memory. We're pretty familiar with that requirement. And, of course, there is the same requirement in DNA. It's a great big memory, it's continually being reproduced, it's got errors in it, somebody's got to correct it. We've identified this molecule, and we call it a base-flipper. Let me tell you what this does, because I find this pretty fascinating. Now, this is a single molecule.

It walks down the strands of DNA looking for a base pair that's wrong. When it finds one, it grabs the structure of the DNA, bends it sharply, and pops out the base that's wrong, puts it in the pocket in the molecule, makes sure it's wrong, and if it's wrong it puts the right one in, and then straightens out the strand again. Now, I find that pretty incredible, but I just read that recently, and it was a government report so you know it's got to be true.

Well, how can this one molecule do this complicated job? Apparently it has arms and legs and a good-sized brain. And we know from physics that this isn't true. So I say it's being micromanaged by life force.

Absence of References to the Life Force

Well, why didn't I read about this in the textbooks? Now, I open a physics book, and it talks about all the forces of nature, strong force, weak force, electrostatic force, gravity, and it acts like everything is covered, but it doesn't mention the life force. Well, maybe the author thought that would be covered in the biology book. So I open a biology book, and starting right off with chapter 1 the author assumes you know all about the life force, and he starts talking about all the details of what the molecules do and how they interact and the life force is assumed. So I think we've got a real gap here.

Now, the reason is, of course, we don't know how it works, and everybody is too embarrassed to speculate in print. But I would rather see some speculation about this than to see nothing at all. And so that's why I'm here talking.

So how does a life force work at the molecular level? I would like to know the answer to that question. I'm sure you would, too, and I don't think we're making much progress with that.

Wave / Particle Duality and Computers

I want to talk a little bit about something I recently read. It's not completely off the track here. This has got some relationship with computers. This was a recent experiment, and if you'll excuse me, it asks the question of what does God think about computers. Now, you might feel what do I know about that. Well, every once in a while you get a little hint. God gives us a little hint. So I have to talk about this experiment. It's about a year old now, and it has to do with what God thinks about computers.

Now, this is called a wave-particle duality experiment. Now, I'm sure you all know about wave particle duality, but maybe some of you haven't thought about it lately. And so let me give you a very brief history of this wave-particle duality experiment which began in the 1920's and is still going on, and I've got this recent experiment to report when I finish my history.

Back in the 1920's there are two groups of physicists. One group believed that photons, for example, any basic particle could do the same thing. But let me talk about photons. This group believed that photons were waves, and they did experiments to prove that photons were waves. There was another group of physicists who thought photons were particles, and they did a group of experiments and they showed that photons were particles. The interesting thing was they were both doing exactly the same experiment. They were putting two slits close together in a metal plate and they put a target on the other side, fired a photon at it, and if it went through both slots and made a fringing pattern it was clearly a wave. If it went through must one slight and went splat at the target it was a particle.

Well, the physicists that looked for waves saw waves, and the physicists that looked for particles saw particles. All the time. And so this was a real problem. And so Heidenberg came along with this uncertainty principle and said humans aren't supposed to know everything. And one of the things you're not supposed to know is about elementary particles and where they are and how they behave.

Well, that makes the quantum theory that we have today, but wave-particle duality sort of sticks out like a wart on that quantum theory, because there is this one unique thing. If you could say God had a bad day back here, and he couldn't decide between the two groups, and he said yes, to both. Now, that's one possibility. There's another one, too. I'll come to it later.

But today what do we say, because we've done this experiment over and over through decades. Now we say the photon is undefined until observed. Well, what kind of talk is that?

It turns out what you look for is what you see, and it isn't half and half. It isn't both at once. These are exclusive. If you see a wave there is no particle. If you see a particle there is no wave. We've proved that over and over and over.

Okay. Now you're ready for today's experiment. Since the earlier experiments all showed that the observer determined which it was, we build an experiment with no observer. We put a computer in instead. And so we made a wave-particle duality experiment, a computer looked both for waves and particles, and put the data in a computer, a file for each, and we did the experiment again and it made another file for each, and we did it again and we made along list of files.

Long after the experiment and no human has looked, a person, a human, goes up to the computer console and looks in the memory. And if he looks first for the wave results in the file he sees waves. If he then looks for particles, he sees none. If he first looks at the next experiment for particles he sees particles, and if he looks for waves he sees none. In other words, the computer was transparent to the experiment, and God doesn't think computers are observers. I think that's the conclusion.

Now, maybe if we make better computers he will change his mind. But right now, computers aren't observers. Isn't that fascinating.

Giving Meaning to Binary Data

Now, I have real trouble with this, because you know for elementary particles you can kind of excuse the fact you don't know what's going on and it depends on the observer and all that. But think about this computer now. Between the time the experiment was done and the time the observer looked at the screen on the console, there's the computer memory, it's got these files in it, the maintenance routine is all run. The data is binary, you know? It's all binary. How can it be undefined? This is macrostuff now. It isn't particles anymore. It's an extension, you see.

Well, the best I can do is that these bits in the memory are all defined, but they are defined by an event in the future, cause and result are reversed in time. That's really quite disturbing, I think. That's not the way we want it to be. But apparently that's the way it is.

You see how confused I am now. I'm getting ready for the question and answer session, so if any of you can help me with this, I'd like to hear about it.

Are Fundamental Particles Real?

I'd like to end my talk and start getting into discussion with one more thought which dates back about 10 years. This was a discussion I had in Lucerne, Switzerland with a French physicist whose job it was to find elementary particles. And he'd been doing this for most of his life. And we were having dinner, and so I was asking him about his work. And I said isn't it kind of strange that physicists find a whole set of particles and they all fit together and we get all our textbooks updated, and about 10 or 15 years goes by and then you find another whole set of particles that are smaller, and we get our textbooks all up to date again, and then another 10 or 15 years goes by and you do it all over again? And, you know, he'd thought this all through, because it only took a couple of seconds, and he looked me straight in the eye and he said these particles didn't always exist. God makes them up as physicists need them.

Well, I hope God does the same thing for computer engineers.