table of contents

06/11/2014

Edward Feigenbaum on Artificial Intelligence

Edward Feigenbaum is Professor Emeritus of Computer Science at Stanford University, where he was also co-director of the Knowledge Systems Laboratory. He received his PhD from Carnegie Institute of Technology (now Carnegie Mellon University) in 1960, working under the supervision of Herbert Simon and developing EPAM, “Elementary Perceiver and Memorizer.” He is considered one of […]

download transcript [vtt]
00:00:00.000
[Music]
00:00:03.000
This is KZSU Stanford.
00:00:06.000
Welcome to entitled opinions.
00:00:08.000
My name is Robert Harrison.
00:00:10.000
We're coming to you from the Stanford campus.
00:00:13.000
[Music]
00:00:25.000
I'm joined in the studio by a very special guest who is going to share some thoughts with us today about cognitive science and artificial intelligence.
00:00:36.000
In title opinions specializes in embodied existential, historical,
00:00:42.000
eminently human intelligence so it should be very interesting to hear all about both the history, the present, and the long-term future of machine intelligence.
00:00:52.000
No one is more qualified to speak about this topic than my guest Edward Feigumbow, Professor Emeritus in the Department of Computer Science of Stanford.
00:01:02.000
Professor Feigumbow received his PhD at 1960 at Carnegie Institute of Technology, now known as the Carnegie Mellon University.
00:01:11.000
His thesis advisor was Herbert A. Simon, one of the founding fathers of cognitive science, artificial intelligence, and information processing.
00:01:21.000
In his thesis, Ed Feigumbow developed E-PAM, or Elementary Perceiver and Memorizer, which, if I understand it correctly, is a psychological theory of learning and memory implemented as a computer program.
00:01:39.000
Designed along with Herbert Simon, E-PAM was originally intended to simulate verbal learning but was subsequently expanded to account for data on the psychology of expertise and concept formation.
00:01:53.000
Will he be hearing more about that shortly?
00:01:57.000
In 1994, Ed Feigumbow, along with Raj Reddy, received the most prestigious award in Computer Science, the ACM Turing Award for pioneering the design and construction of large-scale artificial intelligence systems, demonstrating the practical importance and potential, commercial impact of artificial intelligence technology.
00:02:23.000
A former chief scientist of the Air Force, Ed Feigumbow, received the U.S. Air Force exceptional civilian service in 1997.
00:02:33.000
In 2011, he was inducted into the IEE, Intelligent Systems, which is artificial intelligence's Hall of Fame for significant contributions to the field of AI and intelligence systems.
00:02:49.000
Professor Feigumbow was also chairman of the Computer Science Department and director of the Computer Center at Stanford University.
00:02:56.000
He is the founder of Stanford's Knowledge Systems Laboratory.
00:03:01.000
In 2012, he was made a fellow of the Computer History Museum for his pioneering work in artificial intelligence and expert systems, and this is just a sampling of his very impressive bio, which will be posting on our website.
00:03:15.000
But I for one am eager to get him talking, so let me welcome into the program. Ed, I'm glad that you could enjoy us here on entitled opinions today.
00:03:24.000
Robert, thanks for having me. It's really a privilege to be here, and it's an opportunity to thank you for the really great work that you've done in giving us so many interesting programs.
00:03:36.000
And so I'm happy to contribute to that.
00:03:39.000
Well, these interesting programs are in large part dependent on having guests like you who are willing to come and share their expertise with me on air.
00:03:48.000
And so that's why I'm looking forward to this topic in particular because it's not a domain that I tend to venture into a lot, artificial intelligence, but it's something that clearly affects all our lives, those of us who belong to the time that we live in,
00:04:06.000
and it's going to convince going to have a huge effect on all our future.
00:04:15.000
So we're going to talk about the history as present as well as future of artificial intelligence and cognitive science, and you have a very fascinating story to tell about that, both the history and the future.
00:04:30.000
Could I ask you just to give us a kind of basic version of that history and share with our listeners some of the things that you have done also personally to contribute to that history?
00:04:44.000
Here's a plan.
00:04:46.000
I'd like to focus some time on what I call the Big Bang, which took place in 1956, and thereabouts, lasted from 1956 to about 1965, but there's a very interesting,
00:04:59.000
story of what happened before the Big Bang. That's what people always ask physicists.
00:05:06.000
What happened before the Big Bang? I'll tell you, in artificial intelligence.
00:05:11.000
And then after that I'd like to focus on the stream of work that led up to, through machine learning, up to the present day, and some rather sensational applications, and also some of the maturation of what came to the first day.
00:05:28.000
So let me begin in the prehistory, which is absolutely fascinating.
00:05:37.000
In mid-century, actually in the 1930s, there was a young and brilliant logician in Britain named Alan Turing, for which the Turing Award is named. He's regarded as the father of computer science.
00:05:55.000
He proved a special theorem about a rather abstract kind of device that we now call a computer. It wasn't a real computer, but it had all the properties of what we now call a computer.
00:06:09.000
But that didn't last very long because World War II came along, and Alan Turing, although he volunteered for some other duty, was actually assigned to a then-secret and now very famous
00:06:24.960
cryptographic laboratory, a cryptanalysis laboratory called "Bletzley Park," which and Turing's work with his colleagues at "Bletzley Park" resulted in breaking the enigma code.
00:06:40.960
You've probably seen the play breaking the code. There was a movie made of it, and there was a TV program made of it.
00:06:48.960
So Turing, in order to do that, code breaking is very hard to do manually. That's why it's used for secrets.
00:06:55.960
So Turing had in mind the idea that you could design a computer, an information processing machine, not doing a lot of calculation, but just doing a lot of manipulation of symbols that you could do that.
00:07:09.960
And he designed a computer of its time, pre-electronic, electro-mechanical. It's happened to be called the bomb, BOMBE, and it was later replaced by a real electronic computer called the Colossus, which is really the first major electronic computer that came into existence, although the British intelligence service never let it be known to the world.
00:07:38.960
Such a thing was Colossus developed during World War II.
00:07:42.960
Yes, it was developed during World War II in 1943 and 1944.
00:07:46.960
By Turing.
00:07:47.960
No.
00:07:48.960
By a group followed on from Turing's work, another group, in a famous person who should have become famous had this not been a secret.
00:08:02.960
A post office research employee named Tommy Flowers actually designed the Colossus machine, which came into existence because the Germans had changed their coding machine from the enigma to a much more complex machine and the British were in the dark for about nine months until there was this electronic machine that was about a thousand times faster than Turing's machine.
00:08:27.960
So Tommy Flowers is an unsung hero in this pre-history?
00:08:31.960
A bitter. Well, he's dead now, but he was a bitter and unsung hero because the British didn't make this public until the 1980s when everyone else had claimed credit for the invention of the computer of the electronic computer.
00:08:46.960
But anyway, during the war, in the early 1940s, Turing was talking to the young people that were working with him.
00:08:54.960
In particular, one of them named Donald Mickey, who later became the pioneer of artificial intelligence in the UK, was working for Turing at the time.
00:09:06.960
And Donald told me many times about the Turing stories of Turing talking about these computers ultimately will be capable of thinking.
00:09:20.960
In 1949, when Turing was leaving the employ of the British governmental lab called the National Physical Laboratory, he wrote a report for the boss at the National Physical Lab on this board.
00:09:35.960
And he was on this subject.
00:09:39.960
And many people urged him to rewrite this article for public use in a journal.
00:09:48.960
And he did. And the journal was the journal mined.
00:09:52.960
The article has some lengthy title, but most of us have shortened that title to be kind of machine-think.
00:10:02.960
And in that paper, Turing lays out, first of all, a point of view and second of all, a test, which has become known as the Turing test for intelligence.
00:10:14.960
But the point of view is what is most important, which is thinking is, as thinking does.
00:10:22.960
Thinking is just a word that we use to cover a great deal of behavior. From perceptual behavior all the way to the deepest kind of cognitive behavior that you would find involved in Einstein's thinking and Shakespeare's thinking and the kind of stuff we do on this show.
00:10:38.960
And the kind of stuff. And even language, just the conversation that we're doing right now.
00:10:45.960
And so, Turing introduced the idea of a behavioral definition of thinking.
00:10:51.960
Let's not be philosophical about it. Let's not be theological about it.
00:10:55.960
Let's just see if a machine can do what people can do.
00:11:00.960
And hide the person and hide the machine and have a human judge decide at a level better than chance, which is the human and which is the machine.
00:11:11.960
So that's the famous Turing test where a human judge would not know whether it's speaking to a computer or a human being.
00:11:19.960
And if it's not able to determine that, then it's irrelevant whether this kind of thinking is equivalent.
00:11:28.960
Yeah, let's just give it the same word we give it when we talk about people, namely thinking.
00:11:36.960
So nothing much happened in the field from the publication of that paper until 1956.
00:11:47.960
In 1955, John McCarthy, who was a professor here until he died two or three years ago, decided to convene a conference, which was held at Dartmouth, famous conference, the 1956 summer,
00:12:01.960
in which about a dozen people were brought together to try to frame up what kind of scientific issues were involved in actually producing computer programs that would do that.
00:12:17.960
In the meantime, computers themselves were growing up from basically nothing to the first things that you could sell to businesses and you could use for scientific work.
00:12:30.960
By 1956, something really existed that you could run these programs on.
00:12:36.960
In mid-December of 1955, if you read the autobiography of Herbert Simon, my thesis advisor who later won the Nobel Prize in economics in the '70s, he was a genius in many disciplines.
00:12:59.960
He's one of the biggest prizes in political science, and psychology, and computer science, and economics.
00:13:08.960
Herbert Simon and a very advanced graduate student of his at the time who later became one of the most famous computer scientists, Alan Newell, really got down to business.
00:13:22.960
And they actually took a task, which was, people thought was a difficult task, and proceeded to write, I'll tell you what that is in a minute, and proceeded to write a computer program which solved problems within that task.
00:13:41.960
In the early part of the 20th century, Bertrand Russell and Alfred North Whitehead had, in investigating the foundations of mathematics, had written a tome called "principia mathematica," which dealt with what is called propositional logic.
00:14:02.960
And by propositional, it means that the elements of these theorems were simply P's, Q's, R's, S's, etc. They were not objects in the real world.
00:14:15.960
And that was a brilliant choice by Simon and Newell to do that, because we didn't know how to connect up these programs to anything meaningful in the real world.
00:14:24.960
And it didn't quite follow. Simon and Newell were trying to implement a computer program to solve some of the problems that Bertrand Russell was of propositional logic.
00:14:36.960
Correct.
00:14:37.960
And the program was called the LT or the logic theorists, and it proved theorems in Chapter 2 of "principia mathematica."
00:14:46.960
In fact, proved all the theorems in Chapter 2, and proved one of the theorems that was given a rather lengthy proof by Whitehead and Russell.
00:14:55.960
It proved it in a few lines, a spectacular short proof, that in fact, Simon wrote up and sent to Russell, and Russell wrote back and was amazed.
00:15:07.960
And you can see the Russell's letter back to Simon is quoted in Simon's autobiography.
00:15:14.960
As Russell was saying, "Whitehead and I waste our time on this, I'm now prepared to believe that a machine can do anything."
00:15:22.960
Was this a mathematical computation that the machine had done?
00:15:26.960
Very interesting question, Robert, because you used the word computation.
00:15:31.960
Computation up till that point was thought to be a numerical thing.
00:15:38.960
You would compute a formula, and there would be numbers coming out.
00:15:42.960
Numbers went in and numbers came out.
00:15:44.960
This was pure symbolic manipulation.
00:15:48.960
These were theorems, there were no numbers involved.
00:15:51.960
You'd start with a theorem, and you'd end with an axiom, and just like a mathematician does, you would say QED.
00:15:59.960
And so that kind of computation became known as symbolic computation.
00:16:05.960
The key thought was that what people did when they went through the behavior that we call thinking is purely symbolic information processing.
00:16:19.960
In physics, when you construct a theory, a physicist normally aims at expressing that theory in mathematical terms.
00:16:30.960
But that turns out to be difficult and fruitless for something as complex as the behaviors that we call human thinking.
00:16:40.960
It turned out exactly the right language was the language of information processing that is reflected in the programming languages of computers.
00:16:50.960
You don't have to think of them as adding machines or multiplying machines.
00:16:54.960
You can think of them as simply devices that take a symbol out of memory, put it somewhere else, look at it, test it, manipulate it, do something with it, infer something from it, and put it back into memory.
00:17:08.960
So what is a symbol here, could it be a word, or is it an abstract?
00:17:16.960
All of the above.
00:17:18.960
Down at the machine level, it's represented as a string of zeros and ones, which are choices.
00:17:25.960
They're not thought of as numbers. They're ons and offs.
00:17:29.960
But they're just a way of coding something.
00:17:33.960
I already mentioned that in connection with Turing's work on the German messages.
00:17:39.960
They weren't treated as numbers. They were treated as symbols.
00:17:45.960
So is this the Big Bang we're talking about?
00:17:48.960
Yeah, so this is the Big Bang. This is the big idea.
00:17:52.960
And McCarthy gave the field its name, Artificial Intelligence, by naming the conference that was held in 1956.
00:18:09.960
In addition to the logic theory program, one of the attendees at that conference, Herbert Colerinter of IBM,
00:18:19.960
did a program very much like the logic theorist to prove theorems in Euclidean geometry, high school geometry.
00:18:28.960
And that program, the Geometry Therm-Pruing program, was the only taker of the New York State region's exam in plain geometry,
00:18:40.960
and I believe the year 1959. The only taker of that exam that scored 100%.
00:18:46.960
In addition to that, it actually proved one of the more famous of Euclide's Theorems in a completely novel and interesting way,
00:18:56.960
which had not been noticed in a couple of 2,500 years of 2,000 years of examination by all of the people who have looked at Euclide, including all the teachers and professors and students and so on.
00:19:11.960
That was really the Big Bang. There are a couple concepts there that are very important in carry on into the future.
00:19:19.960
The first is the concept that the essence of problem solving is search.
00:19:26.960
That's search really is, that problem solving really is decision making in a very large space of possibilities.
00:19:33.960
That are implicit in the symbols that you're manipulating, but the whole trick is to find your way through that space of possibilities, to find the right answer or a good enough answer, actually in Simon's terminology, he used the word "satisficing" to contrast it with maximizing.
00:19:54.960
You're not trying to maximize anything, you're trying to find a solution which is good enough.
00:20:01.960
So by search, do you mean that the machine has to, if you want to use this figure, eliminate all the noise, that is irrelevant and zero in on what is most pertinent to the problem?
00:20:14.960
Yeah, and it's not necessarily noise. It could be just worse solutions, solutions that are not as good.
00:20:21.960
So for example, you can imagine starting off the search in a good chess move.
00:20:29.960
If I move here and he moves there and then if he does that, I have these possibilities and if I do that, he has those possibilities.
00:20:38.960
If you work that all out to the end of the game, that number of avenues of looking for a good move is a one followed by 120 zeros.
00:20:49.960
Well, there's no way. I mean, that's bigger than the number of particles in the universe.
00:20:55.960
And search is a way of narrowing down? Yeah.
00:20:59.960
And what you use to narrow it down is what we call heuristics. Heuristics constitute what I call the art of good guessing.
00:21:10.960
But it's based on knowledge. It's based on either real hard knowledge about the world and you can therefore cut off some avenues of search because they're not in a, because we know better than that.
00:21:24.960
Or it's just based on experience. We know that certain things work in the past, other things didn't work in the past. We have some common sense, particularly in the area of, in specialties where we're expert,
00:21:37.960
we have really good expert common sense about the area. If you're a doctor and a patient comes in who's, if the patient is a man, then you rule out all things that have to do with symptoms having to do with pregnancy.
00:21:54.960
So that's just good, that's factual knowledge about the world and good common sense.
00:22:00.960
So heuristic programming became the essence of the Big Bang.
00:22:07.960
And it was carried on later into the second phase, which I'm going to tell you about right now.
00:22:14.960
When I got to Stanford in 1965, I had been, I had been at Berkeley for four years, completing, actually I worked with Herb Simon long distance.
00:22:24.960
We didn't have the internet at the time. We didn't have any network at the time. The interaction with Herb Simon was via the post office and telephone.
00:22:32.960
But the field had divided in the Big Bang. It had divided into two pieces.
00:22:39.960
The people who wanted to construct machines that were as smart as possible, whether or not they were doing anything like what people were doing.
00:22:50.960
And those are the people who used the banner artificial intelligence.
00:22:56.960
And then there were the people who wanted to construct realistic and testable models of human information processing.
00:23:06.960
That is, they wished to behave like psychologists.
00:23:11.960
That their models were constructed not out of intricate verbal arguments, such as you had find in the book by Bruner Goodnow and Austin, a study of thinking, which was the hot book at the time in the 1950s.
00:23:30.960
And not like mathematical expressions like you had find in the work of one of our most famous psychologists Gordon Bauer here at Stanford or Dick Atkinson, who later became president of the UC University California system.
00:23:45.960
They were doing mathematical or Pat Supeys. They were doing mathematical psychology, which turned out to be a dead end.
00:23:54.960
But these were information processing models, which you could write programs for and test.
00:24:01.960
So for example, my-
00:24:02.960
You're working now.
00:24:03.960
Yeah, for example, my work on human verbal learning, that line of research had started with Ebbinghouse back around the turn of the 20th century.
00:24:16.960
And psychology. And psychology. And they were very- there was a very large amount of stable experimental data about experiments that had been taken place in learning verbal materials, particularly those verbal materials that were not tied to real world meaning.
00:24:37.960
We call them nonsense syllables. And in my thesis, I constructed, I hypothesized and then programmed and then tested mechanisms, information processing mechanisms that would account for those experimental results.
00:24:55.960
And it became the best tested theory of verbal learning behavior at the time, quite well known.
00:25:03.960
And it's lasted right up to the present day. There are still people doing research using the e-pam model of verbal learning behavior.
00:25:12.960
That became known as cognitive science. Originally known as simulation of cognitive processes.
00:25:19.960
So when I edited with my colleague Julian Feldman edited the first book in AI, which was called Computers and Thought. It was an anthology.
00:25:28.960
We divided that anthology into two parts. Simulation of cognitive processes, which were basically psychology oriented papers, where the original publication were in places like the journal psychology,
00:25:42.960
the psychological review, and other work was artificial intelligence, which published in computer journals essentially.
00:25:54.960
So that's the work I was doing, but I decided when I moved to Stanford to move from the psychology end of the field, which I was not too happy with, to the artificial intelligence field, which had been my dream of the science field,
00:26:11.960
which had been my dream from day one to do what, again, one of Turing's young people, IJ Good, turned out to be a famous statistician.
00:26:23.960
He called the ultra intelligent computer. He had a famous paper called "Toward the ultra intelligent computer."
00:26:31.960
The problem that I was interested in was the problem of induction, not theorem proving, not discovery of proofs, not puzzle solving, which is a constituted many of the tasks that were done in the Big Bang period, but rather the problem of inducing hypotheses from a lot of data.
00:26:53.960
And my thought was, this is what we all do, almost all of the time, data is pouring into us through our senses, and what we're doing is forming and continually reformulating a model of our world. What's going on around us?
00:27:10.960
Well, the people who are supposedly experts at this particular area are the people we call scientists. That's their job to take in data and form hypotheses about what's going on from that data in their field of science.
00:27:31.960
So I expressed that interest to a newly found Stanford colleague, Joshua Letterberg.
00:27:40.960
Josh had gotten, he was chairman of the genetics department and a Nobel Prize winner in medicine and physiology, 1958, I believe, or '59.
00:27:52.960
But Josh had gotten interested in computing in 1964, had written a rather sensational program which could serve as the basis for one of these programs that I've, these induction programs that I was describing.
00:28:12.960
And so we got, he said, "Ed, I've got just the problem for you. It's the induction of hypotheses about how organic chemical molecules are structured, where the atoms are, and how they're connected to other atoms. Given a whole bunch of data that comes from an instrument, rather modern instrument at that time, called a mass spectrometer.
00:28:41.960
I won't go into what that is, but it was a new instrument. It was particularly interesting to Letterberg because he was helping to design the instrumentation that would go on the first planet probe that landed on Mars, looking for signs of life.
00:29:00.960
And in particular, looking for life precursor molecules called amino acids.
00:29:06.960
It's hard to do wet chemistry in a lab that lands on Mars, so you wanted some kind of instrument that would be able to do everything electronically and report back the data.
00:29:18.960
Oh, yes, I have seen this kind of a molecule. That's how Letterberg got started in that particular problem.
00:29:28.960
We began work on that, and as I mentioned, the essence of the artificial intelligence work is what I called heuristic programming, the application of heuristics to tasks like that.
00:29:41.960
The program that Letterberg had written to elucidate structures gave vast amounts of structures for even simpler molecules.
00:29:52.960
So what we wanted to do was to cut that down, given the knowledge that we had of chemistry, given the knowledge that we had of the specific expertise in mass spectrometry, what's the best candidate?
00:30:10.960
And then we would give that to the chemist.
00:30:13.960
It turned out that we ran out of what Letterberg knew pretty quickly, because he was working on amino acids only.
00:30:24.960
So what we did was recruit the expertise of the head of the Stanford mass spectrometry laboratory, turned out to be another genius, Carl Gerassi, the inventor of the birth control pill.
00:30:38.960
And Letterberg and Gerassi and I and our colleagues in the computer science department and in the mass spectrometry lab did a program called Dendral, or heuristic Dendral, that actually achieved levels of performance on these tasks better than PhD level postdoc level problem solving in Gerassi's lab.
00:31:04.960
And that gave us the courage to move on to do the same task for medical diagnosis, for example, in diagnosis of blood infections, diagnosis of pulmonary disease.
00:31:17.960
One of those programs is called Myson, the other was called Puff.
00:31:21.960
We did it for several engineering problems.
00:31:23.960
We even did it for a problem that our sponsor, the Defense Department sponsor, ARPA, same people who paid for the ARPA net, which was the precursor to the internet.
00:31:35.960
They wanted a secret one done to detect Soviet submarines off the West Coast.
00:31:42.960
And we did it for that and eventually it became a, it led to the formation of companies in Silicon Valley that did that kind of work.
00:31:55.960
Indeed, one of those projects that Letterberg and I did was called Mylgen, said for molecular genetics, it was the first program that did what we now call computational molecular biology.
00:32:10.960
Remember, for the molecular biologists, the key things are letters, ACGNT, in the DNA, not numbers.
00:32:19.960
And so we were able to do symbolic manipulation that was a powerful assistance to molecular biologists and started a company called Intelli Genetics, which did that for the pharmaceutical industry and other biologists.
00:32:35.960
So that's basically the Phase 2 story.
00:32:39.960
Well, that's great. So we'll take just a short little musical break and we'll be right back to get the rest of it. Stay tuned.
00:32:47.960
[Music]
00:33:13.960
[Music]
00:33:39.960
We're back on air with Professor Edward Feigumban, continuing our story about artificial intelligence.
00:33:51.960
So we went through the prehistory, we went through the Big Bang and then spoke about your own work about induction and I'm eager to hear what comes next.
00:34:05.960
Well, the work on induction, as I said, led to a great body of work which dominated artificial intelligence for a period of maybe 10 or 15 years called expert systems, which are these programs that perform at high levels of intelligence when we give them a great deal of knowledge about some particular area of expertise.
00:34:32.960
And that's excuse me for interrupting of his set where your famous phrase you're known for that through knowledge comes power.
00:34:40.960
Is that what it is? Is that the more knowledge you put in there, the more power you get from the...
00:34:45.960
Yeah, that's what I would call the second Big Bang. In 1968 when, later Bergen and my colleagues and I were going over the results of the original work that we did together,
00:35:00.960
we had a set of experiments which showed definitively that the program's behavior became better and better in terms of quality of chemical work.
00:35:15.960
The more knowledge we gave it of the domain in which we were working, namely chemistry and mass spectrometry.
00:35:23.960
So that's when the, of course, is as famous phrase, "knowledge is power." It wasn't used in anything having to do with technical language when it was when Bacon first used it, but so I changed it into "in the knowledge lies the power."
00:35:38.960
That was in contrast to the work that happened during the Big Bang where people were concerned about methods, search methods, heuristic methods, but they weren't concerned about knowledge.
00:35:49.960
So by the mid-70s, two of the famous researchers at MIT wrote a paper in which they talked about AI's shift to the knowledge-based paradigm.
00:36:02.960
That shift was initiated here at Sanford and it still dominates today.
00:36:09.960
So there actually is a brilliant counter-example to that which I like to cite because it satisfied one of the early predictions of Herb Simon actually,
00:36:25.960
that a such a program would beat the world's chess champion. He had predicted that it would happen in ten years, but it actually took about 40 years and that was a program called Deep Blue that used rather
00:36:38.960
a small amount of knowledge about chess, but did a tremendous amount of searching in that maze I was telling you about. If I do this and he does that and I do this, it looked at 250 million alternatives per move, roughly.
00:36:56.960
That was in the 80s? Yeah, that was in 1997 or 8, Deep Blue beat Gary Kasparov, the world's chess champion.
00:37:08.960
Now Kasparov didn't search anything like he searched maybe a few hundred or a few thousand alternatives, but he started off with some superb pattern recognition, some experience that he had that said that certain patterns were great patterns.
00:37:25.960
And he didn't have to do all that amount of searching, but the searching beat him. And that was a counter-intuitive thing to the people like me who were believing in the knowledge's power principle.
00:37:43.960
That whole set of ideas is called the knowledge versus search spectrum, and it's very important in artificial intelligence because we've been able to beat the search game by putting tremendous amounts of knowledge in about particular specialties of the world, or large bodies of common sense knowledge about the world.
00:38:07.960
But what we haven't done is to exploit large amounts of computation, which are now available and weren't available then. So that really is part of the future of artificial intelligence is to look down avenues like that for combinations of symbolic concepts that are novel and interesting that induce a sense of awe in the knowledge of knowledge.
00:38:36.960
And the sense of awe in the human audience who sees it and our new demand kind, just like you had in that move that Kasparov, where Deep Blue beat Kasparov, there was actually in the audience there was among the grandmasters and masters watching it, there was a sense of awe.
00:39:03.960
And Kasparov, who himself is not a religious man, said it was like looking into the mind of God to see that move emerge.
00:39:15.960
So the relationship between knowledge and search, so initially you would put as much human knowledge, let's say, into the machine and that the more knowledge you would put in the more it would arrive at the desired results.
00:39:34.960
But then quickly enough, I suppose that the machine can generate knowledge rather than just merely receive it in process, is that correct?
00:39:43.960
That's a really great question. One of the parts of the field that received some attention, but not a lot of attention in both of those early periods, a big bang period and the expert system period, was a part of the field called machine learning.
00:40:03.960
Some people thought of that as the critical element for an intelligent program or an intelligent human being, that it learns from experience.
00:40:19.960
But our efforts up until 1980 were rather minimal in my own project with Josh Lederberg and Bruce Buchanan and my team.
00:40:31.960
We were able to learn, excuse me, our programs were able to learn the knowledge necessary about certain families of molecules that was necessary to solve problems in particular.
00:40:48.960
We thought of that as a great achievement. That was the first time that any new knowledge about a field got published in a journal of its own discipline, not a computer journal, but in this case, the journal of the American Chemical Society.
00:41:10.960
So we thought that was fantastic, but it turns out machine learning didn't go in that direction. It went in the direction of statistical pattern recognition, which means that the AI people join forces with the statisticians, and it's very hard now to distinguish a machine learning theorist from a statistician.
00:41:34.960
It's at the big data stuff that we hear someone get out. Exactly.
00:41:39.960
Now, the people who do inference from big data don't always use machine learning algorithms, but a great deal of the time, if you read the Forbes magazine or New York Times or something, you'll see the terms come up together, big data and machine learning come up together.
00:41:59.960
Now, the field developed extremely sophisticated statistical methods for doing pattern recognition starting with that work in 1980 and going right up to the present.
00:42:11.960
Now, if you do that, you're not doing a great deal of deep cognition. You're doing surface cognition. You're doing recognition as opposed to deep cognition.
00:42:26.960
So there actually, a world famous psychologist named Daniel Kahneman wrote a book in the last two or three years ago called Thinking Fast Thinking Slow, Wonderful Book.
00:42:43.960
And he distinguishes between the two kinds of thinking. The thinking fast is the perceptual recognition of things. And that's the kind of thing that AI programs are doing very well now using these machine learning statistical methods that hook up to big data, as you said.
00:43:04.960
So the thinking slow is when we really get down to thinking hard about a difficult problem that takes us minutes, hours, weeks and months, or in the case of Einstein years to solve.
00:43:20.960
The thinking fast programs have led us to intelligent sensing devices that are spectacular.
00:43:36.960
For example, the sensing programs, sensors and programs that help guide the Google self-driving car. The Google self-driving car is only now getting into which you might call the thinking part of driving.
00:43:53.960
But it's been sensational at the recognition part of driving, recognizing pedestrians, recognizing bicyclists, recognizing traffic lights.
00:44:03.960
Then you've driven in one hour. Oh yeah, I took a ride in a Google self-driving car about three weeks ago through the sit-through mountain view, up shoreline Boulevard and up across California Street over to Ringstorf and back to the freeway.
00:44:22.960
It was really amazing, but one of the things that the Google people right now are doing is exercising this knowledge principle that in the knowledge lies the power. They have to know a great deal about the streets of mountain view in order to drive carefully in mountain view.
00:44:40.960
You have to know the traffic light configurations, so you know where the red light really should be, so you can look there. When it stops outside of a zone that says, "Keep clear," I knew damn well that it wasn't reading K-E-E-P-C-L-E-A-R.
00:45:01.960
It's that the team had marked that as a clear zone, and it knew enough about that not to stop inside the clear zone. In fact, it's not allowed to break any of the rules of the California automobile driver's handbook manual.
00:45:19.960
It has to know a great deal about mountain view, and it will have to know a great deal about San Francisco, if it's going to drive in San Francisco.
00:45:28.960
And any of the cities that it drives in, hence we arrive at what AI people call the knowledge bottleneck. It's going to have to learn those things.
00:45:40.960
In some way, otherwise it's going to have to be told everything about those. And right now, the Google car is being told all of those things about mountain view.
00:45:50.960
So, but it works very efficiently in mountain view because it has such an extensive knowledge of the street. It's actually spectacular. But it's wildly challenging to get that level of expert knowledge about all the cities in the United States.
00:46:08.960
But I imagine if anyone thinks big it's Google. Yeah, it made me think that there's an opportunity for a wonderful business there, which is to provide other manufacturers for it and Toyota and Audi with that body of knowledge for all, and Google isn't planning to go into the car business, but I don't think.
00:46:32.960
But anyway, I told that to someone at Google and they said, "Of course, we know that. That's what we're aiming at."
00:46:40.960
Just to own and sell that body of knowledge.
00:46:43.960
So, is a car like that add something that excites you as is it also something that you consider to be one of the fruits of your own research over the years?
00:46:56.960
I see the, or is it only fast thinking and that you're more interested in slow thinking?
00:47:02.960
It turns out that for me personally, I'm interested in slow thinking. But what is very important for the artificial intelligence field is to have what are called integrated systems that combine all the elements of fast thinking and slow thinking into a working system.
00:47:23.960
One another way to put it is that computer science mostly and artificial intelligence in particular are experimental disciplines, and we learn by watching what we do and what mistakes our programs make, and we learn how to make them, we humans learn how to make them better, and we learn how to build learning programs to make them better.
00:47:47.960
But we can only do that if we have integrated systems that do the complete job, albeit badly, even if it does it badly, it will get better.
00:47:57.960
Something like the sire of the iPhone.
00:48:01.960
Yeah, and so how does that, such an integrated system, is it a combination of fast and slow thinking?
00:48:09.960
Well, Siri, it's called Siri, the sire, if you pick out the letters sRI, it means that the little company that Apple bought came was founded by people at sri.
00:48:25.960
And then we Stanford Research Institute.
00:48:27.960
What used to be Stanford Research Institute was spun off from Stanford during the Vietnam War demonstration era.
00:48:36.960
So they added the letter I to make it pronounceable, and Siri was developed as a consequence of an absolutely huge DARPA program, the same defense department sponsorship, defense advanced research project agency, it was a $300 million project.
00:48:56.960
It's a personal assistant, and Siri was the spinoff of that project, further developed by Apple, of course, as a commercial effort.
00:49:07.960
Now Siri does not do a lot of perception, so it doesn't, it's machine learning is not machine learning of sensations, because the Apple iPhone really doesn't have that many sensory issues.
00:49:25.960
It doesn't have that many senses on it.
00:49:31.960
But what Siri does is to, has done a lot of machine learning on natural language text that exists out on the internet.
00:49:43.960
The whole field of natural language processing, which is central to the artificial intelligence field, because language is so central to intelligence, has really switched to the artificial intelligence field, because language is so central to intelligence.
00:49:53.960
Has really switched over to become statistical natural language processing, in which the meaning of words is implicit in the statistical juxtaposition of certain adjectives with certain nouns, at certain adverb, that certain distances from each other, and the big machines that we have today can process billions, and I say billions, you know, the Carl Sagan, Billions, and you know,
00:50:22.960
the Carl Sagan, Billions, and Billions, and Billions, actually in the case of IBM's Watson program, another spectacular program will talk about in a minute.
00:50:32.960
They claim a trillion pieces of writing, and so Siri uses that, it also uses huge bodies of knowledge.
00:50:44.960
For example, it's linked up to one of the largest knowledge bases in the world. That's a thing called Wolfram Alpha, to which you can ask an absolutely enormous number of factual questions and get factual answers.
00:51:00.960
So Siri is an integrated system for coupling voice understanding, which itself was an artificial intelligence problem that was gradually solved over the years to be almost perfect now.
00:51:18.960
Voice understanding with textual understanding, trying to infer what is your meaning in the question, and generate an answer for you.
00:51:28.960
Now it doesn't always generate the right answer in just the same way that Watson, when it did its Jeopardy program, challenging the Jeopardy champions of all time, and Jeopardy didn't quite always get the right answers.
00:51:44.960
Sometimes it got really silly answers. Sometimes that's the case with Siri also, but that's the way it is with people too.
00:51:52.960
Sometimes people don't understand what you're saying, and they give you the wrong answer, and they have access to wrong knowledge.
00:51:59.960
So Siri is regarded by the artificial intelligence community as a real triumph. It's not only Siri, it's Siri and it's Microsoft competitor, and it's Google competitor by now.
00:52:12.960
So there are three of them out there. So how far away are we from a data who is a character in the Star Trek next generation?
00:52:24.960
He's the Android who not only has all the enhanced capacities of Siri exponentially multiplied, but it's also a sentient thinking moving syntheticization.
00:52:41.960
Synthetic human being in every respect except for the fact that he was not biologically generated.
00:52:48.960
Yeah. Great question. Let me answer the question by challenging the in every respect.
00:52:59.960
That's the key thing about data. Mr. Data.
00:53:03.960
He is not, we can tell right away he is not human like because Mr. Data is like an expert system.
00:53:13.960
He really knows a lot of specialties. What we love about Mr. Data, what makes him so cute on the program, is there's all that stuff he doesn't know that all the other humans on the ship know, but Mr. Data is so narrow.
00:53:28.960
He only knows about his specialties. Now with the exception of fluid natural language processing, Mr. Data is what we have now in our expert systems.
00:53:39.960
We can construct expert systems that do the things Mr. Data does when he does his specialty work.
00:53:47.960
What we haven't been able to replicate yet is the fluidity of the natural language processing, but we're getting there.
00:53:55.960
So now when you ask about the other characters on that program, you see real human beings.
00:54:06.960
Those human beings know an enormous amount more than our programs today know.
00:54:14.960
And when I say no, I don't mean no in the sense of Wolfram Alpha specific hard knowledge about the world.
00:54:21.960
I mean experiential knowledge, the knowledge of everyday life, common sense knowledge.
00:54:28.960
Well there's also this phenomenon that I'm curious about which is in philosophy, it's called attunement.
00:54:35.960
To be human is always to be attuned to the world, to your environment, to others.
00:54:40.960
It's almost a musical metaphor that you're in tune with things and that sort of attunement which on that Star Trek show you see,
00:54:50.960
the human beings being attuned to each other, so many things are just natural, taken for granted.
00:54:57.960
And Data doesn't, Mr. Data doesn't seem to have that attunement, although he's getting close.
00:55:06.960
Well Robert, there we go back, you're exactly right, and there we go back to that principle, the knowledge principle, in the knowledge lies to power.
00:55:16.960
Data never got that knowledge.
00:55:18.960
And what amuses us in the program is the occasional times when Data begins to dawn on him, that there's something else that he doesn't know, that it's a world outside of what he knows.
00:55:32.960
And we giga, and we say oh yeah of course, Data doesn't know that.
00:55:37.960
But isn't that also the origin of consciousness in a certain way where they're Aristotle through a bunch of other people have this notion.
00:55:47.960
Have this notion that thinking begins in wonder, it's when you become aware of the fact that you don't know something, that there's some incomprehensibility about the world or something that you're aware of, and that it activates this process of thinking about things.
00:56:05.960
Well, I think that there are two, my own feeling is that there are two separate questions there.
00:56:12.960
The first is something that we're really on top of, the research community and artificial intelligence, and that's the question of how can we know when it is we don't know, so that we can set up the learning process, so that our programs can set up the learning process to begin to figure out what it is it doesn't know.
00:56:34.960
That we have a reasonable handle on in artificial intelligence, we're moving in that direction, and that indeed is a big research direction.
00:56:44.960
Knowing what it is you don't know to be the start, the framework of machine learning.
00:56:50.960
The question of consciousness is an entirely different question that ranges from what you might call the "excessor"
00:57:02.960
what you might call the "excessor" some people view as the extremely fundamental and other people view like me view as an epiphenomenon.
00:57:15.960
The epiphenomenon view says that if you look inside some of the programs that I've described, particularly the one, the early one I described, E-PAM, you find components in there like the immediate memory.
00:57:31.960
We know what that is, that's heavily research in psychology, it's the very most short-term memory, plus or minus seven symbols, seven plus or minus two symbols.
00:57:43.960
We have the working memory, which is in the order of 300 to 1000 symbols, and we have the long-term memory.
00:57:51.960
We know that items in the working memory are passed into the long-term memory overnight, usually, or over a longer period of time through a brain mechanism called the hippocampus, the transfers memories from working memory to long-term memory.
00:58:08.960
During that period of accumulating symbols in these different working memories, we have an internal glimpse of the world that we're dealing with.
00:58:22.960
It's not 100% glimpse because there's all this unconscious work that's going on in the long-term memory that we don't have access to because those symbols are not available to us for working.
00:58:36.960
So the working memory is the consciousness, that's the EPI phenomenon view. It just says, yes, of course these machines have a consciousness, but that consciousness consists of only the things that they're working on.
00:58:50.960
And when we say that we're thinking of something and it takes place, oh, we woke up in the morning and suddenly we knew it.
00:58:58.960
That's because other processes were working on our behalf, but they were working outside of the working memory to deliver a result.
00:59:09.960
Now other people think that consciousness is ingrained in nature itself. You can read physicists who think that consciousness is actually a quantum phenomenon.
00:59:20.960
You can read the book by Sir Roger Penrose, who thinks that consciousness is really built into the quantum universe.
00:59:28.960
And there are all shades in between. But the great book to me, the single book someone should read if they're interested in the subject is by the philosopher Daniel Dennett of Tuft University called Consciousness Explained.
00:59:45.960
That is the single best book I would read if I were interested in that subject.
00:59:50.960
Well, here at Can I read you something I get from a paper that you yourself published in the 90s and where you're talking about the far side of the dream of artificial intelligence.
01:00:01.960
And it's a view from the future looking back at our present from Professor Marvin Minsky of MIT, who says, can you imagine that they used to have libraries where the books didn't talk to each other?
01:00:17.960
And you go on to write the libraries of today are warehouses of passive objects, the books and journals sit on shelves waiting for us to use.
01:00:25.960
Our intelligence to find them, read them, interpret them and cause them finally to divulge their stored knowledge.
01:00:31.960
Electronic libraries of today are no better. Their pages are pages of data files, but the electronic pages are equally passive.
01:00:39.960
And then you go on to ask us to imagine the library as an active intelligent knowledge server where these books would actually talk to each other.
01:00:50.960
So two questions here. One is as literary critics or historians, but especially literary critics, we look at the history of literature and all these books and part of our work, a great deal of our work is get these books to speak to each other.
01:01:08.960
Through our own hermeneutics and analysis and interpretation, making connections between them and coming up with a history of literature or interpretation.
01:01:19.960
So that would be one question, is that something that we've been doing natural?
01:01:24.960
I mean, we've been doing humanly in any case? And the other question is, how do you envision this future,
01:01:35.960
what you call here the intelligent knowledge server that would get these books in the library to speak to each other independently of our human agency?
01:01:45.960
Back though, in those days, that was with the minor exception of a few disciplines like I mentioned some of them, a few chemical disciplines, a few medical disciplines, few engineering disciplines.
01:01:57.960
The knowledge was all dead stuff. It was words printed or formulas printed on paper stored in libraries.
01:02:05.960
And that's in one way, that's a miracle that we were able to do that. We as human beings, we don't find animals below our level, writing up their stuff, writing down symbols, and transferring their knowledge, one baboon does not transfer its knowledge to another
01:02:27.760
baboon through writing and books and storing that material and passing it on to generation after generation by the act of reading.
01:02:38.760
Teaching reading. So it's miraculous that we, that's the essence of what we do as a culture.
01:02:49.760
Not all of our knowledge is passed through books. Some of it is experiential, but that is mostly a person by person. One-on-one, we gain experience and we get some knowledge that way, but as a culture, we pass our knowledge along through these books.
01:03:06.760
Now, we're not passing along dead symbols, ink on paper. We're passing along thoughts, concepts that are represented by the ink on paper.
01:03:19.760
The problem with the traditional library is it's dead. It sits there as ink on paper, not as actionable thoughts where you could pose a question and then those actionable thoughts would deliver,
01:03:35.760
an answer or a concept or a suggestion. None of that is available through a dead book.
01:03:44.760
So the quote that you have there says, let's try to make all of those books live in that symbolic sense of tying itself to other knowledge, tying itself to language, tying itself to the ability to communicate to human beings and
01:04:04.760
to other sources of knowledge that relate to it, and analogical ones, or related in some other way, metaphorical ones.
01:04:18.760
So, how close are we to that? Well, in one sense, I mean, there is a sense in which we are amazingly close to it.
01:04:33.760
Anybody who's used Google and uses it with any level of skill knows that the answer that you're looking for from these, as I said, billions and billions and billions of pages usually comes out right on the first page.
01:04:52.760
That's amazing. On the other hand, the Google people will be the first people to tell you that there's no understanding that's involved in that.
01:05:02.760
There's excellent keyword search and ranking of other people's search queries. So, realizing that, I remember one of the founders of Google who gave a talk at Stanford once saying that Google is an artificial intelligence company that just doesn't realize it's yet.
01:05:25.760
Well, they began to realize it, and they actually bought a couple artificial intelligence companies that had some very large knowledge bases, and they incorporated that material and build on it in a thing called the knowledge graph.
01:05:44.760
And now, if you use Google, you'll see the knowledge graph of what you asked, come up on the right-hand side where there used to be ads. Now you find a great deal of knowledge about the subject that you were talking about. So, things are just beginning to become knowledge rather than just symbols that are dead living on old trees.
01:06:09.760
Well, it's been quite a ride over the last hour, and you've taken us through so many different disciplines. I'm starting to realize that the work that you do over there with your colleagues in artificial intelligence really engages almost all our disciplines here in the university. There's psychology. There's language learning and symbolic systems, obviously chemistry.
01:06:38.760
And every sort of thing, so it seems like you can take the whole knowledge that we are involved within the university and put it to work in your artificial intelligence programs.
01:06:52.760
But we're all of us who teach here are proud of Stanford being one of the world's most influential and most excellent interdisciplinary centers of thought.
01:07:07.760
And in this case, that magic has been worked over the last two decades. We've had an undergraduate program called the program in symbolic systems, which cuts across psychology,
01:07:21.760
linguistics, and computer science. And it's been one of Stanford's most successful interdisciplinary programs, and student interest has been very high in the symbolic systems program.
01:07:37.760
A huge amount of majors, I gather. Yes. You can have, that's right, you can major in symbolic systems.
01:07:43.760
And I agree, I think that Stanford, something about this university, promotes interdisciplinary, and it's the kind of place where I would imagine, not every place could bring together these things away, the way we do it here.
01:08:00.760
Well, at Stanford, interdisciplinarity, if you can use that word, is highly rewarded in the faculties of the different departments, as opposed to other places where in solidarity, publishing as many papers as you can in the same journal, is unfortunately, that's what's rewarded.
01:08:24.760
That's great. Well, Ed, thanks for coming on. I want to remind our listeners, we've been speaking with Professor Edward Feigumbaum from the Department of Computer Science, and you can check out his entire bio on our website. It's very extensive. I mentioned it parts of it at the beginning of the show. So thanks again, Ed, for coming on, and thank you all for listening to entitled opinions.
01:08:48.760
[Music]
01:08:58.760
[Music]
01:09:26.760
[Music]
01:09:36.760
[Music]
01:09:46.760
[Music]
01:10:10.760
[Music]
01:10:34.760
[Music]
01:10:58.760
[Music]