Artificial Intelligence
In reviewing the history of artificial intelligence, there is a risk of being side-tracked into philosophical discussions of what intelligence really is. We will avoid these discussions by beginning with Turing's operational definition of artificial intelligence; then we'll return to take a brief look at the philosophical questions at the end.
Turing's paper includes a sample conversation between an interrogator and a human, covering a range of topics, including arithmetic and poetry. After reading this dialogue, it is impossible to imagine that the participants are not intelligent. It is also clear that no existing system is even close to being able to pass the test. However, we do now have a definite goal to aim at. So, after Turing, AI is aiming at achieving particular performances, and leaving the question of whether they can be called intelligent to the philosophers.
"I went for a long boat ride"
and he responded
"Tell me about boats",
one would not assume that he knew nothing about boats, but that he had some purpose in so directing the subsequent conversation. ''
Weizenbaum was surprised and horrified to find that many people fell for these simple tricks and attributed intelligence to the program. It seems that people have an innate tendency to attribute personality to inanimate objects, just as we find it easy to see two dots and a line as making an expressive face :)
This does not invalidate the Turing test, but it does show that the test has to be administered sceptically and searchingly.
One particular researcher, Arthur Samuel, spent the years 1947 to 1967 programming an IBM 700 computer to play checkers. This was one of the most successful of early attempts at programming game-playing; by the 1960's, it was playing at club level. (A.L. Samuel, ``Some Studies in Machine Learning Using the Game of Checkers II -- Recent Progress.'' IBM Journal of Rsearch and Development, Volume 11, No. 6, pp. 601-617, November 1967.) Reaching world-class level took considerably longer; a checkers-playing program becoming world champion in 1996. (The website http://www.math.wisc.edu/~propp/chinook.html gives a good account of this.)
Chess-playing programs were developed by several researchers, including Turing himself and Claud Shannon, the inventor of information theory. (``Programming a Computer for Playing Chess'', Philosophical Magazine, 41, pp. 256-275 (1950).) The subsequent advance of chess-playing computers to the world championship is more a tribute to brute force methods than to ingenious programming; Deep Blue can beat Kasparov because it's capable of immense parallel search; but this isn't how humans play chess.
For us to call something intelligent, it must have a certain breadth of ability. Some human beings have remarkable arithmetic capabilities -- for example, they can multiply 50-digit numbers mentally and identify six-figure primes -- while being in other respects subnormal. We don't call them intelligent, rather, we have the expression `Idiot Savants' to describe them. Similarly, researchers in the Fifties were looking for ways to give computers more general problem-solving abilities. One significant move in this direction was the `GPS' or `General Problem Solver' developed by Newell, Shaw and Simon. (A. Newell and H.A. Simon, ``GPS: A Program that Simulates Human Thought'', in Feigenbaum and Feldman (eds), Computers and Thought, McGraw-Hill, New York, 1963, pp. 279-293.) GPS was used to solve the `Cannibal and Missionary' problem -- we have three missionaries and three cannibals on one side of a river, plus a boat. All wish to cross to the other side, but the boat will only hold two, and if the cannibals ever outnumber the missionaries on either side of the river, the missionaries will get eaten. It solved this by, essentially, trial and error: the programmers provide a description of the problem in terms of states and operations, and the program then tries to reduce the difference between the current state and the goal state by applying operations, backtracking as necessary.
Newell, Shaw and Simon's motive in developing this program was not to provide a practical tool for missionaries to use in the field. Rather, they used the missionary and cannibal problem as an example of a problem that could be solved by reasoning, and which was complex enough that there was no obvious algorithmic solution. If they could solve this problem then, they expected, they could use very similar techniques to solve many other reasoning problems, including problems of practical interest. After all, they hadn't invented the problem to suit the computer; a very similar problem concerning a farmer attempting to cross a river with a wolf, a goat and a cabbage had been discussed by Alcuin, an advisor to Charlemagne, in the ninth century AD. Other river-crossing problems have been popular as brain-teasers ever since, and are found in Africa as well as most European traditions.
In retrospect, this problem has some peculiar features which make it unlike practical problems. Notably, all the information needed for a solution is provided in the problem. Any knowledge the solver may already have about boats, cannibals or missionaries is unnecessary -- and, since real cannibals and missionaries seldom behave in just this way, may even be a distraction.
The problem with this approach became clear when Newell and Simon attempted to extend it to larger problems: the number of possibilities to investigate grows unmanageable very quickly, and keeping track of what's been tried becomes a major book-keeping problem. In 1967, further development of GPS was abandoned. A further problem with GPS was that the problems had to be `pre-digested' by the programmers: setting up the notation and the variables for each problem already goes halfway towards solving it.
In 1956, John McCarthy invented the programming language `LISP' at MIT. This language was explicitly designed for symbol processing, unlike earlier languages which had been designed for numerical work. It continues to be one of the favourite languages for AI programming.
In 1962, Frank Rosenblatt published a book, ``Principles of Neurodynamics'', in which he described a class of devices known as perceptrons. These were parallel-processing devices, in which a `layer' of processing elements would each perform a simple calculation, based on a selection of a small number of inputs from a larger set. A second layer would then combine the results of these calculations to give a global conclusion, for example, ``The input is a picture of eight squares.'' Interest in these devices was high for several years, but the publication of `Perceptrons' by Minsky and Papert in 1969 discouraged further interest for about twenty years.
In 1974, Marvin Minsky wrote an internal memo at the MIT AI labs, ``A Framework for Representing Knowledge''. This memo suggested a new strategy in AI, namely, creating sets of default expectations to aid programs in interpreting partial information. For example, if you're told that John went into a restaurant, ordered a hamburger, paid for it, and left, you can fill in various facts not given in the story, for example, that John probably ate the hamburger before leaving the restaurant. Minsky decided that one way to give a program appropriate expectations would be to provide it with `frames'. These can be thought of as skeleton scripts for particular situations, for example, ``Visiting a Restaurant'', ``Visiting the Doctor's Office''. The frames have various `slots', which represent possible roles in the script, for example, `customer', `waiter'. The frame can also contain default knowledge about these roles, for example, `the waiter brings food for the customer'. This idea has influenced the development of object-oriented programming.
Minsky's introduction of frames indicated a change in emphasis for AI research. Frames are a powerful method for organising knowledge about the world at appropriate levels of generality. Unfortunately, once we begin setting out our knowledge of the world in this (or any other) formal way, it becomes clear that we know a lot, but that our knowledge is very difficult to organise unambiguously.
One interesting attempt to organise some of our knowledge about the world was the `Naive Physics Project' [Hayes, P., ``The Naive Physics Manifesto'', in ``Expert Systems in the Microelectronic Age'', Editor D. Michie, Edinburgh University Press, 1978.]. Hayes started from the observation that we can predict quite a lot about the physical behaviour of the world -- for example, that if we tilt a glass of water sufficiently far, it will tip over and the water will spill out -- without doing any calculations. The standard method for generating a computer prediction of what happens to the glass would be to build a finite-element model of a contained liquid with a free surface in a gravitational field, then solve the Navier-Stokes equations for the fluid, starting from the initial condition of a given rotation about a horizontal axis. But it's quite clear that we don't make our prediction in this way. What do we need to know, and how do we reason, to conclude ``It'll spill!''?
This is a promising topic for investigation: we know from our own example that this kind of reasoning is possible; we can work on simple problems that are nevertheless situated within the real world; we can avoid having to reason about elusive concepts such as emotions and personalities; and a computer that can reason this way will probably be able to give understandable explanations for its predictions, which would be useful in many ways. Hayes estimated that this project would involve an order of magnitude more work than any prior AI project, but that it would be worth it.
Five years later, Hayes wrote `The Second Naive Physics Manifesto' [Hayes, P. J. (1985). The second naive physics manifesto. In Hobbs, J. R. and Moore, R. C., editors, Formal Theories of the Commonsense World, pages 1--36. Ablex, Norwood, New Jersey.]. Looking back on what had been achieved, he concluded that his original estimate of the size of the project had been off, and revised it to `two or three orders of magnitude more than any prior project.'
As the project proceeded, researchers began to distinguish between `naive physics', which would correspond to the common-sense predictions of ordinary people, and `qualitative physics', which would draw qualitative conclusions from qualitative axioms plus an appropriate set of reasoning tools: a qualitative mathematics. There were several candidates for this mathematics: an arithmetic in which the only three values were `negative', `zero' and `positive'; interval arithmetic; fuzzy mathematics. Naive physics, like ordinary people, might sometimes make wrong predictions; qualitative physics, on the other hand, should be a subset of real physics, so its predictions, though they might be vague, should never be wrong.
The achievements of the qualitative physics projects were summed up fourteen
years after the original proposal in
`Prolegomena to any future qualitative physics', by Sacks and Doyle.
(See
Dr Douglas Lenat began work on such a project, called `CYC', in the mid-eighties.
As of 1994, CYC had half a million facts in its database, condensed from an earlier
total of two million. Various claims have been made for how much common sense
CYC currently displays; a sceptical assessment was made by Professor Vaughan
Pratt of Stanford -- see http://boole.stanford.edu/pub/cyc.report. Pratt's
conclusion was that at that time, CYC was unable to answer any commonsense
questions at all. His report contains an interesting list of such questions,
for example:
Eleven years later, how much more does CYC know? It's still rather difficult
to find out. You can download an open-source subset of CYC from http://www.opencyc.org/ ,
but what you get is a toolbox for a computer scientist, not an intelligence that you can
converse with. To ask CYC any of Pratt's questions, you still need a CYC expert
to translate them into a form CYC can handle, so even a successful answer may owe
more to the translator's common sense than to CYC's insight.
This represents a failure of Lenat's claims from a decade ago -- by now, CYC was
supposed to understand English well enough to increase its understanding by doing
its own reading.
One aspect of being able to come up with `milk' is a certain degree of
self-monitoring: recognising that the question can be tackled by looking
at the letters making up the word, rather than at the meaning of the word
itself. This seems to be an essential part of human-solving -- asking at
intervals, ``Is this approach getting me anywhere? Is there a different
way of seeing the question?''
In a 1995 book, ``Fluid Concepts and Creative Analogies'', Hofstadter describes
a number of research projects to develop programs that can display this
kind of flexibility. One example is Copycat, a program that generates multiple
answers to questions of the form `If abc becomes abd, what
does kji become?'.
Copycat is clearly designed for a microworld, the microworld of
short alphabetical sequences.
The difficulty of achieving analogical reasoning within such a microworld
shows, Hofstadter argues, that we are still very far from being able to
achieve it in a larger world.
In focusing on these activities, we neglect the much larger class of
activities in which all people outperform existing machines -- for example,
recognising faces, finding the way to the bus stop, carrying coffee cups
with the open end at the top. And these classes of activity are much less
accessible to introspection.
In the early literature on IQ testing, we find the idea that intelligence
is a single capacity, perhaps resolvable into a small number of
specializations, such as verbal, mathematical and visuo-spatial
intelligence. This provides a clear motivation for designing
intelligent systems: we have only to provide them with this single capacity,
and they will be able to handle whatever task we need performed.
One of our laws of engineering is applicable here: rather than solve a
thousand design problems for a thousand specialised tasks, we need only
solve a single design problem -- creating a general-purpose intelligence --
and then mass-produce intelligent robots, adaptable to any task at hand.
The history of AI research runs counter to this idea. We have a
collection of systems which excel at narrowly defined tasks -- playing chess,
doing arithmetic -- but no way of generalising these abilities has been
found.
So perhaps we have the wrong idea of what intelligence is.
When we use introspection to examine our thought processes, perhaps
we can only see those thought processes that move very slowly, while the
bulk of our mental activity goes by too fast to be noticed.
More difficult to contend with are philosophers who argue that
passing the Turing test -- or passing any other behavioural criterion
of intelligence -- is insufficient to show that the computer really
understands what it's doing.
These thinkers include John Searle, of the University of California at Berkeley;
Roger Penrose, of the University of Oxford; and John Bird, of SFU.
Penrose's argument makes use of Godel's proof, which states that
given any sufficiently complex system of formal rules, either there is a
true statement that cannot be proved within the system, or the system
is self-contradictory. We can view the program of a computer
as a set of formal rules, and conclude that there will be true conclusions
that the computer cannot reach. But we can reach them.
Dr Bird's argument may be summarised as follows:
The expression `agent of necessity' means a system whose
outputs are fully determined by its inputs. For example, an `AND'
gate is a computer switch with two inputs and a single output.
The behaviour of the `AND' gate is fully specified by
the rule ``If both inputs are `1', output `1'; otherwise
output `0'. Clearly there is no room for an AND gate to
manifest intelligence, choice, or awareness. But, Dr Bird argues,
a computer is simply a network of AND gates, and its behaviour will
be fully determined by the behaviour of these gates. Therefore,
however large the computer, it can never display any trace of
intelligence.
In solving large problems, computer scientists sometimes make use of a method called
`the Monte Carlo method'. This may be illustrated by a simple example:
suppose we wish to calculate a value for pi. We construct a circle of unit radius.
If we could measure its area, that would give us the value. We'll suppose, however,
that we cannot directly measure the area, though
we can tell whether any given point is inside or outside the circle.
(This is obviously unrealistic for something as simple as a circle, but
is quite reasonable for the regions in multi-dimensional space where this method
would actually be used.)
I have a random-number generator, based on measuring the time between successive disintegrations
of nuclei in a lump of a radio-isotope. I normalize a series of these random numbers to lie
between -1 and 1, and take successive pairs of numbers as the coordinates of points within the
square having opposite vertices at (-1,-1) and (1,1). I place a large number of points, and count
the fraction that land within the circle. This fraction is an estimate of pi/4.
Applying this method repeatedly would give different answers every time, though if the number of
points were large enough, they would all be fairly close to the right answer.
This might be a very small step in a much larger calculation. I might ask a grad student
to provide an estimate of pi, and not care about how he got it. So in describing the large
calculation, it would appear to me that each step followed by mathematical necessity from the
previous step. Yet repeating the whole process might give me a different answer.
Now, Dr Bird has anticipated this objection, and his response is to separate the random-number
generator from the rest of the system, and to argue that the rest of the system remains an
agent of necessity. But this may not be possible. Suppose, for example, that we are
considering a neural-net architecture of 1,000,000 neurons, and each neuron makes decisions
by saying to itself, ``If the sum of my inputs is greater than pi, I will fire; otherwise not.''
Its estimate of pi is, of course, provided by its own Monte Carlo routine and associated radio-isotope.
John's argument requires him to take the net apart and consider the programmable part and the random-number
generators separately; but then he's no longer studying the architecture I've designed.
So for Searle, and for Dr Bird, we can never really know if a creature is
intelligent or not. Suppose, for example, that a spaceship lands in the AQ tomorrow and an alien
emerges. How are we to tell if the alien is really intelligent? We could try giving him tests
or drawing conclusions from the fact that he's the one who's built a spaceship and reached our planet,
not vice-versa; but Searle's Chinese Room argument has already ruled these out.
So, can we look inside?
Assuming the alien will humour us for a while, we strap him on an operating table and
prepare our X-rays, CAT scans and electron microscopes. But having got him on the table, it strikes
us: we haven't the faintest idea what we're looking for. If we look at the alien's brain, we
can expect to see matter in motion. But we already know that matter in motion obeys the laws of
physics, and the laws of physics are of only two kinds: there are deterministic laws, governing
the behaviour of macroscopic matter, and there are probabilistic laws, governing the decay of
quantum wave-functions under measurement. Science knows of no other way for matter to behave.
So it seems that we're putting the alien through a test that none of us could pass: for Dr Bird
to call him intelligent, his brain has to behave in a way that contradicts the known laws of
physics. But we've never seen any other system behave in such a way, so why are we looking for it now?
As soon as we return from the world of philosophy to the real world, all these difficulties disappear.
We know very well what intelligence
is and how to measure it. Dr Bird himself, when assigning grades for a course, will test the understanding
of his students to see if they've really understood the material or are just repeating phrases they've
memorised from a text book. And he does this, not by probing their brains,
but by asking questions: ``How will the circuit behave if this resistance is increased?
What could cause this circuit to go unstable?'' This is the approach we all use in making judgements of
intelligence, something we do every day with no difficulty.
Next: Impossibility of AI
CYC
How much knowledge does an entity need if it is going to be able to display
some approximation to common sense? Marvin Minsky was fairly optimistic:
``I can't think of a hundred things that I know a thousand facts about,''
he said, suggesting that a database of a hundred thousand facts might be
big enough. A hundred thousand facts is a lot, but not beyond the scope of
a large research project. And, if this big database is combined with a reasoning
engine, it should be able to fill in the gaps in its knowledge by inferring new
facts from what it already knows.
Hofstadter's Fluid Analogies
A researcher by the name of Douglas Hofstadter, best known for his Pulitzer-prize-winning
Godel, Escher and Bach: An Eternal Golden Braid, has pursued a quite different
line of attack.
Hofstadter observes that one of the essential characteristics of intelligence is
fluidity: the ability to solve a problem by seeing it in an unconventional
way. For example, imagine asking an expert system ``What is klim?''. We can imagine
a powerful computer scanning through extensive lists of the chemical and common names
for substances and drawing a blank, yet a smart ten-year-old would probably hit on
the right answer, which is that it's `milk' spelled backwards.
Late Nineties, Early Zeroes: Behaviour-Based Robotics
By the late nineties, it was becoming clear that we would not have a computer
capable of passing the Turing test anytime soon.
A researcher at MIT, Rodney Brooks, argued that most prior attempts to
achieve intelligence had been akin to building skyscrapers by starting at
the ninetieth floor: researchers had assumed that the hard problem was
reasoning about concepts. These concepts would be generated by perceptions
of the world, and plans resulting from the reasoning process would be
enacted by suitable actuators, but the engineering of suitable sensors
and actuators was a relatively trivial matter that could be left
to technicians.
But surveying the animal kingdom, Brooks argued, it was obvious that for the
first few hundred million years of evolution, animals had managed to survive
quite well without reasoning about any concepts whatever. It seems very unlikely
that a cockroach, for example, constructs a mental map of the world based on
its perceptions; and yet cockroaches have a fair degree of success in locating
food and avoiding dangers.
Therefore, the simplest strategy in designing an organism that can cope with the world
is to have its sensors drive its actuators directly, just as a bright light
immediately causes our iris to contract.
Conspicuous Failures, Invisible Successes
What Intelligence is
Parallel with the history of the AI movement is a change in our conception of
intelligence. In 1900, the ability to play a good game of chess would
certainly have been one criterion, as would facility in mental arithmetic,
brainteasers and logic puzzles.
These activities have several features in common: they are all activities in
which clever people outperform ordinary people, and they are all to some
extent accessible to introspection.
Philosophers
One of the best-known critics of AI is Hubert Dreyfus, who
argues that computers will never be capable of certain tasks.
(His list of such tasks at one point included `playing master-level
chess'.) His objections can most economically be dealt with
by waiting: progress, or lack of progress, in AI will eventually
show whether he's right.
Searle
Searle's chief argument is something he calls the `Chinese Room'.
It works as follows: we have a large box with a slot in it. If
you push a question, written in Chinese, through the slot, after a
few minutes an answer, also written in Chinese, will get pushed out
through the slot. A Chinese-speaking person may converse with the
box for several minutes, and come to believe that the box, or something
in the box, understands Chinese. But then we open the box, and inside we
find a grad student and a large collection of books. The student, who only
speaks English, looks up the incoming characters in the books, then follows
various rules, also printed in the books, to assemble an answer, which he then
pushes out through the slot. Note that he doesn't translate the
characters into English; he follows the rules blindly. But since he doesn't understand what he's written,
there really is no understanding of Chinese going on inside the box. Therefore,
says Searle, the fact that an entity acts like it understands is no proof that it really
understands.
Penrose
Dr John Bird
Some Counter-Arguments
Computers need not be deterministic
Penrose's argument and Dr Bird's argument both assume that all
computers are deterministic systems. But it is easy to construct
a computer which is non-deterministic:
Computers need not manipulate symbols according to formal rules
Searle's and Penrose's arguments both assume that the only thing computers can do is to
manipulate symbols according to rules. But there already exist computers, for example,
neural net computers, which make no use of symbols.
Symbols may emerge from the operation of the neural net, just as the operation of our
brains gives rise to symbolic entities such as words. But we don't find symbols at the level
of the individual neurons.
`Intelligent' is not a difficult word
Another characteristic of the arguments advanced by these three thinkers is that
they all make `intelligent' a mysterious word. For Searle, for example,
an entity that behaves intelligently may always turn out to be faking it
-- if we can just figure out how to look `inside' it.
John Jones
Tue Aug 30 14:38:19 PST 2005