Next: Impossibility of AI

Artificial Intelligence

In reviewing the history of artificial intelligence, there is a risk of being side-tracked into philosophical discussions of what intelligence really is. We will avoid these discussions by beginning with Turing's operational definition of artificial intelligence; then we'll return to take a brief look at the philosophical questions at the end.

The Turing Test

The computer scientist Alan Turing proposed a practical test that would replace the question, `Can a machine think?'. The test he proposed (``Computing Machinery and Intelligence'', in Mind, 59, pp. 433-460, 1950), was to interrogate the supposedly intelligent entity via some neutral interface, such as a teletype, so the interrogator would not be mislead by superficial details such as whether the entity was made of metal or meat.

Turing's paper includes a sample conversation between an interrogator and a human, covering a range of topics, including arithmetic and poetry. After reading this dialogue, it is impossible to imagine that the participants are not intelligent. It is also clear that no existing system is even close to being able to pass the test. However, we do now have a definite goal to aim at. So, after Turing, AI is aiming at achieving particular performances, and leaving the question of whether they can be called intelligent to the philosophers.

The ELIZA Effect

Very early in the history of AI -- in the early Sixties -- a researcher at MIT wrote a very simple program that he called ELIZA. (See http://i5.nyu.edu/~mm64/x52.9265/january1966.html ). ELIZA was a relatively short program that ran on computers with less processing power than a modern digital watch. Its creator, Joseph Weizenbaum, used a few simple tricks to give the illusion that the program was intelligent. The first of these was to choose to have the program simulate a non-directive psychiatrist -- in this way the person interacting with the program is tempted to interpret the program's naive responses as subtle prompts. Weizenbaum gives the example: ``If, for example, one were to tell a psychiatrist

"I went for a long boat ride"

and he responded

"Tell me about boats",

one would not assume that he knew nothing about boats, but that he had some purpose in so directing the subsequent conversation. ''

Weizenbaum was surprised and horrified to find that many people fell for these simple tricks and attributed intelligence to the program. It seems that people have an innate tendency to attribute personality to inanimate objects, just as we find it easy to see two dots and a line as making an expressive face :)

This does not invalidate the Turing test, but it does show that the test has to be administered sceptically and searchingly.

Games and Puzzles

In the late Forties, the Fifties and the Sixties, AI research concentrated on playing games and solving logic puzzles. The appeal of these domains were that they required very limited amounts of knowledge -- to play chess, you only need to know how the pieces move; you can remain ignorant of the existence of Dickens and even Shakespeare. This was a necessary limitation, since in those days memory was scarce and expensive.

One particular researcher, Arthur Samuel, spent the years 1947 to 1967 programming an IBM 700 computer to play checkers. This was one of the most successful of early attempts at programming game-playing; by the 1960's, it was playing at club level. (A.L. Samuel, ``Some Studies in Machine Learning Using the Game of Checkers II -- Recent Progress.'' IBM Journal of Rsearch and Development, Volume 11, No. 6, pp. 601-617, November 1967.) Reaching world-class level took considerably longer; a checkers-playing program becoming world champion in 1996. (The website http://www.math.wisc.edu/~propp/chinook.html gives a good account of this.)

Chess-playing programs were developed by several researchers, including Turing himself and Claud Shannon, the inventor of information theory. (``Programming a Computer for Playing Chess'', Philosophical Magazine, 41, pp. 256-275 (1950).) The subsequent advance of chess-playing computers to the world championship is more a tribute to brute force methods than to ingenious programming; Deep Blue can beat Kasparov because it's capable of immense parallel search; but this isn't how humans play chess.

For us to call something intelligent, it must have a certain breadth of ability. Some human beings have remarkable arithmetic capabilities -- for example, they can multiply 50-digit numbers mentally and identify six-figure primes -- while being in other respects subnormal. We don't call them intelligent, rather, we have the expression `Idiot Savants' to describe them. Similarly, researchers in the Fifties were looking for ways to give computers more general problem-solving abilities. One significant move in this direction was the `GPS' or `General Problem Solver' developed by Newell, Shaw and Simon. (A. Newell and H.A. Simon, ``GPS: A Program that Simulates Human Thought'', in Feigenbaum and Feldman (eds), Computers and Thought, McGraw-Hill, New York, 1963, pp. 279-293.) GPS was used to solve the `Cannibal and Missionary' problem -- we have three missionaries and three cannibals on one side of a river, plus a boat. All wish to cross to the other side, but the boat will only hold two, and if the cannibals ever outnumber the missionaries on either side of the river, the missionaries will get eaten. It solved this by, essentially, trial and error: the programmers provide a description of the problem in terms of states and operations, and the program then tries to reduce the difference between the current state and the goal state by applying operations, backtracking as necessary.

Newell, Shaw and Simon's motive in developing this program was not to provide a practical tool for missionaries to use in the field. Rather, they used the missionary and cannibal problem as an example of a problem that could be solved by reasoning, and which was complex enough that there was no obvious algorithmic solution. If they could solve this problem then, they expected, they could use very similar techniques to solve many other reasoning problems, including problems of practical interest. After all, they hadn't invented the problem to suit the computer; a very similar problem concerning a farmer attempting to cross a river with a wolf, a goat and a cabbage had been discussed by Alcuin, an advisor to Charlemagne, in the ninth century AD. Other river-crossing problems have been popular as brain-teasers ever since, and are found in Africa as well as most European traditions.

In retrospect, this problem has some peculiar features which make it unlike practical problems. Notably, all the information needed for a solution is provided in the problem. Any knowledge the solver may already have about boats, cannibals or missionaries is unnecessary -- and, since real cannibals and missionaries seldom behave in just this way, may even be a distraction.

The problem with this approach became clear when Newell and Simon attempted to extend it to larger problems: the number of possibilities to investigate grows unmanageable very quickly, and keeping track of what's been tried becomes a major book-keeping problem. In 1967, further development of GPS was abandoned. A further problem with GPS was that the problems had to be `pre-digested' by the programmers: setting up the notation and the variables for each problem already goes halfway towards solving it.

In 1956, John McCarthy invented the programming language `LISP' at MIT. This language was explicitly designed for symbol processing, unlike earlier languages which had been designed for numerical work. It continues to be one of the favourite languages for AI programming.

In 1962, Frank Rosenblatt published a book, ``Principles of Neurodynamics'', in which he described a class of devices known as perceptrons. These were parallel-processing devices, in which a `layer' of processing elements would each perform a simple calculation, based on a selection of a small number of inputs from a larger set. A second layer would then combine the results of these calculations to give a global conclusion, for example, ``The input is a picture of eight squares.'' Interest in these devices was high for several years, but the publication of `Perceptrons' by Minsky and Papert in 1969 discouraged further interest for about twenty years.

The Seventies and Eighties

The fifties and sixties in AI were characterised by puzzle-solving and game-playing -- attempts to generate intelligent behaviour within a very limited domain. Puzzles and games share the feature that no background knowledge is required; once you know the statement of the puzzle or the rules of the game, all you need is reasoning power. But if we look at the model answers that Turing imagines his intelligent computer giving in the Turing test, it's clear that they use a lot of background knowledge -- for example, the computer knows who Mr Pickwick is. One approach to bridging the gap between puzzle-solving and intelligent behaviour in the real world was the creation of `micro-worlds': simplified versions of the world, yet complex enough that heuristic reasoning was needed to cope with them. Terry Winograd did this with the robot `SHRDLU', which inhabited an imaginary `Blocks World', consisting of various rectangular and pyramidal blocks on an imaginary table top. SHRDLU could answer questions about these blocks, and re-arrange them to order. However, despite the excitement that SHRDLU generated in the AI community, Winograd himself came to take a very pessimistic view of the prospects for AI's success -- see his book with Flores, ``Understanding Computers and Cognition''.

In 1974, Marvin Minsky wrote an internal memo at the MIT AI labs, ``A Framework for Representing Knowledge''. This memo suggested a new strategy in AI, namely, creating sets of default expectations to aid programs in interpreting partial information. For example, if you're told that John went into a restaurant, ordered a hamburger, paid for it, and left, you can fill in various facts not given in the story, for example, that John probably ate the hamburger before leaving the restaurant. Minsky decided that one way to give a program appropriate expectations would be to provide it with `frames'. These can be thought of as skeleton scripts for particular situations, for example, ``Visiting a Restaurant'', ``Visiting the Doctor's Office''. The frames have various `slots', which represent possible roles in the script, for example, `customer', `waiter'. The frame can also contain default knowledge about these roles, for example, `the waiter brings food for the customer'. This idea has influenced the development of object-oriented programming.

Minsky's introduction of frames indicated a change in emphasis for AI research. Frames are a powerful method for organising knowledge about the world at appropriate levels of generality. Unfortunately, once we begin setting out our knowledge of the world in this (or any other) formal way, it becomes clear that we know a lot, but that our knowledge is very difficult to organise unambiguously.

One interesting attempt to organise some of our knowledge about the world was the `Naive Physics Project' [Hayes, P., ``The Naive Physics Manifesto'', in ``Expert Systems in the Microelectronic Age'', Editor D. Michie, Edinburgh University Press, 1978.]. Hayes started from the observation that we can predict quite a lot about the physical behaviour of the world -- for example, that if we tilt a glass of water sufficiently far, it will tip over and the water will spill out -- without doing any calculations. The standard method for generating a computer prediction of what happens to the glass would be to build a finite-element model of a contained liquid with a free surface in a gravitational field, then solve the Navier-Stokes equations for the fluid, starting from the initial condition of a given rotation about a horizontal axis. But it's quite clear that we don't make our prediction in this way. What do we need to know, and how do we reason, to conclude ``It'll spill!''?

This is a promising topic for investigation: we know from our own example that this kind of reasoning is possible; we can work on simple problems that are nevertheless situated within the real world; we can avoid having to reason about elusive concepts such as emotions and personalities; and a computer that can reason this way will probably be able to give understandable explanations for its predictions, which would be useful in many ways. Hayes estimated that this project would involve an order of magnitude more work than any prior AI project, but that it would be worth it.

Five years later, Hayes wrote `The Second Naive Physics Manifesto' [Hayes, P. J. (1985). The second naive physics manifesto. In Hobbs, J. R. and Moore, R. C., editors, Formal Theories of the Commonsense World, pages 1--36. Ablex, Norwood, New Jersey.]. Looking back on what had been achieved, he concluded that his original estimate of the size of the project had been off, and revised it to `two or three orders of magnitude more than any prior project.'

As the project proceeded, researchers began to distinguish between `naive physics', which would correspond to the common-sense predictions of ordinary people, and `qualitative physics', which would draw qualitative conclusions from qualitative axioms plus an appropriate set of reasoning tools: a qualitative mathematics. There were several candidates for this mathematics: an arithmetic in which the only three values were `negative', `zero' and `positive'; interval arithmetic; fuzzy mathematics. Naive physics, like ordinary people, might sometimes make wrong predictions; qualitative physics, on the other hand, should be a subset of real physics, so its predictions, though they might be vague, should never be wrong.

The achievements of the qualitative physics projects were summed up fourteen years after the original proposal in `Prolegomena to any future qualitative physics', by Sacks and Doyle. (See ) Sacks and Doyle concluded that very little had been achieved: most examples in the literature of qualitative math being successfully applied to physical problems involved minor variants on one or two examples. Many of the generalised differential equations derived in a search for qualitative physics contained no useful information. The best tool for reasoning about physical systems, they suggested, was traditional advanced mathematics.

CYC

How much knowledge does an entity need if it is going to be able to display some approximation to common sense? Marvin Minsky was fairly optimistic: ``I can't think of a hundred things that I know a thousand facts about,'' he said, suggesting that a database of a hundred thousand facts might be big enough. A hundred thousand facts is a lot, but not beyond the scope of a large research project. And, if this big database is combined with a reasoning engine, it should be able to fill in the gaps in its knowledge by inferring new facts from what it already knows.

Dr Douglas Lenat began work on such a project, called `CYC', in the mid-eighties. As of 1994, CYC had half a million facts in its database, condensed from an earlier total of two million. Various claims have been made for how much common sense CYC currently displays; a sceptical assessment was made by Professor Vaughan Pratt of Stanford -- see http://boole.stanford.edu/pub/cyc.report. Pratt's conclusion was that at that time, CYC was unable to answer any commonsense questions at all. His report contains an interesting list of such questions, for example:

Eleven years later, how much more does CYC know? It's still rather difficult to find out. You can download an open-source subset of CYC from http://www.opencyc.org/ , but what you get is a toolbox for a computer scientist, not an intelligence that you can converse with. To ask CYC any of Pratt's questions, you still need a CYC expert to translate them into a form CYC can handle, so even a successful answer may owe more to the translator's common sense than to CYC's insight. This represents a failure of Lenat's claims from a decade ago -- by now, CYC was supposed to understand English well enough to increase its understanding by doing its own reading.

Hofstadter's Fluid Analogies

A researcher by the name of Douglas Hofstadter, best known for his Pulitzer-prize-winning Godel, Escher and Bach: An Eternal Golden Braid, has pursued a quite different line of attack. Hofstadter observes that one of the essential characteristics of intelligence is fluidity: the ability to solve a problem by seeing it in an unconventional way. For example, imagine asking an expert system ``What is klim?''. We can imagine a powerful computer scanning through extensive lists of the chemical and common names for substances and drawing a blank, yet a smart ten-year-old would probably hit on the right answer, which is that it's `milk' spelled backwards.

One aspect of being able to come up with `milk' is a certain degree of self-monitoring: recognising that the question can be tackled by looking at the letters making up the word, rather than at the meaning of the word itself. This seems to be an essential part of human-solving -- asking at intervals, ``Is this approach getting me anywhere? Is there a different way of seeing the question?''

In a 1995 book, ``Fluid Concepts and Creative Analogies'', Hofstadter describes a number of research projects to develop programs that can display this kind of flexibility. One example is Copycat, a program that generates multiple answers to questions of the form `If abc becomes abd, what does kji become?'.

Copycat is clearly designed for a microworld, the microworld of short alphabetical sequences. The difficulty of achieving analogical reasoning within such a microworld shows, Hofstadter argues, that we are still very far from being able to achieve it in a larger world.

Late Nineties, Early Zeroes: Behaviour-Based Robotics

By the late nineties, it was becoming clear that we would not have a computer capable of passing the Turing test anytime soon. A researcher at MIT, Rodney Brooks, argued that most prior attempts to achieve intelligence had been akin to building skyscrapers by starting at the ninetieth floor: researchers had assumed that the hard problem was reasoning about concepts. These concepts would be generated by perceptions of the world, and plans resulting from the reasoning process would be enacted by suitable actuators, but the engineering of suitable sensors and actuators was a relatively trivial matter that could be left to technicians. But surveying the animal kingdom, Brooks argued, it was obvious that for the first few hundred million years of evolution, animals had managed to survive quite well without reasoning about any concepts whatever. It seems very unlikely that a cockroach, for example, constructs a mental map of the world based on its perceptions; and yet cockroaches have a fair degree of success in locating food and avoiding dangers. Therefore, the simplest strategy in designing an organism that can cope with the world is to have its sensors drive its actuators directly, just as a bright light immediately causes our iris to contract.

Conspicuous Failures, Invisible Successes

What Intelligence is

Parallel with the history of the AI movement is a change in our conception of intelligence. In 1900, the ability to play a good game of chess would certainly have been one criterion, as would facility in mental arithmetic, brainteasers and logic puzzles. These activities have several features in common: they are all activities in which clever people outperform ordinary people, and they are all to some extent accessible to introspection.

In focusing on these activities, we neglect the much larger class of activities in which all people outperform existing machines -- for example, recognising faces, finding the way to the bus stop, carrying coffee cups with the open end at the top. And these classes of activity are much less accessible to introspection.

In the early literature on IQ testing, we find the idea that intelligence is a single capacity, perhaps resolvable into a small number of specializations, such as verbal, mathematical and visuo-spatial intelligence. This provides a clear motivation for designing intelligent systems: we have only to provide them with this single capacity, and they will be able to handle whatever task we need performed. One of our laws of engineering is applicable here: rather than solve a thousand design problems for a thousand specialised tasks, we need only solve a single design problem -- creating a general-purpose intelligence -- and then mass-produce intelligent robots, adaptable to any task at hand.

The history of AI research runs counter to this idea. We have a collection of systems which excel at narrowly defined tasks -- playing chess, doing arithmetic -- but no way of generalising these abilities has been found. So perhaps we have the wrong idea of what intelligence is. When we use introspection to examine our thought processes, perhaps we can only see those thought processes that move very slowly, while the bulk of our mental activity goes by too fast to be noticed.

Philosophers

One of the best-known critics of AI is Hubert Dreyfus, who argues that computers will never be capable of certain tasks. (His list of such tasks at one point included `playing master-level chess'.) His objections can most economically be dealt with by waiting: progress, or lack of progress, in AI will eventually show whether he's right.

More difficult to contend with are philosophers who argue that passing the Turing test -- or passing any other behavioural criterion of intelligence -- is insufficient to show that the computer really understands what it's doing. These thinkers include John Searle, of the University of California at Berkeley; Roger Penrose, of the University of Oxford; and John Bird, of SFU.

Searle

Searle's chief argument is something he calls the `Chinese Room'. It works as follows: we have a large box with a slot in it. If you push a question, written in Chinese, through the slot, after a few minutes an answer, also written in Chinese, will get pushed out through the slot. A Chinese-speaking person may converse with the box for several minutes, and come to believe that the box, or something in the box, understands Chinese. But then we open the box, and inside we find a grad student and a large collection of books. The student, who only speaks English, looks up the incoming characters in the books, then follows various rules, also printed in the books, to assemble an answer, which he then pushes out through the slot. Note that he doesn't translate the characters into English; he follows the rules blindly. But since he doesn't understand what he's written, there really is no understanding of Chinese going on inside the box. Therefore, says Searle, the fact that an entity acts like it understands is no proof that it really understands.

Penrose

Penrose's argument makes use of Godel's proof, which states that given any sufficiently complex system of formal rules, either there is a true statement that cannot be proved within the system, or the system is self-contradictory. We can view the program of a computer as a set of formal rules, and conclude that there will be true conclusions that the computer cannot reach. But we can reach them.

Dr John Bird

Dr Bird's argument may be summarised as follows:

  1. All computers are agents of necessity.

  2. No agent of necessity can be intelligent.

  3. Therefore, no computer can be intelligent.

The expression `agent of necessity' means a system whose outputs are fully determined by its inputs. For example, an `AND' gate is a computer switch with two inputs and a single output. The behaviour of the `AND' gate is fully specified by the rule ``If both inputs are `1', output `1'; otherwise output `0'. Clearly there is no room for an AND gate to manifest intelligence, choice, or awareness. But, Dr Bird argues, a computer is simply a network of AND gates, and its behaviour will be fully determined by the behaviour of these gates. Therefore, however large the computer, it can never display any trace of intelligence.

Some Counter-Arguments

Computers need not be deterministic

Penrose's argument and Dr Bird's argument both assume that all computers are deterministic systems. But it is easy to construct a computer which is non-deterministic:

In solving large problems, computer scientists sometimes make use of a method called `the Monte Carlo method'. This may be illustrated by a simple example: suppose we wish to calculate a value for pi. We construct a circle of unit radius. If we could measure its area, that would give us the value. We'll suppose, however, that we cannot directly measure the area, though we can tell whether any given point is inside or outside the circle. (This is obviously unrealistic for something as simple as a circle, but is quite reasonable for the regions in multi-dimensional space where this method would actually be used.)

I have a random-number generator, based on measuring the time between successive disintegrations of nuclei in a lump of a radio-isotope. I normalize a series of these random numbers to lie between -1 and 1, and take successive pairs of numbers as the coordinates of points within the square having opposite vertices at (-1,-1) and (1,1). I place a large number of points, and count the fraction that land within the circle. This fraction is an estimate of pi/4.

Applying this method repeatedly would give different answers every time, though if the number of points were large enough, they would all be fairly close to the right answer.

This might be a very small step in a much larger calculation. I might ask a grad student to provide an estimate of pi, and not care about how he got it. So in describing the large calculation, it would appear to me that each step followed by mathematical necessity from the previous step. Yet repeating the whole process might give me a different answer.

Now, Dr Bird has anticipated this objection, and his response is to separate the random-number generator from the rest of the system, and to argue that the rest of the system remains an agent of necessity. But this may not be possible. Suppose, for example, that we are considering a neural-net architecture of 1,000,000 neurons, and each neuron makes decisions by saying to itself, ``If the sum of my inputs is greater than pi, I will fire; otherwise not.'' Its estimate of pi is, of course, provided by its own Monte Carlo routine and associated radio-isotope. John's argument requires him to take the net apart and consider the programmable part and the random-number generators separately; but then he's no longer studying the architecture I've designed.

Computers need not manipulate symbols according to formal rules

Searle's and Penrose's arguments both assume that the only thing computers can do is to manipulate symbols according to rules. But there already exist computers, for example, neural net computers, which make no use of symbols. Symbols may emerge from the operation of the neural net, just as the operation of our brains gives rise to symbolic entities such as words. But we don't find symbols at the level of the individual neurons.

`Intelligent' is not a difficult word

Another characteristic of the arguments advanced by these three thinkers is that they all make `intelligent' a mysterious word. For Searle, for example, an entity that behaves intelligently may always turn out to be faking it -- if we can just figure out how to look `inside' it.

So for Searle, and for Dr Bird, we can never really know if a creature is intelligent or not. Suppose, for example, that a spaceship lands in the AQ tomorrow and an alien emerges. How are we to tell if the alien is really intelligent? We could try giving him tests or drawing conclusions from the fact that he's the one who's built a spaceship and reached our planet, not vice-versa; but Searle's Chinese Room argument has already ruled these out. So, can we look inside?

Assuming the alien will humour us for a while, we strap him on an operating table and prepare our X-rays, CAT scans and electron microscopes. But having got him on the table, it strikes us: we haven't the faintest idea what we're looking for. If we look at the alien's brain, we can expect to see matter in motion. But we already know that matter in motion obeys the laws of physics, and the laws of physics are of only two kinds: there are deterministic laws, governing the behaviour of macroscopic matter, and there are probabilistic laws, governing the decay of quantum wave-functions under measurement. Science knows of no other way for matter to behave. So it seems that we're putting the alien through a test that none of us could pass: for Dr Bird to call him intelligent, his brain has to behave in a way that contradicts the known laws of physics. But we've never seen any other system behave in such a way, so why are we looking for it now?

As soon as we return from the world of philosophy to the real world, all these difficulties disappear. We know very well what intelligence is and how to measure it. Dr Bird himself, when assigning grades for a course, will test the understanding of his students to see if they've really understood the material or are just repeating phrases they've memorised from a text book. And he does this, not by probing their brains, but by asking questions: ``How will the circuit behave if this resistance is increased? What could cause this circuit to go unstable?'' This is the approach we all use in making judgements of intelligence, something we do every day with no difficulty.

Next: Impossibility of AI


John Jones
Tue Aug 30 14:38:19 PST 2005