Artificial Intelligence Tutorial Review   home

Developed and compiled by Eyal Reingold and Johnathan Nightingale

Welcome to the PSY371 Artificial Intelligence tutorial review. These pages were developed for the use of psychology students interested in the field of Artificial Intelligence, especially as it relates to the ongoing investigations in psychology aimed at understanding the human mind. This review is meant to be treated as a unified whole since there is a great deal of interconnection between the topics, and since later chapters may depend to some extent on earlier readings.

This review has been designed with the expectation that its readers are new to the area, and care is taken to explain concepts fully. The review should provide an interesting and accessible introduction for beginners, but may be somewhat redundant for readers with more background in the area. Nevertheless, more advanced readers may find interesting links and demonstrations throughout the review. Also, in hopes of keeping the tutorial accessible, many of the more technical issues in AI have been simplified or avoided, with more emphasis being put on conceptual developments and interactive examples. For readers interested in implementation issues, and a computer science perspective, see the University of Nottingham's AI Methods (AIM) course.

 home

1) Early Work in AI
2) Can Machines Think?
     2.1) Turing Test
     2.2) Chinese Room
     2.3) Can Computers be Creative?
     2.4) Can Computers possess Emotional Intelligence?
     2.5) Consciousness in AI
3) Symbolic AI
     3.1) Common Sense Knowledge Problem
     3.2) CYC
     3.3) Expert Systems
     3.4) Game Playing: Man vs. Machine
4) Artificial Neural Networks
     4.1) Brainwave Neural Net Tutorial
5) Robotics and Computer Vision
6) Artificial Life
7) Additional Resources

 

Early Work in AI

Back to index
 home

 

Prehistory of AI

   For centuries, if not millenia, humanity has concerned itself - in fiction and often in engineering - with the creation of devices meant to mimic human behaviour, or to behave in a seemingly intelligent way. Even before science fiction brought the ideas of superhuman robots and megalomaniacal computers to the masses, people built bowing statues, letter-writing dolls, and player pianos in an effort to make the inanimate seem somehow more human.

   Once fiction got ahold of the idea, bookshelves filled with ideas that until recently, were entirely speculative, with no hope of being realised. Bruce Mazlish, at Stanford university, has written an excellent article (if link fails, click here for a local copy) that further discusses the history of automata and other AI in culture and literature. This article helps the reader to appreciate what a vast cultural impact work in AI can have on society. It is well researched and documented, as well as being an interesting read.

The Modern Birth of AI

   Like many technological advances of this century, the development of computer science began as a military effort. In the early 1940's, both Germany and the United States were racing to produce electronic computers that could be used in ballistics calculations, or in deciphering coded messages. It wasn't until after the war that computing facilities, often several rooms in size, could be spared for less essential tasks. The excess computing power after the war, coupled with some major advances in the design of the machines provided a fertile ground for exploring some more esoteric ideas for computing.

   Among the first researchers to attempt to build intelligent programs were Newell and Simon. Their first well known program, Logic Theorist, was a program that proved (or attempted to prove) statements using the accepted rules of logic and a problem solving program of their own design. Their results were promising: not only could Logic Theorist reproduce many of the proofs that humans had developed, but in the case of one theorem, it actually produced a better (i.e. shorter, more direct) proof than the one commonly found in logic textbooks.

   The fury of enthusiasm that emerged out of Logic Theorist's humble accomplishments was astounding. In the summer of 1956, within a year of Newell and Simon's accomplishment, John McCarthy organised "The Dartmouth summer research project on Artificial Intelligence" (McCarthy actually coined the term 'Artificial Intelligence'). This conference is considered an important milestone in the development of AI.

Early Successes... and failures

Hubert Dreyfus, a prominent critic of the AI movement, notes that almost all early work in AI was concerned with one of three areas: Language Translation, Problem Solving, or Pattern Recognition. In each area, significant early successes were produced. Language translation was an early leader, Dreyfus estimates that in the first 10 years of the research program (that is, from 1955-1965) five governments spent over $20 million on its development. By the late fifties, programs existed that could do a passable job of translating technical documents and it was seen as only a matter of extra databases and more computing power to apply the techniques to less formal, more ambiguous texts. In reality, the programs simply could not scale up in the expected ways.

In spite of journalistic claims at various moments that machine translation was at last operational, this research produced primarily a much deeper knowledge of the unsuspected complexity of syntax and semantics. Hubert Dreyfus, in "What Computers Can't Do", p92.

One fundamental problem was the inability to mimic the human capacity to use context to disambiguate the meanings of words and sentences. By the mid-sixties, the program had been substantially scaled down. Although current research in this area has produced recent success, it is still considered one of the great failures of early AI, one of the first signs of over-optimism.

   Work in Problem solving followed the same course of early success and eventual failure that Dreyfus later argued was at the heart of almost every AI attempt. Most problem solving work revolved around the work of Newell, Shaw and Simon, on the General Problem Solver or GPS. The general problem solver employed abstract problem solving rules (e.g. "If you can, always try to reduce differences between your current state, and your goal state") to solve a wide variety of problems. The rules, most of which were heuristic in nature (i.e. they were good rules-of-thumb, rather than perfect, tedious algorithms) were postulated to be the same as those employed by human problem solvers. Over-optimism was also pervasive in this area. Unfortunately, the GPS did not fulfill its promise, and not because of some simple lack of computing capacity, but rather, because of deep theoretical problems. While it could solve some problems with its general techniques, any but the simplest eluded it. The fundamental problem is that general problem solving strategies are limited. People use domain-specific knowledge and skills to solve problems in different contexts. In general, domain specific skills and knowledge do not generalize across areas. For example, knowledge relevant to chess will not be useful in the area of physics. What the GPS had were broad, weak strategies. Strengthening its abilities would mean adding domain-specific knowledge for all possible problem areas - an impossible task. In 1967, roughly 10 years after it began, Newell announced that the GPS program was being abandoned.

   The last of the three early areas of AI was pattern recognition. Once again, there were some early successes. Computers were built that could handle morse code, if there was very little line noise, and the sender was another computer or a very efficient and precise human operator. There were even programs that could, with a reasonable (though not human level) degree of accuracy decipher handwriting (or at least the letter 'A') in various styles and orientations. None, however, did so by way of some fundamental discovery in pattern recognition. Instead, they used ultra-specific and inflexible templates, and were defeated by any significant distortion of the data. They were also incapable of resolving ambiguity or utilizing context.

   From overwhelming optimism, the field was reduced to expressions like the following, from Vincent Giuliano, a researcher in the area.

Alas! I feel that many of the hoped-for objectives may well be porcelain eggs; they will never hatch, no matter how long heat is applied to them, because they require pattern discovery purely on the part of machines working alone. The tasks of discovery demand human qualities. (As cited by Dreyfus)

Back to index

Can Machines Think?

Why do we care?

   The initial successes of computers in replicating seemingly intelligent behaviour quickly led to argument and speculation about what it would mean for a computer to be 'intelligent'. As any psychology student knows, intelligence is a very controversial topic -- making the classification of a man-made machine as intelligent even harder. In terms of raw calculation power, even computers of the mid 1950's could beat human counterparts. The brute force calculations of programs like Newell & Simon's Logic Theorist had already produced results that, had a human produced them, would be unwaveringly accepted as intelligent. Still the debate raged on. The argument focused on the fact that there was no ingenuity, no insight, the computer merely followed human-programmed rules mindlessly until it reached a conclusion. The other side countered that an argument like that was moot, since the results were intelligent and meaningful, the computer has produced proof of intelligence.

   The developers of AI were claiming more than just replication of intelligent products though, they claimed the replication of the intelligence process (this has come to be known as the Strong AI position, a term coined by philosopher John Searle). In other words, they were beginning to claim not only that computers were intelligent, but that they were intelligent in the same way as people are intelligent. Suddenly the debate became more than just a philosophical one, psychology entered the picture. If intelligent computers were produced, could they be considered a working model of human intelligence? All the computers to date, systems like the GPS (see previous section for more details) were mindless rule-followers, what did that say about human intelligence? If we accept them as intelligent, does it necessarily follow they must be intelligent in the way that we are? These were significant questions for psychologists and philosophers, as well as computer scientists.

   Each side found its supporters quite quickly. Critics like Hubert Dreyfus (author of "What Computers Can't Do" and later, "What Computers Still Can't Do") argued not only that current machines were not intelligent, but that no computer could ever be. Supporters like Marvin Minsky offered remarkable optimism with quotes like "Within 10 years the problems of artificial intelligence will be substantially solved," and later, "Within 10 years computers won't even keep us as pets." The important thing to recognize is that when people started asking "Can Machines Think", psychology became inextricably tied to the AI movement. A position it has held, and strengthened in the 50 years since. Time lends focus, so it is not surprising that by the 1980s, Minsky's optimism had thinned, even if he remained hopeful. The two papers assigned, one from Minsky, and the other from Dreyfus, present some of the more typical arguments and refutations in the field.


Back to index

 

The Turing Test

Alan Turing and the Imitation Game

   Alan Turing, in a 1951 paper, proposed a test called "The Imitation Game" that might finally settle the issue of machine intelligence. The first version of the game he explained involved no computer intelligence whatsoever. Imagine three rooms, each connected via computer screen and keyboard to the others. In one room sits a man, in the second a woman, and in the third sits a person - call him or her the "judge". The judge's job is to decide which of the two people talking to him through the computer is the man. The man will attempt to help the judge, offering whatever evidence he can (the computer terminals are used so that physical clues cannot be used) to prove his man-hood. The woman's job is to trick the judge, so she will attempt to deceive him, and counteract her opponent's claims, in hopes that the judge will erroneously identify her as the male.

   What does any of this have to do with machine intelligence? Turing then proposed a modification of the game, in which instead of a man and a woman as contestants, there was a human, of either gender, and a computer at the other terminal. Now the judge's job is to decide which of the contestants is human, and which the machine. Turing proposed that if, under these conditions, a judge were less than 50% accurate, that is, if a judge is as likely to pick either human or computer, then the computer must be a passable simulation of a human being and hence, intelligent. The game has been recently modified so that there is only one contestant, and the judge's job is not to choose between two contestants, but simply to decide whether the single contestant is human or machine.

   The Encyclopedia Britannica's entry on the Turing Test (click here) is short, but very clearly stated. A longer, but point-form review of the imitation game and its modifications written by Larry Hauser, click here (if link fails, click here for a local copy) is also available. Hauser's page may not contain enough detail to explain the test, but it is an excellent reference or study guide and contains some helpful diagrams for understanding the interplay of contestant and judge. The page also makes reference to John Searle's Chinese Room, a thought experiment developed as an attack on the Turing test and similar "behavioural" intelligence tests. We will discuss the Chinese Room in the next section.

Back to index


Natural Language Processing (NLP)

   Partly out of an attempt to pass Turing's test, and partly just for the fun of it, there arose, largely in the 1970s, a group of programs that tried to cross the first human-computer barrier: language. These programs, often fairly simple in design, employed small databases of (usually English) language combined with a series of rules for forming intelligent sentences. While most were woefully inadequate, some grew to tremendous popularity. Perhaps the most famous such program was Joseph Weizenbaum's ELIZA. Written in 1966 it was one of the first and remained for quite a while one of the most convincing. ELIZA simulates a Rogerian psychotherapist (the Rogerian therapist is empathic, but passive, asking leading questions, but doing very little talking. e.g. "Tell me more about that," or "How does that make you feel?") and does so quite convincingly, for a while. There is no hint of intelligence in ELIZA's code, it simply scans for keywords like "Mother" or "Depressed" and then asks suitable questions from a large database. Failing that, it generates something generic in an attempt to elicit further conversation. Most programs since have relied on similar principles of keyword matching, paired with basic knowledge of sentence structure. There is however, no better way to see what they are capable of than to try them yourself. We have compiled a set of links to some of the more famous attempts at NLP. Students are encouraged to interact with these programs in order to get a feeling for their strengths and weaknesses, but many of the pages provided here link to dozens of such programs, don't get lost among the artificial people.

Back to index

 

Online Examples of NLP

   A series of online demos (many are Java applets, so be sure you are using a Java-capable browser) of some of the more famous NLP programs.

Back to index

 

The Loebner Prize

   Although Turing proposed his test in 1951, it was not until 40 years later, in 1991, that the test was first really implemented. Dr. Hugh Loebner, a professor very much interested in seeing AI succeed, pledged $100,000 to the first entrant that could pass the test. The 1991 contest had some serious problems though, (perhaps most notable was that the judges were all computer science specialists, and knew exactly what kind of questions might trip up a computer) and it was not until 1995 that the contest was re-opened. Since then, there has been an annual competition, which has yet to find a winner. While small prizes are given out to the most "human-like" computer, no program has had the 50% success Turing aimed for.

Validity of the Turing Test

   Alan Turing's imitation game has fueled 40 years of controversy, with little sign of slowing. On one side of the argument, human-like interaction is seen as absolutely essential to human-like intelligence. A successful AI is worthless if its intelligence lies trapped in an unresponsive program. Some have even extended the Turing Test. Steven Harnad (see below) has proposed the "Total Turing Test", where instead of language, the machine must interact in all areas of human endeavor, and instead of a five minute conversation, the duration of the test is a lifetime. James Sennett has proposed a similar extension (if link fails, click here for a local copy) to the Turing Test that challenges AI to mimic not only human thought but also personhood as a whole. To illustrate his points, the author uses Star Trek: The Next Generation's character 'Data'.

   Opponents of Turing's behavioural criterion of intelligence argue that it is either not sufficient, or perhaps not even relevant at all. What is important, they argue, is that the computer demonstrates cognitive ability, regardless of behaviour. It is not necessary that a program speak in order for it to be intelligent. There are humans that would fail the Turing test, and unintelligent computers that might pass. The test is neither necessary nor sufficient for intelligence, they argue. In hopes of illuminating the debate, we have assigned two papers that deal with the Turing Test from very different points of view. The first is a criticism of the test, the second comes to its defense.

Additional Resources

   Students interested in more information on the Turing test and the surrounding controversy may find the links below helpful. Each is a compilation of Turing Test related material, the first dealing with the more applied issues, the Loebner prize and NLP programs in general; the second with the philosophical issues surrounding the Test and its variations.


Back to index

The Chinese Room


The Motivation

   The Turing Test (discussed in the previous section) was the first attempt at resolving the question of machine intelligence. It was a behavioural test, judging intelligence based not on inner processes, or faithfulness to neuronal structure, but purely on a computer's ability to verbally communicate. This approach elicited numerous objections: Why should behaviour be the final test on intelligence, hadn't psychology moved away from behaviorism? How can behavior suffice if the internal mechanisms controlling it are nothing like a human being's? How can a conversation capture all of human intelligence? These questions essentially reduced themselves to the question of whether one could pass the Turing Test, that is, produce passable conversational speech, while still possessing no 'real' intelligence. This argument has been stated in numerous ways, but perhaps none more eloquent than John Searle's Chinese Room metaphor.

Back to index

 

The Thought Experiment (An Adaptation of Searle's Original)

    Searle asks the reader to imagine a room, with a man trapped inside. The man speaks no Chinese, nor could he even confidently distinguish Chinese characters from random lines of similar structure. One day, as he is sitting in the room, someone slips a piece of paper under the door with (what he assumes to be) Chinese writing on it. Having puzzled over it for a moment, he notices that there is a book in the room titled "What to do if someone slides some Chinese writing under the door." Having nothing better to do, the man proceeds to open the book and begin reading. The book, he finds, is actually an enormous set of instructions for producing new Chinese symbols based on what comes in. The rules instruct him on how to produce new Chinese symbols, based on the ones received. They are all if-then type statements describing a pattern in the text and the appropriate action or response. He follows these rules, using the piece of paper handed to him, and produces a new sheet, which he slides back under the door. The next day, another sheet comes in, he again opens the rule book, finds out which symbols to write and in which order, and passes the completed sheet back out. At no time does the man understand what he's doing as anything more than symbol manipulation, he does not understand the words coming in, or the words going out, he isn't even sure they ARE words - but it's more exciting than doing nothing, so he continues. What the man in the room does not know, is that the symbols coming in are questions, written in Chinese, and that the symbols he produces in turn are answers to those questions. What's more, the book of instructions has been written so well that his answers are not only proper Chinese, but they make sense, and are indistinguishable from an actual Chinese speaker. Outside, the world is amazed that this room can actually understand Chinese, that the room is intelligent. Inside though, we know that the man understands no Chinese whatsoever!

The Conclusion

   What Searle describes is a system that produces intelligent, meaningful output, in the absence of true understanding. If you accept this counter-example, then the Turing Test is doomed. The Chinese Room would pass the Turing test, even though it lacks understanding and intelligence. Searle's argument has, naturally, produced its own share of furious debate, and several strong counter-arguments have been levelled at it. The adaptation presented here is meant to familiarize students with the ideas Searle is trying to convey, but the thought experiment and the debate surrounding it deserve a more thorough analysis. Thus, two readings are provided for further elaboration of the argument and its replies.

Back to index

 

Additional Resources

Back to index

Can Computers Be Creative?


Theory

   Historically, human creativity has been a neglected topic in psychology in general and intelligence testing in particular. Despite this, creativity is considered by most to be an essential component of human intelligence. Consequently, in attempting to answer the question of whether computers can think, it is only natural to ask whether computers can think creatively. Many feel, in fact, that whereas computers can excel in well-structured areas of problem solving - e.g. logic, algebra, etc. - they have little hope of ever producing truly creative work. For a work to be creative, it must be novel and useful- this represents an enormous challenge for AI.

   The first two links below provide readers with general background on human creativity. The next two deal specifically with creativity in the context of AI.

Back to index

 

Practice

   If AI is famous for anything, it's the so-called "engineering end-run". The idea that resources should not be spent over philosophical debates, instead focus should be on building actual engineering solutions. Later when we find an implementation that works, it can form the basis of more adequate theory. Many in the field of AI will remind you that this is precisely the strategy that worked quite well for aviation. After almost a century of debate on whether a flying machine would be possible, the first machine was constructed, the debate was solved, and a wealth of new data was produced. With that in mind, there have been several attempts to build creative computers, despite the lack of conceptual and theoretical consensus. The most impressive of these is AARON, a painting program that produces both abstract and lifelike works. How are we to judge whether such works of art are truly creative? Is it sufficient to judge the products or is the process by which they were created the determining factor? Below are two pages which do a particularly good job of summarizing the current applied work in computer creativity. Read the articles, look at the illustrations, and ask yourself if you would ever doubt, under normal circumstances, that the pictures were produced by an intelligent mind.


Back to index

 

Can Computers Possess Emotional Intelligence?


Rationale

   The rationale for attempting to mimic emotional intelligence in a computer is not immediately clear. Star Trek fans will remember the difficulties Commander Data felt with the introduction of his emotion chip. In fact, classic Western views of intelligence often pitted emotion and reason against each other. Emotion was seen as a disorganizing factor, harmful to reasoning and logic. However, with the recent introduction of the concept of emotional intelligence, the many positive contributions of emotional factors to intellectual functioning were highlighted. Furthermore, understanding emotion (and to a lesser extent, exhibiting it) may prove essential to any system that is designed to interact with human beings. Of course, the implementation of such a system represents an enormous challenge.

   So why is understanding emotions crucial? The most direct reason, and the obvious one after a moments introspection, is that emotion is inextricably tied to everything we say and do. In more concrete terms, let us consider the advantages of having a computer that understood emotion. First of all, it could provide vastly improved interactions with users. More importantly though, emotional intelligence would be an enormous leap forward for systems attempting to learn about people, and the world in which they live.

   Also of interest to AI research, and psychology, is the idea of simulating emotion in computers. These simulations can be either internal, external, or both, depending on the motivations of those designing the system. External simulation, that is, exhibiting emotion purely for the benefit of those interacting with the system, is more difficult than it may seem. While psychologists have known for some time that there are a good deal of physical correlates to emotion - voice changes, blushing, pupil dilation, etc. - reproducing them proves difficult. Often the effect is exaggerated, so much so that the emotional device becomes obvious. For an example of the type of work being done, click here for a link to Janet Cahn's master's thesis work, generating expression in synthesized speech. There is also work being done on internal emotion, that is, programming a computer to actually 'experience emotions'. Emotion provides us with a motivation and drive, with a set of personal preferences, with a uniqueness that is desirable in a sophisticated AI. In some cases, most notably MIT's 'Kismet', the two (internal and external emotions) are combined into a robot that seems to possess and display emotion. For more information, click here to access Kismet's homepage (this page is of a moderately technical level). Both Cahn's work, and Kismet are part of the MIT Media Lab's research program. Students are encouraged to explore the various research projects being pursued at MIT -- its lab is at the forefront of AI and emotional AI research.

Back to index

AI and Consciousness

 

   Consciousness is perhaps one of the most controversial areas of research in psychology. Currently, there is no general consensus as to how to define or measure conscious awareness. Despite this, both researchers and lay-persons alike feel that consciousness is a fundamental determinant of what it means to be human. Not surprisingly, in the field of AI, consciousness is just as controversial.

   One fundamental issue is whether or not conscious awareness is simply a by-product of complex intelligent systems. Those who assume that consciousness is simply a by-product or an emergent property argue as follows. In humans, a single neuron has nothing resembling intelligence. Yet in combination, billions of neurons combine to form a mind that does possess intelligence. It would appear then, that the brain is more than the sum of its parts. Intelligence emerges with sufficient configural complexity of neurons. So it is not inconceivable that other attributes such as consciousness, creativity and emotionality may emerge as a by-product of complex artificially intelligent systems. In general then, the idea is that consciousness is just a by-product of any sufficiently complex brain, and AI engineers need not try to isolate and recreate it specifically, it will emerge automatically as needed.

   If one assumes, on the other hand, that consciousness is not such a by-product, then an additional question is whether or not it is possible to computationally define, and simulate it. Thus, in asking whether computers can think, we must inevitably turn to the question of whether thinking computers would actually be conscious? In other words, at some point in the enterprise of AI it becomes important to define the relationship between consciousness and intelligence. For example, is consciousness a necessary condition for intelligent systems or would intelligent systems necessarily display consciousness?

   Skeptics point out that a fundamental component of consciousness is subjective phenomenal experience which may be beyond the scope of computational simulation. To illustrate this distinction, imagine a person, totally colourblind from birth, who as a result, has never experienced the colour red as any different from an equally bright grey, or green. Then imagine that the person studies colour theory, physics, psychology of perception and the biology of the eye. Imagine that the person becomes totally knowledgeable about all aspects of the colour red. Skeptics argue that in studying all of this factual information, our person is learning the type of thing a computer might learn: wavelengths, frequencies, etc. However, what the person is not learning is what red really looks like - she can't see it, so her experience is lacking something. The argument then is that no amount of factual information - the kind you would give a computer - can give them the subjective experience of red, or anything else. What makes the argument significant is that the same skeptics argue that the subjective, personal, phenomenological experiences are what make up conciousness and thus without them, an AI cannot be conscious.


Back to index

Symbolic AI

 

   The work started by projects like the General Problem Solver (see Early Work in AI) and other rule-based reasoning systems (like Logic Theorist, mentioned in the same chapter) became the foundation for almost 40 years of research. Symbolic AI (or Classical AI) is the branch of artificial intelligence research that concerns itself with attempting to explicitly represent human knowledge in a declarative form (i.e. facts and rules). If such an approach is to be successful in producing human-like intelligence then it is necessary to translate often implicit or procedural knowledge (i.e. knowledge and skills which are not readily accessible to conscious awareness) possessed by humans into an explicit form using symbols and rules for their manipulation. Symbolic AI has had some impressive successes. Artificial systems mimicking human expertise (Expert Systems, discussed later) are emerging in a variety of fields which constitute narrow but deep knowledge domains. Game playing programs are being written now that challenge the best human experts. The difficulties encountered by symbolic AI have however, been deep, possibly unresolvable ones. One difficult problem encountered by symbolic AI pioneers came to be known as the common sense knowledge problem (discussed in the next chapter). In addition, areas which rely on procedural or implicit knowledge such as sensory/motor processes, are much more difficult to handle within the Symbolic AI framework. In these fields, Symbolic AI has had limited success, and by and large has left the field to neural network architectures (discussed in a later chapter) which are more suitable for such tasks. In sections to follow we will elaborate on important sub-areas of Symbolic AI as well as difficulties encountered by this approach.

Back to index

 

The common sense Knowledge Problem


The Problem

   In Early Work in AI we saw that AI's reach quickly exceeded its grasp when trying to build universal machines. One of the fundamental problems encountered became known as the general knowledge problem or the common sense knowledge problem. While researchers were aware that in an AI system, knowledge would have to be explicitly represented, they did not anticipate the vast amount of implicit knowledge we all share about the world and ourselves. Designers of AI systems did not consider producing rules like "If President Clinton is in Washington, then his left foot is also in Washington," or "If a father has a son, then the son is younger than the father and remains younger for his entire life." In retrospect, this is perhaps not surprising, because the implicit nature of this knowledge in humans means that we all take it for granted, and never have to state it or consider it explicitly.

   Once the problem was acknowledged, it soon became clear that it represented an enormous hurdle for the development of general purpose intelligent systems. One hope or perhaps wishful thinking on the part of AI developers, was that all that was needed was a decent learning program, and this knowledge would be acquired by computers as automatically as it is acquired by humans. A central part of the common sense knowledge problem has to do with the issue of knowledge representation in artificial systems. What is the best approach to represent knowledge? Are dictionary- or encyclopedia-like entries the best approach? Should everything be formulated as a series of if-then rules? Should multiple forms of representation be used? It is clear that not all human knowledge is represented in such an explicit or declarative form. The implicit nature of knowledge applies not only to common sense knowledge, but also to a wide variety of expertise and skills we possess. Such domain-specific knowledge is often represented as procedures, rather than facts and rules (these difficulties will be discussed further in the section on Expert Systems). The article below discusses some of the issues relevant to knowledge representation in both humans and computers.

Back to index

CYC

 

   In the absence of a learning machine that can acquire common sense facts on its own, there would seem to be only one option left. That is, manually programming in the millions of general knowledge items that we take entirely for granted. The CYC research project has actually undertaken this mammoth task.

The Most Ambitious AI Project Ever?

   Decide for yourself: CYC is a $25 million, 20 year project in Artificial Intelligence. Aimed at beating the common sense knowledge problem outlined in the last chapter, the project employs workers whose full time occupation consists of entering volumes of common sense data into the ever-growing AI. Its database, at the time of writing, contains over 1,000,000 of these hand-entered facts, and is maintained in an entire room full of computers. The CYC programmers designed a representation scheme that claims to be standardized enough to be useful, while being flexible enough to represent almost any fact. In short, the project headed by Douglas Lenat and already 15 years running is one of the most impressive AI undertakings ever.

Why?

   The companies that eventually funded CYC must have been presented with the same staggering statistics that we have just presented to you: millions of dollars, decades of work, massive computing requirements. Yet companies chose to invest, not governments or academic institutions, private for-profit corporations. What benefits could outweigh those costs?

   However large the task is, if successfully completed, think of the contribution to computing. Cycorp (the corporation in charge of CYC's construction) will be able to license the world's only common sense knowledge base out to other companies. There might be a common sense search engine on the web that actually finds you the links you want, instead of semi-randomly looking for keywords. Whatever the cost, CYC has the potential to revolutionize "intelligent" computing. No one, though, can better explain CYC's potential and successes than Cycorp themselves. We have included three readings, all written by Douglas Lenat and other members of the Cycorp team and all (naturally) beaming about the project's prospects.

Objections

   Of course, not everyone is as full of enthusiasm as Doug Lenat. Several critics (among them Hubert Dreyfus, the critic mentioned in earlier chapters) point out that while it makes for an interesting exercise in programming, it falls short of the mark as strong AI. Strong AI claims to duplicate not only human-like intelligent products, but also human-like intelligent process. As explained in the previous section, human knowledge is often implicit, procedural, and domain specific, rather than explicit, declarative, and general purpose. However, the focus of the CYC project is not Strong AI, or modeling human intelligence. Rather, it is an example of an 'engineering end-run' attempt to produce general purpose, intelligent machines.

   Lenat may, however, have overstepped his bounds at one point, and critics have taken him to task on it. In his enthusiasm for the project, Lenat has been found to hold some lofty ambitions: Cyc teaching schoolchildren, advising public policy, and even dispensing justice. These predictions if they came to pass, would certainly reflect great confidence in AI research, but critics warn that the results could be catastrophic, as the paper below argues.


Back to index

Expert Systems

 

   Given the common sense knowledge problem, and the difficulty in creating general purpose intelligent machines, an alternative approach developed which attempted to mimic human performance within restricted domains of knowledge.

   The first serious attempt at applying this alternate approach came to be known as "Microworlds". The theory behind Microworlds was that the first step in AI ought to be producing intelligence in a restricted environment. Once that had been solved, one could gradually increase the complexity of the environment, and the AI, until eventually AI arrived at a level that could cope with real-world situations. It was a scaling theory, from specificity towards generality without loss of strength. The most famous microworlds project was Terry Winnograd's SHRDLU (SHRDLU is just the 7th to 12th most common letters in English, he got the name from a Mad Magazine article). SHRDLU lived in a world called Blocks World. It had in its memory descriptions of various blocks: shapes, colours, sizes and positions. It also had a robotic arm (actually, the entire thing was a simulation, so neither the blocks, nor the arm actually existed) that could move the blocks. Finally, its intelligence programming included two components: the first was a problem solver that could look at the world, gather information, and make changes when possible, such as moving a block; the second was a natural language program that interacted with users while manipulating the blocks world. Within its world, SHRDLU was impressive. You could tell it to move the green block and it could do so. It would even ask clarifying questions like "By the green block, I assume you mean the green cube on the blue cube," or "To move the green block, I will have to move the red pyramid."

   The problem with microworlds projects like SHRDLU was that they failed to scale up as the original strategy called for. Nevertheless, it provided a proof that AI systems designed to operate within domains of knowledge that are narrow, but deep, could be highly effective. This realization inspired the creation of one of the most successful sub-areas of AI - the field of expert systems. The basic idea is that if one can codify human expertise within a narrow domain as a hierarchical series of if-then rules, then an AI system can be created that mimics or perhaps even exceeds the performance of a human expert. One problem which was encountered early in this enterprise was that experts cannot always explicitly state the rules which guide their performance. Even when experts do state rules explicitly, when such rules were implemented, the performance obtained was inferior to that of the expert providing the rules, indicating there is insufficiency. This is the problem of implicit knowledge which was discussed previously. Given that domain specific knowledge is often implicit and procedural, one of the challenges of expert system developers was to find a way of interrogating experts and collecting information about expert performance in order to clarify the rules being used. The new occupation of "Knowledge Engineer" emerged to fill that purpose. Knowledge engineers spend a lot of time with human experts during the design stage of an expert system, as well as during multiple feedback and improvement cycles. Currently expert systems represent one of the major financial successes of AI with an industry exceeding $1 billion.

   We have assembled two sets of links below. The first discuss expert systems in more detail than they have been described here. The second set are actual expert systems available for online interaction or study. Students should be aware that of the three Introduction and Discussion sites linked to, only the first is written for a lay audience. The second and third are accessible, but were written for students of artificial intelligence and as such, tend to move through material at a quicker pace.

Back to index

 

Introduction and Discussion

Illustrations and Demos


Back to index

Game Playing

 

   Ever since the beginning of AI, there has been a great fascination in pitting the human expert against the computer. Game playing provided a high-visibility platform for this contest. It is important to note, however, that the performance of the human expert and the AI game-playing program reflect qualitatively different processes. More specifically, as mentioned earlier, the performance of the human expert utilizes a vast amount of domain specific knowledge and procedures. Such knowledge allows the human expert to generate a few promising moves for each game situation (irrelevant moves are never considered). In contrast, when selecting the best move, the game playing program exploits brute-force computational speed to explore as many alternative moves and consequences as possible. As the computational speed of modern computers increases, the contest of knowledge vs. speed is tilting more and more in the computers favour, accounting for recent triumphs like Deep Blue's win over Gary Kasparov.

   Readings for this section have been organised into two groups. In the first are two papers that discuss game playing in general - describing certain attempts, and certain techniques common to the area. The second deal specifically with Deep Blue's success, and its implications both for AI, and for society.

Game Playing

Deep Blue


Back to index

Artificial Neural Networks


What They Are

   A neural network is, in essence, an attempt to simulate the brain. Neural network theory revolves around the idea that certain key properties of biological neurons can be extracted and applied to simulations, thus creating a simulated (and very much simplified) brain. The first important thing to understand then, is that the components of an artificial neural network are an attempt to recreate the computing potential of the brain. The second important thing to understand, however, is that no one has ever claimed to simulate anything as complex as an actual brain. Whereas the human brain is estimated to have something on the order of ten to a hundred billion neurons, a typical artificial neural network (ANN) is not likely to have more than 1,000 artificial neurons.

   Before discussing the specifics of artificial neural nets though, let us examine what makes real neural nets - brains - function the way they do. Perhaps the single most important concept in neural net research is the idea of connection strength. Neuroscience has given us good evidence for the idea that connection strengths - that is, how strongly one neuron influences those neurons connected to it - are the real information holders in the brain. Learning, repetition of a task, even exposure to a new or continuing stimulus can cause the brain's connection strengths to change, some synaptic connections becoming reinforced and new ones are being created, others weakening or in some cases disappearing altogether. The second essential element of neural connectivity is the excitation/inhibition distinction. In human brains, each neuron is either excitatory or inhibitory, which is to say that its activation will either increase the firing rates of connected neurons, or decrease the rate, respectively. The amount of excitation or inhibition produced is of course, dependent on the connection strength - a stronger connection means more inhibition or excitation, a weaker connection means less. The third important component in determining a neuron's response is called the transfer function. Without getting into more technical detail, the transfer function describes how a neuron's firing rate varies with the input it receives. A very sensitive neuron may fire with very little input, for example. A neuron may have a threshold, and fire rarely below threshold, and vigorously above it. A neuron may have a bell-curve style firing pattern, increasing its firing rate up to a maximum, and then levelling off or decreasing when over-stimulated. A neuron may sum its inputs, or average them, or something entirely more complicated. Each of these behaviours can be represented mathematically, and that representation is called the transfer function. It is often convenient to forget the transfer function, and think of the neurons as being simple addition machines, more activity in equals more activity out. This is not really accurate though, and to develop a good understanding of an artificial neural network, the transfer function must be taken into account.

   Armed with these three concepts: Connection Strength, Inhibition/Excitation, and the Transfer Function, we can now look at how artificial neural nets are constructed. In theory, an artificial neuron (often called a 'node') captures all the important elements of a biological one. Nodes are connected to each other and the strength of that connection is normally given a numeric value between -1.0 for maximum inhibition, to +1.0 for maximum excitation. All values between the two are acceptable, with higher magnitude values indicating stronger connection strength. The transfer function in artificial neurons whether in a computer simulation, or actual microchips wired together, is typically built right into the nodes' design.

   Perhaps the most significant difference between artificial and biological neural nets is their organization. While many types of artificial neural nets exist, most are organized according to the same basic structure (see diagram). There are three components to this organization: a set of input nodes, one or more layers of 'hidden' nodes, and a set of output nodes. The input nodes take in information, and are akin to sensory organs. Whether the information is in the form of a digitised picture, or a series of stock values, or just about any other form that can be numerically expressed, this is where the net gets its initial data. The information is supplied as activation values, that is, each node is given a number, higher numbers representing greater activation. This is just like human neurons except that rather than conveying their activation level by firing more frequently, as biological neurons do, artificial neurons indicate activation by passing this activation value to connected nodes. After receiving this initial activation, information is then passed through the network. Connection strengths, inhibition/excitation conditions, and transfer functions determine how much of the activation value is passed on to the next node. Each node sums the activation values it receives, arrives at its own activation value, and then passes that along to the next nodes in the network (after modifying its activation level according to its transfer function). Thus the activation flows through the net in one direction, from input nodes, through the hidden layers, until eventually the output nodes are activated. If a network is properly trained, this output should reflect the input in some meaningful way. For instance, a gender recognition net might be presented with a picture of a man or woman at its input nodes and must set an output node to 0.0 if the picture depicts a man, or 1.0 for a woman. In this way, the network communicates its knowledge to the outside world.

Back to index

 

How They Learn

   Having explained that connection strengths are storehouses of knowledge in neural net architectures, it should come as no surprise that learning in neural nets is primarily a process of adjusting connection strengths. In neural nets of the type described so far, the most popular method of learning is called Back-Propagation. To begin, the network is initialised, all the connection strength are set randomly, and the network sits as a blank slate. The network is then presented with some information, let us suppose that we are designing the "gender detector" mentioned earlier, and that the input nodes are receiving a digitised version of a photograph. The activation flows through the net (albeit haphazardly since we have not yet set the connection strengths to anything but random values). And eventually the output node registers an activation level. However, since the net has not yet been trained, its responses will initially be random. This is where back-propagation steps in. The net's response is compared with the correct response for that picture (i.e. 0.0 for male, 1.0 for female). Then working backwards from the output node, each connection strength is adjusted so that next time it's shown that picture, its answer will be closer to the desired one (the process by which each node is adjusted involves mathematics more complicated than this course requires. Students who are interested will find that some of the papers provided at the bottom of this chapter will discuss these methods in more detail).

   This whole process: input, processing, comparing output with correct answer, and adjusting connection strengths is called one 'back-propagation cycle', or often just one 'iteration'. The net is then presented with another picture and its answer is compared with the correct answer, the connection strengths adjusted where needed. This process can often take hundreds or thousands of iterations. Eventually, the net should become fairly proficient at identifying males and females. There is always a risk however, that the net has not learned to discriminate males from females, but rather that it has effectively memorized the response for each picture. To test for this, the pictures (or whatever input is being used) should be divided into two groups: The traini