The paper that I’m reviewing today is Computing Machinery and Intelligence by Alan Turing. It is one of the most important papers in the history of computation, and has hence shaped the world as we know it today. It could perhaps be called more an essay than a paper; this claim will hopefully become apparent below. I spent 4th of July reading it, and it was a pretty fun way to spend my day.
The Imitation Game
“Can machines think?” We soon realize that we first need to define the terms “machine” and “think”. Let us set the scene: there are three people- A (man), B (woman) and C (interrogator). None of them can see each other or speak, and they can only communicate through typewritten messages. The interrogator (C) has the job of determining the genders of A and B (he doesn’t know that A is a man and B a woman). C can ask them any number of questions. A’s job is to mislead C into thinking that he is a woman. B’s job is to convince C that she is the woman, and that A is a man. Both A and B can lie in their answers to C.
Clearly, lying and misleading are traits of human beings, and require thought. Suppose we replace A (man) by a machine. Will the machine be as convincing a liar as A? Will it be able to dodge C’s questions as skillfully as A? If the probability of C calling out the machine’s lies is less than or equal to the probability of him/her calling out A’s lies, then Turing says that that implies that the machine can “think“. Testing whether someone or something can “think” would involve having them accomplish a task that would require considerable thought and planning. Lying and misleading clearly fall into that category. Without going into the metaphysics of what exactly “thought” is and why inanimate objects can’t think because they don’t own big squishy brains like us, I think this is a good definition of what it means for a machine to think.
What is a “machine” though? In order to do away with objections like “humans are also machines” etc, Turing says all machine that are considered in this paper are “digital computers”. This may sound like a very restrictive definition. However, he says that digital computers, given enough memory, can imitate all other machines (and are hence universal machines). This claim will be partly proved below. Hence, if we wish to prove that that there exists some machine capable of a certain task, it is equivalent to proving that there exists a digital computer capable of that task.
One of the more fascinating parts of this section is where Turing describes how the storage of a computer would work, and how a loop program works. He then gives an example of a loop in human life, which is what a loop program is meant to imitate:
To take a domestic analogy. Suppose Mother wants Tommy to call at the cobbler’s every morning on his way to school to see if her shoes are done, and she can ask him afresh every morning. Alternatively, she can stick up a notice once and for all in the hall which he will see when he leaves for school and which tells him to call for the shoes, and also to destroy the notice when he comes back if he has the shoes with him.
Although fascinating, the details of such functions are quite well known, and hence I won’t elaborate more on them here.
A “machine” is a pretty broad term in general. It has certain states, and rules for how the machine will behave when it is in that state. For instance, the universe may be thought of as a machine. Its initial state was that it was very very hot at first. God’s Rule Book said that in such a state, the universe had to expand and cool down, and that’s exactly what it did. The universe could be described as a continuous state machine- if its initial state was, say, even one millionth of a degree cooler, it would have evolved very differently. Hence, we can say that the universe is a machine is that very sensitive to initial conditions, and hence is a continuous state machine.
The opposite of a continuous state machine is a discrete state machine. This is a machine that is not very sensitive to initial conditions. An example would be your stereo system. Imagine that you have a stereo system with one of those old fashioned volume knobs that you turn to increase or decrease the volume. It won’t make that much of a difference to your audio experience if you turn the knob slightly more than you intended to. 70 dB will sound almost the same as 71 dB. Hence, what a lot of companies do is that they break up the volume level on your stereo into discrete states. Because 70dB is also the same as 71dB, they can both be clubbed into the volume level “15”. When you turn the volume knob, and see the volume bar in the visual display go up by 1, you’ve moved up from one state to the next. In this way, your stereo system is clearly a discrete state system.
Digital computers are discrete state machines. Discrete state machines are stable to small changes in initial conditions, and hence it becomes easy to predict their behavior in the future, based on knowing their initial state.
Turing then makes a fantastic prediction in this paper:
..I believe that at the end of this century…one will be able to speak of machines thinking without expecting to be contradicted.
Turing claims that this is only a conjecture, but that making conjectures is good for science. This is one conjecture that is perhaps still hotly debated across the world, and “verify you are a human” tests on many websites regularly play The Imitation Game with us.
Can machines have our immortal soul?
Turing anticipates many objections to his claim that machines can “think” if they can mislead other humans. And he deals with these anticipated objections one by one.
The Theological Objection might be that God provides immortal souls capable of thinking only to humans, and not animals or machines. Turing contends that “we should not irreverently usurp his power of creating souls”, and that He might indeed have provided all objects, animate and inanimate, of souls capable of thought. Turing also clarifies that religious arguments don’t impress him anyway. The Head in the Sand Objection contends that contemplating machines having the capacity for thought is too scary, and hence we should assume that they will never gain this facility. Intelligent people, basing their superiority on their capacity for thought, often make this argument. Turing thinks that this argument is too weak to even argue against, and that such “intelligent people” just need consolation.
The Mathematical Argument against machines being able to imitate humans well is that every machine is based on a certain logical system. Gödel proved that to every such logical system, you can ask a Yes/No question that it can only answer incorrectly, or never be able to answer. As humans (namely C, in this context) don’t have such a handicap, we can easily find out whether A is a machine or not. However, knowing what this question might be is going to be near impossible, as we wouldn’t know what type of a machine A might be (even if we are fairly sure it is a machine). Hence, we might now know what question to ask for which the machine will falter. Moreover, it is also entirely possible that humans would also give the incorrect answer to this question (as it is likely to be a tough logical question). In addition to this, what if this one machine is actually a collection of digital machines, with different such “trap” questions for each? It is going to be tough finding a question for which all such machines will fail. Therefore, this mathematical objection, although valid, is not something we need to worry about too much.
The Argument from Consciousness says that machines cannot think unless they can write a sonnet, feel pleasure at their success, feel warm because of flattery, etc. Turing contends that there is no way of knowing whether machines can feel any of this unless we can become the machine itself, which is impossible. One way to ascertain the emotional content of a machine is to conduct a viva voce, in which answering the questions posed would require some amount of emotional processing. The example that Turing provides, which he expects a human and a machine to be able to conduct, is given below:
Lady Lovelace, who described in details The Analytical Machine that Charles Babbage designed, claimed that machines cannot do anything original, and can only do what they’re programmed to do. Turing contends that although the kinds of machines that Lady Lovelace could see perhaps led her to that conclusion, she was incorrect, and that even The Analytical Machine could be suitably programmed such that it could do “original” things. One variation of her statement could be that “machines cannot do anything new”. However, even the “new” things that humans do is inspired, at the very least, from their own experiences. Hence, an even better variant would be “machines cannot surprise us”. This is also incorrect, as humans often make calculations that are slipshod, that lead them to certain conclusions. When they ask machines for answers, they’re often surprised with the (correct) answer that the machines provide. An analogy would be when we incorrectly calculate that preparing for a certain exam would take us one all nighter, and are surprised by how bad that plan was. We did not correctly calculate our productivity over the course of one night.
The Argument from Continuity says that the nervous system is a continuous state system, and a digital computer is a discrete state system. Hence, a digital computer can never successfully imitate a human being. Turing counters this by saying that a digital computer is capable of imitating continuous state systems. Take a differential analyzer for instance, which is a continuous state system. If asked for the value of the differential analyzer, might give any value between and , based on its current state. This is a feature of continuous state systems- even a small deviation from the “ideal” state brings about a noticeable change in the output. This could be imitated by a digital computer by having it output with probabilities of .
The Argument from Informality of Behavior says that machines follow a rule book, which tells them how to behave under certain circumstances. Moreover, the behavior of a machine can be completely studied in a reasonable amount of time, such that we’ll be able to predict the behavior of a machine perfectly in any given situation. Humans are not predictable in such a fashion. Hence, humans and machines are different, and can be easily distinguished. Turing argues against this by saying that it is possible that such a “rule book” for humans may also exist, and the unpredictability of humans is just a result of the fact that we haven’t found all the rules in the book yet. Moreover, he says he has written a program on a relatively simple Manchester computer, which when supplied with one 16 digit number returns another such number within 2 seconds. He claims that humans will not be able to predict what number this program returns even if they get a thousand years to study the machine.
One of the more interesting sections of the paper is where Turing says that if player B is telepathic, then this game would break down, as she (player B is a woman) would easily be able to pass tests like ‘C asks “What number am I thinking?”‘. Clearly, a machine would be unable to think of this number with any degree of certainty. While Turing contends that Telepathy is indeed real, he overcomes this problem by suggesting that all the participants sit in “telepathy proof” rooms.
Turing says that in some sense, a machine can be made to become like a “super-critical” brain (something which is capable of devising plans of action after being given an idea), and that in another sense a brain is also a machine. However, these ideas are likely to be contested. The only proof of whether a learning machine can exist can be given only when one such machine is constructed, which for all purposes lies far ahead in the future.
But what are the constraints when one tries to construct such a machine? Turing says that a human brain uses about binary digits to process ideas and “think”. These binary digits can be thought of as analogues to neurons, which are when they fire, and when they don’t. A particular combination of neuron firing leads to resultant thoughts and actions. Turing thinks that the storage space needed for containing these many binary digits can be easily added to a computer. Hence, there is no physical constraint on constructing this device. The only question is, how do we program a computer to behave like a human being?
One could theoretically observe human beings closely, see how they behave under all possible circumstances, and then program a computer to behave in exactly that way. However, such an endeavor is likely to fail. One can instead program a computer to behave like a human child, and then make it experience the same things that a human child experiences (education, interacting with others, etc). Because a child is mostly a blank slate, and forms its picture of the world and behavioral paradigms based on its experiences, a computer may learn how to be a human adult (and hence imitate a human adult) in exactly the same way. Turing concedes that constructing a “child machine” is a difficult task, and that some trial and error is required.
A child learns through a “reward and punishment” system. What rewards and punishments can a teacher possibly give to a machine? A machine has to be programmed to respond to rewards and punishments much like a child does. It should repeat behaviors for which it is praised by the teacher, and not repeat behaviors for which it is scolded or punished. Also, it can be fed with programs that ask it to do exactly as the teacher says. Although inferring how the machine should behave based on fuzzy input from the external world might lead to mistakes, Turing contends that this is not any more likely than “falling over unfenced cliffs”. Moreover, suitable imperatives can be fed into the machine to further reduce such errors.
A critical feature of “learning machines” is that they change their rules of behavior with time. Suppose they have been programmed to go to school every day at 8 am. One day, the teacher announces that school will being at 9 am the next day. Learning machines are able to change the 8 am rule to 9 am. Adapting rules of behavior based on external output is also a feature of human beings.
How does a machine choose its mode of behavior though? Suppose the parents say “I want you to be on your best behavior in front of the guests”. What does that mean? There are a lot of near solutions to this problem- the machine could sit in a corner silently throughout the party, it could perform dangerous tricks to enthrall and entertain the guests, etc. How does it know which of these solutions is best? Turing suggests that we assign the machine a random variable. Let us suppose that all the “solutions” mentioned above are assigned numbers between and . The machine could pick any number randomly, try out the solution corresponding to that number, evaluate the efficacy of that solution, and then pick another number. After a certain number of trials (maybe 15), it could pick the solution that is the most effective. Why not pick all numbers between and ? Because that would take too much time and computation.
By the time the machine is an adult, it will be difficult to accurately predict how the machine will behave in a certain situation, because we cannot know all the experiences that the machine has been through, and how its behavioral patterns have evolved. Hence, if we program something into it, it might behave completely unexpectedly- it might “surprise us”. Also, because its learned behavior and traits are unlikely to be perfect, it is expected to behave less than optimally in many situations, and hence mimic “human fallibility”. For all purposes, our machine will have become a human.
What is incredible to me is that this is exactly how neural networks behave. It is insane that Turing could envision such a futuristic technology more than half a century ago. Thanks for reading!