Is the Turing Test the Last Word in Robot Intelligence? Don't Count On It

Convincing humans you're human is easy. Convincing humans you're smart takes some doing.

Robin Zebrowski/Flickr

Back in 1950, computer scientist, codebreaker, and war hero Alan Turing introduced the world to a very simple premise: If a robot can engage in a text-based conversation with a person and fool that person into believing it is human at least 30 percent of the time, surely we could agree that the robot is a “thinking” machine. Turing’s goal was to force people to think more creatively about computer interaction, but he inadvertently ended up creating the test that robot intelligence developers and commentators have relied on for years. But serious artificial intelligence thinkers aren’t focused on confusing a long-dead genius a third of the time. They’re focused on more substantive metrics.

Fundamentally, the problem with the Turing Test is that it’s poorly defined therefore facilitates hype (e.g. that fake teaching assistant in Georgia) rather than offering easily duplicated results. Beyond that, one can argue that it measures human weakness, not artificial strength. Deception and deflection can allow a relatively unsophisticated chatbot to “pass the test.” For example, a bot named Eugene Goostman designed to impersonate a 13-year-old Ukrainian boy, recently tricked a third of a panel of judges into believing the ruse. Eugene comes off as a bit of a doofus in conversation, and this turned out to be his secret weapon. Judges were expecting a robot programmed for intelligence, not one that avoided questions, made bad jokes, dropped malapropisms, and peppered the text with emoticons.

If not the Turing Test, then what? Researchers around the globe have come up some alternatives.

Deciphering Ambiguous Sentences

A fundamental problem with the Turing chatbots is that machines still have a really hard time understanding sentences that would immediately make sense to a human. “Peter yelled at Paul, because he slept with his girlfriend.” To a human, it’s immediately clear that Paul slept with Peter’s girlfriend, but to a computer “he” and “his” could each refer to either man. Understanding what happened requires knowing something about what it means to yell at someone, and under what conditions a person might be motivated to do it.

Hector Levesque, a professor of computer science at the University of Toronto, has proposed challenging machines to pull meaning from these sorts of ambiguously constructed sentences, called Winograd schema, as an alternative to the Turing test. This would require going beyond mimicking human language and into the realm of actual understanding. Already, a $25,000 prize is on offer to the developer who can make a bot that performs as well as a human on this task — although the bot may consider each question for up to five minutes.

Facial Recognition

Some A.I. researchers have considered the idea that machine intelligence can and should go beyond language. Facial recognition is an example of something that humans do particularly well — a baby can recognize its mother within weeks of birth, after all.

Some computers are already outcompeting humans at recognizing faces, although whether this is a measure of true intelligence is still a matter of debate. A machine programed to be very good at one thing is quite different from having the sort of flexible intelligence that could be put to use in different ways and in different situations.

College Acceptance

Japanese roboticists are trying to build a robot that can get into college. The entrance exams for the University of Tokyo are notoriously difficult, and much more so for a robot than a high school senior.

Unfortunately for robots, being good at tests takes a lot more than memorizing lots of facts. Math questions don’t give you an equation to solve — they describe a scenario in plain language, and leave it up to you to figure out how to build an equation that will come to the right answer. Even a straightforward question about a historical fact could be complicated if the robot can’t grasp the syntax or context of the language used.

And the entrance exams aren’t just a multiple choice test — the robot would also have to write essays. Presumably, plagiarism would not be allowed, and the machine would have to generate some prose on a given subject that is both original and intelligent. Given that robots have a hard enough time mimicking the language of a 13-year-old, this seems pretty far off. Still, the researchers involved say they hope to see their little bot off to college by 2021.


This one is a particularly high bar. Commenting on a sports game involves taking in complex audio-visual information and communicating what is happening in plain language. A robot would have to have very good language skills in addition to a visual processing system.

If a computer could even produce a half-decent live report on a football game, humans might be able to agree that that robot is pretty damn smart. Although, perhaps 65 years from now sports commentator bots will seem particularly two-dimensional, and we’ll have to come up with some new hurdles for them to leap.

Related Tags