Computing Community Consortium Blog

The goal of the Computing Community Consortium (CCC) is to catalyze the computing research community to debate longer range, more audacious research challenges; to build consensus around research visions; to evolve the most promising visions toward clearly defined initiatives; and to work with the funding organizations to move challenges and visions toward funding initiatives. The purpose of this blog is to provide a more immediate, online mechanism for dissemination of visioning concepts and community discussion/debate about them.

Nearing the Turing Test

September 4th, 2012 / in research horizons / by Erwin Gianchandani

Freelance writer Dan Falk penned an interesting story for The Telegraph last month, reflecting on his experience as a judge in the Turing Test Marathon this summer:

Freelance writer Dan Falk on the Turing Test [image courtesy].Will this summer be remembered as a turning point in the story of man versus machine? On June 23, with little fanfare, a computer program came within a hair’s breadth of passing the Turing test, a kind of parlour game for evaluating machine intelligence devised by mathematician Alan Turing more than 60 years ago.


This wasn’t as dramatic as Skynet becoming self-aware in the Terminator films, or HAL killing off his human crew mates in 2001, A Space Odyssey. But it was still a sign that machines are getting better at the art of talking — something that comes naturally to humans, but has always been a formidable challenge for computers.


Turing proposed the test — he called it “the imitation game” — in a 1950 paper titled “Computing machinery and intelligence.” Back then, computers were very simple machines, and the field known as Artificial Intelligence (AI) was in its infancy. But already scientists and philosophers were wondering where the new technology would lead. In particular, could a machine “think” [more following the link]?


Turing considered that question to be meaningless, so proposed the imitation game as a way of sidestepping the question. Better, he argued, to focus on what the computer can actually do: can it talk? Can it hold a conversation well enough to pass for human? If so, Turing argued, we may as well grant that the machine is, at some level, intelligent.


In a Turing test, judges converse by text with unseen entities, which may be either human or artificial. (Turing imagined using teletype; today it’s done with chat software.) A human judge must determine, based on a five-minute conversation, whether his correspondent is a person or a machine.


Turing speculated that by 2000, “an average interrogator will not have more than a 70 per cent chance of making the right identification” — that is, computers would trick the judges 30 per cent of the time. For years, his prediction failed to come true, as software systems couldn’t match wits with their human interrogators. But in June, they came awfully close.


The event in question, billed as a “Turing test marathon,” was organised by the University of Reading as part of the centenary celebrations of the mathematician’s birth — and held, appropriately enough, at Bletchley Park in Buckinghamshire, where he played a key role in cracking the Enigma code as part of the Allied code-breaking effort. I joined 29 other judges in chatting electronically with 25 “hidden humans” (ensconced in an adjacent room) and five sophisticated “chatbots” — computer programs designed to imitate human intelligence and ability to converse.


Altogether, some 150 separate conversations were held. The winning program, developed by a Russian team, was called “Eugene.” Attempting to emulate the personality of a 13-year-old boy, Eugene fooled the judges 29.2 per cent of the time, just a smidgen below Turing’s 30 per cent threshold.


As a judge, I got a first-hand look at the strengths and weaknesses of the test. First of all, there’s the five-minute time limit — an arbitrary figure mentioned by Turing in his paper. The shorter the conversation, the greater the computer’s advantage; the longer the interrogation, the higher the probability that the computer will give itself away — typically by changing the subject for no reason, or by not being able to answer a question. The 30 per cent mark, too, is arbitrary.


But what about the nature of the test itself? Traditionally, language has been as the ultimate hallmark of intelligence, which is why Turing chose to focus on it. Yet while it may be our most impressive cognitive tool, it is certainly not the only one. In fact, what gives our species its edge may be the sheer variety of skills we have at our disposal, rather than its proficiency at any one task. “Human intelligence,” says Manuela Veloso, a computer scientist at Carnegie Mellon University, “has to do with the breadth of things that we can do.”


Not that we would necessarily want a machine that could “do it all.” Aside from being a staggeringly ambitious task, the idea of building an all-purpose robot — an “artificial human” — has never been a useful approach to AI, not least because it would simply replicate our own impressive capabilities.


Instead, the greatest progress has come when AI is applied to very specific tasks, such as the satellite navigation system in your car, the apps on your iPhone, or the search engines that pull needles out of the Internet’s haystack. Indeed, its most widely publicised achievements — the chess-playing skills of the computer Deep Blue, or the quiz knowledge of an IBM supercomputer called Watson, which last year triumphed in the American TV show Jeopardy! — are very narrow indeed. (Watson can answer difficult trivia questions with impressive skill, but it can’t do your taxes, fold your laundry, or make you a cup of tea.)


“That very simplistic idea, that we’re trying to imitate a human being, has sort of become an embarrassment,” says Pat Hayes of the Institute for Human and Machine Cognition in Pensacola, Florida. “It’s not that we’re failing, which is what a lot of people think — it’s that we’ve decided to do other things which are far more interesting.”


Finally, there’s another aspect of the Turing test that’s easy to overlook: it makes a virtue out of deception, forcing the machine to pretend to be something it’s not. “This is a test of being a successful liar,” says Hayes. “If you had something that really could pass Turing’s imitation game, it would be a very successful human mimic.”



At the opposite end of the spectrum, there are those who feel that the difference between machines and humans is merely one of complexity. The staunchest defender of that view is probably the philosopher Daniel Dennett, of Tufts University in Massachusetts. Dennett rejects the idea that we have a mysterious “essence” endowed by our biological structure that underlies our cognitive abilities. “It’s not impossible to have a conscious robot,” he told me. “You’re looking at one.” What he means is that the human brain is the most complex known arrangement of matter in the universe.


That, more than anything, accounts for the challenges facing AI: that while the human body is a machine, too, it’s one that far exceeds the current capabilities of information technology.


Perhaps, however, we’re closer than we think to “true” AI. After the Wright Brothers’s aeroplane lifted off in 1903, [skeptics] continued to debate whether we were “really” flying — an argument that simply faded away. It may be like that with AI. As Hayes argues, “You could argue we’ve already passed the Turing test.” If someone from 1950 could talk to Siri, he says, they’d think they were talking to a human being. “There’s no way they could imagine it was a machine — because no machine could do anything like that in 1950. So I think we’ve passed the Turing test, but we don’t know it.”

Read the full story here — and share your thoughts below.

(Contributed by Erwin Gianchandani, CCC Director)

Nearing the Turing Test

Comments are closed.