'Artificial Intelligence' Fools Judges, but Experts Aren't Impressed

A chat-bot has just recently passed the Turing Test, which is considered the gold standard for judging artificial intelligence. However, experts are not as impressed as you might imagine, saying that the time of the Turing Test is long gone.

The Turing Test - devised by robotics pioneer and philosopher Alan Turing - is rather simple. It requires a number of human judges to have blind conversations with a computer program. If, at the end of this conversation more than 30 percent of the judges are duped into thinking they are having a conversation with a real person, the program is deemed "intelligent," or at least efficient at imitating human intelligence.

Earlier this month the program "Eugene Goostman" did just that, tricking 10 out of 30 judges that "he" was a 13-year-old Ukrainian boy in an online chat room at the Royal Society in London.

Some scientists are hailing this as an alarming accomplishment of artificial intelligence, with author, lecturer, and CNN columnist Mark Goldfeder going as far as to question when exactly robots will start having to be granted "legal personhood," at least for purposes of liability.

However, many other experts are simply smirking at all this sudden interest in Eugene.

First Impressions

Celeste Beiver, a NewScientist writer, met Eugene, "a guinea pig-owning, 13-year-old boy living in Odessa, Ukraine," two years ago at Bletchley Park in Milton Keynes, UK.

Beiver served as a judge at the world's first 30 judge Turing test - deemed far more adequate for a true application of the test. There, he and his fellow judges carried out 150 separate conversations on an instant messaging program. Some of these conversations were with real-live people, while others were with programs like Eugene, according to a 2012 article detailing the experience.

According to Beiver, unlike other chat-bots - like programmer Rollo Carpenter's web-famous Cleverbot - Eugene always has the same personality. He always is from Ukraine, he is always 13, and he always owns a guinea pig.

Eugene's creator Vladimir Veselov told Beiver that this helps Eugene get away with some strange answers.

"Thirteen years old is not too old to know everything and not too young to know nothing," he explained.

In that 2012 test, Eugene took first place - fooling nine judges - but was shy of successfully "passing" the Turing Test. That was rectified this year when the six-year veteran program fooled 10 out of 30 people.

Eugene Isn't Alone

Despite what many media outlets are reporting, Eugene is not the first program to successfully pass the Turing Test.

A recent NewScientist article details how a program called PC Therapist successfully fooled five out of 10 judges in a small-scale Turing test in 1991.

Cleverbot, a relatively new favorite for these competitions, took third place in the larger 2012 competition. However, a year earlier the program successfully tricked 59.3 percent of 1,300 people at a Techniche festival in Guwahti, India, with the majority of voters believing it was human.

According to the program's site, the chat-bot "learns" random appropriate responses to questions from crowd-sourced social media and constant user interaction.

However, this means that depending on the tone and personality of the orignal reply's source or sources, Cleverbot's "mood" can rapidly change, potentially giving it away or making it seem even more human, depending on a bit of luck.

In my own attempt at chatting with Cleverbot, it went from wanting my help with homework to being my mortal enemy in less than 20 key-strokes.

Cracks in the Facade

This also better shows just how far from actual artificial intelligence chat-bots like Eugene and Cleverbot really are.

Robotics pioneer Alan Turing first proposed this test for artificial intelligence in the 1950's in his revolutionary paper "Computing Machinery and Intelligence." The test was designed to measure how human-like a computer could be. However, in this work, the scientist quickly turns the question "can machines think?" into the more practical question, "are there imaginable digital computers which would do well in the imitation game?" noting that machines would only ever be able to "think" and "converse" in terms of what their designers deem are sensible replies.

This is what we see in Cleverbot and even Eugene, despite its human-like personality.

Nature World News was unable to obtain a copy of any conversation transcripts with Eugene. However, David Aurbach, of Slate Magazine's Bitwise, was able to pull some strings to get his hands on a judge's conversation with the Eugene program back in 2012.

In an analysis of the transcript, Aurbach, a software engineer based in New York, quickly determined that if the Eugene program didn't understand a question, it would automatically tell a keyword related joke or "attempt to change the topic."

For instance, at one point a judge asks "what is the weather like tomorrow?" Eugene quickly responds with what appears to be a weather-related inside joke from his native home town, quickly segueing into asking the judge if he or she likes the weather.

According to Aurbach, this is clearly "a wholly scripted line by the program's creator."

Because the program couldn't possibly craft a response about the weather unless it were hooked up to the Internet, it is forced to used an automated statement.

"Eugene's joke and subsequent question attempt to make the judge forget that his own question was not answered," Aurbach adds.

While clever, this shows exactly what Turing means by the "imitation game" in which these programs are not truly "thinking" about their responses, but instead crafting believable ones based off a programmed set of rules.

According to Beiver, who saw right through Eugene's human imitation in 2012, the Turing Test is ill-suited for the modern world, where an online community has broken down the science of mindlessly "chatting" to an art.

"Turing Test has come to symbolize machine intelligence but it only tests machines on their ability to chat - and people are capable of so much more," he writes.

Beiver and others are now calling for a "universal intelligence test" that truly analyzes thought and not just an appearance of it. However, how this will be conceived remains very much a mystery.

Tags Robotics , Artificial intelligence

'Artificial Intelligence' Fools Judges, but Experts Aren't Impressed

Why Winter Storm Fern Actually Proves Climate Change Is Real—Not a Hoax

Why Healthy Soil Is Vital for Life on Earth: Boost Crops, Biodiversity, and Ecosystem Health

How the Polar Vortex and Warm Oceans Are Driving Extreme US Winter Storms in 2026

3I/ATLAS Update: What Did ExoMars Trace Gas Orbiter Capture at Close Range Before Comet Disappeared Behind the Sun?

Hurricane Melissa Targets Southeast Bahamas and Bermuda After Historic Category 5 Jamaica Landfall

Microplastics Health Effects: Are Environmental Toxins and Plastic in the Bloodstream Dangerous?