homehome Home chatchat Notifications


AI Chatbots are easily fooled by nonsense. Can their flaws offer insights about the human brain?

Peeking into chatbot mistakes, scientists probe deeper into human cognition.

Tibi Puiu
September 16, 2023 @ 2:19 am

share Share

AI chatbot
Credit: Pixabay.

You’ve likely chatted with an AI chatbot, such as ChatGPT or Google’s Bard, marveling at its ability to mimic human conversation. But ‘mimicry’ is the keyword here, as these bots aren’t actually thinking machines. Case in point, researchers purposefully threw a curveball at some of the most popular chatbots currently available, showing they can easily get tripped up by sentences that sound nonsensical to our ears.

These AIs, powered by immense neural networks and trained on millions upon millions of examples, perceived these nonsense sentences as ordinary language. It’s a good example of the limitations of these systems that are often severely overblown and hyped up on social media. If these results are any indication, we’re still a long way from Skynet (thank God!).

However, the same results also offer an intriguing revelation — studying these AI missteps could not only boost chatbot efficiency but also unveil secrets about the inner workings of human language processing.

Of Transformers and Recurrent Networks

ai chatbot nonsense
Credit: Columbia Zuckerman Institute.

Researchers at Columbia University compiled hundreds of sentence pairs — one that made sense, the other more likely to be judged as gibberish — and had humans rate which one sounded more “natural”. They then challenged nine different large language models (LLMs) with the same sentence pairs. Would the AI judge the sentences as we did?

The showdown results were telling. AIs built on what is known in the tech world as “transformer neural networks”, such as ChatGPT, outperformed their peers that rely on simpler recurrent neural network models and statistical models. Yet, all the models, irrespective of their sophistication, faltered. Many times, they favored sentences that might make you scratch your head in confusion.

Here’s an example of a sentence pair used by the study:

  1. That is the narrative we have been sold.
  2. This is the week you have been dying.

Which one do you reckon you’d hear more often in a conversation and makes more sense? Humans in the study gravitated toward the first. Yet, BERT, a top-tier model, argued for the latter. GPT-2 agreed with us humans on this one, but even it failed miserably during other tests.

“Every model showcased limitations, sometimes tagging sentences as logical when humans deemed them as gibberish,” remarked Christopher Baldassano, a professor of psychology at Columbia.

“The fact that advanced models perform well implies they have grasped something pivotal that simpler models overlook. However, their susceptibility to nonsense sentences indicates a disparity between AI computations and human language processing,” says Nikolaus Kriegeskorte, a key investigator at Columbia’s Zuckerman Institute.

The limits of AI and bridging the gap

This brings us to a pressing concern: AI still has blind spots and it’s not nearly as ‘smart’ as you might think, which is both good and bad news depending on how you view this.

In many ways, this is a paradox. We’ve heard how LLMs like ChatGPT can pass US Medical and bar exams. At the same time, the same chatbot often can’t solve simple math problems or spell words like ‘lollipop’ backwards.

As the present research shows, there’s a wide gap between these LLMs and human intelligence. Untangling this performance gap will go a long way towards catalyzing advancements in language models.

For the Columbia researchers, however, the stakes are even higher. Their agenda doesn’t involve making LLMs better but rather teasing apart their idiosyncrasies to learn more about what makes us tick, specifically how the human brain processes language.

A human child exposed to a very limited household vocabulary can very quickly learn to speak and articulate their thoughts. Meanwhile, ChatGPT was trained on millions of books, articles, and webpages and it still gets fooled by utter nonsense.

“AI tools are powerful but distinct in processing language compared to humans. Assessing their language comprehension in juxtaposition to ours offers a fresh perspective on understanding human cognition,” says Tal Golan, the paper’s lead, who recently moved from the Zuckerman Institute to Ben-Gurion University of the Negev.

In essence, as we peer into the errors of AI, we might just stumble upon deeper insight about ourselves. After all, in the words of ancient philosopher Lao Tzu, “From wonder into wonder, existence opens.”

The findings appeared in the journal Nature Machine Intelligence.

share Share

This 5,500-year-old Kish tablet is the oldest written document

Beer, goats, and grains: here's what the oldest document reveals.

A Huge, Lazy Black Hole Is Redefining the Early Universe

Astronomers using the James Webb Space Telescope have discovered a massive, dormant black hole from just 800 million years after the Big Bang.

Did Columbus Bring Syphilis to Europe? Ancient DNA Suggests So

A new study pinpoints the origin of the STD to South America.

The Magnetic North Pole Has Shifted Again. Here’s Why It Matters

The magnetic North pole is now closer to Siberia than it is to Canada, and scientists aren't sure why.

For better or worse, machine learning is shaping biology research

Machine learning tools can increase the pace of biology research and open the door to new research questions, but the benefits don’t come without risks.

This Babylonian Student's 4,000-Year-Old Math Blunder Is Still Relatable Today

More than memorializing a math mistake, stone tablets show just how advanced the Babylonians were in their time.

Sixty Years Ago, We Nearly Wiped Out Bed Bugs. Then, They Started Changing

Driven to the brink of extinction, bed bugs adapted—and now pesticides are almost useless against them.

LG’s $60,000 Transparent TV Is So Luxe It’s Practically Invisible

This TV screen vanishes at the push of a button.

Couple Finds Giant Teeth in Backyard Belonging to 13,000-year-old Mastodon

A New York couple stumble upon an ancient mastodon fossil beneath their lawn.

Worms and Dogs Thrive in Chernobyl’s Radioactive Zone — and Scientists are Intrigued

In the Chernobyl Exclusion Zone, worms show no genetic damage despite living in highly radioactive soil, and free-ranging dogs persist despite contamination.