homehome Home chatchat Notifications


Voice mimicking AI dupes Alexa and other voice recognition devices

These algorithms open up a new avenue of scamming and hacking.

Tibi Puiu
October 14, 2021 @ 5:48 pm

share Share

Credit: Pixabay.

Deepfakes (a portmanteau of “deep learning” and “fake“) are synthetic media in which a real person’s pictures, videos, or speech are converted into someone else’s (often a celebrity’s) artificial, AI-generated likeness. You may have come across some on the internet before, such as Tom Cruise deepfakes on Tik Tok or Joe Rogan voice clones.

While image and video varieties are more convincing, the impression was that audio deepfakes have lagged behind — not without copious amounts of training audio, at least. But a new study serves as a wake-up call, showing that voice copying algorithms that are easy to find on the internet are already pretty good. In fact, the researchers found that with minimal amounts of training, these algorithms can fool voice recognition devices, such as Amazon’s Alexa.

Researchers at the University of Chicago’s Security, Algorithms, Networking and Data (SAND) Lab tested two of the most popular deepfake voice synthesis algorithms — SV2TTS and AutoVC — both of which are open-source and freely available on Github.

The two programs are known as ‘real-time voice cloning toolboxes’. The developers of SV2TTS boast that only five seconds’ worth of training recordings are enough to generate a passable imitation.

The researchers put both systems to the test by feeding them the same 90 five-minute voice recordings of different people talking. They also recorded their own samples from 14 volunteers, who were asked for permission to see whether the computer-generated voices could unlock their voice recognition devices, such as Microsoft Azure, WeChat, and Amazon Alexa.

SV2TTS was able to trick Microsoft Azure about 30 percent of the time but got the best of both WeChat and Amazon Alexa almost two-thirds, or 63 percent, of the time. A hacker could use this to log into WeChat with a synthetic vocal message mimicking the real user or access a person’s Alexa to make payments to third-party apps.

AutoVC performed quite poorly, being able to fool Microsoft Azure only 15 percent of the time. Since it fell short of expectations, the researchers didn’t bother to test it against WeChat and Alexa voice recognition security.

In another experiment, the researchers enlisted 200 volunteers who were asked to listen to pairs of recordings and identify which of two they thought was fake. The volunteers were tricked nearly half the time, which made their judgments no better than a coin toss.

The most convincing deepfake audios were those mimicking women’s voices and those of non-native English speakers. This is something that researchers are currently looking into.

‘We find that both humans and machines can be reliably fooled by synthetic speech and that existing defenses against synthesized speech fall short,’ the researchers wrote in a report posted on the open-access server arXiv

‘Such tools in the wrong hands will enable a range of powerful attacks against both humans and software systems [aka machines].’

In 2019, a scammer performed an ‘AI heist’, using deep fake voice algorithms to impersonate a German executive at an energy company and convince employees to wire him $240,000. According to the Washington Post, the person who performed the wire transfer found it odd that their boss would make such a request, but the German accent and familiar voice heard over the phone was convincing. Cybersecurity firm Symantec says it has identified similar cases of deepfake voice scams that resulted in losses in the millions of dollars. 

share Share

This 5,500-year-old Kish tablet is the oldest written document

Beer, goats, and grains: here's what the oldest document reveals.

A Huge, Lazy Black Hole Is Redefining the Early Universe

Astronomers using the James Webb Space Telescope have discovered a massive, dormant black hole from just 800 million years after the Big Bang.

Did Columbus Bring Syphilis to Europe? Ancient DNA Suggests So

A new study pinpoints the origin of the STD to South America.

The Magnetic North Pole Has Shifted Again. Here’s Why It Matters

The magnetic North pole is now closer to Siberia than it is to Canada, and scientists aren't sure why.

For better or worse, machine learning is shaping biology research

Machine learning tools can increase the pace of biology research and open the door to new research questions, but the benefits don’t come without risks.

This Babylonian Student's 4,000-Year-Old Math Blunder Is Still Relatable Today

More than memorializing a math mistake, stone tablets show just how advanced the Babylonians were in their time.

Sixty Years Ago, We Nearly Wiped Out Bed Bugs. Then, They Started Changing

Driven to the brink of extinction, bed bugs adapted—and now pesticides are almost useless against them.

LG’s $60,000 Transparent TV Is So Luxe It’s Practically Invisible

This TV screen vanishes at the push of a button.

Couple Finds Giant Teeth in Backyard Belonging to 13,000-year-old Mastodon

A New York couple stumble upon an ancient mastodon fossil beneath their lawn.

Worms and Dogs Thrive in Chernobyl’s Radioactive Zone — and Scientists are Intrigued

In the Chernobyl Exclusion Zone, worms show no genetic damage despite living in highly radioactive soil, and free-ranging dogs persist despite contamination.