homehome Home chatchat Notifications


This AI will let you listen to one person and mute everyone else in a crowd

When you pair this AI with noise-canceling headphones, it allows you to listen to only the person you want to talk to in a crowd.

Rupendra Brahambhatt
August 17, 2024 @ 1:46 am

share Share

Imagine you’re at a concert with your significant other and want to say something special when their favorite song starts to play. However, you realize the music is so loud that she won’t be able to hear your special words. What can you do? 

You’ll most likely have to postpone it because talking and listening to someone in a loud, noisy, and crowded environment is often challenging. But, you know what? AI can solve this challenge too.

A team of researchers at the University of Washington (UW) has developed an interesting AI system that allows you to listen to the specific person you want to in a crowded environment using ordinary noise-cancelling headphones

Child wearing headphones
A kid wearing headphones. Image credits: Alireza Attari/Unsplash

All you need to do is look at the person, press a button, and enroll them. The AI system, called “Target Speech Hearing” will remove all the surrounding noise and sounds. Now you can talk and listen to the enrolled person even when they are not facing you or lost somewhere in the crowd.

“As urban environments get more noisy, this technology gives us back some control over our acoustic scene and what we want to focus on. This can also be very beneficial for hearing aids for folks who have hearing loss,” Shyam Gollakota, one of the researchers and Head of the Mobile Intelligence Lab at the University of Washington, told ZME Science.

How does target speech hearing work?

Commercially available noise-canceling headphones eliminate the noise in your environment allowing you to listen to songs undisturbed. However, you can’t use them to listen to a sound from a particular person or object. This is where target speech hearing (TSH) can help you.

Do you ever wonder why familiar voices, like those of a close friend or parent, stand out to us in crowded environments? This is because our brains are capable of focusing on sounds from a target source, given prior knowledge of what the source sounds like. 

So, TSH works similarly to the human brain. It allows headphones to learn a target speaker’s voice and how they differ from other human voices in the environment. Here is a step-by-step explanation of its working mechanism: 

  • A user wearing headphones equipped with TSH clicks a button on the headphones and looks at the target speaker for a few (two to five) seconds. 
  • During this time, the system captures a noisy audio example from the target across the left and right microphones. 
  • The system uses this recording to extract the speaker’s voice characteristics even when there are other speakers and noises in the vicinity. This is called the enrollment stage. 
  • The neural network is then trained on the voice’s characteristics within the two- to five-second timeframe.

Once the AI learns the voice characteristics, it then cancels all other sounds in the environment and plays just the enrolled speaker’s voice in real time even as the listener moves around in noisy places and no longer faces the speaker.

“Since all this is happening in real-time, we effectively suppress all sounds except for say the chirping of the birds,” Gollakota said.

Unlike ChatGPT, TSH doesn’t need data centers

According to the researchers, when people typically talk about neural networks and artificial intelligence these days, they refer to large language models like ChatGPT. Such models run in huge data centers. However, setting up data centers to make TSH work will make the technology impractical.

“So, we had to design special neural networks that can run on a smartphone and can extract the sound we care about in real-time. This is because the kind of sound intelligence one needs for this is likely something that even small insects have. So, what we are showing here is that we do not need a large neural model to achieve these tasks,” Gollakota told ZME Science.

The researchers demonstrated the AI’s action with a specific pair of commercial noise-canceling headphones. But this can work with most noise-canceling headphones. Plus, this technology can also be used for earbuds and hearing aids.

However, the AI also has some limitations. For instance, the TSH system can enroll only one speaker at a time, and it’s only able to enroll a speaker when there is no other loud voice coming from the same direction as the target speaker’s voice. 

The researchers are working to overcome these limitations and plan to make the AI system commercially available through a startup.

“We are working on getting this to a much smaller form factor, e.g., a wireless earbud or a hearing aid. That would be transformative since it can then be included in billions of earbuds that folks use today,” Gollakota said.

The study is published in ACM Digital Library.

share Share

How Hot is the Moon? A New NASA Mission is About to Find Out

Understanding how heat moves through the lunar regolith can help scientists understand how the Moon's interior formed.

This 5,500-year-old Kish tablet is the oldest written document

Beer, goats, and grains: here's what the oldest document reveals.

A Huge, Lazy Black Hole Is Redefining the Early Universe

Astronomers using the James Webb Space Telescope have discovered a massive, dormant black hole from just 800 million years after the Big Bang.

Did Columbus Bring Syphilis to Europe? Ancient DNA Suggests So

A new study pinpoints the origin of the STD to South America.

The Magnetic North Pole Has Shifted Again. Here’s Why It Matters

The magnetic North pole is now closer to Siberia than it is to Canada, and scientists aren't sure why.

For better or worse, machine learning is shaping biology research

Machine learning tools can increase the pace of biology research and open the door to new research questions, but the benefits don’t come without risks.

This Babylonian Student's 4,000-Year-Old Math Blunder Is Still Relatable Today

More than memorializing a math mistake, stone tablets show just how advanced the Babylonians were in their time.

Sixty Years Ago, We Nearly Wiped Out Bed Bugs. Then, They Started Changing

Driven to the brink of extinction, bed bugs adapted—and now pesticides are almost useless against them.

LG’s $60,000 Transparent TV Is So Luxe It’s Practically Invisible

This TV screen vanishes at the push of a button.

Couple Finds Giant Teeth in Backyard Belonging to 13,000-year-old Mastodon

A New York couple stumble upon an ancient mastodon fossil beneath their lawn.