Researchers have long theorized that the superior temporal sulcus (STS) is involved in processing speech rhythms, but it’s only recently that this has been confirmed by a team at Duke University. Their findings show that the STS is sensitive to the timing of speech, a crucial element of spoken language. This could help further our understanding of how some speech-impairing conditions arise in the brain, or aid tutors design next-generation, computer assisted foreign language courses.
Timing and the brain
Human brains are particularly efficient in perceiving, producing, and processing fine rhythmic information in music and speech. However, music is processed differently from speech, suggesting some underlying, specific mechanisms. For instance, any type of sound whether rhythmic or not triggers activity in the temporal lobe’s auditory cortex, but speech lights up only the STS.
Any linguist can tell you timing makes everything in a language. Namely, it involves timing and stitching ultra short, short and long sounds together. Phonemes are the shortest, most basic unit of speech and last an average of 30 to 60 milliseconds. By comparison, syllables take longer: 200 to 300 milliseconds, while most whole words are longer still. It’s an immense amount of information to process, least we forget that data related to speech is distinct from other sounds like the environment (birds chirping, water splashing) or music, which shares a rhythmic sequence.
The Duke researchers took a speech rendered in a foreign language then cut it into short chunks ranging from 30 to 960 milliseconds in length. Using a novel computer algorithm, the sounds were then re-assembled which led to new sounds that the authors call ‘speech quilts’. It’s basically gibberish, but it still sounds like some language. The shorter the pieces of the resulting speech quilts, the greater the disruption was to the original structure of the speech.
The sounds were then played to volunteers who had their brain activity monitored. The researchers hypothesized that the STS would have a better response to speech quilts made up of longer segments. Indeed, this was what happened: the STS became highly active during the 480- and 960-millisecond quilts compared with the 30-millisecond quilts.
To make sure they weren’t actually seeing some other response, the authors also played other sounds intended to mimic speech, but with some key differences. One of the synthetic sounds they created shared the frequency of speech but lacked its rhythms. Another removed all the pitch from the speech. A third used environmental sounds. Again, each control sound was chopped and quilted before playing them to the participants. The STS didn’t seem responsive to the quilting manipulation when it was applied to these control sounds, as reported in Nature Neuroscience.
“We really went to great lengths to be certain that the effect we were seeing in STS was due to speech-specific processing and not due to some other explanation, for example, pitch in the sound or it being a natural sound as opposed to some computer-generated sound,” said co-author Tobias Overath, an assistant research professor of psychology and neuroscience at Duke.