Some individuals can focus on a single speaker despite a surrounding environment that obscures a person’s voice. The setting may be a classroom, a bar, or a sporting event –the ability is not unique and has been described by psychologists as the “cocktail party effect.”
A new research effort led by a University of California - San Francisco neurosurgeon and a postdoctoral fellow focused on discovering how selective hearing works in the brain.
Edward Chang, M.D., and Nima Mesgarani, Ph.D., worked with three patients who were undergoing brain surgery for severe epilepsy.
Part of this surgery involves pinpointing the parts of the brain responsible for disabling seizures. This exercise involves mapping the brain’s activity over a week, with a thin sheet of up to 256 electrodes placed under the skull on the brain’s outer surface or cortex. The electrodes record activity in the temporal lobe — home to the auditory cortex.
Chang said the ability to safely record intracranial recordings provided unique opportunities to advance fundamental knowledge of how the brain works.
“The combination of high-resolution brain recordings and powerful decoding algorithms opens a window into the subjective experience of the mind that we’ve never seen before,” Chang said.
In the experiments, patients listened to two speech samples played to them simultaneously in which different phrases were spoken by different speakers. They were asked to identify the words they heard spoken by one of the two speakers.
The authors then applied new decoding methods to “reconstruct” what the subjects heard from analyzing their brain activity patterns.
Strikingly, the authors found that neural responses in the auditory cortex only reflected those of the targeted speaker. They found that their decoding algorithm could predict which speaker and even what specific words the subject was listening to based on those neural patterns. In other words, they could tell when the listener’s attention strayed to another speaker.
“The algorithm worked so well that we could predict not only the correct responses, but also even when they paid attention to the wrong word,” Chang said.
The new findings show that the representation of speech in the cortex does not just reflect the entire external acoustic environment but instead just what we really want or need to hear.
They represent a major advance in understanding how the human brain processes language, with immediate implications for the study of impairment during aging, attention deficit disorder, autism and language learning disorders.
In addition, Chang says that we may someday be able to use this technology for neuroprosthetic devices for decoding the intentions and thoughts from paralyzed patients that cannot communicate.
An understanding of how our brains are wired to favor some auditory cues over others may encourage new approaches toward automating and improving how voice-activated electronic interfaces filter sounds in order to properly detect verbal commands.
The method by which the brain can so effectively focus on a single voice is a area of significant interest to companies that develop electronic devices with voice-active interfaces.
While the voice recognition technologies that enable such interfaces as Apple’s Siri have come a long way in the last few years, they are nowhere near as sophisticated as the human speech system. For example, an average person can walk into a noisy room and have a private conversation with relative ease — as if all the other voices in the room were muted.
Speech recognition, said Mesgarani, an engineer with a background in automatic speech recognition research, is “something that humans are remarkably good at, but it turns out that machine emulation of this human ability is extremely difficult.”
The research article appears in the journal Nature.