In one of its most recent researches, a team of neuro-engineers at the Columbia University’s Zuckerman Mind Brain Behavior Institute has developed a system that intends to help people who have lost their ability to speak due to accidents, trauma, or diseases. This scientific breakthrough harnesses the power of speech synthesizers and artificial intelligence to monitor brain signals and reconstruct high-quality speech. Scientists believe that it could offer new ways for computers to communicate with the brain. Paralyzed patients or those recovering from stroke could benefit from this neuro prosthetic device that can recreate speech from neural activity, ultimately helping them regain their ability to communicate with others.
The research paper was first published in this week’s Scientific Reports. According to Dr. Nima Mesgarani, the paper’s senior author and an investigator at Columbia University, “losing the power of one’s voice due to injury or disease is so devastating. In this study, we’ve shown that with the right technology, these people’s thoughts could be decoded and understood by any listener.”
Challenges faced during the research
Like any other innovations, accomplishing this feat had its own challenges. Initially, Dr. Mesgarani’s team focused on simple computer models that analyzed spectrograms (visual representation of sound frequencies). However, this approach failed to give a result that resembles meaningful speech. The team then turned to a vocoder, a computer algorithm that can synthesize speech after being trained on recordings of people talking.
Hitting the milestone
In order to train the vocoder on how to interpret brain activity, Dr. Mesgarani teamed up with Dr. Ashesh Dinesh Mehta, a neurosurgeon at Northwell Health Physician Partners Neuroscience Institute who treats patients affected with epilepsy. Mesgarani’s team asked the epilepsy patients undergoing brain surgery to listen to the sentences spoken by different people so that his team could measure the patterns of these patients’ brain activity. These neural patterns were used to train the vocoder.
After several stages of training and testing, it became evident that the sensitive vocoder and the powerful neural networks could reproduce the sounds the patients had listened earlier with higher levels of accuracy. When the brain signals were recorded and run through the vocoder, it responded with sounds which was in turn cleaned up using neural networks, a kind of AI that imitates neuron’s structure in human brain.
“The sensitive vocoder and the powerful neural networks represented the sounds patients had originally listened to with surprising accuracy”, says Dr. Mesgarani. The research team is planning to test more complicated words and sentences and run the same tests on brain signals emitted while a person speaks or imagines speaking. Their major goal is to incorporate the system in implants, like the ones worn by some epilepsy patients, so that the device can translate the wearer’s thoughts directly into words.
Zerone develops bespoke software solutions carefully customized for the needs of our clients. Contact an expert today for a better tomorrow!