AI algorithm used to unpack neuroscience of human language

Brain activity illustration.
Scientists have used a type of artificial intelligence, called a large language model, to uncover new insights into how the human brain understands and produces language. (Image credit: Yuichiro Chino/Getty Images)

Using artificial intelligence (AI), scientists have unraveled the intricate brain activity that unfolds during everyday conversations.

The tool could offer new insights into the neuroscience of language, and someday, it could help improve technologies designed to recognize speech or help people communicate, the researchers say.

Based on how an AI model transcribes audio into text, the researchers behind the study could map brain activity that takes place during conversation more accurately than traditional models that encode specific features of language structure — such as phonemes (the simple sounds that make up words) and parts of speech (such as nouns, verbs and adjectives).

The model used in the study, called Whisper, instead takes audio files and their text transcripts, which are used as training data to map the audio to the text. It then uses the statistics of that mapping to "learn" to predict text from new audio files that it hasn't previously heard.

Related: Your native language may shape the wiring of your brain

As such, Whisper works purely through these statistics without any features of language structure encoded in its original settings. But nonetheless, in the study, the scientists showed that those structures still emerged in the model once it was trained.

The study sheds light on how these types of AI models — called large language models (LLMs) — work. But the research team is more interested in the insight it provides into human language and cognition. Identifying similarities between how the model develops language processing abilities and how people develop these skills may be useful for engineering devices that help people communicate.

"It's really about how we think about cognition," said lead study author Ariel Goldstein, an assistant professor at the Hebrew University of Jerusalem. The study's results suggest that "we should think about cognition through the lens of this [statistical] type of model," Goldstein told Live Science.

Unpacking cognition

The study, published March 7 in the journal Nature Human Behaviour, featured four participants with epilepsy who were already undergoing surgery to have brain-monitoring electrodes implanted for clinical reasons.

With consent, the researchers recorded all of the patients' conversations throughout their hospital stays, which ranged from several days to a week. They captured over 100 hours of audio, in total.

Each of the participants had 104 to 255 electrodes installed to monitor their brain activity.

Most studies that use recordings of conversations take place in a lab under very controlled circumstances over about an hour, Goldstein said. Although this controlled environment can be useful for teasing out the roles of different variables, Goldstein and his collaborators wanted to "to explore the brain activity and human behavior in real life."

Their study revealed how different parts of the brain engage during the tasks required to produce and comprehend speech.

Goldstein explained that there is ongoing debate as to whether distinct parts of the brain kick into gear during these tasks or if the whole organ responds more collectively. The former idea might suggest that one part of the brain processes the actual sounds that make up words while another interprets those words' meanings, and still another handles the movements needed to speak.

In the alternate theory, it's more that these different regions of the brain work in concert, taking a "distributed" approach, Goldstein said.

The researchers found that certain brain regions did tend to correlate with some tasks.

For example, areas known to be involved in processing sound, such as the superior temporal gyrus, showed more activity when handling auditory information, and areas involved in higher-level thinking, such as the inferior frontal gyrus, were more active for understanding the meaning of language.

They could also see that the areas became active sequentially.

For example, the region most responsible for hearing the words was activated before the region most responsible for interpreting them. However, the researchers also clearly saw areas activate during activities they were not known to be specialized for.

"I think it's the most comprehensive and thorough, real-life evidence for this distributed approach," Goldstein said.

Related: New AI model converts your thought into full written speech by harnessing your brain's magnetic signals

Linking AI models to the inner workings of the brain

The researchers used 80% of the recorded audio and accompanying transcriptions to train Whisper so that it could then predict the transcriptions for the remaining 20% of the audio.

The team then looked at how the audio and transcriptions were captured by Whisper and mapped those representations to the brain activity captured with the electrodes.

After this analysis, they could use the model to predict what brain activity would go with conversations that had not been included in the training data. The model's accuracy surpassed that of a model based on features of language structure.

Although the researchers didn't program what a phoneme or word is into their model from the outset, they found those language structures were still reflected in how the model worked out its transcripts. So it had extracted those features without being directed to do so.

The research is a "groundbreaking study because it demonstrates a link between the workings of a computational acoustic-to-speech-to language model and brain function," Leonhard Schilbach, a research group leader at the Munich Centre for Neurosciences in Germany who was not involved in the work, told Live Science in an email.

However, he added that, "Much more research is needed to investigate whether this relationship really implies similarities in the mechanisms by which language models and the brain process language."

"Comparing the brain with artificial neural networks is an important line of work," said Gašper Beguš, an associate professor in the Department of Linguistics at the University of California, Berkeley who was not involved in the study.

“If we understand the inner workings of artificial and biological neurons and their similarities, we might be able to conduct experiments and simulations that would be impossible to conduct in our biological brain," he told Live Science by email.

Anna Demming
Live Science Contributor

Anna Demming is a freelance science journalist and editor. She has a PhD from King’s College London in physics, specifically nanophotonics and how light interacts with the very small. She began her editorial career working for Nature Publishing Group in Tokyo in 2006. She has since worked as an editor for Physics World and New Scientist. Publications she has contributed to on a freelance basis include The Guardian, New Scientist, Chemistry World, and Physics World, among others. She loves all science generally, but particularly materials science and physics, such as quantum physics and condensed matter.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.