Home » Schizophrenia » Machine Learning Can Help Predict Psychosis Via Language Analysis
Machine Learning Can Help Predict Psychosis Via Language Analysis

Machine Learning Can Help Predict Psychosis Via Language Analysis

A new machine-learning method can predict with 93 percent accuracy whether a person at-risk for psychosis will go on to develop the disorder.

The method, developed by scientists at Emory University and Harvard University, discovered that higher than normal usage of words related to sound, combined with a higher rate of using words with similar meaning, meant that psychosis was likely on the horizon.

Even trained clinicians had not noticed how people at risk for psychosis use more words associated with sound than the average, although abnormal auditory perception is an early warning sign.

“Trying to hear these subtleties in conversations with people is like trying to see microscopic germs with your eyes,” says Neguine Rezaii, first author of the paper. “The automated technique we’ve developed is a really sensitive tool to detect these hidden patterns. It’s like a microscope for warning signs of psychosis.”

The onset of schizophrenia and other psychotic disorders typically occurs in the early 20s, with early warning signs — known as prodromal syndrome — beginning around age 17. Around 25 to 30 percent of young people with prodromal syndrome will eventually develop schizophrenia or another psychotic disorder.

Currently, there is no cure for psychosis. Through structured interviews and cognitive tests, trained clinicians can predict psychosis with about 80 percent accuracy in those with a prodromal syndrome.

Now, research with machine-learning, a form of artificial intelligence that can uncover hidden patterns, is one of the many ongoing efforts to streamline diagnostic methods, identify new variables, and improve the accuracy of predictions.

“It was previously known that subtle features of future psychosis are present in people’s language, but we’ve used machine learning to actually uncover hidden details about those features,” says senior author Phillip Wolff, a professor of psychology at Emory. Wolff’s lab focuses on language semantics and machine learning to predict decision-making and mental health.

For the study, the researchers first used machine learning to establish “norms” for conversational language. They fed a computer software program the online conversations of 30,000 users of Reddit, a social media platform where people have informal discussions about a range of topics.

The software program, known as Word2Vec, uses an algorithm to change individual words to vectors (a mathematical term referring to the position of one point in space relative to another). In other words, the program assigned each word to a location in a semantic space based on its meaning. Words with similar meanings were positioned closer together than those with very different meanings.

The Wolff lab also developed a computer program to perform “vector unpacking,” or analysis of the semantic density of word usage. Vector unpacking allowed the researchers to quantify how much information was packed into each sentence.

After generating a baseline of “normal” data, the researchers applied the same techniques to diagnostic interviews of 40 young people at high risk for psychosis. The automated analyses of the participant samples were then compared to the normal baseline sample.

The results showed that higher than normal usage of sound-related words, along with a higher rate of using words with similar meaning, meant that psychosis was likely to occur.

Strengths of the study include the simplicity of using just two variables — both of which have a strong theoretical foundation — the replication of the results in a holdout dataset, and the high accuracy of its predictions, at above 90 percent.

“In the clinical realm, we often lack precision,” Rezaii says. “We need more quantified, objective ways to measure subtle variables, such as those hidden within language usage.”

Rezaii and Wolff are now gathering larger data sets and testing the application of their methods on a variety of neuropsychiatric diseases, including dementia.

“This research is interesting not just for its potential to reveal more about mental illness, but for understanding how the mind works — how it puts ideas together,” Wolff says. “Machine learning technology is advancing so rapidly that it’s giving us tools to data mine the human mind.”

Co-author Elaine Walker, Emory professor of psychology and neuroscience, says “If we can identify individuals who are at risk earlier and use preventive interventions, we might be able to reverse the deficits.”

The findings are published in the journal npj Schizophrenia.

Source: Emory Health Sciences

Machine Learning Can Help Predict Psychosis Via Language Analysis

Traci Pedersen

Traci Pedersen is a professional writer with over a decade of experience. Her work consists of writing for both print and online publishers in a variety of genres including science chapter books, college and career articles, and elementary school curriculum.

APA Reference
Pedersen, T. (2019). Machine Learning Can Help Predict Psychosis Via Language Analysis. Psych Central. Retrieved on November 26, 2020, from
Scientifically Reviewed
Last updated: 28 Jun 2019 (Originally: 28 Jun 2019)
Last reviewed: By a member of our scientific advisory board on 28 Jun 2019
Published on Psych All rights reserved.