How We Figure Out Which Tweets are Credible

Researchers at the Georgia Institute of Technology have developed a new language model showing which words and phrases positively or negatively influence the credibility of world events reported on Twitter.

The study, which scanned 66 million tweets regarding nearly 1,400 real-world events, suggests that the words of millions of people on social media may offer considerable information about an event’s credibility, even when an event is still in progress.

“There have been many studies about social media credibility in recent years, but very little is known about what types of words or phrases create credibility perceptions during rapidly unfolding events,” said Tanushree Mitra, the Georgia Tech Ph.D. candidate who led the research.

The team studied tweets regarding world events in 2014 and 2015, including the emergence of Ebola in West Africa, the Charlie Hebdo attack in Paris and the death of Eric Garner in New York City.

The researchers asked people to judge the posts on their credibility (from “certainly accurate” to “certainly inaccurate”). Then the team fed the words into a model that divided them into 15 different linguistic categories. The classifications included positive and negative emotions, hedges and boosters, and anxiety.

The Georgia Tech computer then examined the words to judge if the tweets were credible or not. It matched the humans’ opinions about 68 percent of the time, a percentage significantly higher than the random baseline of 25 percent.

“Tweets with booster words, such as ‘undeniable,’ and positive emotion terms, such as ‘eager’ and ‘terrific,’ were viewed as highly credible,” said Mitra. “Words indicating positive sentiment but mocking the impracticality of the event, such as ‘ha,’ ‘grins’ or ‘joking,’ were seen as less credible. So were hedge words, including ‘certain level’ and ‘suspects.'”

Higher numbers of retweets were associated with lower credibility scores. Replies and retweets with longer message lengths were believed to be more credible.

“It could be that longer message lengths provide more information or reasoning, so they’re viewed as more trustworthy,” she said. “On the other hand, a higher number of retweets, which was scored lower on credibility, might represent an attempt to elicit collective reasoning during times of crisis or uncertainty.”

Although the model isn’t deployable yet, the researchers say they may eventually develop an app that can calculate the perceived trustworthiness of an event as it unfolds on social media.

“When combined with other signals, such as event topics or structural information, our linguistic result could be an important building block of an automated system,” said Dr. Eric Gilbert, Mitra’s advisor and an assistant professor in Georgia Tech’s School of Interactive Computing.

“Twitter is part of the problem with spreading untruthful news online. But it can also be part of the solution.”

Source: Georgia Institute of Technology