Just how uncivil is Election 2016? MIT’s Media Lab has some charts you should see.

April 28, 2016

By Timothy McGrath

A supporter of Republican presidential candidate Donald Trump yells at a demonstrator after Trump canceled his rally at the University of Illinois at Chicago on March 11, 2016.
Kamil Krzaczynski/Reuters

We’ve heard a lot of … let’s say, colorful rhetoric this election cycle from US presidential candidates and their supporters.

Now, a team of data scientists at MIT say they can actually quantify just how toxic our political and public discourse has become — at least on Twitter.

Welcome to Tonar, an index of incivility brought to you by the MIT Media Lab and more than 300 million Twitter users.

Simply put, Tonar is a new data analysis tool developed by researchers at the Laboratory for Social Machines (LSM). It sorts through Twitter’s fire hose of tweets (roughly 500 million per day), figures out which ones are about the US presidential election (around 250,000 per day right now), and then takes that smaller election sample and identifies all the tweets that include profanity, insults, pejorative racial or sexual language, or threats of violence.

You might not be shocked to learn that Election 2016 is producing uncivil discourse. This cohort of presidential hopefuls seems to have brought new meaning to the term “negative campaigning.” Our Facebook feeds are filled with friends accusing one another of racism, sexism and more when it comes to their political choices. And we know that the wider internet is filled with trolls who reserve some of their worst vitriol for the campaign season.

As Andrew Heyward, the former president of CBS News and now a researcher at the Media Lab (among other things), said in a Medium post about Tonar:

“You don’t have to be an MIT scientist to know that the Presidential campaign has often been lewd, crude, and rude. But it helps.”

Tonar helps in that it can actually measure all this uncivil political discourse, sort it into categories and monitor how it changes over time.

Take this graph, for example.

Share of election-related tweets that Tonar identified as uncivil. — Share of election-related tweets (Y axis) that Tonar identified as uncivil. Analytics and visualization by Soroush Vosoughi and Prashanth Vijayaraghavan, researchers at the Laboratory for Social Machines.Credit: Courtesy of the Laboratory for Social Machines

It shows the percentage of election-related tweets that the Tonar algorithm identified as uncivil between March 2015 and mid-April 2016. As you can see, things got cruder when voting and caucusing started in February. In the months before that, the ugly share of election tweets spiked to between 5 and 11 percent; by March and April, those spikes were between 10 and 20 percent.

The LSM also uses Tonar to compare conversations related to particular candidates.

Here’s the percentage of tweets related to the three remaining Republican candidates that included profanity:

Share of conversations about the current Republican candidates that involve profanity since Nov. 2015. — Share of Twitter conversations (Y axis) about the current Republican candidates that involve profanity since Nov. 2015. Analytics and visualization by Soroush Vosoughi and Prashanth Vijayaraghavan, researchers at the Laboratory for Social Machines.Credit: Courtesy of the Laboratory for Social Machines

And the Democrats:

Share of conversations about the current Democratic candidates that involve profanity since Nov. 2015. — Share of Twitter conversations (Y axis) about the current Democratic candidates that involve profanity since Nov. 2015. Analytics and visualization by Soroush Vosoughi and Prashanth Vijayaraghavan, researchers at the Laboratory for Social Machines.Credit: Courtesy of the Laboratory for Social Machines

Tweets that refer to GOP frontrunner Donald Trump tend to be more profane than tweets about Ted Cruz and John Kasich, although Cruz and Kasich have plenty of spikes. On the Democratic side, there’s not much to separate Hillary Clinton and Bernie Sanders.

You’ll also notice some significant spikes in profanity. The LSM tried to correlate those spikes with events that might have inspired them.

In his Medium post, Heyward notes that a huge Trump spike happened on March 12, after violence among protesters and supporters in Chicago led the Trump campaign to cancel a rally there. Clinton’s spike on Feb. 19 could be connected to a tense exchange about immigration during a Las Vegas town hall with Sanders. And Sanders’ spike on Feb. 29 and March 1 seems related to Super Tuesday.

The LSM isn’t limiting itself to tracking profanity. Tonar is just one application of its big data analysis engine, Electome, which the LSM built to identify election issues that matter to the public — beyond just a focus on leading candidates.

“We think it’s time to listen more closely to all those citizen voices, to better understand what they’re saying about the big questions of our time, and to see if the candidates and the journalists are responding to those concerns,” William Powers, a journalist and researcher at LSM, wrote on the John S. and James L. Knight Foundation’s blog. “We plan to explore how three separate forces — the campaign journalism, the messaging of the candidates, and the public’s response in the digital sphere — converge to shape the presidential election’s most important narratives as well as its outcome.”

There are some limits to what Tonar and Electrome can tell us. Only 20 percent of Americans use Twitter, so Electrome is capturing a big slice of the public conversation, but certainly not all of it — and not from a representative sample. But finding differences between what matters to Twitter users and what matters to Americans as a whole can itself tell an interesting story, as the Washington Post has demonstrated.

The election season is long, and MIT scientists are smart, so expect more from Tonar and Electrome in the coming months.

“As the campaign swings into the final stages of primary season,” Heyward writes, “we and our media partners will look not just at the Incivility Index for conversations about specific issues and candidates, but also for each candidate’s supporters.”

“Tonar will tell us whether the election conversation gets more ‘presidential’ — or even uglier.”

Correction: A previous version of this story misspelled the last name of Andrew Heyward.