Research & Insights

By Renette Youssef, December 19, 2014

Why Media Bias Has Nowhere to Run and Hide from Data Science

 

When you want to see the face of biased reporting in online news, you may not have to go further than, the satirical news site, The Onion. Titles such as “Media Reports of Bear Attacks May Be Biased”, “Weather Channel Accused of Pro-Weather Bias”, and “Media Criticized for Hometown Sports Reporting” can make us laugh, but they can also get us thinking about what real biased-reporting may look like.

In writing news, what form does bias take? In the most narrow terms, and as you can see from the experiment mentioned later in this article, you can find bias in word-choice. Does the reporter use positive or negative terms or highly emotional words? These are all clues; they are clues that not only give us hints to how an author of an article may feel about the topic, but, also their organization who puts conscious or unconscious pressure on them, not to mention the societal norms the author lives by.

What kind of articles attract biased reporting? Even when handled by professional journalists (who try to take precaution against it), articles covering controversial topics are at the most risk of being biased. What is a controversial topic? They are usually topics involving social, political, or ideological issues, and, here is the big clue; they provoke arguments!

(We have written about media bias before here at CrowdFlower. A few months ago, we used our own ‘people-powered data-enrichment platform’ to analyse 250 articles from Google news to see how a controversial topic was being covered by the press. You can read it here: Did the Media Blame Israel or Hamas for the Collapsed Humanitarian Ceasefire in Gaza.)

Here Comes the Experiment and the Data:

When talking about bias, why only look at the small picture of word choice? This answer is simple. These kinds of traits are quantifiable, and that makes them the most ripe for use in data-analysis.

Researchers Yelena Mejova and Carlos Castillo from the Qatar Computing Research Institute, Nicholas Diakopoulos from the University of Maryland, and Amy X. Zhang from MIT CSAIL sought out a “data-driven approach to understand how controversy interplays with emotional expression and biased language in the news,” and they may have found it. They wrote their results in a paper called ‘Controversy and Sentiment in Online News’.

They focused on 15 major U.S. news outlets:

  • CNN

  • Reuters

  • Usa Today

  • Los Angeles Times

  • Washington Post

  • Chicago

  • Tribune

  • News Day

  • Minneapolis Star Tribune

  • Houston Chronicle

  • Philadelphia Inquirer

  • Honolulu Star-Advertiser

  • Huffington Post

  • New York Times

  • ProPublica

  • Talking Points Memo

In the study, the data science and research team compared 21 million articles (sourced from Newscred) both controversial and non-controversial nature. The articles were recent, and sourced over the span of 7 months. Beyond the use of clever algorithms, 25 human annotators, through the data-enrichment platform CrowdFlower, helped identify controversial topics and words.

Here’s what they found:

  1. Controversial articles had fewer positive words, but more negative words.

  2. The use of highly emotional terms was less-likely to be found in articles on controversial topics (a possible sign that authors were trying to not sound biased by avoiding certain words.)

For a closer look at the words used to identify controversial articles, take a look at the list:

controversy1.png

You can also download the list of controversial words here.

How did some of the publications tested fare in this test? Unfortunately, because the study was not done for entertainment value, this is hard to decipher from the data. However, if you’re curious, you can see below a comparison of Huffington Post, CNN, and Reuters.roblems:controversy2.png

As you can see, data is giving us a new peek into the world of online news. In the future, this way of analyzing articles could become the norm to help news-organizations create automatic-style-guide-checkers, journalists could use it to stay more objective in their writing, and readers too could benefit from software like this to keep away from reading biased reports. In short, this study may have found a way to encourage the world of reporting to bite its lip and stick to the facts!

Do you see other ways this research could affect the world of yellow journalism? Let us know in the comments.