By Tyler Schnoebelen, June 14, 2016

How does the media cover AI?

If you could read everything on Google News that mentioned artificial intelligence, what would you find?

You’d see a lot of stuff about humans and Google.

Instead of reading everything article-by-article, let’s do what a computer would and zoom out to find the major patterns. This job is made a lot easier since Mark Davies and BYU made the NOW corpus available last month. Scripts run nightly, so they get about 10,000 news articles each day and you can find results back to 2010. For the purpose of this post, I’m going to focus on the last year and a half.

Reading thousands of articles at a distance

From January 1, 2015 to June 1, 2016, there were 2,920 mentions of “artificial intelligence” across about 1,800 articles published in about 650 different publications. In case you’re worried, during that same time period, human occurred 39 times more often (112,458 total occurrences).

If you did a word cloud of the contexts immediately surrounding “artificial intelligence”, it would look something like this:

AI Word cloud

But word clouds are terrible. They lump everything together and all you have is size to indicate frequency. If you see two big words, you don’t really know whether they always appear together or never do.

Topic modeling, on the other hand, gives you structure—it attempts to keep words that co-occur together alongside each other.

If we put these articles into 10 clusters, we see a number of themes emerge. These are also basically what you get when you topic model with 12 or 15 clusters.


These are the major themes in press mentions of artificial intelligence for the last year and a half:

Is Google winning AI?

This year, an AI system bested a human expert at the complex game of Go. Go is considered even more complex than chess and Google’s AlphaGo team captured an enormous amount of press from this.

One way to understand how a term is getting used is to look at its collocates—what are the other words and phrases that keep recurring around a target term versus everything else in the corpus? We’ve been looking at a 25-word window but let’s narrow the field a bit and see what appears within a four-word span of “artificial intelligence”. The NOW corpus tool reports this in co-occurrence counts as well as through mutual information, a measure that estimates how unlikely it is that words appear near each other by chance. For reference, the MI in this corpus between salt and pepper is 7.34 and the MI between computer and laptop is 3.21.



Not surprising, a lot of journalists use the full term “artificial intelligence” to define AI so they can go ahead and use that nice, shorter term in the rest of the article. Comparing news about AI with news about everything else, there’s much more talk about Google, robotics, machine learning, automata, and prophetic. You also see prominent themes within AI like virtual (mostly reality but some assistants), robots, and automation.

Google dominates the immediate context of artificial intelligence. It appears within four words of the term 150 times. The next closest organization is Microsoft with 26 mentions in that window. MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) occurs in the tiny window 16 times; IBM appears 13 times.

For people following the field, Google really is prominent in their work on artificial intelligence. However, their dominance isn’t actually as strong as it appears in this data.

The main thing working in Google’s favor is that it’s had a great 2016 with DeepMind/AlphaGo winning against Lee Sedol. If we go back to just 2015, Google doesn’t appear as an important collocate with artificial intelligence.

This is despite the fact that Google has had plenty of other news around AI—like releasing Parsey McParseface. But other topics haven’t gotten nearly the press traction nor the tight association with artificial intelligence. This is not really a criticism, just an explanation that their great press can be tracked back to a single (important) effort.

This probably doesn’t (and shouldn’t) worry Google. Discussions around artificial intelligence tend to be fairly high-level or buzzwordy. One of the major themes that doesn’t get its own cluster is the competition between tech giants (and others) to win developers over to their tools for AI.

To the degree that these AI platforms are vying for technical eyeballs, it’s probably more important to win on other keywords like machine learning and specific techniques. If we look at mutual information for machine intelligence, Google is much lower than it is for artificial intelligence—29 mentions, MI of 3.04. Microsoft is still in second place, with 16 mentions and 3.77 MI. Happily for Microsoft, these mentions tend to be around Azure Machine Learning and other products and not about Tay.

The Humans

About 16% of the documents mention human/humans/humanity/person/people within 25 words of artificial intelligence. There’s a fair amount of diversity in this cluster and in many ways it is really just a superset of all the other issues. But broadly, one of the preoccupations of journalists, companies, and researchers is how AI systems perform relative to humans, which is of course the story of the Turing test, self-driving cars, and AlphaGo.

AI is often said to approximate human cognitive structures. Neural nets and deep learning are commonly described in terms of modeling the human brain. This is part of what’s behind IBM’s brand focus on cognitive computing, which they tend to use in favor of artificial intelligence.

Whether or not you want to claim that AI systems are anything like human brains, it is clearly the case that you want to affect humans. And in general AI efforts are meant to help people (e.g., Facebook captions for the blind or Google working on daily tasks). This is part and parcel of the big current focus on chatbots/virtual assistants, which also get their own cluster.

The ability of AI to reduce human error is inherently part of coverage of self-driving cars or what happens to humans in jobs where AI is able to perform as well or better.

And while there are those who look forward to AI replacing particular kinds of chores and jobs, there is of course a whole cluster of articles devoted to DOOM and/or the attempt to pacify fears of robot monsters replacing us all entirely.

In his 1986 book, The Society of Mind, Minsky described intelligence in terms of lots of smaller, diverse parts. The cluster around virtual assistants is partly about helping with quotidian tasks with lots of different “agents” doing different things. But it’s also about the competition to own the platforms that developers use to build even more services.

The themes from the last 18 months show us what’s possible with greater amounts of data and processing power. Next week, we’ll zoom out to put these trends in historical context. AI researchers talk about AI winters in which they wouldn’t even call what they were doing artificial intelligence. Are we in an AI springtime? Is winter coming?