By Tyler Schnoebelen, October 13, 2016

U.S. Presidential Debates Through the Eyes of a Computer

This post wraps up a series I’ve been doing on using machine learning models to understand recent American political debates (here and here). By taking all the transcripts of the debates since last year, I show which words and phrases most distinguish debaters’ styles and issues. Training a computer to identify speakers is usually thought of as a way of doing forensics or personalization. But here, I’m interested in something closer to summarization. If you can pick one section of talk for each candidate from the last debate, which moments are most consistent with everything they’ve said up to then?

Machine learning behind the presidential debates

First, let’s begin with Hillary Clinton at her most Clintonesque. The theme of families and children is big in her campaign.

Mr. Carter, I have tried my entire life to do what I can to support children and families. You know, right out of law school, I went to work for the Children’s Defense Fund. And Donald talks a lot about, you know, the 30 years I’ve been in public service. I’m proud of that. You know, I started off as a young lawyer working against discrimination against African-American children in schools and in the criminal justice system. I worked to make sure that kids with disabilities could get a public education, something that I care very much about. I have worked with Latinos—one of my first jobs in politics was down in south Texas registering Latino citizens to be able to vote. So I have a deep devotion, to use your absolutely correct word, to making sure that an every American feels like he or she has a place in our country.

Historically, Clinton is in a league of her own when it comes to mentioning children107 times across all the debates she’s been in this year and in the past. If you put together all of the BushesGeorge H.W., George W., and Jebyou still wouldn’t equal that. Al Gore got up to 69 mentions in all his debate appearances through the years. Similarly, Clinton has mentioned families 92 times over the years. Obama and Gore are in second and third place with 87 and 65 mentions, respectively. And while Bill Bradley and Al Gore mentioned discrimination the most of everyone since 1976, in our current election cycle it’s Clinton who has used this word the most.

One thing that did shift in this debate is the use of our countrythat’s a phrase that Trump has used a great deal, but here Clinton used it 20 times to Trump’s 15 (they were basically matched for the use of our, btw).

This excerpt also mentions Latinos, but she doesn’t use the determiner the. Lynne Murphy has a nice post about why it sounds racist to refer to people as the Latinos or the African-Americans. Outside of the debates, Trump has also talked about his relationship with “the Blacks.” My own quick survey of debates since 1976 suggests that Republicans have a much stronger tendency to use definite articles with minority groups.

The model knows only specific words and phrases, though, so it doesn’t have a notion of what classes of words like determiners or demonstratives might mean. Nor does it really know that Clinton is addressing the question-asker as Mr. Carter and returning to his word, devoted. To a human watchingor at least to David Brooksthis attempt at connection seems like an important contrast. Certainly, data scientists and computational linguistics can imagine how to build new features around this: work around referring to questioners by name and using their own words. Matching the style of people you are friendly with is known as ‘accommodation’ and seems to be part of what we react to positively in others.

But let’s return to what the machine learning model does know. Within the debate, it selects the following as the most Trumpian moment for Donald Trumpthe one it is most confident was voiced by him and no one else:

And my whole concept was to make America great again. When I watch the deals being made, when I watch what’s happening with some horrible things like Obamacare, where your health insurance and health care is going up by numbers that are astronomical, 68 percent, 59 percent, 71 percent, when I look at the Iran deal and how bad a deal it is for us, it’s a one-sided transaction where we’re giving back $150 billion to a terrorist state, really, the number one terror state, we’ve made them a strong country from really a very weak country just three years ago.

Here you have Trump explicitly bringing up his very well known slogan, “Make America Great Again.” He also talks about deals—historically, deals have only been mentioned 171 times in debates but Trump accounts for 49 of those times.

In this election cycle, Trump is also big on talking about Obamacare, though not-too-many-years-ago, Michelle Bachmann and Mitt Romney used Obamacare even more than Trump has. The model doesn’t know anything as complex as the interesting way that Trump lists numbers as crescendo as he does in this excerpt. But it does know that he likes to talk about who is strong and weak.

And of course, there’s a horrible in there—historically, Trump is responsible for 30% of all horribles ever used by candidates or moderators in all the debates since 1976. Other strongly characteristic affective words include the 11 tremendouses of this debate and the 17 disasters.

If Trump really does feel unshackled, the final debate is likely to have even more extreme language…as you can tell by the use of the metaphor of shackles, which is a hobbling and enslaving mechanism rarely seen on self-described billionaires.

Some notes on method

The data is all of this year’s presidential and primary debates from The American Presidency Project, with only the most recent town hall debate between Trump and Clinton held out. The training data contains the 13 major speakers this year (Bush, Carson, Christie, Cruz, Kasich, Rubio, Trump; Sanders, O’Malley, Clinton; Blitzer, Cooper and Tapper). The goal is to predict whether Clinton or Trump was speaking in the town hall debate.

The best model turns out to be a logistic regression model, which in a 13-way classification task correctly gets Clinton and Trump 207 out of 299 times (the second-best guess is right another 39 times). The precision scores are very high, meaning that when the model guesses Clinton or Trump, it’s almost always right. The recall is not as awesome, which is to be expected given that we allowed it guess 11 other possible speakers. If we only let it guess Trump or Clinton without otherwise changing the training data, it’d be right 86.6% of the time.

US Pres debate table

When the model is wrong, it’s usually because it’s guessing Trump or Sanders for what is really Clinton or it’s guessing Rubio or Sanders for what is really Trump. It makes sense for you to sound like someone from your own party as well as for you to sound like someone who was an earlier rival of your present opponent (you’ll bring up many of the same issues). The weirdness is Clinton getting classified as Trump. There are 11 of these errors.

The logistic regression model has an average confidence for the things it gets right of 64.7%. The 11 Trump-for-Clinton errors have an average confidence of 35.6%, meaning that the model doesn’t assert itself too much. Part of the reason is that these are short. The average length of the correct items is over three times longer than the wrong ones. For example, one of the wrong ones is simply, No. That was Clinton but it is actually a good guess for Trump, who has one of the strongest tendencies towards negation of any candidate.

The longest item that the model mistakes for Trump is Clinton talking about China multiple times (but the model is only 30.6% confident). If you’ve followed the campaign, you know that Trump is normally the candidate talking about China.

And actually—well, that was very interesting. First of all, China is illegally dumping steel in the United States and Donald Trump is buying it to build his buildings, putting steelworkers and American steel plants out of business. That’s something that I fought against as a senator and that I would have a trade prosecutor to make sure that we don’t get taken advantage of by China on steel or anything else.

Next, let’s look at the most confident misclassification. It’s pretty obvious to us that the model shouldn’t be 82.1% confident that this is Trump when it’s clearly Clinton:

And the final thing I would say, this is the 10th or 12th time that he’s denied being for the war in Iraq. We have it on tape. The entire press corps has looked at it. It’s been debunked, but it never stops him from saying whatever he wants to say.

The model is trained on the 2016 debates, and as far as they go, Trump is a major user of Iraq. He is also very likely to talk about what people say and to talk about wants. In this election, he is also a big user of thing. Historically, the only people to rival his use of thing are John Edwards and Mike Huckabee.

When it comes to understanding something as deeply human as politics, we can use computers and statistics to guide us but there are always humans in the loopat the ends of the process, it’s humans who produce this kind of data and it’s humans who are represented or disenfranchised.

And of course, there are usually humans in the middle, trying out new features and challenging their own algorithmic assumptions. So while we can say that Trump is a big user of monosyllables like thing, it’s harder for a computer to automatically know that minimal-thing-user Mike Pence made a very memeable moment with his comment to Tim Kaine, “Senator, you’ve whipped out that Mexican thing again.” Donald Trump’s campaign began with that Mexican thing and here at the denouement of his campaign we can use our eyes and ears to guide our machine learning as much as we let the machine learning tell us what’s worth a second look.