Bradley Hayes, a postdoctoral associate at MIT who does robotics research, just turned Donald “Drumpf” Trump into a robot. He programmed a recurrent neural network — an artificial intelligence — to study and emulate the Republican-ish candidate’s speeches.
Hayes’s “day job,” he says, is “research focused on human-robot teaming: designing algorithms that let robots work together with and learn from humans so that humans can be safer, more efficient, more effective at their jobs.” @DeepDrumpf is a “side project.” He drew inspiration, in part, from John Oliver’s “fantastic sketch”. (“Hopefully he’ll see this — hopefully he’ll see this and appreciate it.”)
Inverse spoke with Hayes about this patriotic endeavor.
What else inspired you to make @DeepDrumpf?
It came about from a lunchtime conversation with some colleagues of mine that also do robotics research and deal with machine learning. We were talking about some various statistical modeling techniques that were actually relevant for our research. It turns out that the same technique that is behind DeepDrumpf works in a lot of robotics domains, because it’s a modeling technique that tries to learn the structure of sequential information, or sequential data. Natural language is a great example of sequential data, where the structure of the sentence is fairly consistent: there are rules, and there is underlying structure to all the data that you’re getting.
A different researcher out at Stanford wrote a course on neural networks, and, in particular, published an article titled “The Unreasonable Effectiveness of Recurrent Neural Networks.” So, he wrote up this fantastic introduction to this statistical modeling technique, and a bunch of people have shown that it has this unreasonable power to represent structure in this kind of free-form text data’s writing.
I saw an article that was comparing the speech complexity of the various political frontrunners. The article was saying how Trump’s using more simplistic language, and it’s a huge hit with his voting demographic and his fans. From a political perspective, that’s really great, because it makes your message clear and within grasp of the widest possible audience; from a machine-learning standpoint, that means that this might be the most tractable model that we can make.
Had you heard of a coding language called “Make Python Great Again”?
You know, I saw it yesterday. TrumpPython or something like that? I did see that. I read an article about it, I went to their GitHub page, but I haven’t had any time to play with it yet. But it looks great.
Can we learn anything about Trump’s linguistic tendencies, or anything like that, from your A.I.?
Yeah, it’s possible in the sense that, if you look at the output from the model, it’s indicative of the structure that the model has learned from the data. So the kinds of repetition, the kinds of things that come out of the model, will tell you — potentially — about certain things that are inherent to his speaking patterns and his message.
You wouldn’t necessarily be able to get that from the Twitter account itself, mostly because Twitter only gives you 140 characters to work with. And, because there’s not a whole lot of data that has gone into the model, and also partially because the transcripts are from debates — where the candidates (and especially Trump) tend to interrupt themselves — it makes for these discontinuities in the output.
There’s still a little bit of manual work required to basically sample a wall of text from this model and then go through it and pick out the best contiguous 140-character nugget, and then post that.
So it’s not very hands-off at this point?
It effectively learns as probability distribution, and you can sample from it. What that means is — you have your model and you can ask it for a letter. And, if you ask it for enough letters in a row, it’ll give you things that resemble English. Or, even better, some of them resemble things that Trump might actually have said — because it was trained on him. So, the general process I’ve been following is: I would sample, say, 500 or 1,000 characters from it. It would just give me a wall of text with 500 or 1,000 characters worth of, I guess, ramblings, and then, from within that, I’ll just pick the best 140-character block that makes sense. Or the best sentence that comes out of it that seems kind of relevant.
For example, last night I was using it to kind of live-tweet the debate. And so, one of the things you can do with a model like this is you can prime it. So, because the model’s only giving you one character at a time, it has this dependence on the characters that’ve come before it — the letters that it output previously. That’s how it learns words, that’s how it captures sentence structure and certain elements of grammar.
Say I start my sentence with ‘Romney is’ and then ask it for the next thousand characters. We call that priming. It’ll give whatever output it wants, but it’ll set the initial part of the sequence to that ‘Romney is…’
Is that referencing those tweets with bracketed phrases?
One of the things that I’m hoping to do, once the process is a little bit cleaner — and that’s just going to come with more data — is to start having it interact with the other candidates. If you look at the Twitter account, it’s following the other primary candidates. Eventually, it’s hopefully going to start responding to them and maybe challenging them. But that’s more of a weekend-project kind of thing.
Can you explain what a recurrent neural network is in simplistic, unspecialized language?
Sure — we’ll try. A neural net, in general, is taking in some input, then it’s doing some math in the middle, and it gives you an output. In general, it’s just a classifier. So, given some input, it will tell you what class that input corresponds to. A popular example would be — a basic neural net — you give it a picture of a cat, and you want it to tell you that — if it’s, like, a cat, a dog, or a plane, or a car — you want it to say that “Okay — with high confidence — this is a cat that you just gave me.”
So that’s the high-level classification task. This is a similar concept, but instead of being cat, dog, car, the classes are the individual letters of the alphabet and punctuation. So it’s taking an input, and then it’s doing math to it based on what it’s learned — so all of the learning happens ‘in the middle,’ we’ll call it — and it gives you a classification at the end. So, like, this letter.
The thing that makes it a recurrent neural net is that the output from previous steps gets fed into the next step as part of the model. The fact that the model gave me an ‘M’ will feed into the next run-through of the model. So then it might give you an ‘a,’ and then a ‘k,’ and then an ‘e,’ because it’s trying to put out ‘Make America great again,’ because that’s represented in the data a lot.
Are you particularly proud of any DeepDrumpf tweets thus far?
Yeah, actually. I have a couple that I haven’t actually posted yet, but —
[Laughs] Exactly. Of the ones that are posted, I’m particularly happy with ‘I’m what ISIS doesn’t need.’
Let’s see … I did seed it with ‘I’m not racist, but…’ and the continuation of that was ‘…believe it,’ which I thought was pretty excellent. I was going to save that one for when it became relevant, if it became relevant.
Nothing good ever comes after those words.
Would you rather vote for Donald Trump or vote for @DeepDrumpf?
I think there are tradeoffs with each of those choices.