In this post I introduce Trump Bot, a Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) that generates new Donald Trump speeches. Trump Bot is trained on a compilation of transcripts of speeches delivered by Trump throughout this last year. Though prone to grammar mistakes and rare but sudden changes of opinion, Trump Bot’s speeches bear a strong resemblance to those of its namesake.
…we’re going to take our party. We’re not going to make bad jobs. We’re going to create a wall. We don’t win. We have a plan. So we’re going to have a strong border.— Trump Bot
- Evaluation: Trump Bot vs. Donald Trump
- Selected Trump Bot samples
- Credits, inspiration and similar projects
The model and the task
LSTMs are known for their ability to produce some pretty amazing results across a variety of different tasks. LSTMs are trained to predict the next item in a sequence, given the preceding items in that sequence. Once trained, LSTMs can be used to generate new sequences if provided with an initial kick-start, called a seed. I trained two LSTM models for this post:
- Character-level model: predicts the next character in a sequence
- Word-level model: predicts the next word in a sequence
The word-level model has the advantage of not needing to learn how to spell, as each prediction is constrained to a finite vocabulary. As a result, it can spend its time learning higher-level concepts, like grammar and sentence composition. A second advantage of the word-level model is that it can be initialized with pre-trained word embeddings, giving it a head start in the training process1. I am not aware of any direct scientific support for these assertions, but experimental testing for this post broadly supports them (additional details can be found below).
While these characteristics make the word-level model a good candidate for generating meaningful paragraphs, I found the character-level model to be better at completing individual sentences. As a result, Trump Bot blends the two models: it uses the character-level model to complete the first sentence based on the initial seed, and uses the resulting sentence as the seed for the word-level model, which then generates the remainder of the speech.
On the human side of the equation, Trump’s speaking style has been described by linguists as aphasic and unstructured, repetitive, grammatically simple, and even feminine. Mark Liberman points out that these observations are more prominent when reading his speech transcripts than when listening to the same speeches in audio. While Trump’s speaking style presents certain modeling challenges, like getting stuck in syntactic loops, his relatively simple grammar and vocabulary make his speeches a good candidate for language modeling.
For both the character-level model and the word-level model I scripted a training grid to vary the LSTM’s RNN size, number of layers, dropout rate and sequence length. I selected the top model based on cross-validation results, manually checked the sampled text quality, and then extended training with additional epochs as necessary.
Additionally, I trained a set of word-level models initialized with pre-trained GloVe word embeddings, and I trained a set without. I found that the models initialized with the embeddings achieved better results in a shorter amount of time, other things equal.
I found that the character-level model was better than the word-level model at completing sentences when provided with a short phrase as a seed, while the word-level model was better than the character-level model at composing realistic, semi-coherent sentences and paragraphs. As a result, I created a short python script to blend samples from the two models together, per my description above.
Evaluation: Trump Bot vs. Donald Trump
Below are samples from two speeches: one from Trump Bot and one from Donald Trump.
Now here's Trump. Now Trump is president. Trump, Trump. Trump is now president. Trump. Trump. So, president Trump. I owe them - all I - you know who I owe I - here - this is the group I owe. I owe these people. Wow. I owe these people. So I didn't take any of their money, and by the way you know it's sort of adverse to what I do.
These people are coming up especially I've been in first place practically since I announced, right for like six months I've been in first place. Do you know how many people have come up? Darn, I'd love to contribute to your campaign. I said I'm not taking money. They said but we'd love to make a major contribution, because if I do you know what's going to happen it's just psychologically even if - it's not - it - deal or any - it's just a guy gives you five million bucks and he's representing a company or he's representing China or he's rep - you know you sort of feel obligated. I'm - I still really don't think it - but I'm a very loyal person, so I just do it the easy way I don't take it.
And it's very hard for me to say no, because all my life I take, I take money, I love money. I take money. Now I'm telling these people I don't want your money. I don't want your money, because I know what happens. So now they come to me and I'll get a call from the head of Ford, nice guy by the way. I think, who the hell knows right but I think he's a - wrote me a beautiful letter and he'll say to me "Mr. President, we're doing a wonderful thing." I said, "Why is it wonderful that you're building a plant in Mexico, why can't you build that plant in the United States? Ideally in Michi - you know ideally I want it in Michigan. But why can't you even if it's anywhere in the United States right, but why can't you build that plant in Michigan?"
I will build a wall. She's got the greatest victory that I've ever seen. He had a big problem. I think it was no beautiful. But I said, Let's go to the wall and the people are going to say, We want to have a show that they're going to pay for the wall. We're going to build a wall. We're going to build a wall. We're going to go. It's very hard.
I love everybody. I'm not doing a great job for the vets. I'm one of the most important, you know, in the whole world, that are going to be more than what's going on. And I want to make sure that's the best that I've ever seen negotiated. I've gotten a lot of money outside of the world - the primary two days - I'm not a conservative on my own, and we're going to make America great again. I'm a messenger until we're going to be the good people in the world. I think it's amazing to say, I'm a conservative. I love you.
I will tell you, I will say, I will not tell you I'm going to win because we're going to win. I'm going to make a lot of money. But everybody, that's a big story, that was the wall just have read it because I was the right. I've always been a very good person. But I don't want to use it…
I'd love to thank you to see what happens because I'm not going to win this. I'm going to tell you very well. But I'm a messenger. I'm a conservative conservative. I'm the only one that had a tremendous people that I've ever seen for a long time. A couple of weeks ago, I'm a very conservative. This is the most of the greatest. And I was so honored by the way, the most thing we need. You understand it.
Sample A came from Trump’s December 21st speech in Grand Rapids, MI (sourced from Mark Liberman), and Sample B was produced by Trump Bot, seeded with the phrase “I will build a” Sample A was not in the training set for the Trump Bot models. I introduced newlines to both speeches at intuitive points in the speeches. Below, I outline a few of the most notable similarities and differences between these two speeches.
Trump Bot’s style is remarkably similar to that of the real speech in many ways. Here are a few examples with supporting excerpts:
|Observation||Example from Sample A||Example from Sample B|
|Phrases are frequently repeated||
|Thoughts are expressed in short clauses||
Similarity of style alone is not difficult to achieve – I could easily generate speeches similar in style to the original transcripts by copying and pasting. What is impressive is that Trump Bot also generates entirely new phrases and sentences: 53% of all consecutive three-word sequences and 79% of all consecutive four-word sequences cannot be found in the training transcripts Trump Bot trained on (see Figure 1 for more details).
Figure 1: Generated sequence novelty
Lastly, Trump Bot more often than not produces content that Trump would undoubtedly support, if he hasn’t done so already. Here are a few examples:
- “I think it’s amazing to say, I’m a conservative.”
- “I’ve gotten a lot of money…”
- “We’re going to build a wall.”
- “…they’re going to pay for the wall.”
- “I want to make sure that’s the best that I’ve ever seen negotiated.”
In spite of these successes, Trump Bot still has a lot to learn. There are a few notable differences between Sample A and Sample B, the largest of which is continuity. Speech A (the real speech) is clearly building towards something throughout, though it’s not necessarily clear what that something is until the third paragraph or so. There are many tangents and asides3, but there is a clear and central point to it all: Trump can bring jobs back to America. Trump Bot, however, doesn’t develop a consistent argument throughout Sample B; instead, it produces a jumbled, rambling speech that, on average, more closely resembles the “typical” Trump speech.
A second difference is that Trump Bot makes English mistakes more frequently than Trump. For example, in Sample B, it mistakenly uses the word no instead of not: “I think it was no beautiful”. Trump’s speech transcripts often contain similar mistakes (though arguably to a lesser extent), which is undoubtedly one of the reasons Trump Bot makes these mistakes – Trump Bot learns solely from these transcripts after all.
The last difference worth noting is that Trump Bot sometimes just gets things backwards. Here are two of the most egregious examples:
- “…I’m not going to win this.”
- “I’m not doing a great job for the vets.”
Potential future improvements
Future work might focus on enhancing Trump Bot by testing the following:
- Document-specific embeddings: add a unique document ID to the input layer to capture speech-specific themes
- Speech normalization: optimize text normalization to retain important features (e.g., quotation marks), while minimizing unnecessary noise
- Additional hyperparameter tuning: test the impact of varying additional hyperparameter combinations on prediction accuracy and sample quality
- Expand training data: add data from recent speeches and potentially from other forms of media (e.g., Twitter)
Selected Trump Bot samples
I’ve selected a few of my favorite Trump Bot samples and shared them below. I underlined the most realistic quotes and highlighted the most memorable quotes for ease of reading.
Similar to the sample analyzed above, these samples tend to capture the style and key ideas of the real speeches, while often lacking in continuity and clarity. One thing that is more apparent in these samples is that Trump Bot has started to learn the importance of storytelling, often recounting previous conversations with (imaginary) characters; key phrases are often preceded by “I said”, or “He said”.
Sample 1: seeded with "Obamacare is a d"
Obamacare is a disaster. The big poll in the history of the world. It is a big estate problem. But it's playing real numbers. I'm going to be all of the money. I think you're going to do the job. You know, I think somebody has a great job and we're going to make sure I have a great relationship for the vets. And the people are being treated in the real estate state in the history of the world that has like the politicians. I'm not going to be able to take care of our vets. But we're going to take our party. We're not going to make bad jobs. We're going to create a wall. We don't win. We have a plan. So we're going to have a strong border.
We have a very special interests. And we have to be a big hotel. We're going to save. We have the best people in the history of Iran. We're going to work with stupid business. We're going to repeal and replace Obamacare. We're not going to win. We're going to bring everybody into the Second Amendment at the world, by this country, and you have to say, You know, a time that make America great again, you're going to be by the best. And if we're going to be very strong, we're going to make the education rich again. I will tell you, right, we're going to get rid of it. We're going to look very much. A lot of one - time, we're going to have that country. We're going to terminate it.
We're going to spend new trade deals. I'm going to build a wall. I'm going to go to the wall. We're going to stop it. We're going to have strong borders, and we have to do it. We're going to win. We're going to be great again and we're going to go to that a wall. We have total control for people. And a lot of people want to know that about our leaders -- who doesn't do that. There is a problem, the way like me. We have a president. But we're going to work. We're going to make our military and strong again.
Sample 2: seeded with "I am a very ”
I am a very simple thing, the problem is a lot of people that are going to pay for the wall. No, I hasn't seen it. He said, Donald Trump was a big guy. He said, It's about a couple of time. He's a great person. I mean, he's a little tough. I've gotten the greatest business in the world. It's not never. So I'm a smart person. And he's a totally good guy. He talks about his Bible. He said we're going to be a good guy. I'm a messenger and I say, I'm on my heart. I've seen an amazing guy and when I said it's a movement of winning. I'm not a conservative person. I know the Make the world. So I'm a messenger of the candidate.
I had a great relationship with the people of the New York Times and I'm going to knock into our country. They're not going to get it in. They're going to get a lot or they're going to be strong. I mean, it's so good. You know I love I'm going to be a great president. And I will tell you, I will say, I will say that if I don't win and we're going to win. I'm going to make a lot of money. But he's a very, very successful citizen. You know, you don't know who they feel - and I love - I've been in our country.
I'm good with the trade deals. And I think I'm going to be very proud of that. I'm not going to win the border. I'm going to say that to be a big day. And if they don't get the nomination, believe me. I'm not doing this. If I'm not proud of it. So a Trump Administration is the most beautiful dishonest deal. But we're not going to be so proud of. And I was running for president.
Sample 3: seeded with "I will destroy ISIS."
I will destroy ISIS. She's going to destabilize this crowd. I'm going to do that. But every time and we're going to get it done. And if we don't win with the military, right? I don't know what they're doing. They have no idea. They don't know if they're going to do something. But I tell you that. I want to give a lot of money up. I don't know if they don't want to have to do it. But we're going to take the wall.
So I think you're going to say, You know we're going to give him a right. I said, He's a very good guy. And it wasn't like it's not even supposed to be run. That I was a baby. I guarantee you - I think it -- and you know, the last thing - it's how a little bit of your poll. And the people I said, I said, Oh, I think I've been very successful. I'm a conservative. I love you. I want to see - I would have been great with the press. But I see the ads with the press, I had a wonderful conservative.
He's a nice guy… He's a one - the Democrats - the Hispanics. You know, you had to do to be tough and then I president, but I'm telling you by the way, maybe I'm the only person that I'm the messenger, I'm a very nice person. I'm going to do great with them. I'm going to tell you this person. I'm a conservative. I'm not building against the country. I don't want to be talking about the beginning. You know, the drugs are so incredible, I've done a little bit fairly.
Credits, inspiration and similar projects
- Auto-Generating Clickbait With Recurrent Neural Networks
- DeepDrumpf: Twitterbot
- RoboTrumpDNN: Generating Donald Trump Speeches with Word2Vec and LSTM
Individual training runs may yield slightly different models due to randomness introduced as part of the training (e.g., minibatch selection, random weight and bias initialization) ↩
Such as the claim that the CEO of Ford is a “nice guy”, which was immediately followed by a retraction of that very claim ↩