top of page

MACHINE LEARNING TEXT ANALYSIS

We processed and analyzed multiple rally speeches and addresses from three different modern political figures: President Joe Biden, President Donald Trump, and President Barak Obama. You can find the different analyses we did below.

SEMANTIC ANALYSIS

In our initial analysis, we used a semantic dataset to create a dictionary of words with each positive word given a weight of 1 and each negative word given a weight of -1. This dictionary had 76,224 words in it. We then used this dictionary to explore the sentiment of Trump’s speeches. Additionally, we used the same process to compare Trump’s rally speeches with a Biden rally speech and an Obama rally speech. Our results are seen below:

PUA.png

TEXT PREDICTION WITH MARKOV CHAINS

Next, we used Markov Chains to do text prediction. We created a Markov Chain and then imported the 35 Trump rally speeches. This creates a dictionary with each unique word in the speeches as a key. The value attached to each of these keys is a list of words that follow the given key in the speeches. This is also done with phrases of two words and three words. With this dictionary created, we can provide a word and discover the most common words that follow this word in the speeches. We looped this process in an effort to generate sentences from a single word.

Screen Shot 2021-03-23 at 1.31.17 PM.png

TEXT PREDICTION WITH RNN

The Markov Chain could only provide us with text that had already been said, it could not create novel text. So instead of using the Markov Chain to write a new speech, we used a Char-RNN.

Screen Shot 2021-03-23 at 1.42.35 PM.png

TOPICS CLUSTERING

Finally, we did a clustering of the words in the Trump speeches to separate the speeches by topic.

Screen Shot 2021-03-23 at 1.47.49 PM.png
Machine Learning Text Analysis: Projects

©2020 by Amy Huddell. Proudly created with Wix.com

bottom of page