Giter VIP home page Giter VIP logo

computational_linguistic_analysis's Introduction

Political Speech Analysis with Natural Language Processing (NLP)

image

Image source

Datasets

This is my Master of Science in Business Analytics (MSBA) capstone project in spring 2023. The primary dataset includes large-scale text data transcribed from 194 hours of Democratic National Convention (DNC) and Republican National Convention (RNC) speeches from 2004 to 2020. The text data were transformed to a SQLite database with 3470 rows and 9 columns including year, party, day, speaker, speaker count, time, text, text length, and the source of text.

An extended dataset we used for this project was 1038 presidential speeches from 1789 to 2021, from George Washington to Joe Biden, for permutation testing. These speeches were delivered by 45 U.S. Presidents, 445 of which were from 19 Republican Presidents and 513 of which were from 16 Democratic Presidents.

Methods

We used two research approaches, topic modeling and permuation tests, in this project. The Python code for topic modeling was written in Jupyter Notebook. The R code for permutation tests was written in R Markdown and knitted to html.

  • Topic modeling: to track the evolution of topics from 2004 to 2020.
  • Permutation tests: to compare speech features at the subtle linguistic granularity level.

Results

Our topic modeling identified topics that gained or lost favor over time and topics that consistently reflected core values of the two parties. Our permutation test analysis showed statistically significant differences in past tense usage between the two parties in two corpora and in first-person singular and plural pronoun usage in convention speeches.

Selected visuals

  • Topic Modeling with Python image

image

  • Permutation tests with R

image
image

Data sources

computational_linguistic_analysis's People

Contributors

xin-bu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.