Giter VIP home page Giter VIP logo

analysis-of-political-corpus's Introduction

Analysis of Political Corpus

Project Overview

The "Analysis of Political Corpus" project aims to apply Natural Language Processing (NLP) techniques to gain insights into political debates using the Political Debates Dataset. This dataset summarizes debates on six prominent US political topics: healthcare, god, guns, Gay Rights, abortion, and creation. Each debate category contains discussion threads, each with discussions presenting either pro or con arguments. The project explores a range of NLP tasks, including text analysis, sentiment analysis, entity recognition, and more, to uncover patterns and insights in the data.

Project Tasks

The project is organized into several tasks, each focusing on different aspects of the political debates. The tasks are divided into two Jupyter notebooks:

  • Tasks 1 to 4: Contained in one Jupyter notebook.

    • Download the dataset, organize it for easy manipulation, and save it to an Excel file.
    • Generate statistics on the debates, including the number of threads per category and average message count per thread.
    • Calculate the argument text length and display the distribution.
    • Identify key persons and organizations involved in shaping arguments using SpaCy named-entity recognition.
  • Tasks 5 to 10: Contained in another Jupyter notebook.

    • Perform sentiment analysis using SentiStrength. Then use Pearson correlation coefficient to test correlation score and its p-value for each category.
    • Assess the relationship between thread and category titles using pre-trained word2vec models.
    • Analyze the coherence of discussion posts within threads using Empath.
    • Examine pro and con reasoning by identifying modal verbs and their context.
    • Test for the presence of negation operators in argument texts and create a histogram showing the percentage of arguments with negation operators.

How to Use

To run the project tasks, you need Python and Jupyter Notebook installed. Make sure to install the required Python libraries mentioned in the Jupyter notebooks.

  1. Clone the repository to your local machine.
  2. Open and run the Jupyter notebooks for tasks 1 to 4 and tasks 5 to 10.
  3. Follow the instructions and code comments in the notebooks to complete each task.

Dependencies

The project relies on various Python libraries, including SpaCy, SentiStrength, and more. You can find a list of dependencies in the Jupyter notebooks and the requirements.txt file

analysis-of-political-corpus's People

Contributors

justinseby avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.