Giter VIP home page Giter VIP logo

roweyerboat / twitter_hashtag_analysis Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 41.02 MB

This repository is a project looking at tweets that used the #BLM and analyzed the sentiment and words used as well as utilized topic modeling with Latent Semantic Analysis and Latent Dirichlet Allocation to pull out the main themes that are used when the #BLM is used.

Home Page: https://roweyerboat.github.io/analyzing_a_single_hashtag

Jupyter Notebook 100.00%
natural-language-processing sentiment-analysis topic-modeling hashtags tweets tweets-classification

twitter_hashtag_analysis's Introduction

BLM Hashtag Analysis

Background Information

Over the years, the Black Lives Matter (BLM) movement has gained attention across many platforms. One such platform, Twitter, has been a major place where the message of BLM has been articulated. With the events in May and June of 2020, the deaths of Ahmaud Arbery, Breonna Taylor, and George Floyd, BLM has been in the spotlight more than ever before. With that, there has been a variety of messages about what BLM means and what does it really accomplish. There are some who see it as a new wave of the Civil Rights movement from the 1960s. There are others that consider BLM as a dangerous terrorist organization. In order to sort out the message of BLM, I scraped twitter data from the past 7 years by pulling tweets using twint and searching the hashtag BLM.
I recognize and acknowledge that those that use the hashtag might not be for Black Lives Matter and could even be a critic of it. I felt it was important to gather all that I could to see what rose to the surface when analyzing the text. I also recognize that people have the ability to delete tweets, and so I do not have a full collection of every tweet.
With those acknowledgements, I was able to obtain 220,504 tweets from over 140,000 Twitter users.

Libraries Needed

Twint
Nest_asyncio
Vader
Textblob
Wordcloud
Nltk
Sklearn

The Data

Since it was too large to upload to Github, here are the links for the raw data as well as the clean data
Raw data
Clean data

picture of a graph showing tweets over the years with the hashtag blm

Scraping Notebook

The scraping notebook was also too large for github, so here is a link to it

Repo Files

Tweet Cleaning Notebook - Cleaning the raw data notebook
EDA and Visualization Notebook - Exploratory data analysis and visualizations
Time Series Notebook - Notebook looking at the hashtag over time
Latent Sentiment Analysis - Notebook of the LSA process
Latent Direchlet Allocation - Notebook of the LDA process
BLM_Hashtag_Analysis - Summarization of the whole project

Findings

Sentiment analysis was inconclusive as a tweet that could be seen as "negative" wasn't necessarily against the BLM movement. Likewise a "positive" sentiment didn't mean the message was for BLM or promoting their values. Therefore I chose to look deeper at the words through Topic Modeling with two methods.

LSA topic counts
LDA 4 topics

Conclusion

Based on the analysis and looking at what continued to rise to the top, the main themes of tweets with the hashtag BLM are about protecting Black lives. Antifa was not included as much as mainstream news outlets assume. However this is definitely a limited study. It was interesting to see how a sentiment analysis isn't always helpful. Future work with this and other data sets would be to look at how BLM is portrayed by major news media and see if the words used are similar. An application of this project is to understand the context of the sentiment analysis.

Blog and Video

Blog post about the project
Video of final presentation

twitter_hashtag_analysis's People

Contributors

roweyerboat avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.