Giter VIP home page Giter VIP logo

blue-cat-33's Introduction

Team 33 - Blue Cat

Members: Max Beeson, Liangqi Cai, Divya Rajasekhar, YuYu Madigan

Here is the link of ShinyApp

Project Description

Data Sets and API's Used

We will be working with the New York Times API as well as a dataset from Kaggle about Tweets relating to Charlottesville. The New York Times API is created by them and so the data is pulled directly from the source. We will mostly be using data from August 2017, the time of the Charlottesville protests. The Kaggle daataset allows us to analyze related tweets between August 15-18. It provides data on users such as their username, follow counts, retweets, likes, as well as the text of the tweet, hashtags and more. This is a random sample of 50,000 selected tweets. Additionally, we will use either a Google Maps API or a simiar resource to analyze the geographics of the tweets in relation to Charlottesville.

Target Audience

Our target audience is those interested in the revisiting the events, feelings, and discourse of the Charlottesville protests. By exploring both national, respected, journalism (NYT) and public discourse (Twitter) we hope build a complete picture of how people felt. Additionally, although it is not the main focus, we will examine how people�s reporting on twitter differs from that of the NYT.

Related Questions to Explore

  • How does the sentiment of those reacting (both the NYT and Twitter users) change day-to-day over the four day period?
  • Where did the Tweets originate from geographically and how did they spread?
  • Does linking the article affect ones engagement on twitter?
  • Does a new published article relates to quantity of tweets?
  • Are there any other factors or trends in tweeting?

Technical Description

The data will be read using an API and a static .csv file. We will be filtering the data significantly to reduce its size for running as well as transforming both the CSV And the json data into R data frames. We will most likely need to create small test files as well. We will use the following packages as well as additional ones yet to be determined.

  • sentimentanalysis -takes string and returns sentiment
  • NLP - POS tokenizing and string analysis
  • plotly - plotting
  • ggplot - plotting and long/lat data

We hope we can understand a relationship between distance from Charlottesville affecting the quantity the content of the tweets. We will also look for relationships between article publishing and quantity. We are expecting many challenges with this project including and especially the learning curve to using the new technology. There will be a learning curve to using natural language processors, and understanding how the NYT API works. Once we get over those hurdles, it should work out for us but there is a significant initial time dedication necessary for this.

Project Set Up

Our project can be found here at: https://github.com/madigan-99/blue-cat-33

Our project description can be found here

blue-cat-33's People

Contributors

madigan-99 avatar mbeeson7 avatar dr35-1623329 avatar

Watchers

James Cloos avatar  avatar

blue-cat-33's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.