Giter VIP home page Giter VIP logo

spark_twitter's Introduction

spark_twitter

It's a playground for using Twitter streaming api and trying Apache Spark.

Answering the question: What are the current most popular hashtags in tweets mentioning Toronto?

  1. Run twitter_client.py: it creates a socket on localhost:50001 and listens.
  2. Run spark_client.py: it connects to localhost:50001.

twitter_client.py connects to Twitter API and takes tweets where toronto is mentioned and then sends hashtags of the tweet to spark stream.

With the help of Spark hashtags are being counted (constantly updating as tweets come) and 20 the most popular tags are displayed in console as a data frame.


spoiler: #toronto is usually around 10 times more popular than any other hashtag

Example (proof):

hashtag hashtag_count
toronto 240
job 20
hiring 18
blackpanther 16
canada 15
sales 11
news 11
topoli 11
ontario 10
danaigurira 9
bobmarleyday 9
careerarc 9
hamilton 8
fake 8
mapleleafs 8
mississauga 8
nhl 7
design 7
cats 7
winter 7

spark_twitter's People

Contributors

ikristina avatar dependabot[bot] avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.