Giter VIP home page Giter VIP logo

tngaspar / twitter-stream-mongo Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 2.0 1.47 MB

Dockerized realtime twitter streaming to MongoDB. Data synced with Elasticsearch and Kibana. Flask webapp created to search and display tweets.

Home Page: http://twittersearch.tgaspar.com

License: Apache License 2.0

Python 89.96% Dockerfile 10.04%
docker mongodb python docker-compose elasticsearch kibana twitter-api twitter-stream monstache flask

twitter-stream-mongo's Introduction

Twitter Stream

Check out a Live Demo of the search engine webapp integrated with Elasticsearch here.

Table of Contents

  1. Features
  2. Project Components
  3. Requirements
  4. Installation
  5. Kibana
    1. Kibana Dashboard
    2. Kibana Search
  6. Search Webapp
    1. Main Page & Dashboard
    2. Search Output

Features

  • Dockerized realtime tweet streaming to MongoDB based on search rules. Tweepy used to connect to twitter API;
  • MongoDB collection is continuously synced with an Elasticsearch index using Monstache;
  • MongoDB queried with Mongo Express, a web-based MongoDB admin interface;
  • Kibana used to visualize and search tweets.
  • Flask search webapp connected served by nginx.

Project Components

All components of the project are dockerized. The Streaming Client is initiated by twitter_stream/Dockerfile and the Search Webapp by flask_search/Dockerfile. All remaining containers are created from DockerHub images.

Requirements

Installation

  1. Clone the repo:
$ git clone https://github.com/tngaspar/twitter-stream-mongo.git
  1. Create .env file in project root folder with the following parameters:
API_KEY=[Twitter API key]
API_SECRET_KEY=[Twitter API secrect key]
BEARER_TOKEN=[Twitter API bearer token]
MDB_HOST_NAME=mongodb://root:[Password]@mongo:27017/
MDB_DATABASE_NAME=tweetdb
MDB_COLLECTION_NAME=tweets
SEARCH_RULE=[Twitter Filtered Stream rule]
MONGODB_ROOT_PASSWORD=[choose Password]
MONGODB_REPLICA_SET_KEY=[choose ReplicaKey]

Replace all fields between brackets. You may find the twitter documentation for the SEARCH RULE here. By default the rule has lang:en, -is:retweet and -is:reply implicit so there's no need to add this parameters.

  1. Add password to mongo-url on monstache/monstache.config.toml:
mongo-url = "mongodb://root:[Password]@mongo:27017" 

Replace fields between brackets.

  1. In the project root directory run docker-compose:
$ docker-compose up -d

After this all containers should be up and running and the streaming initiated.

If running locally you can check MongoDB through Mongo Express at localhost:8081 and search gathered tweets in Kibana at localhost:5601. The seach webapp should also be up and accessible at 0.0.0.0 and localhost (port 80).

Kibana

Kibana allows search and analysis of tweet data from Elasticsearch.

Kibana Dashboard:

This dashboard may be imported to Kibana by navigating to Stack Management>Saved Objects>Import and importing the file doc/kibana_dashboard.ndjson.

Kibana Search:

Kibana uses syntax from Apache Lucene to query and filter data. Find out more here.

Here's a simple example:

Search Webapp

The flask_search webapp displays a user interface where it is possible to use the Elascticsearch search functionalities. It acts as a search engine on the records present on the index.

Main Page & Dashboard:

The main page shows the search bar and a snapshot of the Kibana Dashboard.

Search Output:

Search example with tweets gathered using software engineer, data, jobs and other related keywords as the streaming search rule.

(back to top)

twitter-stream-mongo's People

Contributors

dependabot[bot] avatar tngaspar avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.