Giter VIP home page Giter VIP logo

twitter_toolbox's Introduction

Twitter Toolbox

Welcome to the Twitter Toolbox, a comprehensive suite designed to simplify data acquisition, preprocessing, and analysis from Twitter. This project is an up-to-date solution built in response to the recent changes in Twitter's API and front end. Given that several existing libraries are no longer maintained or updated, this Twitter Toolbox ensures a seamless data extraction process for data analysts, researchers, marketers, and developers alike.

Table of Contents

Features

The Twitter Toolbox offers a broad spectrum of functionalities, including:

Data Acquisition: Our toolbox equips you with everything you need to extract a variety of data from Twitter, from streaming and scraping real-time data to making API calls and hydrating or dehydrating tweets.

Preprocessing: Our tools offer data cleaning, language filtering, data labeling, and group generation features to refine your dataset for accurate and reliable analyses.

Natural Language Processing (NLP): The toolbox is equipped with sentiment analysis, emotion analysis, topic analysis, and named entity recognition to provide you with meaningful insights from the content of tweets.

Each of these capabilities is designed to help you make the most out of Twitter data, whether you're exploring public sentiment, detecting emotional trends, identifying key themes, or recognizing named entities such as organizations or individuals.

Articles

I have written a series of articles to explain how to use the Twitter Toolbox. You can find them here:

Data Acquisition

Collect data from Twitter using scraping, streaming and Twitter API.

Learn more about the data collection here.

Preprocessing

In progress...

NLP

In progress...

Future Developments

The Twitter Toolbox is an evolving project. We plan to continue adding new features as they are developed. Stay tuned for regular updates and improvements!

Contributions and Feedback

This toolbox is designed to grow with the contributions and feedback from the community. You are welcome to suggest new features, report any issues, or even submit pull requests. Let's collaborate to create the most valuable Twitter Toolbox possible!

Disclaimer

Please note that the use of the Twitter API and all data retrieved through this toolbox should comply with the Twitter Terms of Service, Developer Agreement, and Developer Policy, including Twitter's privacy policy. This project includes a dehydration script to comply with Twitter's terms of service, allowing for sharing only the tweet_id. Always de-identify the information and respect user privacy when sharing or publishing data.

Structure

Project is structured as follows:

├── data (Data is not stored in the repository)
├── src
│   ├── dataAcquisition
│   ├── preprocessing
│   ├── nlp
├── docs 
└──

Data is stored in the following structure:

├── data
│   ├── <scraping> (Scrape from user, hashtag or keyword)
│   │   ├── <user>
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   └── ...
│   │   ├── <user>
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   ├── <user>_<start>_<end>.csv
│   │   │   └── ...
│   │   └── ...
│   ├── <sample-stream> (Stream 1% of tweets)
│   │   ├── <date>.csv
│   │   ├── <date>.csv
│   │   └── ...
│   ├── <covid-github> (Scrape from Github and rehydrate)
│   │   ├── <date>.csv
│   │   ├── <date>.csv
│   │   └── ...
│   └──
└──

twitter_toolbox's People

Contributors

sferez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.