Giter VIP home page Giter VIP logo

tweep's Introduction

tweep

Build Status Python 3.5|3.6 GitHub license

Tweep is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API.

Tweep utilizes Twitter's search operators to let you scrape Tweets from specific users, scrape Tweets relating to certain topics, hashtags & trends, or sort out sensitive information from Tweets like e-mail and phone numbers. I find this very useful, and you can get really creative with it too.

tl;dr Benefits

Some of the benefits of using Tweep vs Twitter API:

  • Can fetch almost all Tweets (Twitter API limits to last 3200 Tweets only)
  • Fast initial setup
  • Can be used anonymously and without sign up
  • No rate limitations

Requirements

  • Python 3.5/3.6
  • pip3 install -r requirements.txt

Usage

  • -u The user's Tweets you want to scrape.
  • -s Search for Tweets containing this word or phrase.
  • -g Retrieve tweets by geolocation. Format of the argument is lat,lon,range(in km) . ex : 48.01009,36.09876,0.5km - Please note that km has to be included.
  • -o Save output to a file.
  • --year Filter Tweets before the specified year.
  • --fruit Display Tweets with "low-hanging-fruit".
  • --tweets Display Tweets only.
  • --verified Display Tweets only from verified users (Use with -s).
  • --users Display users only (Use with -s).
  • --csv Write as a .csv file.
  • --hashtags Extract hashtags.
  • --userid Search from Twitter user's ID.
  • --limit Number of Tweets to pull (Increments of 20).
  • --count Display number Tweets scraped at the end of session.
  • --stats Show number of replies, retweets, and likes.

Low-Hanging Fruit

The --fruit feature will display Tweets that might contain sensitive info such as:

  • Profiles from leaked databases (Myspace or LastFM)
  • Email addresses
  • Phone numbers
  • Keybase.io profiles

Basic Examples and Combos.

A few simple examples to help you understand the basics:

  • python3 tweep.py -u username - Scrape all the Tweets from user's timeline.
  • python3 tweep.py -u username -s pineapple - Scrape all Tweets from the user's timeline containing pineapple.
  • python3 tweep.py -s pineapple - Collect every Tweet containing pineapple from everyone's Tweets.
  • python3 tweep.py -u username --year 2014 - Collect Tweets that were tweeted before 2014.
  • python3 tweep.py -u username --since 2015-12-20 - Collect Tweets that were tweeted since 2015-12-20.
  • python3 tweep.py -u username -o file.txt - Scrape Tweets and save to file.txt.
  • python3 tweep.py -u username -o file.csv --csv - Scrape Tweets and save as a csv file.
  • python3 tweep.py -u username --fruit - Show Tweets with low-hanging fruit.
  • python3 tweep.py -s "Donald Trump" --verified --users - List verified users that Tweet about Donald Trump.
  • python3 tweep.py -g 48.880048,2.385939,1km -o file.csv --csv - Scrape Tweets in a radius of 1km around a place in Paris a export them to a csv file.

Example String

955511208597184512 2018-01-22 18:43:19 GMT <now> pineapples are the best fruit

Screenshot

Changelog

2/21/18

  • Added new features:
    • --userid feature allowing a user to search Tweets from a Twitter user's user-id.
    • --limit feature allowing a user to specify how many Tweets get scraped (Incriments of 20).
    • --count feature to display the total number of Tweets collected at the end of a Tweep session.
    • --stats feature to display the number of replies, retweets, and likes.
    • -g feature to scrape tweets in a radius of a gps location.
  • Fixed:
    • Error handling - Moved to a seperate function and better organized.

1/21/18

  • Added:
    • Python3 update and rewriten using asyncio. Fetching Tweets should be a lot more faster naturally.
    • Output can be saved.
    • Replies are now visible in the scrapes.
  • Removed:
    • Pics feature, I'll re-add this on a later date.

Contact

Shout me out on Twitter: @now

tweep's People

Contributors

haccer avatar hpiedcoq avatar sshinol avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.