Giter VIP home page Giter VIP logo

chirps's Introduction

Chirps

Predicate Paraphrases From Twitter

This is the code used in the paper:

"Acquiring Predicate Paraphrases from News Tweets"
Vered Shwartz, Gabriel Stanovsky and Ido Dagan. *SEM 2017. link


The steps performed to create the resource:

We executed the script get_daily_news_stream.sh and now we can sit back and relax while the job is performed automatically for us... But if you want a detailed explanation step-by-step:

  1. Obtain news tweets:
    Querying the Twitter Search API for news:

    get_news_tweets_stream.py --consumer_key=<consumer_key> --consumer_secret=<consumer_secret>
         --access_token=<access_token> --access_token_secret=<access_token_secret> [--until=<until>]
    

    where consumer_key, consumer_secret, access_token and access_token_secret are obtained by registering to the Twitter API as an app, in here. The optional argument until is a date in the format YYYY/MM/dd if you'd like to retrieve tweets only until this date. The Search API only supports up to one week ago. If this argument is not specified, it will retrieve current tweets. This script will save the tweets in a file named by the date they were created at.

    Important note: we downloaded TwitterSearch and changed the code to add the news filter to the search URL. If you want to get news tweets, you should do the same.

  2. Extract propositions:

    prop_extraction --in=[tweet_folder] --out=[prop_folder]
    

    Note: You can also install our proposition extraction as a stand-alone tool.

  3. Generate positive instances:

    get_corefering_predicates.py [tweets_file] [out_file]
    
  4. Package the resource:

    cat news_stream/positive/* | cut -f1,2,4,5,6,7,8,10,11,12,13,14 > resource
    python -u package_resource.py resource [repository_dir]
    

    where news_stream/positive/ is where we keep all the positive instances files. cut is used to remove the tweets, to comply with Twitter policy. package_resource.py updates the resource file under [repository_dir]\resource and pushes the changes.

chirps's People

Contributors

vered1986 avatar gabrielstanovsky avatar

Stargazers

Suhas Aggarwal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.