Giter VIP home page Giter VIP logo

emissary's Introduction

Emissary

An intelligence utility / test for researchers, programmers and generally carnivorous primates who want personally curated news archives. Emissary is a web content extractor that has a RESTful API and the ability to run pre-store scripts. Emissary stores the full text of linked articles from RSS feeds or URLs containing links.

Documentation lives here.


Alt text Alt text Alt text

Installation requires the python interpreter headers, libevent, libxml2 and libxslt headers.
Optional article compression requires libsnappy. 
All of these can be obtained on debian-based systems with:
sudo apt-get install -y zlib1g-dev libxml2-dev libxslt1-dev python-dev libevent-dev libsnappy-dev

You're then ready to install the package for all users:
sudo python setup.py install


 Usage: python -m emissary.run 

  -h, --help            show this help message and exit
  -c, --crontab         Crontab to parse
  --config              (defaults to emissary.config)
  -a, --address         (defaults to 0.0.0.0)
  -p, --port            (defaults to 6362)
  --export              Write the existing database as a crontab
  --key                 SSL key file
  --cert                SSL certificate
  --pidfile             (defaults to ./emissary.pid)
  --logfile             (defaults to ./emissary.log)
  --stop                
  --debug               Log to stdout
  -d                    Run in the background
  --run-as              (defaults to the invoking user)
  --scripts-dir         (defaults to ./scripts/)


Some initial setup has to be done before the system will start.
Communication with Emissary is mainly done over HTTPS connections
and for that you're going to need an SSL certificate and a key:

user@host $ openssl genrsa 4096 > key
user@host $ openssl req -new -x509 -nodes -sha256 -days 365 -key key > cert

To prevent your API keys ever getting put into version control for all
the world to see, you need to put a database URI into the environment:

export EMISSARY_DATABASE="sqlite://///home/you/.emissary.db"

Protip: Put that last line in your shells' rc file.

Start an instance in the foreground to obtain your first API key:

user@host $ python -m emissary.run --cert cert --key key
14/06/2015 16:31:30 - Emissary - INFO - Starting Emissary 2.0.0.
e5a59e0a-b457-45c6-9d30-d983419c43e1
^That UUID is your Primary API key. Add it to this example crontab:

user@host $ cat feeds.txt
apikey: your-api-key-here

# url                                                 name            group            minute  hour    day     month   weekday
http://news.ycombinator.com/rss                       "HN"            "HN"             */15    *       *       *       *
http://phys.org/rss-feed/                             "Phys.org"      "Phys.org"       1       12      *       *       *
http://feeds.nature.com/news/rss/most_recent          "Nature"        "Nature"         30      13      *       *       *

user@host $ python -m emissary.run -c feeds.txt
Using API key "Primary".
Primary: Creating feed group HN.
Primary: HN: Creating feed "HN"

Emissary supports multiple apikey directives in one crontab.
Subsequent feed definitions are associated with the previous key.

Start an instance in the background and connect to it:
user@host $ python -m emissary.run -d --cert cert --key key
user@host $ python -m emissary.repl
Emissary 2.0.0
Psybernetics 2015

(3,204) > help

If the prospect of creating an NSA profile of your reading habits is something that rightfully bothers you then my advice is to subscribe to many things and then use Emissary to read the things that really interest you.

Alt text

emissary's People

Contributors

lukeb42 avatar tijko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

emissary's Issues

When it is time to run, it crashes

pascal@MBP:~/Emissary $ cat feeds.txt

apikey: mykey

# url                                                 name            group            minute  hour    day     month   weekday
http://news.ycombinator.com/rss                       "HN"            "HN"             15!     *       *       *       *
http://phys.org/rss-feed/                             "Phys.org"      "Phys.org"       1       12      *       *       *
http://feeds.nature.com/news/rss/most_recent          "Nature"        "Nature"         30      13      *       *       *

Then ran

python -m emissary.run -c feeds.txt

And

python -m emissary.run -d --cert cert --key key

without errors.

When it was time to run it produced:

 "Python quit unexpectedly while using the greenlet.so plug-in."

Tried relaunching in the same minute, and again immediately then hit the same error.

Python 2.7.10
greenlet.version == '0.4.9'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.