Giter VIP home page Giter VIP logo

minnpost-scraper-2012-general-elections's Introduction

MN 2012 General Election Results

Scraper for the 2012 general election in Minnesota. Main results page can be found here:

http://electionresults.sos.state.mn.us/enr/ENR/Home/1 http://electionresults.sos.state.mn.us/ENR/Select/Download/1

Scraper on Scraperwiki

https://box.scraperwiki.com/zzolo/mn-2012-election-results

Local setup

Make a virtualenv.

pip install -r requirements_local.txt

This is used locally to get around some bugs in the scraperwiki libraries.

Independent Deployment

For election night, the scraper needs to run every 10 minutes or less, and with ScraperWiki, it is not fast enough and there were some issues with SQLite performance.

These instructions were performed on EC2's quick-launch Ubuntu 12 install.

Libraries and prerequisites

sudo apt-get install git-core git python-pip python-dev build-essential python-xml sqlite3 nginx fcgiwrap 
sudo pip install --upgrade pip 
sudo pip install --upgrade virtualenv 

Install codebase

We are assuming this is the only thing running on server so not using Virtualenv, but feel free to use it. Assuming all relative paths are from repo directory.

git clone git://github.com/MinnPost/minnpost-scraper-2012-general-elections.git
cd minnpost-scraper-2012-general-elections
sudo pip install -r requirements_local.txt

Setup webserver/API

sudo git clone https://github.com/zzolo/dumptruck-web.git /var/www/dumptruck-web
sudo chown -R www-data:www-data /var/www/dumptruck-web

Configure fcgiwrap to use more children, check if this file exists, if so just copy it.

ls /etc/default/fcgiwrap
sudo cp deploy/fcgiwrap /etc/default/fcgiwrap

Configure nginx.

sudo cp deploy/nginx-scraper-api /etc/nginx/sites-available/nginx-scraper-api
sudo ln -s /etc/nginx/sites-available/nginx-scraper-api /etc/nginx/sites-enabled/nginx-scraper-api
sudo rm /etc/nginx/sites-enabled/default

Restart services.

sudo service fcgiwrap restart
sudo service nginx restart

API Setup. If for some reason, you need a publish token, then update scraperwiki.json as needed.

echo "{ \"database\": \"scraperwiki.sqlite\" }" > scraperwiki.json
ln -s /home/ubuntu/minnpost-scraper-2012-general-elections/scraperwiki.json scraperwiki.json
ln -s /home/ubuntu/minnpost-scraper-2012-general-elections/scraperwiki.sqlite scraperwiki.sqlite

Cron

crontab deploy/crontab

minnpost-scraper-2012-general-elections's People

Contributors

zzolo avatar

Watchers

 avatar James Cloos avatar Kaeti avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.