Giter VIP home page Giter VIP logo

russtat's Introduction

russtat: Python / PostgreSQL access to the Russian Federal statistics

russtat utilizes the power of Python to download and process the massive public data release by the Russian Statistics Office from the EMISS website. The original XML-formatted datasets (close to 7,000 in total) are parsed and saved as JSON files which can then be fed into a ready-made PostgreSQL database to give additional power. Alternativaly, you can utilize the JSON files with your own software, in any way you like!

Download

russtat source code and documentation are hosted on Github

Features:

  • parallel non-blocking data retrieval and processing using the multiprocessing library
  • fool-proof XML parsing with default values for missing data, string trimming, type conversions and exception handling
  • ready SQL script to create the PostgreSQL database from scratch
  • database routine handling
  • extensive API documentation with Doxygen

License and Copyrights

All source data are taken from the EMISS website at https://fedstat.ru/. No data is modified by this application.

The EMISS website publishes all materials under the Creative Common Attribution 3.0 license. The copyright owners are the Russian Federal State Statistic Service (ROSSTAT) and the Russian Ministry of Digital Development, Communications and Mass Media (MINKOMSV'AZ').

Installation

Requirements

You must have the following applications / packages installed in your system:

  • Python 3.7+ (the app was written and tested with Python 3.8)
  • Python packages:
    • pip
    • psycopg2
  • Git (should be pre-installed on most modern Linux and Mac systems, alternatively install from the git website)

1. Clone repo

To get the latest version from Github, run:

git clone https://github.com/S0mbre/russtat .

2. Install the required packages

I recommend (as many do) installing packages into python's virtual environment using virtualenv or the inbuilt venv:

Create a new virtual environment (assuming your projects root folder is 'myprojects'):

Linux / Mac

cd myprojects
venv russtat
cd russtat
. ./bin/activate

Windows

cd myprojects
venv russtat
cd russtat
scripts\activate.bat

This step is, of course, optional. You can skip it if you don't want to use virtual environments for some reason or other.

Then just run:

cd russtat
python -m pip install -r requirements.txt

If you're using a virtual environment, you can deactivate it after closing the app with deactivate.

Usage

Run python russtat.py to start the application. Please modify the main() function is that file first to suit your purpose!

See the documentation in russtat/doc/ref to find out more!

russtat's People

Contributors

s0mbre avatar

Stargazers

Eugene Tokarev avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.