Giter VIP home page Giter VIP logo

pull_facebook_data_for_good's Introduction

pull_facebook_data_for_good

GitHub Actions (Tests) codecov

Imitate an API for downloading data from Facebook Data for Good.

This library uses selenium webdriver to imitate the behaviour of an API for downloading the full timeseries of a data collection.

This library is developed and tested in Python 3.8.

Disclaimer: This download routine will only work for those with access to the Facebook Geoinsights platform, and will only function for datasets to which the user has been granted access. This tool is not developed by or associated with Facebook, it is simply a utility to automate downloading data from the Geoinsights platform.

Installation

From a clone:

To develop this project locally, clone it onto your machine:

git clone https://github.com/hamishgibbs/pull_facebook_data_for_good.git

Enter the project directory:

cd pull_facebook_data_for_good

Install the package with:

pip install .

From GitHub:

To install the package directly from GitHub run:

pip install git+https://github.com/hamishgibbs/pull_facebook_data_for_good.git

Usage

Currently functional for TileMovement datasets only.

Use the CLI from the directory where you would like data to be downloaded:

cd path/to/downloaded/data

The CLI follows the format:

pull_fb --dataset_name --area

For example, to pull the TileMovement dataset for Britain:

pull_fb --dataset_name TileMovement --area Britain

or:

pull_fb --d TileMovement --a Britain

The country name must exactly match the name stored in the .config file. For multi-word names, each word will be separated by '_'. ie. New_Zealand

Please Note:

If the .config file is missing variables for a given dataset, please alter the .config file and open a pull request to share with others.

Chrome Web Driver

To download data, this library relies on selenium and ChromeDriver.

This requires a chromedriver executable which can be downloaded here. Make sure that your Chrome version is the same as your chromedriver version.

pull_facebook_data_for_good assumes that the chromedriver executable is located at Applications/chromedriver. To supply a different path, use the argument --driver_path or -driver from the command line.

Credentials

Credentials must be input manually on each download.

Credentials are not stored on your computer and are passed directly to the Facebook login page by the web driver.

Tests

This project is tested with tox.

To run unit tests:

tox

Contributions

Issues:

To request a feature or report an issue with this tool, please open an issue.

Adding a dataset:

Dataset attributes are stored in the .config file.

Each time you use the library, pull_facebook_data_for_good will look for dataset configuration variables here.

To add the ability to download another dataset, alter the .config file with two pieces of information:

  1. The dataset id, embedded in the url of the Geoinsights download page. For example, the dataset ID for the collection stored at https://www.facebook.com/geoinsights-portal/downloads/?id=243071640406689 is 243071640406689.

  2. The date origin of the dataset, the earliest date of data publication, in the format: year_month_day(_hour). i.e. 2020_01_01_00.

Please open a pull request to share the config variables for a new dataset with everyone.

Other contributions:

Other contributions are welcome.

Please look for open issues with the Help Wanted tag.

pull_facebook_data_for_good's People

Contributors

hamishgibbs avatar alebitetto avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.