Giter VIP home page Giter VIP logo

kpu_data_extractor's Introduction

Indonesian Presidential Election 2024 Data

This repository consists of Indonesian Presidential Election 2024 dataset. The dataset consists of the final vote results per polling station.

Files

The dataset can be obtained here

Voting Results File

The voting results can be found under raw_district_counts_<date>.json, where date is the date the data were extracted. The dataset consist of multiple lines of json. Each json follows this structure

{
	"url": URL_TO_KPU_API, 
	"data": {
		"chart": {
			"null": null,
			"100025": VOTE_COUNT_PASLON_1,
			"100026": VOTE_COUNT_PASLON_2,
			"100027": VOTE_COUNT_PASLON_3
		}, 
		"images": [C1_FORM_1_LINK, C1_FORM_2_LINK, C1_FORM_3_LINK], 
		"administrasi": {
			"suara_sah": 136,
			"suara_total": 142,
			"pemilih_dpt_j": 156,
			"pemilih_dpt_l": 74,
			"pemilih_dpt_p": 82,
			"pengguna_dpt_j": 140,
			"pengguna_dpt_l": 65,
			"pengguna_dpt_p": 75,
			"pengguna_dptb_j": 0,
			"pengguna_dptb_l": 0,
			"pengguna_dptb_p": 0,
			"suara_tidak_sah": 6,
			"pengguna_total_j": 142,
			"pengguna_total_l": 66,
			"pengguna_total_p": 76,
			"pengguna_non_dpt_j": 2, 
			"pengguna_non_dpt_l": 1, 
			"pengguna_non_dpt_p": 1
		}, 
		"psu": null, 
		"ts": "2024-02-16 09:30:28", 
		"status_suara": true, 
		"status_adm": true
	}
}

Some of the values might be None or null, which indicate that the voting results for the particular polling station has yet to be uploaded.

The API also provides images link to the C1 form under images key.

URLs

We also provide list of all urls API to obtain the above KPU dataset under urls.json

Installation

create virtualenv

python -m venv env
source env/bin/activate

Install dependencies

pip install -r requirements.txt

Download files

You can download the above files by following the link, or using gdown. You just need to find the file id from gdrive and run the following command:

 gdown https://drive.google.com/uc\?id\=<FILE_ID>

E.g., for district_counts_18_2_2024.json , link to gdrive is : https://drive.google.com/file/d/11gsCs8npDV_2Eanaw94rGGna9wzXTV8A/view?usp=drive_link

 gdown https://drive.google.com/uc\?id\=11gsCs8npDV_2Eanaw94rGGna9wzXTV8A

For urls.json, run the following command (this command is neccessary to run data extractor below):

https://drive.google.com/file/d/1cUWd0OyBprz6TNJHxceyEBLRCstzU1VY/view?usp=drive_link

Obtain the dataset

If you wish to recollect the most updated data, run this command to call the API and obtain the dataset:

python main.py --task collect-data --target_file <TARGET_FILE>

Obtain C1-form scanned image

Run this command to download the scanned C1-form:

python main.py --task extract-image  --url <URL> --target_path <TARGET_PATH> 

It will download C1-form images for a particular URL (Polling station) to the target path.

Run this command to download scanned C1-form for all polling station

python main.py --task extract-image --target_path <TARGET_PATH> 

Data Statistics

We also provide data statistics and some insights between data extracted on 18th and 20th February 2024 under notebooks/statistics.ipynb

kpu_data_extractor's People

Contributors

seelenbrecher avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.