Giter VIP home page Giter VIP logo

index_debat_gemist's Introduction

Indexing Tweede Kamer Debates

Indexing information from search results from https://debatgemist.tweedekamer.nl to a csv or Excel dataset.


Table of Contents


The project

What is this project about?

This project is about scraping data from debates in ‘de Tweede Kamer’ (the Dutch Parliament/ House of Representatives). This is done by utilising the search option from the website ‘Debat Gemist – Tweede Kamer’. By inserting a search term in the python script, the code scrapes the data (only of which is asked for in the code) from the debates which are found through the search query of website itself. The results are exported to a .csv file by which the data can be analysed by a data scientist. The scraping of data is achieved by utilising a Python library named BeautifulSoup.

Why did I start this project?

I did my Bachelor Thesis on the quantitative relation of transparency and accountability in the European Parliament. For my research I made use of a somewhat recent database of all the debates linked to the information of the members of parliament. This database was accessible through SPARQL, which is, if I am not mistaken, a relational database language. While I, a gamma student of Public Administration, had some knowhow on programming, my tutor didn’t. When I told him about the database, he was very enthusiastic about the scope of data hidden in the database, but was turned-off by the required programming experience. This, and my spare time through the Coronavirus, motivated me to start this project. I wanted to make the available data of the House of Representatives (Tweede Kamer) easily accessible for data scientist. Most students I know, sociologists and of Public Administration, do have experience with tools like SPSS, but do not have any experience with programming languages as Python. So, through my desire to make more data accessible for research, I set out on this journey.

For who is this project meant?

Any data enthusiast who speaks/reads Dutch can thinker with the data retrieved by running the code. A dataset as .csv is retrieved, which then can be imported to well-known applications like SPSS. However, the main focus for this project is on data research (quantitative or qualitative) about the House of Representatives (Tweede Kamer). So, anyone who wants easy access to the debates in the Dutch House of Representatives, can utilise this code. Moreover, this project is also about showing the possibility of creating such a tool for any parliament with open access to its debates.

How can I use this project?

At this stage, you can run the code by inserting a search term within the Python file and running it with the program Python. The variable ‘Searchterm’ is already defined, so you only have to change the word between quotation marks. In the future, I am planning to write a more sophisticated guide on how to setup Python and the necessary libraries.

    Searchterm = "bier" #I like to use "bier" as my test search term

References


Author Info

Back To The Top

index_debat_gemist's People

Contributors

kcvanderlinden avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

openinspiration

index_debat_gemist's Issues

Invalid URL '?start=500'

MissingSchema: Invalid URL '?start=500': No schema supplied. Perhaps you meant http://?start=500?

prompted when an invalid statement_url is found.
Should create an if statement to check for the right URL

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.