Giter VIP home page Giter VIP logo

bi-p02-thesis-topic-evaluation's Introduction

bi-p02-thesis-topic-evaluation

Caution

WARNING: Use of Dummy Passwords

โš ๏ธ The project's Docker Compose file contains dummy passwords. These are suitable for local development and testing only. Using them in a live production environment or pushing them to a public repository (like GitHub) poses a significant security risk.

Action Required: Before pushing your code to a repository or deploying it to production, replace all dummy passwords with strong, unique credentials. Failure to do so can lead to serious security breaches.

Init

docker compose up -d

Database Schema + Creation

The dataset are dicts for every day and contains a json with two relevant json-files:

  • db-topics
    • titel
    • abschlussarbeitstyp
    • studiengaenge
    • art_der_arbeit;ansprechpartner
    • status;erstellt
    • action;topic_id
    • url_topic_details
  • db-topics-additional
    • titel
    • beschreibung
    • heimateinrichtung
    • art_der_arbeit
    • abschlussarbeitstyp
    • autor
    • status
    • aufgabenstellung
    • erstellt
    • topic_id
    • voraussetzung

I have create following database schema: DatabaseSchema.png

For the creation to the database i used a DDL-SQL-Script. This script excecutes automatically with the docker compuse startup. DDL-SQL-File

ETL: Importer

For the ETL process i have a python-script: importer.py

For exceution:

python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt
python3 importer.py

Data cleaning: Same data items for date 24.05.2023 on different pages where based on crawler issuer or Stud.Ip problem. To solve this, execute CLEANING-SQL-File. This will overwrite the date 24.05.2023 with the data from date 23.05.2023.

Visualisation

Go to localhost:3000, create an Account and log in to metabase.

NOTE: use host.docker.internal instead of localhost for the host

I added a dashboard for the tasks/KPIs. The sql statements are in the file tasks.sql

The final dashboard is here: Dashboard.pdf

Tasks

  • T1 - How many thesis topics are published in a week, in a month, in a year?
  • T2 - Which supervisor has the most thesis topics to offer?
  • T3 - Which department has the most thesis topics to offer?
  • T4 - How many thesis topics are "removed from the list" in a week, in a month, in a year?
  • T5 - Create 1 task/business question of your own and answer this question: Which Thesis Type is Most Common?

KPIs

  • KPI 1 - Unique thesis topics published every month
  • KPI 2 - Average thesis topics for each department
  • KPI 3 - Create 1 KPI of your own, describe how to calculate it and put it on the dashboard: This KPI gives an overview of the distribution of thesis topics across different thesis types.

Presentation

The file with the presenation slides can be found here: Slides.pdf

bi-p02-thesis-topic-evaluation's People

Contributors

kenoc1 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.