Giter VIP home page Giter VIP logo

lottonumberarchive's Introduction

API for the lotto numbers of the german lottery (1955-2023)

Project Status: Active – The project has reached a stable, usable state and is being actively developed.

Background

This repo provides the german lotto numbers from 1955 - today in one single file. All people who are interested in data analysis or just to “calculate” their chances to win the lottery are invited to use the data.

Two JSON files are give: Choose the one you can work with :-)

Data analysis examples

The data provided is a JSON file and readable by all modern software languages. In the following two examples are shown (R and Python).

R

The package tidyverse is able to analyze the data very quickly with R.

In the next chunk, all data are read, filtered (just taking the lotto numbers) and grouped by the values and counted the number of appearance. We can see, that lotto number 6 is the most frequent number.

library(tidyverse)
library(jsonlite)
library(lubridate)

data <- fromJSON("https://johannesfriedrich.github.io/LottoNumberArchive/Lottonumbers_tidy_complete.json")

lottonumbers_count <- data %>% 
  filter(variable == "Lottozahl") %>% 
  group_by(value) %>% 
  summarise(count = n())
lottonumbers_count %>% 
  arrange(desc(count)) %>% 
  top_n(5)
## Selecting by count
## # A tibble: 7 × 2
##   value count
##   <int> <int>
## 1     6   646
## 2    49   632
## 3    32   620
## 4    31   615
## 5    22   614
## 6    26   614
## 7    33   614

Now we want to summarise all numbers from 1-49 and their appearance.

library(ggplot2)

ggplot(lottonumbers_count, aes(value, count)) +
  geom_bar(stat = "identity") +
  labs(x = "Lottonumber", title = "Lottonumbers in Germany since 1955")

Since 2001 in the german lottery a number called “Zusatzzahl” was introduced. Every Wednesday and Saturday the number chosen. The following graph shows the distribution of the Zusatzzahl.

superzahl <- data %>% 
  filter(variable == "Superzahl") %>% 
  mutate(date = dmy(date),
         Day = weekdays(date),
         year = year(date)) %>% 
  filter(year >= 2001) %>% 
  group_by(value, Day) %>% 
  summarise(count = n())
## `summarise()` has grouped output by 'value'. You can override using the
## `.groups` argument.
ggplot(superzahl, aes(value, count, fill = Day)) +
  geom_bar(stat = "identity", position = "dodge") +
  scale_x_continuous(breaks = c(0:9)) +
  labs(x = "Zusatzzahl", title = "Zusatzzahl since 2001")

What were the numbers most chosen in 2023?

data %>% 
  filter(variable == "Lottozahl") %>% 
  mutate(date = dmy(date),
         year = year(date)) %>% 
  filter(year == 2023) %>% 
  group_by(value) %>% 
  summarise(count = n()) %>% 
  slice_max(count, n = 5)
## # A tibble: 8 × 2
##   value count
##   <int> <int>
## 1    19    19
## 2    22    18
## 3    33    18
## 4    25    17
## 5    23    16
## 6    28    16
## 7    42    16
## 8    43    16

Python

In python the module pandas is very handy to analyse data. In the following the same analysis as shown above will be executed.

import pandas as pd

data = pd.read_json("https://johannesfriedrich.github.io/LottoNumberArchive/Lottonumbers_tidy_complete.json")
## <string>:2: UserWarning: Parsing dates in DD/MM/YYYY format when dayfirst=False (the default) was specified. This may lead to inconsistently parsed dates! Specify a format to ensure consistent parsing.

res = data[data.variable == "Lottozahl"].groupby("value")["value"].count().sort_values(ascending = False)

print(res.head(5))
## value
## 6     646
## 49    632
## 32    620
## 31    615
## 33    614
## Name: value, dtype: int64

lottonumberarchive's People

Contributors

johannesfriedrich avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

lottonumberarchive's Issues

Consider order of each draw

I need to know the chronological order of each draw (the order in which the numbers were drawn). Could you add this information or update the given "complete" file?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.