Comments (6)
That's interesting - do you mind me asking what dataset you're using? Is it the same number of teams/games (and therefore the same number of parameters)?
from regista.
Hei,
this is the test dataset that I tried with. I tried to subset it to 380 observations as your premier league dataset.
I am using data from over two years, could this be the problem?
My test dataset
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 520 obs. of 7 variables:
$ date : chr "2019-08-09" "2019-08-10" "2019-08-10" "2019-08-10" ...
$ home : Factor w/ 445 levels "Aberdeen","Accrington",..: 243 433 66 77 118 429 403 234 283 260 ...
$ away : Factor w/ 445 levels "Aberdeen","Accrington",..: 289 259 365 372 147 73 31 438 28 98 ...
$ hgoal : num 4 0 1 3 0 0 3 0 0 4 ...
$ agoal : num 1 5 1 0 0 3 1 0 1 0 ...
$ result: Factor w/ 3 levels "A","D","H": 3 1 2 3 2 1 3 2 1 3 ...
$ hfa : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
I was also wondering if there is a way for your model to predict outcome more definitely:
- You already have percentages for H, D and A, but is it possible to predict an outcome such as "H", "A", "D" as definite outcome , where it select one of this three?- and also a way to test accuracy for these predictions?
Thank you alot btw for sharing this package :)
Ps.
I see HFA is set as HFA TRUE on all of the matches. Is this a default setting, or something that the model calculates for you? :)
from regista.
Factor w/ 445 levels
I think this is what's slowing it down - are you trying to estimate the strengths of all 445 teams at once?
I would suggest splitting things out into "connected" leagues (only estimating for teams that can play each other at one time). For example, estimating the Scottish leagues separately from the English leagues.
I was also wondering if there is a way for your model to predict outcome more definitely
Yes! :)
The predict
method has an argument type
, which determines whether to get scorelines, outcomes, or goalscoring rates for each match: http://regista.statsandsnakeoil.com/reference/predict.dixoncoles.html
See here for a more worked through example: http://www.statsandsnakeoil.com/2018/07/15/what-a-diff-rence-xg-makes/
from regista.
I see HFA is set as HFA TRUE on all of the matches. Is this a default setting, or something that the model calculates for you? :)
I think by default, the dixoncoles
function does this for you. But you can be more flexible with dixoncoles_ext
if you need to.
from regista.
Hey thank you. The issue resolved now. The issue was that my factor contained levels for all teams from european leagues, so no wonder it went slow. Thank you for you help!
from regista.
No problem!
from regista.
Related Issues (20)
- Return tibbles
- Use rsample > modelr HOT 1
- Use tidyeval > lazyeval HOT 1
- Create a package site HOT 1
- Dixon-Robinson fit
- predict.dixoncoles requires unnecessary home/away goals columns
- Informative error message when predicting with different factor levels
- Warnings after fit HOT 2
- Dixon-Robinson predict method HOT 1
- Broom model methods
- Include example goal-times data
- Correct old blogs and documentation
- Speed up dixoncoles tests
- DixonColes error message HOT 5
- Can't create table of scoreline probabilities without dixoncoles class object HOT 3
- error with broom HOT 8
- Use Github Actions
- Unplayed games - factor issues HOT 8
- Non-list contrasts argument ignored while modeling HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from regista.