Comments (5)
Thanks!
Point 1 has been corrected. Point 2 has been changed to:
Identifying observations the regression model does not fit well can help find information relevant to our specific research context.
For point 3, I added,
Currently, most lm models are supported (with the exception of
glmmTMB
,lmrob
, andglmrob
models), as long as they are supported by the underlying functionsstats::cooks.distance()
(orloo::pareto_k_values()
) andinsight::get_data()
(for a full list of the 225 models currently supported by theinsight
package, see https://easystats.github.io/insight/#list-of-supported-models-by-class).
(addressing the other points in a second reply)
from performance.
Table 1 is very helpful; consider adding a column for the R function use. Right now it is very hard to read the thresholds column. Is the recommended thresholds the function defaults?
I clarified that the recommended thresholds are the default thresholds, but I find it difficult to format the table properly in rmarkdown PDF/JOSE paper. I tried using "\n"
or adding empty rows to help air the content a bit but that does not seem to work.
Also, originally the table was larger but with four columns, the text starts overlapping, even when not using code formatting inside the table. I think it is because the function call (check_outliers(model,
) is a single word with no spaces so it does not jump to the next line until the end of the string. I have for now removed the in-table code formatting in the hope of improving readability and added a column showing the function usage. See below for what I mean.
I currently use knitr::kable
but will try to see if other packages can yield a better outcome for the .md
format.
from performance.
Ok, using the flextable
package, I think we have a much better result now...
from performance.
Extending this to work well in the tidyverse or adding a section exhibiting that it already does might increase the adoption of your package. Consider compatibility with tidymodels specifically where the outlier detection can be part of the recipe for the data analysis.
I confirm that although check_outliers()
works with the pipe and insight::get_data()
, it does not currently work with tidymodels because of stats::cooks.distance()
:
library(performance)
# Create some artificial outliers and an ID column
data <- rbind(mtcars[1:4], 42, 55)
data <- cbind(car = row.names(data), data)
lm(disp ~ mpg * hp, data = data) |>
check_outliers() |>
which()
#> [1] 31 34
suppressWarnings(library(tidymodels))
linear_reg() %>%
fit(disp ~ mpg * hp, data = data) %>%
check_outliers()
#> Error in UseMethod("cooks.distance"): no applicable method for 'cooks.distance' applied to an object of class "c('_lm', 'model_fit')"
# insight::get_data works
linear_reg() %>%
fit(disp ~ mpg * hp, data = data) %>%
insight::get_data()
#> disp mpg hp
#> Mazda RX4 160.0 21.0 110
#> Mazda RX4 Wag 160.0 21.0 110
#> Datsun 710 108.0 22.8 93
#> Hornet 4 Drive 258.0 21.4 110
#> Hornet Sportabout 360.0 18.7 175
#> Valiant 225.0 18.1 105
# stats::cooks.distance() doesn't
linear_reg() %>%
fit(disp ~ mpg * hp, data = data) %>%
stats::cooks.distance()
#> Error in UseMethod("cooks.distance"): no applicable method for 'cooks.distance' applied to an object of class "c('_lm', 'model_fit')"
Created on 2023-10-04 with reprex v2.0.2
I think it is worth thinking about adding support for those long-term, but for now, I have added tidymodels
to the list of unsupported models within the paper, while mentioning that the pipe operator is supported.
Also note that although
check_outliers()
supports the pipe operators (|>
or%>%
), it does not supporttidymodels
at this time.
With this I think I have addressed all the points so closing this issue for now but do not hesitate to comment again if more things come up. Thanks!
from performance.
Even though
output:
rticles::joss_article:
journal: "JOSE"
Renders the PDF correctly, it seems like the JOSE rendering process of the .md
file does not like the LaTeX commands introduced by using flextable
to produce the table... I suppose I'll just have to take a screenshot and leave it at that...
from performance.
Related Issues (20)
- Revising `check_model()` HOT 1
- check_model failing on logistic regression HOT 2
- Check_model in version 0.11.0 no longer produces qq plot residuals HOT 19
- r2_nakagawa and glmmTMB with beta_family HOT 4
- Outlier detection in Linear mixed models failed? HOT 5
- cannot apply check_model title with patchwork::plot_annotation HOT 4
- check_model error suggestions are not complete HOT 5
- Error and Incomplete Output Using performance::check_collinearity with Cox Models HOT 1
- Normality of Residuals of check_model is abnormal. HOT 2
- Revise compare_models() for Bayesian models HOT 5
- R-squared for glmmTMB (binomial) HOT 9
- check_model() bugged for lmer models *only* when run as part of an RMD chunk HOT 3
- check_predictions() fails when outcome is log-transformed and named like a valid function HOT 2
- Error in `check_model(<glmer>)` HOT 3
- Problems using `r2_nakagawa()` HOT 3
- check_model fails if dependent variable is labelled HOT 5
- Remove unnecessary `tryCatch()` statements targeting `insight::download_model()` HOT 2
- check_collinearity() does not work with orthogonal polynomials HOT 10
- Should check_overdispersion give warning when applied to quasipoisson? HOT 1
- Check for influential observations of GLM w/o numeric variables
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from performance.