hturner / gnm Goto Github PK

View Code? Open in Web Editor NEW

16.0 16.0 8.0 11.72 MB

gnm package for generalized nonlinear models in R

License: GNU General Public License v2.0

R 96.63% C 1.42% TeX 1.96%

generalized-linear-models generalized-nonlinear-models r statistical-models

gnm's People

Contributors

Stargazers

Watchers

Forkers

guhjy ganzeb daissi karthy257 wdkrnls hadley russianimperialscott octantbio

gnm's Issues

eliminate = NULL in gnm

Calling gnm with 'eliminate = NULL' leads to failure despite the fact that NULL is the default value of eliminate. The reason is the code snippet

if (!missing(eliminate)) {
eliminate <- modelData$(eliminate)
if (!is.factor(eliminate))
stop("'eliminate' must be a factor")

Wouldn't it be better to replace the first line with 'if (!is.null(eliminate))'?

extend Dref to continuous covariates?

Dref and DrefWeights designed for dependence on factors, can this be extended to continuous covariates?

(c.f. email query Xiaoguang Fan )

Error in if (abs(dev - devold)/(0.1 + abs(dev)) < control$epsilon)

Hi,

I'm currently using gnm for constructing conditional poisson models. I had the model specified as
gnm(event_count ~ factor(exposure) + cov1 + ... + cov n, data = data, family = 'poisson', offset = logtime, eliminate = factor(analytical_unit))

I encountered the error message as Error in if (abs(dev - devold)/(0.1 + abs(dev)) < control$epsilon) { : missing value where TRUE/FALSE needed

However, when I tried the formula specified differently, using a subset of covariates, it would run. I also tried to run the same formula on a dataset with same structure, the function would run as well. Additionally, I checked the data that I feed into the function had no missing values.

I could not find documentation about this error on stackoverflow or CRAN. Do you know where the problem is? Thanks you so much!

Best,
Monica

conditional quasi Poisson regression

Hi,
I'm wondering how do you estimate the dispersion parameter when doing conditional Poisson regression? I called gnm() and eliminate the strata (i.e. group, class) and set family = quasipoisson. My guess is this function first do conditional Poisson (which is a multinomial regression) and get the estimates of covariates' effects, then estimate the dispersion parameter (assumed common, for all stratae?) using the moment estimator. Can you clarify the algorithm, or provide any references? Thank you!

Release gnm 1.1-5

Prepare for release:

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)

speed up summary.gnm?

summary() takes a while on gnm models when there are many parameters, compared to glm(), perhaps due to parameter identification checks. Can this be improved?

(c.f. email query from @nalimilan)

Add reference to Diagonal Reference Models paper (van der Waal et al)

Consider adding a reference to the paper by van der Waal et al discussing the usefulness of Diagonal Reference Models compared to other standard types of analysis in Public Health: https://link.springer.com/article/10.1007/s00038-017-1018-x

Does GNM support nonlinear parameter boundary?

Hi there,
I'm wondering whether users are allowed to set upper and lower boundary for nonlinear parameters when fitting a GNM model.
Thanks,
Xiaotong

Using confint.gnm() inside a function

I believe this is related to issue 1. Consider

library(gnm)
f <- function(d) {
    fit <- gnm(Freq ~ vote, eliminate=class, family=poisson, data=d)
    confint(fit)
}
f(as.data.frame(cautres))

For me, this fails because d could not be found - presumably because some function downstream from update.gnm() (called by profile.gnm()) does not search in the correct environment. I'm sorry I'm not able to locate the exact source of the error, but I don't fully understand what's going on with environments in the following evaluated calls to gnm(), gnmTerms() etc.

Thanks for quickly fixing the previous issue and for your ongoing efforts!

Possible bug in confint.profile.gnm()

I believe there is a subsetting bug in confint.profile.gnm() which is triggered when there is only 1 covariate. Consider

library(gnm)
d <- as.data.frame(cautres)
fit <- gnm(Freq ~ vote, eliminate=class, family=poisson, data=d)
confint(fit)

For me, this returns an error message which appears to be due to lines 30-31 in file confint.profile.gnm.R:

std.err <- attr(object, "summary")$coefficients[, "Std. Error"]
parm <- parm[!is.na(std.err)[parm]]

After replacing these lines with

std.err <- attr(object, "summary")$coefficients[, "Std. Error", drop=FALSE]
parm <- parm[!is.na(std.err)[parm, ]]

The error disappears.

Many thanks for developing gnm and making it freely available! Conditional Poisson regression is very useful in epidemiology.

Better handling of non-full-rank model specifications with predict.gnm

Similar issue to here: hturner/BradleyTerry2#7

How to extract the result from gnm object by using broom::tidy/stats::confint within a forloop?

I had posted the same question in Stackoverflow.
https://stackoverflow.com/questions/77269879/how-to-extract-the-result-from-gnm-object-by-using-broomtidy-statsconfint-wi/77270019?noredirect=1#comment136222463_77270019

jared_mamrot https://stackoverflow.com/users/12957340/jared-mamrot and I had attempted different methods, including
for loop,lapply(), Map() and purrr::map(), and still could not extract the result from broom::tidy.

I know that broom::tidy.gnm just directly uses the profile.gnm to obtain the estimate and confidence intervals from the gnm object. jared_mamrot suspects that the error related to how the function handles the args.

Here is the sample dataset and the code I used:

library(tidyverse)
library(gnm)
library(broom)
library(haven)

data = read_dta('londondataset2002_2006.dta')

data$ozone10 <- data$ozone/10

# GENERATE MONTH AND YEAR
data$month  <- as.factor(months(data$date))
data$year   <- as.factor(format(data$date, format="%Y") )
data$dow    <- as.factor(weekdays(data$date))
data$stratum <- as.factor(data$year:data$month:data$dow)

data <- data[order(data$date),]

# FIT A CONDITIONAL POISSON MODEL WITH A YEAR X MONTH X DOW STRATA
modelcpr = vector(mode = 'list',length = 12)

for(i in 1:12){
  
  
  modelcpr1 <- gnm(numdeaths ~ ozone10 + 
                               lag(temperature,i), data=data, family=poisson,
                               eliminate=factor(stratum))
  
  
  modelcpr[[i]] = broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
    select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
}

#vs
#only for i = 1

modelcpr1 <- gnm(numdeaths ~ ozone10 + 
                             lag(temperature,1), data=data, family=poisson,
                             eliminate=factor(stratum))

#broom::tidy
broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
            select(estimate, conf.low, conf.high) %>% as.matrix %>% unname

The dataset and part of the code are from this paper:

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-14-122#Sec13

dataset:
https://static-content.springer.com/esm/art%3A10.1186%2F1471-2288-14-122/MediaObjects/12874_2014_1140_MOESM2_ESM.zip

code:
https://static-content.springer.com/esm/art%3A10.1186%2F1471-2288-14-122/MediaObjects/12874_2014_1140_MOESM1_ESM.docx

After running the forloop, the following error popped up:

modelcpr = vector(mode = 'list',length = 12)

for(i in 1:12){
  
  
  modelcpr1 <- gnm(numdeaths ~ ozone10 + 
                               lag(temperature,i), data=data, family=poisson,
                               eliminate=factor(stratum))
  
  
  modelcpr[[i]] = broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
    select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
}

Error in profile.gnm(object, which = parm, alpha = 1 - level, trace = trace) : 
  profiling has found a better solution, so original fit had not converged
In addition: Warning message:
In sqrt((deviance(updated) - fittedDev)/disp) : NaNs produced

where i = 1.

However, when I run the for loop one by one:

> modelcpr1 <- gnm(numdeaths ~ ozone10 + 
+                              lag(temperature,1), data=data, family=poisson,
+                              eliminate=factor(stratum))
> 
> #broom::tidy
> broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
+             select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
         [,1]      [,2]     [,3]
[1,] 1.000446 0.9988817 1.002013

I can obtain the result without any error. Is there any thing that I missed?

move `svygnm` from logmult package to gnm?

@nalimilan implemented this for gnm in general so perhaps could be incorporated in gnm (with appropriate credit of course).

Release gnm 1.1.3

Prepare for release:

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)

compatibility with broom?

In general, better support for getting results out, e.g. with stargazer and similar packages. Not sure what would be most useful here.

Fortran issues

Please see the problems shown on
https://www.stats.ox.ac.uk/pub/bdr/BLAS/gnm.out

For reproduction details see
https://www.stats.ox.ac.uk/pub/bdr/BLAS/README.txt .

The Symm() function produces NA's with more than two factors

The function Symm() cannot handle more than two factors. This means that the information on the help page of Symm() as well as in the vignette, viz. where it is said that ... take take more than two factors, is misleading.
An illustration comes from trying to implement a Complete Symmetry model which Agresti (2010: 242-246) describes. The data are Table 8.5 in Agresti (2010: 242) and can be created as follows:

df8_5 <- data.frame(
  A = rep(1:3, each = 9),
  B = rep(rep(1:3, each = 3), times = 3),
  C = rep(1:3, times = 9),
  Freq = c(6, 4, 5, 3, 13, 10, 1, 8, 14,
           2, 3, 2, 1, 3, 1, 2, 1, 2,
           1, 0, 2, 0, 0, 0, 1, 1, 0)
)

The Symm() function works properly when used with two factors:

with(df8_5, Symm(A, B))
 [1] 1:1 1:1 1:1 1:2 1:2 1:2 1:3 1:3 1:3 1:2 1:2 1:2 2:2 2:2 2:2 2:3 2:3 2:3 1:3 1:3
[21] 1:3 2:3 2:3 2:3 3:3 3:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3

with(df8_5, Symm(A, C))
 [1] 1:1 1:2 1:3 1:1 1:2 1:3 1:1 1:2 1:3 1:2 2:2 2:3 1:2 2:2 2:3 1:2 2:2 2:3 1:3 2:3
[21] 3:3 1:3 2:3 3:3 1:3 2:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3

with(df8_5, Symm(B, C))
 [1] 1:1 1:2 1:3 1:2 2:2 2:3 1:3 2:3 3:3 1:1 1:2 1:3 1:2 2:2 2:3 1:3 2:3 3:3 1:1 1:2
[21] 1:3 1:2 2:2 2:3 1:3 2:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3

However, when used with three factors, the Symm() function returns a factor with the wrong levels and only containing NA values:

with(df8_5, Symm(A, B, C))
 [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
[17] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Levels: 1:1 1:2 1:3 2:2 2:3 3:3

Release gnm 1.1-2

Prepare for release:

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()

add reference to `logmult` in vignette

Should be referred to for easier handling of association models.

environment issue

When gnm is used in a function it looks in the global environment for it rather than inside the function environment.

nonlinear fails with exact parameters

Hello,
I am one of the authors of the VFP package on CRAN which imports gnm.
Occasionally, using a custom nonlin model, the gnm fails with the error:
Error: no valid set of coefficients has been found: please supply starting values
This happens even if the exact parameters are provided as starting values.
This failure does not occur with all datasets. E.g., the following code runs, if the
data frame is restricted to lines 1:10.
I suppose it has to do with gamma models with identity link not being defined if the estimator becomes <0.
Of course this can be avoided sometimes using another parametrization, however, I want to get the variance estimates and the like on the original scale.

Here is a minimal example (using a linear model to get a reference):
Mean <- c(1237.1, 5605.55, 801.45, 2713.55, 570.4, 193.1, 97.2, 11.05,
202.5, 7031.75, 2679.6, 1252.55, 735.6, 4088.05, 9818.7, 4486.45,
3104.85, 1189.3, 217.6, 603.2, 28.45)
VC <- c(33696.08, 296681.045, 24842.205, 31777.205, 1705.28, 950.48,
2693.78, 244.205, 3026.42, 17578.125, 3.92, 8281.845, 1280.18,
76479.605, 4665.78, 130101.005, 0.125, 9800, 18355.28, 1152,
1618.805)
dat <- data.frame(Mean=Mean,VC=VC)
coeffs <- c(beta1 = 4792.94726157285, beta2 = 0.00366035757993686)
fitglm <- gnm(VC ~ I(Mean^2) , family = Gamma(link = "identity"), data = data.frame(dat), start=coeffs, trace=TRUE)
print(fitglm$coefficients)
powfun <- function(x)
{
list(
predictors=list(beta1 = 1, beta2 = 1),
variables=list(substitute(x)),
term=function(predictors,variables){
paste( predictors[1],"+",predictors[2],"*",variables[1],"^2")
}
)
}
class(powfun) <- "nonlin"
form <- VC ~ powfun(Mean)-1
fitgnm <- gnm(formula = form, family = Gamma(link = "identity"), data = data.frame(dat), start=coeffs, trace=TRUE,iterMax=100)

Kind regards
Florian

Upkeep for gnm (2023)

usethis::use_roxygen_md()
usethis::use_pkgdown_github_pages()
usethis::use_tidy_description()
usethis::use_package_doc()
Consider letting usethis manage your @importFrom directives here. usethis::use_import_from() is handy for this.
usethis::use_testthat(3) and upgrade to 3e, testthat 3e vignette
Align the names of R/ files and test/ files for workflow happiness. The docs for usethis::use_r() include a helpful script. usethis::rename_files() may be be useful.
usethis::use_code_of_conduct()
Add alt-text to pictures, plots, etc; see https://posit.co/blog/knitr-fig-alt/ for examples

Set up or update GitHub Actions.
Updating workflows to the latest version will often fix troublesome actions:

usethis::use_github_action('check-standard')
usethis::use_github_action('test-coverage')

^{Created on 2023-08-26 with usethis::use_upkeep_issue(), using usethis v2.2.0}

make it easier to extract coefficients from getContrasts?

getContrasts returns a "qv" object so maybe a coef method for "qv" objects?

hturner / gnm Goto Github PK

gnm's People

Contributors

Stargazers

Watchers

Forkers

gnm's Issues

Recommend Projects

Recommend Topics

Recommend Org