Giter VIP home page Giter VIP logo

gnm's People

Contributors

daissi avatar davidfirth avatar hturner avatar wdkrnls avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gnm's Issues

eliminate = NULL in gnm

Calling gnm with 'eliminate = NULL' leads to failure despite the fact that NULL is the default value of eliminate. The reason is the code snippet

if (!missing(eliminate)) {
eliminate <- modelData$(eliminate)
if (!is.factor(eliminate))
stop("'eliminate' must be a factor")

Wouldn't it be better to replace the first line with 'if (!is.null(eliminate))'?

Error in if (abs(dev - devold)/(0.1 + abs(dev)) < control$epsilon)

Hi,

I'm currently using gnm for constructing conditional poisson models. I had the model specified as
gnm(event_count ~ factor(exposure) + cov1 + ... + cov n, data = data, family = 'poisson', offset = logtime, eliminate = factor(analytical_unit))

I encountered the error message as Error in if (abs(dev - devold)/(0.1 + abs(dev)) < control$epsilon) { : missing value where TRUE/FALSE needed

However, when I tried the formula specified differently, using a subset of covariates, it would run. I also tried to run the same formula on a dataset with same structure, the function would run as well. Additionally, I checked the data that I feed into the function had no missing values.

I could not find documentation about this error on stackoverflow or CRAN. Do you know where the problem is? Thanks you so much!

Best,
Monica

conditional quasi Poisson regression

Hi,
I'm wondering how do you estimate the dispersion parameter when doing conditional Poisson regression? I called gnm() and eliminate the strata (i.e. group, class) and set family = quasipoisson. My guess is this function first do conditional Poisson (which is a multinomial regression) and get the estimates of covariates' effects, then estimate the dispersion parameter (assumed common, for all stratae?) using the moment estimator. Can you clarify the algorithm, or provide any references? Thank you!

Release gnm 1.1-5

Prepare for release:

  • git pull
  • Check current CRAN check results
  • Polish NEWS
  • urlchecker::url_check()
  • devtools::build_readme()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • git push

Submit to CRAN:

  • usethis::use_version('patch')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • usethis::use_github_release()
  • usethis::use_dev_version(push = TRUE)

speed up summary.gnm?

summary() takes a while on gnm models when there are many parameters, compared to glm(), perhaps due to parameter identification checks. Can this be improved?

(c.f. email query from @nalimilan)

Using confint.gnm() inside a function

I believe this is related to issue 1. Consider

library(gnm)
f <- function(d) {
    fit <- gnm(Freq ~ vote, eliminate=class, family=poisson, data=d)
    confint(fit)
}
f(as.data.frame(cautres))

For me, this fails because d could not be found - presumably because some function downstream from update.gnm() (called by profile.gnm()) does not search in the correct environment. I'm sorry I'm not able to locate the exact source of the error, but I don't fully understand what's going on with environments in the following evaluated calls to gnm(), gnmTerms() etc.

Thanks for quickly fixing the previous issue and for your ongoing efforts!

Possible bug in confint.profile.gnm()

I believe there is a subsetting bug in confint.profile.gnm() which is triggered when there is only 1 covariate. Consider

library(gnm)
d <- as.data.frame(cautres)
fit <- gnm(Freq ~ vote, eliminate=class, family=poisson, data=d)
confint(fit)

For me, this returns an error message which appears to be due to lines 30-31 in file confint.profile.gnm.R:

std.err <- attr(object, "summary")$coefficients[, "Std. Error"]
parm <- parm[!is.na(std.err)[parm]]

After replacing these lines with

std.err <- attr(object, "summary")$coefficients[, "Std. Error", drop=FALSE]
parm <- parm[!is.na(std.err)[parm, ]]

The error disappears.

Many thanks for developing gnm and making it freely available! Conditional Poisson regression is very useful in epidemiology.

How to extract the result from gnm object by using broom::tidy/stats::confint within a forloop?

I had posted the same question in Stackoverflow.
https://stackoverflow.com/questions/77269879/how-to-extract-the-result-from-gnm-object-by-using-broomtidy-statsconfint-wi/77270019?noredirect=1#comment136222463_77270019

jared_mamrot https://stackoverflow.com/users/12957340/jared-mamrot and I had attempted different methods, including
for loop,lapply(), Map() and purrr::map(), and still could not extract the result from broom::tidy.

I know that broom::tidy.gnm just directly uses the profile.gnm to obtain the estimate and confidence intervals from the gnm object. jared_mamrot suspects that the error related to how the function handles the args.

Here is the sample dataset and the code I used:

library(tidyverse)
library(gnm)
library(broom)
library(haven)

data = read_dta('londondataset2002_2006.dta')

data$ozone10 <- data$ozone/10

# GENERATE MONTH AND YEAR
data$month  <- as.factor(months(data$date))
data$year   <- as.factor(format(data$date, format="%Y") )
data$dow    <- as.factor(weekdays(data$date))
data$stratum <- as.factor(data$year:data$month:data$dow)

data <- data[order(data$date),]

# FIT A CONDITIONAL POISSON MODEL WITH A YEAR X MONTH X DOW STRATA
modelcpr = vector(mode = 'list',length = 12)

for(i in 1:12){
  
  
  modelcpr1 <- gnm(numdeaths ~ ozone10 + 
                               lag(temperature,i), data=data, family=poisson,
                               eliminate=factor(stratum))
  
  
  modelcpr[[i]] = broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
    select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
}

#vs
#only for i = 1

modelcpr1 <- gnm(numdeaths ~ ozone10 + 
                             lag(temperature,1), data=data, family=poisson,
                             eliminate=factor(stratum))

#broom::tidy
broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
            select(estimate, conf.low, conf.high) %>% as.matrix %>% unname

The dataset and part of the code are from this paper:

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-14-122#Sec13

dataset:
https://static-content.springer.com/esm/art%3A10.1186%2F1471-2288-14-122/MediaObjects/12874_2014_1140_MOESM2_ESM.zip

code:
https://static-content.springer.com/esm/art%3A10.1186%2F1471-2288-14-122/MediaObjects/12874_2014_1140_MOESM1_ESM.docx


After running the forloop, the following error popped up:

modelcpr = vector(mode = 'list',length = 12)

for(i in 1:12){
  
  
  modelcpr1 <- gnm(numdeaths ~ ozone10 + 
                               lag(temperature,i), data=data, family=poisson,
                               eliminate=factor(stratum))
  
  
  modelcpr[[i]] = broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
    select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
}

Error in profile.gnm(object, which = parm, alpha = 1 - level, trace = trace) : 
  profiling has found a better solution, so original fit had not converged
In addition: Warning message:
In sqrt((deviance(updated) - fittedDev)/disp) : NaNs produced

where i = 1.

However, when I run the for loop one by one:

> modelcpr1 <- gnm(numdeaths ~ ozone10 + 
+                              lag(temperature,1), data=data, family=poisson,
+                              eliminate=factor(stratum))
> 
> #broom::tidy
> broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>% 
+             select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
         [,1]      [,2]     [,3]
[1,] 1.000446 0.9988817 1.002013

I can obtain the result without any error. Is there any thing that I missed?

Release gnm 1.1.3

Prepare for release:

  • git pull
  • Check current CRAN check results
  • Polish NEWS
  • urlchecker::url_check()
  • devtools::build_readme()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • git push

Submit to CRAN:

  • usethis::use_version('patch')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • usethis::use_github_release()
  • usethis::use_dev_version(push = TRUE)

compatibility with broom?

In general, better support for getting results out, e.g. with stargazer and similar packages. Not sure what would be most useful here.

The Symm() function produces NA's with more than two factors

The function Symm() cannot handle more than two factors. This means that the information on the help page of Symm() as well as in the vignette, viz. where it is said that ... take take more than two factors, is misleading.
An illustration comes from trying to implement a Complete Symmetry model which Agresti (2010: 242-246) describes. The data are Table 8.5 in Agresti (2010: 242) and can be created as follows:

df8_5 <- data.frame(
  A = rep(1:3, each = 9),
  B = rep(rep(1:3, each = 3), times = 3),
  C = rep(1:3, times = 9),
  Freq = c(6, 4, 5, 3, 13, 10, 1, 8, 14,
           2, 3, 2, 1, 3, 1, 2, 1, 2,
           1, 0, 2, 0, 0, 0, 1, 1, 0)
)

The Symm() function works properly when used with two factors:

with(df8_5, Symm(A, B))
 [1] 1:1 1:1 1:1 1:2 1:2 1:2 1:3 1:3 1:3 1:2 1:2 1:2 2:2 2:2 2:2 2:3 2:3 2:3 1:3 1:3
[21] 1:3 2:3 2:3 2:3 3:3 3:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3
with(df8_5, Symm(A, C))
 [1] 1:1 1:2 1:3 1:1 1:2 1:3 1:1 1:2 1:3 1:2 2:2 2:3 1:2 2:2 2:3 1:2 2:2 2:3 1:3 2:3
[21] 3:3 1:3 2:3 3:3 1:3 2:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3
with(df8_5, Symm(B, C))
 [1] 1:1 1:2 1:3 1:2 2:2 2:3 1:3 2:3 3:3 1:1 1:2 1:3 1:2 2:2 2:3 1:3 2:3 3:3 1:1 1:2
[21] 1:3 1:2 2:2 2:3 1:3 2:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3

However, when used with three factors, the Symm() function returns a factor with the wrong levels and only containing NA values:

with(df8_5, Symm(A, B, C))
 [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
[17] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Levels: 1:1 1:2 1:3 2:2 2:3 3:3

Release gnm 1.1-2

Prepare for release:

  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • rhub::check(platform = 'ubuntu-rchk')
  • rhub::check_with_sanitizers()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md

Submit to CRAN:

  • usethis::use_version('patch')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • usethis::use_github_release()
  • usethis::use_dev_version()

environment issue

When gnm is used in a function it looks in the global environment for it rather than inside the function environment.

nonlinear fails with exact parameters

Hello,
I am one of the authors of the VFP package on CRAN which imports gnm.
Occasionally, using a custom nonlin model, the gnm fails with the error:
Error: no valid set of coefficients has been found: please supply starting values
This happens even if the exact parameters are provided as starting values.
This failure does not occur with all datasets. E.g., the following code runs, if the
data frame is restricted to lines 1:10.
I suppose it has to do with gamma models with identity link not being defined if the estimator becomes <0.
Of course this can be avoided sometimes using another parametrization, however, I want to get the variance estimates and the like on the original scale.

Here is a minimal example (using a linear model to get a reference):
Mean <- c(1237.1, 5605.55, 801.45, 2713.55, 570.4, 193.1, 97.2, 11.05,
202.5, 7031.75, 2679.6, 1252.55, 735.6, 4088.05, 9818.7, 4486.45,
3104.85, 1189.3, 217.6, 603.2, 28.45)
VC <- c(33696.08, 296681.045, 24842.205, 31777.205, 1705.28, 950.48,
2693.78, 244.205, 3026.42, 17578.125, 3.92, 8281.845, 1280.18,
76479.605, 4665.78, 130101.005, 0.125, 9800, 18355.28, 1152,
1618.805)
dat <- data.frame(Mean=Mean,VC=VC)
coeffs <- c(beta1 = 4792.94726157285, beta2 = 0.00366035757993686)
fitglm <- gnm(VC ~ I(Mean^2) , family = Gamma(link = "identity"), data = data.frame(dat), start=coeffs, trace=TRUE)
print(fitglm$coefficients)
powfun <- function(x)
{
list(
predictors=list(beta1 = 1, beta2 = 1),
variables=list(substitute(x)),
term=function(predictors,variables){
paste( predictors[1],"+",predictors[2],"*",variables[1],"^2")
}
)
}
class(powfun) <- "nonlin"
form <- VC ~ powfun(Mean)-1
fitgnm <- gnm(formula = form, family = Gamma(link = "identity"), data = data.frame(dat), start=coeffs, trace=TRUE,iterMax=100)

Kind regards
Florian

Upkeep for gnm (2023)

  • usethis::use_roxygen_md()
  • usethis::use_pkgdown_github_pages()
  • usethis::use_tidy_description()
  • usethis::use_package_doc()
    Consider letting usethis manage your @importFrom directives here. usethis::use_import_from() is handy for this.
  • usethis::use_testthat(3) and upgrade to 3e, testthat 3e vignette
  • Align the names of R/ files and test/ files for workflow happiness. The docs for usethis::use_r() include a helpful script. usethis::rename_files() may be be useful.
  • usethis::use_code_of_conduct()
  • Add alt-text to pictures, plots, etc; see https://posit.co/blog/knitr-fig-alt/ for examples

Set up or update GitHub Actions.
Updating workflows to the latest version will often fix troublesome actions:

  • usethis::use_github_action('check-standard')
  • usethis::use_github_action('test-coverage')

Created on 2023-08-26 with usethis::use_upkeep_issue(), using usethis v2.2.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.