leifeld / texreg Goto Github PK
View Code? Open in Web Editor NEWConversion of R Regression Output to LaTeX or HTML Tables
Conversion of R Regression Output to LaTeX or HTML Tables
Hi there,
Great package, I've been using it to do some gnls tables recently... If I have the tidyr
package loaded before I define my extract.gnls
method (using setMethod
to define), I run into an error as extract
is a standard generic already defined and exported by tidyr.
Is there a good workaround for this?
Thanks,
Dan
Comment from user an ri on R-Forge:
I find myself manually adding a custom footnote after the tabular environment. The built in function custom.note puts the note into a multicolumn environment, but my annotations are often long, and the table proportions get distorted if I put it into this environment. So I add a custom note below the tabular environment using "\footnotesize". I imagine many other people do so as well.
[message also sent directly to p. leifeld]
`I think there may be something wrong with the
way texreg calculates stars on these objects. Here's
the result of a survreg estimation run:
Call:
survreg(formula = Surv(N_Days, censDecSummer) ~ 1 + chief + CrimX +
CivLibX + EconX + FirstAmX + Total_Amicus + DuePX + AttysX +
UnionsX + JudPowX + FedismX + TaxX + cluster(term) + percuriam +
nDissents + nConcurs + nCresults + nytSalience + declarationUncon,
data = timing, subset = nz, dist = "weibull")
Value Std. Err (Naive SE) z p
(Intercept) 4.2156 0.11722 0.07189 35.962 3.23e-283
chiefRehnquist -0.2313 0.04735 0.02478 -4.883 1.04e-06
chiefRoberts -0.2146 0.05736 0.03802 -3.741 1.83e-04
chiefVinson -0.1704 0.09280 0.03114 -1.836 6.63e-02
chiefWarren -0.3470 0.06112 0.02423 -5.677 1.37e-08
CrimXTRUE -0.0705 0.09471 0.06245 -0.745 4.57e-01
CivLibXTRUE -0.0609 0.08270 0.06318 -0.737 4.61e-01
EconXTRUE -0.0878 0.08948 0.06223 -0.981 3.27e-01
FirstAmXTRUE 0.0577 0.10986 0.06924 0.525 6.00e-01
Total_Amicus 0.0195 0.00408 0.00257 4.775 1.79e-06
DuePXTRUE 0.0823 0.11426 0.07440 0.720 4.71e-01
AttysXTRUE 0.0455 0.11492 0.10697 0.396 6.92e-01
UnionsXTRUE -0.1243 0.11907 0.07111 -1.044 2.97e-01
JudPowXTRUE -0.0410 0.09329 0.06360 -0.440 6.60e-01
FedismXTRUE -0.1513 0.08744 0.07121 -1.731 8.35e-02
TaxXTRUE -0.0961 0.12706 0.07250 -0.756 4.50e-01
percuriam -0.3020 0.07135 0.03308 -4.232 2.31e-05
nDissents 0.1637 0.00905 0.00611 18.097 3.34e-73
nConcurs 0.1084 0.01498 0.01187 7.237 4.60e-13
nCresults 0.1583 0.01771 0.01083 8.937 3.98e-19
nytSalience 0.2927 0.05540 0.02876 5.284 1.26e-07
declarationUncon 0.0743 0.02610 0.01972 2.846 4.43e-03
Log(scale) -0.4851 0.03711 0.00985 -13.074 4.66e-39
Scale= 0.616
Weibull distribution
Loglik(model)= -28011.6 Loglik(intercept only)= -29009.1
Chisq= 1995.02 on 21 degrees of freedom, p= 0
(Loglikelihood assumes independent observations)
Number of Newton-Raphson Iterations: 5
n=7387 (9 observations deleted due to missingness)
and here's what happens when I run this through screenreg:
(etr1 is the result of extract.survreg)
> screenreg(etr1,single.row=T,digits=3,custom.model.names="Model 1")
========================================
Model 1
----------------------------------------
(Intercept) 4.216 (0.117)
chiefRehnquist -0.231 (0.047) ***
chiefRoberts -0.215 (0.057) ***
chiefVinson -0.170 (0.093) ***
chiefWarren -0.347 (0.061) ***
CrimXTRUE -0.071 (0.095) ***
CivLibXTRUE -0.061 (0.083) ***
EconXTRUE -0.088 (0.089) ***
FirstAmXTRUE 0.058 (0.110)
Total_Amicus 0.019 (0.004)
DuePXTRUE 0.082 (0.114)
AttysXTRUE 0.045 (0.115)
UnionsXTRUE -0.124 (0.119) ***
JudPowXTRUE -0.041 (0.093) ***
FedismXTRUE -0.151 (0.087) ***
TaxXTRUE -0.096 (0.127) ***
percuriam -0.302 (0.071) ***
nDissents 0.164 (0.009)
nConcurs 0.108 (0.015)
nCresults 0.158 (0.018)
nytSalience 0.293 (0.055)
declarationUncon 0.074 (0.026)
Log(scale) -0.485 (0.037) ***
----------------------------------------
AIC 56069.214
Log Likelihood -28011.607
========================================
*** p < 0.001, ** p < 0.01, * p < 0.05
The stars look wrong to me, at least if they're being
calcuated as coeff/std-error, which appears to be represented
as pvalue in the extract.survreg result. To check this, I did:
> c1<-etr1@coef # extract coeffs
> se1<-etr1@se # extract se s
> t1<-c1/se1 # calculate tstats
> p1<-etr@pvalues # extract pvalues, for comparison
> cbind(c1,se1,t1,p1)
c1 se1 t1 p1
(Intercept) 4.21563707 0.117223289 35.9624535 35.9624535
chiefRehnquist -0.23125223 0.047354249 -4.8834526 -4.8834526
chiefRoberts -0.21459294 0.057360414 -3.7411331 -3.7411331
chiefVinson -0.17039978 0.092795972 -1.8362842 -1.8362842
chiefWarren -0.34695761 0.061116736 -5.6769655 -5.6769655
CrimXTRUE -0.07052040 0.094711720 -0.7445795 -0.7445795
CivLibXTRUE -0.06094230 0.082695655 -0.7369468 -0.7369468
EconXTRUE -0.08775338 0.089477503 -0.9807313 -0.9807313
FirstAmXTRUE 0.05768771 0.109862080 0.5250921 0.5250921
Total_Amicus 0.01948112 0.004079518 4.7753480 4.7753480
DuePXTRUE 0.08227394 0.114257493 0.7200748 0.7200748
AttysXTRUE 0.04548361 0.114924076 0.3957710 0.3957710
UnionsXTRUE -0.12430357 0.119071832 -1.0439377 -1.0439377
JudPowXTRUE -0.04100121 0.093290039 -0.4395026 -0.4395026
FedismXTRUE -0.15133097 0.087437920 -1.7307247 -1.7307247
TaxXTRUE -0.09606226 0.127062616 -0.7560230 -0.7560230
percuriam -0.30197049 0.07136025 -4.2324781 -4.2324781
nDissents 0.16370191 0.009045561 18.0974865 18.0974865
nConcurs 0.10842433 0.014982691 7.2366394 7.2366394
nCresults 0.15831111 0.017713210 8.9374602 8.9374602
nytSalience 0.29272654 0.055398539 5.2840120 5.2840120
declarationUncon 0.07429407 0.026103381 2.8461475 2.8461475
Log(scale) -0.48509865 0.037105383 -13.0735386 -13.0735386
Note that the last two columns are the same. What I don't see
is why chiefRehnquist, with a pvalue of 4.88 gets 3 stars, while
nConcurs, with a pvalue of 8.94 gets none. Or why nDissents,
with a pvalue of 18.10 also gets none. Am I missing something
here? (Now that I look at it, it seems that only negative
pvalues are being correctly starred: could there be a problem
with positive values?).
`
First, thank you so much for this awesome package. It's been really helpful.
The main texreg()
function contains some code that checks to see if the file output is a string, e.g.:
else if (!is.character(file)) {
stop("The 'file' argument must be a character string.")
}
However, on Unix systems it is possible to copy directly to the clipboard. On Mac OS, it is as simple as file=pipe('pbcopy')
.
Would you accept a pull request enabling direct copy of the outputted code to the clipboard?
In general, I think it is a good idea to look for partial matching throughout the package.
Just run your tests with:
options(warnPartialMatchAttr = TRUE,
warnPartialMatchDollar = TRUE,
warnPartialMatchArgs = TRUE)
options(showWarnCalls = TRUE) # gives more hints where the partial matching occurs
There do not seem to be test files in the package, but I assume you have some available locally?
Comment by user felix_h (Felix Haas) on R-Forge:
My second relates to custom additional rows. Sometimes it is desirable to indicate whether some set of variables was included in a model but not shown (e.g. country fixed-effects), see e.g. the "Time Dummies" row in this table:
Is it possible to add a similar functionality to texreg
?
Again thanks for such a great package and keep up the good work!
Best
Felix
There is a small issue with the extract.ols()
function where the option include.rsquared = F
with simultaneous include.adjrs = T
would throw an error because the rs
object is only defined within the the if (include.rsquared== TRUE)
condition. I modified the code so that if (include.adjrs == TRUE)
directly pulls the R2 from the model stats & added a pull request.
Comment by user ashenkin (Alexander Shenkin) on R-Forge:
Hi there,
The predictors in my models often have underscores. This seems to lead to illegal TEX, as the underscores don't get escaped by texreg
. Is there any way to get texreg
to automatically escape such characters when it produces the table, or will I have to make custom names?
thanks,
alex
my_df = data.frame(a = runif(100), b = runif(100), c_d = rep(c("a","b"),50))
my_mod = lmer(a ~ b + (1|c_d), data = my_df)
texreg(my_mod)
\begin{table}
\begin{center}
\begin{tabular}{l c }
\hline
& Model 1 \\
\hline
(Intercept) & $0.45^{***}$ \\
& $(0.05)$ \\
b & $0.12$ \\
& $(0.10)$ \\
\hline
AIC & 45.63 \\
BIC & 56.05 \\
Log Likelihood & -18.82 \\
Num. obs. & 100 \\
Num. groups: c_d & 2 \\
Variance: c_d.(Intercept) & 0.00 \\
Variance: Residual & 0.08 \\
\hline
\multicolumn{2}{l}{\scriptsize{$^{***}p<0.001$, $^{**}p<0.01$, $^*p<0.05$}}
\end{tabular}
\caption{Statistical models}
\label{table:coefficients}
\end{center}
\end{table}
Comment by mdanese (Mark Danese) on R-Forge:
It would also be nice to use the single.row
option to include the estimate and the standard error on the same row, and then use ci.test
to report the confidence interval on the second line. That way one can provide a complete report that doesn't require the reader to choose between SE and CI. Since both arguments report two values, it should probably not hurt the table formatting. And wrapping the CI in parentheses or brackets would be good.
2.50 (0.50)
[1.50, 3.00]
It would be great to have an option (or default) to put commas in between numbers that are integers.
4,234,332
Comment from user an ri on R-Forge:
I find myself manually stacking different texreg-tables "on top of each other". That is, I split the table with strsplit, "grep" it from the beginning to the "Num Obs" field, and place the content of a second texreg-table directly below this. I wish there was an easier way of doing this. Something along the lines of an rbind-method for texreg-tables.
The possibility of customizing tables in latex is nearly infinite, so it should not be expected that all the possible options will ever be added to texreg. Several open issues relate to this kind of enhancements: #2, #3, or #25.
An easier approach would be to add an option (e.g., tabular = FALSE) for exporting just what is inside the tabular environment: the coefficients and GOF. Experienced LaTeX users each have their own habits when building tables and it may be faster for them to copy/paste the table and tabular environments from previous tables than to understand from texreg options how to generate tables in their own way.
I did it for my own use in https://github.com/christophe-gouel/texreg. Let me know if you want a pull request.
Bug report by e-mail from Christopher Gandrud on 17 March 2016:
Hi Philip
I noticed in texreg 1.36.4 that users basically can’t (or can’t easily) omit coefficients and provide custom coefficient names for the remaining coefficients.
This is because in:
m <- customnames(m, custom.coef.names) #rename coefficients
m <- rearrangeMatrix(m) #resort matrix and conflate duplicate entries
m <- as.data.frame(m)
m <- omitcoef(m, omit.coef) #remove coefficient rows matching regex
omitcoef
comes after customnames
have been applied. In the attached file I simply moved the omitcoef
call before customnames
. Now I can omit coefficients and rename them.
Best
Christopher
Attached revised version of texreg
: texreg.R
Comment by user jonkeane (Jonathan Keane) on R-Forge:
I noticed that there is a typo in the stars note for html format when there are 4 levels. This results in an extra tag being shown in the html output.
line 976 in texreg.R currently is :
star.symbol, star.symbol, "</sup", css.sup, ">p < ", st[2],
but it looks like it should be:
star.symbol, star.symbol, "</sup>p < ", st[2],
I've tried this change, and the output looks good to me.
Often in published work we need to write 12345 as 12,345.
The way I do this by hand is with prettyNum
: prettyNum(12345, big.mark = ',')
. But as it stands I have to clean this up in the post-up when using texreg
. I suppose it's possible to add a prettyNum
argument to extract
methods to handle this.
Comment by user Kevin Tappe on R-Forge:
I looked around a while for an option to print SE and p values in the table (in separate rows, preferably). It seems like, it is not possible to print the p values at all (other than by indicating significance by stars or by doing some dirty substitution tricks to substitute SEs for p values)?
Would be nice to have an option for that. (stargazer has the report option, e.g. report = c*sp to print coefficients (starred), SE, and p values).
I'm trying to use texreg
to create outputs for results from the Johansen test results generated by the urca
package. The problem I run into is that the function from the package returns an object of class summary.lm
rather than the lm
itself.
Since essentially all the information needed to render the table is contained in the summary object, it should be easy to use the summary object. However, I had a little bit of difficulty and it may be possible to change the code to make it easier.
summary.lm
objects using texreg
The native extract.lm
method uses almost exclusively data from the summary objects, so we are almost there. However, there is one bit missing in the summary: the number of observations. To get this information, the easiest way appears to be to simply count the length of the residuals (which are contained in the summary). This leads to the following extract.summary.lm
function which is almost identical to the original one:
extract.summary.lm <- function (model, include.rsquared = TRUE, include.adjrs = TRUE,
include.nobs = TRUE, include.fstatistic = FALSE, include.rmse = TRUE,
...)
{
s <- model;
names <- rownames(s$coef)
co <- s$coef[, 1]
se <- s$coef[, 2]
pval <- s$coef[, 4]
rs <- s$r.squared
adj <- s$adj.r.squared
n <- length(s$residuals)
gof <- numeric()
gof.names <- character()
gof.decimal <- logical()
if (include.rsquared == TRUE) {
gof <- c(gof, rs)
gof.names <- c(gof.names, "R$^2$")
gof.decimal <- c(gof.decimal, TRUE)
}
if (include.adjrs == TRUE) {
gof <- c(gof, adj)
gof.names <- c(gof.names, "Adj. R$^2$")
gof.decimal <- c(gof.decimal, TRUE)
}
if (include.nobs == TRUE) {
gof <- c(gof, n)
gof.names <- c(gof.names, "Num. obs.")
gof.decimal <- c(gof.decimal, FALSE)
}
if (include.fstatistic == TRUE) {
fstat <- s$fstatistic[[1]]
gof <- c(gof, fstat)
gof.names <- c(gof.names, "F statistic")
gof.decimal <- c(gof.decimal, TRUE)
}
if (include.rmse == TRUE && !is.null(s$sigma[[1]])) {
rmse <- s$sigma[[1]]
gof <- c(gof, rmse)
gof.names <- c(gof.names, "RMSE")
gof.decimal <- c(gof.decimal, TRUE)
}
tr <- createTexreg(coef.names = names, coef = co, se = se,
pvalues = pval, gof.names = gof.names, gof = gof, gof.decimal = gof.decimal)
return(tr)
}
setMethod("extract", signature = 'summary.lm', definition = extract.summary.lm)
After this modification, it is possible to use the summary method to render the table:
> model <- lm(mpg ~ disp , data = mtcars)
> screenreg(list(model, summary(model)))
=================================
Model 1 Model 2
---------------------------------
(Intercept) 29.60 *** 29.60 ***
(1.23) (1.23)
disp -0.04 *** -0.04 ***
(0.00) (0.00)
---------------------------------
R^2 0.72 0.72
Adj. R^2 0.71 0.71
Num. obs. 32 32
RMSE 3.25 3.25
=================================
*** p < 0.001, ** p < 0.01, * p < 0.05
summary.lm
objects nativelyIt would be a great help if the extract.lm function would also accept summary.lm objects - then this entire workaround would not be needed. The change would also appear to be quite minor.
Thanks for awesome package!
Sometimes, it is convenient to change the rows containing the coefficients and goodness of fit statistics with the columns containing the models. This is for example interesting if you compare many models which means you are focusing first of all only on gofs and you omit all specific coefs.
Model | L2 | df | p | BIC |
---|---|---|---|---|
NA | 42,970 | 64 | .000 | 42,225 |
Q0 | 1,500 | 61 | .000 | 790 |
Qx | 956 | 46 | .000 | 421 |
... | ... | ... | ... | ... |
Would it be possible to do produce such a table with texreg?
PS: Results from the package gnm
follow the same pattern as glm
. Thus you could add setMethod("extract", signature = className("gnm", "gnm"), definition = extract.glm)
to extract.R
. Often, one would produce some additional results using gnm's function getContrasts
. I am not sure but possibly the easiest way to implement these additional results is to manually adjust the texreg object.
<<copy-pasted from e-mail>>
I use the program LyX for formatting my documents with LaTeX. I find it very convenient for taking care of most of the annoying formatting bits that can be a pain to handle in raw LaTeX.
The problem is that copy-pasting single-spaced TeX into LyX screws up the formatting.
A workaround that I've discovered is simply double-spacing the code to be copy-pasted, a workaround which can be accomplished in R by simply changing cat(...,collapse="\n")
to cat(...,collapse="\n\n")
(basically, doubling the newline separator whenever it comes up works).
As such it should be simple to integrate into the function itself by allowing a lyx
option with default FALSE
, used like: cat(...,collapse=if (lyx) "\n\n" else "\n")
.
<<end>>
I don't use LyX so much any more, but I do think it can be a useful tool, and the fix is simple.
I'll file a PR sometime today.
I have a use case where I batch submit regressions and then save the serialized output (rds of the returned value of extract(lm) ). This is much better than stargazer (need to save lm, which carries around residuals) or tidy (cannot be entered into an outreg).
So, when I need to look through hundreds of robustness checks, I simply call a function that filters regressions on dependent variables and independent variables. Maybe a strange use case. Wondering if it would have general value to add something like this to the package. The beneficiaries would be anyone who has to run batch analysis they need to store for later review.
Also, in one set of interactions between client and server, I can only receive textual output. I have had to save a string representation of the list that contains a call to tidy and glance from broom, and then write my own bootleg texreg function. Perhaps it would aid batch analysis as well to have a string representation of an extract object. This might violate S4 principles, so it would require creating something like toString for texreg which could then be parsed and converted back.
Comment from user an ri on R-Forge:
I find myself manually adding a multicolumn-header to the output that indicates that two or more regressions belong to a certain group. For instance, if I have four regression models, I may want the word "Women" to appear above the first two regressions, and "Men" above the last two.
mlogit
would benefit from having a beside
option to govern the ordering of coefficient output without as much need for manual interference:
data("Fishing", package = "mlogit")
Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode")
texreg(mlogit(mode ~ 0 | income, data = Fish))
output:
\begin{table}
\begin{center}
\begin{tabular}{l c }
\hline
& Model 1 \\
\hline
boat:(intercept) & $0.74^{***}$ \\
& $(0.20)$ \\
charter:(intercept) & $1.34^{***}$ \\
& $(0.19)$ \\
pier:(intercept) & $0.81^{***}$ \\
& $(0.23)$ \\
boat:income & $0.00^{*}$ \\
& $(0.00)$ \\
charter:income & $-0.00$ \\
& $(0.00)$ \\
pier:income & $-0.00^{**}$ \\
& $(0.00)$ \\
\hline
AIC & 2966.30 \\
Log Likelihood & -1477.15 \\
Num. obs. & 1182 \\
\hline
\multicolumn{2}{l}{\scriptsize{$^{***}p<0.001$, $^{**}p<0.01$, $^*p<0.05$}}
\end{tabular}
\caption{Statistical models}
\label{table:coefficients}
\end{center}
\end{table}
Prefer to have an option to produce instead:
\begin{table}
\begin{center}
\begin{tabular}{l c }
\hline
& Model 1 \\
\hline
boat:(intercept) & $0.74^{***}$ \\
& $(0.20)$ \\
boat:income & $0.00^{*}$ \\
& $(0.00)$ \\
charter:(intercept) & $1.34^{***}$ \\
& $(0.19)$ \\
charter:income & $-0.00$ \\
& $(0.00)$ \\
pier:(intercept) & $0.81^{***}$ \\
& $(0.23)$ \\
pier:income & $-0.00^{**}$ \\
& $(0.00)$ \\
\hline
AIC & 2966.30 \\
Log Likelihood & -1477.15 \\
Num. obs. & 1182 \\
\hline
\multicolumn{2}{l}{\scriptsize{$^{***}p<0.001$, $^{**}p<0.01$, $^*p<0.05$}}
\end{tabular}
\caption{Statistical models}
\label{table:coefficients}
\end{center}
\end{table}
This facilitates creating three groups (boat
, charter
, and pier
).
The documentation for 1.36.23, (texreg/omit.coef) says: "The ‘omit.coef' argument is processed after the custom.coef.names’ argument, so the regular expression should refer to the custom coefficient names."
This is incorrect as demonstrated by the following, where the omission is based on the original variable name rather than the custom name:
library(texreg)
y <- 1:20 + rnorm(20)
x1 <- rnorm(20)
x2 <- rnorm(20)
reg <- list()
reg[[1]] <- lm(y ~ x1)
reg[[2]] <- lm(y ~ x2)
reg[[3]] <- lm(y ~ x1 + x2)
texreg(reg) # three coefficients
texreg(reg, custom.coef.names=c('a', 'b3', 'b5'),
omit.coef='2') ## this omits "b5"
texreg(reg, custom.coef.names=c('a', 'b3', 'b5'),
omit.coef='b3') ## output is unchanged
A basic version was provided by Yves Croissant on 17 March 2017 by e-mail:
extract.mhurdle <- function (model, include.nobs = TRUE, ...){
s <- summary(model, ...)
names <- rownames(s$coefficients)
class(names) <- "character"
co <- s$coefficients[, 1]
se <- s$coefficients[, 2]
pval <- s$coefficients[, 4]
class(co) <- class(se) <- class(pval) <- "numeric"
n <- nobs(model)
lik <- logLik(model)
gof <- numeric()
gof.names <- character()
gof.decimal <- logical()
gof <- c(gof, n, lik)
gof.names <- c(gof.names, "Num. obs.", "Log Likelihood")
gof.decimal <- c(gof.decimal, FALSE, TRUE)
tr <- createTexreg(coef.names = names, coef = co, se = se, pvalues = pval,
gof.names = gof.names, gof = gof, gof.decimal = gof.decimal)
return(tr)
}
setMethod("extract", signature = className("mhurdle", "mhurdle"), definition = extract.mhurdle)
The code needs to be checked and added, together with entries in the help files, vignette, and NAMESPACE.
If I try to get an output from the table command with texreg I get an error
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘extract’ for signature ‘"table"’
texreg(table(c(2,5,5,5,5,7,7,7,7,NA),c(1,5,2,2,2,2,7,7,NA,NA)))
How can I get it?
I would like to get the left title rotated, as in this picture:
It is a great pleasure to use texreg
package, but recently I need use the speedglm
to run Poisson regression since it is developed recently thus sadly not supported by by texreg
, would you consider to add it? Thank you!
Comment from user yennywebbv (Yenny Webb Vargas) on R-Forge:
Hi!
Thanks for the great implementation!
I have an important suggestion: add an option to use a t distribution, with appropriate degrees of freedom taken from the model. As it is implemented, the confidence intervals do not match the lm
confint
ones. This is due to texreg
using a standard normal distribution, as opposed to a t distribution that pulls the error degrees of freedom from the model.
I live in a small-sample world, and the degrees of freedom really make a big difference.
I attach code for a possible implementation of this. You would need to update the getdata
function to output the degrees of freedom of the model, and have access to it in the S4 object (models[[i]]@df.residual
).
Thanks!
-Y
ciforce = function (models, ci.force = rep(FALSE, length(models)), ci.level = 0.95,tdist=FALSE)
{
if (class(ci.force) == "logical" && length(ci.force) == 1) {
ci.force <- rep(ci.force, length(models))
}
if (class(ci.force) != "logical") {
stop("The 'ci.force' argument must be a vector of logical values.")
}
if (length(ci.force) != length(models)) {
stop(paste("There are", length(models), "models and",
length(ci.force), "ci.force values."))
}
for (i in 1:length(models)) {
if (ci.force[i] == TRUE && length(models[[i]]@se) > 0) {
if(tdist==TRUE){
quant <- qt(1 - ((1 - ci.level)/2), models[[i]]@df.residual)
}else{
quant <- qnorm(1 - ((1 - ci.level)/2))
}
upper <- models[[i]]@coef + (quant * models[[i]]@se)
lower <- models[[i]]@coef - (quant * models[[i]]@se)
models[[i]]@ci.low <- lower
models[[i]]@ci.up <- upper
models[[i]]@se <- numeric(0)
models[[i]]@pvalues <- numeric(0)
}
}
return(models)
}
The code below runs correctly with texreg_1.36.18 ("logSigma" is omitted from the output), but with texreg_1.36.23 it generates this error:
Error in customnames(m, custom.coef.names) :
There are 3 coefficients, but you provided 4 custom names for them.
This can be sourced and run under the two versions of texreg.
library(texreg)
library(censReg)
df <- data.frame(y=sample(0:5, replace=TRUE, size=50), x=rnorm(50),
z=rnorm(50))
reg <- censReg(y ~ x + z, data=df, left=0, right=10)
reg <- lm(y ~ x + z, data=df)
custom.name <- c('Intercept', 'X', 'Z', 'logSigma')
f <- texreg(reg, custom.coef.names=custom.name,
omit.coef='logSigma')
print(f)
In some of the extract.*
functions partial matching is done due to the $
operator.
This can be checked with setting these options and running the code
options(warnPartialMatchAttr = TRUE,
warnPartialMatchDollar = TRUE,
warnPartialMatchArgs = TRUE)
Just a quick suggestion to incorporate in texreg objects of class lagImpact
. Spatial lag regression models returns direct, indirect and total effects for each regression coefficient and their level of significance. It would be great if texreg could cover these models organizing the output with information about direct, indirect and total effects.
I've left a question on SO about this: https://stackoverflow.com/questions/45971419/texreg-table-for-impacts-of-spatial-durbin-model
Here is an example that reproduces the problem
library(nlme)
library(texreg)
library(multilevel)
data <- univbct[,c(1,8,18:22)]
data <- data[complete.cases(data),]
data$BTN <- as.factor(data$BTN)
data$SUBNUM <- as.factor(data$SUBNUM)
mlm1 <- lme(JSAT ~ 1,
random = ~ 1 | BTN/SUBNUM,
data = data, method = "REML")
screenreg(list(mlm1))
The problem itself is that the extract.lme function does not properly recognize that "number of groups" can receive multiple values if there are more than one grouping level in the model
Here is a log that demonstrates the issue. In the data, there are 16 level 3 groups and 486 level 2 groups.
> extract(mlm1)
Tracing createTexreg(coef.names = coefficient.names, coef = coefficients, .... on entry
Called from: eval(expr, p)
Browse[1]> n
debug: {
new("texreg", coef.names = coef.names, coef = coef, se = se,
pvalues = pvalues, ci.low = ci.low, ci.up = ci.up, gof.names = gof.names,
gof = gof, gof.decimal = gof.decimal, model.name = model.name)
}
Browse[2]> gof.names
[1] "AIC" "BIC" "Log Likelihood" "Num. obs."
[5] "Num. groups"
Browse[2]> gof
SUBNUM BTN
3352.390 3373.219 -1672.195 1350.000 486.000 16.000
Further explanation by Shahram Amini via e-mail on 11 November 2016:
Thank you for replying my email and your help. Sometimes the user wants to stretch out the table to fit the width of the text. This task is accomplished by tabularx
environment. This environment takes all the regular "l"
,"r"
, and "c"
arguments but it has extra column type called "X"
. When "X"
is used, the table gets stretched out according to a scale passed to the table. The environment looks like this:
\begin{tabularx}{\textwidth}{XXX}
......
\end{tabularx}
The same task can be accomplished by tabular*
environment but i did not see a support for that one either.
Sometimes, it is required for certain test to consider also significance levels beyond 5%. The stars
parameter makes it easy to do this.
However, the symbol chosen by texreg's default for the 10% level - cdot
is nearly invisible. It looks like dirt on the screen or toner messed up on the page. While you may think that a result at the 10% level does not deserve more recognition, consider anyway to change the symbol.
Suggestion: dagger
is used by other packages and more visible
Allow coefficients to be re-ordered by a vector of regex:
Eg:
reorder.coef = c("^log", "^GDP", "^Population")
would put all coefficients that start with "log" on top, then "GDP" and then "Population" coefficients.
Motivation: I routinely have model with some fixed effects and interactions, then I would have to calculate the number of created dummy variables and based on these I could reorder them.
Comment by user snthsnht on R-Forge:
It would be great to have support for short captions in texreg. For example
\caption[short one for list of tables]{Long elaborate caption which explains everything in the table but is too long for the list of tables.}
I recently came across this package and it feels way more responsive than stargazer (which seems to have been inactive in development). However there is one feature that seems missing, which is the option to automatically display dependent variables associated with each model.
Here is a minimum example:
library("tidyverse")
library("stargazer")
library("texreg")
set.seed(1234)
N = 300
data <- tibble(var1 = rnorm(N), var2 = rnorm(N),
var3 = 0.3 * var1 + 0.4 * var2 + rnorm(N),
var4 = 0.8 * var1 + 0.2 * var2 + rnorm(N))
model <- list(formula(var3 ~ var1 + var2),
formula(var4 ~ var1 + var2)) %>%
map(function(formula, ...) lm(formula, ...),
data = data)
model %>% splice(., type = "text") %>% invoke(stargazer, .)
## ===========================================================
## Dependent variable:
## ----------------------------
## var3 var4
## (1) (2)
## -----------------------------------------------------------
## var1 0.325*** 0.842***
## (0.054) (0.056)
##
## var2 0.408*** 0.185***
## (0.053) (0.055)
##
## Constant -0.023 -0.055
## (0.055) (0.057)
##
## -----------------------------------------------------------
## Observations 300 300
## R2 0.247 0.444
## Adjusted R2 0.242 0.440
## Residual Std. Error (df = 297) 0.944 0.979
## F Statistic (df = 2; 297) 48.650*** 118.400***
## ===========================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
model %>% invoke(screenreg, l = .)
## ===================================
## Model 1 Model 2
## -----------------------------------
## (Intercept) -0.02 -0.05
## (0.05) (0.06)
## var1 0.32 *** 0.84 ***
## (0.05) (0.06)
## var2 0.41 *** 0.19 ***
## (0.05) (0.05)
## -----------------------------------
## R^2 0.25 0.44
## Adj. R^2 0.24 0.44
## Num. obs. 300 300
## RMSE 0.94 0.98
## ===================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
I am aware that we can override model names, however it is still handy to automatically display dependent variables, when there are dozens to hundreds of candidate models to evaluate and we want to compare their performance by grouping them into several panels.
Thank you,
Yi
Comment by user ignacio82 (Ignacio Martinez) on R-Forge:
Thanks for the great work!
I’m reproducing this stata example:
regress api00 acs_k3 acs_46 full enroll, cluster(dnum)
Regression with robust standard errors Number of obs = 395
F( 4, 36) = 31.18
Prob > F = 0.0000
R-squared = 0.3849
Number of clusters (dnum) = 37 Root MSE = 112.20
------------------------------------------------------------------------------
| Robust
api00 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
acs_k3 | 6.954381 6.901117 1.008 0.320 -7.041734 20.9505
acs_46 | 5.966015 2.531075 2.357 0.024 .8327565 11.09927
full | 4.668221 .7034641 6.636 0.000 3.24153 6.094913
enroll | -.1059909 .0429478 -2.468 0.018 -.1930931 -.0188888
_cons | -5.200407 121.7856 -0.043 0.966 -252.193 241.7922
------------------------------------------------------------------------------
In R
:
library(readstata13)
library(texreg)
library(sandwich)
library(lmtest)
clustered.se <- function(model_result, data, cluster, level=0.95) {
model_variables <-
intersect(colnames(data), c(colnames(model_result$model), cluster))
model_rows <- rownames(model_result$model)
data <- data[model_rows, model_variables]
cl <- data[[cluster]]
M <- length(unique(cl))
N <- length(model_result$residuals)
K <- model_result$rank
dfc <- (M / (M - 1)) * ((N - 1) / (N - K))
uj <-
apply(estfun(model_result), 2, function(x)
tapply(x, cl, sum))
vcovCL <- dfc * sandwich(model_result, meat = crossprod(uj) / N)
standard.errors <- coeftest(model_result, vcov. = vcovCL)[, 2]
p.values <- coeftest(model_result, vcov. = vcovCL)[, 4]
a <- 1-(1 - level)/2
df <- M - 1
coeff <- model_result$coefficients
lb <- coeff - qt(p = a, df = df)*standard.errors
ub <- coeff + qt(p = a, df = df)*standard.errors
clustered.se <-
list(vcovCL = vcovCL,
standard.errors = standard.errors,
p.values = p.values,
lb = lb, ub = ub)
return(clustered.se)
}
Run the code:
elemapi2 <- read.dta13(file = 'elemapi2.dta')
lm1 <-
lm(formula = api00 ~ acs_k3 + acs_46 + full + enroll,
data = elemapi2)
clustered_se <-
clustered.se(model_result = lm1,
data = elemapi2,
cluster = "dnum")
screenreg
works as expected:
screenreg(
lm1,
override.se = clustered_se$standard.errors,
override.p = clustered_se$p.value,
override.ci.low = clustered_se$lb,
override.ci.up = clustered_se$ub,
digits = 7
)
##
## ========================================
## Model 1
## ----------------------------------------
## (Intercept) -5.2004067
## [-252.1930390; 241.7922255]
## acs_k3 6.9543811
## [ -7.0417337; 20.9504960]
## acs_46 5.9660147 *
## [ 0.8327565; 11.0992729]
## full 4.6682211 *
## [ 3.2415297; 6.0949125]
## enroll -0.1059909 *
## [ -0.1930931; -0.0188888]
## ----------------------------------------
## R^2 0.3848830
## Adj. R^2 0.3785741
## Num. obs. 395
## RMSE 112.1983218
## ========================================
## * 0 outside the confidence interval
htmlreg
outputs the se instead of the ci
htmlreg(
lm1,
override.se = clustered_se$standard.errors,
override.p = clustered_se$p.value,
override.ci.low = clustered_se$lb,
override.ci.up = clustered_se$ub,
star.symbol = "\\*",
digits = 7
)
ci.force=T
produces the wrong CI
htmlreg(
lm1,
override.se = clustered_se$standard.errors,
override.p = clustered_se$p.value,
ci.force = TRUE,
override.ci.low = clustered_se$lb,
override.ci.up = clustered_se$ub,
star.symbol = "\\*",
digits = 7
)
For example, the CI for acs_k3
goes is [-6.5715605; 20.4803228]
but it should be [-7.0417337; 20.9504960]
Thanks for you great work!
dcolumn
is mature for replacement by siunitx
. Any plans of at least adding it? In my fork I've simply replaced
Lines 401 to 402 in 4129fee
with
coldef <- paste0(coldef, "S[table-format=", dl, separator, dr, "]", margin.arg, " ")
and it works in my use cases.
I also needed to replace star.prefix = "^{",
with star.prefix = "\\sym{",
and add
# \sym definition
string <- paste0(string, "\\def\\sym#1{\\ifmmode^{#1}\\else\\(^{#1}\\)\\fi}")
right after
Line 408 in 4129fee
Comment by user dan on R-Forge:
Dear Philip,
I often have models with hundreds of coefficients (many of which get thrown away). I was about to write my own functions for renaming, reordering, and omitting variables before passing them to texreg
when I discovered this blog post that already contains code to do it:
http://conjugateprior.org/2013/10/call-them-what-you-will/
It has a really nice interface: just define a "name map" that gives the order and new names of any variables you want and omits anything you want to omit, and it automatically builds the parameters to pass to custom.coef.names
, reorder.coef
, and omit.coef
. (And it's robust to including variables that aren't in the model, so it's reusable throughout a project with different models.) This is unbelievably useful and would make a great addition to texreg
itself.
I didn't see texreg
on github or I'd send a pull request. I'm honestly not sure how to submit a patch through R-forge, but the code is already there in the link above, so it shouldn't be hard to add.
This also satisfies a feature request from a couple years ago in this thread:
https://r-forge.r-project.org/forum/forum.php?thread_id=28268&forum_id=4325&group_id=1420
Thanks for your work on texreg
!
Best,
Dan
Hi
I would like to show the point estimate and CI on the odds ratio scale, but cannot locate this option. I am currently using an own customized model extraction for R2mlwin (Mlwin), where the original models are logistic multilevel. The current extraction mode collects and transforms the log-odds coefficient and SE in the following manner:
> # model summary (model is a R2mlwin object)
> s <- summary(model)
>
> # Convert coefficients to odds ratios (just remove this part if you want to stay on the log-odds scale)
> coef <- exp(s@FP)
> se <- coef*se # This is the converstion of the log-odds scale standard error to the odds scale standard error,
> # It is an approximate method: take the exponentiate of the log-odds and multiply it by its log-odds standard error
> # se.odds.scale = exp(log-odds)*se.log.odds.scale
> # http://stats.stackexchange.com/questions/158481/how-to-convert-the-standard-error-of-the-log-odds-ratio-to-the-odds-ratio-standa?rq=1
However, when I use the ci.force option it seems to assumes that the SE of the odds ratio is additive, and therefore, producing incorrect upper and lower bounds:
htmlreg(tr, file = "mytable1.Std.doc",
ci.force = TRUE,
ci.force.level = 0.95, bold = 0.05, ci.test = 1,
inline.css = FALSE, doctype = TRUE, html.tag = TRUE, head.tag = TRUE, body.tag = TRUE)
Any ideas about for a workaround to get correct CI?
Hi, I would like to add multiple rows to my texreg output indicating things such as fixed effects, Year-Location, and potentially many such rows. Here is an example of an output with such rows
example.pdf.
Is there an option for this?
Comment by user sdaza (Sebastian Daza) on R-Forge:
I get a weird error with texreg
but not with screenreg
:
This works perfectly:
screenreg(models,
custom.model.names = mnames ,
custom.coef.names = ror$ccn,
groups = groups,
omit.coef = ror$oc,
reorder.coef = ror$rc
)
But, when I use texreg
I get:
texreg(models,
custom.model.names = mnames,
custom.coef.names = ror$ccn,
omit.coef = ror$oc,
reorder.coef = ror$rc,
groups = groups)
Error in reorder(m, reorder.coef) :
Error when reordering matrix: there are 20 rows, but you provided 9 numbers.
So, the problem is when omitting variables.This is what I am trying to omit:
"(Intercept)|rfemale|z_rage|rrace_2|rrace_3|rmstatus_1|rmstatus_3|rmstatus_4|rworking_2|rworking_3|z_radla|z_nkids|kfemale|z_kage|stepchild|parent|kworking|chwealth"
Any ideas about what could be the problem?
Thanks, Sebastian
when outputing multinom
models via texreg()
, I get the same model fit statistics for all levels of the model. If you show one level at a time, then this doesn't occur, but if you want to show multiple levels of the dependent variable, then this is problematic.
Make it easier to add rows to the GOF block. Currently, users have to save the results in an intermediate texreg
object and manipulate the gof
, gof.names
, and gof.decimal
slots and then hand over the updated object to texreg
. This could be done more easily using dedicated arguments, such as add.decimal.gof
and add.integer.gof
, which would expect a named list of values.
Reported by David Hugh-Jones by e-mail on 24 June 2016:
Hi Philip
For those using knitr and rmarkdown, something like this could be quite helpful. Tell me if you are interested.
flexreg <- function(...) {
of <- opts_knit$get('out.format')
if (is.null(of)) {
screenreg(...)
} else {
switch(of,
'latex' = texreg(...),
'sweave' = texreg(...),
'markdown' = texreg(...),
'html' = htmlreg(...),
screenreg(...)
)
}
}
David
Update 28 June 2016:
Not many users will know the subtle opts_knit option that the code checks. I think it is quite common for r users to be using rmarkdown via rstudio, and to run chunks at the command line (screenreg
needed) then knit the whole document. For ordinary tables, knitr::kable does the job but for regressions... Just my thoughts.
D
This would likely be a lot of work, but I'm wondering if its possible to use a package like xlsx or reporteR to print a texreg directly to a file. I can experiment with it. If anyone has an easy idea for how to implement, that would be awesome.
Using example #2 from https://www.rdocumentation.org/packages/VGAM/versions/1.0-4/topics/vglm
library(VGAM)
pneumo <- transform(pneumo, let = log(exposure.time))
fit <- vglm(cbind(normal, mild, severe) ~ let, multinomial, data = pneumo)
Running texreg::extract(fit)
produces the following error:
Error: $ operator is invalid for atomic vectors
sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
[5] LC_TIME=Swedish_Sweden.1252
attached base packages:
[1] splines stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] VGAM_1.0-4
loaded via a namespace (and not attached):
[1] compiler_3.4.2 tools_3.4.2 yaml_2.1.14 texreg_1.36.23
At the moment, the option reorder.gof
seems broken. In ?texreg
, it is described with a numerical vector as input. However, this produces an error for me.
> library(texreg)
> ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
> trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
> group <- gl(2,10,20, labels = c("Ctl","Trt"))
> weight <- c(ctl, trt)
> lm.D9 <- lm(weight ~ group)
> screenreg(lm.D9)
======================
Model 1
----------------------
(Intercept) 5.03 ***
(0.22)
groupTrt -0.37
(0.31)
----------------------
R^2 0.07
Adj. R^2 0.02
Num. obs. 20
RMSE 0.70
======================
*** p < 0.001, ** p < 0.01, * p < 0.05
> screenreg(lm.D9, reorder.gof = c(4,2,3,1))
Error in matrix(nrow = nrow(gofs), ncol = ncol(gofs) + 1) :
non-numeric matrix extent
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.