Giter VIP home page Giter VIP logo

estimatr's Introduction

estimatr: Fast Estimators for Design-Based Inference

CRAN Status Travis-CI Build Status AppVeyor Build Status Coverage Status CRAN downloads

estimatr is an R package providing a range of commonly-used linear estimators, designed for speed and for ease-of-use. Users can easily recover robust, cluster-robust, and other design appropriate estimates. We include two functions that implement means estimators, difference_in_means() and horvitz_thompson(), and three linear regression estimators, lm_robust(), lm_lin(), and iv_robust(). In each case, users can choose an estimator to reflect cluster-randomized, block-randomized, and block-and-cluster-randomized designs. The Getting Started Guide describes each estimator provided by estimatr and how it can be used in your analysis.

You can also see the multiple ways you can get regression tables out of estimatr using commonly used R packages such as texreg and stargazer. Fast estimators also enable fast simulation of research designs to learn about their properties (see DeclareDesign).

Installing estimatr

To install the latest stable release of estimatr, please ensure that you are running version 3.4 or later of R and run the following code:

install.packages("estimatr")

If you would like to use the latest development release of estimatr, please ensure that you are running version 3.4 or later of R and run the following code:

install.packages("estimatr", dependencies = TRUE,
                 repos = c("http://r.declaredesign.org", "https://cloud.r-project.org"))

Easy to use

Once the package is installed, getting appropriate estimates and standard errors is now both fast and easy.

library(estimatr)

# sample data from cluster-randomized experiment
library(fabricatr)
library(randomizr)
dat <- fabricate(
  N = 100,
  y = rnorm(N),
  clusterID = sample(letters[1:10], size = N, replace = TRUE),
  z = cluster_ra(clusterID)
)

# robust standard errors
res_rob <- lm_robust(y ~ z, data = dat)
# tidy dataframes on command!
tidy(res_rob)
#>          term estimate std.error statistic p.value conf.low conf.high df
#> 1 (Intercept)     0.27      0.16       1.7   0.089   -0.041     0.580 98
#> 2           z    -0.42      0.21      -2.0   0.044   -0.833    -0.012 98
#>   outcome
#> 1       y
#> 2       y

# cluster robust standard errors
res_cl <- lm_robust(y ~ z, data = dat, clusters = clusterID)
# standard summary view also available
summary(res_cl)
#> 
#> Call:
#> lm_robust(formula = y ~ z, data = dat, clusters = clusterID)
#> 
#> Standard error type:  CR2 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper   DF
#> (Intercept)    0.269      0.164    1.64     0.20   -0.255    0.793 2.99
#> z             -0.422      0.250   -1.69     0.14   -1.027    0.182 6.30
#> 
#> Multiple R-squared:  0.041 , Adjusted R-squared:  0.0312 
#> F-statistic: 2.86 on 1 and 9 DF,  p-value: 0.125

# matched-pair design learned from blocks argument
data(sleep)
res_dim <- difference_in_means(extra ~ group, data = sleep, blocks = ID)

The Getting Started Guide has more examples and uses, as do the reference pages. The Mathematical Notes provide more information about what each estimator is doing under the hood.

Fast to use

Getting estimates and robust standard errors is also faster than it used to be. Compare our package to using lm() and the sandwich package to get HC2 standard errors. More speed comparisons are available here. Furthermore, with many blocks (or fixed effects), users can use the fixed_effects argument of lm_robust with HC1 standard errors to greatly improve estimation speed. More on fixed effects here.

dat <- data.frame(X = matrix(rnorm(2000*50), 2000), y = rnorm(2000))

library(microbenchmark)
library(lmtest)
library(sandwich)
mb <- microbenchmark(
  `estimatr` = lm_robust(y ~ ., data = dat),
  `lm + sandwich` = {
    lo <- lm(y ~ ., data = dat)
    coeftest(lo, vcov = vcovHC(lo, type = 'HC2'))
  }
)
estimatr median run-time (ms)
estimatr 22
lm + sandwich 43

This project is generously supported by a grant from the Laura and John Arnold Foundation and seed funding from Evidence in Governance and Politics (EGAP).

estimatr's People

Contributors

lukesonnet avatar graemeblair avatar nfultz avatar acoppock avatar aaronrudkin avatar lilymedina avatar nick-rivera avatar rvlenth avatar jaspercooper avatar katrinleinweber avatar kuriwaki avatar vincentarelbundock avatar

Stargazers

AsForMe avatar Roberto Salas avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.