Giter VIP home page Giter VIP logo

mice's Introduction

The mice package implements a method to deal with missing data. The package creates multiple imputations (replacement values) for multivariate missing data. The method is based on Fully Conditional Specification, where each incomplete variable is imputed by a separate model. The MICE algorithm can impute mixes of continuous, binary, unordered categorical and ordered categorical data. In addition, MICE can impute continuous two-level data, and maintain consistency between imputations by means of passive imputation. Many diagnostic plots are implemented to inspect the quality of the imputations.

Installation

The mice package can be installed from CRAN as follows:

install.packages("mice")

The latest version is can be installed from GitHub as follows:

install.packages("devtools")
devtools::install_github(repo = "stefvanbuuren/mice")

Overview

The mice package contains functions to

  • Inspect the missing data pattern
  • Impute the missing data m times, resulting in m completed data sets
  • Diagnose the quality of the imputed values
  • Analyze each completed data set
  • Pool the results of the repeated analyses
  • Store and export the imputed data in various formats
  • Generate simulated incomplete data
  • Incorporate custom imputation methods

New feature in version 2.41: User may choose which cells to impute.

Main functions

The main functions in the mice package are:

Function name Description
mice() Impute the missing data m times
with() Analyze completed data sets
pool() Combine parameter estimates
complete() Export imputed data
ampute() Generate missing data

One-day course in mice

If you are new to mice consider studying the materials of our one-day course in Winnigpeg. The materials are at https://stefvanbuuren.github.io/Winnipeg.

miceVignettes

A detailed series of vignettes that walk you through solving realistic inference problems with mice. These vignettes overlap with the one-day course.

We suggest going through these vignettes in the following order

  1. Ad hoc methods and the MICE algorithm
  2. Convergence and pooling
  3. Inspecting how the observed data and missingness are related
  4. Passive imputation and post-processing
  5. Imputing multilevel data
  6. Sensitivity analysis with mice

Related packages

Packages that extend the functionality of mice include:

  1. ImputeRobust: Multiple Imputation with GAMLSS
  2. countimp: Incomplete count data
  3. miceadds: Functions for multilevel imputation
  4. micemd: Functions for multilevel imputation
  5. CALIBERrfimpute: Another random forest method
  6. smcfcs: Addressing incompatibility in selected models
  7. parlMICE: Parallel MICE imputation wrapper
  8. fancyimpyute: MICE in Python for ordinal data

Further reading

The mice software was published in the Journal of Statistical Software (Buuren and Groothuis-Oudshoorn 2011). The first application of the method concerned missing blood pressure data (Buuren, Boshuizen, and Knook 1999). The term Fully Conditional Specification was introduced in 2006 to describe a general class of methods that specify imputations model for multivariate data as a set of conditional distributions (Buuren et al. 2006). Details about imputing mixes of numerical and categorical data can be found in (Buuren 2007). Wulff and Ejlskov provide a comprehensive overview of MICE. Many more details and applications can be found in the book Flexible Imputation of Missing Data (Buuren 2012) (see Chapman & Hall/CRC or Amazon).

References

Buuren, S. van. 2007. “Multiple Imputation of Discrete and Continuous Data by Fully Conditional Specification.” Statistical Methods in Medical Research 16 (3): 219–42.

———. 2012. Flexible Imputation of Missing Data. Boca Raton, FL: Chapman & Hall/CRC Press.

Buuren, S. van, and K. Groothuis-Oudshoorn. 2011. “MICE: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software 45 (3): 1–67.

Buuren, S. van, H. C. Boshuizen, and D. L. Knook. 1999. “Multiple Imputation of Missing Blood Pressure Covariates in Survival Analysis.” Statistics in Medicine 18 (6): 681–94.

Buuren, S. van, J. P. L. Brand, C. G. M. Groothuis-Oudshoorn, and D. B. Rubin. 2006. “Fully Conditional Specification in Multivariate Imputation.” Journal of Statistical Computation and Simulation 76 (12): 1049–64.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.