Rfer

Team

Group Member	Github
Gabriel Bogo	@GabrielBogo
Yuwei Liu	@liuyuwei169
Weifeng (Davy) Guo	@DavyGuo
Mohamad Makkaoui	@makka3

Reimplementation of the infer R package, that offers a tidy way of developing statistical inference built on top of Tidyverse.

The infer package streamlines the process of reshuffling and bootstrapping of samples, calculating summary statistics and confidence intervals, and performing hypothesis tests for statistical inference. It does this using a combination of functions that are built with the emphasis on clear expressive code and using correct statistical grammar that explains the way the values are calculated and the tests are evaluated in statistical inference.

With this package as the inspiration, rfer will have four main functions (specify,generate,calculate,get_ci) for the first iteration. These functions will, given a data frame and the specified response variable; calculate summary statistics and confidence intervals for the response variable. Further details follow in the description of the functions below.

Where does rfer fit into the R ecosystem?

Currently, infer does a great job at implementing what we've specified for the functions in the R ecosystem. Nevertheless, we will begin by developing similar functions for the initial iterations with the expectation that we will add on extra functions that will enhance infer package at a later time.

Functions

`specify`

Function Description: choose specific columns to feed the subsequent pipeline.

Inputs:

data: a Dataframe
response: string. The column of your dataframe to be the response variable.

Output:

Dataframe containing one column for response variable and zero or more columns for the explanatory variables. The first column is always the response.

`generate`

Function Description: Generate bootstrap resamples and permutations

Inputs:

data: a Dataframe generated from a specify function.
n_samples: Integer. Number of resamples.
type: "Bootstrap" (default), or "Permutation".

Output:

Dataframe containing all resamples stacked vertically. Will keep all columns from the input data and an additional sample_id column to identify individual resamples.

`calculate`

Function Description: calculate a summarizing statistic for each bootstrap sample.

Inputs:

data: Dataframe generated by the generate function.
stat: Summarizing statistic. "mean" (default) or "median"

Output:

Dataframe of summarized data. Each row contains the summary statistic for a given resample..

`get_ci`

Function Description: return the bootstrap confidence interval for a point estimate.

Inputs:

data: Dataframe generated from calculate function.
interval: Significance level. Percentage Float (0-100)

Output:

Dataframe containing 1 row and columns for Statistic (Point Estimate), significance level, Lower Bound and Upper Bound.

gabrielbogo / rfer Goto Github PK

rfer's Introduction

Rfer

Team

Functions

`specify`

`generate`

`calculate`

`get_ci`

rfer's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent