Giter VIP home page Giter VIP logo

functionsnashid's Introduction

functionsnashid

During the data exploration phase, developers write repeated code to investigate the summary view based on different categories. The goal of this package is to avoid writing boilerplate code during the data exploration phase. This package implements counting the number of observations per category in a given dataset and returns the top observations.

Installation

This package is not in the CRAN yet. You can install the development version of functionsnashid from the GitHub repository with:

devtools::install_github("stat545ubc-2021/functionsnashid")

Basic Example

Please check ?count_by_category for a more detailed explanation of the function. Now we demonstrate the basic usage of the function. In the following example, we get the number of games per genre from the steam_games dataset.

  1. Results in descending order by default:
suppressMessages(library(tidyverse))
suppressMessages(library(datateachr))
library(functionsnashid)

games <- steam_games %>%
  select(id, name, genre, publisher, developer, original_price, release_date, all_reviews) %>%
  separate_rows(genre, sep = ",", convert = TRUE)

count_by_category(steam_games, genre, 5)
#> # A tibble: 5 × 2
#>   genre                  count
#>   <chr>                  <int>
#> 1 Action                  2386
#> 2 Action,Indie            2129
#> 3 Casual,Indie            1732
#> 4 Action,Adventure,Indie  1585
#> 5 Adventure,Indie         1520
  1. Results in ascending order:
count_by_category(steam_games, genre, 5, FALSE)
#> # A tibble: 5 × 2
#>   genre                                                                    count
#>   <chr>                                                                    <int>
#> 1 Accounting,Animation & Modeling,Audio Production,Design & Illustration,…     1
#> 2 Accounting,Education,Software Training,Utilities,Early Access                1
#> 3 Action,Adventure,Casual,Early Access                                         1
#> 4 Action,Adventure,Casual,Free to Play                                         1
#> 5 Action,Adventure,Casual,Free to Play,Early Access                            1

More Examples with Different Datasets

Here we would demonstrate the usage of the function count_by_category to explore different dataset:

Get the count of trees per genus in the vancouver_trees dataset.

We see Acer genus i.e. family of Maple trees are the most common in vancouver.

count_by_category(vancouver_trees, genus_name, 5)
#> # A tibble: 5 × 2
#>   genus_name count
#>   <chr>      <int>
#> 1 ACER       36062
#> 2 PRUNUS     30683
#> 3 FRAXINUS    7381
#> 4 TILIA       6773
#> 5 QUERCUS     6119

Get the count of apartment buildings per property type in the apt_buildings dataset.

count_by_category(apt_buildings, property_type, 5)
#> # A tibble: 3 × 2
#>   property_type  count
#>   <chr>          <int>
#> 1 PRIVATE         2888
#> 2 TCHC             327
#> 3 SOCIAL HOUSING   240

What heating_types are common in in the apt_buildings dataset?

count_by_category(apt_buildings, heating_type, 5)
#> # A tibble: 3 × 2
#>   heating_type   count
#>   <chr>          <int>
#> 1 HOT WATER       2789
#> 2 FORCED AIR GAS   315
#> 3 ELECTRIC         265

functionsnashid's People

Contributors

nashid avatar

Watchers

 avatar

functionsnashid's Issues

B2 Feedback

Great job, the function runs well and I was able to download it and run examples

small issue:

  • in testthat function, all the tibbles you created had to be removed: rm()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.