The cohorts from peerchristensen

Release cohorts 1.1.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Years as a cohort interval

Thank you for creating the cohorts package! I was looking for something to help me analyze donor retention across years and this package and explanation was far and away the simplest thing I found.

However, I wanted to look at years rather than months, so I created a new function with very minor changes that uses lubridate instead of zoo:

cohort_table_year <- function(df, id_var, date)
{
  dt <- dtplyr::lazy_dt(df)
  dt %>% dplyr::rename(id = {
    {
      id_var
    }
  }) %>% dplyr::group_by(id) %>% dplyr::mutate(year = lubridate::year({
    {
      date
    }
  })) %>% dplyr::mutate(cohort = min(year)) %>% dplyr::group_by(cohort,
                                                                 year) %>% dplyr::summarise(users = dplyr::n_distinct(id)) %>%
    tidyr::pivot_wider(names_from = year, values_from = users) %>%
    dplyr::ungroup() %>% dplyr::mutate(cohort = 1:dplyr::n_distinct(cohort)) %>%
    tibble::as_tibble()
}

Feel free to add this (and/or rewrite with zoo instead).

change from counting n to n_distinct(var_id)

@PeerChristensen many thanks for this package. I used it for one of my analysis. One thing I found out in the source code is that the values being calculated seem to be a count of the rows (n) and not n_distinct(var_id). I believe to get the no of users we need to count the distinct user_ids ?

i.e. instead of
dplyr::summarise(users = dplyr::n()) %>%

we have

dplyr::summarise(users = dplyr::n_distinct({{ id_var }})) %>%

How can I get historical data?

A great job with the package. It makes pulling Twitch data pretty simple.
I do have one request, being able to pull down historical data (i.e not just the current month) would be really useful.

TX
David

cohorts converted to data.table/tidytable....

Hello Peer,

Thanks for your package. I find it very useful and convenient.

Since I am using a large customer base, the process to get the monthly or the daily cohorts take a little while, so I converted your code to data.table (and a little bit of tidytable) and I see already an improvement. Even for the small dataset you include online_cohorts in the package I see an improvement of 2x.

If you are interested, I can send to you the code to extend your package with them.

Thanks,
Carlos.

peerchristensen / cohorts Goto Github PK

cohorts's People

Contributors

Stargazers

Watchers

Forkers

cohorts's Issues

Release cohorts 1.1.0

Years as a cohort interval

change from counting n to n_distinct(var_id)

How can I get historical data?

cohorts converted to data.table/tidytable....

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent