Giter VIP home page Giter VIP logo

feasts's Introduction

feasts

R build status Coverage status CRAN_Status_Badge Lifecycle: maturing

Overview

feasts provides a collection of tools for the analysis of time series data. The package name is an acronym comprising of its key features: Feature Extraction And Statistics for Time Series.

The package works with tidy temporal data provided by the tsibble package to produce time series features, decompositions, statistical summaries and convenient visualisations. These features are useful in understanding the behaviour of time series data, and closely integrates with the tidy forecasting workflow used in the fable package.

Installation

You could install the stable version from CRAN:

install.packages("feasts")

You can install the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("tidyverts/feasts")

Usage

library(feasts)
library(tsibble)
library(tsibbledata)
library(dplyr)
library(ggplot2)
library(lubridate)

Graphics

Visualisation is often the first step in understanding the patterns in time series data. The package uses ggplot2 to produce customisable graphics to visualise time series patterns.

aus_production %>% gg_season(Beer)

aus_production %>% gg_subseries(Beer)

aus_production %>% filter(year(Quarter) > 1991) %>% gg_lag(Beer)

aus_production %>% ACF(Beer) %>% autoplot()

Decompositions

A common task in time series analysis is decomposing a time series into some simpler components. The feasts package supports two common time series decomposition methods:

  • Classical decomposition
  • STL decomposition
dcmp <- aus_production %>%
  model(STL(Beer ~ season(window = Inf)))
components(dcmp)
#> # A dable: 218 x 7 [1Q]
#> # Key:     .model [1]
#> # :        Beer = trend + season_year + remainder
#>    .model                           Quarter  Beer trend season_year remainder season_adjust
#>    <chr>                              <qtr> <dbl> <dbl>       <dbl>     <dbl>         <dbl>
#>  1 STL(Beer ~ season(window = Inf)) 1956 Q1   284  272.        2.14     10.1           282.
#>  2 STL(Beer ~ season(window = Inf)) 1956 Q2   213  264.      -42.6      -8.56          256.
#>  3 STL(Beer ~ season(window = Inf)) 1956 Q3   227  258.      -28.5      -2.34          255.
#>  4 STL(Beer ~ season(window = Inf)) 1956 Q4   308  253.       69.0     -14.4           239.
#>  5 STL(Beer ~ season(window = Inf)) 1957 Q1   262  257.        2.14      2.55          260.
#>  6 STL(Beer ~ season(window = Inf)) 1957 Q2   228  261.      -42.6       9.47          271.
#>  7 STL(Beer ~ season(window = Inf)) 1957 Q3   236  263.      -28.5       1.80          264.
#>  8 STL(Beer ~ season(window = Inf)) 1957 Q4   320  264.       69.0     -12.7           251.
#>  9 STL(Beer ~ season(window = Inf)) 1958 Q1   272  266.        2.14      4.32          270.
#> 10 STL(Beer ~ season(window = Inf)) 1958 Q2   233  266.      -42.6       9.72          276.
#> # … with 208 more rows
components(dcmp) %>% autoplot()

Feature extraction and statistics

Extract features and statistics across a large collection of time series to identify unusual/extreme time series, or find clusters of similar behaviour.

aus_retail %>%
  features(Turnover, feat_stl)
#> # A tibble: 152 × 11
#>    State             Indus…¹ trend…² seaso…³ seaso…⁴ seaso…⁵ spiki…⁶ linea…⁷ curva…⁸ stl_e…⁹ stl_e…˟
#>    <chr>             <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 Australian Capit… Cafes,…   0.989   0.562       0      10 5.15e-5   227.    48.5   0.281   0.187 
#>  2 Australian Capit… Cafes,…   0.993   0.629       0      10 9.73e-5   342.    77.8   0.320   0.218 
#>  3 Australian Capit… Clothi…   0.991   0.923       9      11 4.23e-6   131.    17.4   0.262   0.152 
#>  4 Australian Capit… Clothi…   0.993   0.957       9      11 1.29e-5   195.    19.3   0.262   0.193 
#>  5 Australian Capit… Depart…   0.977   0.980       9      11 2.21e-5   130.   -43.9  -0.254   0.119 
#>  6 Australian Capit… Electr…   0.992   0.933       9      11 2.68e-5   233.    -9.07  0.308   0.207 
#>  7 Australian Capit… Food r…   0.999   0.890       9      11 2.24e-4  1264.   199.    0.0866  0.268 
#>  8 Australian Capit… Footwe…   0.982   0.944       9      11 3.69e-6    64.0    1.95  0.152   0.176 
#>  9 Australian Capit… Furnit…   0.981   0.687       9       1 4.09e-5   141.   -21.6   0.200   0.0812
#> 10 Australian Capit… Hardwa…   0.992   0.900       9       4 1.32e-5   173.    45.1   0.102   0.0796
#> # … with 142 more rows, and abbreviated variable names ¹​Industry, ²​trend_strength,
#> #   ³​seasonal_strength_year, ⁴​seasonal_peak_year, ⁵​seasonal_trough_year, ⁶​spikiness, ⁷​linearity,
#> #   ⁸​curvature, ⁹​stl_e_acf1, ˟​stl_e_acf10

This allows you to visualise the behaviour of many time series (where the plotting methods above would show too much information).

aus_retail %>%
  features(Turnover, feat_stl) %>%
  ggplot(aes(x = trend_strength, y = seasonal_strength_year)) +
  geom_point() +
  facet_wrap(vars(State))

Most of Australian’s retail industries are highly trended and seasonal for all states.

It’s also easy to extract the most (and least) seasonal time series.

extreme_seasonalities <- aus_retail %>%
  features(Turnover, feat_stl) %>%
  filter(seasonal_strength_year %in% range(seasonal_strength_year))
aus_retail %>%
  right_join(extreme_seasonalities, by = c("State", "Industry")) %>%
  ggplot(aes(x = Month, y = Turnover)) +
  geom_line() +
  facet_grid(vars(State, Industry, scales::percent(seasonal_strength_year)),
             scales = "free_y")

feasts's People

Contributors

andrewkinsman avatar davidtedfordholt avatar long39ng avatar mitchelloharawild avatar pursuitofdatascience avatar robjhyndman avatar teunbrand avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

feasts's Issues

tsfeatures

Perhaps pull in the features that are implemented in tsfeatures?

Also, I think STL should probably be moved to tsibblestats rather than live in fable.

Finally, perhaps tsibblestats should have features in its name? Perhaps feasts (FEatures And Statistics for Time Series)

gg_season and max_cols

I'm not sure the current defaults are right. For example

tsibbledata::PBS %>%
  filter(ATC2 == "A10") %>%
  summarise(Cost = sum(Cost)/1e6) %>%
  gg_season(Cost, labels = "both") +
    ylab("$ million") +
    ggtitle("Seasonal plot: antidiabetic drug sales")

is black and white, but there is no reason it should not be full colour given the legend is suppressed.

Fix NSE in features()

cc @robjhyndman

library(feasts)
#> 
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#> 
#>     X11
as_tsibble(USAccDeaths) %>% 
  mutate(diff = difference(value)) %>% 
  features(diff, unitroot_kpss)
#> Error in mutate(., diff = difference(value)): could not find function "mutate"

Created on 2019-04-30 by the reprex package (v0.2.1)

Re-factor plot labels to return minimal format type

It's possible to have inconsistent labels across facets, and it would be better to have appropriate labels for axis/legends.

library(feasts)
#> 
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#> 
#>     X11
co2 %>% 
  as_tsibble() %>% 
  gg_season(value, facet_period = "10 year", labels = "both")

Created on 2019-05-24 by the reprex package (v0.2.1)

keys missing

library(feasts)
#> 
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#> 
#>     X11
USAccDeaths %>% as_tsibble %>% STL(value ~ trend(window = 10))
#> Error in .f(.x[[i]], ...): object 'keys' not found

Created on 2019-04-01 by the reprex package (v0.2.1)

namespace issue with pillar_shaft

installing tsibblestats fails with following error

 object 'pillar_shaft' not found whilst loading namespace 'tsibblestats'```

Fails regardless of whether pillar updated via CRAN or latest development version (‘1.3.0.9000’)

Also means can't install fable.

cheers

Add title to gg_tsdisplay

Is there a way to add a title? The function does not return a ggplot2 object, so + ggtitle() does not work.

Generalise ACF family of functions to accept `...`

With @earowang:
Rather than accepting values, these functions should accept ..., which computes the ACF/PACF for each column with the resulting column names matching the input names (rather than just acf)

Details for the type of cf used will be stored as an attribute for printing and plotting.

For CCF, it should evaluate all pairwise cross-correlations, with resulting names joined by *

Inconsistent missing values in gg_lag

library(tsibbledata)
library(feasts)
#> 
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#> 
#>     X11

aus_production %>% gg_lag(Bricks)
#> Warning: Removed 20 rows containing missing values (geom_path).

aus_production %>% gg_lag(Bricks, geom='point')
#> Warning: Removed 180 rows containing missing values (geom_point).

Created on 2019-03-15 by the reprex package (v0.2.1)

Unnecessary warning

> aus_production %>% gg_lag(Beer)
Warning message:
Removed 0 rows containing missing values (gg_lag). 

autoplot() for decompositions: default black for one key

Can we default to black line when only one key?

library(feasts)
library(tsibble) 
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
pedestrian %>% 
  filter(Sensor == "Southern Cross Station") %>% 
  STL(Count) %>% 
  autoplot()

Created on 2019-02-25 by the reprex package (v0.2.1)

Curl error when trying to install via install_github

I get an error when trying to install tsibblestats using install_github
I have tried in different networks and got the same issue.

> devtools::install_github("tidyverts/tsibblestats")
Error in curl::curl_fetch_memory(url, handle = h) : 
  Failed to connect to api.github.com port 80: Timed out

ACF() error message

Can you please give a more human-readable message?

library(feasts)
tsibbledata::elecdemand %>% ACF()
#> Error in value[[1]]: subscript out of bounds

Created on 2019-02-25 by the reprex package (v0.2.1)

ndiffs inconsistency with forecast

library(feasts)
#> 
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#> 
#>     X11
as_tsibble(WWWusage) %>% 
  features(value, unitroot_ndiffs)
#> # A tibble: 1 x 1
#>   value_ndiffs
#>          <int>
#> 1            0
forecast::ndiffs(WWWusage)
#> [1] 1

Created on 2019-04-29 by the reprex package (v0.2.1)

Improve seasonal labelling for weekly data

library(feasts)
#> 
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#> 
#>     X11
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:tsibble':
#> 
#>     interval, new_interval
#> The following object is masked from 'package:base':
#> 
#>     date
gasoline <- tsibble(week = yearweek(ymd("1991-2-2") + weeks(seq_along(fpp2::gasoline))),
                    value = fpp2::gasoline, index = week)
gasoline %>% autoplot(value)

gasoline %>% gg_season(value, labels = "both")

Created on 2019-03-25 by the reprex package (v0.2.1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.