Giter VIP home page Giter VIP logo

tsframes.jl's People

Contributors

asinghvi17 avatar ayushpatnaikgit avatar chiraganand avatar codetalker7 avatar doganmehmet avatar emmanuel-r8 avatar frank-iii avatar harsharora21 avatar matcauthon49 avatar san-ath avatar shrutirdalvi avatar siddjain444 avatar sumeetsuley avatar tumon2001 avatar valentinkaisermayer avatar viralbshah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

tsframes.jl's Issues

`apply` fails to correctly resample using `Dates.Week`

TS.apply() fails to correctly resample weekly data. Consider the following dataset:

julia> ts_daily_1
(360 x 1) TS with Date Index

 Index       data      
 Date        Float64   
───────────────────────
 2007-01-01  -0.790931
 2007-01-02   1.45561
 2007-01-03  -0.496326
 2007-01-04  -2.00011
                
 2007-12-23   0.459265
 2007-12-24   0.744704
 2007-12-25   0.583233
 2007-12-26   0.104833
       352 rows omitted

The following operation outputs an incorrect result:

julia> ts_weekly = apply(ts_daily_1, Dates.Week(12), first)
(5 x 1) TS with Date Index

 Index       data_first 
 Date        Float64    
────────────────────────
 2007-01-01   -0.790931
 2007-01-29   -1.81454
 2007-04-23   -0.427515
 2007-07-16   -0.678125
 2007-10-08   -0.236422

Instead of resampling by 12 weeks each, the first box is only 4 weeks.

`TS` does not display Index as the first column

When a TS object is viewed, if ts.index != 1 then the Index is not the first column displayed. If the TS contains a large enough number of columns, then the index is not displayed at all in the terminal.

`ts.getindex()` only retrieves first valid value when indexing by date

The function Base.getindex() when accepting arguments of the form (ts, row::T) where {T<:TimeType} or (ts, row::AbstractVector{T}) where {T<:TimeType} only returns the first value corresponding to the given date. This seems like incorrect behaviour considering that the TS object is designed to allow duplicate indices.

Example:

julia> ts
(20 x 1) TS with Date Index

 Index       data  
 Date        Int64 
───────────────────
 2008-01-01      1
 2008-01-01      2
 2008-01-01      3
 2008-01-01      4
              
 2008-01-01     17
 2008-01-01     18
 2008-01-01     19
 2008-01-01     20
    12 rows omitted

julia> ts[Date(2008,1,1), :data]
1

Ideally it should return

julia> ts[Date(2008,1,1), :data]
20-element Vector{Int64}:
  1
  2
  3
  4
  5
  6
  
 16
 17
 18
 19
 20

Proposed fix:

Consider the function

function Base.getindex(ts::TS, dt::T, j::AbstractVector{Int}) where {T<:TimeType}
    idx = findfirst(x -> x == dt, index(ts))
    ts[idx, j]
end

and replace the findfirst with a findall function, that returns a vector of all instances of the date index.

Implement Broadcasting

Currently, log is automatically broadcasted. It's best not to do this and just let users call log.(ts) instead, since special-casing log doesn't deal with other transformations — for instance, if a user wants to take sqrt(ts).

Implement rename!()

In-place renaming of columns while protecting Index columns. Also, the requested names argument should not already have a column named Index.

apply is appending the function name to the column names

The function name gets added to the column names.

julia> ts_monthly = apply(ts, Month, last)
(15 x 1) TS with Date Index

 Index       value_last
 Date        Float64
────────────────────────
 2007-01-01    10.5902
 2007-02-01     8.85252
 2007-03-01     8.85252
 2007-04-01     9.04647
 2007-05-01     9.04647
 2007-06-01     8.26072
 2007-07-01     8.26072
 2007-08-01     8.26072
 2007-09-01     9.95546
 2007-10-01     9.95546
 2007-11-01     7.88032
 2007-12-01     7.88032
 2008-01-01     7.88032
 2008-02-01    10.6328
 2008-03-01     8.85252

Can we have a parameter to turn this off?
In my case, I had to do:

rename!(ts.coredata, replace.(names(ts.coredata), "_last" => ""))

At the end and construct the ts object again.

Implement Broadcasting

Since apply is really aggregation or downsampling it would be nice to supply some way of applying a function to a column which would keep the index intact.

ts = TS(random(10))

sin.(ts[:x1]) .+ 2 

I think it is already possible with rollapply but this is kind of abusing the API

rollapply(x->sin(x) + 2, ts, :x1, 1)

Add consistency checks to expensive operations

Since the TS object can be manipulated externally we need to perform consistency checks to make sure certain conditions are not violated before any expensive operation (such as apply(), rollapply(), or even print()).

Conditions which should be checked:

  1. The index should always be sorted.
  2. Duplicate values in the index should be checked if the object doesn't support them. Whether the object supports duplicate values or not can be stored inside some sort of a metadata object without touching the TS struct. This is until DataFrames.jl starts supporting metadata.

Implement Method for Resampling

I think this is one of the most important methods for time series data. Being able to interpolate and aggregate.

I like the interface of Grafana, i.e. being able to specify not only the interpolation or aggregation method but both at the same time.

Useful if you have measurement data at e.g. about 5min intervals but with some holes in it and want to get a clean vector with an equidistant sample time of 15min. Where there is good data it has to be aggregated and where there are holes it has to be interpolated.

For interpolation, common methods would be

  • previous
  • next
  • nearest
  • linear
  • fill, typically with Nan or Missing

And for aggregation

  • mean
  • median

Allow joins of two or more TS objects

Currently, TSx.join() can only merge two objects TS objects. DataFrames.innerjoin() et all do support joining of two or more objects so it should be possible to replicate that behaviour in TSx join methods.

Keep a single primary method for renaming columns

Currently, there are two methods for rename!() and both of them contain the same function body. One of them should be calling the other to reduce costs of maintaining two methods.

rename!(ts::TS, colnames::AbstractVector{String})
rename!(ts::TS, colnames::AbstractVector{Symbol})

Implement doing regressions using `rollapply()`

It is not possible to currently run multiple regressions using rollapply() because RollingFunctions.jl does not support rolling over tables.

Example code:

function regress(data)
        ll = lm(@formula(inrchf ~ usdchf + eurchf + gbpchf + jpychf), data)
        co = coef(ll)[coefnames(ll) .== "usdchf"]
        sd = Statistics.std(residuals(ll))
        return Dict("coeff" => co, "sd" => sd)
      end

rollapply(regress, returns, 200) // doesn't work

Implement `endpoints()` method

To locate points of Index given a particular frequency.

Interface to be similar to R xts:

endpoints(ts::TS, on::Union{String, Symbol}, k::Int=1)

`TimeArray` conversion

We should be able to make a TS object into a TimeArray whenever possible, and vice versa.
For example:

using Dates
dates = collect(Date(2008):Year(1):Date(2010))
ts = TS(1:3, dates)
TimeArray(ts) # Doesn't work 

Perhaps TimeArray(ts::TS) can be put in TimeSeries.jl and TS(ts::TimeArray) can be in TSx

Allow subsetting TS for specific dates

This requires creating a getindex method which takes Vector{Date} as input.

Output should be like:

julia> dates = [Date(2007, 1, 1), Date(2007, 2, 1)]
julia> ts[dates]
(2 x 1) TS with Date Index

 Index       value
 Date        Float64
─────────────────────
 2007-01-01  10.8087
 2007-02-01  8.7392

Fix getindex testing error

Log from CI build:

getindex(): Test Failed at /home/runner/work/TSx.jl/TSx.jl/test/getindex.jl:36
  Expression: unique(Dates.yearmonth.(TSx.index(ts[y, m]))) == [(2007, 3)]
   Evaluated: Tuple{Int64, Int64}[] == [(2007, 3)]
Stacktrace:
 [1] top-level scope
   @ /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Test/src/Test.jl:445
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:451
 [3] macro expansion
   @ ~/work/TSx.jl/TSx.jl/test/runtests.jl:15 [inlined]
 [4] macro expansion
   @ /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
 [5] top-level scope
   @ ~/work/TSx.jl/TSx.jl/test/runtests.jl:15
getindex(): Error During Test at /home/runner/work/TSx.jl/TSx.jl/test/runtests.jl:14
  Got exception outside of a @test
  LoadError: MethodError: no method matching test_types(::Float64)
  Closest candidates are:
    test_types(!Matched::TS) at ~/work/TSx.jl/TSx.jl/test/getindex.jl:3
  Stacktrace:
    [1] top-level scope
      @ ~/work/TSx.jl/TSx.jl/test/getindex.jl:47
    [2] include(fname::String)
      @ Base.MainInclude ./client.jl:451
    [3] macro expansion
      @ ~/work/TSx.jl/TSx.jl/test/runtests.jl:15 [inlined]
    [4] macro expansion
      @ /opt/hostedtoolcache/julia/1.7.3/x64/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
    [5] top-level scope
      @ ~/work/TSx.jl/TSx.jl/test/runtests.jl:15
    [6] include(fname::String)
      @ Base.MainInclude ./client.jl:451
    [7] top-level scope
      @ none:6
    [8] eval
      @ ./boot.jl:373 [inlined]
    [9] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:268
   [10] _start()
      @ Base ./client.jl:495
  in expression starting at /home/runner/work/TSx.jl/TSx.jl/test/getindex.jl:47

Can `apply` work for all user provided functions?

The fun argument may be a function which only works for scalars or only works for vectors (AbstractVector). The implementation of apply() needs to figure out whether to use broadcasting operator or not. Or, perhaps create different apply() methods for both cases.

isregular() method for checking regularity of TS object

Methods

  • isregular(ts::TS)
  • isregular(ts::TS, unit::T) where {T<:Dates.Period}
  • isregular(timestamps::T) where {T<:AbstractVector{TimeType}}
  • isregular(timestamps::V, unit::T) {V<:AbstractVector{TimeType}, T<:Dates.Period}

Rules

  1. The input is considered regular if the times are in a sequence that is strictly monotone (either increasing or decreasing) with a unique time step. REF: Matlab isregular()
  2. For methods where the second argument <: TimeType is provided the input will be checked using rule 1 using the period provided.

Ref: #48

Incorrect output of `getindex(ts::TS, y::Year, m::Month, w::Week)`

julia> ts[Year(2022), Month(8), Week(1)]
(0 x 1) TS with Date Index

But, the following returns correct output:

julia> ts[Year(2022), Month(8), Week(32)]
(7 x 1) TS with Date Index

 Index       x1
 Date        Float64
───────────────────────
 2022-08-08   0.647277
 2022-08-09   0.800605
 2022-08-10   0.698464
 2022-08-11   0.868943
 2022-08-12   0.510194
 2022-08-13   2.4704
 2022-08-14  -0.86813

This is because Dates.week() returns the $N^th$ of the year and not within the month. This getindex() method should just take year and week as arguments so that the functionality is clear.

function getindex(ts::TS, y::Year, w::Week)

Allow subsetting TS using Date and specific columns

Essentially, a getindex method like: getindex(::TS, ::Vector{Date}, ::T) where {T<:Union{String, Symbol, Int}. Another method should be created which takes in a scalar Date can call the previous method internally with [Date()] (related to #19).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.