fraunhoferportugal / tsfel Goto Github PK

View Code? Open in Web Editor NEW

888.0 888.0 142.0 40.08 MB

An intuitive library to extract features from time series.

Home Page: https://tsfel.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

classification colab-notebook data-science feature-engineering feature-extraction time-series

tsfel's People

Contributors

Stargazers

Watchers

Forkers

ghayth82 mbarandas elseviersoftwarex exp-time-series-tools sophie-mtz tecamenz saniamos vishalbelsare lmfernandes96 jingmouren frankfan007 smlsantos venkateshsundi alandene joao-romao liamcarroll hhynb dashan0313 ivanletteri arita37 world4jason laxnring tuliochiodi nunovb skhan45 paratra fdoperezi ruksana-rasheed sudharshan-chakra huangshizhi ibrahim85 sbhadade saurabh2079 aryanphd jacksampiere rooroo224 jsablowski andrew89982018 rambam613 gjacquenot veckothegecko passion4energy shawn1111 spkorhonen valeman xxm320 andrek1992 evanswitte jordan-bird webclinic017 margaridaantunes seropb hangzhang10 thewchan shalevy1 hydrogeohc greysun slloydbourne utyiama bigandsweet miraclelemon overfittingstudyroom 92rogercao mishell1996 ray-0403 qtbgo ipsec ainhyeok yqz5514 tdl77 atick-faisal taogeanton2 nidhog ranazia517 taogeanton3 babajideowoyele feblich knut0815 cdchushig praveenkumar-r07 nengwp roneival leewlving zhuohan pebbleshx amoscolumbus deveshm76 fanfan-19941119 hobart10 tongwu-sh kurokabe shism2 chordata-insight yun37 gleipnir1 skumar2514 artliathain ajay-sreeram cevolbake zhanghaotree

tsfel's Issues

Tsfel on google Colab

Hi. Thanks for the amazing work.
I would like to use tsfel on Colab.
Could you please tell me how to access the that colab file?

Additional TSFEL examples

It would be nice to have additional examples of TSFEL apart from HAR.

A priori 'feature vector size'

Dear all,

Playing with your tool I wanted to obtain the spectral features for a given signal

cfg = tsfel.get_features_by_domain(domain='spectral')
len(cfg['spectral'].keys())
26

26 spectral features, nice!

But, when I calculated those features

#Fs previously obtained from data
X = tsfel.time_series_features_extractor(cfg, data,fs=Fs)
X.size
335

335 elements!.

I would love to iterate among several time-series, and obtain a feature matrix. I would like to know a priori the size of the X features, as I may create a specific variable array to store the values.
I know that certain features of the signal are computed in time slots (such as FFT_mean_coeff), but is really time consuming to annotate how many results per feature I should expect.

Thus, is there any option to know a priori how many elements will be in the X series?

Progress bar still show up with verbose = 0

Hi, unless I misunderstood how to fully disable progress bar, the time_series_features_extractor() stil displays a progress bar. It stays empty but still it prints something out which cause me troubles when trying to integrate tsfel with streamlit.

Feature Reduction

Thanks for the useful library.
I would like to use it to extract features from a time-series dataset composed of experiments characterized by 30 synchronous times-series, each collected at 1 Hz.
By running your code I get 11.700 features.
Are there any means to automatically reduce the features to the subset your tool evaluate as the most important ones?

Since I have 400 experiments run a feature selection process afterwards does not seem a good option.

Thanks

Sampling Frequency as Integer

Hello! First of all, thank you for developing this great library.
I've been using it lately, and the dataset I'm working on has a low, non-whole sampling frequency of 1/60 Hz. Since tsfel assumes the sampling frequency value type as integer, it aproximates that value to zero, which impossibilitates the extraction of some features due to the occurence of "ZeroDivisionError".

Correlation Feature Selection

Hello there,

Not an issue, just a suggestion.

I saw that you created a function in signal_processing called correlated_features which identifies highlighy correlated features given a threshold and then returns a list of features to drop. There is an example on how to use it in the notebook TSFEL_HAR_Example.ipynb, but you don't use that in pipeline.

I've used that function as a base to create a class that can be used in pipelines. Please note that I think you should update your function to take the absolute number of the correlation as currently you are only dropping high positively correlated features (with correlation bigger than positive threshold). However, correlations have range [-1,1], so I would have thought you also want to drop high negatively correlated features (i.e. with correlation smaller than negative threshold) . See my code below.

class CorrelationThreshold(BaseEstimator, TransformerMixin):
    
    """Feature selector that removes all correlated features.

    This feature selection algorithm looks only at the features (X), not the
    desired outputs (y), and can thus be used for unsupervised learning.
    
    Parameters
    ----------
    threshold : float, default=0.95
        Features with a training-set correlation higher than this threshold will
        be removed. The default is to keep all features with non-zero variance,
        i.e. remove the features that have the same value in all samples.

    Attributes
    ----------
    selected_features_ : list, shape (n_features)
        Returns a list with the selected feature names.

    """

    def __init__(self, threshold = 0.95):
        self.threshold = threshold
        self.to_drop = None
        self.to_keep = None

    def fit (self, X, y = None ): 
        '''
        Parameters
        ----------
        X : {array-like, sparse matrix}, shape (n_samples, n_features)
            Sample vectors from which to compute variances.
        y : any, default=None
            Ignored. This parameter exists only for compatibility with
            sklearn.pipeline.Pipeline.
        Returns
        -------
        self
        '''
        corr_matrix = np.absolute(X.corr())
        upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(np.bool))
        self.to_drop = [column for column in upper.columns if any(upper[column] > self.threshold)]
        self.to_keep = list(set(X.columns) - set(self.to_drop))
        return self
        
    def transform(self, X, y = None):
        X_selected = X[self.to_keep]
        return X_selected
    
    def get_support(self):
        return self.to_keep

Please note that this is a topic in discussion in the sklearn community too: #13405, #14698. One thing that should be further discussed/analysed/improved is how to choose which of the highly correlated variables should be dropped. I think some , f-value correlation with the y label could be appropriate - only applicable to supervised learning problems.

Overlapping windows

Hello! Thank you in advance for providing this useful package.

I am currently using your package (tsfel) to extract features from a dataframe with physiological signals as columns.
I have a very specific question regarding the overlap of the moving window. I was wondering how the overlap is rounded when an uneven windowsize is used. For example, if I have a windowsize of 1281 and an overlap of 0.7, it means that my window will shift (1-0.7)*1281 = 384.3 samples down in my dataframe for each feature extraction step. Is this rounded up or down (ceil or floor)? so 384 or 385?

Kind regards,

Maarten

FFT Frequency Bins (np.linspace vs np.fftfreq)

I've noticed a small inconsistency between the usage of np.fft.fft and the numpy documentation.
Namely in tsfel/feature_extraction/features_utils.py in line 52 following (calc_fft).

The numpy documentation at https://numpy.org/doc/stable/reference/generated/numpy.fft.fft.html#numpy.fft.fft
uses fftfreq to extract the center of each frequency bin. While the TSFEL implementation uses np.linspace.

The output of both functions differs slightly, likely due to accountancy of the zero-frequency component.
Example:

fs = 10
signal_length = 50
a = np.linspace(0, fs // 2, signal_length // 2) # used in calc_fft
b = np.fft.rfftfreq(signal_length, d=1/fs) # numpy documentation
a, b, a.shape, b.shape

yields:

(array([0.        , 0.20833333, 0.41666667, 0.625     , 0.83333333,
        1.04166667, 1.25      , 1.45833333, 1.66666667, 1.875     ,
        2.08333333, 2.29166667, 2.5       , 2.70833333, 2.91666667,
        3.125     , 3.33333333, 3.54166667, 3.75      , 3.95833333,
        4.16666667, 4.375     , 4.58333333, 4.79166667, 5.        ]),
 array([0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2, 1.4, 1.6, 1.8, 2. , 2.2, 2.4,
        2.6, 2.8, 3. , 3.2, 3.4, 3.6, 3.8, 4. , 4.2, 4.4, 4.6, 4.8, 5. ]),
 (25,),
 (26,))

The two parts would be equivalent when using np.linspace(0, fs // 2, signal_length // 2 + 1).

As noted the output differs slightly. If I'm reading this correctly, the linspace is calculated for 25 frequencies including the zero-frequency component, while the rfftfreq is calculated for 25 frequencies plus zero-frequency component.

Extract relevant features from different domain.

It takes a long time to extract all domain features so, How can I extract only relevant features from all three domains?
Please help!!!

time series with variable length?

Hi, I am curious whether we could use this to extract features from multi-dimensional time series with variable lengths?

a problem with correlation_report

Give a parameter instead of the question in input for removing highly correlated features at correlation_report.

rolling over timeseries

Hi folks! Thanks for this fantastic contribution. I'm excited to test the capabilities of this package.

I have a hard time to extract features constructed by tsfel for a univariate time series rolled by date. For example, I have a pandas dataframe with m dates and n features, and I want to estimate the tsfel feature set given fixed window size. As a result, I should get a dataframe of shape m dates and n times y (number of variables derived from tsfel). Any comments are welcome.

Thanks in advance!

tsfel.dataset_features_extractor checks

I am trying to use tsfel.dataset_features_extractor on the data file https://github.com/numenta/NAB/blob/master/data/realAWSCloudwatch/ec2_cpu_utilization_24ae8d.csv

The following message is printed:
Features files saved in:
But there are no results in the directory. I expect some sort of error message or warning if nothing is produced. It is not obvious how I can debug this.

Minimal sample window size

I'm using:

X_train = tsfel.time_series_features_extractor(cfg_file, X_train_sig, fs=fs, window_size=window_size)

There is a minimal sample window size of 12?

If I use a smaller window_size then I get errors like below (I'm running through Julia so the messages are not so nice)

File "..../tsfel/feature_extraction/calc_features.py", line 297, in time_series_features_extractor for i, feat in enumerate(features):

Typo in doc

hypotenuse

tsfel/tsfel/feature_extraction/features.py

Line 201 in 4e07830

using the hipotenusa between 2 datapoints.

Get different number of columns in extracted features

I use time_series_features_extractor to extract multiple time series, which is a dataframe with columns [ts1, ts2]. Now I have several such tables, and I run time_series_features_extractor on each, and want to concatenate the result by axis=1. But I find the numbers of features extracted are different. Can you tell me in what situation a feature will be dropped from result? It seems 780 features will be extracted by default.

multiple timeseries feature extraction

Hi,

I am using tsfel for 564 timeseries analysis, I want to extract the features of each time course and get a dataframe containing the features for all time courses (they should have the same column and each row represents a specific time course)

So I used a loop for this, my dataset has nan in some time course.

My code looks like this, but it only shows the feature extraction started, never finished and could not return the features.

Any suggestions on this?

Many thanks!

Using feature extraction on a frequency response function

Good day

I have experimental results from modal hammer testing of a bolted beam structure. These results are in the form of a frequency response function (FRF), with amplitude as the y axis and frequency as the x axis.

I would love to use your code in the spectral domain for feature extraction of my signals. The issue that I'm facing is that your code calculates the fft within the functions. Is there a way to use your spectral domain code without calculating the fft in the function codes? (As in, allow me to input an fft as the input signal to the code)

In essence, if I use your code as is, I would be calculating the fft of an fft.

I would greatly appreciate feedback.
Thank you

Installation problem using Anaconda

Hello,

I am using Anaconda,
Do you have an idea on why I can't install the packages using pip ?

Requirements for input dataframe

Hi,
I have another question related to the input format of the data for tsfel.

Let's say I have the following dataframe (time_df):

id  timestamp ch0  ch1 ch2
0   1                0.5   0.8   0.9
0   2                0.9   0.9   0.8
...
0   100              0.8   0.8   0.8
1   1            0.9   0.1   0.1
....
1  50                 0.9   0.9  0.9
2,
3, etc..

Where id is the measurement number, and ch0, ch1, ch2 are the channels recorded.

Let's say for measurement 0, I had 100 points per channel; for measurement 1, I had 50 points, and for measurement 3, I had 70 points.

Just to put it another way:
for measurement 0 - an array of 3x100
for measurement 1 - an array of 3x50
for measurement 2- an array of 3x70

The sampling frequency is the same ( let's say 1 sample/second)

When I used tsfresh to generate features, I just providing column with IDs and timestamps
ts_features = extract_features(time_df, column_id='id', column_sort='timestamp')

It boils down to the question:
How do I need to re-shape the data frame to use tfsel to generate each measurement's features, regardless of the measurement lengths (they may be the same, or may be different)?

RuntimeWarning: divide by zero encountered in double_scalars

There is an unhandled exception in line 770 in features.py. The corresponding function has the following code:

@set_domain("domain", "statistical")
def ecdf_slope(signal, p_init=0.5, p_end=0.75):
    """Computes the slope of the ECDF between two percentiles.
    Possibility to return infinity values.

    Feature computational cost: 1

    Parameters
    ----------
    signal : nd-array
        Input from which ECDF is computed
    p_init : float
        Initial percentile
    p_end : float
        End percentile

    Returns
    -------
    float
        The slope of the ECDF between two percentiles
    """
    signal = np.array(signal)
    # check if signal is constant
    if np.sum(np.diff(signal)) == 0:
        return np.inf
    else:
        x_init, x_end = ecdf_percentile(signal, percentile=[p_init, p_end])
        return (p_end - p_init) / (x_end - x_init)

Unfortunately, the case (x_end-x_init) is not handled and thus occassionally results in:

PathToAnaconda\lib\site-packages\tsfel\feature_extraction\features.py:770: RuntimeWarning: divide by zero encountered in double_scalars
  return (p_end - p_init) / (x_end - x_init)

UnicodeEncodeError: 'charmap' codec can't encode character '\u2588' in position 12: character maps to <undefined>

I'm using TSFEL on a Windows 10 machine and end up with the following error message whenever I enable the progress bar for feature extraction:

UnicodeEncodeError: 'charmap' codec can't encode character '\u2588' in position 12: character maps to <undefined>

I'm not really sure why this problem occurs but this Stackoverflow thread provides some suggestions on how to mitigate the issue.

If I set the verbose parameter to 0, everything works as expected.

Working with unevenly sampled data

I have a question about the following usage scenario of tsfel:
I have a pandas data frame with the following structure:

time           value
2020-01-01       1.2
2020-01-02       1.3
2020-01-04       1.1
2020-01-07       1.0
2020-01-08       1.5

As you may see, the sampling frequency is not constant (sometimes it is once per day, sometimes it is once per few days).
Can you please let me know if tsfel can handle this kind of data to extract time-series features?

Get features for each line

Hello,

I would like to generate features for each observation of my time serie and not only window by window.

Does this possibility exist in tsfel and do you know how to do it ?

Thanks in advance

Updates to the pip package

Hello! thank you for the tsfel package.

Do you plan to push recently made updates to the PyPI? I see that there have been a number of changes to tsfel since 14th Feb.

If not, would you recommend that we use the development branch or stick with v0.1.4

How to extract certain feats from a domain

Hi, thanks for this tool! It's a huge help. I'm struggling with how I go about extracting some but not all features from the spectral domain.

To extract all we use something like this:
cfg_file = tsfel.get_features_by_domain('spectral')
data = tsfel.time_series_features_extractor(cfg_file, data, fs=fs)

Which function can we use to extract a list of chosen features?
Thanks!

How to edit features configuration dictionary (JSON)

Hola como seria el proceso para editar el diccionario

Originally posted by @espjose in #89 (comment)

tsfel.negative_turning, tsfel.positive_turning returns zero in spyder 5.0 numpy 1.20.2

I think this is due to recent updates in numpy.

diff_sig = np.diff(signal,axis=0)

fixes the issues

How to edit the params for peak detection to get only prominent peaks

Not an issue, more of a question about classes

How does the dataset should look like in order to create features based on a class of the data. Let's say I have this data, where a class is 1 or 2:

      energy.current  energy.power  energy.powerFactor  class  time
0              0.080          12.5                0.67   1     1
1              0.081          12.6                0.67   1     2
2              0.083          12.7                0.66   1     3
3              0.083          12.7                0.66   1     4
4              0.080          12.5                0.67   2     1
5              0.081          12.6                0.67   2     2
6              0.083          12.7                0.66   2     3
7              0.083          12.7                0.66   2     4

How do I provide this data to the library for it to be able to generate features by class?

Notebook and Google Sheet

I would ike to get a simple demo of tsfel running so I downloaded the notebooks and run into:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-14-0cfaaee0f6b7> in <module>
      2 googleSheet_name = "MH Copy of Features_dev"
      3 # Extract excel info
----> 4 cfg_file = tsfel.extract_sheet(googleSheet_name)
      5 
      6 # Get features

~/Library/Python/3.6/lib/python/site-packages/tsfel/utils/gSheetsFilters.py in extract_sheet(gsheet_name, **kwargs)
    111 
    112     assert len(list_of_features) <= (len_json), \
--> 113         "To insert a new feature, please add it to data/features.json with the code in src/utils/features.py"
    114 
    115     # adds a new feature in Google sheet if it is missing from features.json

AssertionError: To insert a new feature, please add it to data/features.json with the code in src/utils/features.py

I'm not sure the first demo should use Google sheets - maybe that table could be inserted into the Notebook?

[Feature]Verbose option

Dear all,
Thank you for your incredible work.

I am giving a try with your software, analyzing some neural recordings that I have.
Because of computational power, I am running my code through Google's Colaboratory. However, since this tool has a fixed time for running, it should really help a verbose option when running

tsfel.time_series_features_extractor(cfg,data)

as it may help to calculate the amount of time that certain feature calculation should take

Determining correct sampling frequency?

Hi,

First thankyou for such a wonderful library and its real life saver. I wanted to clarify about sampling frequency so say that in my time series dataset, i have observation timestamped after every 10 minutes so should i set sampling frequency 'fs' to about 1/600 or 0.001667ish something?

How to use this library with a timeseries dataframe ?

Would you be able to share an example ? How to use this library with a timeseries dataframe?

Library is great it is just that, I am trying to get my head around fs and window_size parameter to feed my hourly data .

Basically how I can do i have to map the data from time to frequency ?

My schema looks looks like this one below :
hourly recording :
id, time, feature1, feature2 , target

Thanks!

Incorrect number of features for lppc function

Hey! I really like the plenty feature functions that are implemented in this library 😄
When toying around with your library, I found a bug (some unexpected behavior).

When calling lppc on some data with default arguments (i.e., n_coeff=12) it returns an array of length 13.

from tsfel.feature_extraction import lpcc
# Calculate lpcc feature on some dummy data
len(lpcc(np.arange(500)))  # returns 13

However, when calling the get_number_features function, it returns that only 12 features will be returned.

from tsfel import get_features_by_domain, get_number_features
# Get number of features for lpcc feature configuration
feat_dict = get_features_by_domain("spectral")
get_number_features({"dummy": {"dummy": sorted(feat_dict.values())[0]["LPCC"]}})  # returns 12

This difference is because the get_number_features function use the n_coeff parameter to determine the number of features. But in the lpc function the output is pre-pended with 1 value ([1]), making the output of lpc length 13 and thus also the output of lpcc.

So the number of features for lppc will always be n_coeff + 1. Is there a way to encode this in the feature configuration (dictionary)?

Understanding outputs: what's the difference between 0_Mean and 1_Mean?

Hi guys,
When running the following it extracts two sorts of statistical features. For example I get 0_Mean but also 1_Mean. It seems that the first is the regular mean but I cannot figure out from the documentation what is the later mean. Same question for all other features which are returned as 0_featureName and 1_featureName.

Could you clarify?

import tsfel
cfg_file = tsfel.get_features_by_domain("statistical")
X_train = tsfel.time_series_features_extractor(cfg_file, dataset, fs=1, window_size=10)

Best, Patrick

Repeated calculation slow down feature calculation

I apologize in advance if this is not the right place for this post. I am new to Github.
TSFEL repeats basic calculations, such as FFT, for different types of features, which slows down the feature calculation unnecessarily. I work with time series with over 160 million data points and calculating FFT costs 2 minutes each time. Wouldn't it be better to store and retrieve the FFT for the following features?

Regarding ECDF Slope Feature removal

Hey guys,

Thanks for this awesome library. I saw that in this commit, ECDF slope feature extraction functionality was removed from the library. It would be great if you guys can explain what was the issue with it.

Creating/Sampling synthetic time series based on TSFEL extracted features?

Hi, I found it to be a great library for time-series features extraction and that too in lightweight manner.
I was wondering if one wants to create/sample synthetic series based on TSFEL extracted features from univariate real series, what should be the way forward and if this going to be a feature in future release? Any suggestions/recommendations in this regard will be highly appreciated.

Many Thanks.

Progress bar during feature extraction

It would be nice TSFEL benefit from a progress bar during the feature extraction process. We can monitor the estimated time to accomplish the feature extraction.

<RuntimeWarning: invalid value encountered in sqrt>

Thank you very much for this create library, first of all!

The following code in the "distance" function in features.py:

    diff_sig = np.diff(signal)
    return np.sum([np.sqrt(1 + diff_sig ** 2)])

should be changed to:

    diff_sig = np.diff(signal)
    diff_sigFloat = diff_sig.astype(float)
    return np.sum([np.sqrt(1 + diff_sigFloat ** 2)])

The reason for that is that otherwise an integer overflow might occur for larger numbers, which results in negative numbers, which, in turn, results in an "invalid value" error.

X_train reduced to one row after tsfel.time_series_features_extractor(cfg, X_train, fs=fs)

The X_train data reduced from 208 rows to just 1 row, resulting error for further execution of the code. What can go wrong?

Here is the code:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
print(X_train)

cfg = tsfel.get_features_by_domain()
# Get features
X_train = tsfel.time_series_features_extractor(cfg, X_train, fs=fs)
X_test = tsfel.time_series_features_extractor(cfg, X_test, fs=fs)
print(X_train)
print(X_test)

corr_features = tsfel.correlated_features(X_train)
X_train.drop(corr_features, axis=1, inplace=True)
X_test.drop(corr_features, axis=1, inplace=True)

Feature extractor doesn't run

Hi everyone,

I'm looking to extract features from 3 IMUs each containing a 3-axis Accelerometer and Gyroscopes. I have created a dataframe to combine the data from all of them 3 IMUs x 2 sensors (Acc, Gry) x 3 axis (xyz) = 18 columns + Timestamp

I started out calling the tsfel.get_features_by_domain on the entire dataframe, but that never progressed from 0% Complete. Then, I much reduce the problem:

features.json: I copied the original file in the older 'feature_extraction' changed the 'use' from 'yes' to 'no's so it will only calculate [Min, Max, Mean, Median]
data = selected only one columns and 100 rows with float numbers

`
cfg = tsfel.get_features_by_domain(domain = 'statistical',
json_path = 'features.json')

data = df.loc[s_times[0]:f_times[0], 'Neck.Acc.X'][:101].to_list()

df_tsfel = tsfel.time_series_features_extractor(
# configuration file with features to be extracted
dict_features = cfg,
# dataframe window to calculate features window on
signal_windows = data,
# sampling frequency of original signal
fs = 100,
# sliding window size
window_size = 100
)
`

It surely can't get any simpler than this and still it doesn't leave the 0%

There must be something wrong with the way I set up stuff. Can someone help please? I'm currentely writting a paper and will need to give up using this package if I dont' manage to sort this out...

Some extra info:

Using tsfel==0.1.4
Windows 10
Sample data attached. **
sample.zip
**

Histogram warning

When I am extracting features, the histogram feature constantly raises a warning. Could you check what is happening?

Detailed handbook for users or GUI?

Hi,

I am wondering if there is a gui version of tsfel.
Also, what should the input data look like? Is this able to compute multiple timeseries at the same time and extract their features and cluster them based on the features?

Question: Comparison tsfel and tsfresh

Hi tsfel developers!
Nice package! Congratulations on this.
I am one of the authors of tsfresh. I would be very much interested in understanding if there is something we can learn from each other? Is there a functionality tsfresh could provide to tsfel? Or the other way round? Would you even think it makes sense to combine efforts?
Happy to hear your opinion :-)

System Message: WARNING While opening List of features in docs.

While opening the Feature list from docs of tsfel , it's showing errors in place of code-area.

System Message: WARNING/2 (/home/docs/checkouts/readthedocs.org/user_builds/tsfel/checkouts/latest/docs/descriptions/feature_list.rst, line 6)
failed to import tsfel.feature_extraction.features
System Message: WARNING/2 (/home/docs/checkouts/readthedocs.org/user_builds/tsfel/checkouts/latest/docs/descriptions/feature_list.rst, line 6)

toctree references unknown document ‘descriptions/_generated/tsfel.feature_extraction.features’

Low sample rate

I have a series with a sample frequency of 5minutes, so sample rate 0.0033
X_train = tsfel.time_series_features_extractor(cfg_file, X_train_sig, fs=0.0033, window_size=12)

but I get an error:

Name: value, dtype: float64
*** Feature extraction started ***
/Users/markh/Library/Python/3.6/lib/python/site-packages/scipy/signal/spectral.py:1800: RuntimeWarning: divide by zero encountered in double_scalars
  scale = 1.0 / (fs * (win*win).sum())

Must the sample rate be > 1 ?

Problem with the number of extracted features samples

Dear authors,
First of all, congratulations on this great project very helpful fo all the community.
I have a issue related to the number of extracted features samples:
I execute this call
X = ts.time_series_features_extractor(cfg, tmp_data, fs = 32, window_size=32, overlap=0, verbose = 0)

On my accelerometer data frame of dimension 160 x 3, 160 samples and three columns ['X','Y','Z'].
From this call, X has a dimension of 1 x 789. It returns a single sample of features for all the 160 x 3 accelerometer samples.
However, this does not seem right. Since window _size = 32 (1 second of time frame), it has to return to me an X whit dimension 5 x 789.
How is this possible.