Giter VIP home page Giter VIP logo

tsa's Introduction

Time Series Analysis Project

Overview

This project involves time series analysis for sales data of different clothing categories (Tshirt, Shirt, Jeans, Pant). The analysis includes data preprocessing, dataset generation, and forecasting using various methods like Exponential Smoothing, Moving Average, and Simple Average.

Getting Started

Prerequisites

  • Python (version 3.x)
  • Pandas
  • Statsmodels
  • Matplotlib
  • Scikit-learn

Theory

Let’s assume we are dealing with the additive model, that is, consisting of a linear trend and seasonal cycle with the same frequency (width) and amplitude (height). For the multiplicative model, you just need to replace the additions with multiplications and subtractions with divisions.

Trend component

Trend is calculated using a centered moving average of the time series. The moving average is calculated using a window length corresponding to the frequency of the time series. For example, we would use a window of length 12 for monthly data.

Smoothing the series using such a moving average comes together with some disadvantages. First, we are “losing” the first and last few observations of the series. Second, the MA tends to over-smooth the series, which makes it less reactive to sudden changes in the trend or jumps.

Seasonal component

To calculate the seasonal component, we first need to detrend the time series. We do it by subtracting the trend component from the original time series (remember, we divide for the multiplicative variant).

Having done that, we calculate the average values of the detrended series for each seasonal period. In the case of months, we would calculate the average detrended value for each month.

The seasonal component is simply built from the seasonal averages repeated for the length of the entire series Again, this is one of the arguments against using the simple seasonal decomposition — the seasonal component is not allowed to change over time, which can be a very strict and often unrealistic assumption for longer time series.

On a side note, in the additive decomposition the detrended series is centered at zero, as adding zero makes no change to the trend. The same logic is applied in the multiplicative approach, with the difference that it is centered around one. That is because multiplying the trend by one also has no effect on it.

Residuals

The last component is simply what is left after removing (by subtracting or dividing) the trend and seasonal components from the original time series.

That would be all for the theory, let’s code!

Installation

To install the required dependencies, run:

pip install pandas statsmodels matplotlib scikit-learn

Project Structure

1. dataset_generator.py: Python script to generate a synthetic time series dataset with multiple categories, trends, and seasonality.

2. tsa.ipynb: Python script for forecasting analysis using Exponential Smoothing, Moving Average, and Simple Average.

3. time_series_dataset.csv: Generated synthetic time series dataset.

4. manual.ipynb: Python script for manual analysis of time series components trend, seasonality, and Residue/Noise using additivie approach

Usage

  1. Generate Dataset: To generate a synthetic time series dataset, run:
python dataset_generator.py

This script generates a synthetic time series dataset and saves it as time_series_dataset.csv.

But we already have the dataset we need not to run this code

  1. Forecasting Analysis: To perform forecasting analysis using Exponential Smoothing, Moving Average, and Simple Average, run:
python manual.ipynb
python tsa.ipynb

This script displays results with plots for each category and also the manual analysis for the pant sales.

Results

The manual.ipynb file outputs the trend,sesaonality and noise for the pant category done manually without using the seasonal_decompose library The forecasting analysis results, including plots and RMSE values, can be found in the output of the tsa.ipynb script.

Explaination

The model is also explained on this medium article by me so you can visit it for in depth explaintation. It is a 3 part exploaration

  1. Part-1 Uncover the theory behind Time Series analysis.

  2. Part-2 Implement practical solutions using Python to dissect datasets and identify trends.

  3. Part-3 Evaluate forecasting models' effectiveness - Simple Average, Exponential Smoothing, and Moving Average.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes.

License

This project is licensed under the MIT License.

tsa's People

Contributors

sampurn44 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.