Giter VIP home page Giter VIP logo

sales_forecast-time_series's Introduction

Sales_Forecast-Time-Series-Forecast

Overview

Forecasting is perhaps the most common application of machine learning in the real world. Sales forecasting is an important problem in the e-commerce industry. Forecasting the sales of items helps businesses make better and informed decisions. Here we are working on the sales data of a grocery store from the country of Ecuador. The model aims at accurately predicting the unit sales for thousand of items sold across all the stores. High accuracy in the results is important because if the model predicts a little over, the stores get overstocked with perishable goods, and if it predicts a little under, the items quickly sell out, with disappointed customers.

Dataset

This dataset has been taken from Kaggle competition held by the Corporación Favorita organization. The training data includes dates, stores, and product information, whether that item was being promoted, as well as the sales numbers. In addition to this there were some other supporting files containing data related to holiday events across the year and oil prices for the country over years. This information can help us build a robust model for forecasting that considers the trend, seasonality from the historical data of four years. This is a multivariate time series problem.

Trend derived from the Sales data

  • Trend captured by taking moving average over window of 365 days -

image

Seasonality captured from the Sales data

Seasonality derived from the sales data

  • Impact of different kinds of seasonality derived from the dataset –

image

Autocorrelation Plot for the Sales data

image

Partial Autocorrelation for the Sales data

image

Design

Based on the above plots we extracted the features like week, month, quarter, year to capture the time series trend along with other important features. Different lag features were added to capture the autocorrelation (commonly called serial dependence), is the correlation between a time series current value with past values. Also, moving average of different windows are used as feature to capture data trend. All these features were combined with the features already present in the dataset to train the model.

3 different models were trained to observe how each performs against the other –

  1. ARIMA

Model trained on Moving Average calculated based on different window sizes - 7(weekly), 30(monthly) and 365(annual) days.

  1. LGBM

For training the LGBM model, batch of data were used and these batches were identified based on the type of training window passed to the model. There were two types of windows defined one being the sliding window and another being the expanding window. Example of sliding window and expanding window is given below -

image

  1. LSTM

Used TimeseriesGenerator class to create training sequence and test sequences. Used a 3-layered LSTM model with the last layer being a Dense Layer to predict the sales price for any given day. The model was trained by fit_generator function of TimeseriesGenerator class. Evaluation metrics used was MSE (Mean squared error).

Performance

Below is the Performance data of different models tried -

  1. ARIMA

RMSE obtained from each window size is shown below -

  1. LGBM

RMSE scores for both sliding and expanding type of window training -

sales_forecast-time_series's People

Contributors

ruparna25 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.