Giter VIP home page Giter VIP logo

abhikgupt / bike-rental-prediction-based-on-weather-and-season Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.26 MB

Building a model to predict demand of shared bikes. It will be used by the management to understand how exactly the demands vary with different features. They can accordingly manipulate the business strategy to meet the demand levels.

Jupyter Notebook 100.00%
exploratory-data-analysis linear-regression modelevaluation regression-models rfe

bike-rental-prediction-based-on-weather-and-season's Introduction

Bike-Rental-Prediction-based-on-weather-and-season

Problem Statement:

A bike-sharing system is a service in which bikes are made available for shared use to individuals on a short term basis for a price or free. Many bike share systems allow people to borrow a bike from a "dock" which is usually computer-controlled wherein the user enters the payment information, and the system unlocks it. This bike can then be returned to another dock belonging to the same system. A US bike-sharing provider BoomBikes has recently suffered considerable dips in their revenues due to the ongoing Corona pandemic. The company is finding it very difficult to sustain in the current market scenario. So, it has decided to come up with a mindful business plan to be able to accelerate its revenue as soon as the ongoing lockdown comes to an end, and the economy restores to a healthy state. In such an attempt, BoomBikes aspires to understand the demand for shared bikes among the people after this ongoing quarantine situation ends across the nation due to Covid-19.

They have planned this to prepare themselves to cater to the people's needs once the situation gets better all around and stand out from other service providers and make huge profits. They have contracted a consulting company to understand the factors on which the demand for these shared bikes depends. Specifically, they want to understand the factors affecting the demand for these shared bikes in the American market. The company wants to know: Which variables are significant in predicting the demand for shared bikes. How well those variables describe the bike demands Based on various meteorological surveys and people's styles, the service provider firm has gathered a large dataset on daily bike demands across the American market based on some factors.

Business Goal

You are required to model the demand for shared bikes with the available independent variables. It will be used by the management to understand how exactly the demands vary with different features. They can accordingly manipulate the business strategy to meet the demand levels and meet the customer's expectations. Further, the model will be a good way for management to understand the demand dynamics of a new market.

Dataset characteristics

day.csv have the following fields:

  • instant: record index
  • dteday : date
  • season : season (1:spring, 2:summer, 3:fall, 4:winter)
  • yr : year (0: 2018, 1:2019)
  • mnth : month ( 1 to 12)
  • holiday : weather day is a holiday or not (extracted from http://dchr.dc.gov/page/holiday-schedule)
  • weekday : day of the week
  • workingday : if day is neither weekend nor holiday is 1, otherwise is 0.
  • weathersit :
    • 1: Clear, Few clouds, Partly cloudy, Partly cloudy
    • 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
    • 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
    • 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
  • temp : temperature in Celsius
  • atemp: feeling temperature in Celsius
  • hum: humidity
  • windspeed: wind speed
  • casual: count of casual users
  • registered: count of registered users
  • cnt: count of total rental bikes including both casual and registered

Approach Taken

  • As model interpretability is the key requirement here and during EDA, observed linear relation between the independent variables and the target, so used Linear Regression.
  • Performed Coarse tuning using RFE followed by manual tuning by building multiple LR models and checking the p values (< .05) of coefficients & removed multicollinearity (checking VIF) simultaneously. 7th/Final LR model had 8 features.
  • Durbin-Watson & Residual Analysis confirmed LR assumptions met. Adj. R-squared: 0.82. R2 score is almost similar in test and train dataset. So no possible overfitting is identified.
  • Identified top features, beta coefficients & equation to interpret final Linear Regression model

bike-rental-prediction-based-on-weather-and-season's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.