In this project, we download a dataset that is related to fuel consumption and Carbon dioxide emission of cars. Then, we split our data into training and test sets, create a model using training set, Evaluate our model using test set, and finally use model to predict unknown value.
Data : FuelConsumption.csv
We have downloaded a fuel consumption dataset, FuelConsumption.csv, which contains model-specific fuel consumption ratings and estimated carbon dioxide emissions for new light-duty vehicles for retail sale in Canada.
Data Source : [Data Source] (http://open.canada.ca/data/en/dataset/98f1a129-f628-4ce4-b24d-6f16bf24dd64)
Environment Set up :
-
Importing Needed packages
-
Understanding the Data
-
Data Exploration
-
Visualize Data
-
Creating train and test dataset
###About linear regression?
Linear Regression fits a linear model with coefficients B = (B1, ..., Bn) to minimize the 'residual sum of squares' between the independed x in the dataset, and the depended y by the linear approximation.
- Simple Regression Model
- Non-linear regression
- Multiple Regression Model
For each model we performed the following tasks :
-
Train Data Distribution
-
Modeling using sklearn package
-
Plot Output
-
Evaluation : Evaluate model with test data
-
Plot Output