Capstone Project - Olist Customer Recommendation

Data Set

Final Reports

Summary

Using a dataset of over 100k real world orders provided by Olist, the largest department store in Brazilian market places, the code provided in Jupyter Notebooks constructs a product recommendation engine.

This engine can take a user's unique id and a minimum similarity score and return product ids the user has not purchased yet, ranked by similarity to the user. Three groupings were used to categorize the data:

Customer id vs Product Category Group
Seller id vs Product Category Group
Customer id vs Product Id

And the code generates a Category Group csv file (containing Group Ids percentage of purchases across categories, and number of users in the Group) and a cosine similarity csv file (containing Group Ids and the similarity score between each group)

Using these files, product recommendations can be generated using the following methods:

Using a customer's id and a Group id, return products of its group and similar groups using similarity table
Using customer's id and list of orders, determine similarity between the user and existing groups in Category table. Return product ids of groups similar to this user
Using customer's id and list of orders, use Agglomerative clustering to cluster the categories and the user. Return product ids of the user's cluster

Each method can also consider RFM scores of Buyers or Sellers when considering the quality of the recommendation. Code has also been provided for updating the Category Group and Similarity Score tables based on new orders.

Notebooks

1 - Data Wrangling.ipynb

Investigation into several Data Sets, including Olist
Data grouping of several Olist tables into a single orders DataFrame

2 - EDA.ipynb

RFM Score Generation
Tableau Workbook Links
- RFM Score Investigation
- Mapping of Olist Orders

3 - Similarity Score Generation.ipynb

Generation of Category Group and Similarity Score Tables
Investigation into Similiarity Score's performance based on Silhouette Scores
- Euclidean Distance
- Modified Euclidean Distance
- Cosine Similarity

4 - Product Recommendation Engine.ipynb

Methods to generate ranked recommendations
Method to update Category and Similarity tables based on new orders

ai-noid / springboard-capstoneproject_olistcustomerrecommendation Goto Github PK

springboard-capstoneproject_olistcustomerrecommendation's Introduction

Capstone Project - Olist Customer Recommendation

Data Set

Final Reports

Summary

Notebooks

springboard-capstoneproject_olistcustomerrecommendation's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent