Giter VIP home page Giter VIP logo

rajkulk16 / cltv-compass Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 161.32 MB

The main objective of this project is to forecast the Customer Lifetime Value (CLTV) using user and policy data.

Python 90.27% Shell 0.01% Batchfile 0.01% HTML 0.06% CSS 0.02% Roff 0.01% JavaScript 0.03% Jupyter Notebook 1.92% Lua 0.01% Cython 2.10% C 3.18% C++ 0.15% Meson 0.01% Fortran 0.07% Forth 0.01% Smarty 0.01% PowerShell 0.01% Perl 0.02% DTrace 0.01% Tcl 2.14%
cltv-prediciton deep-neural-networks elasticnet-regression lightgbm linear-regression random-forest statistical-analysis

cltv-compass's Introduction

Optimizing Customer Acquisition and Retention with CLTV Prediction

  • The main goal of this project is to forecast the Customer Lifetime Value (CLTV) using user and policy data.

  • CLTV: a metric that indicates how much revenue a company can anticipate from a customer over their entire relationship.

    • It aids businesses in understanding the long-term worth of their customers and guides decisions regarding customer acquisition and retention investments.
    • Essentially, CLTV assists businesses in determining the value of each customer over time.

Dataset Overview

  1. Description, Features, and Attributes:
    • This dataset contains information on customers and policies, such as the customers' highest qualification, annual income, employment status, chosen policy, policy type, and the target variable indicating the total CLTV.
    • Some key features include:
      • id: Unique customer identifier
      • gender: Customer's gender
      • area: Customer's location
      • qualification: Customer's highest qualification
      • claim_amount: Total amount claimed by the customer (in rupees)
      • vintage: Number of years since the first policy date
      • num_policies: Total number of policies issued to the customer

Code Related

  • Utilized a conda environment with Python version 3.9 (GPU and CUDA enabled).
  1. For local setup:

    • Clone the repository: <repo_name>
    • Activate the environment: conda activate venv
  2. "Might require re-installation" Libraries:

    • tensorflow
    • tensorflow-diretml-plugin (GPU-based)
    • fast_ml

Project Flow

  1. Data Analysis, Preprocessing, and Engineering:

    • Handled duplicates, missing values, null values, etc., to ensure minimal impact on business analysis.
    • Conducted thorough analysis, including exploring gender ratios to diversify product offerings, investigating factors contributing to higher average claims in urban areas, etc.
    • Utilized statistical tests (1-way ANOVA test) to compare mean CLTV across different categorical variables and understand CLTV variation.
    • Implemented logical data cleaning and transformation techniques.
  2. Models Employed for Prediction:

    • Linear Regression (complemented with Elastic Net with equal regularization ratio)
    • Random Forest Regressor (complemented with LightGBM)
      • vanilla
      • hyperparameter tuned
    • Deep Neural Network
      • vanilla
      • hyperparameter tuned

Observations

  1. None of the models exhibited exceptional performance, with the Random Forest Regressor slightly outperforming the others.
  2. R2 and Adjusted-R2 scores were prioritized to gauge the metrics' influence on the business module.

Results

  • Through EDA and model development, it is apparent that "PLATINUM" policy is favored by a majority of the population regardless of the area, be it rural or urban. So, major focus can be given there for retention.

  • Despite the poor performance of the vanilla Random Forest Regressor, the tuned model exhibited significant improvement compared to its predecessor.

  • The likely cause for the low metrics could be attributed to the "quality" of the data (DATA QUALITY). Even with thorough preprocessing and engineering, if the data quality is poor, it is expected to yield unfavorable results.

    • Upon deeper investigation into the data source, it was discovered that all attempts at modeling this dataset resulted in similar metrics, underscoring the data's quality issues.

cltv-compass's People

Contributors

rajkulk16 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.