Giter VIP home page Giter VIP logo

churn-prediction-with-text-and-interpretability's Introduction

Churn Prediction with Text and Interpretability

Customer churn, the loss of current customers, is a problem faced by a wide range of companies. When trying to retain customers, it is in a company’s best interest to focus their efforts on customers who are more likely to leave, but companies need a way to detect customers who are likely to leave before they have decided to leave. Users prone to churn often leave clues to their disposition in user behavior and customer support chat logs which can be detected and understood using Natural Language Processing (NLP) tools.

Here, we demonstrate how to build a churn prediction model that leverages both text and structured data (numerical and categorical) which we call a bi-modal model architecture. We use Amazon SageMaker to prepare, build, and train the model. Detecting customers who are likely to churn is only part of the battle, finding the root cause is an essential part of actually solving the issue. Since we are not only interested in the likelihood of a customer churning but also in the driving factors, we complement the prediction model with an analysis into feature importance for both text and non-text inputs.

The categorical and numerical data is from Kaggle: Customer Churn Prediction 2020 and was combined with a synthetic text dataset we created using GPT-2.

Blog Post

Medium / Towards Data Science blog post

Installation

git clone https://github.com/aws-samples/churn-prediction-with-text-and-interpretability.git
conda create -n py39 python=3.9
conda activate py39
cd churn-prediction-with-text-and-interpretability
pip install -r requirements.txt

Download categorical/numerical data and combine with synthetic text data

  1. Download categorical/numerical data - Customer Churn Prediction 2020 May require Kaggle account. Download train.csv and store in data folder.

  2. Run script to combine categorical data with synthetic text data (../scripts)

    python create_dataset.py
    

Run in Notebook

An example notebook to run the entire pipeline and print/visualize the results in included in ../notebook.

Run in Terminal

The python scripts to prepare the data, train and evaluate the model, as well as interpret the model, are stored in ../scripts. The parameters used for training and interpreting the model are stored in ../model/params.yaml.

  1. Prepare the data:
    python preprocess.py
    
  2. Train and evaluate the model:
    python train.py
    
  3. Interpret the trained model (text):
    python interpret.py --churn 1 --speaker Customer
    

Credits

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

churn-prediction-with-text-and-interpretability's People

Contributors

amazon-auto avatar danielhkt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.