CPIMS

Devs

Business Understanding

I. Problem

CPIMS service desk receives a high volume of repetitive requests from users, negatively impacting the efficiency and effectiveness of the support team and creating a negative user experience.
The need to develop a virtual assistant to help reduce the amount of requests and improve user experience.
Key services that the end user expects the virtual assistant to perform.

II. Client Engagement Process

A user-centered virtual assistant is developed through a user engagement process consisting of the following steps:

Defining target audience
Understanding user needs
Defining virtual assistant purpose
Creating a user-friendly virtual assistant
Providing excellent services
ntinuously improving the virtual assistant
asuring user engagement

III. Objectives
To develop a user support Virtual Assistant for the CPIMS system.
To deploy the virtual assistant for the CPIMS system into a web interface to enhance user experience.
To develop online virtual assistance using machine learning which can be able to answer inquiries and user queries quickly and efficiently.

Data Acqusition

Sources of Data

WhatsApp chats from 5 different groups based on regions
CPIMS website documentations and frequently asked questions

Data Acquisition Process
Extraction, Transformation, and Loading (ETL) tool used
Extraction of data from text files
Transformation of data into JSON format
Loading of data into system for model training

Exploratory Data Analysis

I. Introduction

Explanation of EDA and its purpose in this project
The main variable of interest in the data

II. EDA Techniques Used

Description of the exploratory visualizations used to analyze the frequently asked questions
Explanation of data cleaning and preprocessing
Identification of patterns and relationships

III. Results of EDA

Distribution of user intentions
Feature selection and most frequent phrases used to train the model
Visualizations including histograms and box plots to examine the distribution of message types across different intents
Word frequency analysis to identify the most common words and phrases used in different questions
Conclusion on the most common topics and trends in the chat conversations

IV. Conclusion

Outcome of EDA and its usefulness in understanding the data
Importance of EDA in the development of the chatbot model

Data Cleaning

Introduction

Briefly introduce the topic of data cleaning and its importance in the data science process

Data Cleaning Process

Explain the process of data cleaning which involved:
Identifying the intents of each chat such as password reset
Categorizing the data into different intents
Creating a JSON file and tagging each intent with pattern and response
Removing duplicated patterns and responses, and unnecessary characters
Formatting the data
Extracting key information related to CPIMS from the entire dataset

Data Cleaning Outcomes

Describe the outcomes of the data cleaning process, which included:

Acquiring a dataset that only contained CPIMS related issues
Creating a JSON file that was used to solve the problem
Improving data quality by removing errors, inconsistencies, and irrelevant data

Conclusion

Summarize the importance of data cleaning in ensuring high quality data for analysis and modeling purposes.

Feature Engineering

I. Introduction

Definition of feature engineering
Importance of feature engineering in training a machine learning model

II. Feature Engineering Process
Data cleaning and transformation
Converting data to lowercase
Tokenizing of data
Removing punctuation
OneHotEncoding
Removing emojis
Lemmatization
Limiting each question to a length of 50 words

III. Features Used
Intents
Patterns
Responses

IV. Conclusion
Summary of feature engineering process and features used
Importance of feature engineering in training chatbots

Model Development

This project uses a supervised learning approach for model development. The approach involves providing the computer with labeled data, which consists of input data and corresponding desired output. The computer then learns a model from this data, which can be used to map new input data to the desired output. The model can also classify data into different categories and make predictions on unseen data.

Justification for Model Used

The chosen model is useful because it is capable of making predictions on unseen data and classifying data into different categories. It is an effective way of training a machine learning algorithm using labeled data. The supervised learning approach used in this project ensures accuracy and reliability since the computer is given both the input data and the corresponding desired output.

Model Evauation

Machine learning models are evaluated using metrics, which are measures of performance that can be used to track and compare progress. Metrics provide an objective way to measure and compare progress, allowing for informed decision-making and identification of areas for improvement.

Metrics Used

The following metrics were used to evaluate the model:

Precision:

measures accuracy and consistency in a model, assessing how accurately the model can predict the true value of the target variable. It measures how many of the predictions made by the model were correct. Higher precision indicates more accurate predictions by the model.

F1-score:

combines precision and recall into a single score to assess the overall performance of the model. It takes into account both precision and recall to give an overall measure of how well the model is performing. Higher F1-score indicates better overall performance of the model.

Accuracy

Accuracy is a metric that measures the proportion of correct predictions made by the model out of all predictions made. This metric was used to assess the overall performance of the model. The higher the accuracy, the better the performance of the model.

Results from Different Metrics

The accuracy of the model was evaluated using accuracy metrics. Results of the evaluation will be presented in this section.

Justification for Metrics Used

The precision, recall, and F1-score metrics are useful for assessing the performance of a model. Precision measures accuracy, recall measures completeness, and F1-score combines these two metrics into a single score. Using these metrics allows for a more comprehensive assessment of the model's performance, helping to identify areas for improvement.

Model Deployment

Model deployment is the process of integrating a trained machine learning model into a production environment, where it can be used to make predictions or perform other tasks in real-time. The following outlines the deployment method used and the process of model deployment:

Deployment Method Used

We used Flask for our model deployment because it is a lightweight web application framework that is commonly used for deploying machine learning models. Flask allows a trained model to be deployed in a web page which can then be accessed online by different users. Flask incorporates CSS, HTML, and JavaScript to come up with interactive web pages.

Process of Model Deployment

The process of deploying the model involved the following steps:

Installed Flask using pip: pip install flask
Created a Flask application: We did this by creating a new Python file, importing the Flask module, creating a new instance of the Flask class, and defining a route for the application.
Created a new Python file and imported the necessary modules such as pandas and scikit-learn for the machine learning model. We defined the model and loaded the necessary data.
Created a new route in the Flask application that used the machine learning model to make predictions. We created a route that takes input data from a POST request and returns a JSON response with the predicted value.
Saved the Python files and ran the Flask application using the following command in the command prompt: export FLASK_APP=app.py flask run
Tested the model: Used an HTTP client to test the model by sending a POST request to the /predict endpoint with the input data in JSON format.
The Flask application receives the request, uses the machine learning model to make a prediction, and returns a JSON response with the predicted value.

Challenges

Cleaning the data
Inadequate dataset
Language barrier
Inadequate time

kigenchesire / cpims-virtual-assistant Goto Github PK

cpims-virtual-assistant's Introduction

CPIMS

Business Understanding

I. Problem

II. Client Engagement Process

III. Objectives

Data Acqusition

Sources of Data

Data Acquisition Process

Exploratory Data Analysis

I. Introduction

II. EDA Techniques Used

III. Results of EDA

IV. Conclusion

Data Cleaning

Introduction

Data Cleaning Process

Data Cleaning Outcomes

Conclusion

Feature Engineering

I. Introduction

II. Feature Engineering Process

III. Features Used

IV. Conclusion

Model Development

Model Development

Justification for Model Used

Model Evauation

Metrics Used

Precision:

F1-score:

Accuracy

Results from Different Metrics

Justification for Metrics Used

Model Deployment

Deployment Method Used

Process of Model Deployment

Challenges

cpims-virtual-assistant's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org