Giter VIP home page Giter VIP logo

cpims-virtual-assistant's Introduction

CPIMS

Devs

  1. Dedan Okware
  2. Rebecca Cheptoek
  3. Emmanuel Kigen
  4. Stanley Njoroge

Business Understanding

I. Problem

  • CPIMS service desk receives a high volume of repetitive requests from users, negatively impacting the efficiency and effectiveness of the support team and creating a negative user experience.

  • The need to develop a virtual assistant to help reduce the amount of requests and improve user experience.

  • Key services that the end user expects the virtual assistant to perform.

    II. Client Engagement Process

A user-centered virtual assistant is developed through a user engagement process consisting of the following steps:

  • Defining target audience

  • Understanding user needs

  • Defining virtual assistant purpose

  • Creating a user-friendly virtual assistant

  • Providing excellent services

  • ntinuously improving the virtual assistant

  • asuring user engagement

    III. Objectives

  • To develop a user support Virtual Assistant for the CPIMS system.

  • To deploy the virtual assistant for the CPIMS system into a web interface to enhance user experience.

  • To develop online virtual assistance using machine learning which can be able to answer inquiries and user queries quickly and efficiently.

Data Acqusition

Sources of Data

  • WhatsApp chats from 5 different groups based on regions

  • CPIMS website documentations and frequently asked questions

    Data Acquisition Process

  • Extraction, Transformation, and Loading (ETL) tool used

  • Extraction of data from text files

  • Transformation of data into JSON format

  • Loading of data into system for model training

Exploratory Data Analysis

I. Introduction

  • Explanation of EDA and its purpose in this project
  • The main variable of interest in the data

II. EDA Techniques Used

  • Description of the exploratory visualizations used to analyze the frequently asked questions
  • Explanation of data cleaning and preprocessing
  • Identification of patterns and relationships

III. Results of EDA

  • Distribution of user intentions
  • Feature selection and most frequent phrases used to train the model
  • Visualizations including histograms and box plots to examine the distribution of message types across different intents
  • Word frequency analysis to identify the most common words and phrases used in different questions
  • Conclusion on the most common topics and trends in the chat conversations

IV. Conclusion

  • Outcome of EDA and its usefulness in understanding the data
  • Importance of EDA in the development of the chatbot model

Data Cleaning

Introduction

Briefly introduce the topic of data cleaning and its importance in the data science process

Data Cleaning Process

  • Explain the process of data cleaning which involved:
  • Identifying the intents of each chat such as password reset
  • Categorizing the data into different intents
  • Creating a JSON file and tagging each intent with pattern and response
  • Removing duplicated patterns and responses, and unnecessary characters
  • Formatting the data
  • Extracting key information related to CPIMS from the entire dataset

Data Cleaning Outcomes

Describe the outcomes of the data cleaning process, which included:

  • Acquiring a dataset that only contained CPIMS related issues
  • Creating a JSON file that was used to solve the problem
  • Improving data quality by removing errors, inconsistencies, and irrelevant data

Conclusion

Summarize the importance of data cleaning in ensuring high quality data for analysis and modeling purposes.

Feature Engineering

I. Introduction

  • Definition of feature engineering

  • Importance of feature engineering in training a machine learning model

    II. Feature Engineering Process

  • Data cleaning and transformation

  • Converting data to lowercase

  • Tokenizing of data

  • Removing punctuation

  • OneHotEncoding

  • Removing emojis

  • Lemmatization

  • Limiting each question to a length of 50 words

    III. Features Used

  • Intents

  • Patterns

  • Responses

    IV. Conclusion

  • Summary of feature engineering process and features used

  • Importance of feature engineering in training chatbots

Model Development

Model Development

This project uses a supervised learning approach for model development. The approach involves providing the computer with labeled data, which consists of input data and corresponding desired output. The computer then learns a model from this data, which can be used to map new input data to the desired output. The model can also classify data into different categories and make predictions on unseen data.

Justification for Model Used

The chosen model is useful because it is capable of making predictions on unseen data and classifying data into different categories. It is an effective way of training a machine learning algorithm using labeled data. The supervised learning approach used in this project ensures accuracy and reliability since the computer is given both the input data and the corresponding desired output.

Model Evauation

Machine learning models are evaluated using metrics, which are measures of performance that can be used to track and compare progress. Metrics provide an objective way to measure and compare progress, allowing for informed decision-making and identification of areas for improvement.

Metrics Used

The following metrics were used to evaluate the model:

Precision:

measures accuracy and consistency in a model, assessing how accurately the model can predict the true value of the target variable. It measures how many of the predictions made by the model were correct. Higher precision indicates more accurate predictions by the model.

F1-score:

combines precision and recall into a single score to assess the overall performance of the model. It takes into account both precision and recall to give an overall measure of how well the model is performing. Higher F1-score indicates better overall performance of the model.

Accuracy

Accuracy is a metric that measures the proportion of correct predictions made by the model out of all predictions made. This metric was used to assess the overall performance of the model. The higher the accuracy, the better the performance of the model.

Results from Different Metrics

The accuracy of the model was evaluated using accuracy metrics. Results of the evaluation will be presented in this section.

Justification for Metrics Used

The precision, recall, and F1-score metrics are useful for assessing the performance of a model. Precision measures accuracy, recall measures completeness, and F1-score combines these two metrics into a single score. Using these metrics allows for a more comprehensive assessment of the model's performance, helping to identify areas for improvement.

Model Deployment

Model deployment is the process of integrating a trained machine learning model into a production environment, where it can be used to make predictions or perform other tasks in real-time. The following outlines the deployment method used and the process of model deployment:

Deployment Method Used

We used Flask for our model deployment because it is a lightweight web application framework that is commonly used for deploying machine learning models. Flask allows a trained model to be deployed in a web page which can then be accessed online by different users. Flask incorporates CSS, HTML, and JavaScript to come up with interactive web pages.

Process of Model Deployment

The process of deploying the model involved the following steps:
  1. Installed Flask using pip: pip install flask
  2. Created a Flask application: We did this by creating a new Python file, importing the Flask module, creating a new instance of the Flask class, and defining a route for the application.
  3. Created a new Python file and imported the necessary modules such as pandas and scikit-learn for the machine learning model. We defined the model and loaded the necessary data.
  4. Created a new route in the Flask application that used the machine learning model to make predictions. We created a route that takes input data from a POST request and returns a JSON response with the predicted value.
  5. Saved the Python files and ran the Flask application using the following command in the command prompt: export FLASK_APP=app.py flask run
  6. Tested the model: Used an HTTP client to test the model by sending a POST request to the /predict endpoint with the input data in JSON format.
  7. The Flask application receives the request, uses the machine learning model to make a prediction, and returns a JSON response with the predicted value.

Challenges

  • Cleaning the data

  • Inadequate dataset

  • Language barrier

  • Inadequate time

cpims-virtual-assistant's People

Contributors

softcysec avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.