Giter VIP home page Giter VIP logo

reinforcement-learning-chatbot's Introduction

Reinforcement-Learning-chatbot

• Team Neurons

Abhishek Prabhudesai

Pooja Agarwal

Rohini Shimpatwar

Saurabh Aggarwal

Reference:
-https://github.com/KGSands
-https://learning.oreilly.com/library/view/deep-reinforcement-learning/9781788834247/ch12.html
-https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On/tree/master/Chapter12

We would like to thank Keelan KGSands to provide us a baseline for our project. The fundamental of this chatbot implementation was inspired by his work and we were able to give it a shape according our need which was to provide the user of our application (crowd couting at a public plcae, Mall, stores, DMV etc.) an interface to give a crowd footfall estimation using an DQN learned chatbot.

The application comprises of a chatbot, that aims to give a user appropriate answers based on their queries. A learning agent uses a deep Q learning neural network to get the maximum reward and reply at a given point of time.

The application will be trained on a conversation csv, that will have all the possible conversations between chatbot and the user. The input to the neural network is a vector [S1, S2] which will be of size 2. The S1 will have the value of the current sentence or the question that will be entered by the user. To maintain the context, S2 will be storing the previous question entered by the user.

An action space, will be an array of [0, 1 ...21] and an action will be selected out of this action space. The actions will be one of the possible answers from [A0 ...A21]. The maximum Q-value will be selected for the answer during the testing phase out of [Q0…Q21]. The Q-value will give us an estimate of the aggregated future reward . The reward will be an integer value, that will be sent over to the agent when it gets an action.

• Environment

The main purpose of the environment class is to generate the random user conversation to train the chatbot agent and develop a reward function which takes current question and actions as the input and provides rewards as the output.

These randomly generated conversations will be used for training the neural network.

• Replay Memory

This is used to store the historical data so that it can be used during exploitation.

A number of 2-dimensional arrays are maintained with action to q value mapping containing historic data.

Q-values are updated with the formula: State Value = Reward + Discount_Rate* Max(Q value of next State)

• Neural Network

Structure of the Neural network required for estimating the Q value for the actions

Input layer: 2 neurons corresponding to state(Current State and Previous State) Hidden layer1: Dense layer of 20 neurons with Relu as an activation function Hidden layer2: Dense layer of 20 neurons with Relu as an activation function

Output Layer: Layer with 22 Neurons

• Agent

The main purpose of the Agent is to initialize the environment, train the Neural Network and take action - that is to choose the correct reply for the user question.

All the hyperparameters required for training the neural network like learning rate, exploration parameter epsilon, Replay memory size are initialized in the agent class.

reinforcement-learning-chatbot's People

Contributors

poojaagrwal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.