Giter VIP home page Giter VIP logo

genesys-web-messaging-tester's Introduction

Genesys Web Messaging Tester

npm

Automatically test your Web Messenger Deployments

Allows behaviour for Genesys Chatbots and Architect flows behind Genesys' Web Messenger Deployments to be automatically tested using:

  • Scripted Dialogue - I say "X" and expect "Y" in response (example)
  • Generative AI - Converse with my chatbot and fail the test if it doesn't do "X" (examples)

Why? Well it makes testing:

  • Fast - spot problems with your chatbots sooner than manually testing
  • Repeatable - scenarios in scripted dialogues are run exactly as defined. Any response that deviates is flagged
  • Customer focused - expected behaviour can be defined as scenarios before development commences
  • Automatic - being a CLI tool means it can be integrated into your CI/CD pipeline, or run on a scheduled basis e.g. to monitor production

Demo of tool executing two scenarios that pass

The above test is using the test-script:

examples/cli-scripted-tests/example-pass.yml

config:
  deploymentId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  region: xxxx.pure.cloud
scenarios:
  "Accept Survey":
    - say: hi
    - waitForReplyContaining: Can we ask you some questions about your experience today?
    - say: 'Yes'
    - waitForReplyMatching: Thank you! Now for the next question[\.]+
  "Decline Survey":
    - say: hi
    - waitForReplyContaining: Can we ask you some questions about your experience today?
    - say: 'No'
    - waitForReplyContaining: Maybe next time. Goodbye
  "Provide Incorrect Answer to Survey Question":
    - say: hi
    - waitForReplyContaining: Can we ask you some questions about your experience today?
    - say: Example
    - waitForReplyContaining: Sorry. Please input "Yes" or "No". Do you want to proceed?

How it works

The tool uses Web Messenger's guest API to simulate a customer talking to a Web Messenger Deployment. Once the tool starts an interaction it follows instructions defined in a file called a 'test-script', which tells it what to say and what it should expect in response. If the response deviates from the test-script then the tool flags the test as a failure, otherwise the test passes.

Tool using test-script file to test Web Messenger Deployment

Quick Start

Prepare your system by installing node

Install the CLI tool using npm:

npm install -g @ovotech/genesys-web-messaging-tester-cli

Testing with scripted dialogues

Write a dialogue script containing all the scenarios you wish to run along with the ID and region of your Web Messenger Deployment.

examples/cli-scripted-tests/example-pass.yml

config:
  deploymentId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  region: xxxx.pure.cloud
scenarios:
  "Accept Survey":
    - say: hi
    - waitForReplyContaining: Can we ask you some questions about your experience today?
    - say: 'Yes'
    - waitForReplyMatching: Thank you! Now for the next question[\.]+
  "Decline Survey":
    - say: hi
    - waitForReplyContaining: Can we ask you some questions about your experience today?
    - say: 'No'
    - waitForReplyContaining: Maybe next time. Goodbye
  "Provide Incorrect Answer to Survey Question":
    - say: hi
    - waitForReplyContaining: Can we ask you some questions about your experience today?
    - say: Example
    - waitForReplyContaining: Sorry. Please input "Yes" or "No". Do you want to proceed?

Then run the test by pointing to the dialogue script file in the terminal:

web-messaging-tester scripted tests/example.yml

Testing with AI

This tool supports two GenAI providers:

Using ChatGPT

Start by setting up an API key for ChatGPT:

  1. Create an API key for OpenAI
  2. Set the key in the environment variable: OPENAI_API_KEY

Write a scenario file containing all the scenarios you wish to run along with the ID and region of your Web Messenger Deployment.

The scenarios are written as prompts, these can take some fine-tuning to get right (see examples here). The terminatingPhrases section defines the phrases you instruct ChatGPT to say to pass or fail a test.

examples/cli-ai-tests/chatgpt-example.yml

config:
  deploymentId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  region: xxxx.pure.cloud
  ai:
    provider: chatgpt
    config:
      temperature: 1
scenarios:
  "Accept survey":
    setup:
      placeholders:
        NAME:
          - John
          - Jane
      prompt: |
        I want you to play the role of a customer called {NAME}, talking to a company's online chatbot. You must not
        break from this role, and all of your responses must be based on how a customer would realistically talk to a company's chatbot.

        To help you play the role of a customer consider the following points when writing a response:
        * Respond to questions with as few words as possible
        * Answer with the exact word when given options e.g. if asked to answer with either 'yes' or 'no' answer with either 'yes' or 'no' without punctuation, such as full stops

        As a customer you would like to leave feedback of a recent purchase of a light bulb you made where a customer service
        rep was very helpful in finding the bulb with the correct fitting.

        If at any point in the company's chatbot repeats itself then say the word 'FAIL'.

        If you have understood your role and the purpose of your conversation with the company's chatbot then say the word 'Hello'
        and nothing else.
      terminatingPhrases:
        pass: ["PASS"]
        fail: ["FAIL"]

Then run the AI test by pointing to the scenario file in the terminal:

web-messaging-tester ai tests/example.yml

For a slightly more detailed guide see: Let's test a Genesys chatbot with AI.

Using Google Vertex AI

  1. Create a Google Cloud Platform (GCP) account and enabled AI access to Vertex AI
  2. Authenticate the machine running this testing tool, with GCP
  3. Define a prompt to provide the model with context on how to behave during testing

The terminatingPhrases section defines the phrases you instruct PaLM 2 to say to pass or fail a test.

examples/cli-ai-tests/google-vertex-ai-example.yml

config:
  deploymentId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  region: xxxx.pure.cloud
  ai:
    provider: google-vertex-ai
    config:
      location: example-location
      project: example-gcp-project
      modelVersion: "002"
      examples:
        - input: "What would you like to do today?"
          output: "I would like to leave feedback, please"
scenarios:
  "Accept survey":
    setup:
      prompt: |
        I want you to play the role of a customer talking to a company's online chatbot. You must not
        break from this role, and all of your responses must be based on how a customer would realistically talk to a company's chatbot.

        To help you play the role of a customer consider the following points when writing a response:
        * Respond to questions with as few words as possible
        * Answer with the exact word when given options e.g. if asked to answer with either 'yes' or 'no' answer with either 'yes' or 'no' without punctuation, such as full stops

        As a customer you would like to leave feedback of a recent purchase of a light bulb you made where a customer service
        rep was very helpful in finding the bulb with the correct fitting.

        If at any point in the company's chatbot repeats itself then say the word 'FAIL'.

        If you have understood your role and the purpose of your conversation with the company's chatbot then say the word 'Hello'
        and nothing else.
      terminatingPhrases:
        pass: ["PASS"]
        fail: ["FAIL"]

Then run the AI test by pointing to the scenario file in the terminal:

web-messaging-tester ai tests/example.yml

Example commands

$ web-messaging-tester scripted --help
Usage: web-messaging-tester scripted [options] <filePath>

Arguments:
  filePath                             Path of the YAML test-script file

Options:
  -id, --deployment-id <deploymentId>  Web Messenger Deployment's ID
  -r, --region <region>                Region of Genesys instance that hosts the Web Messenger Deployment
  -o, --origin <origin>                Origin domain used for restricting Web Messenger Deployment
  -p, --parallel <number>              Maximum scenarios to run in parallel (default: 1)
  -a, --associate-id                   Associate tests their conversation ID.
                                       This requires the following environment variables to be set for an OAuth client
                                       with the role conversation:webmessaging:view:
                                       GENESYS_REGION
                                       GENESYSCLOUD_OAUTHCLIENT_ID
                                       GENESYSCLOUD_OAUTHCLIENT_SECRET (default: false)
  -fo, --failures-only                 Only output failures (default: false)
  -t, --timeout <number>               Seconds to wait for a response before
                                       failing the test (default: 10)
  -h, --help                           display help for command

Override Deployment ID and Region in test-script file:

web-messaging-tester scripted test-script.yaml -id 00000000-0000-0000-0000-000000000000 -r xxxx.pure.cloud

Run 10 scenarios in parallel:

web-messaging-tester scripted test-script.yaml --parallel 10

Support

If you have any questions then please feel free to:

Development

genesys-web-messaging-tester's People

Contributors

dependabot[bot] avatar sketchingdev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

genesys-web-messaging-tester's Issues

Failure icon is a green tick

Since the latest release when a test fails they're given a 'green tick'. The tests still result in the process failing, so this is just a UI issue.

Allow multiple scenarios per test config

Could we re-use one deployment configuration for more than one scenario file? This might reduce a bit of overhead

e.g.

As well as

  - name: Deployment Config Name
    deploymentFlow: Inbound message flow name
    deploymentConfig: Messenger Deployment Config Name
    scenarios: scenario.yaml

Can we also support the following?

  - name: Deployment Config Name
    deploymentFlow: Inbound message flow name
    deploymentConfig: Messenger Deployment Config Name
    scenarios:
      - scenario-1.yaml
      - scenario-2.yaml

Out of order messages fails test

On occasions the messages received from the server are out of order.

Idea

Gather all messages until silence of x seconds, order by timestamp then emit one after another

Invalid 'say' value does not show validation error

When running tests with a 'say' command with a number the validation message is not logged.

config:
  region: x.pure.cloud
scenarios:
  "test name":
    - say: 1
yarn run v1.22.18
$ web-messaging-tester ./tests/example.yaml --deployment-id x
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Process finished with exit code 1

Use ChatGPT to navigate a customer journey

Add the ability for a test to utilise ChatGPT to navigate a customer journey.

Why:

  • It would remove the complexity of navigating complex flows using expected responses
  • Theoretically journeys could be heavily modified without the tests needing to change

Below is an example of what the scenario might look like, with aiHaveConversationAbout being the operator for indicating you want ChatGPT to have a conversion:

config:
  deploymentId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  region: xxxx.pure.cloud
scenarios:
  "Negative survey responses end with offer to talk to an agent":
    - aiHaveConversationAbout: a product that you're not satisfied with. You've just connected to their customer service chat-bot and will tell it you want to submit a feedback questionnaire, which you will fill in with your experience.
    # ChatGPT will then have the conversation and only quit when the following assertion is matched
    - waitForReplyContaining: Sorry to hear about your experience with us. Would you like to speak to an agent?

Implement follow-up prompts

Complete the implementation of the follow-up prompts:

config:
  deploymentId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  region: xxxx.pure.cloud
scenarios:
  "Accept survey":
    setup:
      prompt: |
        I want you to play the role of a customer talking to a company's online chatbot. You must not
        break from this role, and all of your responses must be based on how a customer would realistically talk to a company's chatbot.

        To help you play the role of a customer consider the following points when writing a response:
        * Respond to questions with as few words as possible
        * Answer with the exact word when given options e.g. if asked to answer with either 'yes' or 'no' answer with either 'yes' or 'no' without punctuation, such as full stops

        As a customer you would like to leave feedback of a recent purchase of a light bulb you made where a customer service
        rep was very helpful in finding the bulb with the correct fitting.

        If at any point in the company's chatbot repeats itself then say the word 'FAIL'.

        If you have understood your role and the purpose of your conversation with the company's chatbot then say the word 'Hello'
        and nothing else.
      terminatingPhrases:
        pass: ["PASS"]
        fail: ["FAIL"]
    followUp:
      prompt: |
        Inspect the transcript below for spelling or grammatical errors and list them. Ignore anything
        said by the speaker called 'AI'.

        Transcript:
        %TRANSCRIPT%
      terminatingPhrases:
        pass: ["PASS"]
        fail: ["FAIL"]

Create logging files to automate testing scenarios

It currently appears not possible to store the results of a test when it has been logged. At production scale with a significant number of test cases, the ability to be able to develop a test script with all the scenarios, and store the outcome of each scenario in a results file like a csv will allow users to hone in on the failed tests.

Anticipating that a normal test cycle would run multiple rounds, with defects resolved and another testing run performed, it should create the file based on the run time of the script.

Further to this, being able to automate this on a CRON job, and then review output the file would allow the administrator to monitor the effect of changes in NLU (Addition/deletion of utterances) and perform regression testing to ensure the intent model does not move backwards as more phrases are added to the bot during the normal tuning process.

Investigate sporadic CI failures

Investigate why the CI job to test the example projects sporadically fails.

Whilst developing #37 the job that runs the examples sporadically failed due to timeouts waiting for a response.

Decline Survey (FAIL)
---------------------
You: hi
Them: Can we ask you some questions about your experience today?
You: No

Expected a message within 10 seconds containing:
 Maybe next time. Goodbye
Didn't receive any response

Example 1
Example 2

Add way to determine if conversation disconnected

Implement a way to determine if the conversation has been disconnected and cannot be continued.

Currently if the Conversation Disconnect configuration is set to 'Display conversation status and disconnect session' then the session is disconnected and any subsequent .sendText(...) results in the response:

{"type":"response","class":"string","code":409,"body":"Session is read-only, no new messages allowed"}

Update API to surface the Disconnect presence in the StructuredMessage sent from the bot:

{
  "type": "message",
  "class": "StructuredMessage",
  "code": 200,
  "body": {
    "type": "Event",
    "direction": "Outbound",
    "id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "channel": {
      "time": "2024-02-09T22:39:50.526Z",
      "type": "Private",
      "from": {
        "nickname": "Robot Friend"
      },
      "to": {}
    },
    "events": [
      {
        "eventType": "Presence",
        "presence": {
          "type": "Disconnect"
        }
      }
    ],
    "metadata": {
      "readOnly": "true"
    },
    "originatingEntity": "Bot"
  }
}

Use ChatGPT as assertion to reduce test fragility

Integrate ChatGPT into test scenarios to allow more forgiving assertions, reducing fragility of tests.

Example of possible assertion function aiWaitFor:

config:
  deploymentId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  region: xxxx.pure.cloud
scenarios:
  "Asked for account ID":
    - say: hi
    - aiWaitFor: a request for my account number
    - say: 12345
    - aiWaitFor: a message thanking me for my account number, and asking me to wait whilst it is checked

Example prompt:
image

Regular Expression match throwing error

The following section of a scenario is causing WM Tester to throw an error

    - waitForReplyMatching: Flow\.Metrics = Start\(.+\),AccountNo\(.+\),Example\(.+\)
wm-tester-regex-messages/genesys-web-messaging-tester/packages/genesys-web-messaging-tester/lib/Conversation.js:192
                const expectedText = caseInsensitive ? text.toLocaleLowerCase() : text;

Add configuration option to not show successful transcripts

When debugging a failing test, it isn't perhaps that useful to see the transcripts for tests that pass - since they make it more difficult to find the transcripts for the failed tests

Could a configuration option be added to not include these in the test output, please?

Configurable retry option

Would it be possible to configure the number of attempts to retry in the event of Didn't receive any response?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.