Use case Example: Airline AI Help Desk

Let’s see a concrete example of how you can use Coval to test an Airline AI Help Desk Agent Receptionist. You can adapt this example to your own use case.

Guide to test your AI agents using our web app

1

Create Account

See our Quickstart and create an account.

2

Create Test Set

There are 3 types of test sets that you can create:

  1. Transcript: Uploading a conversation (transcript) representing a user-agent interaction.

  2. Scenario: A natural language scenario of how a user might interact with your agent.

  3. Graph: Define a workflow as a graph to test your agent’s ability to follow a series of steps.

Let’s keep it simple for your first example and create a Scenario test set.

You can use the Generate More button to create more test cases.

3

Create Metrics

Let’s create a metric to evaluate the performance of our agent. We can either use a Customizable Metric, use one of our pre-built metrics, or create our own.

For this example, let’s create our own metric to evaluate if the user agent was able to receive information about the check-in process and use the existing Sentiment Analysis metric to evaluate the agent’s ability to communicate in a friendly and professional manner.

4

Create a Template

Predefine settings for your evaluation, including test sets, metrics, and audio configurations.

5

Launching Evaluation

Now, we’re one step away from testing our agent. Let’s launch the evaluation by selecting your desired Template, adjusting the config as needed, and finally launching your eval.

6

Review Results

Now, let’s review the results of our evaluation. Here is an example of what the results might look like for the Sentiment Analysis metric. In the Scorecard tab, you can check the results for the metric we have previously created.

Congratulations 🎉! You’ve just tested your agent.