Skip to main content
This example demonstrates how to leverage test users to evaluate an airline help desk voice agent. We’ll assume the voice agent has access to an internal system that maintains customer accounts.

Goal

Ensure that the airline voice agent books users on the correct flights.

Step One: Configure Your Agent Attributes

The first step in testing the agent is to configure a list of test users that exist in the agent’s internal system. These users will be used throughout your test sets. Navigate to the Agent Details page and add the following attributes:
{
  "qa_accounts": {
    "user1": {
      "tier": "platinum",
      "miles": 100000,
      "credit_card": "379923037966854",
      "user_token": "duhfsaihd1234567654323456789"
    },
    "user2": {
      "tier": "standard",
      "miles": 8,
      "credit_card": "4134823389064963",
      "user_token": "8976890dfaoisfuapsd80873248179"
    }
  }
}
Notice that user1 is a platinum member while user2 is a standard member. This allows us to compare the agent’s behavior between different user tiers.

Step Two: Create Booking Test Cases

Example Booking Test Cases Now we can use these users in test cases. Let’s examine the first test case configuration.

Setting Up Test Case Metadata

The goal of the metadata is to store values we need for deterministic validation. When we create metrics later, we’ll need to know the exact flight path (in airport codes) to perform simple comparisons on the ticket. For this test case, we configure the following metadata:
  • source: SFO
  • destination: LAX
  • user: user1
The user field identifies which user account the flight will be booked on, allowing us to verify the booking was made for the correct account.

Test Case Prompt

You are calling an airline help desk.

Book a flight from {{test_case.source}} to {{test_case.destination}}.

Use your credit card: {{agent.qa_accounts.user1.credit_card}}
Using the test case metadata and agent attributes in the prompt allows everything to be fully in sync. That way, you only have to change the value in one place. However, this will ultimately be processed as:
You are calling an airline help desk.

Book a flight from SFO to LAX.

Use your credit card: 379923037966854
You can create many permutations of this test case, requesting different sources, with different users, etc.

Step Three: Create an API State Match Metric

After a simulation, we want to check if the airline’s internal database has a ticket for our user. In Step Two, you created a test set with many users and ticket combinations. To do this, navigate to the metric creation page and create an API State Match metric. Example Metric In our example, the airline has an API that allows us to see all tickets for a specific user. It takes in a userId and a user_token, and outputs a list of tickets. After the simulation, we will call the API with {{test_case.user}}, which will be transformed to user1 for our first test case.
{
  "user": "user1"
}
We will receive the response:
{
  "tickets": [
    {
      "source": "SFO",
      "destination": "LAX",
      "date_booked": "12/12/25",
      "confirmed": true
    }
  ]
}
Use the match path tickets[source={{test_case.source}},destination={{test_case.destination}}].confirmed. This will be rendered as, for example, tickets[source=SFO,destination=LAX].confirmed for a given test case. It will select the first ticket that matches both source and destination, and verify the confirmed field. If the ticket exists, the metric will return MATCH. If the ticket exists but is not confirmed, it will return DIFF. If the ticket doesn’t exist, it will return NOT_FOUND.

Step Four: Run Your Simulations!

Now, we have all the building blocks to run our simulations.