Voice & Video AI Agents Hackathon
Build AI video and voice agents.
Make the most out of the hackathon
Join Coval Slack
Join our Slack to get help and connect with our enginneering team.
Register for the hackathon
Join us to build with the power of voice and video AI.
Coval Prize:
Cash Prize: 500$
Challenge
Test your AI agents using your own Metrics and create dashboards for showing the results.
Use case Example: Airline AI Help Desk
Let’s see a concrete example of how you can use Coval to test an Airline AI Help Desk Agent Receptionist. You can adapt this example to your own hackathon use case.
Guide to test your AI agents using our web app
Create Account
See our Quickstart and create an account.Create Test Set
There are 3 types of test sets that you can create:Transcript: Uploading a conversation (transcript) representating an user-agent interaction.
Scenario: A natural language-scenario of a how an user might interact with your agent.
Graph: Define a workflow as a graph to test your agent’s ability to follow a series of steps.
Let’s keep it simple for your first example and create a Scenario test set.
You can use the Generate More button to create more test cases.
Create Simulator
Now, let’s say you not only want to test your agent based on the input provided in the test set, but you would like to test it using variations of that test set. Additionally, you would like to provide a persona to the coval testing agent and give more details about the context and testing scrnario. To test your agent in a more realistic environment, you can create a Simulator.Let’s go back to our example and create a simulator for our confused traveler coval user agent.
Find more about simulators here.
Create Metrics
Now, let’s create a metric to evaluate the performance of our agent. We can either use a Customizable Metric, use one of our pre-built metrics, or create our own. For this example, let’s create our own metric to evaluate if the user agent was able to receive information about the check-in process and use existing the Sentiment Analysis metric to evaluate the agent’s ability to communicate in a friendly and professional manner.Launching Evaluation
Now, we’re one step away from testing our agent. Let’s launch the evaluation by pressing the Run Evaluation button on our test set and fill in the details for our evaluation. Don’t forget to select the simulator and metrics that we created.Review Results
Now, let’s review the results of our evaluation. Here is an example of what the results might look like for the Sentiment Analysis metric. In the Scorecard tab you can check the results for the metric we have previously created.Congratulations 🎉! You’ve just tested your agent.
Was this page helpful?