Test Set is a structured collection of test cases designed to evaluate specific functionalities, workflows, or scenarios in a project.

To define a test set you need to specify the following attributes:

  • Test Set Name

  • Test Set Owner

  • Test Set Description

  • Type of Test Set:

Field DescriptionDescriptionExample Test Case
ScenarioNatural language descriptions of user intents and goals.Call agent to order a pizza
TranscriptConversation history between user and assistant to re-simulate. Needs to be in OpenAI transcript format, structured as a JSON array of objects[{“role”: “assistant”, “content”: “Welcome to X Restaurant. How may I assist you today?”}, {“role”: “user”, “content”: “I would like to order some pizza.”}]
WorkflowGraph showing the workflow you would like to test.Greeting the customer -> Ask for Pizza -> Confirm request
StreamAudio stream-
DefaultGeneric test set type.Default value.

Each test case consist of:

  • Input: Has the same format as the defined test set (see examples in the table above)

  • Expected Output (Optional): Specify the desired test output if you plan to run metrics for result comparison.

curl --request POST \
--url "https://api.coval.dev/dataset" \
--header "x-api-key: <COVAL_API_KEY>" \
--header "Content-Type: application/json" \
--data '{
  "dataset_name": "Restaurant transcripts",
  "dataset_metadata": {
    "additional_info": "Some additional info."
  },
  "description": "Transcript of ordering a pizza from our restaurant.",
  "test_set_type": "Transcript"
}'

Add a Test Case

curl --request POST \
--url "https://api.coval.dev/dataset/<TEST_SET_ID>" \
--header "x-api-key: $COVAL_API_KEY" \
--header "Content-Type: application/json" \
--data '{
  "body": {
    "action": "append",
    "input_strs": [
      {"role": "assistant", "content": "Welcome to X Restaurant. How may I assist you today?"},
      {"role": "user", "content": "I would like to order some pizza."}
    ],
    "output_strs": [{"role": "assistant", "content": "Welcome to X Restaurant. How may I assist you today?"},
      {"role": "user", "content": "I would like to order some pizza."}]
  }
}'
See API documentation for more details here