Launch a new simulation run to evaluate an agent against test cases using a persona.
API key for authentication
The agent to test. Must be owned by the authenticated organization.
22"gk3jK9mPq2xRt5vW8yZaBc"
The simulated persona to use for testing.
22"hL4kL0nQr3ySt6vX9zAcDd"
The test set containing test cases to run.
8"aB1cD2eF"
Optional list of metric IDs to evaluate. If not provided, uses agent's default metrics.
22[
"iM5lM1oRs4zTu7wY0aBdEe",
"jN6mN2pSt5aUv8xZ1bCeFf"
]
Single mutation ID to run in addition to the base agent.
Mutually exclusive with mutation_ids. The base agent always runs.
26"01ARZ3NDEKTSV4RRFFQ69G5FAV"
List of mutation IDs to run in addition to the base agent.
Mutually exclusive with mutation_id. Max 100 mutations.
The base agent always runs alongside all mutations.
Total simulations = test_cases × iterations × (1 + len(mutation_ids))
10026[
"01ARZ3NDEKTSV4RRFFQ69G5FAV",
"01ARZ3NDEKTSV4RRFFQ69G5FAW"
]
List of metric names that should evaluate the persona instead of the agent. Each entry should be the base metric name (e.g., 'latency', not 'persona:latency').
["latency", "interruption_rate"]
Execution options for launching simulations.
Metadata for tracking and organization purposes.
Run launched successfully