Run Management
list_runs
List evaluation runs with filtering and pagination.| Parameter | Type | Required | Description |
|---|---|---|---|
page_size | number | No | Results per page (1-100, default: 50) |
page_token | string | No | Pagination token from previous response |
order_by | string | No | Sort order (e.g., -create_time for newest first) |
filter | string | No | Filter expression (e.g., status="COMPLETED") |
get_run
Get detailed information about a specific run.| Parameter | Type | Required | Description |
|---|---|---|---|
run_id | string | Yes | The unique run ID |
create_run
Launch a new evaluation run.| Parameter | Type | Required | Description |
|---|---|---|---|
agent_id | string | Yes | Agent ID from list_agents |
persona_id | string | Yes | Persona ID from list_personas |
test_set_id | string | Yes | Test set ID from list_test_sets |
metric_ids | string[] | No | Specific metrics to evaluate |
options.iteration_count | number | No | Iterations per test case (1-10, default: 1) |
options.concurrency | number | No | Parallel simulations (1-5, default: 1) |
metadata | object | No | Custom metadata for tracking |
Agent Management
list_agents
List all configured agents.| Parameter | Type | Required | Description |
|---|---|---|---|
page_size | number | No | Results per page (1-100, default: 50) |
page_token | string | No | Pagination token |
order_by | string | No | Sort order |
filter | string | No | Filter by model_type, display_name, etc. |
get_agent
Get detailed configuration for a specific agent.| Parameter | Type | Required | Description |
|---|---|---|---|
agent_id | string | Yes | Agent ID from list_agents |
create_agent
Create a new agent configuration.| Parameter | Type | Required | Description |
|---|---|---|---|
display_name | string | Yes | Human-readable name (1-200 chars) |
model_type | string | Yes | Agent type (see below) |
phone_number | string | No | E.164 format for voice agents |
endpoint | string | No | Webhook or WebSocket URL |
prompt | string | No | System prompt/instructions |
metadata | object | No | Custom metadata |
MODEL_TYPE_VOICE- Inbound voiceMODEL_TYPE_OUTBOUND_VOICE- Outbound voiceMODEL_TYPE_CHAT- Chat/textMODEL_TYPE_SMS- SMS messagingMODEL_TYPE_WEBSOCKET- WebSocket
update_agent
Update an existing agent configuration.| Parameter | Type | Required | Description |
|---|---|---|---|
agent_id | string | Yes | Agent to update |
display_name | string | No | New name |
phone_number | string | No | New phone number |
endpoint | string | No | New endpoint URL |
prompt | string | No | New system prompt |
metadata | object | No | New metadata |
Test Set Management
list_test_sets
List all test sets available for evaluation.| Parameter | Type | Required | Description |
|---|---|---|---|
page_size | number | No | Results per page (1-100, default: 50) |
page_token | string | No | Pagination token |
order_by | string | No | Sort order |
filter | string | No | Filter expression |
get_test_set
Get detailed information about a test set.| Parameter | Type | Required | Description |
|---|---|---|---|
test_set_id | string | Yes | Test set ID from list_test_sets |
create_test_set
Create a new test set.| Parameter | Type | Required | Description |
|---|---|---|---|
display_name | string | Yes | Test set name (1-100 chars) |
slug | string | No | URL-friendly ID (auto-generated if omitted) |
description | string | No | Test set description |
test_set_type | string | No | DEFAULT, SCENARIO, TRANSCRIPT, or WORKFLOW |
test_set_metadata | object | No | Configuration metadata |
parameters | object | No | Test parameterization |
Test Case Management
list_test_cases
List test cases with optional filtering by test set.| Parameter | Type | Required | Description |
|---|---|---|---|
test_set_id | string | No | Filter by test set ID |
page_size | number | No | Results per page (1-100, default: 50) |
page_token | string | No | Pagination token |
order_by | string | No | Sort order |
filter | string | No | Filter expression |
get_test_case
Get detailed information about a test case.| Parameter | Type | Required | Description |
|---|---|---|---|
test_case_id | string | Yes | Test case ID from list_test_cases |
create_test_case
Create a new test case in a test set.| Parameter | Type | Required | Description |
|---|---|---|---|
test_set_id | string | Yes | Test set to add the case to |
display_name | string | Yes | Test case name |
description | string | No | Test case description |
input | object | No | Input data for the test |
expected_output | object | No | Expected output for validation |
metadata | object | No | Custom metadata |
update_test_case
Update an existing test case.| Parameter | Type | Required | Description |
|---|---|---|---|
test_case_id | string | Yes | Test case to update |
display_name | string | No | New name |
description | string | No | New description |
input | object | No | New input data |
expected_output | object | No | New expected output |
metadata | object | No | New metadata |
Metrics
list_metrics
List available evaluation metrics.| Parameter | Type | Required | Description |
|---|---|---|---|
page_size | number | No | Results per page (1-100, default: 50) |
page_token | string | No | Pagination token |
order_by | string | No | Sort order |
filter | string | No | Filter expression |
include_builtin | boolean | No | Include built-in metrics |
get_metric
Get detailed configuration for a specific metric.| Parameter | Type | Required | Description |
|---|---|---|---|
metric_id | string | Yes | Metric ID from list_metrics |
Personas
list_personas
List available simulated personas for testing.| Parameter | Type | Required | Description |
|---|---|---|---|
page_size | number | No | Results per page (1-100, default: 50) |
page_token | string | No | Pagination token |
order_by | string | No | Sort order |
filter | string | No | Filter expression |
get_persona
Get detailed configuration for a specific persona.| Parameter | Type | Required | Description |
|---|---|---|---|
persona_id | string | Yes | Persona ID from list_personas |

