Skip to main content

Overview

This guide provides comprehensive documentation for uploading transcripts to Coval for conversation evaluation. Coval supports multiple formats.

Video Tutorial

Required Fields

Essential transcript fields:
  • role: Must be one of "user", "assistant", "system", or "tool"
  • content: The actual message content (string)
  • beginning: Index position in the conversation (number)
  • end: End position in the conversation (number)

Optional Fields

  • start_timestamp: Unix timestamp for when the message started (number)
  • end_timestamp: Unix timestamp for when the message ended (number)
  • error: Error message if transcription failed (string)
  • transcriptionError: Boolean flag indicating transcription error
  • name: Name identifier for the message (string)

Supported Formats

Tool Call Messages

For tool call messages, the content field should contain a JSON string that can be parsed to extract tool information.

Tool Call Content Examples

{
  "role": "tool",
  "content": "{\"tool\": \"waiting_on_customer\"}",
  "start_time": 12.0,
  "end_time": 12.5,
  "beginning": 3,
  "end": 4
}

Alternative Tool Call Formats

The system supports these formats in the content field:
  1. Function format: {"function": "name", "arguments": {...}}
  2. Tool format: {"tool": "name", ...} (other fields become arguments)
  3. Custom backend format: {tool_call: name, arguments: {...}}

Validation Rules

Content Limits

Important limits to keep in mind:
  • Individual message content: Maximum 1,000 characters
  • Total transcript size: Maximum 40MB
  • Number of messages: Maximum 1,000 messages per transcript

Role Validation

  • Only "user", "assistant", "system", and "tool" roles are accepted
  • Each message must have role, content, start_time, and end_time fields
  • start_time and end_time must be float values representing seconds

Role Normalization

For monitoring and evaluation purposes, roles may be normalized:
  • "system" messages with tool call content may be treated as "tool" for display purposes
  • Tool calls in "system" role are automatically detected and parsed
  • The UI will display tool calls with appropriate icons and formatting regardless of the original role

Timing Validation

  • beginning and end values should be sequential integers
  • start_timestamp and end_timestamp should be valid Unix timestamps
  • If timestamps are provided, end_timestamp should be greater than start_timestamp

Audio Requirements

Audio files must be stereo (2 channels). Mono audio files are not supported and will be rejected.
When uploading audio files, the channel mapping determines speaker roles:
ChannelPositionRole
Channel 0LeftAgent
Channel 1RightUser
The system uses channel position to assign roles during transcription—channel 0 (left) is always treated as the agent, and channel 1 (right) is always treated as the user.