Overview
Twilio ConversationRelay lets you connect a Twilio Programmable Voice call to a WebSocket server that handles STT → LLM → TTS in real time. This guide covers how to:
- Build an OTel span tree from ConversationRelay events and export it to Coval
- Correlate traces with Coval simulation runs despite Twilio PSTN stripping SIP headers
For a complete working implementation, see the coval-examples Twilio agent on GitHub.
The PSTN limitation
When Coval places a simulation call to your agent, it normally passes the simulation output ID as a custom SIP header:
X-Coval-Simulation-Id: <simulation-id>
This works for agents using SIP trunking (Telnyx, custom SBCs) where the SIP signaling layer is preserved end-to-end. Twilio Programmable Voice, however, routes calls through the public telephone network (PSTN). PSTN carriers strip non-standard SIP headers, so X-Coval-Simulation-Id never reaches your application.
Solution: pre_call_webhook_url
Coval supports an alternative correlation mechanism for agents where SIP headers are unavailable. Configure pre_call_webhook_url on your agent and Coval will POST the simulation output ID to your agent before dialing, giving it a chance to stash the ID before the call connects.
The webhook is called once per simulation, immediately before the outbound call is placed. It receives:
{
"simulation_output_id": "<sim-output-id>"
}
Your agent queues this ID, then pops it when the next call arrives.
Coval agent configuration
In the Coval dashboard, open your agent’s settings and set the following in the agent metadata:
{
"pre_call_webhook_url": "https://your-agent.fly.dev/register-simulation",
"pre_call_webhook_headers": {"x-api-key": "<your-agent-api-key>"}
}
| Field | Description |
|---|
pre_call_webhook_url | The URL Coval will POST to before each simulation call |
pre_call_webhook_headers | Optional headers to include — use this to authenticate Coval’s request to your agent |
Use COVAL_API_KEY (your Coval API key) as the value for x-api-key and validate it in your /register-simulation handler. This prevents other callers from pre-registering IDs.
Agent implementation
/register-simulation endpoint
Add an endpoint that accepts Coval’s pre-call notification and queues the simulation ID:
import time
from collections import deque
from typing import Optional
from fastapi import FastAPI, Header, HTTPException, Request
from fastapi.responses import JSONResponse
app = FastAPI()
COVAL_API_KEY = os.environ.get("COVAL_API_KEY", "")
# FIFO queue of (simulation_output_id, registered_at) tuples
_pending_sim_ids: deque[tuple[str, float]] = deque()
_SIM_ID_TTL_SECONDS = 300 # expire after 5 minutes
def _pop_pending_sim_id() -> Optional[str]:
"""Return the oldest non-expired pending simulation ID, or None."""
now = time.time()
while _pending_sim_ids:
sim_id, registered_at = _pending_sim_ids[0]
if now - registered_at > _SIM_ID_TTL_SECONDS:
_pending_sim_ids.popleft() # expired, discard
else:
break
if _pending_sim_ids:
sim_id, _ = _pending_sim_ids.popleft()
return sim_id
return None
@app.post("/register-simulation")
async def register_simulation(
request: Request,
x_api_key: str = Header(default=""),
):
if not COVAL_API_KEY or x_api_key != COVAL_API_KEY:
raise HTTPException(status_code=401, detail="Invalid API key")
body = await request.json()
simulation_output_id = body.get("simulation_output_id", "")
if not simulation_output_id:
raise HTTPException(status_code=400, detail="simulation_output_id is required")
_pending_sim_ids.append((simulation_output_id, time.time()))
return JSONResponse({"status": "ok", "queued": len(_pending_sim_ids)})
Reading the simulation ID on call arrival
In your ConversationRelay WebSocket handler, pop the pending ID when the "setup" event arrives:
@app.websocket("/ws")
async def conversationrelay_websocket(websocket: WebSocket):
await websocket.accept()
simulation_id: Optional[str] = None
async for raw_message in websocket.iter_text():
event = json.loads(raw_message)
event_type = event.get("type", "")
if event_type in ("setup", "connected"):
# Pop the pre-registered simulation ID for this call
simulation_id = _pop_pending_sim_id()
elif event_type == "prompt":
voice_prompt = event.get("voicePrompt", "")
# ... call LLM, stream response back to Twilio ...
Exporting traces after the call
When the WebSocket closes, build OTLP spans from your turn log and POST them to Coval:
import httpx
COVAL_TRACES_URL = "https://api.coval.dev/v1/traces"
def _send_spans(spans: list[dict], simulation_id: str) -> None:
payload = {
"resourceSpans": [
{
"resource": {
"attributes": [
{"key": "service.name", "value": {"stringValue": "twilio-voice-agent"}}
]
},
"scopeSpans": [
{
"scope": {"name": "twilio-voice-agent"},
"spans": spans,
}
],
}
]
}
httpx.post(
COVAL_TRACES_URL,
json=payload,
headers={
"x-api-key": COVAL_API_KEY,
"X-Simulation-Id": simulation_id,
},
timeout=30,
)
Call _send_spans in the finally block of your WebSocket handler, after the call ends:
finally:
if simulation_id and turns:
call_duration_seconds = time.time() - call_start_epoch_seconds
spans = _build_spans_from_turns(turns, call_start_epoch_seconds, call_duration_seconds)
_send_spans(spans, simulation_id)
Trace limitations
ConversationRelay abstracts STT and TTS away from your application code entirely — you receive transcribed text in "prompt" events and send text tokens back; Twilio handles the rest. This means several span values cannot be measured and must be approximated. These are architectural constraints of the ConversationRelay model, not implementation choices.
The following trace values are synthetic when using Twilio ConversationRelay. Do not use them for latency analysis, benchmarking, or metric thresholds.
| Value | Why it must be synthetic |
|---|
stt → metrics.ttfb | Twilio performs speech recognition internally. Your application only receives the final transcribed text in a "prompt" WebSocket event — there is no timestamp for when speech started or when transcription completed. |
stt → stt.confidence | Twilio does not expose per-utterance ASR confidence scores through the ConversationRelay WebSocket API. Fixed at 0.95. |
tts → metrics.ttfb | Twilio converts your text tokens to audio internally. Your application has no visibility into when audio playback begins at the caller’s end. Fixed at 0.1s. |
The one value that is real: llm → metrics.ttfb. Because your application makes the LLM API call directly, you can measure wall-clock time from when the "prompt" event arrives to when the first response token is sent back. This is the only latency signal from ConversationRelay traces worth trusting.
Practical implication: Coval’s built-in STT TTFB and TTS TTFB latency metrics will not reflect real performance for Twilio ConversationRelay agents. LLM TTFB metrics will. If you need real STT/TTS timing data, consider a framework where you control the STT and TTS API calls directly (e.g., Pipecat or LiveKit), which expose those timings to your instrumentation code.
Span schema
| Span | Key attributes | Notes |
|---|
conversation | call.duration_seconds | Root span |
stt | stt.transcription, metrics.ttfb (synthetic), stt.confidence (synthetic 0.95) | One per user turn |
stt.provider.twilio | stt.providerName, stt.confidence, metrics.ttfb | Child of stt |
llm | metrics.ttfb (real), llm.finish_reason | One per assistant turn |
tts | metrics.ttfb (synthetic 0.1s) | One per assistant turn |
tool_call | tool.name, tool.call_id, tool.arguments | When tools are invoked |
tool_call_result | tool.name, tool.call_id, tool.result | Status = ERROR if tool returned an error |
Viewing traces
After a simulation completes, an OTel Traces card appears in the metric grid on the result page. Click View Traces to open the trace viewer.
If no traces appear, check:
pre_call_webhook_url is set on the Coval agent and points to the correct URL
- Your
/register-simulation endpoint is publicly accessible and returning 200 OK
- The
COVAL_API_KEY in the pre_call_webhook_headers matches what your agent expects
COVAL_API_KEY is set in the agent environment (needed to export spans)
Full example
See the complete working implementation in coval-examples/voice-agents/twilio, which includes:
- ConversationRelay WebSocket handler with interrupt support
- Agentic LLM loop (tool calls → re-enter loop until
finish_reason = stop)
- Full span builder with real LLM TTFB measurement
- Fly.io deployment configuration