Skip to main content
Personas define the characteristics of the simulated user interacting with your agent. Configure voice, accent, behavior, and more to match your real user base.

Creating a Persona

  1. Navigate to the Personas section
  2. Click “Create New Persona”
  3. Configure the persona settings
Persona Configuration

Configuration Options

Avatar

Customize the persona’s visual representation:
  • Select from various hair, eye, and lip styles
  • Regenerate avatar seed for a new base face

Persona Label

Display name for the persona (required).

Persona Characteristics

Define the persona’s demographics, personality, and communication style (required). Use the expand button (Shift+E) for a full-screen editor.

Voice Configuration

SettingDescription
VoiceSelect from available voices with gender options. Preview available for each voice.
Language & AccentChoose language and regional accent. Supported languages include English, Spanish, French, French (Canada), German, Italian, Japanese, Korean, Portuguese, and Russian.
Background NoiseAdd ambient noise to simulate real-world calling environments. Volume is adjustable with a slider.
Voices fall into two categories based on realism and concurrency. Higher-realism voices sound more natural and expressive but have a concurrency limit of approximately 12 simultaneous connections. For high-volume simulation runs, use higher-concurrency voices to avoid bottlenecks.Higher Concurrency (20 voices)
VoiceGender
AriaFemale
AshwinMale
AutumnFemale
BrynnFemale
CallumMale
CaspianMale
CorwinMale
DarrowMale
DelphineFemale
DorianMale
ElaraFemale
KieranMale
LysanderMale
MarinaFemale
NaveenMale
OrionMale
RowanMale
SkyeFemale
SorenMale
VeraFemale
Higher Realism (7 voices) — Limited Concurrency
VoiceAccent
AlejandroLatin America
AngelaAmerican
ErikaAmerican
HarryAmerican
MarkAmerican
MonikaAmerican
RajuIndian
Concurrency limit: Higher-realism voices support a maximum of approximately 12 simultaneous connections. If you are running a large volume of simulations, these voices can become a bottleneck. Use higher-concurrency voices for high-volume runs.
SoundDescription
OffNo background noise (default)
OfficeOffice ambience
LoungePeople in a lounge
Crowd TalkingCrowd conversation noise
Airport BoardingAirport boarding announcements
Bus InteriorInside a bus
Kids PlayingPlayground sounds
DoorbellDoorbell ringing
Train ArrivalTrain station arrival sounds
Portable ACAir conditioner hum
SkateparkSkatepark ambience
Small Dog BarkSmall dog barking
CafeCafe ambience
Ferry AnnouncementFerry and PA announcements
Heavy RainHeavy rainfall
Moderate WindWind sounds
Newborn Baby CryingBaby crying
Office with AlarmOffice with alarm going off
Street with SirensStreet traffic with sirens
Construction WorkConstruction site noise

Conversation Initiator

OptionBehavior
Persona waits to speakWaits for the agent to speak first.
Persona speaks firstPersona initiates the conversation.

Interruption Rate

Controls how often the persona proactively interrupts the agent during a conversation. This simulates impatient or talkative callers who don’t wait for the agent to finish speaking.
OptionBehavior
NoneThe persona never proactively interrupts the agent (default).
LowThe persona occasionally interrupts (roughly every 90 seconds).
MediumThe persona interrupts at moderate frequency (roughly every 45 seconds).
HighThe persona frequently interrupts (roughly every 30 seconds).
Note on natural turn-taking: Even with Interruption Rate set to None, you may observe occasional overlapping speech between the persona and agent. This is expected behavior caused by natural voice conversation turn-taking, where the speech-to-text engine detects a pause in the agent’s speech and the persona begins responding before the agent has fully finished. This is distinct from proactive interruptions and reflects realistic phone conversation dynamics.To minimize this, add instructions in your persona prompt like: “Always wait for the agent to completely finish speaking before responding.”See Interruption Behavior below for more details.

Multi-Language STT

Enable multilingual speech recognition so the persona can accurately hear and respond to agents that speak multiple languages in the same conversation (e.g. “For English press one, Para español presione dos”). Found under Advanced in the persona configuration modal.
SettingDescription
Off (default)Speech recognition is set to the persona’s configured language for best single-language accuracy.
OnSpeech recognition accepts all supported languages simultaneously. Supports English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch.
If your agent starts with a multi-language greeting or IVR menu, create a persona with Multi-Language STT enabled. You can clone an existing persona and toggle it on — this lets you test the same agent with both single-language and multi-language speech recognition.

Silent Mode

When enabled, the persona remains completely silent throughout the conversation. The persona will not respond to anything the agent says. This is useful for testing how your agent handles unresponsive callers, dead air, or scenarios where the caller has put their phone down. When silent mode is enabled, all other behavioral settings (background sound, interruption rate, conversation initiator) are automatically disabled.

Caller Phone Number

Configure phone number routing for voice simulations. See phone number mappings below.
Caller Phone Number for Voice Simulations:Coval uses different phone numbers depending on the simulation type. Assign a specific phone number index to a persona if your workflow depends on phone number routing.

Inbound Voice Simulations

For inbound simulations (Coval calls your agent), assign up to 29 phone numbers to a persona.
IndexPhone Number
1+16504471573
2+16506400392
3+16506329775
4+16505360811
5+16505360576
6+15418450089
7+15412194880
8+14157181081
9+14157180538
10+14157180269
11+14153765034
12+14069058267
13+14066920094
14+14064159042
15+14063022479
16+14063022353
17+17182801764
18+17182858503
19+17182859858
20+17183051836
21+17187195385
22+17187195407
23+15342172296
24+15342172366
25+15342172371
26+15342172387
27+19855295712
28+19858539008
29+19858539188

Outbound Voice Simulations

For outbound simulations (your agent calls Coval’s simulated user), select a phone number for the persona to receive calls on.
IndexPhone Number
1+14158734019
2+17199853850
3+17199853656
4+17196219208
5+17194630332
6+17194630202
7+17194630116
8+17194510465
9+16309315617
10+16309190593
11+16306014871
12+16305857118
13+16305526080
14+16305222063
15+16304468895
16+12624037199
17+12623988133
18+12622149045

Advanced Configuration

Emotional Voice Simulation

Emotional tone in voice simulations is controlled through the Persona Characteristics prompt. These instructions guide the persona’s dialogue generation, shaping word choice, sentence structure, and phrasing to convey emotion. The text-to-speech engine then speaks that text.
How emotion works in voice simulations: The persona prompt controls what text the persona generates, not the voice itself. For example, instructing the persona to “be impatient” results in shorter sentences, more direct language, and frustrated phrasing. The TTS engine does not have direct emotion controls — it speaks whatever text the persona produces. Emotional impact comes from the words and sentence structure, not from changes in vocal tone or volume.

Best Practices for Emotional Personas

Be specific and descriptive. Instead of generic labels, describe the emotional behavior in terms of word choice and conversational patterns:
// Less effective
You are an angry customer.

// More effective
You are extremely frustrated and losing patience. You use short, clipped
sentences. When you have to repeat information you've already provided,
you use phrases like "I already told you this" and "This is unacceptable."
Your language becomes sharper and more aggressive as the conversation goes
on if the agent cannot resolve your issue quickly.
Use punctuation to signal emotion. The text-to-speech engine interprets punctuation as speech cues:
  • Exclamation marks (!) convey urgency or emphasis
  • Commas create natural pauses and hesitation
  • Dashes (-) create brief breaks
  • Short sentences convey impatience or stress
  • Question marks with exclamation marks (?!) convey disbelief
Avoid using ellipses (...) for pauses. Some TTS engines read them aloud as “dot dot dot.” Use commas or dashes instead to create natural pauses.
Include emotional progression. Real callers escalate or de-escalate:
You start the call calmly but become increasingly frustrated if the agent
asks you to repeat information or puts you on hold. If the agent resolves
your issue, your tone should soften. If the agent is dismissive, you
become more insistent and demand to speak to a supervisor.

Voice Selection for Emotional Scenarios

Higher-realism voices generally produce better emotional expressiveness. If emotional nuance is important for your test scenarios, consider selecting a higher-realism voice for your persona.
Higher-realism voices have a concurrency limit of approximately 12 simultaneous connections. For high-volume simulation runs where emotional expressiveness is less critical, use higher-concurrency voices to avoid bottlenecks.

Example Emotional Personas

You are a customer under significant time pressure. You are calling during
your lunch break and need this resolved quickly. Keep your responses very
short and direct. You mention the time frequently and say things like
"Can we speed this up?" and "I really don't have much time." If the agent
asks unnecessary questions, respond with impatience: "Is that really
necessary right now?"
You are an older adult who is not comfortable with technology. You are
calling because you cannot figure out the website. You are somewhat
impatient and repeat yourself when you feel you're not being understood.
You become flustered when given too many steps at once and say things
like "That's too many things at once" or "Can you just do it for me?"
You occasionally go on brief tangents about how things used to be simpler.
You are disappointed with the service you received but remain polite
throughout the call. You express frustration through pointed questions
rather than harsh language. Use phrases like "I'm quite disappointed"
and "I was really expecting better." You give the agent a fair chance to
resolve the issue but make it clear that your patience has limits.

Filler Words and TTS Behavior

When configuring personas to use filler words like “um”, “uh”, or “hmm”, the way these words are written in the persona’s speech directly affects how the text-to-speech engine pronounces them. Text-to-speech engines process text literally. Unusual spellings or excessive repeated letters can cause the engine to spell out letters individually, read punctuation marks aloud, or mispronounce unfamiliar character sequences.

TTS-Friendly Filler Words

Use these standard spellings, which are recognized by text-to-speech engines:
Use ThisAvoid ThisWhy
umummm, ummmmExtra letters may be spelled out
uhuhhh, uhhhhhExtra letters may be spelled out
hmmhmmmmm, hmmmmmmExtra letters may be spelled out
ohohhh, ohhhhExtra letters may be spelled out
ahahhh, ahhhhExtra letters may be spelled out
well,well...Ellipses may be read as “dot dot dot”
so,so...Ellipses may be read as “dot dot dot”
you know,you know...Ellipses may be read as “dot dot dot”
Include explicit TTS-friendly instructions in your persona prompt:
<speaking_style>
You use natural filler words in conversation. When hesitating, use only
these words: "um", "uh", "hmm", "oh", "well". Write them as single short
words. Use commas for pauses instead of ellipses or repeated letters.
Example: "Um, I think the order number is, uh, let me check, it's 12345."
</speaking_style>

Conversation Triggers

You may want the persona to remain silent until the agent says a specific word or phrase, such as waiting for a greeting before starting to speak. The persona’s behavior is driven by the instructions in the persona prompt. You can instruct the persona to wait for specific phrases, but because the underlying language model is probabilistic, adherence is not 100% deterministic.

Maximizing Trigger Reliability

  1. Set the Conversation Initiator to “Persona waits to speak” so the agent always speaks first.
  2. Use strong, repeated language in the persona prompt:
CRITICAL INSTRUCTION: You MUST remain completely silent until the agent
says "How can I help you today?" Do not speak. Do not respond to any
other greeting or introduction. Wait specifically for the phrase
"How can I help you today?" before saying anything. Any other phrase
like "How may I assist you?" or "What can I do for you?" should NOT
trigger your response. Continue waiting silently.
  1. Keep the trigger phrase simple and distinctive. Shorter, more common phrases are easier for the persona to reliably detect.
  2. Include fallback behavior for cases where the exact phrase doesn’t appear:
If the agent does not say "How can I help you today?" within the first
30 seconds, you may begin speaking with your objective. This prevents the
conversation from stalling entirely.
Conversation triggers provide high but not perfect consistency. For mission-critical trigger behavior, consider running multiple simulations to account for natural variation.

Interruption Behavior

Voice simulations involve two distinct types of interruption behavior:

Proactive Interruptions

The Interruption Rate setting (None, Low, Medium, High) controls whether the persona deliberately interrupts the agent on a timer. When set to None, the persona never proactively talks over the agent.
SettingBehavior
NoneNo proactive interruptions
LowInterrupts approximately every 90 seconds
MediumInterrupts approximately every 45 seconds
HighInterrupts approximately every 30 seconds

Natural Turn-Taking Overlap

Even with Interruption Rate set to None, you may observe the persona starting to speak while the agent is still talking. This is caused by natural voice turn-taking, not proactive interruptions. In real phone conversations, speakers rely on pauses, intonation changes, and context to determine when the other person has finished speaking. The simulation’s speech-to-text engine detects pauses in the agent’s speech and may interpret a brief pause as end-of-turn, causing the persona to begin responding before the agent has fully finished. This behavior is realistic and expected in voice simulation testing, as it mirrors how real callers sometimes talk over agents.

Reducing Turn-Taking Overlap

If you need the persona to be more patient and avoid any overlap:
  1. Add explicit waiting instructions to the persona prompt:
Always wait for the agent to completely finish their thought before
responding. Take a brief pause after the agent stops speaking before
you begin your response. If you hear the agent start speaking again,
stop immediately and let them finish.
  1. Use longer, more deliberate speech patterns in the persona characteristics to naturally slow the response:
You are a thoughtful, patient caller who considers the agent's words
carefully before responding. You take a moment to think before speaking.

Personas vs. Test Sets

Personas and test sets serve distinct purposes and work together in simulations.

Personas: Define HOW to Behave

Personas establish behavioral traits applied across multiple test sets:
  • “You are polite and friendly, respond in short sentences.”
  • “You speak slowly with natural pauses like ‘uhh’ and ‘umm’.”
  • “You are impatient and frequently interrupt.”

Test Sets: Define WHAT to Do

Test sets contain specific instructions for the conversation:
  • “Call to get a refund for order #12345”
  • “Ask for PTO from March 21st to 22nd”
  • “Inquire about account balance”

Why Keep Them Separate?

Reusability: Apply one persona to multiple test sets, or test one scenario with multiple personas. Comparison Testing: Run the same test set across different personas to evaluate agent handling of various user types. Easier Maintenance: Update behavioral traits in one place without affecting test scenarios.

Best Practices

Recommended:
Persona: "You are a friendly customer who speaks in short sentences."
Test Set: "Call to cancel your subscription."
Avoid mixing behavioral traits with task instructions:
Test Set: "You are a rude customer who wants to cancel subscription and argues about fees."

Custom Persona Prompts

Include in your custom persona prompt:
  • DTMF/IVR handling: Navigation instructions for phone menus
  • Speech style: Filler words, response patterns
  • Information flow: When to provide or withhold information
  • Call ending triggers: Conditions for hanging up

Voice Persona Example

<role>
You are a customer calling support.

<ivr_behavior>
- WAIT for all options before proceeding
- Use dtmf tool to select menu options
- Remain silent during IVR navigation
</ivr_behavior>

<speaking_behavior>
- Natural responses with occasional pauses
- Only respond when directly asked
</speaking_behavior>

<end_call>
Hang up if transferred to a human agent.
</end_call>

Chat Persona Example

<chat_rules>
- Wait for the automated greeting before typing
- Respond naturally to prompts
</chat_rules>

<typing_behavior>
- Natural chat language with occasional typos
- Concise responses unless asked for details
</typing_behavior>

Template Strategy

  1. Create persona variations for different user types
  2. Create focused test sets for specific workflows
  3. Combine in templates for comprehensive testing
  4. Analyze results across user personalities
Start with 2-3 core personas (Polite Customer, Impatient Customer, Technical Customer) and build test sets around common workflows.