Documentation Index Fetch the complete documentation index at: https://docs.coval.dev/llms.txt
Use this file to discover all available pages before exploring further.
Agent Skills are modular knowledge packages that teach your AI coding agent how to evaluate effectively. They follow the open Agent Skills standard and work with Claude Code, Cursor, Windsurf, Codex, and 40+ other agents.
Install
npx skills add coval-ai/coval-external-skills
This installs all Coval skills into your agent’s skills directory. Skills are loaded on demand — only the name and description are in memory until activated.
Skills vs MCP vs CLI
Skills MCP Server CLI What it provides Knowledge (how to evaluate well) Tools (execute operations) Operations (run from terminal) Install npx skills add coval-ai/coval-external-skillsnpx coval-mcpbrew install coval-ai/tap/covalUse when Agent needs to design evaluations Agent needs to run evaluations natively Scripting, CI/CD, any terminal Works with Any agent supporting skills MCP-compatible clients Any shell environment
We recommend Skills + CLI for the most complete experience. Skills teach your agent what to create, and the CLI executes it with structured JSON output.
Available Skills
Onboarding
onboard Interactive guided setup for your first evaluation. Walks through connecting an agent, creating personas, building test cases, selecting metrics, and launching a run.
Runs
Skill Description launch-run Launch an evaluation run against an AI agent watch-run Monitor a run’s progress with live status updates quick-eval Full workflow — launch, watch, and summarize results in one go
Simulations
Skill Description get-results Retrieve and analyze simulation results from a run download-audio Download audio recordings from voice simulations
Resources
Skill Description coval-resources Complete reference for all Coval resources, their hierarchy, relationships, API endpoints, and ID formats
Dashboards
Skill Description create-dashboard Create a new dashboard and populate it with metric widgets add-widget Add a chart, table, or text widget to a dashboard manage-dashboard Get, update, or delete a dashboard manage-widgets List, update, resize, or delete widgets list-dashboards List all dashboards with filtering
Test Cases
Skill Description huggingface-import Import datasets from HuggingFace and convert them to Coval test sets
Migrations
Skill Description migrate-bluejay Migrate configuration from Bluejay voice AI testing platform to Coval
Human Review
Skill Description review-llm-annotations-and-improve-prompt Calculate agreement between human and machine labels, then propose improved metric prompts
How Skills Work
Skills use progressive disclosure to stay lightweight:
At startup (~100 tokens per skill): Only the name and description are loaded
When activated (under 5000 tokens): The full skill instructions load when your agent detects a relevant task
On demand : Reference files (templates, examples) load only when needed
This means having all Coval skills installed adds minimal overhead to your agent’s context.
Skill Structure
Each skill follows the Agent Skills spec :
skill-name/
├── SKILL.md # Instructions (required)
├── references/ # Templates, detailed docs (optional)
├── scripts/ # Executable code (optional)
└── assets/ # Static resources (optional)
Source Code
All skills are open source: github.com/coval-ai/coval-external-skills