Skip to main content
Multi-Run Analysis lets you bring multiple runs together into a single view so you can spot regressions, compare agent variants, and track metric trends across evaluations. Instead of flipping between individual run pages, you see all the data in one place—with color-coded grouping, aggregated statistics, and a shareable URL.

When to use it

Multi-Run Analysis is useful when you want to:
  • Compare agent versions — run the same test set against different agent builds and see which performs better across every metric
  • Evaluate persona impact — test the same agent against multiple personas and understand how user behavior affects outcomes

Accessing Reports

Navigate to Reports in the left sidebar

Creating a Report

There are two ways to start a new report.

From the Runs list

  1. Go to Runs in the sidebar.
  2. Enable select mode and check the runs you want to analyze (up to 50 at a time).
  3. Click Multi-Run Report — this opens the report builder pre-loaded with those run IDs in the URL (?run_ids=...).
Alternatively, while the runs list has filters applied, click Multi-Run Report with filters to open a report that dynamically loads up to 50 runs matching the current filter state. The filter parameters are encoded in the URL, so the same link will resolve to the same runs when shared.

From the Reports page

Click New Report from the Reports page, then add run IDs manually or navigate there via the runs list flow above.

Compare By

The Compare by dropdown (top-right of the report) is the core analytical tool. It segments the data by a dimension you choose, then color-codes each segment so patterns are immediately visible in both the metric cards and the results table.
OptionWhat it segments by
NoneNo grouping — all rows shown together
AgentGroups rows by the agent that ran the simulation
PersonaGroups rows by the persona used
Test caseGroups rows by the specific test case input
MetadataGroups rows by a custom metadata key you specify

View Modes

Once a Compare By dimension is selected, you can switch between two view modes:

Row view

Each simulation output appears as an individual row, color-coded by its Compare By group. This is the default. Use it when you want to inspect individual conversations or find outliers within a group.

Grouped view

Rows are collapsed into one row per group. Each group row shows aggregated metric scores for all simulations in that group. Use this when you want a high-level comparison across groups without the noise of individual results. The grouped view toolbar lets you toggle between five aggregation modes:
ModeWhat it shows
AverageMean score across all simulations in the group
MedianMiddle value — less sensitive to outliers than average
P9595th percentile — useful for understanding worst-case performance
MinLowest score in the group
MaxHighest score in the group
Click a group row in the grouped view to expand it and see the individual simulation rows within that group.

Filtering by Metric

Clicking a metric card in the left pane filters the results table to show only the column for that metric, making it easier to focus on one score at a time. Click All Metrics in the breadcrumb to return to the full table.

Saving a Report

An unsaved report (opened from the runs list) shows a Save Report button in the header. Click it to save the current set of run IDs and view configuration (Compare By setting, view mode, and color overrides). After saving, the report gets a permanent ID and appears in the Reports list. On a saved report, the Save button is only active when the view configuration has changed from what was last saved. Click it to persist your latest configuration changes. To rename a saved report, click the pencil icon next to the report title and type a new name. Press Enter or click away to save.

Sharing a Report

Saved reports can be published for external sharing.
  1. Open a saved report.
  2. Click the Share button in the header.
  3. Click Publish shareable link — this publishes all runs in the report and generates a public URL at /shared/reports/{report_id}.
  4. Copy the link from the popover and share it.
Anyone with the link can view the report without a Coval account. Published reports show a Public badge in the Reports list. To revoke access, open the Share popover and click Unpublish all.
Reports copied from the Reports list actions menu show a warning if the report is still private — the link won’t be accessible until the report is published.

Deleting a Report

From the Reports list, open the actions menu (three-dot icon) on any report row and select Delete. You’ll be asked to confirm before the report is permanently removed. Deleting a report does not delete the underlying runs.