Skip to main content

DatasetsManager.compareExperiments()

Added in: @mastra/core@1.4.0

Compares two or more experiments, producing per-item and per-scorer comparisons. Requires at least two experiment IDs.

Usage example
Direct link to Usage example

import { Mastra } from '@mastra/core'

const mastra = new Mastra({
/* storage config */
})

const comparison = await mastra.datasets.compareExperiments({
experimentIds: ['exp-baseline', 'exp-new'],
baselineId: 'exp-baseline',
})

console.log(`Baseline: ${comparison.baselineId}`)

for (const item of comparison.items) {
console.log(`Item ${item.itemId}:`)
console.log(` Input: ${JSON.stringify(item.input)}`)

for (const [expId, result] of Object.entries(item.results)) {
if (result) {
console.log(
` ${expId}: output=${JSON.stringify(result.output)}, scores=${JSON.stringify(result.scores)}`,
)
}
}
}

Parameters
Direct link to Parameters

experimentIds:

string[]
Array of experiment IDs to compare. Must contain at least 2.

baselineId?:

string
ID of the baseline experiment. Defaults to the first ID in `experimentIds`.

Returns
Direct link to Returns

Throws MastraError if fewer than 2 experiment IDs are provided.

result:

Promise<object>
Comparison results.
object

baselineId:

string
ID of the baseline experiment used for comparison.

items:

Array<object>
Per-item comparison data.
object

itemId:

string
ID of the dataset item.

input:

unknown | null
Input data for the item.

groundTruth:

unknown | null
Ground truth for the item.

results:

Record<string, { output: unknown; scores: Record<string, number | null> } | null>
Results keyed by experiment ID. Each entry contains the output and scorer results for that experiment.
On this page