DatasetsManager.compareExperiments()
Added in: @mastra/core@1.4.0
Compares two or more experiments, producing per-item and per-scorer comparisons. Requires at least two experiment IDs.
Usage exampleDirect link to Usage example
import { Mastra } from "@mastra/core";
const mastra = new Mastra({ /* storage config */ });
const comparison = await mastra.datasets.compareExperiments({
experimentIds: ["exp-baseline", "exp-new"],
baselineId: "exp-baseline",
});
console.log(`Baseline: ${comparison.baselineId}`);
for (const item of comparison.items) {
console.log(`Item ${item.itemId}:`);
console.log(` Input: ${JSON.stringify(item.input)}`);
for (const [expId, result] of Object.entries(item.results)) {
if (result) {
console.log(` ${expId}: output=${JSON.stringify(result.output)}, scores=${JSON.stringify(result.scores)}`);
}
}
}
ParametersDirect link to Parameters
experimentIds:
string[]
Array of experiment IDs to compare. Must contain at least 2.
baselineId?:
string
ID of the baseline experiment. Defaults to the first ID in `experimentIds`.
ReturnsDirect link to Returns
Throws MastraError if fewer than 2 experiment IDs are provided.
result:
Promise<object>
Comparison results.
object
baselineId:
string
ID of the baseline experiment used for comparison.
items:
Array<object>
Per-item comparison data.
object
itemId:
string
ID of the dataset item.
input:
unknown | null
Input data for the item.
groundTruth:
unknown | null
Ground truth for the item.
results:
Record<string, { output: unknown; scores: Record<string, number | null> } | null>
Results keyed by experiment ID. Each entry contains the output and scorer results for that experiment.