# Datasets overview **Added in:** `@mastra/core@1.4.0` Datasets are collections of test cases that you run experiments against to measure how well your agents and workflows perform. Each mutation creates a new version, so you can reproduce past experiments exactly. Pair datasets with [scorers](https://mastra.ai/docs/evals/overview) to track quality across prompts, models, or code changes. ## Usage ### Configure storage Configure storage in your Mastra instance. Datasets require a storage adapter that provides the `datasets` domain: ```typescript import { Mastra } from '@mastra/core' import { LibSQLStore } from '@mastra/libsql' export const mastra = new Mastra({ storage: new LibSQLStore({ id: 'my-store', url: 'file:./mastra.db', }), }) ``` ### Accessing the datasets API All dataset operations are available through `mastra.datasets`: ```typescript const datasets = mastra.datasets // Create a dataset const dataset = await datasets.create({ name: 'my-dataset' }) // Retrieve an existing dataset const existing = await datasets.get({ id: 'dataset-id' }) // List all datasets const { datasets: all } = await datasets.list() ``` > **Info:** Visit the [`DatasetsManager` reference](https://mastra.ai/reference/datasets/datasets-manager) for the full list of methods. ## Studio You can also manage datasets in [Studio](https://mastra.ai/docs/studio/overview). After opening Studio, select **Datasets** from the sidebar to see all your available datasets or create a new one. To get started, select **Create Dataset** and set a name, description, and optional schemas. After confirming, you'll see the dataset details page with two tabs: **Items** and [**Experiments**](https://mastra.ai/docs/evals/datasets/running-experiments). In the **Items** view you can add, update, and delete items, and view version history. Select **Add Item** to insert a new item with JSON editors for input and ground truth. From this view you can also import items in bulk from a CSV or JSON file. When importing, map each column to the corresponding dataset field. Select **Versions** to see the full history of changes to the dataset. After selecting **Compare Versions**, choose any two versions and select **Compare** to see a side-by-side diff of all items that were added, changed, or removed between those versions. ## Creating a dataset Call [`create()`](https://mastra.ai/reference/datasets/create) with a name and optional description: ```typescript import { mastra } from '../index' const dataset = await mastra.datasets.create({ name: 'translation-pairs', description: 'English to Spanish translation test cases', }) console.log(dataset.id) // auto-generated UUID ``` ### Defining schemas You can enforce the shape of `input` and `groundTruth` by passing Zod schemas. Mastra converts them to JSON Schema at creation time: ```typescript import { z } from 'zod' import { mastra } from '../index' const dataset = await mastra.datasets.create({ name: 'translation-pairs', inputSchema: z.object({ text: z.string(), sourceLang: z.string(), targetLang: z.string(), }), groundTruthSchema: z.object({ translation: z.string(), }), }) ``` Items that don't match the schema are rejected at insert time. ## Adding items Use [`addItem()`](https://mastra.ai/reference/datasets/addItem) for a single item or [`addItems()`](https://mastra.ai/reference/datasets/addItems) to insert in bulk: ```typescript // Single item await dataset.addItem({ input: { text: 'Hello', sourceLang: 'en', targetLang: 'es' }, groundTruth: { translation: 'Hola' }, }) // Bulk insert await dataset.addItems({ items: [ { input: { text: 'Goodbye', sourceLang: 'en', targetLang: 'es' }, groundTruth: { translation: 'Adiós' }, }, { input: { text: 'Thank you', sourceLang: 'en', targetLang: 'es' }, groundTruth: { translation: 'Gracias' }, }, ], }) ``` ## Updating and deleting items [`updateItem()`](https://mastra.ai/reference/datasets/updateItem), [`deleteItem()`](https://mastra.ai/reference/datasets/deleteItem), and [`deleteItems()`](https://mastra.ai/reference/datasets/deleteItems) let you modify or remove existing items by `itemId`: ```typescript await dataset.updateItem({ itemId: 'item-abc-123', groundTruth: { translation: '¡Hola!' }, }) await dataset.deleteItem({ itemId: 'item-abc-123' }) await dataset.deleteItems({ itemIds: ['item-1', 'item-2'] }) ``` ## Listing and searching items [`listItems()`](https://mastra.ai/reference/datasets/listItems) supports pagination and full-text search: ```typescript // Paginated list const { items, pagination } = await dataset.listItems({ page: 0, perPage: 50, }) // Full-text search const { items: matches } = await dataset.listItems({ search: 'Hello', }) // List items at a specific version const v2Items = await dataset.listItems({ version: 2 }) ``` ## Versioning Every mutation to a dataset's items (add, update, or delete) bumps the dataset version. This lets you pin experiments to a specific snapshot of the data. ### Listing versions Use [`listVersions()`](https://mastra.ai/reference/datasets/listVersions) to see the paginated history of versions: ```typescript const { versions, pagination } = await dataset.listVersions() for (const v of versions) { console.log(`Version ${v.version} — created ${v.createdAt}`) } ``` ### Viewing item history See how a specific item changed across versions by calling [`getItemHistory()`](https://mastra.ai/reference/datasets/getItemHistory) with the `itemId`: ```typescript const history = await dataset.getItemHistory({ itemId: 'item-abc-123' }) for (const row of history) { console.log(`Version ${row.datasetVersion}`, row.input, row.groundTruth) } ``` ### Pinning to a version Fetch the exact items that existed at a past version: ```typescript const items = await dataset.listItems({ version: 2 }) ``` You can also pin experiments to a version, see [running experiments](https://mastra.ai/docs/evals/datasets/running-experiments). > **Info:** Visit the [`Dataset` reference](https://mastra.ai/reference/datasets/dataset) for the full list of methods and parameters. ## Related - [Running experiments](https://mastra.ai/docs/evals/datasets/running-experiments) - [Scorers overview](https://mastra.ai/docs/evals/overview) - [DatasetsManager reference](https://mastra.ai/reference/datasets/datasets-manager) - [Dataset reference](https://mastra.ai/reference/datasets/dataset)