Skip to main content

Metadata Filters

Mastra provides a unified metadata filtering syntax across all vector stores, based on MongoDB/Sift query syntax. Each vector store translates these filters into their native format.

Basic Example

import { PgVector } from "@mastra/pg";

const store = new PgVector({ connectionString });

const results = await store.query({
indexName: "my_index",
queryVector: queryVector,
topK: 10,
filter: {
category: "electronics", // Simple equality
price: { $gt: 100 }, // Numeric comparison
tags: { $in: ["sale", "new"] }, // Array membership
},
});

Supported Operators

Basic Comparison

$eq
Matches values equal to specified value
{ age: { $eq: 25 } }
Supported by: All except Couchbase
$ne
Matches values not equal
{ status: { $ne: 'inactive' } }
Supported by: All except Couchbase
$gt
Greater than
{ price: { $gt: 100 } }
Supported by: All except Couchbase
$gte
Greater than or equal
{ rating: { $gte: 4.5 } }
Supported by: All except Couchbase
$lt
Less than
{ stock: { $lt: 20 } }
Supported by: All except Couchbase
$lte
Less than or equal
{ priority: { $lte: 3 } }
Supported by: All except Couchbase

Array Operators

$in
Matches any value in array
{ category: { $in: ["A", "B"] } }
Supported by: All except Couchbase
$nin
Matches none of the values
{ status: { $nin: ["deleted", "archived"] } }
Supported by: All except Couchbase
$all
Matches arrays containing all elements
{ tags: { $all: ["urgent", "high"] } }
Supported by: Astra, Pinecone, Upstash, MongoDB
$elemMatch
Matches array elements meeting criteria
{ scores: { $elemMatch: { $gt: 80 } } }
Supported by: LibSQL, PgVector, MongoDB

Logical Operators

$and
Logical AND
{ $and: [{ price: { $gt: 100 } }, { stock: { $gt: 0 } }] }
Supported by: All except Vectorize, Couchbase
$or
Logical OR
{ $or: [{ status: "active" }, { priority: "high" }] }
Supported by: All except Vectorize, Couchbase
$not
Logical NOT
{ price: { $not: { $lt: 100 } } }
Supported by: Astra, Qdrant, Upstash, PgVector, LibSQL, MongoDB
$nor
Logical NOR
{ $nor: [{ status: "deleted" }, { archived: true }] }
Supported by: Qdrant, Upstash, PgVector, LibSQL, MongoDB

Element Operators

$exists
Matches documents with field
{ rating: { $exists: true } }
Supported by: All except Vectorize, Chroma, Couchbase

Custom Operators

$contains
Text contains substring
{ description: { $contains: "sale" } }
Supported by: Upstash, LibSQL, PgVector
$regex
Regular expression match
{ name: { $regex: "^test" } }
Supported by: Qdrant, PgVector, Upstash, MongoDB
$size
Array length check
{ tags: { $size: { $gt: 2 } } }
Supported by: Astra, LibSQL, PgVector, MongoDB
$geo
Geospatial query
{ location: { $geo: { type: "radius", ... } } }
Supported by: Qdrant
$datetime
Datetime range query
{ created: { $datetime: { range: { gt: "2024-01-01" } } } }
Supported by: Qdrant
$hasId
Vector ID existence check
{ $hasId: ["id1", "id2"] }
Supported by: Qdrant
$hasVector
Vector existence check
{ $hasVector: true }
Supported by: Qdrant

Common Rules and Restrictions

  1. Field names cannot:

    • Contain dots (.) unless referring to nested fields
    • Start with $ or contain null characters
    • Be empty strings
  2. Values must be:

    • Valid JSON types (string, number, boolean, object, array)
    • Not undefined
    • Properly typed for the operator (e.g., numbers for numeric comparisons)
  3. Logical operators:

    • Must contain valid conditions
    • Cannot be empty
    • Must be properly nested
    • Can only be used at top level or nested within other logical operators
    • Cannot be used at field level or nested inside a field
    • Cannot be used inside an operator
    • Valid: { "$and": [{ "field": { "$gt": 100 } }] }
    • Valid: { "$or": [{ "$and": [{ "field": { "$gt": 100 } }] }] }
    • Invalid: { "field": { "$and": [{ "$gt": 100 }] } }
    • Invalid: { "field": { "$gt": { "$and": [{...}] } } }
  4. $not operator:

    • Must be an object
    • Cannot be empty
    • Can be used at field level or top level
    • Valid: { "$not": { "field": "value" } }
    • Valid: { "field": { "$not": { "$eq": "value" } } }
  5. Operator nesting:

    • Logical operators must contain field conditions, not direct operators
    • Valid: { "$and": [{ "field": { "$gt": 100 } }] }
    • Invalid: { "$and": [{ "$gt": 100 }] }

Store-Specific Notes

Astra

  • Nested field queries are supported using dot notation
  • Array fields must be explicitly defined as arrays in the metadata
  • Metadata values are case-sensitive

ChromaDB

  • Where filters only return results where the filtered field exists in metadata
  • Empty metadata fields are not included in filter results
  • Metadata fields must be present for negative matches (e.g., $ne won't match documents missing the field)

Cloudflare Vectorize

  • Requires explicit metadata indexing before filtering can be used
  • Use createMetadataIndex() to index fields you want to filter on
  • Up to 10 metadata indexes per Vectorize index
  • String values are indexed up to first 64 bytes (truncated on UTF-8 boundaries)
  • Number values use float64 precision
  • Filter JSON must be under 2048 bytes
  • Field names cannot contain dots (.) or start with $
  • Field names limited to 512 characters
  • Vectors must be re-upserted after creating new metadata indexes to be included in filtered results
  • Range queries may have reduced accuracy with very large datasets (~10M+ vectors)

LibSQL

  • Supports nested object queries with dot notation
  • Array fields are validated to ensure they contain valid JSON arrays
  • Numeric comparisons maintain proper type handling
  • Empty arrays in conditions are handled gracefully
  • Metadata is stored in a JSONB column for efficient querying

PgVector

  • Full support for PostgreSQL's native JSON querying capabilities
  • Efficient handling of array operations using native array functions
  • Proper type handling for numbers, strings, and booleans
  • Nested field queries use PostgreSQL's JSON path syntax internally
  • Metadata is stored in a JSONB column for efficient indexing

Pinecone

  • Metadata field names are limited to 512 characters
  • Numeric values must be within the range of ±1e38
  • Arrays in metadata are limited to 64KB total size
  • Nested objects are flattened with dot notation
  • Metadata updates replace the entire metadata object

Qdrant

  • Supports advanced filtering with nested conditions
  • Payload (metadata) fields must be explicitly indexed for filtering
  • Efficient handling of geo-spatial queries
  • Special handling for null and empty values
  • Vector-specific filtering capabilities
  • Datetime values must be in RFC 3339 format

Upstash

  • 512-character limit for metadata field keys
  • Query size is limited (avoid large IN clauses)
  • No support for null/undefined values in filters
  • Translates to SQL-like syntax internally
  • Case-sensitive string comparisons
  • Metadata updates are atomic

MongoDB

  • Full support for MongoDB/Sift query syntax for metadata filters
  • Supports all standard comparison, array, logical, and element operators
  • Supports nested fields and arrays in metadata
  • Filtering can be applied to both metadata and the original document content using the filter and documentFilter options, respectively
  • filter applies to the metadata object; documentFilter applies to the original document fields
  • No artificial limits on filter size or complexity (subject to MongoDB query limits)
  • Indexing metadata fields is recommended for optimal performance

Couchbase

  • Currently does not have support for metadata filters. Filtering must be done client-side after retrieving results or by using the Couchbase SDK's Search capabilities directly for more complex queries.

Amazon S3 Vectors

  • Equality values must be primitives (string/number/boolean). null/undefined, arrays, objects, and Date are not allowed for equality. Range operators accept numbers or Date (Dates are normalized to epoch ms).
  • $in/$nin require non-empty arrays of primitives; Date elements are allowed and normalized to epoch ms. Array equality is not supported.
  • Implicit AND is canonicalized ({a:1,b:2}{$and:[{a:1},{b:2}]}). Logical operators must contain field conditions, use non-empty arrays, and appear only at the root or within other logical operators (not inside field values).
  • Keys listed in nonFilterableMetadataKeys at index creation are stored but not filterable; this setting is immutable.
  • $exists requires a boolean value.
  • undefined/null/empty filters are treated as no filter.
  • Each metadata key name limited to 63 characters.
  • Total metadata per vector: Up to 40 KB (filterable + non-filterable)
  • Total metadata keys per vector: Up to 10
  • Filterable metadata per vector: Up to 2 KB
  • Non-filterable metadata keys per vector index: Up to 10