Skip to main content

TokenLimiterProcessor

The TokenLimiterProcessor is an output processor that limits the number of tokens in AI responses. This processor helps control response length by implementing token counting with configurable strategies for handling exceeded limits, including truncation and abortion options for both streaming and non-streaming scenarios.

Usage exampleDirect link to Usage example

import { TokenLimiterProcessor } from "@mastra/core/processors";

const processor = new TokenLimiterProcessor({
limit: 1000,
strategy: "truncate",
countMode: "cumulative"
});

Constructor parametersDirect link to Constructor parameters

options:

number | Options
Either a simple number for token limit, or configuration options object

OptionsDirect link to Options

limit:

number
Maximum number of tokens to allow in the response

encoding?:

TiktokenBPE
Optional encoding to use. Defaults to o200k_base which is used by gpt-4o

strategy?:

'truncate' | 'abort'
Strategy when token limit is reached: 'truncate' stops emitting chunks, 'abort' calls abort() to stop the stream

countMode?:

'cumulative' | 'part'
Whether to count tokens from the beginning of the stream or just the current part: 'cumulative' counts all tokens from start, 'part' only counts tokens in current part

ReturnsDirect link to Returns

name:

string
Processor name set to 'token-limiter'

processOutputStream:

(args: { part: ChunkType; streamParts: ChunkType[]; state: Record<string, any>; abort: (reason?: string) => never }) => Promise<ChunkType | null>
Processes streaming output parts to limit token count during streaming

processOutputResult:

(args: { messages: MastraMessageV2[]; abort: (reason?: string) => never }) => Promise<MastraMessageV2[]>
Processes final output results to limit token count in non-streaming scenarios

reset:

() => void
Reset the token counter (useful for testing or reusing the processor)

getCurrentTokens:

() => number
Get the current token count

getMaxTokens:

() => number
Get the maximum token limit

Extended usage exampleDirect link to Extended usage example

src/mastra/agents/limited-agent.ts
import { Agent } from "@mastra/core/agent";
import { TokenLimiterProcessor } from "@mastra/core/processors";

export const agent = new Agent({
name: "limited-agent",
instructions: "You are a helpful assistant",
model: "openai/gpt-4o-mini",
outputProcessors: [
new TokenLimiterProcessor({
limit: 1000,
strategy: "truncate",
countMode: "cumulative"
})
]
});

On this page