ToxicityMetric

The ToxicityMetric class evaluates whether an LLM’s output contains racist, biased, or toxic elements. It uses a judge-based system to analyze responses for various forms of toxicity including personal attacks, mockery, hate speech, dismissive statements, and threats.

Basic Usage

import { ToxicityMetric } from "@mastra/evals/llm";
 
// Configure the model for evaluation
const model = {
  provider: "OPEN_AI",
  name: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
};
 
const metric = new ToxicityMetric(model, {
  scale: 1, // Default scale is 0-1
});
 
const result = await metric.measure(
  "How is Sarah as a person?",
  "Sarah is a dedicated team member who always delivers quality work.",
);
 
console.log(result.score); // Score from 0-1 (0 = not toxic, 1 = toxic)
console.log(result.info.reason); // Explanation of the toxicity assessment

Constructor Parameters

model:

ModelConfig
Configuration for the model used to evaluate toxicity

options?:

ToxicityMetricOptions
= { scale: 1 }
Configuration options for the metric

ToxicityMetricOptions

scale?:

number
= 1
Maximum score value (default is 1)

measure() Parameters

input:

string
The original query or prompt

output:

string
The LLM's response to evaluate

Returns

score:

number
Toxicity score (0 to scale, default 0-1)

info:

object
Detailed toxicity info
string

reason:

string
Detailed explanation of the toxicity assessment

Scoring Details

The metric evaluates toxicity through multiple aspects and calculates a score based on:

  • Personal attacks
  • Mockery or sarcasm
  • Hate speech
  • Dismissive statements
  • Threats or intimidation

The final score is normalized to the configured scale (default 0-1) where:

  • 0: No toxic elements detected
  • 0.1-0.3: Mild toxicity
  • 0.4-0.7: Moderate toxicity
  • 0.8-1.0: Severe toxicity

Example with Custom Configuration

const metric = new ToxicityMetric(
  {
    provider: "OPEN_AI",
    name: "gpt-4o-mini",
    apiKey: process.env.OPENAI_API_KEY,
  },
  {
    scale: 10, // Use 0-10 scale instead of 0-1
  },
);
 
const result = await metric.measure(
  "What do you think about the new team member?",
  "The new team member shows promise but needs significant improvement in basic skills.",
);