Blog

AI Beats Laboratory: A Multi-Agent Music Generation System

The AI Beats Laboratory is an interactive web application that generates musical beats and melodies using AI agents. Here's how it works:

Agents

The system uses two specialized Mastra agents:

  • A music reference agent that analyzes musical styles and references
  • A music generation agent that creates drum patterns and melodies

Here's the reference agent definition:

export const musicReferenceAgent = new Agent({
  name: "music-reference-agent",
  instructions: `
    You are given a style of music, an artist or song as a reference point. 
    First think about what keys and what drum patterns fit this reference point.
    Based on this knowledge, generate a drum pattern and a minimal melody that fits the style.
    Pick a key based on the style of the music. All notes should be in this key.
    `,
  model: {
    provider: "ANTHROPIC",
    name: "claude-3-5-sonnet-20241022",
    toolChoice: "auto",
  },
});

Here's the music generation agent definition:

export const musicAgent = new Agent({
  name: "music-agent",
  instructions: `
    
    For the pianoSequence:
    - Create wonderful melodies
    - Available notes:
      * High register: ['C5', 'B4', 'A4', 'G4']
      * Middle register: ['F4', 'E4', 'D4', 'C4']
      * Low register: ['B3', 'A3', 'G3']
    - Each note should have an array of step numbers (0-15)
    For the drumSequence:
    - Available sounds:
      * Core rhythm: ['Kick', 'Snare', 'HiHat']
      * Accents: ['Clap', 'OpenHat', 'Crash']
      * Percussion: ['Tom', 'Ride', 'Shaker', 'Cowbell']
    - Each sound should have an array of step numbers (0-15)
    Response format must be:
    {
      "pianoSequence": {
        "C5": [numbers],
        "B4": [numbers],
        // ... other piano notes
      },
      "drumSequence": {
        "Kick": [numbers],
        "Snare": [numbers],
        // ... other drum sounds
      }
    }
`,
  model: anthropic("claude-3-5-sonnet-20241022"),
});

It turns out LLMs are not very good at music so most of the time was spent iterating on the system prompt. Anthropic’s Claude 3.5 Sonnet was better than OpenAI’s 4o.

User Interface Components

The main interface is built around the Sequencer component which provides:

  • A 16-step grid for both piano notes and drum sounds
  • Interactive controls for playing/stopping sequences
  • Tempo controls
  • Export/share functionality
  • AI generation controls

The sequencer layout is defined in:

const STEPS = 16;
const PIANO_NOTES = [
  "C5",
  "B4",
  "A4",
  "G4",
  "F4",
  "E4",
  "D4",
  "C4",
  "B3",
  "A3",
  "G3",
];
const DRUM_SOUNDS = [
  "Kick",
  "Snare",
  "HiHat",
  "Clap",
  "OpenHat",
  "Tom",
  "Crash",
  "Ride",
  "Shaker",
  "Cowbell",
];

Audio System

The application uses the Web Audio API for sound generation. The audio system is initialized with:

// Create a single audio context for the entire application
let audioContext: AudioContext | null = null;

export const getAudioContext = () => {
  if (!audioContext) {
    audioContext = new AudioContext();
    // Resume audio context on creation to handle auto-play restrictions
    audioContext.resume();
  }
  return audioContext;
};

Piano notes are mapped to frequencies:

const NOTE_FREQUENCIES: { [key: string]: number } = {
  C5: 523.25,
  B4: 493.88,
  A4: 440.0,
  G4: 392.0,
  F4: 349.23,
  E4: 329.63,
  D4: 293.66,
  C4: 261.63,
  B3: 246.94,
  A3: 220.0,
  G3: 196.0,
};

Generation Flow

When a user requests a new beat:

  • The user enters a prompt describing their desired musical style
  • The music reference agent analyzes the prompt and provides musical context
  • The music generation agent creates patterns based on this context
  • The patterns are rendered in the sequencer grid

The generation process is handled in:

const handleGenerateSequence = async () => {
  if (!prompt) return;
  setIsGenerating(true);

  try {
    const ctx = getAudioContext();
    ctx.resume();

    // First, get musical analysis from reference agent
    const refAgent =
      getMastraFetchUrl() + "/api/agents/musicReferenceAgent/generate";
    const response = await window.fetch(refAgent, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        messages: [`Please analyze the users request "${prompt}"`],
      }),
    });

    const d = await response.json();
    setReference(d.text);

    // Then, generate the actual beat pattern using music agent
    const uri = getMastraFetchUrl() + "/api/agents/musicAgent/generate";
    const result = await window.fetch(uri, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        messages: [
          `Please make me a beat based on this information: ${d.text}`,
        ],
        output: {
          // ... JSON schema defining required notes and drum sounds
          // Each property (C5, B4, Kick, Snare, etc.) expects an array of integers
          // representing the steps where that note/sound should play
        },
      }),
    });

    const data = await result.json();

    // Map the response data to piano and drum sequences
    const pianoSequence = {
      C5: data.object.C5 || [],
      B4: data.object.B4 || [],
      // ... additional piano notes C5 through G3
    };

    const drumSequence = {
      Kick: data.object.Kick || [],
      Snare: data.object.Snare || [],
      // ... additional drum sounds
    };

    setDrumSequence(drumSequence);
    setPianoSequence(pianoSequence);
    stopSequence();
  } catch (error) {
    console.error("Error generating sequence:", error);
  } finally {
    setIsGenerating(false);
  }
};

Sharing and Export

The system supports:

  • Sharing beats via URL encoding
  • Exporting to MIDI format
  • Generating variations of existing patterns

By the way, you can find all the code on Github and try the demo yourself here.

Share

Stay up to date