Chat with YouTube Videos 🎬

Want to talk to YouTube videos? This template builds a RAG (Retrieval-Augmented Generation) system that lets you chat with the Mastra AI YouTube channel using AI! Ask questions, get summaries, find topics - just like having a conversation with video transcripts.

🎯 Built for Mastra AI: This template works with the Mastra AI YouTube channel, but you can use it with any YouTube channel!

What This Does 🤖

This template creates a RAG chatbot that knows everything about YouTube videos:

🎯 Chat with Videos: Ask "What did they say about workflows?" and get smart answers from Mastra AI video transcripts.

📚 Learn Mastra: Build something fun while learning Mastra concepts like workflows, agents, and tools.

🔄 Use Any Channel: Want different videos? Just change the video IDs.

How It Works ✨

  1. Process Videos: Download and transcribe Mastra AI YouTube videos (or any videos!)
  2. Extract & Store: AI pulls out speakers, topics, and key insights, stores them as searchable chunks
  3. RAG Chat: When you ask questions, it retrieves relevant transcript chunks first
  4. Get YouTube Data: Uses the video IDs from retrieved chunks to fetch metadata, thumbnails, analytics via MCP

Why This is Cool 🚀

  • 🎬 Talk to Videos: Get instant answers about Mastra AI content
  • 🔍 Smart Search: Find videos by topic, speaker, or just ask in plain English
  • 📚 Learn by Doing: Pick up Mastra skills while building something useful
  • 🛠️ Ready to Use: Built for real use, not just demos

Quick Start 🏃‍♂️

What You Need 📋

  • Node.js 20.9.0+
  • PostgreSQL with pgvector
  • OpenAI API key
  • Deepgram API key
  • Smithery.ai account

Get Started 🔧

 1# Get the code
 2git clone <repository-url>
 3cd chat-ytchannel
 4pnpm install
 5
 6# Add your API keys
 7cp .env.example .env
 8# Edit .env with your keys
 9
10# Set up database
11pnpm db:push
12
13# Start it up!
14pnpm dev

Your API Keys 🔑

Put these in your .env file:

 1# Database
 2DATABASE_URL=postgresql://postgres:postgres@localhost:5432/postgres
 3
 4# AI Services
 5OPENAI_API_KEY=your_openai_key_here
 6DEEPGRAM_API_KEY=your_deepgram_key_here
 7
 8# YouTube Access (you'll need YouTube Data API key when setting up Smithery)
 9SMITHERY_API_KEY=your_smithery_key_here
10SMITHERY_PROFILE=your_profile_here

💡 Note: When setting up your Smithery account, you'll need a YouTube Data API key to access YouTube videos and their metadata.

How to Use It 🎮

1. Process Some Videos 🍿

First, let's teach the RAG system about some Mastra videos:

  1. Start the server: pnpm dev
  2. Go to the Mastra playground at http://localhost:4111/workflows
  3. Find the transcript-workflow and run it
  4. Provide these inputs:
    • videoId: Any YouTube video ID (e.g., "dQw4w9WgXcQ")
    • keywords: Array of relevant terms like ["AI", "Mastra", "workflows"]

This will:

  • Download the video's audio
  • Turn speech into text with Deepgram
  • Extract topics, speakers, and summaries with AI
  • Split into chunks and store with vector embeddings for RAG

2. Start Chatting! 💬

Once you've processed some videos, go to http://localhost:4111/agents/youtubeAgent/chat and try asking:

For Mastra AI videos:

  • "What did they say about building agents?"
  • "Find videos about workflows"
  • "Summarize the latest tool tutorial"
  • "Who talks about memory in the videos?"
  • "Show me the most popular video about agents"

The RAG system works like this:

  1. Retrieves transcript chunks that match your question from the vector database
  2. Gets video IDs from those chunks
  3. Fetches YouTube metadata (titles, descriptions, thumbnails, view counts) via MCP
  4. Combines everything to give you complete answers with context

💡 Note: You need to process at least one video first using the workflow above, or you won't have any data to chat about!

3. Use Different Channels 🌟

Want to build RAG on other YouTube channels? Just process their videos using the same workflow in the playground with any YouTube video ID and relevant keywords.

What's Inside? 🔧

This template shows you how to use key Mastra features:

The Tech Stack 🛠️

  • Workflows: Multi-step processes that handle errors and keep going
  • Tools: Functions the AI can call to search and analyze videos
  • Agents: Chatbots that remember conversations and use tools smartly
  • MCP: Connect to external APIs (like YouTube) seamlessly

The Main Parts ✨

  • transcriptWorkflow: Processes videos behind the scenes
  • youtubeAgent: Your chat buddy who knows about videos
  • videoSearchTool: Smart search that's both fast and accurate

What Powers It ⚡

  • Database: PostgreSQL + pgvector for storing video transcripts and embeddings
  • AI: OpenAI for chat + Deepgram for transcription
  • YouTube: Access via Smithery.ai MCP servers (gets metadata, thumbnails, analytics)
  • Framework: Mastra with TypeScript

Database 🗄️

Simple schema that stores everything you need:

 1-- Videos: transcripts and AI insights
 2CREATE TABLE videos (
 3  id TEXT PRIMARY KEY,              -- YouTube video ID
 4  fullTranscript TEXT,              -- Complete transcript
 5  metadata JSONB,                   -- AI insights (summary, speakers, topics)
 6  createdAt TIMESTAMP,
 7  updatedAt TIMESTAMP
 8);
 9
10-- Chunks: searchable pieces
11CREATE TABLE chunks (
12  id TEXT PRIMARY KEY,
13  videoId TEXT REFERENCES videos(id),
14  data JSONB,                       -- Chunk content + metadata
15  createdAt TIMESTAMP
16);

Customize It 🎨

Add More Metadata

 1// In transcript-workflow.ts, add more fields
 2const videoDataSchema = z.object({
 3  summary: z.string(),
 4  sentiment: z.enum(["positive", "negative", "neutral"]),
 5  difficulty: z.enum(["beginner", "intermediate", "advanced"]),
 6  keyTopics: z.array(z.string()),
 7  speakers: z.array(z.string()),
 8});

Add Search Filters

 1// In retrieval-tool.ts, add date filtering
 2const queryVideos = async (filters?: {
 3  speaker?: string;
 4  tag?: string;
 5  dateRange?: { start: Date; end: Date };
 6}) => {
 7  // Your filtering logic
 8};

Commands 💻

 1# Development
 2pnpm dev                    # Start with playground
 3pnpm build                  # Build for production
 4pnpm start                  # Run production
 5
 6# Database
 7pnpm db:push               # Update schema
 8pnpm db:studio             # View data

Common Issues 🔧

"Failed to download video"

  • Check the video ID is correct
  • Some videos can't be downloaded
  • Try a different video

"Database connection failed"

  • Make sure PostgreSQL is running
  • Check your DATABASE_URL
  • Install pgvector extension

"No search results"

  • Process some videos first
  • Check your API keys work
  • Make sure the database has data

What Makes This Special ✨

Learn by Building

  • Real Example: Not just theory - you build something useful
  • Mastra Patterns: See how workflows, tools, and agents work together
  • Best Practices: Learn the right way to structure Mastra apps

Production Ready

  • Error Handling: Things break, but the app keeps working
  • Type Safety: TypeScript catches bugs before they happen
  • Performance: Built to handle lots of videos efficiently

Easy to Extend

  • Any Channel: Works with any YouTube channel
  • Add Features: Easy to add new search filters or metadata
  • Integration: Connect to other services via MCP

Ideas for More 💡

This template can become:

  • Multi-channel processor - Handle entire YouTube channels
  • Real-time updates - Auto-process new videos
  • Analytics dashboard - Track trends and engagement
  • Slack bot - Ask questions from your team chat
  • API service - Power other apps with video intelligence

Learn More 📚

Contributing 🤝

This is a learning template! Feel free to:

  • Fork it and make it your own
  • Submit improvements
  • Share what you built with it

Built with ❤️ using Mastra - The AI framework that makes building actually fun.