🔐 Mastra Governed RAG Template
A production-ready template for building secure, governed RAG (Retrieval-Augmented Generation) applications using Mastra's multi-agent orchestration framework. This template demonstrates enterprise-grade access control, document classification, and policy enforcement in AI applications.
📋 Table of Contents
- Why Governed RAG?
- Architecture
- Quick Start
- Demo Scenarios
- CLI Usage
- Project Structure
- Security Features
- Configuration
- How It Works
- UI Features
- API Reference
- Development
- Testing
- Deployment
- Performance
- Troubleshooting
- Contributing
- License
🎯 Why Governed RAG?
Traditional RAG systems retrieve and use any available document to answer questions. In enterprise settings, this is a critical security risk. Our Governed RAG solution ensures:
- 🛡️ Role-Based Access Control: Users only see documents they're authorized to access
- 🏷️ Document Classification: Automatic enforcement of public/internal/confidential classifications
- 👤 Identity Verification: JWT-based authentication with claims validation
- 🔍 Security-First Retrieval: Filters applied at the vector database level, not post-retrieval
- ✅ Answer Verification: Multi-agent validation ensures no data leakage
🏗️ Architecture
1graph LR
2 A[User Query + JWT] --> B[Identity Agent]
3 B --> C[Policy Agent]
4 C --> D[Retrieve Agent]
5 D --> E[Rerank Agent]
6 E --> F[Answer Agent]
7 F --> G[Verifier Agent]
8 G --> H[Secure Answer]
Multi-Agent Pipeline
- Identity Agent: Validates JWT and extracts user claims (roles, tenant, clearance)
- Policy Agent: Converts claims into access filters based on security policies
- Retrieve Agent: Queries vector database with security filters applied
- Rerank Agent: Orders retrieved contexts by relevance
- Answer Agent: Generates response using ONLY authorized contexts
- Verifier Agent: Validates answer hasn't leaked unauthorized information
🚀 Quick Start
Prerequisites
- Node.js 20+
- Docker & Docker Compose
- OpenAI API key
Setup
- Clone the repository
1git clone <repo-url>
2cd mastra-governed-rag-template
3npm install
- Configure environment
cp .env.example .env
Edit .env
with your configuration:
1# Required
2OPENAI_API_KEY=your_openai_api_key_here
3JWT_SECRET=your_jwt_secret_here
4
5# Optional - customize as needed
6OPENAI_MODEL=gpt-4o-mini
7EMBEDDING_MODEL=text-embedding-3-small
8QDRANT_URL=http://localhost:6333
9QDRANT_COLLECTION=governed_rag
10TENANT=acme
- Start Qdrant vector database
docker-compose up -d
- Verify services are running
1# Check Qdrant health
2curl http://localhost:6333/health
3
4# Check Docker containers
5docker ps
- Index sample documents
1npm run build-cli
2npm run cli index
- Start the development server
1npm run dev
2# Visit http://localhost:3000
🎮 Demo Scenarios
Scenario 1: Finance Employee
1// JWT Claims: { roles: ["finance.viewer"], tenant: "acme" }
2// Can access: Finance policies, public documents
3// Cannot access: HR confidential data, engineering details
Scenario 2: Engineering Manager
1// JWT Claims: { roles: ["engineering.admin"], tenant: "acme" }
2// Can access: Engineering handbook, public documents
3// Cannot access: Finance policies, HR confidential data
Scenario 3: HR Admin with Step-Up Auth
1// JWT Claims: { roles: ["hr.admin"], tenant: "acme", stepUp: true }
2// Can access: ALL documents including confidential
3// Step-up authentication enables confidential access
🔧 CLI Usage
The template includes a powerful CLI for managing your governed RAG system:
1# Index documents with security tags
2npm run cli index
3
4# Query with JWT authentication
5npm run cli query "<jwt-token>" "What is the expense policy?"
6
7# Run interactive demo
8npm run cli demo
9
10# Show help
11npm run cli help
📁 Project Structure
1├── src/
2│ ├── mastra/
3│ │ ├── agents/ # AI agents with specific responsibilities
4│ │ ├── tools/ # JWT auth and vector query tools
5│ │ ├── workflows/ # Orchestrated pipelines
6│ │ ├── schemas/ # Zod validation schemas
7│ │ └── config/ # OpenAI and logger configuration
8│ ├── app/ # Next.js application
9│ └── cli/ # Command-line interface
10├── corpus/ # Sample documents with classifications
11├── docker-compose.yml # Qdrant vector database setup
12└── .env.example # Environment configuration template
🔐 Security Features
Document Classification
- Public: Accessible to all authenticated users
- Internal: Requires specific department roles
- Confidential: Requires admin role + step-up authentication
Access Control Tags
role:*
- Role-based access (e.g.,role:finance.viewer
)tenant:*
- Multi-tenant isolation (e.g.,tenant:acme
)classification:*
- Document sensitivity level
JWT Token Structure
1{
2 "sub": "user@example.com",
3 "roles": ["finance.viewer", "engineering.admin"],
4 "tenant": "acme",
5 "stepUp": false,
6 "exp": 1234567890
7}
🛠️ Configuration
Environment Variables
Variable | Description | Default |
---|---|---|
OPENAI_API_KEY | OpenAI API key | Required |
OPENAI_MODEL | LLM model to use | gpt-4o-mini |
EMBEDDING_MODEL | Embedding model | text-embedding-3-small |
QDRANT_URL | Qdrant database URL | http://localhost:6333 |
QDRANT_COLLECTION | Collection name | governed_rag |
JWT_SECRET | JWT signing secret | Required |
TENANT | Default tenant ID | acme |
📊 How It Works
1. Document Indexing
Documents are chunked and embedded with security metadata:
1{
2 text: "Expense reports must be submitted within 30 days",
3 docId: "finance-policy-001",
4 securityTags: ["role:finance.viewer", "tenant:acme", "classification:internal"],
5 classification: "internal"
6}
2. Query Processing
When a user queries the system:
- JWT is verified and claims extracted
- Claims are converted to security filters
- Vector search applies filters at database level
- Only authorized documents are retrieved
- Answer is generated from filtered contexts
- Final verification ensures no data leakage
3. Security Enforcement
Security is enforced at multiple levels:
- Database Level: Qdrant filters prevent unauthorized retrieval
- Agent Level: Each agent validates and enforces policies
- Answer Level: Verifier ensures response compliance
🎨 UI Features
- Modern Dark Theme: Clean, professional interface
- Real-time Streaming: See answers generated in real-time
- Security Badges: Visual indicators for document classification
- Role Selector: Demo different access levels easily
- Citation Display: Transparent source attribution
- JWT Authentication: Built-in token management and role switching
- Responsive Design: Works on desktop and mobile devices
📚 API Reference
Chat API
POST /api/chat
Processes a user query through the governed RAG pipeline.
Request Body
1{
2 "message": "What is the expense policy?",
3 "jwt": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
4}
Response
1{
2 "message": "The expense policy requires...",
3 "citations": [
4 {
5 "docId": "finance-policy-001",
6 "source": "finance-policy.md"
7 }
8 ]
9}
Status Codes
200
- Successful response401
- Invalid or missing JWT403
- Insufficient permissions500
- Internal server error
Indexing API
POST /api/index
Indexes documents into the vector database with security metadata.
Request Body
1{
2 "documents": [
3 {
4 "content": "Document content here...",
5 "metadata": {
6 "docId": "doc-001",
7 "source": "example.md",
8 "classification": "internal",
9 "securityTags": ["role:finance.viewer", "tenant:acme"]
10 }
11 }
12 ]
13}
Response
1{
2 "indexed": 1,
3 "skipped": 0,
4 "errors": []
5}
🛠️ Development
Code Style
The project uses TypeScript with strict mode enabled. Follow these guidelines:
- Use explicit types, avoid
any
- Prefer interfaces over types for object shapes
- Use Zod schemas for runtime validation
- Follow existing naming conventions
Adding New Agents
- Create agent file in
src/mastra/agents/
- Define input/output schemas
- Implement agent logic with tools
- Add to workflow in
src/mastra/workflows/
- Update tests
Example agent structure:
1import { createAgent } from "@mastra/core";
2import { z } from "zod";
3
4export const myAgentSchema = z.object({
5 input: z.string(),
6 output: z.string()
7});
8
9export const myAgent = createAgent({
10 name: "myAgent",
11 instructions: "Your agent instructions here",
12 model: {
13 provider: "openai",
14 name: "gpt-4o-mini"
15 }
16});
Environment Variables
Variable | Required | Default | Description |
---|---|---|---|
OPENAI_API_KEY | ✅ | - | OpenAI API key |
OPENAI_BASE_URL | ❌ | https://api.openai.com/v1 | Custom API endpoint |
OPENAI_MODEL | ❌ | gpt-4o-mini | LLM model name |
EMBEDDING_MODEL | ❌ | text-embedding-3-small | Embedding model |
QDRANT_URL | ❌ | http://localhost:6333 | Qdrant database URL |
QDRANT_COLLECTION | ❌ | governed_rag | Vector collection name |
JWT_SECRET | ✅ | - | JWT signing secret |
TENANT | ❌ | acme | Default tenant ID |
LOG_LEVEL | ❌ | info | Logging level |
🧪 Testing
Running Tests
Currently, the project doesn't include automated tests. To add testing:
- Install testing dependencies
npm install --save-dev jest @types/jest ts-jest
- Create test configuration
1# jest.config.js
2module.exports = {
3 preset: 'ts-jest',
4 testEnvironment: 'node',
5 roots: ['<rootDir>/src', '<rootDir>/tests'],
6 testMatch: ['**/__tests__/**/*.ts', '**/?(*.)+(spec|test).ts'],
7 collectCoverageFrom: [
8 'src/**/*.ts',
9 '!src/**/*.d.ts',
10 ],
11};
- Add test scripts
1{
2 "scripts": {
3 "test": "jest",
4 "test:watch": "jest --watch",
5 "test:coverage": "jest --coverage"
6 }
7}
Manual Testing
Use the CLI to test different scenarios:
1# Test authentication
2npm run cli auth --role finance.viewer
3
4# Test document indexing
5npm run cli index --file corpus/finance-policy.md
6
7# Test querying with different roles
8npm run cli query --role engineering.admin "What is our git workflow?"
Integration Testing
Test the full pipeline:
- Start services:
docker-compose up -d
- Index documents:
npm run cli index
- Test web interface:
npm run dev
- Test API endpoints with curl or Postman
🚢 Deployment
Docker Deployment
docker-compose up --build
Production Considerations
- Use managed Qdrant Cloud or self-host with persistent storage
- Implement proper JWT issuer with your identity provider
- Set up monitoring and audit logging
- Configure rate limiting and DDoS protection
- Use environment-specific secrets management
Cloud Deployment Options
Vercel (Recommended for Next.js)
1# Install Vercel CLI
2npm i -g vercel
3
4# Deploy
5vercel --prod
Set environment variables in Vercel dashboard:
OPENAI_API_KEY
JWT_SECRET
QDRANT_URL
(use Qdrant Cloud)
AWS/GCP/Azure
Use the provided docker-compose.yml
with your cloud provider's container services.
Self-Hosted
1# Production build
2npm run build
3npm start
4
5# With PM2 process manager
6npm install -g pm2
7pm2 start npm --name "governed-rag" -- start
⚡ Performance
Optimization Tips
-
Vector Search Performance
- Use HNSW index in Qdrant for faster similarity search
- Optimize embedding dimensions (1536 for
text-embedding-3-small
) - Consider quantization for large datasets
-
Caching Strategy
- Enable Redis for JWT token caching
- Cache frequently accessed document embeddings
- Implement query result caching for common questions
-
Scaling Considerations
- Horizontal scaling: Multiple Next.js instances behind load balancer
- Database: Use Qdrant cluster for high availability
- Queue: Add message queue for async document indexing
Performance Metrics
Operation | Typical Latency | Optimization |
---|---|---|
JWT Validation | < 10ms | In-memory cache |
Vector Search | 50-200ms | HNSW index |
LLM Generation | 1-3s | Streaming response |
Document Indexing | 100-500ms per doc | Batch processing |
🐛 Troubleshooting
Common Issues
1. Qdrant Connection Failed
1# Check if Qdrant is running
2docker ps | grep qdrant
3
4# Check Qdrant logs
5docker logs governed-rag-qdrant
6
7# Restart if needed
8docker-compose restart qdrant
2. JWT Authentication Errors
1// Verify JWT secret matches
2console.log('JWT_SECRET length:', process.env.JWT_SECRET?.length);
3
4// Check token structure
5const decoded = jose.decodeJwt(token);
6console.log('JWT claims:', decoded);
3. No Documents Retrieved
1# Check if documents are indexed
2curl http://localhost:6333/collections/governed_rag/points/count
3
4# Verify security tags
5curl http://localhost:6333/collections/governed_rag/points/scroll \
6 -H "Content-Type: application/json" \
7 -d '{"limit": 5, "with_payload": true}'
4. OpenAI API Errors
- Verify API key is valid
- Check rate limits and quotas
- Ensure model name is correct (
gpt-4o-mini
)
Debug Mode
Enable detailed logging:
1# Set environment variable
2export LOG_LEVEL=debug
3
4# Or in .env file
5LOG_LEVEL=debug
Performance Issues
-
Slow Vector Search
- Check Qdrant index status
- Monitor memory usage
- Consider index optimization
-
High Memory Usage
- Implement batch processing for large document sets
- Use streaming for large responses
- Monitor vector storage efficiency
-
API Timeouts
- Increase timeout limits in Next.js config
- Implement request queuing
- Add circuit breaker pattern
Monitoring
Add observability to your deployment:
1// Example: Custom metrics
2const metrics = {
3 queries_total: 0,
4 auth_failures: 0,
5 avg_response_time: 0
6};
7
8// Log important events
9console.log(JSON.stringify({
10 timestamp: new Date().toISOString(),
11 event: 'query_processed',
12 user_role: claims.roles,
13 response_time_ms: Date.now() - startTime,
14 documents_retrieved: contexts.length
15}));
🤝 Contributing
We welcome contributions! Here's how you can help:
Getting Started
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes
- Add tests if applicable
- Commit your changes:
git commit -m 'Add amazing feature'
- Push to the branch:
git push origin feature/amazing-feature
- Open a Pull Request
Contribution Guidelines
- Follow the existing code style and conventions
- Add documentation for new features
- Include tests for bug fixes and new functionality
- Update the README if needed
- Ensure all CI checks pass
Areas for Contribution
- Security Enhancements: Additional authentication methods, policy engines
- Performance Optimizations: Caching, indexing, query optimization
- New Agents: Specialized agents for different domains
- Documentation: Tutorials, examples, API docs
- Testing: Unit tests, integration tests, e2e tests
- UI/UX: Interface improvements, accessibility features
Reporting Issues
Please use GitHub Issues to report bugs or request features. Include:
- Clear description of the issue
- Steps to reproduce
- Expected vs actual behavior
- Environment details (Node.js version, OS, etc.)
Code of Conduct
This project follows the Contributor Covenant Code of Conduct.
📄 License
MIT License - See LICENSE for details.
🙏 Acknowledgments
Built with:
- Mastra - Multi-agent orchestration framework
- Qdrant - Vector database
- OpenAI - Language models
- Next.js - React framework
- Tailwind CSS - Styling
🌟 Use Cases
This template is perfect for organizations that need secure AI applications:
Healthcare
- Patient record access control
- HIPAA compliance for medical queries
- Role-based access to sensitive health data
Financial Services
- Regulatory compliance (SOX, PCI-DSS)
- Customer data protection
- Risk assessment document access
Legal
- Confidential case file management
- Attorney-client privilege enforcement
- Document privilege classification
Government
- Classified information handling
- Clearance-level access control
- Multi-agency data sharing
Enterprise
- HR policy and confidential data
- Intellectual property protection
- Competitive intelligence security
🔮 Roadmap
- Multi-tenancy improvements: Better tenant isolation
- Advanced policies: Time-based access, geo-restrictions
- Audit trails: Comprehensive logging and compliance reports
- Performance optimizations: Advanced caching, query optimization
- Additional LLM providers: Anthropic Claude, Azure OpenAI
- Fine-grained permissions: Document-level and field-level access
- Real-time updates: Live document synchronization
- Analytics dashboard: Usage metrics and security insights
📞 Support
- Documentation: Check this README and CLAUDE.md
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Discord: Mastra Community
Built for the Mastra Template Hackathon 🏆
This template showcases how to build secure, enterprise-ready AI applications with proper governance and access control. Perfect for industries requiring strict data security: healthcare, finance, legal, and government.