Weekly conversations about building AI agents

Join Mastra cofounders Shane Thomas and Abhi Aiyer for weekly conversations on breaking AI news, guests from the industry, and the technical challenges of building AI agents.

Listen on

YouTube Spotify Apple Podcasts

New episodes Mondays at 12PM Pacific · Live on YouTube and X

Latest episodes

July 15, 2026

#98

Three Models Drop: Grok 4.5, Meta Muse & GPT-5.6 Sol (Which Nuked a Business) | This Week In AI

Three model drops, a lawsuit, and a couple of genuine horror stories. Shane and Abhi cover the busiest week in a while — Grok 4.5, Meta's Muse Spark 1.1, and the one everyone actually tried, GPT-5.6 Sol. Sol lands right around Fable on the benchmarks at a third the price, and the receipts are wild: it beats Fable on cost in real usage, tops the Code Arena front-end board — and, in a few now-infamous posts, deleted someone's Mac files and canceled every active Stripe subscription a business had while its owner slept. Meanwhile Anthropic extends Fable again (compute crunch and all), Apple sues OpenAI over trade secrets in the same 24 hours OpenAI's head of safety quit, China floats restricting its open models, and the "own your stack" argument gets a Satya Nadella cosign. Plus software factories at Uber and Sierra, Cognition's SWE-1.7, Bun rewritten in Rust, TypeScript 7, 1X's eerily good robot hands, and distilling your own small model with Inference Autotune.

July 9, 2026

#97

Is Sonnet 5 Even Worth It? Plus: Fable's Back, and War on Tokenmaxxing | This Week In AI

Anthropic lived rent-free in everyone's feed this week. Claude Sonnet 5 launched as "the most agentic Sonnet yet" — and within 24 hours, the government lifted the export controls on Fable 5 and Mythos 5. Shane and Abhi unpack both, plus the catch: Fable came back wrapped in new classifiers that quietly fall it back to Opus for coding, and it's coming off subscription plans on July 7.

July 7, 2026

#96

The Software Factory: Dex Horthy on Shipping Fast Without AI Slop

Dex Horthy has spent the past year studying what happens to a codebase when you let AI write all of it and stop reading the output. His answer has a name: slop. And it's not just about sloppy prompting. In this episode, the HumanLayer founder Dex Horthy joins Shane and Abhi to draw the software factory on a whiteboard — the whole pipeline that builds the thing that builds the thing — and show exactly where teams tear out the parts that were holding quality up. He moves from the naive factory to the fully agentic one to the "lights-off" version where humans stop reading code entirely, then explains why that dream quietly rots a codebase after three to six months. Dex, Shane, and Abhi also talk about how coding models are trained. Every benchmark rewards one thing: did the test pass? Nowhere in the loop does a model get punished for leaving behind code that a smaller model, or a human six months later, can't work in. That gap is why architecture planning, sprint planning, and code review still can't be skipped. You'll also hear why you're not drowning in PRs so much as drowning in bad ones, what a longer-horizon benchmark that actually catches slop would look like, and how many times Dex has watched the HumanLayer launch film.

July 2, 2026

#95

GPT-5.6 Needs Government Approval Now, npm Finally Gets More Security, and Fable Returns

GPT-5.6 is here — Sol, Terra, and Luna — but for the first time, the US government decides who gets access, approving customers one by one. Shane and Abhi dig into what that precedent means for competition, open models, and every lab that comes next.

June 26, 2026

#94

Mastra Got Hacked. Here's What We Learned

Mastra got hacked. In this special edition of Security Corner, Shane Thomas and Abhi Aiyer break down exactly what happened when a supply chain attack hit Mastra's npm packages — an attack that appears to trace back to hackers in North Korea. They're joined by Ismail Pelaseyed, co-founder and CTO of Superagent, for the outside view on how these campaigns actually work.

June 25, 2026

#93

GPT-5.5 Beats Fable, Cursor Takes On GitHub & Midjourney Scans Your Body | This Week in AI

Fable is gone, and the race to replace it is already on. Shane and Abhi open with the fallout — reports that Mythos breached classified NSA systems, the ID-verification future that may follow, and David Sacks' "you asked for it" take. Then the story flips: OpenAI's GPT-5.5 Cyber lands state-of-the-art on Cyber Gym, beating Fable and Mythos on the underlying benchmarks, with OpenAI Daybreak shipping alongside it. GPT-5.6 leaks, then slips to mid-July. GLM 5.2 arrives near Opus 4.8, tops Design Arena, and runs on a Mac. Cursor goes after GitHub with Origin, adds mobile and its own frontier model on Colossus after the SpaceX deal closes. Vercel ships Eve, Fred Schott counters with Flue, Claude lands in Slack, and Midjourney — yes, the image company — unveils a full-body medical scanner headed for spas.

June 24, 2026

#92

How AI Broke Open Source Security | Security Corner with Ismail Pelaseyed

Open source is under attack, and AI changed the math. In this Security Corner, Ismail Pelaseyed, co-founder and CTO of Superagent, joins Shane and Abhi to break down how the software supply chain became the soft underbelly of everything we build. An attack that once took an army of researchers and weeks of work now takes about an hour, and the attacker no longer needs a frontier model to pull it off. Ismail traces how most breaches begin, why phishing has become almost impossible to spot, and how a single poisoned dependency can cascade across an entire ecosystem. You'll get concrete steps any maintainer or developer can take today: switching package managers, enabling the security scanners that ship for free, and standing up an adversarial agent that hunts for chained exploits before an attacker finds them. Ismail also warns that the same instincts protecting enterprises may be quietly strangling open source itself. You'll hear why he thinks the big registries have dropped the ball, what a "Darwinian GitHub" would mean for anyone shipping a new package, and the one move he believes can keep the ecosystem alive.

June 18, 2026

#91

Claude Fable 5: Launched, Hyped, Banned by the Government | This Week In AI

Fable came and went in a week. Shane and Abhi break down the strangest model launch yet — Claude Fable 5, a Mythos-class model Anthropic said was too capable to release widely, until a US export-control directive made it vanish for everyone. We cover the whole arc: the launch hype, the pricing that had people spending $1,000 a day, the silent fallbacks to Opus, the system-prompt leak, and the snitch that triggered a 90-minute shutdown. Then a stacked back half — China's open-weight surge, OpenRouter Fusion, loop engineering vs loopcraft, the agent-learning gold rush, OpenAI filing to go public, and Dario's new essay.

June 11, 2026

#90

Loop Engineering, OpenAI Sites & the Great China Model Shift | This Week In AI

Shane and Abhi are in person this week — live from the CodeRabbit office — for a packed AI news rundown. The big theme: loop engineering. Boris Cherny (head of Claude Code) and Peter Steinberger both landed the same take within days — stop prompting your agents, start designing the loops that prompt them. We walk the whole evolution: the traditional loop, the Ralph loop, /goal, and Claude Code's new dynamic workflows — and debate whether "stop prompting" is real insight or just clickbait. Plus: Anthropic engineers shipping 8x more code (and the "depressed employees" reply), agentic traffic passing human traffic on the web for the first time, OpenAI's Codex Sites taking aim at Lovable, Cognition's $10M AI Productivity Guarantee and Devin Desktop, Cloudflare acquiring VoidZero, the accelerating shift to Chinese models (Lindy going 100% DeepSeek), Notion disabling Anthropic models over reliability, a big open-model dump (Gemma 4, Magenta RealTime 2, Miso One, Nemotron, Liquid, GLM 5.1, MiniMax M3), funding rounds (Suno, Supabase), Brian Chesky's new AI lab, and whether AI is actually profitable yet.

June 8, 2026

#89

Inside an AI-Native Company | Michael Grinich, WorkOS

WorkOS quietly powers the auth and enterprise layer under OpenAI, Anthropic, Cursor, and a long list of companies building the AI era. So when its founder decides his entire company should learn to code, it's worth asking why. Michael Grinich joins Shane and Abhi to talk about what "AI-native" actually looks like from the inside — the in-house coding agent his team built instead of buying, the all-day internal hackathons, and the operating principle he now hires for: everybody codes. He shares how AI is reshaping go-to-market and not just engineering, why he braced for pushback and got the opposite, and the one trait that separates the people who thrive from the people who stall.

June 3, 2026

#88

Opus 4.8, Anthropic's S-1, MiniMax M3 & NVIDIA Pays You to Host a Data Center | This Week In AI

Anthropic filed to go public. The S-1 lands the same week as a $65 billion Series H at a $965 billion valuation, a claimed first profitable quarter, and a home listing that takes Anthropic stock as payment. Opus 4.8 shipped mid-week, and Shane and Abhi give the vibe check: a polish pass over 4.7 more than a step change, still trailing GPT-5.5 on DeepSWE, with GPT-5.6 spotted in Codex logs. The quieter, more interesting release: mid-conversation system messages, steering a model mid-task without breaking the prompt cache. MiniMax M3 is out, agent-tuned and long-context at a fraction of frontier pricing. NVIDIA used Computex to push AI onto the desktop with Vera, RTX Spark, and a Windows DGX Station that runs trillion-parameter models locally, plus a startup unit that bolts onto your house and pays you for compute. OpenAI brings Codex to Windows and ships private MCP servers. Claude Code's dynamic workflows get cloned by pi within a day, Devin raises $1B, and the model-vs-harness debate heats up. Plus continual learning as the next wave, a $500M accidental Claude bill, Corgi's seven-days-a-week firestorm, and a GitHub Star Party pick. 🔗 LINKS Pope on AI: https://x.com/pontifex/status/2060322763718725798 Kuzma on TBPN: https://x.com/tbpn/status/2060374399031632176 Opus 4.8: https://x.com/alexalbert__/status/2060043196655362358 Mid-conversation system messages: https://x.com/swyx/status/2060044644193624253 Opus 4.8 on DeepSWE: https://x.com/arrakis_ai/status/2060757773579956640 GPT-5.6 sighting: https://x.com/hqmank/status/2060334752369160472 Anthropic S-1: https://x.com/AnthropicAI/status/2061478052257841495 Series H: https://x.com/anthropicai/status/2060061347522433422 First profitable quarter: https://techcrunch.com/2026/05/20/anthropic-says-its-about-to-have-its-first-profitable-quarter/ $500M Claude bill: https://x.com/Polymarket/status/2060034216906068131 Zillow listing: https://x.com/Yuchenj_UW/status/2060776120380010932 Mythos class model: https://x.com/kimmonismus/status/2060047510853312557 MiniMax M3: https://x.com/minimax_ai/status/2061266317815296322 Computex in 12 minutes: https://youtu.be/ugNnw4lAMWA NVIDIA Vera: https://x.com/nvidianewsroom/status/2061298380022726734 RTX Spark: https://x.com/nvidia/status/2061313474005737829 DGX Station for Windows: https://x.com/nvidianewsroom/status/2061307670607319201 Home AI data center: https://x.com/w1nklerr/status/2060091525413884408 Codex on Windows: https://x.com/OpenAI/status/2060428604727771421 Private MCP servers: https://x.com/openaidevs/status/2059703536825565499 Dynamic workflows: https://x.com/claudedevs/status/2060044853279617150 pi dynamic workflows: https://x.com/micLivs/status/2060115468531499224 Devin raises $1B: https://x.com/cognition/status/2059660758531940856 grok-build: https://x.com/xai/status/2060392249402552457 Codex as QA: https://x.com/steipete/status/2061208638027395490 Cua on Windows: https://x.com/trycua/status/2059688960838828391 Elad Gil on liftoff: https://x.com/eladgil/status/2061129428084887593 Continual learning pivot: https://x.com/swyx/status/2061206120233054327 Trajectory: https://trajectory.ai/ Motion Studio: https://x.com/_adishj/status/2059666916835463646 Koji AI tutor: https://x.com/suekhim/status/2060378988606878147 Asana acquires Stack AI: https://x.com/techcrunch/status/2060091143556395421 Corgi raises $106M: https://x.com/nico_laqua/status/2060028908704243782 Coral board with Gemma: https://x.com/googlegemma/status/2059740184930074758 Cloudflare web search: https://x.com/cherryjimbo/status/2060717359979958513 Voice-controlled computer: https://x.com/farzatv/status/2060865350036750847 rift: https://github.com/anomalyco/rift

May 29, 2026

#87

Karpathy Joins Anthropic, China Ships Another Price Cut, Anthropic's SpaceX Bill - This Week In AI

Shane and Abhi are back with AI news! Andrej Karpathy joined Anthropic. The OpenAI co-founder and former Tesla AI head said he wants to "get back to R&D" at the LLM frontier. Same week, Greg Brockman posted "the model alone is no longer the product." Elon Musk announced SpaceX is offering AI compute as a service at significant scale, with Anthropic as the flagship customer. Tom Brown confirmed Anthropic is scaling on GB200 capacity in Colossus 2 through June. Anthropic is reportedly paying SpaceX $1.25B a month — a $15B run rate to one vendor. OpenAI offered $2M in tokens to every YC startup in the current batch in exchange for equity. WorkOS launched auth.md, an open protocol for agents to register for services on the web, with Cloudflare and Firecrawl as launch partners. The Chinese labs kept pushing. DeepSeek made their 75% discount permanent on V4-Pro. The architecture behind the price: V4's KV cache is 100x smaller, ~3GB VRAM for 1M tokens. Qwen shipped 3.7-Max. MiniMax teased a similar move. Anthropic shipped self-hosted sandboxes and MCP tunnels for Managed Agents, /workflows in Claude Code that replaces the LLM orchestrator with code, /usage for per-component token attribution, and a first-party security plugin. Google had a week. Gemini 3.5 Flash launched at 3x input and 6x output the price of 3 Flash — Theo's math shows it costs 2x more to run than 3.1 Pro on similar tasks. Gemini Omni lost a side-by-side to Seedance 2.0. Jack Wotherspoon shipped Antigravity CLI, Philipp Schmid debuted Managed Agents in the Gemini API, and Google open-sourced Agent Executor. Cursor shipped Composer 2.5 and published CursorBench. Datacurve released DeepSWE as a harder agentic coding benchmark. Supply-chain attacks kept rolling: Mini Shai-Hulud hit antv, TrapDoor crypto stealers spread across npm/PyPI/Crates.io, Megalodon injected 5,718 commits into 5,561 GitHub repos in six hours, and GitHub itself disclosed unauthorized access to its internal repositories. Cloudflare published their experience running Anthropic's Mythos against 50 of their own repos. The MCP 2026-07-28 RC is stateless. ElevenLabs launched Music v2 and Speech Engine. Runway shipped Aleph 2.0. Accenture laid off 11,000 in an $865M AI restructuring. Exa raised $250M at $2.2B. OpenRouter raised $113M.

Frequently asked questions

What is AI Agents Hour?

AI Agents Hour is a weekly podcast hosted by Mastra cofounders Shane Thomas and Abhi Aiyer. Each episode covers breaking AI news, conversations with guests from the industry and technical deep dives on building AI agents. New episodes air every Monday at 12PM Pacific, live on YouTube and X.

Is AI Agents Hour free to listen to?

AI Agents Hour is free to listen to on YouTube, Spotify and Apple Podcasts.

When do new episodes of AI Agents Hour air?

New episodes of AI Agents Hour air every Monday at 12PM Pacific. Episodes stream live on YouTube and X and are available after broadcast on YouTube, Spotify and Apple Podcasts.

Where can I listen to AI Agents Hour?

AI Agents Hour is available on YouTube, Spotify and Apple Podcasts.

Who are the guests on AI Agents Hour?

AI Agents Hour features guests from the industry. Past guests include Ismail Pelaseyed from Superagent and Charlie Holtz from Conductor. Shane Thomas and Abhi Aiyer also cover breaking AI news and go deep on the technical challenges of building AI agents.

Ship better agents

Get Started Documentation