Back to all episodes

Is TypeScript the King of AI? Plus CopilotKit, TSAI Preview, and Mastra Agent Studio

November 5, 2025

Is TypeScript the new King of AI? Today we talk about Mastra Agent Studio, we discuss the AI news (including a bunch of new models), chat with Atai from CopilotKit, and then preview the TSAI Conf (sold out but you can attend virtually, go to tsconf.ai to register)

Guests in this episode

Atai Barkai

Atai Barkai

CopilotKit
Uli Barkai

Uli Barkai

CopilotKit

Episode Transcript

4:15

Hello everyone and welcome to AI Agents Hour. What up? We are in a different location than normal today. So, we're in person clearly. We're hanging out and we

4:27

thought it'd be uh fitting because we had a whole bunch of the MRA team in town to have the uh have the live stream at a bar. So, we've joked about it, we talked about it, and now it's happening. This might be a one-time experiment. We may never do it again. Or maybe this is just the new vibe. We'll see.

4:47

Our whole a lot of people from our team are literally over there working right now. So, they may um they may drop in. The sound is probably going to be a little less quality than normal. Uh you might have some background noise. Let us know

5:01

in the chat how bad it is. We can try to do a few things on our ends to improve it, but we're not going to be able to completely fix it. But we have a pretty sweet show lined up today. We're going to be talking AI news like always. We're

5:14

going to talk about a recent Monster launch. We're going to have a tie from Copilot Kit coming on. And then we're going to talk about the TSAI conference which is this week in a few days. 100%.

5:25

It's going to be pretty sick. We are at a place called Victory Hall. If you're in SF, you should check it out. Usually

5:32

no one ever comes here which is why we come here but today of all days it is popping in here. So it's it's a busy day. Yeah. They knew we were coming I guess. Yeah. So if you uh you know if you're

5:45

interested in finding us now you know you know we just we doxed ourselves. We just doxed oursel. But hey come say hi have a drink. But we're going to start with talking about a yeah something that I saw which

5:58

I thought was kind of interesting. Before we dive in to everything else, saw this tweet come by from Theo. Okay, Typescript is officially the number one language on GitHub. And

6:17

that's cool. And we also, if you want to read the actual full post, you can see, yeah, AI leads TypeScript to number one. That's literally the the title of the blog post from Octiverse 2025 and it talks about if you go down in here, you know, you can read the whole post, but it it basically mentions that

6:39

now TypeScript is over. Look at that curve too. Yeah, it essentially talks about how Typescript has overt taken Python and JavaScript to become the most used language on GitHub, reflecting how developers are reshaping their toolkits.

6:58

This marks the most significant language shift in more than a decade. Damn. It's almost like we talked about this many times. Almost how we imagined it. Hey, Ryan Carson's here. Hey, Ryan. Hey, Ryan.

7:09

Tell us how the sound quality is because we're a little nervous about it, but you know, we're having fun. Having fun. We're in SF. We're at a bar. We're just live streaming from here. All right. So, yeah. So, that's pretty

7:22

cool. Typescript is uh clearly becoming more and more popular. Typescript ships.

7:29

This is kind of a lead into what we'll be talking about later when we uh yeah talk about the TSA icon typescript ships. But I wanted to talk through just quickly before we go into the news for today just a little bit about a recent launch we had last week. So, one of the things we've been hearing over and over, if you've used Mastra, you've probably seen we have this really nice playground

7:53

that actually funny story, the first version of Mastra, we were building this playground and we basically pulled it out. We're like, it's not good enough to ship with. It doesn't work. So, the first

8:04

the first version of MRA didn't have it, but we built it. We had we just pulled it out right at the last minute because it it just sucked. Yeah. But we knew it was important or we thought it we at least had the idea that

8:15

it would be important because getting into AI, you know, for ourselves, we've kind of come from the web world. We've had like web admins. We've we've had this idea of being able to wanting to visually see the the progress you're making and test and iterate. And so we added it back in. We've been iterating on it a little bit, but it was kind of

8:32

always something that we had, but we didn't we never really focused on. But what we kept hearing is over and over people just wanted to figure out like how do I deploy the playground? How do I use this playground in production? We

8:44

heard that over and over. And so we pretty quickly realized that playground was a little bit bigger than just a small testing environment. And so we've rebranded it, rebranded it to studio or agent studio. And now you can just

8:58

deploy it in mastercloud for free and collaborate with others on your team. ship your agents to the rest of your team to iterate on more quickly. And yeah, so that came out last week.

9:10

Dude, Studio is something that we thought was uh it was cool. We didn't realize how useful it was until you get very deep into your agent career, let's just say. And then people without the studio always tell us, I wish I had a studio in like where I was. So, um,

9:32

yeah, I'm glad. Oh, we have a bunch of comments. Yeah, Ruben, looking forward to the conference this week. I'll be watching online. We'll be watching you. Yeah, thanks for thanks for tuning in. We'll we'll be here.

9:44

Mauricio from Brazil. And then this one's a good one. Oh, ro this is exactly why we you know why we built it because we kept hearing oftentimes you'll have multiple technical people you know multiple engineers multiple developers and then you'll have someone who's kind of like a technical PM or engineering manager that wants to be able to collaborate and see the progress and so yeah you you

10:09

basically nailed the the use case there is is how do we get those slightly less technical people that that aren't touching code involved early in the project. Also, a fun story. Ro and I have been to uh the dog patch saloon together. So, thanks for calling in.

10:25

All right. Okay. So, it's time. We do this every

10:30

week. We dive into some of the news. Yep. And you know, today we have, you know,

10:37

we'll try to go through this kind of quickly just because we do have a guest coming up. There's also some drama, too. Okay, let's start with the drama. We always Yesterday's timeline was crazy. drama between Sam Alman and Elon Musk.

10:53

Yeah. Um, dude, it was crazy as Like, and we're talking about Open AI, of course, today. Yeah. This is This is it. This is on the list. Yeah. Let's Let's talk about the drama. Let's talk about the drama first. Okay.

11:04

So, this is how I I'm going to play I'm gonna play I'm going to do a playbyplay. Okay. And I'm gonna also say some funny Okay. So, Sam Almond uh posted this picture. It's like a tale in three

11:16

steps where he like bought a Tesla Roadster and then he asked to be cancelled or he asked to get refunded and then the third the third picture is like you know like post can't host not found in email which is hilarious. Okay, we have to also establish that that's pretty funny as a roast, right? Then Elon like he goes he gets all defensive and He's just like, "Hey, well, someone made a nonprofit

11:44

turn into a profit and all this like nonprofit about OpenAI." And then now you guys go watch it. Go look it on the timeline. It's hilarious, dude. But that was yesterday. Yeah. Well, I mean, and Elon was

11:55

recently on the All-In podcast talking about the lawsuit that he has against OpenAI and Well, Yep. And Ruben brings up Yeah. There's been some recent like OpenAI health or legal drama where I I've heard just as of today, I don't know. I haven't validated this, but maybe Chad GBT is not going to give you medical advice anymore. So, there's a whole bunch of uh

12:15

it's it's always a little bit of drama in They also released the court or like the the the manuscript of Ilia's uh board thing. I read it. It's hilarious, dude.

12:28

This is so funny. It's like Yeah, it feels like watching an episode of like Suits or something, you know? It's honestly like at some point there will be a, you know, like a Silicon Valley show or a social network movie made around this time. You just know, you just know it will be and it feels like we're kind of living it, which is is fun because you get that uh

12:47

entertainment while we're all shipping stuff. I watched the like the Uber movie with like Jose Joseph Gordon Levit as a whatever that guy is. Um, Travis, that movie is tight, but it's also like being in like startups and stuff, you know, it's kind of like it's not for us, it's for like the general masses and

13:06

This drama is hilarious, though. All right, so let's talk about some other news. Uh, I did see something that maybe teased OpenAI might be planning to IPO. They figured out how to go from a

13:18

nonprofit to a for-profit in some way. I don't know all the legal uhations, but they've they've figured it out. They found some legal maneuvering to make it happen. So that is, you know, likely

13:29

coming and estimated one trillion. That's a lot of money, you know, if you work at OpenAI. Yeah. Good. Good for you. Good for you. But they have a bunch of other things

13:40

they've released as well. So they're doing a collaboration with AWS and, you know, a lot of information around how they're just going to be using a lot of AWS uh resources. But they did in the the post they talk about not just buying like a lotment for GPUs but buying CPUs as well. So that's interesting like they want they want it all.

14:04

Open AAI wants it all. What are those CPUs for? I wonder.

14:10

Yeah, that is kind of interesting. Uh they've released something called Arvark. It's an agent that finds and fixes security bugs. Uses GPT5. So pretty sick. Yeah, it's it's nice that

14:23

they are like, you know, building some of this stuff. They did release something open source as well. So, last but not least, as far as OpenAI things that they've released in the last week, OSS safeguard, which is which is essentially like guardrails, right? And we've been talking about

14:40

guardrails for quite a bit, but it does have kind of a a slightly larger but then a pretty small model that you can kind of build write your own custom safety policies and then it'll try to be the guardrails to enforce those safety policies. Yeah, this is really cool. And I think this is telegraphing like what we're going to see in 2026 is SLM movement where you

15:02

have smaller models doing specific things that make sense. We had uh the homies on a couple weeks ago uh for uh the condom episode. Super agent. We had Super Agent episode that forever be known as the

15:17

condom episode, but this is kind of that same vibe. Yeah. And you know, and I think we talked with Osmosis, we talked with Andy about like reinforcement learning. And so I think this idea of smaller, more well-tuned models toward your uh

15:31

specific use case could be better in certain situations. It's not it's not going to replace these large models. The large models have a place, but I think we'll we'll continue to see that more and more. Yeah, model routing will be very

15:44

interesting in 2026. You know, uh Ruben has an idea. Maybe we should do that. We'll make a video about using open models and small language models with MRA. Yeah, we should do that.

15:56

That's a good idea. We still need to do Maybe we should do a workshop where you know how we always wanted to break down every model. Maybe we should do that in a workshop or something. Yeah, because the Chinese are coming as we

16:07

tell you all all the time. Yeah. Yeah, we we do we will at some point also just kind of go through various it's on the list. It's on the

16:14

short list of of topics where we just cover all the major, you know, Chinese model providers as well as like their list of models. All right. So, thank you, Nick. You support that. Continuing on here, we do have

16:36

uh stage hand has released kind of an update. They did a rewrite to run basically they dropped Playright and they're running directly on CDP. Obby, what's CDP, dude? Honestly, it's just

16:55

to control the browser. Um, yeah. And Playright, dude, like having been in webdev for so long, playright is actually, let's say, like the latest evolution of this right? Because we have been using different Whoever used Cypress, like let's pour one out for

17:15

Cypress real quick, you know, like things like that. And then he also had like Mocha, like ET and all this So now Playright's like the like the it was the coolest one in this evolution of freaking like um browser testing type And you know Paul fitting they don't need it. Sage hand is a genic version of

17:38

this. You could just you know you could do this stuff already. If you were going to use Chromium browser why do you need playright? I get it. It's tight. I mean, it made sense at the beginning because it was a way to get there faster, but

17:50

then I think they've realized they actually have more flexibility if they just get rid of that dependency. It's, you know, I think that's a pretty common trend. You kind of like move fast, you're dependent on certain things, and then over time maybe you realize the dependency wasn't, you know, wasn't actually needed or isn't needed in the long term. Hey, Tony's here. Get over here. All right.

18:09

So, we are taking a break here. Let's uh What's up, man? You guys know Tony? Hold on. We're gonna stop. You can't

18:21

see. There we go. There it is. Here. Hello. We're getting We got the whole uh a lot

18:27

of the team together. Not everybody, but it's an exciting uh exciting week here in San Francisco. You You just rolled in, Tony. Yeah. Just got here. Tony rolled in from another country.

18:39

Yeah. I I rolled in just, you know, 20 minutes ago from another state, but legit different country. Um, I rolled in here out of bed, so that's what I did. All right. Well, yeah, the audience

18:54

says, "What's up, Tony?" What's up, Tony? What are you excited about most for this week? Maybe that's a good segue.

19:03

The conference. Yeah, the conference. Anything specific about the conference? Any speakers you're excited to see?

19:08

all the speakers. There's a very good lineup. Yeah. Yeah. We're going to be talking a single one. So,

19:14

okay. We uh Yeah, we're going to be talking about the conference here in a little bit and we'll be previewing some of the speakers, but I agree. The lineup is sick. Super sick. And if you have if you're not already

19:25

registered, you can't buy tickets for in person unless maybe you DM us. We might have one spot we can find for for a dedicated audience member in SF if you were if you didn't get uh tickets. But otherwise, uh, you can attend virtually.

19:39

So, please, uh, please go sign up tsconf.ai. We'll talk more about it.

19:44

It's been a while since we all seen each other. We work remote. Mantra is like a remote company, but we're all homies from a long time. Like, we all came together and so like these things are special like to meet up and stuff. Yeah,

19:56

we'll we'll we'll try to bring on some more mastros at the end of the the episode when uh because I know a Tai is coming soon, which reminds me, I gotta send him some information so we can actually join this thing. All right. Did you get a beer yet? Not yet. It's time, dude. It's time. All right.

20:12

Good seeing everyone. I'm going to go and get something to drink. Yeah. I'm thirsty, dude. You get us something. I have a tab

20:18

open. So, put it on my tab. Uh give me some kind of cider. Yeah,

20:24

I'll take a second. That sounds good. All right, everybody. Yeah, we're just we're having a good time. So, you want

20:31

to uh talk about this next one and then I'm going to get our guest on. Okay. So, so as I was saying, and I keep saying every week, the Chinese are coming. Okay.

20:44

And there was actually a comment in the live stream that I'll I'll I'll do right now. Do you have a favorite model outside of the US? Yes, it's Kimmy aka the thing that we're talking about right now.

20:56

So, Kimmy linear is it just came out. We do model highlights all the time. Um, the cool thing is we're going to have this like hugging face adapter soon. So,

21:09

if you're like a JavaScript dev and you want to try out these like models and stuff, it becomes a lot easier. Um, cuz we obviously really like them ourselves. Uh, so just another Kimmy one. So,

21:22

that's cool. And then let's talk about another I told you, dudes, the Chinese are coming. Okay. Uh Quen, another one of our favorites uh internally here at

21:34

Monra, Max Thinking here. I actually tried this one. It's kind of slow, dude.

21:39

to be honest, it's kind of slow and I don't think Max thinking in general matters to me, but that's just my opinion, you know. Uh it's not like it pondered some that I would never think myself. Like it's kind of whack to even spend time on that, but you know. Yeah. But I will say though, it's not even released yet, but if it's true on

22:03

the benchmarks, there's it's actually pretty good. 100% on two different benchmarks, aime and hmmt, which that'd be the first one. I guess we can pull up, we do have those links here. We can pull up the benchmarks so you can kind of see.

22:22

And I think a lot of these are just like it's hard to know if is it just benchmark maxing where it's basically training on the benchmark or if it's actually like did it actually solve the real problem. That's why these benchmarks usually come and go because eventually you've set a benchmark, all the models train towards it, they they do a really good job and then they move on. They need a harder benchmark or a different benchmark. So you can see

22:46

there hasn't been any 100% yet on this AIM 2025. So that in its own right is a little bit interesting. And then you know this HMMT 2025 benchmark is also uh you know there's been a few the only one that's been 100% is GPT5 Pro. So as

23:10

you can see it's basically the statement is it's going to be up there with GPT5 Pro as far as reasoning is concerned. So it is interesting. It always makes you at least stop to think that that's a pretty pretty bold claim.

23:22

So we'll see when it actually comes out. But overall um it is looks promising from from a you know kind of open model. And we're going to pause uh talk. We

23:34

have some more AI news. We have some more models to talk about, some more releases, but we have some guests as well. And so these are some people we've also shared beers with at bars. And you know, we're just going to be uh sharing

23:49

a conversation. So let's bring them on. We have a Tai from Co-Pilot Kit.

23:56

Hey Shane. Hey. Yeah. And Julie. Ulie's here. So, I'm gonna bring him on, too. Why not?

24:03

Oh, wow. Hey, guys. Hey, guys. Surprise guest.

24:09

How's it going? It's going well. We are actually in a bar right now.

24:15

So, we got a whole bunch team. Yeah. And we are just co-working. You know, we

24:20

didn't get a co-working space for today because everyone's just getting in at different times. So, we are just co-working from a bar in San Francisco. We always meet in bars, whether in person or virtually. It's It's always about Yeah, it's like that's where we hung out

24:32

last time. Is that a good sign or a bad sign? I don't know. Seems pretty great to me. Yeah, it seems like a good sign to me.

24:39

But yeah, we're we are uh trying to make some moves and while we're here, we're also having some fun. So, yeah, but it we were excited to bring you on obviously to talk a little bit about co-pilot kit and of course say thanks for being part of the TSAI comp which we're going to talk about a while as well. Uh but maybe I know I'm assuming

24:59

most people know about Copilot yet if they're watching this but if not just in case they haven't they haven't heard they've been living under a rock tell us a little bit about Copilot I don't know give a quick pitch uh essentially Copek is the agentic applications company right we make the connective tissue uh

25:16

between aentic backends uh which is your your guys' world andic front ends right userfacing agentic applications uh and We're also the company behind the AGUI protocol, the agent user interaction protocol, which is a general purpose connective tissue between again an agenic front end and aenic back end. From a technical standpoint, you know, the last 30 years of the internet,

25:41

everything has been running on the request response paradigm, right? You make some client makes request to a server, get some data, renders that data, and interaction is over. Uh, but agents, you know, have a long list of qualities that make them incompatible with that. So that's a technical kind of source for all this. Uh and yeah, we at

25:58

this point power, you know, million of millions of agent user interactions every single week, growing exponentially, have over 180,000 installs to our packages every week, uh all the way from, you know, Fortune 100s, uh startup unicorns, just startups, everybody in between. And yeah, very excited to be working with you guys, uh on the master integrations.

26:19

Yeah. Yeah, definitely. We we know that's it's been a problem that a lot of our users have been trying to solve is what's you know how do we build rich user interfaces for these agents and so it's always it's cool to see all the progress you've been making but also to have tools that users can can kind of

26:37

pull off the shelf and and customize to make make it fit whatever they want to build right it's kind of gives them the flexibility but also like lets them move a lot faster yeah exactly I mean there's a whole new set of building blocks that you need when you build these agendic applications all the way from obviously the basics just chat uh to generative

26:55

UIs there's different flavors of those uh but then all the way to you know uh shared state uh where you can have state synchronization between agents and applications there's different flavors of those too there's read write and then there's read only in either direction you have frontend tool execution so anyways this list of building blocks

27:15

it's not infinitely long but it's not that short either uh and um We essentially, you know, want to take care of that so that both application developers don't have to worry about that and then also you guys can give all of your ecosystem these functionalities hopefully relatively easily at least more more easily than you're playing it

27:34

yourself uh by kind of integrating with with a a protocol. So, like if I'm a front-end developer and I want to get into generative UI, like what does that mean to a like a person not in AI yet? like what are like the front end like how does your life change like as a front-end dev getting into generative AI through the UI? Yeah. Yeah. And you know what? I could also potentially use this as a

27:59

springboard to share screen share a little bit and we do. We love that. We love seeing demos.

28:04

All right, let's do that. De see how that works. All right, can you guys see this? We can. All right. Great. But maybe I'll so I'll

28:17

start here. Uh but I'll I'll uh I can share some some CLI stuff too to show people how they can actually get this type of stuff working on their own computers in a few minutes really. But um this is the AGY dojo. It's just kind

28:33

of a showcase of a bunch of the building blocks uh that we have in uh and you know just to show that you know you have chat obviously so you can just say chat with the agent get something back with streaming and so on. This is also showcases support from for front-end tool calls with this silly example of

28:51

you know changing the background color right so uh you can do a lot of really interesting things by the agent being able to execute front-end tools but this is just a bunch of hello world snippets uh to to kind of get to the core uh there's backend tool rendering so that's one type of generative UI right when the when an agent calls some tool in the

29:08

background uh you want to be able to show both the inputs the arguments to those tools and the outputs what it comes up with. And uh so this is backend tool running is one of these examples. Uh and then of course you have other flavors too including um you know front end uh front-end tools. So these are

29:30

tools that live just on the front end. We have here a haiku generator. We see here it you know generate a nice haiku for us. uh and you know again these are kind of simple little examples but but showcase uh uh the ability to not just

29:44

have chat in the experience and then this is another kind of neat example which is shared state between the agentic uh front end and the agentic back end. So here we have a recipe builder and let's say we want to make a you know make a spicy pasta dish and we'll see here that as the agent is on it's going to update this shared state now that exists between the front end and the back end and let's hope the demo

30:09

gods are with us. Guess gods are not with us today. Try that one more time. I'll switch to one of the Hold on. Let

30:17

me see. Uh, make a spicy pasta dish. This is definitely Yeah. Okay, there we go. So, we have here

30:30

update it. Anyways, you need you need the demo gods to be on your side. That's always important. Okay, we got we got

30:37

this update here coming. Um, so this is some of the things that you can do with with share state. There's the the list actually goes on a lot more. Uh we we I realize we need to complete our dojo but but all these features are working on

30:50

integrations. We just don't have all them in the dojo. Uh but maybe to talk a little bit about generative UI. So

30:58

you know generative UI is about again things in the chat aentic chat that are not just text right they're actual components you can use and there's a really big spectrum even on this just one part of you know this aenic front end. There's this there's a big spectrum of possibilities for generative UIs.

31:16

There's what we call static generative UIs and that's where you have you know maybe a long but a finite list of things that the front end specifically supports. And the advantages there are you you know the designers can get these pixel perfect right for the five 10 things are showed most uh uh commonly in

31:32

your application. The disadvantage is that there's all they're pretty high coupling between the front and the back end like you're used to you know in the old world preai. Uh but there's other flavors of generative UI that solve diff have different trade-off make different trade-offs here. So there's for for example fully open-ended generative UIs

31:50

allow the agent to say to bring any UI it wants at all either through an iframe or through YH HTML. You have some examples on that uh in some generative UI formats that AGUI supports for example MCP UI. Um and then there's this middle of the road uh which I think is really interesting which we call declarative generative UI. Uh and that

32:12

allows um the front end the back end to agree on a spec essentially of semiopen UIs and then the the agent can specify these cards and these cards themselves are not fixed. They can be very different from one another but but they use the same building blocks. So let's say they use um and in terms of input

32:32

forms you can have address forms or uh date forms you have maybe who will autocomplete for the for the address and for the date whatever you want and then you have output uh fields you can have uh graphs and charts and so on. So, so there's a big spectrum of possibilities there. Uh, and uh, yeah, we we essentially try to be agnostic. We we

32:52

explain to people the different benefits of each ones. Uh, but we we make all of them available for for developers to use as they see fit. Yeah. Do I love that uh I love that recipe demo because it just shows like the beauty of like shared state. Like I

33:10

mean most things are schema driven, right? As we've known, we always are gravitated towards like the schemadriven development. The only problem then is like you have to make sure the agent actually does whatever the you want and that becomes the the next problem of that, right? Yeah. And I can show like so the you

33:28

know these are the building blocks uh that you can use but then what what applications you actually build with it and this is maybe the the bonus at the end is this is just the CLI command for getting started copit but you know you can build there's really two c two reasons why people building these genic applications one is just to you know

33:46

simplify complex SAS right so this is an example of a SAS copilot you can ask here everything for us we're doing is developer facing so these are like you know developer platform where you can manage developers and you have testers and lab admins and so on. So I can ask you know uh show me the breakdown of work by a platform

34:07

and you can see here okay here it is front end backend docs and so on. Uh how about the breakdown by contributor, right? So nobody nobody can hide. So we see here, you know,

34:24

John John is pretty active. Uh uh Cersei is not not so much, right? So you can have this interaction. Um and but we can also have like in app,

34:36

right? So I can ask it, you know, uh, show me all the PRs that you think are at high risk for emerging by the end of today. And now it can use actual judgment and show show them show me these PRs. All right. So it can use actual

34:57

judgment. Yeah, that is tight, dude. Yeah. No, it's nice. It's nice. And I

35:02

mean this is very soon going to become the norm in all the SAS that we use uh uh probably for the benefit of all us. And then the other reason people use these genic applications is to just accelerate core work right and obviously developers we're used to that a lot in our day-to-day engineering workflow now with cloud code and cursor and all these

35:20

applications but um the same is really coming to every vertical. So this is a really simple I think we we guys we built this together for the the first uh hack uh uh what was it like the event that we did the workshop yeah there we go um and this is kind of a mini linear right a project management software but now that has a co-pilot in it we can ask it you know uh create a project to redesign my

35:46

website from scratch make all the decisions yourself don't consult me and assign tickets to uh different team members based on their strengths, right? So, this is a bit of an artificial example to keep the demo short. Uh, and you know, this is something that we built during the workshop in in what in like a couple

36:04

hours maybe. So, so this is very easy to build this stuff. So, we got this all these to-dos here. Say like you know uh

36:11

break the tickets up a lot more. This is too high level. And we see here the example of you know junior view. You can see what's happening behind the scenes here. Again, this is kind of this uh uh workshopy

36:26

example. So, so pretty straightforward. But now we see it's it's it's done here, right? That's the power of these tools. You

36:32

know, to fully automate your project manager, we're probably not quite there yet, but to give your project manager this superpower uh like you know, cloud code for everything. Uh we're here. So, so that's uh kind of neat, dude. I love this man. So sick. I love it. Yeah. Yeah, I really think too there's

36:52

even like another level that we're starting to see more and more people experiment with which is not only is the you know is the co-pilot helping you collaborate on something but the co-pilot could also be you know you can take a step further can be educating you on how to use the product if it's a new product it could be like exposing different parts of the UI to you as you

37:12

kind of like get further along with un like onboarding of the product there's like there's like multiple levels of where like AI injected into an app into like the SAS application could really help not only the onboarding experience but also the collaboration experience get further a lot faster. Yeah, it's really taming complexity like

37:30

you now have this almost like a person that that ships with your app knows everything about it knows everything about the user can help the user use it. Uh so it's both in terms of you know managing the complexity and managing the actual work like we can hand off some work to this person uh and and uh you

37:48

know it it's not like you you're out of the loop. You're still very much you know there's this whole big discussion we can have on you know autonomy versus not autonomy. Maybe an interesting angle here actually is um when you flip this this script where now the agents start to learn from the user as opposed to uh the user kind of directing the agent and and I think

38:08

that's uh that's something we're also very excited about kind of the next slightly around the corner taking all these interactions between agents and users and and essentially building RL data sets from them. uh and then you can use those and that's just to for clarity for anybody watching it's not us stealing the data or taking the data

38:27

it's building infrastructure to allow any any user customer to to make this data useful actionable so that's something that uh I I think is is going to be you know RL is what took that's the main thing that works to to improve these agents you know at scale uh that's what you know took the original GPT uh autocomplete to chat GPT the um it's the models are really good

38:52

at coding because there's so much so many RL data sets and there's now you know the leaders here are kind of already collecting all these interactions I like to ask people who's who has ever like trained RL data sets and like nobody raises the hand says like who's professor Andy would yeah we it's funny that you bring this

39:10

up because we literally had conversation with our friend Andy from Osmosis who's doing the same thing it's it's I do believe that you're on to something and he's on to something and we've all been talking about it a little bit is that RL is is going to be huge in the next year and beyond. But I do think it has it's

39:28

going to have a place in 2026 where a lot people who maybe heard fine-tuning, didn't really care about it, thought it was too hard, and now it's like, okay, well RL is maybe a little bit more approachable, but you you kind of have to you have to collect the data. You have to be able to then curate that data and then know what you're doing to

39:44

actually do something with it once you have it. Yeah. Yeah. there's a lot of these bottlenecks to actually be able to use

39:50

these systems. uh like just the infrastructure is excuse me is a big one but that's that's getting a lot of progress really quickly both you know by the model companies open AAI anthropic have like better and better fine-tuning APIs and also you know uh thinking machine labs and all those um but but data is kind of still a really expensive bottleneck and have to pay you know scalei like I don't know many many many

40:15

thousands of dollars for for like a little bit of RL um And uh and it's also ultimately not that specific to your application, right? Because you can't hire people who can RL your sales co-pilot or your application, whatever that is. But if you flip the script and and now start having your users be able to be be the trainers. Um because every act of interaction with an

40:39

agent is is a nudge like did it work well or did not work well. Um and and that flips the script there. But uh you know that's the next uh a little bit around the corner. Uh I think it might be here dude honestly

40:54

like I think it's coming like not even around the corner like tomorrow. Tomorrow you know. Yeah. Well I mean yeah I don't know.

41:00

Is that is that Professor Andy at Osmosis? Well I mean I don't want to put words in his mouth but you know like uh writing is on the wall. Yeah people are working on it for sure. I mean yeah the writing is on the wall. Yeah 100%. He mentioned that they threw an RL

41:14

event and they had a capacity for 200 people but 800 people signed up. So there's a lot of interest. Seems like a hype thing too. Yeah. I mean there's some hype of course this I mean look it's we've seen it working

41:26

and and that's that's the tool that the model companies use like reach for when they want to enter the application world. It's okay let's let's RL it for this application and and it works. It works really well like really really well. It's how do you tie all these pieces so that you can bring it to any team at any company. Get like RL agents

41:43

are customized for you know per agent per customer. Uh I think that's coming you know sooner rather than later but you know like if you could self-s serve that then it's game over right that's the that's the dream. Yeah that's uh yeah you can self-s serve yeah because you typically using smaller models that are more like hyper tuned towards what you need. That's that's often what we've

42:05

been seeing a lot or people are talking about a lot more. There's there's a lot of like token efficiency gains you can get from it as well. I mean there and as well as like quality quality is always number one like people are always asking how do we increase the quality but there's other gains you can get from it too. That's true. Yeah. Cost cost is definitely part of the equation

42:22

especially when people actually roll through production because these are these things but um typically cost is like it's kind of one of those things where it's actually going to cost you a lot more in the beginning probably. So it's like it's actually more expensive in the beginning, but at a certain level of

42:34

scale, maybe it actually decreases cost. That's right. And there's also, by the way, there's what people call in RL. So

42:39

that's not even fine-tuning. It's essentially prompt optimization, but you encode in the prompt, you know, all the the things that worked well and did not work well, all the user journeys through an application like to get to some good end result. There's a lot of information there. you can use it on the model weights but uh just making that

42:57

accessible through the you know LLM at generation time is yeah in the right way the right format is is also really really powerful so yeah I mean that yeah that that's just like another tool for context engineering right it's like how do you get the right how do you improve the prompt in a way that you can get the best results you can get out of the existing model and then if you need to

43:16

go further you can actually pull pull something out that changes the model weights how how much time do we have We got like five more minutes or so. Five minutes. Do you want to do a quick uh showcase of the the master getting started CLI with copet?

43:34

Sure. Yeah. Yeah, let's see it. All right, let's let's do it. Um I think I'll just share my entire screen, which

43:39

is risky, but uh I'll risk it. You don't send me any funny. Um all right. So, so this is it's really it's

43:50

it's really cool. I really shout out to uh Tyler who built this CLI and also shout out for the integration building Obby. You you are honestly a big pusher behind that. You and Ward alongside

44:03

Marcus on our side. So good job guys. Thank you. Thank you.

44:08

Uh all right here. So NPX goit latest. We're going to use MRA. Let's call it uh

44:15

hello. Hello again MRA. That's that's really all there is to it.

44:22

Now, I cheated and cloned it in advance so that I don't have to install and also put my OpenAI key live here. Um, but it's it's that's all it is. It's this is the exact same folder and I just run npm rundev.

44:35

Um, and if I now go to the 3000, what what you get at the end of this is this hello world kind of environment that has everything already loaded in it for from many of these building blocks, right? So you have generative UI to get the weather in San Francisco. We have front end tool calls and these are just suggestions to kind of have a guided

44:57

experience. We have set the theme to green. So now we have a green theme. Human in the loop. Please go to the

45:03

moon. Do we want to launch or abort? Let's launch. And what if we aborted?

45:09

Would we aborted? Um there's some shared state here. So this is a bunch of proverbs. Copa kit may be new, but it's

45:16

the biggest last beard. Okay, that's nice. Let's write a proverb about AI.

45:22

Who comes up with AI is like a compass. It guides but does not decide. That's uh that's deep.

45:29

I live my life by that one. Yeah. And we can update this shared state, right? So remove one random proverb.

45:35

Let's see what it chooses to remove. Okay. AI is this compass remains here and we can read this shared state. So

45:41

what are the proverbs? Yeah, it can see it. So So that's nice. the agent. This

45:47

is a showcase of some of these things. Uh from a code standpoint, this is extremely stren straightforward. So if I look here, let me zoom in a bit. We have the MRA weather agent and we give it the

46:01

weather tool to go fetch the thing. There's memory uh using SQL store and that's really all there is to it. And then on the front end side, there's examples of this is a front-end action.

46:15

So we just use this use copied action find the name of this this tool what arguments we're expecting and then a handler. So this is now going to get passed to the agent. Uh and by the way if folks are concerned about uh what can the agent execute every tool? If you want you can whitelist or allow list um uh certain tools and you know deny list

46:35

some other ones. Um we have the shared state which is kind of neat. This is all it takes. You have now state and set states with use cogent. Just like in React you have use state now you have

46:47

use co agent and all the state synization with streaming with uh you know delta computation. So you you make it really efficiently which is really important really quickly all is taken care of. Um this is to render backend tool calls. So this is weather tool call

47:04

and human in the loop. Finally, we have this render and wait for response where you have this respond call back and you you just call it when when the user did something. Uh and uh that's that's really all it takes. So this is really really easy to build these the demos we showed before. It's really on the order

47:21

of a day of work to build these things uh pretty quickly. Dude, it's great, dude. A lot of improvements, too. I love it. Yeah, a lot of improvements and a lot more coming, too. Yeah, we we've seen

47:34

all the stuff, you know, at different points, but it is it's cool to kind of look back and say like six months ago, what was it what it looked like compared to what it does today because it's I know it's six months ago. I don't think we I don't think we had anything yet and and our uh AGUI I think was still a still a Yeah, we launched AUI with

47:52

I remember you and I talking about the name because we went back and forth about what you were deciding to name it. Yeah, you you hated it, right? And I will say like I still do I actually but I think it landed like I was the the other name that I thought was you know arguably better in my opinion might not have landed because this one definitely did. So I'm sure like if

48:16

you're listening to this if you're tuning in watching this live have you heard of Agui? I'm guessing a lot of you have and maybe you didn't know. The nice thing about is it's just a protocol right so it's like yes of course C-Pilot Kit supports it but it's meant to be an open protocol that Yeah. That's right. Yeah. Front end framework should be able to

48:33

support and use. And if we could agree on the right standards, it makes it easier for the back end and the front end to communicate. Yeah, that's right. And there's uh you know, we make React and Angular clients, but there's now community clients with

48:46

Cotlin and Rust and uhnet coming soon from interesting sources and and all of uh there's there's good work uh being done in the space. Um um in addition there's a slack client in the talks of that and messaging based client. So you can put these things you know on a WhatsApp or uh you know RCS that's another type of protocol. Anyways, yeah, there's there's

49:14

the cool thing is like Monster is integrated with all of them like just by being compatible with AGUI which is cool. Like you can have Monster agents running in your glasses as one guy in our community demos. That's cool. He built a HUI smart glass client which is kind kind of neat. Yeah,

49:36

that shit's tight. Yeah. Yeah. Um, well, all right, guys. So, what's the best way for people to follow what

49:42

you're doing? Yeah, that's you. Uh, copilot kit on Twitter has it all.

49:49

All right. So, go follow copilot kit. Uh, go to tinkers, right?

49:55

Tinkers. Yeah. Shout out. Go to tinkers. Dev tool tinkers. Yeah.

50:00

Or we'll obviously see you in a couple days at TSAI. Excited to hang out there. And yeah. Yeah. Well, thanks for coming on the show. You're always welcome back if you have cool new stuff to to showcase when you launch new things.

50:12

Awesome. Yeah, we got we have uh 1 150 really coming soon which has a few things maybe I'll mention briefly and we we should definitely do another one of those when that comes out. It's got all new internals uh uh but no breaking changes. So all the existing interfaces, developer interfaces continue to work. There's new features like a lot of stateful features with

50:29

full support for self-hosting on anybody's clouds or onrem. Um, and then there's all new developer interfaces, too. So, a lot of the what I showed here, these interfaces, there's, you know, shiny new variants. Uh, but, uh, yeah. What beer are you drinking, Shane?

50:50

I don't know. They just brought me I told them to bring me some kind of cider. You know, it's two o'clock, you know, trying to keep it take it take it easy right now. But this is my seventh beer. I'm just kidding. I believe it. So, it's your

51:00

seventh. We'd hear the real thoughts. If you think he has no filter now, wait till we get a couple more beers. Wait till Wait till the seventh. How long is the stream going? Is it uh

51:12

is going? No, we'll be closing. We got We got We're going to hang out with the team. So, we're we got like 10 15 more minutes to close

51:18

out and and uh wrap some things up. But we appreciate you both coming on and yeah, we'll we'll talk to you again in a couple days. Thanks for having us. See you guys.

51:30

Yeah. See you guys. All right, now it's just us. All right,

51:36

back to just the two of us. All right, well that was cool. That was fun. Always good to see I always always like

51:43

to see live demos. If you're tuning in now, this is AI Agents Hour. I'm Shane.

51:48

This is Obby. We do this every week. We're not normally in a bar, but today we are. Uh, normally we record this at

51:56

or do this live at about noon Pacific time. Today we started a little late because I was traveling, but we talked a little bit about Master Agent Studio. We did some AI news. We talked with Julie and Tai from Copilot Kit. We have a few

52:09

uh more news topics we want to cover and then we're going to talk about the TSA icon comp. Yeah, I found myself in this bar. There are some uh yeah, there are some interesting comments we can share. So,

52:21

Joseph says, "If I have built an agent using Purcell's AI SDK, can I bolt on Maestra to help my existing agent more resilient to errors, persistent and better evals, and tracking for the agent?" Yes. Yes. So, typically the way we see people

52:35

do that is where you normally are calling like generator stream calls in AIDS, you just swap out a master agent for that. Use the master class, get the get the agent, call our generator stream, and it should still should just work. You know, there's obviously there's obviously a few things you can piece together there, but you can get memory, observability, you can run

52:54

evals, you get all you get all the stuff and it it's pretty easy. I mean, MRA is kind of built on a lot of different pieces of AIDS as well. So, you kind of we often have that where you run both together. So, you can kind of you can kind of do it

53:09

you can kind of piece out different parts of it. Like maybe you don't want to replace your entire application. You just want to swap in a master agent for this thing. see if you like it and then eventually you just end up swapping everything uh all those generated calls.

53:21

All right. And I don't know how to pron Hey, thanks for the new awesome update that I can share the dev server with my team and convince or let them decide why master is killer and AI agent framework race. I appreciate you. I like you, bro. Yeah, thank you.

53:40

All right, let's talk some more AI news. So, we didn't get to everything. So, we talked about Quinn. Let's talk about

53:45

cognition. You pull that up and then I'll pull up and we'll talk about the next one. Last week or felt like last week was the week of coding agent releases. Dude, it was last week.

53:58

Literally, it feels like it was a month ago. Last week because I've been using Composer One. This is not even what we're talking about. I've been using Composer One and it's tight. And then you got Devon over

54:09

here or sorry, Cognition over here uh introducing their own agent as well. You can just build your own model. That's what that's what they're kind of saying, right? So they're basically built their own uh agent model specific

54:21

for you know probably I imagine it's kind of tuned for the types of tasks that they're seeing, right? Yeah. So they partnered with Cerebras. It's available in Windsurf which still exists. What the is windfur dude? I

54:36

remember when we used to be wind surfers. We used to be wind surfers, dude. Why? But like

54:41

Yeah, it's cool though. But like if it's available in Windsurf, I feel like no one's going to use it. Yeah. I wonder is it only available in Windsurf though? I bet it's available, you know, you I'm sure you can use it in

54:54

other places, but I doubt it's in cursor or anything yet. But it is uh yeah, that's not the only model that was released. Along the same lines we had cursor introducing cursor 2.0 our first coding model and best way to code with

55:17

agents. It is. So this is it's called composer, right? Yep. Composer one. Dude, I've been using

55:23

that nonstop. I love it because it's not that smart, but it's so fast. it could do like if you know what you're doing, you just tell it like it does it like so quickly. You don't have

55:36

to wait, but if you ask it to think for itself kind of, it's kind of sucks from that perspective. So you're it's still just an agent in the sidebar. Yeah, it's just another model. Um like

55:48

we've been doing a lot like we're going to be releasing a 1.0 soon. So I'm like deleting a bunch of which is something that I don't want to do by hand, but I also want to do it kind of agentically because I don't want to destroy some I didn't mean to. And I just don't want to spend, you don't, you shouldn't spend money on

56:04

tasks like that. I'm just using composer because it's so fast because I'm, you know, I know what I'm doing, right? Um, but then I had to like try to build some new It failed so hard, dude. Like it was not good. Okay. So, so anecdotal evidence says

56:21

composer great. If you want to if you just need it to do things for you and you want to like guide it and just have it make make moves really quickly. If you need it to solve your problems for you, you might want to stick with 4.5. Yeah, stick with 4.5 or or cloud coat. You know what? Whatever.

56:38

But that's cool. So I'm a cursor fan. Stan fan sand. All of

56:44

them. So this was uh so let's talk about we've been talking about it's the year of you know we've been mentioning 2026 is the year of small models. We've also been mentioning it's the year of ancient memory indeed. So we have

57:03

the homies once again. Yes. So Mezero that they raised a bunch of money and you know they've had 14 million downloads a ton of GitHub stars.

57:13

They talk a little bit about, you know, you can watch the video. Of course, we're not going to watch it here, but there's a lot of excitement around agent memory and what that means. So, congrats to me zero. You know, come on the show.

57:26

Where you at? Taran Jeet's a legend, dude. That guy in our YC batch, everyone was like, "Yo, Taran Jeet, you know Taran Jeet?" I'm like, "Not yet." He's like, "That guy's a legend." And I met him. He

57:38

is a legend. He is a legend. All right. Yeah. Taran Jeet, open invitation. come on the show.

57:43

We'll get him on the show. Uh, but congrats to Mezzero. And last bit of news, I thought this one was pretty cool, mainly because All right. I thought this was pretty cool. Mainly because uh it's I've used a

58:01

lot of voice related uh models. Specifically, I used a lot of 11 Labs and this demo was pretty sick. We're not going to watch the whole thing, but I want to play it just because I think it uh it was a pretty cool demo. So, Cartisia announced Sonic 3, and

58:20

basically, you know, it's just a good it's a good model from from the demos. Again, haven't actually tested it, but it appears really pretty awesome. I don't know why we can't hear this. They're so great. Even Elon loved our

58:34

last model. I'm excited to show you our latest Sonic 3. Now, can you tell which one's the real Elon? I I I don't really have a business plan.

58:45

What's the business plan? The answer is neither. Both are Sonic 3.

58:52

Here's the thing. Even my voice was AI generated this entire time. That's amazing.

58:57

Sonic 3 is the best model for real-time conversation. Let's see it in action. Hi, this is Sarah calling back from Saffron Kitchen. Another robot call.

59:10

I get that a lot. So, were you looking to make a reservation? Yeah, tonight for 2 at 8.

59:15

8:00. Got it. Oh, wait. Actually, honey, tonight

59:23

Oh, 8:30 then. Actually, wait. Did you take walkings?

59:30

Uhhuh. But it gets pretty packed. Better to have the res. Got it. Yeah, it really works.

59:38

Sounds great. So to confirm, your phone number is crazy. So that's a demo again. That's

59:44

dope. That's the cherrypicked demo. Speaking Hindi and out there. Yeah, it's like multi multi- language,

59:51

you know. So overall though, Sonic 3 very promising, especially if you like familiar with 11 Labs and other real-time voice. uh you know OpenAI is real time voice.

1:00:02

Yeah. Uh they talk you know they do talk a little bit about some of the details of it. So you can check out that thread, learn a little bit more about it if you're building real-time voice apps. It's it's always promising to see new breakthroughs. I don't know if it's considered a

1:00:18

breakthrough, but it does seem better than a lot of the other demos that I've seen. So seems very realistic. Seems like the latency is pretty good. So if

1:00:26

the demo holds up, it is uh it's pretty awesome. All right. So, last let's talk a little bit about the TSA icon. So,

1:00:37

indeed, we're going to talk through and we're going to highlight the agenda. So, if you are there, you know what you're in for. If if you're going to show up, you know, come say hi. We'll be there. But if you're not and you, you know, you can of

1:00:49

course register and watch it virtually. We'll talk about what you can expect to see. And, you know, we don't really know exactly what's going to happen, but we'll give you the the rundown. All right. So, tsconf.ai.

1:01:02

Go there, sign up if you have not already. You can still get a virtual ticket. So, first we get the opening keynote. Sam, me and Obby are all going to chat.

1:01:15

The one thing you will notice, see what that is about. Yeah, we still are planning this, you know, very much like we we handle the planning for this show. Sometimes we throw it together. But one thing you'll notice that I actually really like about

1:01:28

the way we're running this conference and the way we even do guests on this show is like we bring on people for a very short amount of time and it's like give us the best stuff. Yeah. You don't have 45 minutes. This is not a 45minute keynote. You get like 15 to 20

1:01:40

minutes. Like give us the best stuff you got. Yeah. And pack it into something that's actually like action-packed something

1:01:47

people can like take away something rather than having like these really long drawn out. So it's where you just shill your product. Yeah. So it should feel like you're getting like a lot of value in a very short amount of time. That's the whole goal. So you can see like literally the

1:01:59

opening keynote is 15 minutes. So for three people. So I'm going to talk for like six seconds.

1:02:05

Same 30 seconds maybe. And then you got Paul. He's talking about you know lessons learned building an AI first browser automation framework. Interesting. Yeah. Of course you know Typescript it's

1:02:17

15 minutes long. You got Swix. So it says his 15 minutes to talk about the impossible triangle of LLM infra. Then

1:02:24

we have a you know a short agent tooling panel with Sean Pool and Brian Holt. So that's cool. Then you know a little break talked with David Kramer. He's got his 15 minutes to talk about

1:02:39

what if bugs fix themselves. So for that one then we have Luis from Replet talking about Replet's quest for autonomy. Then we have Nico from Purcell talking about building AI applications with AI SDK. Sick. We got lunch and Sam's gonna sign some

1:02:57

books. We got, you know, maybe maybe a new book. We'll see. You know, like there's some Easter eggs out there

1:03:02

already. Show up and yeah, you maybe seen it. We got a new book. We'll be, you know,

1:03:07

you'll be able to grab a copy. Then we got a whole bunch of lightning demos, which you saw Ty Ty was just here. He's in that list. Dan from Boppy,

1:03:15

Sergio from Arcade, Dustin Goose is on the loose. Yeah, Goose is gonna be on the loose on Thursday, Dustin from Postman, Simon from Assistant UI, Zach from Circuit and Chisel. And then we have a future observability panel with Mark from Langfuse and Aparna from Arise Off for MCP and agents with Michael Grwich from work OS.

1:03:34

That's going to be dope. Writing TypeScript with Codeex, Neil from OpenAI. And then we have the closing remarks which we're just gonna kind of you get me, Sam, and Obby again just wrapping things up. And then probably say peace.

1:03:46

Peace. We will definitely say peace. And then of course happy hour networking. And if you haven't seen it, there is an

1:03:52

afterparty. There is an afterparty. So that's the conference. It's going to

1:03:58

be exciting. You should The whole day wasn't enough to hang out with us. You could have an afterparty as well. Yes. thrown by our homies at Neon and Code Rabbit and whoever the you

1:04:09

know. Yeah. Yeah. There's a The funny thing is so the afterparty has twice as many

1:04:16

registrants as the conference in person conference. People just want to come hang out and have drinks. Yeah. I think there's like

1:04:21

Southern Pacific. Yeah. Obviously with the conference is sold out, but I think you know we just over 300 in the conference. We we let a

1:04:28

few extra people in, but I think there's like 500 registered for the afterparty. So, it's going to be a uh it's going to be an event. Come to the afterparty. Should we get uh some of the Yeah.

1:04:40

Yeah. All right. Well, while Obby uh maybe goes and recruits some MRO members to come say hello and what they're looking at. Uh

1:04:55

yeah. Yeah. We're gonna bring on some people from the Monster Team because why not? You know, this is our live stream. We're

1:05:01

gonna have fun. Daniel's gonna come on. Why not? Come on over. Come on. Step into the

1:05:08

office. I'm coming right up. Coming right up. All right. We got Daniel. Daniel, tell

1:05:15

people what you work on or what some of the things you've been working on in Master. Oh, I mean recently bunch of processor related things. Uh did a workshop last week on that. Yeah, tell us about the guardrails workshop. Um, it was uh it was it was

1:05:30

pretty pretty fun. Um, I would definitely go check it out uh on on the YouTube page of the X channel. Yeah, definitely. Oh, you looking for a way to drop that

1:05:43

right now? Oh, nice. Go sign up. Go reg and subscribe while you're there. We're trying to get our

1:05:48

YouTube count up. If you're watching on YouTube, make sure he says dress like he plays golf. Get over here, Grayson. Yeah, we're gonna All right. So, for the ring, guys, do not wiggle. You kind of look like a like a high

1:06:00

school science teacher. Let's get the science teacher in here. Send it up to Grayson, y'all. All right, that's that's Grayson. Grayson, tell people who haven't met you

1:06:12

what you're doing at MRA. Um, I'm helping keep customers unblocked. Um, inform the framework, what customers need in the real world, and make sure it all works. And what you

1:06:25

know you you've built some agents before. What's something you were doing before this? Uh yeah, building building agents in like the finance manufacturing space. So they were internal agents that helped the business run more effectively. You

1:06:38

know, automated invoice ingestion where it gets the invoice and matches it to a vendor and helps accounting reconcile. Helping source products to manufacture them. So, you know, I want to make sunglasses uh made in India and it would pull back uh manufacturers that could do that. All kinds of things like that.

1:07:01

Cool. Well, yeah, we're excited to have you. Cool. Thanks, guys. All right, science teacher. He dude, you're looking good. You're

1:07:08

looking good. Nick. Nick, come up here. This is Nick. He's He's the guy. Uh we

1:07:14

call him the king of rag. There's another one. What else? No, there I'll think of it. We we had some

1:07:21

names. The emissary of evals. The emissary of eval. The king of rag. Um yeah, Nick is

1:07:28

obvious. But also tell us a little bit about what what you've been working on recently. Lately, it's been a lot of uh working on storage, working on workflows, making those better. Um getting everything

1:07:41

ready for our 1.0 release. So hopefully everyone has a has a good time with the 1.0. All right. Next. Next up, we got to bring Eric in here.

1:07:54

If you've been uh playing around with our playground and looking at our observability, you've definitely seen some of the improvements. So, Eric, tell us a little bit about yourself and what you've been working on at MRA. Uh I started in MRA about three months ago. I've been building up a entire new

1:08:10

tracing system. Um before this, I worked at a data stacks, worked out the graph system there. Yeah. And so, yeah, if you've used the new AI tracing, you've seen and probably

1:08:22

very much appreciated from what I've heard from people, uh, Eric's work. So, Eric's the Yeah. So, thanks Eric.

1:08:29

Marvin, you can't you can't not get in here, but you're you're a fan favorite. You've been here before. This is Marvin.

1:08:37

Marvin, what's some of the stuff you've been working on recently? Oh, wow. Oh, wow. A lot of things. I'm mostly working on the play on the

1:08:43

studio. Sorry. It's not playground studio now. working on the local studio and on the cloud studio and recently we

1:08:49

have released a new cloud studio service if you have tried this one give us feedbacks would be nice yeah we talked about that earlier so if you've tried uh the master agent studio in cloud please let us know and we can't we need we need to get the next person on you know we are we got at least one more Tony was already on all right we got the man himself Oh.

1:09:16

All right. We got Sam. So, the three the three co-founders in one frame. Boom. All right.

1:09:22

Yeah. So, tell us a little bit. So, we teased we teased the new book. So, tell us so tell

1:09:28

us a just a tiny sliver of what they can get if they come to the TSA comp and get your new book on uh on Thursday. So, at TSA I comp we're going to be um we're going to be sharing the new book with everybody there. So the book this is the this is the first book uh it's in frame uh principles of building AI agents um you probably read it um it

1:09:48

kind of goes through the basic primitives um agents workflows rag evals tracing but um what what the second book is called patterns for building AI agents and that goes through sort of the more deep stuff like how do you do context engineering how do you like architect um an an agent so like we've on Shane. How many whiteboarding sessions do you think that like the three of us have done with people? 50

1:10:16

60 dozens upon dozens. Like it might not be it might not be hundred yet, but it's got to be close. It's got to be close. I mean, we you know, so like a lot of the sort of like multi how do I do

1:10:28

multi- aent? Should I have a you know supervisor? Like all all like how you think about that? How you think about

1:10:33

grouping agent like functionality together? Um and then all the different pieces of of context engineering. So like memory and and and and sort of like tuning that so that your application improves performance. Um but we close it

1:10:46

out with sort of like eval um and there's just a lot of patterns that have emerged over the summer um around things like um how do you uh like how do you pick and I'm not even like yes there's like the actual like how you you write an eval but a lot of it is like thinking through what are the metrics for success

1:11:05

of your agent like is false positive more important? What about false negatives? And then, you know, what are the sort of the failure modes that are driving that metric kind of like up or down? And how do you just sort of like

1:11:17

think about that? You know, principles is like really a guide to getting started. But what what patterns is it's a guide to the like to starting to like improve. Yeah, it's a chamber of secrets. Chamber of Secrets. It's a sequel. It's like how you're

1:11:30

starting to get your application uh higher accuracy and and towards production. Oh, funny thing about eval from the book though, we Eug John took a eval course and he learned nothing, which is good. That means we already knew what the we were doing. Yeah. All right. And Eugion isn't here, but

1:11:50

he'll be here. He'll be here soon. So, yeah. And there's a, you know, for all of you other MRA mastros that are not here, we

1:11:57

miss you. Yeah. And for everyone tuning in, uh, this is actually like a a tough thing because like a lot of our team members could come, but a lot of them couldn't come because of like visa concerns or like family um, important family events.

1:12:11

Um, and so like we're happy to get together, but the hardest part of a remote team gathering is always the people who couldn't make it. Yeah, absolutely. That's a good way to that's a good way to end it. Uh, we

1:12:24

appreciate you all for tuning in. you know, make sure you're following me and Obby and the rest of us. You know, we did get a new uh exhandle at MRA. So, yeah, MRA now. So, follows to Shane Shane like he

1:12:37

thanked it. So, follow us there on YouTube. Make sure you're you know, we do like the uh the subscribe numbers to keep going up. So, please uh go follow like and subscribe. Like and subscribe. Like and subscribe.

1:12:49

Subscribe. We'll see you at the Jamber Secrets. and and we hope to see you all at TS comp uh TSAI comp on Thursday. If you can't attend, register anyways. You can you

1:13:02

can watch it remotely. We'll be here on the live stream. You'll see us and a lot of other really cool speakers. And with

1:13:08

that, let's sign off. Peace. Peace. It see