Back to all episodes

Mastra Templates Hackathon Awards, GPT-5, OpenAI OSS models, and AI news

August 11, 2025

Today we review some Mastra Templates Hackathon submissions and give out some awards! We will discuss the OpenAI launches last week (GPT-5 and OSS models) and provide some hot takes. Finally, we chat about all the other AI news.

Guests in this episode

Shreeda Segan

Shreeda Segan

Watch on

YouTube Spotify

Listen on

Episode Transcript

4:14

Welcome to AI Agents Hour. I'm Shane and as you will notice with me today is not Abby. Unfortunately, Abby is sick, but I do have Shreda here with me because we have a jam-packed day today. For those of you that have never uh watched this

4:32

before, welcome. Thank you for watching AI Agents Hour. Today, we're going to be talking about the Monster.build

4:37

templates hackathon. We're going to review a whole bunch of submissions. Unfortunately, we can't review them all.

4:43

We had a lot of good ones and there's a lot of good ones that we won't even be able to show today, but we have some kind of updates and special news or maybe we can get to them in the future. And we'll be going through some of the hackathon awards after that. So, we'll be doing the awards ceremony, handing out some prizes. We will then uh if you

5:00

haven't paying attention, you might have noticed last week there was just maybe a few AI news updates, just a couple. So, we will be spending a lot of time as well talking AI news. I'll talk through the GPT5 launch, the open source models or open weight models from OpenAI and a whole bunch of other AI news. How you doing, Sharita?

5:23

Oh, you are muted, but now you're not. Thank you. Um, I'm doing well. I'm

5:29

pretty excited. I already see some of the hackathon participants, you know, chiming in to say hi. Hi, hi Jelp and hi Vesh. Um, so I think everyone's honestly I think everyone's just waiting for the drop.

5:43

Yeah. And for those of you tuning in, we are it's it's going to take a while before we get to the awards because we wanted to highlight some of the great submissions we got. So, we'll be playing some of the submission videos. We'll

5:54

look through some of the readmes and just kind of talking through in general like some of the some of the hackathon submissions. Again, you if you are here and yours is not uh highlighted, that doesn't mean we didn't appreciate it. It means that we are we got a lot of good ones and so we we just had to be very uh

6:13

very picky. I guess we had we had a lot of good choices to choose from. Agreed. Um maybe it's worth saying a little bit what this hackathon is for anyone who's joining in. Um so we just

6:26

ran uh I guess the first part of the monster build templates hackathon. If you're not familiar with what a MRA template is, just go to master.ai/templates.

6:37

But they're kind of like Yeah, thanks Jane. They're kind of like um like starting like a place for someone, you know, who wants to build a project to just very easily get started. It's like pre-written boilerplate code that then you can very easily customize. Um, and we wanted to get, you know, the community's help in coming up with like

6:57

way more creative templates than maybe we have current, like our team has currently had the time to think of. Um, we're going to be adding more. We're going to always be accepting community submissions. Um, but this hackathon had,

7:09

you know, a handful of good categories. Um, and the good news is that even if you're just joining, it's actually not too late to join. We're going to be doing an encore week um for some remaining categories. Uh, so I'm sure

7:22

we'll get to all of that, but just wanted to give a little bit of context for anyone just joining. Yeah, great context. And really, this is kind of really been really exciting as a hackathon for me to to look at some of these submissions because I know, you know, you know, of course I'm I'm one of the co-founders of MRA, but I just know that when I'm building agents in

7:44

Maestra, I feel like I have to start from scratch every time. and having some of these to learn from. I honestly have learned like different patterns that from some of the submissions that I wouldn't have thought of myself. And I

7:55

think that's a really cool way to just share with the community and allow other people to I guess share their creations and their ideas and the patterns that are working for them in building agents. And if you are just getting started, it provides you a good starting place for either just education or potentially to build a project off of. And so that

8:13

that's what's really exciting. And so you might have noticed on the master templates page right now we just have what we call like our official templates. These are the ones that we manage and maintain but we are going to be releasing and hopefully it'll be out you know by this time next week we can showcase it on the live stream next week our community templates. So we'll slowly

8:34

start to take some of the hackathon submissions and start putting them on the website. We and we'll kind of roll them out over time as you know and then as we get more we'll continue adding. So, if you want some inspiration, if you're, you know, starting a project and you maybe don't want to start from, you know, ground zero, you want some some inspiration, you might be able to go to

8:53

the templates page and see some of ours and some of the community ones as well. So, this is live. So, Shabir Maestro plus cursor plus GBT5 equals smiley face with hearts. Nice. actually and we'll be talking about GPT5. GBT5. I haven't used it yet for

9:14

coding. So, I'll be interested to know if you're watching this when we get when we get to that section. I'll be interested in your feedback if you have actually used it for coding tasks. I've heard good things, but I have not

9:25

necessarily uh done it yet. Erin, welcome. Thanks for joining. Well, Sha, should we uh Yeah, let's get into it. Shane, let's

9:37

get into it. All right, enough with the preamble. Let's let's dive in.

9:45

Okay, so we have, as you can see here, a spreadsheet or a slideshow. I'm going to go through it because why not? Mostra build templates awards. So, thank you

9:58

for the all those that participated. Here are some of the awards that we are giving out. you know, $512 Amazon gift cards for category winners, spotlight in our templates library, which I mentioned, we have a bunch of other prizes you can potentially win just for, you know, submitting. We're going to be giving out those soon as well. As we mentioned previously,

10:20

master.ai/templates if you want to see examples of what's already there and you'll have the link of where you will see the community templates once we get those dropped later this week. We also Shreda teased it a little bit. We are having an encore week. So what we

10:38

realized is we had a lot of categories. I think we had 13 different categories and some of them were very popular. Productivity especially I we had I don't know almost 30 submissions just on productivity category alone. And some of the categories didn't get very many submissions. So, for some of those

10:58

categories that didn't get a lot of submissions, we're giving you one more week. And that means even if you already submitted one, you could submit another uh template for one of these other categories. So, that's going on until this Friday.

11:13

So, you have one more week of agent or template building of MRA template building. So, the schedule going forward right now, this is the 11th, we're live streaming the award ceremony right now. This is happening on on the 15th at 800 a.m. Pacific is the deadline for our

11:32

encore week submissions and we'll talk about the categories that are up for that and then on August 18th we will be live streaming kind of part two of this. So next week at this time we'll continue. So here's all the categories.

11:46

If you remember from last week, as we showed before or talked about last week, there are 13 categories, but let's talk a little bit about the awards. So, here's the seven that we're awarding categories today. So, we're going to talk agent network, we're going to do tool provider, productivity, coding agent, eval. I get to pick my

12:10

favorite. That's, you know, lucky me. And then Abby, who's not here, unfortunately, as I mentioned before, he's feeling a little under the weather today. Uh, we do have the funniest, so we'll still award that one in his in his

12:22

honor and then he'll talk about it next week, I'm sure. Shane, don't go to the next slide because that's going to reveal the winners. So, let's go let's look at some of the submissions first. Yes, good call. All right, so let's pick

12:35

some of these submissions. And what we'll do, again, we're not going to be able to watch all of them because we have limited time here, but we will highlight a number of the submissions. And if you're highlighted, you at least have the potential to get one of these awards. So, we will highlight all the winners at least. So, if you're paying

12:52

attention, you'll know that, you know, not everyone that we're highlighting is going to necessarily win an award, but we're highlighting a lot of the good ones. We only picked a few, but we thought these were were pretty cool. And so we'll share the video and then we'll spend just a few minutes talking about it and kind of go from there. So let me

13:11

figure out how to share this video and hopefully I can get it to come across in a good way. All right. So if I want to share I think I can share it like this and hopefully it will share the audio. All right. and Shita, stop me if this

13:34

doesn't work. Hi, I'm Wayne Akre and my monster template is is a workflow that allows a series of agents to review pull request that you submit to GitHub. You can see an example here. Basically, the idea is you build a series of custom agents that

13:52

can review different things like code style, security, etc. And this monster workflow will fire them all off in parallel when you submit a new uh pull request. And then coalate all the each each of those agents will do an independent review and then the all of that will be coalated together, summarized and a review will be posted to the pull request. So let me show you how it works. So I have a pull request

14:18

over here. I'm going to go ahead and submit it. I'm primarily a Drupal developer. So this is an example Drupal

14:24

module. it has some serious flaws. In particular, it's got a SQL injection vulnerability in it. And so the security

14:31

bot, so the template actually just comes with two two simple agents by default, like a code style agent and a security agent. They're both very generic. They're not intended to be production ready, but they're just examples of what you can do. And so, but hopefully the security agent will find this egregious

14:50

error that I put in here. So if we come back to the pull request, we can see that it has reviewed and it has found the SQL injection vulnerability. So I will push a new commit to this branch with the fixes for the security issue. Then when I push a new, it will fire off

15:09

another round of review and the agents have access to all of the the comments and the previous reviews. So they have some awareness of what the issues were before. Occasionally I have to yell at it in the comments to say no that's not really an issue. If you do stuff like that they do have access to that. They

15:27

don't always listen to you but you can yell at them through the comments here and okay so here we can see my with my second commit I have resolved the critical SQL uh injection vulnerability and otherwise it says that this code follows code standards etc. And because there were no problems, this bot also approved the pull request. And yeah, so

15:52

that's my monster template to review pull request with a multi- aent workflow. Hi, I'm Wayne. All right. And so if we look at the read me here,

16:06

you can see here is the yeah, the MRA PR reviewer template. So, it's a GitHub GitHub PR review bot template. Talks about how you can clone it, install it, run it, and yeah, overall, what did I guess what did you think, Sharita? You know, I felt kind of dumb that when we were sharing some ideas for projects

16:30

we'd like to see that we didn't think of like a reptile kind of like, you know, like a template based off of that. But, you know, it's like yeah, like why sometimes it's like why pay for the B2B SAS tools if you can just build your own, you know? I still love Griptile, but like I don't know. I thought this

16:47

was really really cool. I like that there's like that one like like you can interact just from your GitHub itself and just put the comments there and it automatically kind of comes there in its new branch. So, I was pretty impressed.

17:00

I was pretty impressed by the fix. I think this has a lot of potential. I know it's just like a start. Um, and I

17:05

know he mentioned that, you know, it's like not he doesn't feel like it's like production ready per se, but I think it's a good start. Yeah. And if we look, you can see there's, you know, a collection of what seems to be four agents here. So, you

17:17

have, you know, the base PR agent, which is just looks like a simple base class actually. So, they have a code style agent which just extends that with the prompt. You can see it's using GPT41. I wonder how

17:31

much better it' be if we used GPT5 on this, you know? So easy, that's an easy upgrade, right, to test that out. But I was really impressed as well. I think

17:42

this is, you know, sometimes you want really complex templates because you can learn from them, but often if they're too complex, it's hard to actually start a project from them. They're more educational. I think this one's a pretty good balance of it provides, you know, a useful integration, right? It's like GitHub, so you have this integration,

17:59

but it also provides a pretty simple example of how you could structure a review bot like this, a PR review bot. And by now, I'm sure a lot of us have interacted with them. I know we have GPile on the Maestro repo. So, being able to build your own is pretty cool.

18:16

And providing a starter for that, I think it's a good place where someone wanted to build their own, they could start from this, learn from it, and of course, they'd need to change it, but at least it gives them a good starting spot. So, this is pretty cool. All right, let's continue on. So, that's the PR reviewer.

18:34

Now, let's do let's see what this one is called the food recommendation agent network. So, let's open this one up and I'll share it. I got no audio. Pick food to order online from app. Hi there. It's common that a lot of us

19:09

spend a lot of time to pick food to order online from apps like Door Dash. And for this very same reason, I created the AI powered meal recommendation system using master. Let's check it out. Um so this is the uh master template that I have created. It has five agents.

19:21

Um the first one is the do-ash mail suggestion agent. Second one is preference collection agent. Then we have allocation analysis agent. Then uh

19:28

uh recommendation synthesis agent and the last one is for optimizing your budget. And I've used all of these um five agents to create a network. Um so basically the door dash means suggestation network is what you use. And then um it also uses a tool um

19:40

called the search restaurants tool um to basically uh access the open try map API um to get the restaurant data based on the location. And this is the workflow that it uses. It has two steps. First one is um search restaurants and second

19:52

one is generate recommendations. I have a sample um query um already like sent sent in here for you. Um it says suggest me a healthy budget friendly Chinese food um in Seattle and I also asked it to remember that I'm a vegetarian and as you can see it takes all my preferences like being vegetarian being I should be budget friendly and it should be healthy

20:10

as well. Um uh you can like configure this area to like show the thought process of the agent and it tailor the response for you. It has picked up um vegetarian restaurants for me. It has given the rating, the delivery time, um

20:22

the average price and its address and then the end u it like gives a summary why these are perfect um for you and then some tips on money saving and you know health tips as well. So yeah, I think this could be a great start for a U meal suggestion agent. We can swap in the API with the actual doash API to get like better results and yeah that's it.

20:41

Thank you. All right. I like that. It's very It's very

20:47

relatable. Um who among us is uh not, you know, door dashing something at least once a week, I guess. Uh especially if you're, you know, working in tech. Uh I feel like you're more

20:59

likely to do that. Um and I I mean almost every day I talk to chat GPT um about what I am trying to eat and like you know like I'm I do lifting so I'm always like right now I'm trying to cut weight. So, I think something like this is really helpful because when I talk to chat GPT doesn't always have that up-to-date information and um like you know it's like a financial thing as

21:21

well. It's like how much do you want to spend on a meal? So, I think this is very it it addresses a very like practical day-to-day personal productivity use case. Yeah, I agree with that. I also think

21:34

that it it just because it is everyone it seems like you always start with either weather or cooking or recipes, right? is such a good starting place for any template. So, I do think that it's, as you mentioned, very relatable for a someone to start with. I do think, you know, it would have been even more impressive if they did have some kind of

21:54

API integration into Door Dash. That would have been really, really amazing. So, I think there's there's more to maybe even build on this, but it is a good starting spot and I think it is pretty relatable and a good example of an agent network.

22:07

Yeah. I'm almost like, can they resubmit this with a better integration for like MCP category or something next week? Yeah. I don't know. I don't I'd be kind of surprised if Door Dash had an MCP. That's really fair.

22:20

You could build you could build the Door Dash MCP server and submit it with this and then consume it in your agent. That'd be I mean there there is another category MCP if you you know if you you know if you don't win that's a good option or even if you do you submit double win maybe as a as an MCP server for Door Dash. All right so there's number two.

22:45

Number three is uh Stripe submission subscription business intelligence template. So give me one second to get this one pulled up. I think this is from our friends uh in Michigan who uh participated the last hackathon as well, I believe.

23:04

Yeah. Kevin and Brian, right? Yes. Yeah. All right. So, let me grab the video

23:12

here. Share full screen. Hello there. I'm your business

23:17

intelligence expert. Just ask away and I'll get you the answers you need. What is our total monthly recurring revenue here is the latest data. As you can see, churn has decreased. Our growth is trending upwards. And here are our

23:45

atrisisk customers. All these questions are answered with the maestraw template. Just grab the template and I'm ready to help. Simple as that. The Maestra template.

23:59

But I love it. I I mean, you gota you got to hand it to them for the production value of the video. You know, they they did a little extra effort, you know, on the on the video. I I So, you know, have to

24:13

appreciate that. Let's take a look at the actual template itself. And it looks like I think because these are it's forked because they're they work together I believe on this but they have you know you can kind of see what's in it. There's a Stripe agent. There's an MCP client and server. There's a bunch

24:30

of tools and a few workflows for getting MR active subscribers churn rate. So some workflows for doing that calculation. And overall yeah pretty pretty well done. I think this is a great like educational. It's like really well structured. So for anyone getting into

24:49

building like B2B SAS types of tools, this would be a good starting point. All right, let's continue on. So that was number three. Number four, we have an internet of

25:04

things, an IoT template. So this one is a Loom video. Let me get that pulled up.

25:17

So, MQTT connection. We got some I should say on the last uh video, we got some comments. Very cool robot. Very cool demo video. Yeah, they definitely optimized the the

25:30

video for sure. That definitely little extra touch. All right, let's watch the MQTT connection.

25:37

Okay. Okay. Hello, my name is Bruce Kennedy and I'm demonstrating a template for setting up a MQTT broker and having the connections made by the agent and then having a subscriber and a publisher to that broker and then some reporting as in store memory and a voice response. So, I'm using the Hive MQ um as my

25:55

broker. Again, some of the things I will do here today will be more of what the IoT device will be doing, but I just want to simulate it. So, I want to first set up a connection. So, I'm going to go over here and copy this just to get that

26:06

going. And so that'll establish a connection uh to that broker. And now I'll set up uh three topics. And as you can see that it honors the uh

26:20

pound which are kind of wild cards in the MQTT protocol. And then I will go and publish our first event. Uh so I'm going to do this right here event. And I'm going to do that on the server. Um, again this is just a simulation but

26:38

technically that will be the IoT device in the future. Okay. So I'll send this to send this to the event. So we're

26:44

published to that event. So as you can see the event came through and we stored it. And now if I just go ask a quick question I should see the event in our memory.

26:56

Okay. So we'll see that come back. It's just what the status was for the living room. Um, now I'll go and publish one

27:03

more event just to show the temperature of that room. Okay, so we did that again. It puts it out on the queue. Our agent or tool in

27:20

this case picks it up. And again, I'm just going to demonstrate that that new status is out there. So we should see a temperature and a status. So now we got the new temperature out there and in the other

27:34

one. And now I'll just do quickly a um speak and just basically ask the tool to kind of tell the user publish back on the queue to the user of the uh temperature of that room. And I don't know why it called the data store again, but okay. But it calls a response and it's going to come back

27:55

with a kind of cute response. Can work on the prompt there a little bit, but it does do an encoding of that response. So I'm just going to grab that right here just to kind of show you all. And if you can see here, it did publish

28:07

that back out on the queue. Again, the IoT device will pick that up in the future. So let's see if you guys can hear this through the recording. I'm not sure, but I'm gonna play it.

28:19

So that was 70 degrees. So more to come. You'll see some of the instructions and read me. Thank you.

28:26

All right. And let's take a look at the GitHub for this one really quick. I honestly didn't even think of internet of things, but like, you know, I feel like uh I haven't seen any cool innovation in that in a long time, and I'm like, this is exciting. Like, we can

28:51

finally do it. Yeah. I mean, you can basically wire an agent up to your smart home devices, which is crazy. Crazy. Yeah, I I do remember. So, we're going

29:03

to diverge and tell a short story here, but I do remember Tyler on our team, he he wired up, this was probably about a year and a half ago, wired up a camera that if there was motion, it would take a picture and then it would send it to ChatGBT or an agent or maybe he had a local model. He might had a local model

29:22

that could do image recognition and it would say if it's a dog or a cat because he has a dog and a cat. If it was the cat, drop food. if it was the dog, don't drop food out of the in the like automated food dispenser. So, it had so

29:34

that was a pretty cool project of similar types of things you could do with with agents and uh internet of things type devices. He's going to go for a month and just let the agent be the pet sitter. You almost could. Yeah. Let's Is the dog at the door? Open the unlock the door.

29:51

Open the doggy door. You know, is the cat there? No, don't let the cat in. Uh so again this is

30:00

what struck me about this one was it was such a like pretty detailed you know you can see is a pretty good size system prompt I mean not huge but a good 50 lines so you definitely put some thought into that you have all the tools connecting to MQTT which I'm not actually that familiar with so you know

30:19

I haven't done a ton of internet of things I I did a few things with Raspberry Pies uh years ago but it's been a long time. And yeah, you can see the tools there's there's a scheduled monitoring workflow, but overall good very good quality submission. Next up, so four down, a lot more to go. No, we're we're gonna

30:51

keep going here. We got to keep the pace. All right. So, let's

31:01

All right. So, this one is kind of like a deep research type agent. So, we'll watch this. Is there I don't know if this one maybe doesn't have sound. So, watch it a little bit.

31:11

And I'll because it doesn't have sound. Then I'm going to speed it up a little bit just because we are trying to get through things pretty quickly. I think this is another one of those cases, Shane, where you said it's like when people start building agents workflows uh for the first time, like I think a study tool is also one of those really relatable, accessible first

31:49

projects. So, I'm glad this template exists. I think this has good educational value.

31:58

Yeah, it's in everyone always wants to build a deep research agent. That's kind of a a common thing is common starting spot. And so I think this is a good relatively simple example, right? It's

32:10

not overly complex, but it could be it could serve as a good starting spot to see how and we have an example of a deep research agent as well, but there's kind of different approaches for how you might want to do that. And let's see, let's pull up the GitHub. We'll just go through this one. It's

32:27

kind of I think it's kind of positioned as more of like a study agent. So, it can do deep research and kind of help you as like a study tool. Shane, we're seeing the in at Bay Harbor.

32:40

Okay, cool. All right, so yeah, if we look at this, it's a master study agent. So can generate questions from plain text, perform real time search with Brave, and provide summaries with live web context.

32:59

So again, one of those that's relatively straightforward, not overly complex to implement yourself, but you do have as a way to kind of get a starting point looking at implementations of how you could do tools and how you could wire these things up in a inside of a master agent. Next up, this one is uh I feel like is it's

33:30

Khalil. I feel like he was going after, you know, going after the Shane's favorite award on this one because it just definitely in contention for at least a potential winner for that. Not saying it's whether it's one or not yet, but I made I think I had mentioned maybe even on the live stream previously that I I try to build a YouTube transcript

33:52

basically interact with an agent that will interact with YouTube videos, YouTube transcripts because we have this two-hour live stream every week and we have at this point you know 50 plus I would say almost we're close to 100 hours of like video content if you count our workshops and live streams and being able to have that is data and almost

34:12

just like chat with the transcripts was a really cool idea that I wanted. So let's uh watch this video. Hello guys. So for the master templates hackathon I built an agent that lets you chat with videos from a specific YouTube

34:26

channel for example the master YouTube channel. And for that I've built a workflow to process the video videos. It first checks if the video has already been processed or not. Then it downloads the audio, sends the audio to deep cramp

34:40

for transcription, gets the transcription, extracts some data from it like speakers, topics, stuff like that. We then chunk it and embed it and we can use it in our agent. You can see here that I have some keywords. These would help get a better transcription. So if I ask the agent to tell me about

34:58

the master template hackathon is going to get data from our vector store but it also has access to an MCP from smithery that gets data from the YouTube data API and that's how we got this video which is the the kickoff stream and also we got the templates workshop also gives us some information about the templates.

35:20

Thank you. So meta. Yeah. Bringing up a vid a past video of the stream in the stream.

35:34

All right. And so here's the actual repo. Chat with YouTube videos. You want to talk with YouTube videos. This

35:40

template builds a rag system that lets you chat with MRA YouTube channel. So you could obviously wire this up to chat with anything. If we want to look really quickly, we can see the workflow, the transcript workflow that was shown. You

35:53

can see there's a YouTube agent. We can take a look at it. Here's the MCP from Smithery. That's

35:59

shout out Smithery sponsor. Yeah, thanks Smithery for making this YouTube agent possible. And it talk and it just passes the channel ID right here. So that's interesting. That's easy

36:09

for you to change if you want to test it out on your own. gives it a video search tool and these MCP tools has some memory. So overall not overly complex but very useful because it does actually show it's a good use of rag and a good use of you MCP a good use of agents and it's I think for for me personally I

36:32

know this is something that I would like to have I would honestly you know grand vision I would like to have you know chat with our transcripts of YouTube videos on the master website someday I will build it or someone on the team if they beat me to it which they normally do will build it but that would be pretty cool. While you um ceue up the next one, I'll

36:51

also say that I like that it checks to see if the video has been already transcribed before. Like just a very thoughtful touch, being mindful of uh compute. All right, so the next one up, this one is called the it's the the Kestra agents.

37:14

So let me share the screen here. All right. Hi everyone. Uh in this demo we will see how Kestra agent works. Uh uh to give a

37:31

brief about what Kestra is. Kestra is an open source alternative which helps us in uh orchestrating different tasks. So in this demo we'll see how this agent works by very simple taking a simple example. create a castra flow which prints a log.

37:50

So in this agent network there are two agents present. One is design agent and another is execution agent. Design agent is responsible for understanding the user's requirements and generating a castral file here. Now it is asking me whether it it it can automatically test the flow for me. So

38:10

I'll say yes. So now what what is actually happening behind the scenes is now the context is switched to execution agent and execution agent uh created a new flow in the Kestra platform and it provided links for the user to follow. User can go to Kestra platform and they can uh do any changes directly if they want. So we can also the agent can also uh

38:40

edit the flow which is already created. So let's say add one more log let's say hi master. Now once again the context is switched to design agent. Design agent

38:55

understands that the user wants to edit the yl file. It generates a new file. Then it once again asks me whether it can automatically test it for me and I'll say yes. Now once again execute

39:08

execution agent has taken the responsibility of editing the flow inside Kestra and also runs it and tests it. So if I can go here can see here one more log is completely added here. Thanks for watching.

39:32

All right. And then we will look at the actual agent itself. So the template itself. So you can see there's a link to the demo video has a good overview. If

39:44

you take a look at the code, let's you know, you can see there's a whole bunch of tools here that were wired up. There's a flow generation workflow. Let's just look at the agents. So you can see there's three different agents.

40:01

So overall, like quite a bit of work went into this. I'm not familiar with Kestra, so I this is kind of new to me. I wasn't very familiar with it. I've heard of it, never used it. But I do like that this should this is a useful

40:14

agent because it connects with a system that I'm assuming, you know, the creator uses. Probably use Castro quite a bit. And now they have an agent that can interact with that system for them. So, it's it's a good use of like an agent network or multiple multi- aent system

40:27

with like an integration with another platform. And you can see the pretty detailed system prompts. So, a lot you could probably learn just from reading through, you know, these pretty, you know, pretty comprehensive system prompts really. Okay, they just import the agents there.

40:50

Yeah. Any comments, man? I'm like, yeah, maybe we should do a workshop on uh maybe you've already done a workshop on system prompts. Just

41:02

I'm like to sharpen my skills. Yeah. I mean, even look at the the So, if you're not familiar with agent networks, that's, you know, we're still calling it experimental, but we we are seeing a lot of good use cases for it in this hackathon. But you can see here, you know, this is a 150 line or 120 line

41:20

system prompt. That's a there's a lot you can learn from that and I I do think if you were to further productionize this thing that thing probably balloons to three times that but it is a really good useful thing to look at and study other system prompts of what works for people. Uh so there is a question in the chat here what what does networks do? So

41:41

if we look at this implementation this is an agent network. An agent network allows you to configure essentially a network of master agents that work together to accomplish a task. So here the system prompt is for you and one agent that kind of delegates work to other agents. So if we look we can see that this agent network has three

42:03

different agent. It has the kraflow design agent, the casterflow execution agent and the web summarization agent. And it also has access to this workflow. So this agent can actually pass

42:15

information to these other agents. And so essentially you can use this uh use one agent network and kind of route tasks or route uh messages between the different systems so you can build more complex multi- aent systems. We're about halfway done, I think. So we're getting uh getting close.

42:42

This next one, it's another pulling up another Loom video here. This next one is a personal portfolio builder. So, let's share this video.

43:02

Oops, that's not the video. It's a lot of clicking. I didn't realize how much clicking this is going to be. I know. Next time we're going to have to have it all lined up. Yeah, I'm I'm going to cue these up

43:13

ahead of time. Put it one Put one big video together and we just watch the whole thing. All right, so this is a personal portfolio builder. Hello everyone. Uh today I'm going to show you the personal portfolio website builder which I built

43:25

for the master hackathon. So this is an app which uh takes in the users uh uh hobbies as inputs and it generates dynamic uh websites. Uh so the user can uh give any input like uh surfing, yoga, gym, gaming etc. and it would generate a

43:44

personal statistics uh website tracker. Uh so I'm going to be giving a surfer here. So what it does it it calls the agents and the tools and it uh generates the uh dynamic website and uh it also gives us the link uh and saves all the files locally. So yeah uh you can paste

44:07

this link uh in the browser and then you can see the website. So if you see the website, it's generated a few images and it also has uh a diary diary section and also it has uh a stat section. Uh so if you see the stats it's uh uh general generalized for the surfers and uh you can we also have a yearly uh sessions

44:35

calendar and uh month yearly goal tracker and we we also have a dynamic uh uh tab which is created called brakes. Uh so here you can see the uh brake stats and top brakes and it also shows the uh map where uh all the brakes have been done and it also has the recommendation tabs uh which uh the user can use uh to see the recommendations and I have generated uh a gym dashboard

45:08

similar to this. So if you see it's all dynamic and uh if you see workouts it's different and also a yoga dashboard. So yeah this is the app I built and uh it can be helpful uh for the people to uh put their own uh uh statistics and track it. So yeah thank you.

45:37

So the thing I like about this one is it seems like anytime you have a starter, you always have like a portfolio. Like I remember the So previously I was at Gatsby for many of you know this, but Gatsby was always big on like personal blogs, personal portfolio sites. So I feel like you always it's like a if you have some kind of template or starter

45:56

system, you have to have a portfolio builder. It's just like it's just natural. So, it's kind of uh very uh I I guess yeah, it brings back memories of of those days of building Gatsby starters and using Gatsby starters for my own personal blog, which which I did many many years ago. Uh so, I I do like

46:15

that. I think, you know, of course you could argue the design of the website, you might want it to be better or whatever, but it does just generate you a website off of prompt, which is, you know, kind of a cool use case. And it's pretty simple, but like I said, sometimes those are the best.

46:32

Yeah, I want to see it um connect to like my wearables like a Fitbit or, you know, Aura Ring and have it actually like manually, I mean, automatically update your activity stats. That would be really cool. So you can see is a collection of tools, collection of agents. I didn't really Let's just look and see.

46:55

So pulls in all the different agents. I wonder like what's the dashboard agent? So using llama pretty simple, you know, not overly detailed on some of these system prompts, but a good collection of multiple agent systems that work together to kind of build. It's not using agent networks by the look of it,

47:21

but kind of a cool use case. All right, we have I think three more. No, I lost track. Four more. Yeah. Where are we at? Um we're at the old oldie maps.

47:41

There we are. Yes. So, I thought this one was pretty uh kind of funny, but also, you know, potentially useful and something to learn from. So, old maps. Have you ever asked for directions on

48:00

the side of the road? Well, this is about the same experience. We can say, "Hi there." And then it's going to call

48:06

another agent as a tool to get directions and then give them back to us. We are getting our instructions. And this is step by step. So now let's look at how this is actually done. So the agent that we use as a tool is a

48:19

directions agent. And the way this works is that it's not calling Google Maps. It's actually just reading Wikipedia pages and finding references to roads and places until it can find a way to go from point A to point B. This is really

48:32

cool. It's also widely inaccurate. We can see it looked at a bunch of pages to be able to generate that set of directions. Now, the really cool thing that we can do with this is to score it.

48:43

So, we have two scorers. The first one is to check if we are actually only using results from tools in our um directions. And so right now we are hallucinating about 71% of our directions which is not great. And then

48:57

the second tool actually calls Google maps for the same itinerary and then compares performance. And so right now we are about 85% as performant as Google maps. That's about it. Thank you guys.

49:12

I like this because um I mean Monster scores are relatively new. So for anyone, you know, who's interested in them, they're like our next like iteration of eval. Um, so I think this is a good educational template for like how to structure scores and use them. And I kind of like that like for this demo, um, like the results are kind of

49:32

deliberately bad so you can actually see how your scores perform. Yeah, I thought it was really really cool to actually integrate with Google Maps to like check the score. It's kind of an interesting use of scores. So you can see here they create a score. It's a map score and it it does you know I

49:53

think fetch data from the Google maps API to measure distance and then compare how good were the directions from Wikipedia compared to actual Google maps if you were to use Google Maps and using that as kind of the source of truth which is yeah a really cool way to go about it. We didn't, you know, we just

50:10

with scores being so new, we don't have, you know, we haven't even put together that many good examples. So having a good example, even if it is, you know, relatively, you know, there's only a couple scores here. It is cool as a learning a way to learn just about eval how you should think about creating your own scores, how to use kind of leverage

50:28

some of the existing, you know, like hallucination, which is a pretty common score as well. So overall was kind of excited to I was excited to see someone have like kind of focus the demo on evals. Next up, MRA template evaluator.

50:49

This one was also pretty funny. Very meta. Very another very meta. Yeah, another

50:57

very meta submission. You've been there, right? I mean, it's 2 a.m. You've got 50 tabs open and you're just drown.

51:03

I think I maybe should change the speed. Yeah, I gotta make this one. No, make it slower for one.

51:09

Yeah. Yeah, he wanted to keep it in a minute, I guess. So, we're going to give him a little more time.

51:15

It's 2 a.m. You've got 50 tabs open and you're just drowning in the sea of demo videos and super grandiose readmes.

51:20

You're trying to figure out which gamechanging AI assistant actually works and which one is just, you know, a slick PowerPoint held together by caffeine and wishful thinking. And that's the core problem, isn't it? The old way of judging. It's all based on subjective demos and really slick pitches. It can

51:32

be kind of unreliable. So, how does this thing work? It's a simple four-step workflow, all automated. First, it just

51:38

grabs the code. It clones the project repo. Zero setup needed from you. Then, our AI reads all the documentation, the

51:43

readme, the demo script, you name it, and it extracts every single claim the team makes about their project. Next, and this is the important part, it actually runs the project and tests every one of those claims. And finally, it spits out a super clear databacked scorecard. Well, let's look at a real world example. We ran this on a project called the Deep Research Assistant. As you can see, the evaluator confirmed its

52:00

main features, the endto-end research and the learning extraction. They pass with flying colors. But it also caught something that a quick demo might have totally missed, a key feature, multis source sync. Well, it completely failed the test. And look, this isn't about calling people out. It's about getting

52:11

to the truth of what was actually built during the hackathon. And this is the final output. You get this beautiful, clean scorecard. It gives you scores on

52:17

things like creativity, appeal, the quality of the description. But the real magic, you got to look at the bottom. It automatically analyzes the project's dependencies and suggests which sponsor prizes it might be eligible for. Yeah, imagine that. Are you ready to make

52:28

hackathon judging fair, fast, and maybe, just maybe, actually kind of fun again? Okay, we need to run this project through itself and see which dependencies it uses because like this Yeah. Yeah. Should we see how how does your project score against yourself? That's a great use case. So, this one,

52:51

while being hilarious because it's so incredibly meta to evaluate the templates for the hackathon, and I will admit I didn't run this, but now I wish, you know, I had more time because I probably would have used this to at least like run it on different and compare maybe what we picked to it, right? That would be interesting. Yeah. Yeah. But I will say this one

53:15

might make it into our so so at Monster we have kind of on Master Cloud we have like a marketing team and we I think this agent or this might make it into that marketing team on Monster Cloud and we might just run this for anytime we get a hack a template submission going forward. Maybe we'll tweak it a little bit but then if someone does submit a form and they have

53:36

a really good read me we can at least do some it's like an automated check. Yeah. Yeah. We could even potentially like have it, you know, send them those results and so then they could update it

53:47

or fix it so it passes before we advertising, guys. Yeah, exactly. So, I haven't tried it yet. I can't verify that it it works the way it says, but

53:59

it is uh it is pretty cool. I I would I will definitely be trying this one. You can see uh here's the template reviewer agent has some access to some tools and there is a workflow in here. It mentioned

54:18

template reviewer workflow. So let's just see where probably at the bottom there's a bunch of steps in here. So yeah, you can see how it first it creates the workflow. The first step is clone project. Then in parallel it'll

54:37

set up the project repo and do a claims extractor. And then after that it will essentially kind of combine these two things. So it'll run score it and then output that. Very cool use case. I'm excited to try this

54:58

and definitely, you know, if it works, we will just wire this into our template review process, you know, maybe with some some small tweaks. So, appreciate that submission for sure. That one's could be very useful and honestly very useful for others. May maybe it could just be like a standard for hackathons,

55:16

not just our hackathons, but you know, potentially potentially other hackathons. If you're hosting your own hackathon, maybe you can start with that template and build your own uh basically AI AI judge. All right, so now we have a storyboard generator tool or storyboard template.

55:40

Last two to go. We're getting there and then we get to the actual awards. All right. So let's

55:54

So this is a template I made for the master. So this is the AI storyboard generator template. So that can generate a story board for the given script with the consistent character images. So let's just put this in loop mode and

56:08

I'll send a pre-built prompt I pasted here. Just click send. Now this agent network it's a streaming agent network. So it consist of five sub agents. So the script generation, storyboard

56:19

generation, image generation and other subision sorry. So you can see uh it's for the first event prompt. So I give the like a with a ring code and talking code. So to check whether it's generating a conic character or not. So it generated a story. you're using

56:37

script generator agent for the first time you go and then yeah so it's making the next decision so it's generating the storyboard agent so the master agent deciding to generate the next step using a storyboard agent so let's wait and see and I'm using the Gemini 2.5 flash model so yeah it generated the story with scenes and the characters. Now it's thinking

57:11

for the next step. So it's making decision for the what step does need to do. So I'll tell you in advance what it does is it's generate images and then uh create a PDF with the images and the story content and using the Japier integration it will send a slack message telling that that PDF is saved to the Google drive. So let's wait and see.

57:37

Yeah. So now it's generating the images. You can see the image path and everything.

57:50

Yeah. So in generating the images so it's taking time and it doesn't have any work force. So the master dynically select the subent based on the system infections.

58:15

So it's generated a PDF. It's showing the path where it's generated. in the details of the PDF and now it's it uploaded into S3 bucket and you can see we got a slack message and slack notification here. So when I click this slack, so there's a message here. I just click this link

58:37

and you can see so this is the storyboard that generated place on the prompt. You can see the consistent characters with the yellow ring code and the crow here. So default it can generate only five images. Yeah. So that's my template.

59:01

Thank you. I like this. Yeah. I don't even know. Like, dude, when it when it got to the Slack notification, I was like, there's more.

59:14

Like, that's insane. Just I don't even know. So rich. Very rich submission.

59:22

Yeah. I mean, the the amount of There's a lot of work that went into this because it was not just one integration. I mean it integrated with Slack, it integrated with you know Google Drive, it obviously did image generation as well as you know text generation. It was a use of multi- aents. So overall

59:42

pretty great submission. Also, it's kind of I I built a, you know, previously a storybook generator that wasn't nearly this good. And so now I'm like, okay, this is better than what I did earlier as like I built the first example that MRO had called Fable Frames, which was uses used workflows. But I like the idea

1:00:00

of using a multi- aent network for this rather than having because it took a ton of time to build that workflow and like say first you got to do this step then this step and this step where if you did it through a multi- aent system it's a in theory at least easier to build and makes it much more flexible. Maybe

1:00:16

slightly less accurate. I don't know. I'd have to actually test it a little bit more. But I did like that you kind of you got a PDF at the end. So it

1:00:23

actually even generated a PDF. So really cool. And let's take a quick look. So you can see in the master folder here, you can see

1:00:37

there's an agent network. Now let's see. So they in memory even use like pretty good use of memory here.

1:00:49

Create an agent network with instructions using 2.5 flash. Wow, what a system. Holy

1:01:02

Multiple agent networks. Actually, looks like they have an agent network, storyboard network as well as story. Oh, that. So, it must there must be interesting. So, there's like an an

1:01:19

older one, an older example in there. So, there's multiple agent networks. And then if we look at the agents, I do want to see just the image generator. What are they using for image generation?

1:01:32

Create storyboard scenes. There's the instructions. And it uses 2.5 flash. So, the nice

1:01:38

thing about this is if you didn't like 2.5 Flash's image generation, you could switch that to use OpenAI and then but keep the rest of it using Gemini Flash and you can kind of mix and match these different models to really uh kind of fine-tune the results to what you want it to be or at least test out different models with different parts, different I feel like you're gonna do that this

1:02:00

week and uh you'll have something to say. I I mean there's just too many good ones to go through. So, I am I'm concerned I have other things to do, but I don't I just want to do uh just want to review these and try them out. All right, so we have one more and then we will get on to the awards. Thanks everyone for joining

1:02:21

joining and following along. This has been a lot of fun. We're getting close to the end of reviewing some of the submissions. Again, if yours didn't get reviewed, I will say just as a small caveat, we had a couple people submit in

1:02:33

their video what we didn't have access to watch their video. So, we can't really judge it without we want the video plus the code. So, we wanted all that. So, we will follow up with you and of course, as we said before, there's we're continuing on. We have encore

1:02:47

week, so you're still in the running for prizes, but we do need the videos. And then we'll probably show some more submissions next week, specifically some of those that I'm pretty sure are good quality, but we couldn't access the video for one reason or another. So, we will follow up with you. We'll send you an email and then once we get access, we'll hopefully highlight those next week and then you'll still be in the

1:03:05

running for some of those prizes that we still have to give out. Okay, last video. Hi everyone. So this is my uh submission for the hackathon uh template from

1:03:24

master and purpose of this is to you know create uh just enough uh documentation for the coding agents. If you read uh go through my projects read me you will get to know what problem I'm trying to solve. So when I hit this it starts the process you know it it goes to the uh that particular repo. So it can it try to fetch the repo and process

1:03:50

it to fetch only the public APIs with their just enough required uh documentation and it's not just specific to JS it works for the other libraries as well for example this is a Python one Django this is the how the final output looks like and similarly if you go to this one this is for react it it print out everything like how to do it and

1:04:14

similarly if you go to the one we are generating because it can take time and video needs to be short. So I'll be showing you this one. So ExcelJS uh is a uh JavaScript library for to play with Excel. It gives you all the key APIs which are required to work in your

1:04:36

project. Thank you very much. And let's look at the code here.

1:04:52

So this is a documentation generator or context generator for different libraries. So the idea is that it could be used for cloud code or cursor to generate context from an entire GitHub repo which is pretty cool. So it has you know multi-phase approach mines the documentation it gets the definitions and then does some source code analysis

1:05:18

and you can see that they did a few you know real world examples as it was shown in the video. What are your thoughts? Um I think this is good. I think uh I

1:05:31

think that it's pretty comprehensive because it looks I think it looks at the readme the docs also the type definition. So like independent of what is just in the read me it might do it might pick up on things that you know even the person who published the library or code might not have surfaced as good context.

1:05:56

Yeah. And you can see there's quite a few tools that were used. So you can of course look, you know, what does fetch repo content do? If you want to see what it does, it's really easy to to see how

1:06:10

something like this can be put together with a collection of agents, a collection of pretty good tools. And yeah, there's even a workflow which I think is pretty straightforward. But let's see. Yeah, it's just, you know, fetch source, extract APIs, generate final docs. So, pretty simple,

1:06:29

straightforward workflow, but a good example. And with that, I mean, we did it. Now, it's time for the awards. We're just going to speed through this part. Yeah, we're going to keep going. This is for Thanks everyone for hanging in with

1:06:44

us. That was longer than expected, but we got through it. And we we still have more to do. So, we're not done. Let's get on to the awards. All right. So,

1:06:56

first with the best use of agent network. So, this was judged by the MRA team. We chose the AI storyboard generator.

1:07:08

So, that was the one that generated the PDF, had the Slack integration. So, here's the link to that GitHub as we we showed. But congratulations. You're getting a $512 Amazon gift card for the

1:07:22

best use of agent network. Drum roll, please. We're going to just keep rapid firing these. The next one, I

1:07:29

think, is tool use. The best use of tool provider was the ability to chat with YouTube videos. So again, I think this is a pretty cool use case because I personally have felt this pain. We doing this video live. if we're

1:07:42

going to have a really long transcript for this video and maybe someday we want to see do you remember that video where someone talked about a YouTube template and it would be able to actually you know pull out that data. So, this is a great use of tools and our friends, if we go back, our friends at Arcade judged

1:08:03

and looked through all the ones, all the submissions for tool providers and selected this. They said with really like the use case and demo, very helpful agent and great use of tools and MCP. So, there you have it.

1:08:20

Thanks for submitting this template, but also thanks Arcade for helping us judge the best productivity. The IoT integration template. This one was cool because it's kind of like your your home productivity or this is definitely like a personal productivity. You can have you can chat with an agent to have it set your temperature of your thermostat

1:08:45

or do other things around your house. So really cool example for connecting agents to real world physical items. And so that was, you know, the reason we wanted specifically to highlight this one. And uh if we look, this was one that I

1:09:07

think you you know, you picked this category, I mean, I just didn't, you know, I'm just like, wow, that's crazy. like in our homes, you know. I've I've I'm annoyed that like still whenever I um drive and I have, you know, like my uh Android phone, whatever it's called, connected, like why do I just not like

1:09:26

talk to a real agent, you know, like why is it still like this generic, I don't know, like Siri from 2008? Like I don't know. So, I'm just like, yes, we need we need to make progress here.

1:09:39

Yeah, I'm still upset that so I I have two smart lights in my living room and if I ask and I'll say this quietly so she doesn't hear me. If I ask Alexa to turn on the lights, it will sometimes do it and sometimes it'll ask me left light or right light or which light you have more than one light and dumb. Yeah. So let's let's make this better. This can be better and this is maybe a step

1:10:02

in that direction actually useful agents. All right. Next, next up, the next category, drum roll, is best coding agent. This was also judged by the

1:10:13

No, this one was judged by Scrim. Oh, yeah. Yes, we should give them a shout out. I think it makes sense because they're, you know, they're all about coding

1:10:24

education, so why not have them judge the best coding agent? Exactly. So any any comments on this is a really clear use case of getting a concise description um of a software library independent of what the author has published. Um they like that it can improve the understanding of an like that an LLM has of of a library as well.

1:10:50

And they like how comprehensive it is. It's like not just reading the readme, it's looking everywhere to get that context. Agreed. I thought it was a I thought it was a really good use case and really useful if you wanted to build a code

1:11:08

agent. Next up is the best use of eval. This one was judged by confident AI and old maps or old maps. I I don't know

1:11:21

exactly how you pronounce that, but I really liked how they actually featured scorers as part of the demo. So, it was really leaning in towards actually showcasing how you could write evals and that it wasn't actually that hard. Like the evals aren't overly comprehensive, but provide a good starting spot and a good learning tool. So, I thought this

1:11:40

was a really good use of evals. Okay. Now, for my my favorite two awards, Shane's favorite and uh the funniest. All right. Well, my favorite. So, this was one that was it was kind of

1:11:54

painstaking. I Shreda was on a call with me as I was going through and I was, you know, visibly frustrated with having to choose between so many good options. But I did make a choice. You know, the show must go on. I picked one and my favorite

1:12:07

was the GitHub PR review bot. I thought, why not? Why not build your own reptile? You you can build it yourself and so and you can do it with MRA. So I thought that was really cool

1:12:19

for connecting to GitHub handling you know and actually commenting on I think the developer who made this you know said he works primarily in Drupal and I think that got you Shane. Yeah, it did. It did. That was that was the tiebreaker between it's like well I used to do a lot of Drupal work and so I

1:12:38

appreciate people that you know come from the the world of Drupal because that was prior to going to Gatsby that's even at Gatsby that's what I worked on for a little while is like integrations with Gatsby and Drupal and you know because Gatsby was very integrated with many different CMS systems. So yes blast from the past.

1:12:59

All right. And yeah, so there's I see some really good chats like cool project time to cry made a similar template favorite. Don't cry. Don't cry. There's there's another week, you know. Yes, there there's more time and there's

1:13:17

still a bunch of raffle prizes and even if you didn't make it to, you know, a an actual award winner, you still have, you know, we can still get your template on the MRO templates website. still gonna shout about it on social media. So, don't don't worry, it is not over. There's one more prize.

1:13:36

Pull up the comment from um extra dip of honey. I just think that's really funny. Yeah.

1:13:43

Made a similar template. Gunning for Shane's fave. Is it over? It's never over. It's never over.

1:13:49

Appreciate you trying to uh trying to uh get my favorite. I I I bet I can tell which one it was because there was a couple that were, you know, very very close. All right, last but not least, funniest. So, in honor of Abby, who's out sick today, but you know, wishes he

1:14:06

was here, Oby's funniest was the hackathon evaluator, because why not? Why not be very meta and build a template that evaluates all the other templates that are part of the hackathon that you're at? That's that's pretty dang funny. I agree. And, you know, potentially useful. I I

1:14:27

want to try this one out. And with that, you know, we mentioned before we're having an encore week, so we're going to still have the best overall. So that is still not done. The

1:14:39

best MCP server. So if you already had a template if you already submitted, you can submit another one or add to your existing if you want to be part of that category. So just remember it's not over with yet. I know before we said you can only submit one, but we have these extra

1:14:58

categories, so we want some extra submissions. That's judged by Smithery and also there's a bonus award. If you use Smithery, you have the chance to win a switch, too. The best use of O judged

1:15:10

by work OS, the best use of web browsing, the best rag template, and the best crypto agent. So, if you already did submit in those categories, there's a reason we probably didn't showcase you. We didn't showcase very many from those categories because we won't want to show people what's already there, but you're still you're still in the

1:15:28

running. It just means we didn't quite get enough to to call it yet. And so, please, if you are watching this, submit to one of those categories this week. Again, the awarded prizes are the $512 Amazon gift card, and we will be

1:15:41

selecting a large number of them to get featured on the MROS templates library. We'll be rolling that out throughout the week this week and we'll link to, you know, your name, your GitHub, all that. Thanks to our sponsors, Recall, Smithery, Scribba. Please check them out.

1:15:59

Please tell them, you know, thank you for putting this hackathon on. It's been a lot of fun. And we'll share out these uh these slides. But if you are curious how to get started, go to our Discord. Go to

1:16:12

the master.build channel in Discord. we'll submit or we'll send out a chat message with this slideshow with the links to the form, all that information. Also, go to the website just master.build.

1:16:24

Um, and that'll have uh information as well. Yes, good call out. And we still have more prizes, right? We still have the raffle prize pool. So, Raspberry Pi, everyone gets a copy or a copy of the AI

1:16:38

engineering book for one of those. Everyone gets a copy of this book, Principles of Building AI Agents from my co-founder Sam, a mechanical keyboard, mute me buttons, bunch of Scribba memberships. So we really cool that Scrimbo was, you know, gave us a bunch of free memberships for people. So a lot of prizes to give away still.

1:17:01

All right, this is live. If you do have any questions, please do ask. We'll try to answer a few and then we got to keep moving on because this is gonna be a really long stream today and I'm gonna I think I'm gonna kick you off. You gota go ahead.

1:17:11

Thank you. It was really fun. Happy building everyone.

1:17:17

Thanks for coming on. Sha. All right. Before we jump into

1:17:26

the news, we got a comment here. Just realizing I have no audio in my demo. It was a template that builds templates. Yeah, you please uh you should still

1:17:38

send Yeah, I I think we realized that in your video that you didn't have audio, but it was still a very good submission. So, one that we may feature next week. If you if you send a new video with audio, we'll play it on the live stream next week because that was another cool very cool template.

1:17:54

There's a very good like coding agent example. Xan said Xan build said I was facing problems with deployment. Jump into the Maestro. Discord channel. I'll try to

1:18:07

help you out. Someone on the team will help you out. And with that, wow, we are over an hour and 15 minutes in and we just finally got to the third category.

1:18:22

If you were paying attention at all last week, you will know that there was a ton of AI news. And so we're going to go through that. But I thought it was useful to kind of call out OpenAI's launches in their own section rather than just bundling them with the AI news because there's a lot of really interesting big big things that came out of OpenAI last week.

1:18:46

So early in the week, OpenAI released their the GPT OSS models. So let's pull this up and let's actually take a look. So here was the announcement. August

1:18:59

5th, last week they had the GPT OSS 12B and the GPTO OSS 20B. So you can read the entire article here. talks about the architecture, talks about the number of layers, the number of active params, the context length. You can see that they do have how it

1:19:20

compares. So you can see, you know, with tools, without tools, humanity's last exam, the accuracy compared to some of the other models like 03 with tools. So, not as good as 03, which you'd expect from their open model, like they would they're not going to put out their best stuff as their open model. You can see Healthbench,

1:19:48

see some other benchmarks, and you can overall competition math. I guess I'll give you my my quick takes is I think it was one I was incredibly happy to see that OpenAI was actually just releasing an open model. Now it says OSS. It's not really open source. It's like an openw weight model. So keep that in mind. It's not, you

1:20:14

know, fully open source, but it is an openweight model. And that that's a good thing. It's good that a large company like OpenAI is going to put time into building openweight models. I think if you're part of the OSS community, if you

1:20:27

want great open models that you can run on your own hardware, this is a good showcase that they at least, you know, they're not forgetting about the open open source community. So, I do appreciate that. I will uh, you know, highlight a couple other things and then maybe give you some more of my takes. But we have kind this was kind of the

1:20:49

announcement post here talking a little bit more So this is from Roma Hwitt. So two openweight models we mentioned that they wanted to build them for developers. So they really talked to the developer community. They learned about, you know, how tool

1:21:12

use and structured outputs matter. They're highlighting some of the benchmarks. So they're very the size is uh, you know, pretty good. So, the 12B runs on a

1:21:23

high-end MacBook or on a single H100. I don't really know if the 12B will run on a high-end MacBook very well. So, technically, you can run it. I don't know how well. Um, the 20B runs on

1:21:35

consumer hardware. I ran the 20B on my not super high-end MacBook, couple years old. Uh, it's a it's an M2 and it ran okay. It was pretty slow, but it did

1:21:47

run. You know, I would imagine if you wanted to run the 20B on a higherend MacBook, it would run pretty well, but I don't know if the 12B would run quickly, but on an H100, so it's at least as approachable for someone to run. And it does say the Evals are remarkable. A little caveat there that we'll talk about, but overall,

1:22:07

you can see more, you can learn about it. Very cool to see that they are they actually release these models. I think everyone should try them out. If you don't know how to try them out, I did have a a thread on X talking about how

1:22:20

you could you build an agent with Maestra using these as the actual a agent. So if you you know you can scroll back on my feed a little bit if you follow me on X, you can find it. But I did show really easily how you can use these models with LM Studio. So really recommend trying out LM Studio if you haven't run open models. Make sure you

1:22:41

have a good enough machine to do so. But give it a give it a shot. It's really easy to do.

1:22:47

There was a little bit of, you know, I guess controversy over how these models were trained. And so I thought this thread was worthwhile to share as well. So Jack Morris here, you can see this got quite a lot of views. Curious about the training data. So was uh Jack. So

1:23:12

generated a bunch of examples, did a bunch of deep dives. So here's a map of the embedded generation. So the model loves math and code, which makes sense. I think they were building these models

1:23:24

for the developer community. So it would make sense that they uh really focused on math and code as as the you know most of the training data. But the thing that some of the interesting stuff that kind of came out from doing all of these uh examples is that none of the generations resemble natural web text

1:23:50

and they don't look like normal chatbot interactions. So they so the consensus or the thought at least from Jack is that things clearly trained through reinforcement learning to think and solve tasks for specific reasoning benchmarks. So basically they were trying to benchmark max try which essentially

1:24:08

means they wanted to score really well on the benchmarks. They wanted to be able to say they had a really good oss model that performed well on all these benchmarks. Is that actually true? I

1:24:19

don't know. Jack is pretty convinced it's true. And it you know this I thought this was a funny statement. It truly is a

1:24:25

tortured model. So the model hallucinates a programming problem about dominoes and attempts to solve it. spending 30,000 tokens in the process completely unprompted the model generated and try to solve this domino problem over 5,000 separate times. So,

1:24:41

it's not perfect. Do I think that OpenAI wanted to get a model out that just did really well on benchmarks and they don't really care about it? I'm sure they care about it, but I also know they probably wanted to they wanted to be able to say they had the best open model or a competitive open model with other uh some of the other open models that are that are out there. So, it wouldn't surprise me. I

1:25:05

still think it's a good use case for if you are interested in open models, running open models specifically. It seems like it would potentially be pretty good or at least a pretty good open model for coding. So, I would recommend trying that out, running it locally, seeing how it does. Maybe you

1:25:22

can, you know, if you're on a plane without Wi-Fi, you can run an open model and still still use cursor or cloud, well, not cloud code, but you could use cursor and still have it uh use that model or build your own code agent with that model. So, overall happy that they're releasing the open source models. I don't know if

1:25:40

they're they're not necessarily going to change how I am building agents because I think for mo the most case for the most part most of us need the best models that there are and these clearly aren't the best but they do uh maybe fill some gaps and at least showcase that open AAI is willing to invest some amount of time and effort into uh

1:26:01

working with the open source community and the better open models we have I think the more it kind of pushes the frontier models uh forward even And this is another open AI announcement or so they're providing chat GBT to the entire US federal works workforce. So the basically gives all federal agencies access to chat GBT enterprise for $1 for the next year. doesn't say what it's going to be

1:26:31

the year after, but a dollar for a year of Chad GBT is pretty good uh pretty good price point. So, it is you can see that they're uh they're trying to get it used within the government, get access to the government for chat GBT. Interesting announcement. I don't know what to make of it, but the government's going to be using Chad GBT.

1:26:58

All right. So, we do have some question. Uh, does GBT OSS support image gen? I

1:27:06

actually haven't tried it. I don't know if it supports multimodal. I'm going to actually do a look. I don't see any uh mention of it. I'm imagining maybe it doesn't. So, not exactly sure.

1:27:19

If someone else in the chat knows, does GPTOSS support image gen? I didn't ask it to generate an image, but my guess is it probably doesn't. There's a comment here. Rash G put it

1:27:30

best. Open AAI by Open AI. That's kind of funny. Uh Shabir did ask this is

1:27:37

back, you know, related to the hackathon, not necessarily open AI, but what were the judges focused while judging? It was uh does it make it was kind of creativity, the overall like code quality of the template, the de like the demo and what it showed. So it was a combination of a lot of different things that were kind of looked at when judging, but it was

1:27:56

kind of those categories of like creativity, the actual code and the code structure itself, and then the the demo and the readme, the documentation, all that was kind of packaged into that. So kind of those across those three categories. So that's the OSS models, but OpenAI wasn't done last week. It was kind of like a launch week for OpenAI, I would say, of sorts.

1:28:20

and let's talk about GPT5. So, first of all, if you're watching this and it's live, are you have you used GBT5 yet? What did you think? I will say when they released on August 7th, I tried it and I think

1:28:35

there was some issues with the routing agent that basically routes between the different models and they fixed it, you know, relatively quickly, but they did have some issues during the launch, I think. So, people that tried it immediately. If you tried it right when it was launched, maybe go back, try it again. Uh see if it's improved. I think it has.

1:28:54

You can read about the entire uh what it can do. One of the cool things it can do is build an app uh right within chat GBT. So you can have it build an application.

1:29:05

So if you are a lovable fan or a bolt new or replet, well, OpenAI is now, you know, competing with you a little bit. I don't think it's it's definitely not as fullfeatured of course but you can build applications now directly within Chad GBT I you know we can go through the you know kind of the scores the one thing I

1:29:30

will note is although it scores very well this is the first uh GPT or openAI launch of a major model where they didn't win on almost every benchmark. They still didn't beat Grock in humanities last exam. There's a number of other benchmarks that they didn't uh Sweetbench verified. They didn't uh beat uh Opus. So, if you're uh Claude or

1:30:02

philanthropic fan, you know, Opus is still king, I believe. So, it's a very good model. I will I will say from based on the benchmarks I do really believe it's a good model. It's a

1:30:14

big improvement. But when you go from three to four to five, you kind of expect this like huge jump. And I think from three to four, there was a big jump. From four to five, it's more

1:30:28

incremental jumps. And I think, and I'll share an article here that kind of summarizes it way better than I can, but I think when you think about three was just kind of blew everyone away, right? That was chat GBT's moment. four was really about kind of this mixture of experts. So it was like using multiple

1:30:47

experts to kind of build the best response. And then 03 came and you had reasoning and that was like another step up in what it could do when you added reasoning to the mix. And then with five, I think the kind of the big innovation is it's like a mixture of models where it's not just one model, but it's almost like a routing agent or routing between these different models, right? So it's using the best model

1:31:12

based on your request. So you have rather than having to select that you want it to do research or to use a web browser. You just ask the question and it will based on your response generate you know maybe a quick answer if it's a simple question or maybe a very detailed more complex answer if it is you know

1:31:32

needs reasoning needs you needs you to go out or needs to go out and do some browsing needs to do some deep research. So having that idea of like a single box that just does everything for you. You don't have to click any buttons. I think some people don't like it. But I do think that's the future. And so I think

1:31:49

what GPT5 proved to me is a couple things. And so some of these are, you know, maybe some hot takes, but I think the age of drastic model improvements is temporarily over. I don't see you. This was not you know

1:32:07

everyone was there was a lot of hype around GPT5 probably 18 months ago when it was kind of mentioned that it was going to be this super intelligence it's not that it's an improvement of GPT4 and improved on some of the other benchmarks on against some of the other model providers but it's not even bestin-class in everything maybe in certain things it is so the idea that we

1:32:31

are just going to keep getting you know insane improvements on models I is over for now. You know, there'll be some innovation, but maybe, you know, I we're taking LLMs to at least some kind of reasonable like slowdown of of actual like accuracy of what it can do, what what the quality results are in it. But I do think that over time, you know,

1:32:55

we'll probably come up with some new methods and likely maybe there'll be some, you know, some big step changes in the future. But for now, I think we're kind of not leveled off cuz every improvement does unlock new capabilities that weren't possible before. But I don't think we're going to see the big jumps that we were seeing before. And I

1:33:12

think that's actually a good thing because if you are building in this space, you know, we are at Mastra, a lot of you are if you're watching this, it actually means that maybe the ground can settle just a little bit. And because of that, we can actually come up with like what are the right patterns to you to use and to build upon what we actually

1:33:30

have because when things are changing so drastically, it's kind of hard. You're building on a shaky foundation that's going to change and get ripped out potentially with any new innovation that comes out. And now, while I still think we'll see some of those innovations, I do think that uh maybe we can start building a little bit

1:33:48

more standards. we can start figuring out what are the best practices and in what use cases are actually achievable. So right now it's been largely driven on a lot of hype but a lot of promise and possibility and now I think we can start to figure out what are the models good at today what can they actually do and then every new incremental increase which will will still come when we have new models can

1:34:13

start to unlock new capabilities that we're close to being able to do today but maybe can't quite get to. And so I think this is actually a good thing. I I don't think it was a particularly good launch for OpenAI in my opinion. I think

1:34:24

it was kind of a flop, but I do still think it was a good model. So it wasn't the the game changer that I think people were hoping for or thinking it might be. You know, there's a lot of hype, but overall I still think it's a pretty good model and I think that a lot of people are going to use it and OpenAI is still very good at product. They have by far

1:34:45

like the single search box or the single not not a search box the single prompt box one message routes between all the models genius like that that's the way it's going to be people are going to copy that eventually and once they figure out how to do it and all the other providers I think will uh will turn away all the buttons and uh kind of

1:35:03

just give you that experience where it just does it for you. All right so we have some messages. Xan build says, "Can I build Maestra templates with AGUI?" So, a template isn't a front end, but you could easily

1:35:19

build a template for the Maestra agent and then build some kind of front end in a separate repo that you link to to say like here's the backend master agent and workflows and here's a separate repo if you want an example of how to build a frontend using uh AGUI and Copilot Kit. So, it'd be a cool use case, but

1:35:37

technically a template can't ship a frontend with the template. All right, so enough about uh I guess one more quick GBT5 mention and I would recommend reading this article GBT5 hands-on and welcome to the stone age. It's from Late in Space.

1:36:02

Just a good, you know, more detailed, much more technical than I went into today. So I would highly recommend reading through this. It talks about, you know, how it does encoding. It talks

1:36:14

about the mixture of models. It talks I mean it's very detailed and there's I think there's a couple it might be like a series. There might be a couple posts in this uh GBT 5 series. They clearly had access ahead of time and so they

1:36:28

were they were able to spend quite a bit of time to kind of go through it, review it, and talk about it. So, highly recommend reading this. I think it, you know, it's just a good thing to subscribe to as well if you haven't.

1:36:43

Let's talk more generally about AI news. There was a lot more than just OpenAI's launches last week. Enthropic did some things.

1:36:54

First off, they released a frame, what they call our framework for developing safe and trustworthy agents. It's more of like a policy was released on August 4th, so about a week ago. And it's more of just like principles, right? Like how they think about building. It's more of a stance. They're

1:37:13

taking a stance, which is good. You know, they're kind of core principles are keeping humans in control while enabling agent autonomy, transparency of agent behavior, aligning agents with human values and expectations, protecting privacy across extended interactions, and securing agents interactions. All good principles. I

1:37:32

think no one's going to disagree with that. You you can definitely read through this. It reads kind of like a a little bit of a you know idealist uh view of what they want when bugents at least how they're thinking about it how they're thinking about it with cloud code. So I'd recommend you know reading it. It's good to see. Not more not much

1:37:58

more to say about it than that, but overall a good article. Take a look. Next, a bit bigger of a of an announcement was Claude Opus 4.1.

1:38:13

So, if you have not tried this yet, I've I use Opus in Cloud Code. It's very expensive, but it does good work. And I will be using uh Opus 4.1 as well. And

1:38:25

you can see that the SweetBench verified results, you know, what what the accuracy is between OP, you know, sonnet 37, opus 4 and opus 41. So you can see it's the changes are a little bit more gradual, but it's still an improvement. And overall, it's I I believe and anyone in the chat correct me if I'm wrong

1:38:49

here. This is now the bestin-class you coding model. I don't think uh GPT5 beat it out, but again, GPT5 is a good model. Opus is still a little bit better. The

1:39:00

difference is Opus is much more expensive. So, you might be able to get by with GPT5 and it's going to be a lot cheaper for you. But Opus is still, in my opinion, based on me playing around with it, but also on what I've seen online, still quite a bit better.

1:39:22

I am curious if you are watching this, have you tried out 41 yet in cloud code or in cursor? Any thoughts, any comments? Drop a comment in the chat and let me know. We have uh more AI news

1:39:40

this time from Microsoft. Microsoft has a co-pilot kit announcement. This was on August 5th, an improved poll request review experience. So, we mentioned and kind of joked about Gretile and we also mentioned about the

1:40:00

Maestro template earlier that allows you to build your own GitHub poll request agent, but GitHub Copilot is, you know, of course there. Some of us at Maestra have been testing it, but you can basically talk to Copilot kit and or not copilot kit, copilot, getting my copilot messed up, but you can talk to GitHub copilot and actually just say, "Hey, at Copilot, can you do something for me on

1:40:25

this PR?" And then it'll actually do it. So, you can tag it. It'll go ahead and try to accomplish the task for you.

1:40:31

It'll run in the background and do it. And if you are using I'm curious if anyone's listening. Are you using Grappile? Are you have you're using Copilot in your uh GitHub PRs? How are you how

1:40:43

are you reviewing your PRs? Xan Xan says, "Never tried Opus yet. I'm assuming Opus 41. Only been using GPT5."

1:40:58

That's cool. And let's talk about some more news. Google had a bunch of things going on last week. So, Google wasn't quiet while OpenAI was

1:41:11

trying to steal the headlines. So, this is related to Gemini's CLI. So, Gemini CLI, GitHub actions. So, it's an AI coding teammate for your repo. very similar to

1:41:27

what we just talked about with Copilot, but you can essentially connect Gemini CLI and you can tag Gemini CLI, have it do things for you. It'll run in the background and it's kind of a way to integrate in what was released pretty recently, right? Gemini CLI is kind of a competitor to cloud code, right? So, if you haven't used their CLI, I've heard some good things about it.

1:41:53

obviously uses Gemini models, but you can now wire that up right in your GitHub actions. You can tag it, you can get it to do things for you in GitHub. So definitely a tool for improving automations if you are using Gemini for your your as a your coding model which I know a lot of people are. Similarly or somewhat related,

1:42:19

Jules, which is their asynchronous coding agent, you know, somewhat comparable to OpenAI's codeex is now available for everyone. So, it's out of beta. It's launching publicly. It's powered by Gemini 2.5.

1:42:32

And again, this idea of being able to, you know, dispatch tasks to an agent that just works on it and responds with basically a pull request based on your task. and you you with the ability for you to go back and forth with it a few times is a use case that's going to exist and I honestly like I love codecs

1:42:53

from the UI perspective. That's exactly kind of what I want. It doesn't hasn't done as well as I want it to. Maybe with

1:42:59

GPT5. I don't know if I can use that with codecs yet. It'll do even better. That would be great. Um I haven't tried

1:43:05

Jules yet, but it's ideally it's pretty similar, right? That's the goal is you can kind of dispatch these asynchronous tasks. You can assign it rather than assigning a linear issue to yourself or someone else on your team. You just send it to Jules or the codeex and it does it

1:43:19

for you. I think having this this idea of like these async coding agents, I think it eventually needs to connect to your project management system, your linear or your Jira or whatever. And once it does that even and once it has a really good integration, they'll become even more useful.

1:43:38

But I do think that these kind of async coding agents are going to be big and we're going to be sending off a lot of our simple tasks that come in. Those things that kind of sit in the backlog and you're you're thinking, well, I could probably do that in like a half hour or an hour. I just don't have time and I have to like create the PR and get

1:43:57

it reviewed and this isn't going to do all those steps for you. But if it can take the first pass at it, it might be able to turn that hour task into a 15inute task, right? it takes away part of that work where you can just dispatch that task. When it's done, you review it. You send it to your team maybe for a review. You know, you have a another

1:44:15

agent that reviews its code, of course, and then eventually you have the confidence where you can get that thing merged. So, async coding agents are going to be big. I've been using them quite a bit, you know, whether it's cursor background agents, whether it's codecs, I haven't used jewels, but even like cloud code, I basically use as a

1:44:34

background agent often. So, we'll see more on that front, but good to see that that is now publicly available. Now, this one was my f this might be my favorite release last week and I just wish I had access to it. So, if you are someone uh that has access, be anxious to talk to you or, you know, figure out

1:44:53

how to how I can get access. I would love to love to try it out because if you Google's AI studio introduced Genie 3 last week, so this was August 5th. It's the most advanced world simulator ever created. So, I'm gonna watch this video because I

1:45:09

just think it's cool what you're see if you have not like it's essentially a way for you to prompt create a prompt or and then it actually builds a world around that prompt. It is incredible. So, we're going to watch this video and please uh let me know in the chat if you've seen this yet, but it it might be

1:45:28

my favorite thing that was released in all the news that happened last week. This was the thing that kind of blew me away the most. So, let's uh let's watch it.

1:45:44

What you're seeing are not games or videos. They're worlds. Each one of these is an interactive environment generated by Genie3, a new frontier for world models.

1:45:56

With Genie3, you can use natural language to generate a variety of worlds and explore them interactively, all with a single text prompt. Let's see what it's like to spend some time in a world. Genie 3 has real-time interactivity, meaning that the environment reacts to your movements and actions. You're not walking through a pre-built simulation.

1:46:24

Everything you see here is being generated live as you explore it. And Genie 3 has world memory. That's why environments like this one stay consistent. World memory even carries over into your

1:46:37

actions. For example, when I'm painting on this wall, my actions persist. I can look away and generate other parts of the world, but when I look back, the actions I took are still there. And Genie3 enables promptable events so

1:46:56

you can add new events into your world on the fly. Something like another person or transportation or even something totally unexpected. You can use Genie to explore real world physics and movement and all kinds of unique environments.

1:47:16

You can generate worlds with distinct geographies, historical settings, fictional environments, and even other characters. We're excited to see how Genie 3 can be used for next generation gaming and entertainment. And that's just the beginning. Worlds could help with embodied research, training robotic agents before working in the real world,

1:47:37

or simulating dangerous scenarios for disaster preparedness and emergency training. World models can open new pathways for learning, agriculture, manufacturing, and more. We're excited to see how Genie 3's world simulation can benefit research around the world.

1:48:00

All right. And then it goes into, you know, showing some examples. You can obviously read the click on the link and read the the full article from Deep Mind.

1:48:12

just pretty wild in my opinion. Like the idea that you can from a prompt create a simulated world that you can actually interact with. It's not just a video. It's not just, you know, started with

1:48:25

images, right? Then it was video and then now we're actually able to generate real world simulations. So this is not available for anyone to use yet, unfortunately. I hope that it will be soon. I would love to get access to it

1:48:38

because if you've ever uh you ever played Minecraft and you like get to the edge of the world and you got to wait for the like it's generating a new part of the Minecraft world. It's kind of like that but actually much more realistic and not just like procedurally generated, right? It is actually generating things on the fly but

1:48:57

maintaining coherence between what you're doing which is is pretty mind-blowing to me. So, this was my uh if I had to pick my favorite release for from the week last week with all the stuff going on. This is pretty cool. I'm

1:49:09

anxious to see kind of what comes of it. I know it's just kind of like a research preview and it's not available to everyone yet, but hopefully it will be at some point and we can all build and build our own worlds and simulate things in relatively realistic, you know, environments or like, you know, changing the physics or whatever and like

1:49:27

building whether it's games or entertainment or whatever. You know, they talked about all the different possible use cases. education. I see so

1:49:33

many possible use cases in education as well. So very cool. We're not done yet. There's more news to talk about. So let's keep going. That's all from Google.

1:49:46

There is there was one, you know, cursor wasn't quiet though. There was an announcement from cursor. So let's talk about that.

1:49:58

Cursor is now in your terminal. It's an early beta. Access all model models.

1:50:03

Move easily between your CLI and your editor. So now cursor is competing with cloud code. They're building their own CLI. They realize that maybe uh you know Cloud Code was on to something. You

1:50:15

might want to dispatch things in your CLI. You might want to then pull it into your editor. And yeah, you can do that now in cursor. So for you cursor users uh and if especially those that are cursor users

1:50:29

like me but also use cloud code curious if this is going to um make your list. I still don't know that I'm still skeptical that cursor is going to be as good as cloud code yet but maybe with GPT5 it's getting closer. I know it still can use cloud under the hood but curious on if cursor's agent is going to be you know ever catch up or be better than

1:50:52

cla code. Time will tell. Let me know if you've used it, but try it out.

1:51:00

Cursor wasn't the only one with, you know, announcements last week as, you know, continuing on. Verscell was uh fairly busy. So, let's talk about Verscell. Versel has on August 4th, Verscell MCP

1:51:19

now in public beta. So you can now use Verscell's official MCP server and you can interact with Verscell. You can search and navigate your doc the docs. You can manage projects and

1:51:31

deployments. You can analyze deployment logs. So more and more companies which I like to see this they're just bundling their documentation into an MCP server. So if you're actually like writing code you

1:51:43

can chat with the documentation. Master's had this for I don't know four months now and now more and more companies are starting to add this. I think it's a really useful tool. It's one of the things we hear the most is

1:51:55

that people are just really happy that MRA has an MCP server that has the docs. But of course, this is more than that. You can actually interact with your Versell projects and and do even more. So very cool to see more and more

1:52:07

support, you know, from for MCP from some of these larger companies. in a similar announcement kind of continuing on. We're going to bring cursor back into the mix which just seems like it's you know I I don't know if there was anything special they did. I kind of wonder you know they

1:52:27

released their MCP server but you can now use curs cursor is now supported on versell MCP. I think this is one of those like marketing things like their MCP server if they built it right was usable in cursor anyways. It was already supported but you know you got to get the marketing publicity. We released an MCP server. Oh, now you can use it in

1:52:46

cursor. So now if you search for for PCEL and cursor, you probably get on this page. So interesting. Uh but my

1:52:53

question is did your did your uh August 4th version of MCP not work in cursor? Because that would be kind of weird. I assume it just did and you just wanted some extra marketing. But you can use vers Verscell's MCP and cursor.

1:53:14

One more. Uh, this was kind of a uh let's say there's a little contention last week with with Versel. So, we can kind of talk through that. There's a little bit of drama

1:53:29

and this was with uh AISDK and their AISDK's provider architecture. I think there's a a little comments from people on, you know, open AAI or open router is isn't an official AIS SDK provider. There's a pretty good post here though from Lars explaining that AISDK has an open provider architecture. It means that you know anyone like

1:53:54

they'll have some official providers but you can obviously submit your own. Happy to have community provider pages for community submitted providers. I don't you know Mash is an open source company. We can't support you know every possible

1:54:09

you know we we have for instance like workflow engines right we support you know we could temporal and ingest and all these different workflow engines cloudflare but maintaining all those is a lot of work and so we can't support everything so I do understand where Lars is coming from here you you kind of have to draw

1:54:26

the line somewhere and then decide you know these are the ones that we're going to support we deem important enough for the core framework or the core business, but these other ones will have to be community maintained and then you do some some stuff to encourage the community to keep those up to date. I think there is a lot of uh a little bit

1:54:44

of questions you know like as you know you see the first one here when's open router I think that that's the big thing is is open router important enough it seems like it is but maybe it's a little competitive to versel so they don't want to they it's maybe not important enough to versel but overall like I mostly

1:55:01

agree with Lars here you kind of have to draw the line somewhere you have to decide where it is you have to stick to it I just thought that the the little heated drama between open router and versel was kind of funny last week. There's more uh Verscell news. Verscell was busy last week. So, Verscell has

1:55:22

released AI elements are now beautifully designed React components for AI SDK. So, open source and built on Chad CN. They're easy to integrate and fully customizable. So the idea that you can

1:55:34

now build a even better front end potentially for your whether it's a chatbot whether it's a you know some kind of agent interface that you're building. So you you kind of connect the front end to the back end uh with AISDK. So interesting. I'm not surprised,

1:55:50

right? Versel is very uh they want, you know, they've always been kind of big supporters like especially with Shad Cenne of really beautifully designed interfaces and so now they want to make that even easier when you're building with AI or building for AI. We're getting close. Thanks for all of you still hanging out with me. We have three more news articles and then we're

1:56:14

going to wrap this thing up. As you know, we talked about the GPT OSS models. There's a new Quen model in town.

1:56:26

So, Quen 3 4B thinking 257. This was released uh I don't remember the date, but it was sometime last week. And it's just you know another model another open source model we should you know always always nice to see more open models you know as we had mentioned previously China seems to be seemed to be dominating open models then you know open AI of course last week got into the mix but Quen wasn't you know wasn't

1:56:56

quiet there's a new model in town it's a thinking model it only supports thinking mode and you and see, you know, how it compares, how the 4B thinking compares with some of the other coin models on thinking, reasoning, coding. So, I haven't analyzed it. I haven't really looked into it, but it's always good to see that more models. So, if you're a Quinn user, maybe check it out.

1:57:30

Second to last. So if you are familiar with Zep, Zep is very focused on agent memory and so they have kind of made some updates. Context engineering takes center stage. So there, you know, for those of you that have kind of been through it, first

1:57:55

it was prompt engineering, now people are calling it context engineering. I think context engineering is is better. But they talk through, you know, what context engineering is.

1:58:07

You know, they give some, you know, some examples of why they think context engineering is better than prompt engineering, but then they also talk about, you know, how Zep, you know, connects and helps you actually do this context engineering, right? There's, and you kind of see where Zep kind of fits in, at least in their diagram. So

1:58:26

they have like a V3 and all they made some changes from V2 to V3. So if you're using Zap some improvements, some like kind of center centering around this uh this idea of just really helping with context engineering. And last but not least, this one's just a fun one. Not much uh not much to say

1:58:50

other than I just thought it was interesting. So, if you've been following for a while, you've probably seen you've been following the space, you've seen that Meta's been hiring, you know, hiring some of some researchers, right, for insane amounts of money. I thought this was interesting. The the

1:59:07

days of like high value uh contracts aren't aren't necessarily done. So, the team behind Get MCP was basically swallowed up or grabbed by Monday.com.

1:59:20

So an acquisition for potentially multiple millions of dollars. So even people building MCP servers potentially are candidates for the talent wars of AI. So if you're building an MCP server, you know, you never know, your time might come. You might get you might get a call up from a from a big company wanting to uh wanting to bring you in

1:59:41

for your talent, especially if you've you know, in in the case of in this case, you know, they were pretty pretty heavily involved in MCP early. And so that's why it doesn't make sense, but you know, I think it's it's hope for all of us that if we're just building a simple MCP server, there's still chances. So, I just thought that was pretty funny. All right, everyone. We've

2:00:00

been doing this for two hours now. Exactly. Look at that. It's like it's like we almost planned it to end at two

2:00:06

hours. I appreciate you all for sticking in. I appreciate anyone that submitted hackathon. We went through a whole bunch of hackathon submissions today. We watched a ton of videos. So, if you're just tuning in now, go check out some of

2:00:19

the hackathon submissions. They're incredible. We gave out some hackathon awards. We talked about how you have

2:00:25

until Friday for some additional award categories or some that we want to continue providing award categories for for the hackathon. So, go to master.build and learn more there if and please join the discord if you have questions. There's a master build channel in our discord. We talked through the OpenAI launches last week,

2:00:44

specifically GBT5 and the OSS models. We talked through a whole bunch of AI news from Anthropic to Microsoft to Google to Curser to Verscell to Quen to Zep and even, you know, the talent acquisition war of AI. So, thank you all for tuning in. This has been AI agents hour. Please

2:01:05

check out MSRA if you haven't already. Please follow me on X SM Thomas 3. And if you don't have a copy of this book by my co-founder Sam, you can get a digital copy by going to master.aibook. Learn about building your own AI agent.

2:01:26

Thanks everybody for tuning in and we will see you next week. Goodbye.