Back to all episodes

Mastra Model Routing, Claude Sonnet 4.5, Sora 2, and all the other AI news

October 2, 2025

AI Engineer Paris Recap, Mastra Model Routing, and all the crazy AI news from the last two weeks!

Episode Transcript

3:46

Hello everyone and welcome to AI Agents Hour. It's October 2nd. We haven't done

4:13

a normal show at a normal time for a long time. So, normally we do this on Monday at noon Pacific, but today's Thursday and it's uh still noon Pacific. But good to be back. Hopefully uh going

4:25

forward we'll have our back to our regular schedule. Some of the crazy traveling is is over. But that's kind of what we want to talk about first. We were gallivanting all over the world and now we're back. Today we're in SF.

4:38

The real SF Sou Falls. Sou Falls, South Dakota. It's what we, you know, the real SF as Obby says. Uh

4:45

I'm sure you all believe that, right? You Yeah, this is the real SF. Also, today's sponsored by Red Bull. Cheers. It's not

4:52

sponsored. Not sponsored. We are not sponsored by Red Bull or any energy drink company yet.

4:58

Yeah. But if you want us to show your product for free, let us know. Yeah. I mean, we, you know, we do need

5:04

to talk about Century, their, you know, gold sponsorship, which or diamond sponsorship, which now we means we pay them. That was funny, though. For for those of you that didn't get the joke uh on X, Obby basically said, "We would sponsor Sentry for free or we'd we'd be so happy. We'd pay you to SP just to

5:23

sponsor a Century." Well, the whole thing was like the whole thing was like um Dax from Open Code was like, "I'm sick of all these these people shilling things on Twitter without any affiliate um in their profiles and stuff." And then I just said, "I'll I'll shill Century on the live stream for free." And then David

5:43

asked us, "How much is a gold sponsorship?" Zero. So y'all should use Sentry.

5:48

Yeah. Yeah, we we do like Sentry. We like David. Uh we like Reptile. Yeah,

5:53

we like Celsius. Yeah, we would we will uh disclose if we're ever paid by any of these sponsors, but we just are telling you stuff that we like. And you know, Lawrence likes uh Red Bull, too. So, congrats. Gives you wings. Gives us wings. Uh yeah, but we do want to start by just

6:11

kind of recapping. So we were in Paris for AI engineer. That was last week and it feels like longer than that, but yeah, you know, time is, you know, what is time. But we do want to talk just a

6:23

little bit about what did we see, you know, what did we learn? If you're in the chat, you know, tell us where you're from because that's kind of cool. We want to we have people all over the, you know, all over the place. Dev's in here. Dev. Yo, what's up?

6:34

Dev, good to see you. Those who don't know Dev, he's prolific on at least in my life in terms of open source projects I've watched him uh in like um but yeah, Dev's more of a matcha guy. Yeah, good to know. Uh yeah, so let's talk a little bit

6:53

about AI engineer Paris. Obby, what were you what was your takeaways? What was your thoughts? What did you think of the

6:58

event? And yeah, and then I'll give you my take. My hot take is it was too French. No, I'm just kidding. Um the Okay, a couple

7:06

things and then you said that and then here's a lyric. Hello from France. Um yeah, hopefully uh you know Eric, let us know if you were a AI engineer if you missed it this year. So okay, I I'm also an so I'll

7:23

just keep it keep it for what it is. The station F campus that we saw, which is like the YC of France, is beautiful. They have like crazy amenities and crazy cafeteria where we all were having fun and drinking and stuff. The conference itself, outside of Swix's talk or the

7:43

keynotes, I just thought it was a it was boring because everyone was just shilling products as their talks. And I come from the era and I think a lot of people do where you go to a conference to learn and like see new concepts and stuff. Not to be like hey here's this idea and if you buy this product you will you get the idea like that's not how it works. So I didn't like it from that

8:08

perspective but very well done as an event as a whole. Um, one last thing, my brother got food poisoning from it because they had like a, you know, the ordurves and everyone puts their hands in the ordurves, so he got a little sick. But did he? Yeah, unfortunately. But other than that, it was chill.

8:26

Yeah. I mean, I I'll agree that Station F was awesome. It's just a cool venue, a cool place. the I will agree that there

8:34

were some sponsors you know that you know they just had so many talks about similar topics and you you know they just got those talks because they they paid money which was unfortunate. Yeah. So I would agree with that. Um I didn't

8:47

see that many talks though so I can't you know complain about the content too much. I thought the keynote by Swix was was good. Um the main takeaway from that is you know a lot of people have been talking about 2025 is the year of agents right and his argument is that it's actually going to be the decade of agents and it's true right if you've

9:09

been playing around with any of this stuff you know a lot we haven't figured out a lot of it there's a ton more that you can do with it a ton more that you we still are all learning and improving on and trying to make better and that's going to take a long time so yeah that was very cool to see. He had this diagram of like this LLM OS which you

9:28

know is like he has the LLM and you have memory and you have tools and MCP and you have all these things that connected to the LLM and but it actually was just like an architecture diagram of MRA it seemed like. So I was like oh well I guess maybe we're on to something here. Let me see if I can pull it up too.

9:45

Yeah, see if we can find that uh and indev kind of Yeah. So this is the LLM OS. This was what was shared with us and then to be a little jokester I when I came back and share when we shared it with the the company we just did this um and that was fun. Yeah. So it's very much uh

10:08

close uh matches closely with what we've been building. So that was always good to see that you know feeling like others are seeing the same things. I would say one thing that was great about the conference was the people. Yeah, there's a lot of, you know, you you don't know what to expect. You know, we

10:25

we go to a lot of stuff in SF and so that feels like it's a little bit of a bubble. It's isolated from the rest of the world in some ways because I don't know, everyone talks about AI all the time in in SF it seems like. But the quality of the, you know, individuals, the engineers, the developers that were there, very high, had great

10:43

conversations. They're all struggling with the same things we struggle with. They have, you know, a lot of the same ideas. Um, you know, a lot of great

10:51

questions, a lot of good conversations. You know, oh, it seems like I never go a week without someone saying like, "How do you handle when your agent gets too many tool calls?" That came up multiple times.

11:03

It's just so a lot of the same things that we're going to be continue to figure out, I think. Uh, so if you do have the opportunity to go to an AI engineer conference, which I know that now they're having one in London, right, in April. In April and London. So, if you're in that area, I'd recommend going. And then of course there's there's been ones in

11:21

New York and San Francisco. There's a code summit coming. I think Daario from Anthropics like headliner or whatever. I don't know if that's been announced, but I think that's Oh You heard it from someone else.

11:34

Uh yeah, nobody watches this, right? Yeah. Don't Nobody shared maybe it has. Maybe

11:39

it has been announced. I don't know. But I won't tell you who's going to be at London. I'll tell you, you know. Um

11:46

What's up, Manny? Hey, Manny. Yeah, and seeing you all over X.

11:52

Um, yeah. So, great great event. I'd highly recommend going just to like especially if you're dealing in this stuff often. Sometimes it's nice just to talk to other people that are going through some

12:04

of the same things, figuring things out. You learn some tips and tricks, of course, you build some relationships and, you know, you also probably get pitched some products, too. But, you know, that that that comes with the territory. I'm curious what the Brits are going to pitch. probably the same points that they they pitched in uh in AI France.

12:21

One other thing that it seemed was oh too many tool calls non-stop and happened only percent of the time pain. Yeah. Story of my life. Very true. Lawrence, there's a new definition of big coming

12:34

out. So we we make fun of big eval. And I think there's a new concept coming out called big memory because a lot of database companies are pushing memory features into there and then teaching people that you have to use this database to do memory with agents. So I think a new classification of big of big

12:54

memory wants you to buy their products and we'll probably see that for the rest of the year because in the Swix uh presentation memory had a question mark which means it's not like a it's a winner take all or it's open for everybody. So and a lot of other things have been kind of figured out. So we'll see how that progresses from now until

13:13

April. Yeah, I I do think we're already seeing a lot of those companies have, you know, either they're repositioning or they have a product specifically for memory. Yeah, this is a hot take tool called memory is fake.

13:30

All right, nice. Um, but yeah, as you can see, this is live. Imagine you had to meditate for an hour before every conversation. Yes. Uh, so as you can see, this is

13:43

live. you know, you might be you might be uh watching this live on YouTube or LinkedIn or X and so feel free to leave a comment. We'll talk about them. We'll throw it up on the screen, but you might

13:55

also be listening to this later. We're on YouTube. Please go subscribe to our YouTube channel, MRO-AI, if you have not already. Also, we're on Spotify, Apple Podcast. So, you know, if you did if you

14:08

do like us and you do want to give us a fivestar review, we will accept five star reviews. If you want to give us less than a fivestar review, please find something else to do. Go to find Jesus or something something else. Yeah, just or or just, you know, just build some agents. I don't know. Don't give us a review then. Uh, yeah. Dev

14:27

asks favorite talk from AI engineer outside of Swix Keynote. Yeah, I don't really know if I have any. The Mistral one put Wart to sleep. So that just shows you how that boring that

14:38

one was. But Six was the best for sure. Yeah. Uh yeah, the Mr. One was had some

14:44

good content, but it was a little dry. Just dry. And yeah, Dev's gonna find AI Jesus.

14:53

All right, so let's go on to the next topic of the day. Also, Dev, you should come on the show probably in the future because that'll be a good hang. Yeah. Yeah, Dev, you're always you're welcome. All right. So, let's talk a

15:05

little bit about Maestra model routing. And do you want to give a preview what it is before I pull up the the banger tweet that Tyler sent out? Yeah. So, um months ago now, we uh were

15:20

working on supporting AI SDK v5 in Ma. Uh the problem with that was we had a lot of people still using AIDS V4 and we couldn't really safely tell them to like Monster is a framework. We have all these primitives and some of them are bundled together in the same version, right? So it's hard to say, hey, we're going to break this piece of the code, but then you're still using the old one.

15:47

At that point, we started thinking like, okay, uh we should probably own a lot more of this internal infrastructure here. So we created our own agentic loop primitive that is like you don't really worry about it. It just is under the hood. What that allowed us to do then is try other things. And one of the things

16:06

we wanted to do now having our own agentic loop was like do we can we control model the models themselves? Do people have to install models from a package? can we just do it inherently through endpoints and um envirs and we did and so yeah let's let's play the video and we'll play Tyler's quick little demo video. Hey folks so today I'm excited to share that monster has a new model writer

16:35

feature. Uh what this allows you to do is use um 600 plus models from 45 different providers without needing to install any packages. Um, and let's look at my code to see what that looks like. So, I have the weather agent here, just a basic example. Um, you can see I just

16:53

have a string OpenAI/Gpt 5 mini. So, normally you'd have to install the OpenAI package, import it, configure it, and all that. Uh, but you don't have to do that anymore. So, the nice thing is you get TypeScript autocomplete for all

17:06

the different models and providers you can use. So, you can sort of use it as a search to see what uh you could try. So if we doropic, you know, you could try the new um sonnet 4.5 in here right here. Or you

17:21

know, if you wanted to search um on open router and see, you know, here's a deepsee 3.1 or maybe you want the 3.2 version. Um for now, let's just do open AI GPT5

17:34

mini. And we'll jump over to the playground and have a look at the changes there. Um, so we always had this little model picker up here, but before it was sort of limited. Now when you select the provider list, there are tons

17:46

and tons of providers to choose from. And you see which ones you have API keys for. Those are the green circles and which ones you don't. Those are the red circles. Similarly, you see all the different models you can pick from. So it'll it

17:58

should allow you to very easily try different providers and models. So let's just send a message to GPT5 Mini and then Let's switch to anthropic and we'll do set do the new one set 4.5 and we'll ask for Vancouver weather mic drop. There we go. We switch between different models. Um what I really love about this is the ability to kind of like explore

18:29

and try different things out that I normally wouldn't try. So for example, we try Cerebrus and like the Quen 3 coder model. I've been really liking this. Maybe we'll get Toronto weather.

18:40

Um, you can see that was extremely fast because it was on Cerebrus. Nice thing is this little eye icon here. You click on it, you'll get the documentation for each of the providers. So, you can see the uh the Google Docs here and really

18:56

any of these. Um, and yeah, we're hoping that this will make it a lot easier for you to try different models and use them in your master project. Um, upgrade to the latest Mastra0.19

19:08

or higher and you'll have access to this. Damn, weekly downloads are nice last week, too. Yeah, it's even more if you go today. So, yeah. And kind of highlight the

19:20

features, right? You have this complete nice autocomplete uh from you know thanks to TypeScript in your model selector. You have the the nice UI in the playground to make it really easy.

19:33

And you know Netlefi just also announced that they have a new AI gateway. So we have day you know day one support for that as of yesterday. So definitely uh a big feature for us. There are some caveats to it. you know, it doesn't support every single model

19:50

provider feature, but the nice thing is if you've used Monsterra for a while, you know that, you know, we work really well with AI SDK, so you can just use the AI SDK provider if you did need some specific feature, you know, but mo for most use cases, this is just going to work out of the box. Um, so there's a good question.

20:11

How are API keys managed? So in this case, you manage the API keys like just in your environment variables. So it basically tells you what the environment variable needs to be.

20:21

If you don't want to manage a bunch of keys, then you can use things like open router and some of the other uh gateways. Then you just need the one API key to get all the providers, which is why we kind of support both, right? You have the, you know, kind of other routers. So we kind of think of this as almost like a model router router in a

20:38

way because it supports the other routers or the other gateways. and then um but or you can go directly to the provider if you want to manage all the keys yourself. So great question though. We always do the meta thing which is a

20:51

model router router. So yeah and Brad's comment very much what we were just kind of saying versell API gateway is primarily useful for managing so many darn keys. Yeah, exactly. So the nice thing is you can just use the Verscell API gateway in

21:03

there and then swap between all the different models that you can get through Verscell's API gateway. So that's what's really nice about it. Yeah. or or you know you can send it send it to Verscell's API gateway or open routers and we just also announced

21:17

today we have model fallbacks as well so you can kind of like if one of them fails if cla goes down which sometimes happens you know unfortunately then you could fall back to open AI or something as well so these two things together really make model like the whole model route model fallback feature really

21:35

powerful um how was this built uh just to throw it out there. So, models.dev is a really nice so website. You should go check it

21:46

out. models.dev. They have an open API, open like source API that you can hit that gets all the

21:54

models that they've compiled. What we did is we fetch them. We generate TypeScript types. We generate

22:00

everything. So, you're all good to go. And then, yeah, that's how we that's like our source of truth for the models. And then we built some obviously we wrote some code to actually run them. Um

22:11

but yeah, this is cool, but what's hap what's going to happen next is even cooler. So wait for that. Yeah, definitely. So Manny's comment, I've had issues before is looking for models to plug in, but I'd get an air

22:24

that AI SDK didn't support it or there was another problem. Yep. So this should open up a lot more additional model providers as uh working. Yeah, which is great. Uh, Dev says the modelmcp router

22:38

is the new GraphQL router, right? Yeah. Yeah. Very true. And then banger

22:43

comment. When do we get the agent framework framework? Dude, soon probably. Yeah. Yeah. Yeah. Don't give us any

22:49

ideas. Yeah, we probably will do that, honestly. All right. Um

22:58

and then uh Brad says, you know, I want to fall back from the model host to other model host from fall to replicate but keep the model the same. Yeah, you could do that. You could do that. Yeah, exactly. That's exactly why you you might that way the model is the

23:10

same, but maybe you go from one provider who's having some issues to to the other. Um and the nice thing about model fallbacks is you can tell like you can configure how many times you want it to retry, you know, and then before it falls down, you can have an array. viously you could have a whole list. You could have like three. And so, of course, you're going to add extra latency as it's trying, but at least it

23:28

it might not fail. Yeah. And then I I can't give you a sneak peek yet, but soon.

23:33

We're in the lab. We're in the lab. Um, could we put some simple telemetry on model switch occasion? Uh, that's a good like when it switches or

23:46

like things like that. We don't have a telemetry event for it yet, but Lawrence, that's a great idea. I'm gonna write that down. Yeah. And I I think we do tell what model it was used in, like the trace,

23:57

but there's no like good way to to see that. But it it does kind of exist, but not not in probably the way you'd want yet. Yeah. Yeah. And this is that's what we see

24:10

most of the time, Lawrence, when we talk to people is they basically you test with one model and one provider, right? But you might want the occasion where you know cloud's down or certain providers down so you can fall back to the other one. What would be more interesting in the future is if somehow magically the model

24:30

gets picked based on what you're trying to do anyway. Yeah. All right. Now, this part uh it's so fortunately or

24:43

unfortunately uh it's been a few weeks since we really did a deep dive on all the uh AI news and dang there's been a lot. There's a lot. So, this is going to be a longer segment and it's normally we try to organize it, but there's just so much. It's kind of hard to organize. So, we'll probably do some rapid fire. We'll show some things, but stick with

25:04

us because there's quite a bit. So, definitely uh grab your beverage of choice and uh tune in because there's a lot of AI news from the last two weeks. All right. So, first

25:20

and and as always, we'll give you uh our hot takes on it, but we're talk mad Yeah, for sure. Of course. But you can give yours as well, you know. Please use the chat.

25:31

So, Stripe had this announcement. This was a couple days ago. OpenAI is launching commerce and chat GBT. So, their instant checkouts powered by Stripe. So, yeah, that's cool.

25:45

They're releasing an agentic commerce protocol. And then they're also launching an API for agentic payments. So, Stripe's getting into the agentic payments game. Do you think they named it ACP? Like,

25:58

because there's already ACP from Google, but Google's just trash at everything right now. in terms of not not the models but in their these protocols like they have a agent actually Swix was shilling ACP it's like a agent communication protocol where different agent frameworks can talk to each other but then now ACP is agentic commerce protocol so when you say ACP it's not going to mean whatever Google wants it

26:24

to mean it's going to mean like you know this commerce thing but uh the problem it's not a problem but this blew up the internet Because I think Sam Almond was like, you know, saying like this is how I shop. Like I'm always on Chad GPT looking for things and and people, you know, people are always polarized by

26:44

what he says. So that was interesting. And Dev says ACP is technically from Zed. Google collaborated with them. And

26:51

and Dev also like ruining our, you know, we like to save the best for last around here usually, you know. So way way to bury the lead there, Dev. Appreciate that. But yeah, that's coming. Of course,

27:03

don't don't you know, we we couldn't have a show without that. Uh so let's go through just a whole bunch of different things. I think I'll just probably read these off because I don't think we have time to go through all of them. So, uh

27:15

and this was I think last week, but so DeepSeek released that they have their that they have a merged thinking non-thinking architecture. So now they just have one model called Terminus that has thinking and non-thinking. So yeah, seems like industry trend, right? To

27:33

have a mixed model approach is what we always say. So yeah, and then Deepseek also re some more information from Deepseek, some more releases introduced DeepSeek V3.2 EXP. It's our latest experimental model and the biggest one of the biggest things

27:58

which DeepS was already relatively cheap in comparison but API prices were cut by 50% plus. So there you go. Deepseek is making some moves. But they're not the only uh

28:11

Chinese model company making moves. Quen had released Quen 3 Omni. It's a 30 billion parameter model that does multimodal full multimodal. So fully

28:23

multimodal model. So Quen 3 Omni. And again, we just don't have time to go through all this stuff. So I'm just

28:29

going to share it with you kind of rapid fire, but if if it's of interest to you, definitely uh definitely check that out. We have Quen 3 still making more moves. Yep. They have Quen 3 VL. Sharper vision,

28:46

deeper thought, broader action. That's their headline. Okay. Um it's a it's the

28:52

Quen 3VL series which is a V vision language model in the Quinn family and you can see they you know of course everyone has the how do they perform on the different evals and obviously looks pretty favorable. Yeah. I mean we're we're Quen fans here.

29:13

Yeah. Uh this next one's interesting. Um, next we have a Quenth 3 guard which is super interesting. A built-in guard rails. Um, so this is interesting because as

29:27

application developers, we have to build guardrails into the frameworks or the tools that we use, but the model itself is is has safety built in. Then again, dude, this is a Chinese model, so I don't know how safe it actually is, but it's pretty sick how it works. like it can detect um when we have like when the

29:46

inst like it's like almost like two streams going on and then one is like proxying to the other all the safe tokens you know uh pretty cool. Yeah. So that was a really interesting re release by them in the last couple weeks. They've been killing it too. Yeah. I mean, honestly, all the I I you

30:06

know, I don't know about you, uh, but I almost get all these Chinese model companies mixed up because there's so there's quite a few and they're all just shipping all the time, it seems like. So, I guess that's good because, you know, most of them are open weights or open source. So, that I think pushes everyone

30:22

to be better, but it is it's good to see that, you know, there's more competition. So another, you know, can't get away from the talking more Chinese models. GLM 4.6 from Z.AI

30:38

was released. And this was uh just a couple days ago as well. So, you know, they they advertise it advanced agentic reasoning and coding capabilities. And I've never used any of the G or I never used GLM 4.5, but Tyler

30:52

on our team said that he basically goes between G previously GLM 4.5 Quen Coder and uh Opus, right? Or either Sonnet or Opus, but typically Opus. So that's kind

31:04

of puts you in the class of where it is. Obviously, he's using it for coding, so you know, it has reasoning and other other things as well, but at least previously it was pretty good at coding. So, I imagine this one's even better, right? And the context window is bigger, which which helps. And

31:22

yeah, we have another Chinese model next, dude. There's Chinese people shipping. Good job, y'all. Yeah, they're making moves. All right, so this one's sick, though.

31:36

WAN animate unified character animation replacement. All right, let's This is cool. We get to watch a video.

31:50

Dang. Wow. Hey, Shrek. Hi. Whoa. Yeah.

32:05

Huh? Hey, is that a video game? Wukong. Hi.

32:21

This is wild. Yeah. I want to We should make We should like do a live stream where we're using that and then it's just us. It would be cool. Yeah. It would be cool if we could pull it up in a separate so

32:39

you could see Yeah. you know, like the character version and us. And I I do wonder like what the latency is. I haven't tried this. If if you have if you're watching this and you have tried it, you know, let me know. But yeah, it's I'm a a big

32:56

fan of a lot of these like image video models. So, they're just fun to play with, right? And they go viral because it's like so visceral like you could touch it or you know, it's like such so visual and it gives you visceral reaction. YouTubing will never be the same. RIP Disney animation. Yeah, dude. It is uh

33:16

it's pretty wild. And obviously, you know, there's more in this video space, right? That's been released in the last week. Another big like if you Okay, think about this. You know how people spend a lot doing mocap

33:27

now for video games. What if you don't have to do that? You know, you could just act and then, you know, you use these models later. So you just be like an actor, do all the act acting stuff, and then boom, now you're animated.

33:42

Yeah, dude. A lot of people lose their jobs if is like Brad said, RIP Disney Animation. My friend works there, actually, so RIP to him, too.

33:53

Yeah. Yeah, they they'll uh or or they just get really good at using the tools. They can ship out more, better, high quality content faster, right? Uh but yeah that you know it's wild all the Chinese models that exist like I

34:07

think most like the the layman person in AI doesn't know that these models exist and even we don't necessarily know all the Chinese model companies. So, we were thinking like we should put together like a like we should just have like a segment on the show just talking about who they are like what who are the

34:24

players in that space because everyone know who's who's the player here right in in America and in France is Mistl I think it's the only one but like maybe there's some other French models and stuff. Yeah, I think there's Yeah, it's hard for the average person to keep track of all these things and even you even uh

34:43

those that do I I know all the companies now finally I think who knows there'll be another one tomorrow I'm sure but I have I've say I've tried even 20% of the models more more than a basic you know does it work try one or two things seems pretty good yeah buttholes versus animals yeah this is a you know good hot take mo motion Motion capture verse video models

35:06

is like workflows versus agents. True. Yeah, it's a good point.

35:12

Uh Brad says his brother-in-law works at uh Disney animation. They hate AI. I wonder why. I wonder why. All right, so enough with Chinese models. Let's talk uh you it wouldn't be

35:24

a it definitely would not be a live stream without talking about Cloudflare and Versel. So, we can do that. No drama today, though. Well, there was some drama, but it's probably not appropriate

35:35

to talk about. Yeah, we won't we won't talk about the the that although you can go find it you can go find it easily enough. So, a couple things from Cloudflare and for those of you that are new here, we do try to just cover all kinds of things in the the AI space from you providers, model providers that that's most of what we cover, but we'll even talk about companies like Replet and

36:04

lovable and companies like Verscell and Cloudflare and Netlifi and people building in and around the AI space. Yeah. So, I guess I shared this one first. So, we'll start at the bottom and work our way back. Code mode is here. This is the

36:18

one we're going to have the most conversation about. This is interesting. So, what is code mode? Well, now you you

36:25

tell me what's code mode. It's going to be tough without talking So, um Okay. Well, okay. I I'll be very like just neutral about it. So, code mode is, you know, as

36:36

opposed to having uh an agent with tools uh like in an MCP server, etc., uh you essentially give your agent a task like a prompt and then it will write the code and then execute it as opposed to making a tool call that already has the code written. We actually saw an like when we were in YC and we used to go to like events all the time. We would um we would you know

37:03

people would was experimenting with this concept. So there was like a a business analyst who would describe it in uh structure unstructured text and then the agent would then write all the the BI queries that they need in SQL server execute it then give them the answer u which makes sense and so this would be like it will write all the typescript

37:24

you need to achieve that prompt execute it and then there you go that's what code mode is it's pretty cool conceptually so it's a cool idea Now, I'll get a little spicy without uh without being too spicy. I It's kind of like what we had to do before tool calling was was available, right? You'd kind of just So, but it

37:48

opens up, you know, potentially some more security issues. It opens up, you know, potentially my my other question is like, okay, does does it work better? Because I I would agree that LLM tool call accuracy, especially across models, what we found is it varies wildly. Yeah. Right. We have our own like basically compatibility layer to try to like make

38:08

it better across models, but it still struggles, right? Tool calling's hard. It's not accurate.

38:14

Yeah. But I do imagine model providers are pretty invested in making tool calling very accurate. So by do by releasing this now, you're basically saying you don't think the model providers are going to continue to get better. This is going to because

38:27

they're kind of built around Cloudflare's architecture. Mhm. So the thing that bothers me about this is they say like here's this new way of doing things and oh it only works on this specific product that we have.

38:38

Yeah. Well, okay. I guess that's Cloudflare's way of doing it. And also I don't know that I want to see

38:44

the benchmarks that actually show it's better. It it might actually be, but there's no data yet to prove it. So yeah, we asked for some just because we're curious and we're in the space, but they haven't done them yet. And I'm

38:55

sure it'll be cool once they put it out or not. You know, it may not work, but part of me thinks they're just trying to get you to use their sandbox product. Yeah. And Khalil still needs like a

39:08

sandbox. Yeah. I mean, kind of use it runs on their I whatever they call them, right? Sandbox.

39:13

Yeah. Yeah. Their sandbox things. Um, and then if the benchmarks are good though, every

39:19

sandbox provider will have code mode. Like it'll have some code mode tool. So, it could be interesting for the space, but we'll see. All right. So, here's a hot take question from your mom, guys.

39:33

Is it worth joining Cloudflare as a system engineer for a year or founding or I think or founding engineer. I I got a proposal from them. Are you rich then uh become the founding engineer?

39:46

Yeah. I mean, you know, I'm sure it's a cool place to work. Yeah. Yeah. You're gonna They have good engineers. So I I

39:53

don't you know we we joke like we we'll poke fun at some of these companies but you know it's still a good company from you know it's just you know very tight like everything they do is very tied tightly and coupled to their architecture right which is that's their mission so yeah it's totally uh totally cool just

40:11

uh and then dev says I don't think it would work better but it will let the uh the agent engineer be lazier. Yeah. Yeah. We're in a different kind of hooks. Yeah. All right. So, let's keep

40:23

going. Another Cloudflare thing. Yeah. Some more Cloudflare.

40:28

This one I thought was, you know, I it's kind of tied, I guess, a little bit to Stripes payments, right? Cloudflare introduces their own USbacked stable coin. So, now AI companies are releasing their own stable coin.

40:42

The comments on this one are funny though. Careful, no use effect on this one. secure transactions for the agentic web. So you know now you have companies releasing uh coins that of course like

40:55

you know Cloudflare is you know and this other one kind of ties into it but around like content payments and like pay you know getting models and agents to pay for using content because a lot of you know a lot of the web uses Cloudflare for their CDN right so it's like if they can now take a tiny percentage of a transaction to for an

41:15

agent to access content through their CDN I'm sure they they think those like fractional pennies are going to be worth a lot, which it would be if you do get if you can convince enough people that agents should have to pay for their content and you can get enough content providers to turn this on. That's going to be a

41:33

significant uh like web let's say web not web 3.0 money is not programmable in a way that is like this. So like there's probably so much overhead to give your an agent your credit card and a lot of like hoops, but you could have you could convert a dollar into a stable coin and then give it to your agent and then it's like, hey, go ahead. It's a lot more metadata

41:56

in stable coins, too. So this is a good idea. It's just weird coming from them, but it also makes sense. So um Stripe

42:03

and Shopify are all the dudes will probably get into it. Yeah. And uh last thing from Cloudflare, they did open source an AI vibe coding platform vibe SDK. So you can basically build your own, you know, lovable replet at least that's what they're advertising

42:20

it as. Uh Versell has one too, right? So you know it's Cloudflare and Verscell kind of go headtohead. So if one does, you know, one thing, you're gonna usually see the other try to do

42:31

something similar. Yeah. And I and I believe GMA retweeted this saying or his own their own thing and called it like oh look at this vibe SDK. So there there's some drama for the episode today. Yeah. So Cloudflare is making some

42:45

moves. Um we mentioned this earlier you know kind of on in the same vein of talking about you know Cloudflare Netlefi had a big launch yesterday. NTL NL deploy that's what the event was. NTL deploy. Yeah. So that good to see that

43:04

they were doing some more stuff around, you know, agents and agentic AI things. I don't know what did you see all I know the only thing that I know they released was the the AI gateway because we been working with them a little bit but what else did they launch? They had the AI gateway which is a gateway but also they have a new feature called agent runners which is pretty cool. Um, essentially you can

43:29

run agents on your Nellifi projects and have it do things. So typical kind of thing, but makes sense. And Dev, does this mean Verscell coin's coming soon? Well, we're going to talk. They already do. Oh, really?

43:47

Not it's not a vers. not their own stable coin, but they they do support a certain hex address that can then be uh used to exchange crypto or stable coins. So, I did not know that.

44:00

All the shit's related, man. They're all just oneupping each other. And yeah, but to talk about Verscell, Verscell has zero config support for fast API. So, basically, you can run Python on Verscell is kind of what I

44:14

gathered here. So Purcell's getting out of just the the JavaScript TypeScript game and the node the node runtime and allow you to run Python. But have you met a Python dev? They're

44:26

not going to use this you know. Maybe they will. I don't know. Yeah, we'll see. We'll see. There's there's a lot of uh a lot of people

44:33

using Python for agents. So I I guess Versell is trying to capture that. And more, you know, from Verscell, they announced their series F, raised a ton of money, ton of money. So they raised uh

44:48

300 million at a almost $10 billion valuation 9.3 according to the blog post and a bunch of people about to get rich. Yeah. Like Yeah. Through the tender offer. So if people

45:02

don't know like once your company gets big you can like pay people out of their stock to then take it back essentially buy it back essentially for other future rounds or if you go public you have more like money in your company. So if you're early Verscell employee you look stuck around this long it has been over 10 years right so we know some homies from

45:22

Verscell that actually work at MRA uh you have this opportunity to make a pretty penny and that's good. Yeah, I mean it's good for the ecosystem though to have companies that get big and uh people that are along for the ride or at least contributed, you know, can can get some of the upside as well. So that's

45:39

always it's always good. A lot of activity on YouTube today. Yeah, like having y'all here. Yeah, this is this is a lot of very

45:46

active chat. So So Khalil said, "Had to use Edge Functions for AI before. Wasn't that bad." Uh Lauren says, "Cloudflare's AI gateway

45:58

is nice. They use that a lot. Khalil said, "Maestro, make sure you never consider Python."

46:04

Yeah, Python pollution. Um, yeah, we're we're doing this uh kind of video thing that hopefully we'll share soon, but Abby made this comment that AI was polluted with Python. And that that just like resonated polluted with Python. Um, and then Dev says, "I think this AI

46:22

stuff will finally get me to write Python." So, I get the feeling. You think so? I I my my my goal is to make

46:28

sure you don't have to. Yeah. The the minute you have a you do a pip install, you'll be like, "Never mind." And so Dev says, which which I'm curious

46:39

if anyone else in the chat, you only one noticing that the video and audio is out of sync. And Manny says it just happened. So uh oh. Uh oh. What's going on? What's going on? Reream.

46:53

Maybe they're using Python. Dude, what if they're using Python? Oh, Grayson, it's coming. Will Monster have a PHP version?

47:05

All right, we're going to I think if I refresh this thing will I don't know. I don't know. Maybe you should join quick. My computer died. No, that's not good. All right. Well, we're gonna keep we're

47:17

going to try something. We might be right back. We'll see.

47:29

We're back. Hopefully you're all here and uh maybe hopefully let us know if the audio is back in sync. Hopefully everything is uh fixed now.

47:41

Let's keep going. We have more news to talk about. So much more news. All right. Um let's see. Let's talk a

47:48

little bit about OpenAI and Google Microsoft Dev says maybe it's vibe coded probably it was probably the 90% of the code written by AI you know all right so we got some things uh from openai whole bunch of things because they were making moves as well this one this oh man all right so let's first talk about uh what's that Batman quote where it's like, you know, you're the hero but like

48:20

long enough to see yourself as a villain. Same kind of thing. Long enough to sell ads.

48:25

Yeah. You know, so now in preview, chat GBT pulse. This is a new experience where chat GBT can proactively deliver personalized daily updates from your chats, feedback, and connected apps like your calendar. So,

48:37

first of all, this is actually a good idea because I've been I've basically built this five different times with Maestra, with other tools, you know, with Replet, just trying what would be like there are some things I want to do every day. You know, I want like a simple one is just give me the top things from HackerNews that are in based

48:54

on my interests. Like that's like the very basic one. But I I feel like if you can figure out a way to get your agent to reach out to you either on like a cadence or like when you need to, that can be a a good thing. Now, I do think

49:11

that that was the um that was kind of the thing is now it's proactive. They're reaching out to you. It's going to start delivering ads, too. Yeah. And I don't know if they've actually announced that, but it's got to be

49:21

coming, right? That's the fear-mongering part of me. Yeah. It's got to be coming.

49:27

Uh, so Dev says, "I've been using Pulse and I've enjoyed using it. There's no ads." Yeah. And I don't think there is. Um, but I think it's just like it's getting easier for them to reach out to you

49:40

and you know, if you're already paying for OpenAI or Chat GBT, like they're probably not going to give you ads, right? But it's just like they do have this free tier and I wonder if there is a ad supported model that they're trying to work with at some point. Swix was telling us that in Paris that they're about to hit 1 billion users, right? That's a lot of ads if you think about

49:59

it. Yeah. And if you can and if they can Yeah. I'm assuming most of those users are on their free tier, right? So, how

50:06

do you how do you monetize that free tier? Yeah. Because compute, you know, your inference isn't cheap, right?

50:12

Yeah. But you know uh this whole ambience or you know pulse type of thing uh is a good idea if anyone's building agent type of products because you don't necessarily need to rely on a user prompt for your agent to do things. You can assess the environment or assess like metadata that exists and then turn that into a prompt and have your agent

50:36

act. Um or you can you know run your agent in a web hook handler. This is all the same type of stuff. now like you're integrating Google or Gmail and all

50:47

these other products into chat GBT. They're just doing API calls to see what went what what happened, turn it into a prompt and then give it to you. So, all right. Still on OpenAI. Uh last week

51:01

they announced GDP val a new evaluation that measures AI on real world economically valuable tasks. And you know, they do have some, you know, they kind of walk through what it is, but the thing that and they probably did this on purpose, right? Like they were very intentional about this because they probably wanted to introduce an a new eval that they can then win at, but

51:24

most of the time companies won't introduce a new eval unless they're the best at it. Yeah. I thought it was very interesting that, you know, Opus scored the best, right? So they're better than everyone else, but they're not in first, right? or they

51:37

so I give them a little bit of credit for still releasing it. Like they they had did this new eval. They were probably hoping they'd be first. They weren't. They still released it anyways. And then now they're going to, you know,

51:50

basically try to hill climb this eval to to be the top, right? The way they're spinning it though is that they got to this second place way fast. Like the you know how they got there in such a short time. So it's

52:02

like, oh, imagine if they had more time, they're going to be number one. That's how you kind of market these things, which is cool. Yeah. But good to see that, you know,

52:14

they are willing to not always not just paint themselves as first all the time. So like to see that. And continuing on with OpenAI and this is, you know, kind of tied to some of the agentic payment stuff with Stripe and all that, but Shopify merchants will now be able to sell directly in chat GPT. And as you

52:36

know, Dev predicted or maybe said with a commerce protocol, it'll probably start recommending products probably. I think it it makes a lot of sense, right? If Yeah. But here's the thing. Let's think about

52:49

that. If it recommends a product I actually want because it really knows me. Yeah.

52:55

Is that even an ad? I mean, someone's getting a kickback. Someone's Yeah, but is it really an ad?

53:01

I mean, buy it. If I like it, it's kind of scary, too, because it's going to know, especially if you've spent tons of time like chatting with you through hundreds of threads. Y it's going to know what you like, what you don't like, what you struggle with, what you, you know, your personal ambitions. It's going to be very This gonna be the best uh ad recommendation

53:21

or product recommendation platform probably ever created, right? It's so personalized. Yeah. And then Pulse comes in and says, "Hey,

53:26

you forgot to buy your mom's your mom a a birthday gift." Yeah. Yeah. And here's here's a one click of what we think your mom would like based on all the times you complained about your mom.

53:40

So, I mean, it's gonna it's we're probably a ways away from it being really good, but it is going I gotta imagine, you know, like Tik Tok's I I don't use Tik Tok, but from everything that I've seen, like that's where people like they basically the product recommendations on Tik Tok are very good, right? Because they know the types

53:59

of videos you watch and they they have very good tracking. So, they can recommend products that you will really like. And this is going to be probably as good probably even better. Yeah, because like in Tik Tok and stuff,

54:12

it's all like them kind of inferring what you based on your actions, but in in the chat, you're just telling Cad GBT what you're interested in. So, it's pretty cool actually. This is funny, Dev. Uh I just want Alexa powered by LM. Yeah. Why is that so

54:30

hard? Why has you know what? So, someone joked that the you know, like why isn't Siri doing this? But then people that have actually used Apple's

54:40

uh LLMs, like if you get into like actually the model, it's it's not as bad. Like Siri's not using it. Yeah. So it's like what if you just let Siri use it and there's probably some liability things they don't want to do

54:51

or whatever, but yeah. Can we just get Alexa or Siri to Alexa powered by Bedrock? Yeah. Using Agent Core and Strands and everything else they

55:04

Yeah. Anyways, it's uh it is definitely something that I wish we had, but with Shopify merchants being able to sell directly on chat GBT, that's we are going to get some product recommendations. And do you think they'll partner with Claude in them or is it mainly more like Chad GBT is more like consumerry and then Claude is more like developer?

55:27

Developery like Yeah, I mean, if you're going to pick one, pick the one with a billion users. Yeah, that's for sure. you know, I it kind of dwarfs all the others, but I imagine that they eventually would work with all providers. Why not? Why would they just pick one provider? Any any LLM that wants to sell, they'll

55:44

probably have some kind of, you know, MCP basically that just lets you get product recommendations and do sales. And Dev says there's actually a new version of Alexa called Alexa Plus, but it sucks. It has no MCP. It has no MCV. Oh, man. Exactly the

56:07

opposite of what he wanted. And Brad thinks that this will eventually be one-third of OpenAI's revenue. Yeah. Yeah. All right, we'll keep going. That's enough of OpenAI. Well, dude, there's a big OpenAI thing.

56:19

One more. Probably the biggest one. Yeah. Yeah, we can't miss this one. Sorry,

56:25

we're not There's a better um Why don't you announce it? I'll I'll find it. So Sora 2 came out and broke the internet. You have people saying

56:35

it's going to create dystopian future for everyone because you don't know what's real. Then you also have the other side saying, you know, we're like corrupting our youth because now people are just going to doom scroll Sora. Then you have also people saying, hey, like how could this be legal? Like there's Dragon Ball Z characters fighting one penchman. What

56:58

about the copyrights and stuff? And then you have Sam Alman all over the timeline. It's not really him, but he's stealing stuff and uh you know in a toilet and all that type of stuff playing basketball with Dave. Yeah. Uh yeah. So let's

57:16

let's take a look at the video. Yeah. I'm sure you've all seen it, but let's watch it. The actual video.

57:28

She looks pretty old to me. Ready when you are. 3 2 1 go.

57:48

Dude, you all right? I'm good. Do you mind?

57:54

Yeah. This one. Steady. Damn it.

58:12

So, that wasn't the act the one I was thinking of, but uh I think you've probably all seen some of the videos. It is I don't know. I I found myself on X and I can't stop watching them. Like, I've seen him like this is fake, but it still

58:31

doesn't quite register because it does. It is. And there's almost like it's still uncanny valley in most cases. You

58:38

can still kind of tell, but it is very difficult to tell, right? And especially if they if they do a little editing after, it's almost impossible because you can clip out some of the stuff that like starts to feel not real. Yeah. Uh but it is it's crazy good. And I do

58:55

think that it is I I do think people are going to spend a lot of time just watching these dumb videos and like what are we what are we doing with our time if that's that's over but it is kind of like a the artist in me like the creator in me like I'm a musician. It's like you can build you can create some really cool stories. It's a really

59:12

cool storytelling tool. So I I see both sides of like is it I don't know if it's good or not for society as a whole but it it is pretty wild. Did you uh get access to it? I do have

59:23

access to it. I haven't used it yet. Yeah, I did yesterday. Want to let's do a demo. Do you Do you haven't

59:29

My computer's dead, but I do have access to it myself. Yeah, we'll we'll pull it up. Yeah, let's uh we'll do some Sora demos here at the end. We'll come back to Sora and that's a good thing. Yeah, we'll we'll do some demos. We'll see if we can uh

59:43

can uh get it to do some some fun things. But, you know, like this kind of these kind of things are, you know, they just continue to compound, right? It's like a little bit better. And I I do wonder like how's it compared to to Google's, right? Like I haven't compared them. Yeah. Ve 3. Like V3 is really good

1:00:03

as well. I think this one seems like it does more cuts, which is actually probably better because it feels more like cinematic, but I haven't I've used VO3 VO3 a lot. Yeah. and I've had a lot of good success like building some dumbra ads with it

1:00:20

and such, right? So, I will be interested to use Sor 2. And actually, I saw this um I saw this tweet about like, okay, where does this go next? If you're, you know, thinking like it's if

1:00:33

you're thinking all the people are going to compete on this. So, you're Veo Veo 3 will become Veo Tube. Sora will be Sora.

1:00:41

Yeah. And then Grock will make one and then Vine will come back and now you have the AI slop things. And you also have this. Oh yeah. So Meta announced So continuing with the

1:00:54

news, Alexander Wang at Meta now said, "Excited to share vibes, a new feed in the Meta AI app for short form AI generated videos." And I I believe it's powered with maybe some of the stuff from um Midjourney, I think. using under the hood. I don't know.

1:01:16

But the idea is what if you know the motion of feed that was you knew it was AI. So if you're watching this, I really want to know what this is like a little bit. But I think you get the idea, right? What if you know you just had a social

1:01:54

feed that was just all AI, which is basically what X has become anyways. This is all Sora, but I imagine it'll die down a little bit. I like that. Or I like the form of AI slop that is creative and not trying to

1:02:07

mimic reality but create like But the ones that are like just face face swapping or whatever those are. Yeah. So if you didn't if you have uh tried it looked at vibes curious what you think. Do you do you think that

1:02:23

people will use things like this? Yeah. Dev says that Tik Tok and Insta will probably just build one of these into their app. Yeah they definitely they definitely will. I think um

1:02:35

I think the theme of today is that when one big company does it, the rest will also do it. Yeah. It's like yeah, you see one big company and then all of a sudden the the next is pretty quick to follow. They're usually probably working on these things in parallel, right? They probably know

1:02:48

what's coming. We're all g we're all colliding on the same path, right? Um let's talk a little bit more about models. And guys, we have a lot of news to go.

1:02:59

So, I mean, we No, we're getting we're getting there. We're getting there. I mean, it's all good, though. So, please stick around.

1:03:05

Yeah. And if you haven't got a copy of our book, I was trying to shill something, too. Here's an ad for you.

1:03:10

Get this book atra.aibook. Yeah. Yeah. Yeah. There's our ad for the

1:03:16

day. Most.aibook. But I'm recommending it to you as a friend, not a not a company.

1:03:24

Yeah. So, yeah. Go get go get the book, Principles Building AI Agents. It's free. We'll even ship you a copy if you

1:03:30

want it. So, it cost us more to ship it to you than it is worth, actually. Yeah. All right. So,

1:03:41

let's talk about Gemini. So, Gemini Flash and Flash 2 light releases or they have improved Gemini 2.5 Flash and Flash Light releases. And I guess the biggest

1:03:54

takeaway here, it's just like a little bit better benchmark scores, a little faster response time, but I mean it it still impresses me that Google's kind of caught up quite a bit because for a while there it felt like Google was behind and yeah, they're right in there. They're they're we always asked for those of you new like who do you think is going to be the

1:04:14

top model company as far as like bench most benchmark scores at the end of the year? Because that's kind of a fun topic. I think a lot of people think especially like for coding it's going to be claude and we're going to talk about claude here in a little bit or anthropic I guess but you know open AAI is making

1:04:33

you know it's obviously right there Google's right there X don't come you can't count Elon out so XAI is right there the Chinese are coming the Chinese are right there I mean there's there's a ton but yeah it's kind of fun to think about and I do just think over time there's going to be so many good model providers that Maybe

1:04:51

there's certain things like maybe coding claw still ees out a little bit better or maybe there's writing that you know one model for but it'll be uh yeah Khal's going to check the poly think Khalil last time we asked this question you said Google or correct me if I'm wrong but I wonder if you still believe

1:05:09

that if that's what you believe. Dev says Google because they own the hardware. Yeah, that's not a bad guess.

1:05:17

All right. Well, let's keep going. And we're going to talk about Microsoft or I guess GitHub.

1:05:23

And you know, GitHub's testing something. They they wanted to throw their hat in the ring for their own uh CLI coding agent. So, GitHub Copilot CLI is in public preview, so never gave it a try. Dude, do you see this in the top right of like what their description is now? The AI powered developer platform.

1:05:44

That's interesting. when they made that change could have fooled me. Yeah, I mean they I thought they were, you know, just repositories, code reposts, but no, I I guess it makes sense. A lot of their money comes from

1:05:56

Copilot now, right? I don't know how much of it, but I bet you I bet you they think that they're going to make more money in AI than they do with the rest of their product. But, you know, there's a lot of these uh CLI tools for writing code. And there now there's another one. And this is,

1:06:12

you know, we have a few others, but let's get to the the biggest one. This is the biggest release. I think we've been kind of building up and you all probably know what it is if you're a developer or an engineer. And it's Claude Sonnet 45, the best coding

1:06:31

model in the world. So, I am a big like I run Opus for everything. So, and I still think you you have to talk to Claude Sound at 45 different than Opus. So, I still kind of like Opus better, but as far as like benchmarks, it's better and it's

1:06:48

cheaper. Yep. Faster. There's like all all these things. This one 45 by far should be the

1:06:54

best model, at least according to the benchmarks. What do you all think if you're watching this? At the bar last night, I twoshotted this new thing I'm working on and it was actually I was very impressed. I had a very obscure task that I was trying to

1:07:07

do and I was using my background agents on my phone and the plan it gave was exactly what I was thinking and there were no like like it just knew what I wanted to do and that was like the first time I really ever had that experience with a model that was tight. So it was dope. Yeah. So have you all tried it?

1:07:32

Let us know. They also uh updated the terminal interface. Looks a little more sleek. Looks clean. Yeah. And I think the terminal Yeah, the terminal

1:07:44

UI got upgraded, but also the like the the plugin. Yeah, the plugin in VS Code also looks looks a little better for a artistic feel. That's funny. What do you mean artistic? I don't know. Yeah. Uh,

1:08:07

Lawrence says he thinks it's the best from a half day of playing around with it. All right, let's keep going. Uh, more with Anthropic. I thought this one was

1:08:21

just and it was kind of tied to that release, but this was just pretty cool. It's a research preview called Imagine with Claude. So in this experiment, Claude generates software on the fly. No

1:08:35

functionality is predetermined. No code is pre-written. So essentially it just uh generates the UI in the software as it goes. And

1:08:46

I have not tried it yet, but it's it's a cool idea. I don't I don't think we're quite there where it's actually going to work. That's probably why they're calling it research preview. But I could imagine a future where your software is

1:08:58

just generated as you need it. Yeah. Yeah, like you click a button and then bam, like this UI is just served to you and then it just goes away.

1:09:09

And if it knows your preferences, it knows what you like, it'll probably generate UI based on how you like to interact. Yeah. Seems like a waste of money though. Yeah, if you ask me.

1:09:22

Yeah, I I don't really see how it works because I do think that there's just too many edge cases to really solve for today. Maybe in the future, but it is a cool concept. It's a cool idea of this like idea of generative UI.

1:09:37

It sounds like the Cloudflare code mod thing. Yeah, it's like why do you need it? But again, we just share the news and give our hot takes. And this was kind of a

1:09:50

you know, big it's one of those big if true. I haven't confirmed it, but we're going to share it just why not. Uh let's see collapse that and share.

1:10:06

So they said that basically uh Enthropic cut their opus usage limits on their max plan without telling anybody. And they basically said you should be using sonet 4.5 anyways.

1:10:24

But they've almost they've really cut the cap. So if you've been using Opus on your uh max plan, you might be running out of credits real quickly. Yeah. So Tyler on our team said that he was at like 75% after it just reset in just like a day or two. So

1:10:44

yeah, if that's if that is true, what are you doing? Anthropic, come on. Give us give people a little heads up. But

1:10:50

this isn't the first time they did that. So it's all about the money, though. Yeah. How about the Benjamins, baby?

1:10:56

They want to make they want to make money. All right. And we're going to skip that one because we just have too many to do. And we're going to talk a little bit about cursor. Cursor released

1:11:15

the ability to control your browser, which windsurf had a long time ago basically, but now cursor has it, which makes sense. your agent can take screenshots, improve the UI, debug issues, and it's, you know, in preview, so you can see I don't think Windsurf could do like the debugging and stuff, but yeah, it it was a little more limited, I think. Right. It is cool.

1:11:43

So, I kind of expected that it would this something like this would come sooner, but I think it's a good, you know, on once you can connect more tools to your agent, it's obviously going to be able to fix issues and actually get further than it did before. Yeah. Imagine doing like web performance vibe like Vibe fixing web performance

1:12:02

opening the browser then, you know, looking at the network or the the graphs. So, that'd be cool. And more fundraising news. Cerebrus Systems

1:12:16

raises 1.1 billion at an 8.1 billion valuation. That's some money. They're

1:12:22

going to be uh they already have really fast inference. So, make it even faster, I guess. You know what's funny about this uh like these raises is uh I have a friend who works at a foundation model company and he like he took some sec you know he took some tender he's he got money now and he wants to do more research and if he if he wants us like

1:12:48

he said the only way he would start a company himself is if his first fundraising is a billion dollars. That's just what they expect you know like I don't know. I I don't have my invite code to Sora.

1:13:11

All right. I was going to try to get Sora up, but I don't know if I I haven't I haven't used Sora yet, but I did get an invite code. So, I'm trying to figure out where my invite code is.

1:13:25

There it is. I don't know. Do I need the So, I'm going to see if I can get this. That's not right.

1:13:36

That'd be a cool invite code, though. All right. Oh, we're in now. We are hopefully in. Welcome to All right. So, let's share.

1:13:51

We will see. Uh we'll see what happens. All right. So, I'm in Sora. I don't. That's wild.

1:14:02

That was weird. But if I hug her, let's keep moving. If there's I don't like that UX.

1:14:07

All right. Well, if you are watching this and you h haven't seen Sor yet, give us some ideas. Yeah, give us some ideas. Let's Let's do some things. All right. What should we build?

1:14:23

Oh, wow. how they have these cameos built in. So, you can already pick Sama.

1:14:30

That's funny. Yeah, that makes it and you can obviously add some things. So, you can add images. All right. So,

1:14:41

do you have those images of us from the 70s or something from like a couple weeks ago? I wouldn't know where to find them. That's true. But yes, I do have them. What if we could like

1:14:52

a mascot? We could try that too. Mascot for Monstro, which will just be us as 70s.

1:15:00

Let me see if I can find those images. Give me Give me a second. It could be like It'll be us break dancing or something.

1:15:06

I have I know I have this somewhere. I don't know if I ever downloaded them actually, but I can maybe do this. No, that's not going to do it. What? This doesn't support. I can't just drag an an image into here. That's

1:15:36

That's a bug. Yeah, that's wild. And it doesn't really work. We're going to try this. Try to refresh.

1:15:48

The UI has now frozen on me. There's too much video going on at once. Sor can't support. Well, Sor is now completely frozen, so

1:16:03

we'll uh we'll give it a second. Monster flag in the Eiffel Tower. That' be cool. Getting attacked by an invasive python

1:16:16

sounds pretty good. Yeah, Sora is just like completely froze. Let's Let's try reopening in a new tab here.

1:16:33

Had a JavaScript exception probably. All right. So, I'm going to try to find see if I can add the three of us.

1:16:46

Oh, you can only do one at a time. One at a time. So, I'll share my screen here. I'm just going to try to upload a couple images.

1:16:54

Oh, you can only have one base image. Huh? Yeah. Let's

1:16:59

uh You can use me as a guinea pig. All right, we're going to try it out. Is it being chased by a python?

1:17:24

Man, their UI like super slow. Yeah, let's just see. We'll just start there and see what happens.

1:17:36

Yeah, I don't know this may maybe it's because I'm I'm streaming it, but it is uh you got added to the queue. Oh, we can make multiple at a time then. Dude, what you should play uh play uh rest or like play football with Sama.

1:17:54

Put your picture in there if it loads. Man, this is Oh, it's frozen again. Yeah. Page unresponsive. That sounds good.

1:18:06

Where? How do I see my Q? Where's my Q?

1:18:13

Oh, this is so All right. Wonder if anyone else having this problem. Yeah. Is this just me? Anyone else have access to Sora?

1:18:25

Maybe that's Yeah. Dev says, "I thought Sora was only an app and not a desktop web." Well, I have it on the web. They

1:18:32

it had me go through it basically asked me if I wanted when I entered my invite code it asked me if I wanted to upgrade to the new experience and yeah and basically use yeah so too water to yeah I don't I don't know how to even get to my queue maybe top left button yeah if I can get it to actually load oh boy let's stop sharing and see if that uh if

1:19:02

I can actually get to refresh the page. Oh, it went unresponsive. Yeah, exit page. Yeah, basically they uh

1:19:13

just hella data probably. Uh I think that bubble thing to the left of it is your No. Oh, at least Oh, your profile in the bottom or like the one above that thing.

1:19:38

Oh my god. So, I got a message here. So, those of you wondering, we do not support uploads of images containing photorealistic people.

1:19:50

Well, yeah, use the Giblly version of me. All right, let's do something just so we can get some Oh, you have to add your own Cameo here. That's why.

1:20:02

Oh. Oh, so there's the iOS thing. If you want Cameo creation, you need to use the iOS mobile app to create a Cameo. That

1:20:09

makes sense. So, there's your there's your answer, Dev. And it froze again. And it froze. The desktop experience is

1:20:15

real bad. Um, yeah, I think we just stop here. Yeah, we just stop there. Yeah, it's too frustrating. The desktop experience is no good. Uh

1:20:28

Brad says, "Try Safari." Yeah, who knows? Maybe that's what they tested in. Uh I I'll play around with

1:20:35

it. We'll come up with some cool videos and we'll we'll share it next time next week on the live stream. And yeah, that's all we really have for today. Is a big news day. Tons of There's stuff we didn't even cover. So,

1:20:46

you probably saw even more that we couldn't just honestly couldn't get to it all. It's already been an hour and uh 20 minutes in. And we do this almost every week on Mondays usually. This

1:20:57

week's an exception. The last couple weeks have been traveling. Uh as we mentioned before, went to AI Engineer Paris. You know, we've been hobbies in Sou Falls. You know, the real SF as he likes to say. And we'll be in I think

1:21:09

next week. We might you might still be here on Monday. We'll see. Yeah. We might be doing in person or maybe you'll

1:21:16

be at the airport at the time. We'll have to figure that out. But we do have some guests. So, tons of guests. Uh, we have some guests coming on on

1:21:22

Monday and we'll have a lot more guests in the future. If you know someone we should talk to, we like to bring on guests that are doing interesting things in the AI world. Let them demo what they're doing, ask them questions, learn from them because you know, we not we like to talk about the news, of course, but we also like to learn from others. So,

1:21:38

let us know who we should bring on. Also, we'll be in SF next week. The other SF, not Sou Falls, but San Francisco. So, if you're an SF, there's a bunch of events going on for tech week.

1:21:50

Yeah, there's Yeah, there's a we have a MRA meetup on on Thursday night. The 9th. Yeah. 996. Nine beers on the 9th at 6 PM.

1:22:03

Yeah. So, come hang out with us if you're in SF. But there's a bunch of other going on that week, too, for tech stuff. Um, so see you there. Yeah. Thanks everybody. Uh devs get

1:22:17

ping. Yeah, we send us your guests, your best guests. Come on. Yeah, or come on

1:22:22

yourself. Yeah, we'll we'll we want guests. Thank you everyone for tuning in to AI Agents Hour. We'll be back again

1:22:28

next week with some guests. We'll talk AI news. If you didn't get your book, master.aibook.

1:22:35

If you uh want to, you should go, you know, give Maar on GitHub if you have not already. Also follow Obby and me on X and other places. You know, you can see our stuff there on the bottom. Yeah. Peace. See you.