Back to all episodes

Deepseek benchmarks, Perplexity Labs, more AI news, upcoming Mastra launches and Joyce from GoatSDK

May 30, 2025

Today we discuss the latest Deepseek r1 benchmarks, talk about running the Qwen 8B model locally, discuss Perplexity Labs, talk about upcoming Mastra launches, and chat with Joyce from GoatSDK and Crossmint.

Guests in this episode

Tyler Barnes

Tyler Barnes

Joyce Lee

Joyce Lee

Watch on

YouTube Spotify

Episode Transcript

0:12

hello everyone and welcome to AI Agents Hour i'm Shane this is brought to you by Mastra today we have a few things we're going to be talking about we're going to be going through AI news just like we do every day but there's quite a bit that we're going to be talking through so I'm pretty excited about that and then we

0:31

are going to spend some time talking about some or previewing some upcoming Master launches you're going to see a sneak peek if you're watching this live on some things that we'll be announcing most likely next week so you'll kind of get a almost a week in advance or at least a weekend in advance of what some

0:48

of the things that we're working on and maybe we'll even pull some of it down and run it locally and and try some things out uh some of it is uh already available just not yet announced so you know maybe you can find it and then we will have Joyce uh from Crossmin and go SDK coming on just to chat see how things are going to learn a little bit about what some of the stuff that she's working on

1:13

is with that you know we already have some comments in the chat bio how's it going good to see you here and as you can tell This is live so if if you're watching this on X you're watching this on LinkedIn you're watching this on YouTube just drop a comment there's going to be some uh participation options later to hopefully

1:35

win we'll give away a couple things later when we're talking about the upcoming MRA launches and hello to you as well hello Fabric and as you can see Fabric's on Xabio's on YouTube so we're all over the place but let's go ahead and let's actually get into the news so give me one second and we will pull this up and I'm actually going to kind of uh pull a

2:08

wild card because candidly I like many of you keep track of a bunch of different places where I go to keep up with all this information right it feels like you're sometimes drinking from a fire hose all the stuff that's the launches in AI the new models the the new benchmarks all these things are kind of coming all from all over the place i think uh typically

2:31

I spend the most time getting most of my information from X but you know we have other sources as well i would say my favorite source of getting information is just from we have a channel in our master slack called kindergarten because we're we all feel like we're in kindergarten learning new things every

2:49

day uh but Tyler is very active in kindergarten and so Tyler is probably my you know if it is you know it's like X is my main source of news and then Tyler's like right there so I did want to invite him on and he offered to spend a little time talking about some of the AI news because a lot of this was just

3:08

uh filtered through him first so he'll have better impressions than I will so let's bring Tyler on tyler what's up how's it going it's going well yeah I was just you know telling everyone watching live that most of my news comes through you either act Oh it does i didn't know that that's cool basically the kindergarten channel in Slack is where I uh I monitor for all

3:33

the you know the cool things that people are talking about that I don't that doesn't make that doesn't somehow make it to my X feed i have to make sure I keep posting there then yes yes please do because you're feeding content into this live stream almost you know at least a few times a week for sure but we

3:51

do have a lot of news items to cover today and so I'll probably just share my screen and we can kind of talk through it and Tyler I'll be interested in your perspective on some of this stuff because you know you've actually even played around with some of this stuff so the first one is something we talked about on the stream i believe it was

4:09

yesterday maybe it was two days ago but Deepseek R1 V2 came out but they did release some benchmarks yeah the benchmarks look really good and I've been running it locally um so the I don't know did Daniel was on right yeah can you show like the tool compatibility stuff yeah we talked about tool compatibility we'll

4:33

we'll you know maybe we can even preview that you know as as well if you want so I ran the tool compatibility benchmark against DeepSeek R1 hosted by Deepseek it got 100% which none of the other models did at all not even Claude didn't Claude get 100% it got 99% it like very close okay actually you know what i was

4:56

running in the benchmark locally i didn't run all the models i was running 3.7 so I might be wrong maybe Opus or Cloud 4 got 100% possibly but the crazy thing though I downloaded the 8B Quinn fine-tune of R1 you can run it on your own MacBook it's tiny and it got 100% as well so so I have questions about this

5:23

because you know we are talking AI news but I think this is a helpful divergent because a lot of people have only used Jet GBT through the API or maybe you know Claude sonnet through the API and they they hear about these like deepseek models and other models that are available but how does one go about

5:42

running Quen 8 the the 8 gig or whatever eight or eight eight billion whatever model on their local device what's what are the steps for doing that so there's a few ways you could use like Olama or some other things my favorite is LM Studio though so it's got like a nice interface you can see all the models

6:00

there's a lot of like you know sliders to tweak different settings like context length or like you know um some different caches and stuff like that um so that's that's what I always recommend to people is LM Studio you can search up models and Yeah yeah i'm gonna find that and pull that up let's talk about that for a minute how does it work

6:19

um does it run it runs locally yep it runs on your machine um depending on what kind of machine you have you'll install like a different runtime so there's like a Apple metal runtime or I think there's like another one can't remember what it's called or I think if you're on like a desktop PC with like a

6:38

AMD processor you'll get another runtime but you just pick the right one for your machine and then when you search for models you will see all the models you can actually install and it'll tell you if it works with your machine so it's actually that simple you just download it and then you can run the model you get like a open AI compatible uh HTTP

6:59

endpoint and you can point like a MRA agent at it as well yeah so how do you how do you normally test these models when you're running them locally are you just like using some kind of you know CLI post you know posting to it or what are you doing to test when you're testing these things locally so I have a coding agent that I've been working on i I I mentioned this to you before i'm

7:21

gonna probably open source it in the next couple weeks so it's not like public yet but um it's a very good coding agent like I've been really enjoying using it but that's what I test models with just to get like the vibe of how well the model works uh because I can point that at like LM Studio or you know OpenAI or

7:40

Claude or whatever um but have you tested the new have you tested the new Quen 8AB model with your coding agent i did yeah so I was running it on my MacBook so it was a little bit slow it thinks a lot it'll think for like 30 seconds and then output like a sentence and call a tool um I have a gaming PC with a 4090 graphics card so I'm later

8:04

today I'm going to run it on that and hopefully it'll run a lot faster because from what I saw the quality was really really good uh like editing code and calling tools and and all of that so I might actually use the 8B local one to do some real work which is we'll see but yeah I know I know I just remember we were talking before uh before you joined MRA even before we started MRO

8:29

you and I had talked about some of the models you were running locally and so it's it's funny that you know now it always felt like local models and they probably still are maybe to some extent maybe not we'll find out but local models are always just a little bit behind right kind of the open source ones you can run hopefully because they you know they're kind of tapping into I

8:49

think a lot of the learnings from the some of the closed source ones that are putting a lot of money into this but maybe that's not going to be true forever you know I think it already might not be with this new R1 release and I think that's what's uh kind of blowing my mind right now is like we have this tool compatibility benchmark

9:06

that we're running and even the local R1 like Quinn fine-tune gets 100% for every kind of input schema where like all of the open AI models will fail on like some percent of those tool calls glad will fail on a small percent as well like that's that's wild that it can outperform those as such a tiny model is

9:29

it just more like so the research that we did was all around like tool compatibility with like how function like what type what parameter types tool calls accepted is that right yep exactly so you're saying that it supports a wider range of potential parameter types for function calling yeah exactly um

9:50

actually it supported all of them i think though it it only failed on uh some uh schema input types that no one uses like uh saying that a field input type can only be undefined or typed never which is actually not that useful so that's probably why they didn't actually train on those things but literally every other schema input

10:10

type passed no problem which is just wild very wild all right so let's look through some of these so I'm not familiar with all these benchmarks but there's a few of them here that I I'm familiar with and you can see it's maybe I'll make it a little bigger so you can see the new Deep Seek model and you know the the is it I don't know AIM 2024

10:35

benchmark it seems to outperform 03 and Gemini Pro it does have like the 235B Quen 3 and then the original Deep Seek R1 so you can see just even the gap of where it was and this is only you know the the funny thing is this is only like a minor version upgrade right r2 we're still waiting for R2 which who

10:58

knows what that's going to be like if this is a like a minor improvement but you can see that it's gets uh the new Deep Seek gets much closer to a lot of the other providers in some cases maybe even beating it right in some but at minimum it closes the gap significantly yeah yeah and so it says it you know

11:20

shows significant improvements on handling complex reasoning tasks from 70% to 87.5 on the AIME 2025 test um you can see um the different benchmarks and what you know you had R1 and then now the new version you know V R1 V2 or 528 because I suppose that's the date it was released right and you can see that in pretty much all instances it has pretty

11:49

significantly improved yeah and I guess they achieved that by some like some reinforcement learning and then just getting it to output more reasoning tokens it looks like I think it said it went from like 12K to 30K or something like that on average oh really so it's like just does deeper reasoning and that's how it got a lot of the improvements then i think it's a

12:13

combination of that and and the additional training that they did um because now it also supports um tool calling which the original R1 didn't so structured outputs and and function calling are supported yeah and there's instructions here on how to run Oh yeah yeah go ahead Tyler well I was going to say I think the the only way that they could add the tool calling is just by the additional training that

12:38

they did so I think it's a mix basically more reasoning more training equals better model talks about the temperature so there's a a lot of information here obviously um has a license information so it's again pretty cool that these models are released to be able to be downloaded and run locally and you even have like a the Quen 8B model which is you know

13:09

significantly smaller so you can run on consumer hardware and still give you pretty good results right maybe doesn't about 4 GB of memory to load it up that's it that's it yeah so you can actually run that the 8bit quant and I think that's actually it's weird it's actually kind of even it's 8 gigabytes

13:29

just about that to like 10 gigabytes depending I guess on what you're doing but Yeah that's impressive that is the quality for the size that is pretty wild all right so we we talked a little bit about Deepseek and talked a little bit about how you can use LM Studio if you want to run try to run some of these models locally i um I have not actually

13:55

run any of these models locally so I think that's something that's on my list of things to play around with at some point might need to might need to upgrade my my my system here but it would probably it probably run it just might be a little slow i'm about due are you on a if you have like an M series MacBook or anything like that fine on that yeah I have an M2 so Oh

14:19

it'll run fine so it's like I think an early M2 if I remember correctly when the M2s were first coming out all right uh so the next thing another thing that will look uh familiar because you shared it today we're launching Perplexity Labs so Labs is for your more complex tasks build anything from analytical reports and presentations to dynamic dashboards now available for all pro

14:48

users maybe we should just watch this video i think I need to uh let me know if you can hear this Tyler yep for me hey Looks like all kinds of really cool things that you can do with this yeah do do you use Perplexity so I do use it a little bit but not frequently so I mean I have the app on my mobile device i sometimes will use that instead of um instead of chat GBT

16:34

but I would say it's not something that I use every day what about you i think you you use it right i use it i use it quite a bit yeah I really like it i know some people don't like it but um I find for regular searchers it works quite well like I prefer it to like doing a Google search or something like that i do add like a custom little system prompt just telling it to add

16:57

actual links just output them at the end of the response so I find it annoying having to click like sources and then go find links and stuff like that so that might be part of why I like it more is like it actually lists out the links as text at the end it's a nice little nice little tip or hack you can just edit you

17:15

just edit the system prompt uh or add to Yeah I think it's like uh you can add some custom instructions somewhere i can't remember where it is but it's in the settings somewhere i think you might have to have like the pro subscription for it too but yeah I like it for regular searches and then anytime we're doing research at MRA I will use the

17:34

deep research mode and I won't actually use any of the text that it outputs i'll just use it to find a huge list of relevant sources to whatever info So like the deep research works really well for finding a big list of relevant links to some input prompt that you give it the actual text I found is like pretty

17:54

unreliable maybe like half of it is right half is wrong i'm hoping with this we have a much like higher quality in in the actual report that it generates so yeah yeah very very cool yeah it'd be interesting to see i use 03 a lot for for those deep research things like a lot of times if I'm not getting my AI news from X or I'm not getting it from

18:18

you know you in our Slack channel I'm I'll use like 03 for that same kind of task and it does pretty good but I do sometimes think it'll But I would be curious maybe I should do an AB test with with this and just see see which one I like better yeah I've been trying to do a little bit more you know on a a lot of these things

18:38

even like with like codecs and comparing it to jewels and and all this like I want to do the same task in multiple things and just see which one feels more in you know fits my workflow and also seems to get better results for the way that I prompt right because might be different for everyone else because I I probably have a certain style the way that I do it the way that I would prompt

18:57

it so maybe there's certain ones that will work better for me a cool experiment i did run one one labs search and I just asked it um to make some graphs about like different events that uh happened at MRAA and user growth and stuff like that yeah I came prepared i came prepared oh nice you know I thought this was pretty good did you know through it i I've read through some

19:23

of it i was going to read through it on the stream you know I don't I don't I'm not saying we're the most professional but sometimes I step up my game and I'm I come prepared today is one of those days apparently um before we jump into this we do have a question from the chat and you are the person to maybe answer this or or say we can't do it um Jem

19:42

says "Hi can I use Zap for memory with MRA?" I think you can i think right now you have to kind of manually wire it up yourself but I think you definitely can um something we've been thinking about is uh making it easier to make memory adapters right now the way that memory works in Maestra is pretty opinionated on purpose

20:04

um but we have some things in the pipeline to maybe open that up a little bit we'll we'll see yeah yeah i mean it's kind of like our our general uh I I would say in some ways we we've kind of done this before with workflows right where it's like workflows we built it one way and then we got a bunch of

20:21

feedback from everybody and then we eventually made workflows now where you can we have we're at least working on so it can be supported and run on you know ingest and temporal and cloudflare workers so maybe memory you know time will tell but maybe memory will follow a similar path where we start with our opinionated version learn a lot and then

20:41

maybe figure out ways to allow you to kind of bring your own memory to master agents I will say I think you could use it right now if you use their JavaScript SDK and manually like fetch some info from it and put it into the agents context and then when you get a response put that back into Zep so that will

21:00

probably work i haven't actually tried it but I don't think there's any reason it wouldn't so yeah I mean at the end of the day we're just calling LLMs right under the hood so if you if you can manage the context you should have the tools to manage the context it should be possible but maybe someday it'll be a little easier all right so let's read

21:19

through this uh perplexity search that you ran so what was the prompt this is the prompt or Yep up at the top is the is the prompt there timeline of key events usage growth and user sentiment for MRA AI the TypeScript AI agent framework so it pulled some stuff from here 45 sources in the top right I think wow as well so it did a lot of stuff a

21:43

lot of looking around this is what I mentioned this is what I typically find very valuable with these searches is you can come in like actually go and start manual manually looking at anything that looks interesting um I found a lot of stuff that I normally just I wouldn't have otherwise uh found so yeah this person spelled this wrong maestro TypeScript framework oh yeah

22:08

um yeah okay that's cool though even even scrape my LinkedIn yep yeah it knows who you are it mentioned you by name in the response yeah it sure did all right yep it's it's very you know it it can do its research it honestly the funny thing here so I'm just looking I'm looking at this for the first time like I I

22:33

scrolled through it before you know I didn't come that prepared so I'm reading through this for the first time this is actually like a lot of our messaging too though is like addressing critical gaps in the developer experience that existing Python ccentric frameworks have left unfilled it definitely got that from our you know obviously our content somewhere but I I know that's kind of

22:53

the sentiment you know Tyler as you as you coined you know Python Python trains TypeScript ships you know it's like oh yeah Typescript should have just as good a frameworks as Python there's no reason that you have to build AI in Python i mean obviously you can but it's not shouldn't be a requirement um

23:13

wow it even knew even is this I think this stuff probably like maybe deep yeah maybe Sam has a blog post about that otherwise it had to have come from like YouTube videos because I don't know like I mean the reference there at the end of the second sentence there's like 1629 yeah okay okay that that's Sam's blog post and then one Yeah one of my blog posts okay or my one of my LinkedIn

23:40

posts all right yeah we I mean we we obviously talk about this but not that often so it's just it's some deep cuts um a timeline this I was a little bit unsure about like the the red dot saying partnership oh oh maybe the colors are just kind of misleading because we have feature partnership recognition those

24:01

are all very close in color there yeah it's kind of hard to tell but this is mostly right you know it's like we started this in October we probably had our first launch at the beginning somewhere in November I would think if I'm just trying to guess at timelines so but that that's seems mostly right um and then obviously you know this was YC we got in and started obviously having

24:25

more frequent releases after we had spent the first month and a half kind of learning and talking to people does it have um I know some of these things like when it generates a graph or like a in infographic like that it'll show a CSV like I wonder it looks like I can download Well I think that's the image

24:44

oh that's the image yeah try try scrolling down a little bit i wonder if there's like a CSV because we could see what kind of data CSV yeah yeah if you Oh it even opens it up oh that's nice oh wow i'm impressed perplexity this is cool yeah this is better than I thought i didn't I didn't see this when I was

25:02

reading through so yeah it tells you category launch release funding feature all has a bunch of our change logs not all of our change logs because more frequent than this but does have a lot of them in there this is wild for like a one sentence prompt and then waiting i think I waited 10 minutes for it to generate all this

25:23

some fe some of the features that we released not of course i think there's many that are missing but the it's definitely Oh there's a load more you see that oh was there i think when you scroll to the bottom load more three kilobytes total so maybe it didn't miss it yeah we tool calling improvements

25:45

that was yesterday yeah it's got the the date on it right there dude we're not What are we What are we even doing here like this can do our jobs for us like this would have taken me like a lot of this is like just imagine you were doing some research on on a company right or just like wanted to see that like chat you know let me

26:05

know if you tried tried this out yet i would recommend you try it i could just see like I put myself in the shoes of like I'm searching for a job and I want to just do some background on the company like that would be pretty cool or um if I'm choosing a new technology to use and I'm not sure if how active like this is pretty useful information i don't know that I'd read all of it but

26:24

if I was going to choose Maestro for a project I think a simple prompt let it run and then just do some validation of like is this does this seem like I don't think it's good to ask necessarily AI do you think this is a good idea because I think that's like it tries most of the time it tries to like make you happy so if you think it's a good idea they're

26:43

going to tell you it's a good idea uh but for this like just give me the data and then let me spend a few minutes researching and then I can make a decision yeah it's pretty powerful from what I saw I did read through this whole thing um the only thing that seemed inaccurate was that it called out one other company's blog post as a partnership but I think it was just another company mentioning us was that

27:08

Copilot it wasn't C-pilot Kit it was um I can't remember what it was it's somewhere in here but um I wish I remember where where it was but and it could have just been a company announcing support or something yeah I think so showing active weekly engagements the frameworks yeah user sentiment oh I like this are you Are you

27:33

reading this this is doing the sales pitch for us overwhelmingly positive reception across multiple dimensions of the developer experience oh check that out we looked at two YouTube videos I didn't know it could do that open AI agent SDK versus MRA AI okay Emil sayer i don't know you must have said some good words about us i appreciate that i don't I've never seen that video or this

27:57

one so that's cool it's there's no negative sentiment so let's keep that up there's a bit of a weird graph too though like it's just there's like three steps I guess future outlook very positive future feature completeness very positive I would argue we're not feature complete enough but you know we always like to sh we always like to be better so we constantly pushing ourselves ease of use

28:21

that's important one for us learning curve is generally positive production readiness generally positive developer experience like ultimately I'd like all these to be very positive but I will take I will take that that's cool Yeah and if you are watching this we are not giving a sales pitch for Maestra we are simply looking at analysis that

28:47

Perplexity decided to uh you know to give us on Maestro and it turns out it's pretty good so it wasn't it wasn't even cherrypicked but yeah this was the only search I did actually i haven't had time to to do anymore i was just like this is the something we know a lot about so we can read through and see how much it's hallucinating yeah and it's pretty good i mean it's it's not perfect but it's

29:09

impressive like I would if I if I asked someone random person to do research they would not get it any more correct than this right they're not going to know you're only going to know what you find online so I don't know that's pretty incredible you know I would like to compare it to the same maybe I'll run the same thing through 03 and just

29:28

compare the results uh but this is good like this is very good so we're just looking at if you just joined us we're looking at the Perplexity Labs launch which was that was it launched yesterday I think uh yes that was yesterday it was launched yesterday we ran it through this prompt timeline of key events

29:48

usage timeline of key events usage growth and user sentiment for master.ai the TypeScript AI agent framework and it gave this like stellar review but also just like a really welldetailed report has links has all the sources it checked 45 different sources gives us related questions this is pretty sick all right see scroll up a little tiny bit sorry

30:09

the one last thing the see the like list of links right there um it's like right there that's the That's the system prompt that I give it i tell it to always put a list of links at the end so if you don't get this it makes it so much nicer if you don't get this list of links you can update your settings and just tell it to give it a links at the

30:26

end of the response that's a little little uh Tyler Barnes tip all right uh so yeah we we got some chat messages here so let's maybe take a look at a couple of those so I wrapped MRA inside my Nest server will I still be able to access the agent hotel you should be able to yes if you is so yes I think the big important the important thing is you need to import the master class that you export and

30:58

then call your agents and your workflows through that class and then you should still get all the telemetry um and then another kind of follow-up question also is that generally the practice for production or do you recommend the vanilla hono setup in this one I would say it depends we do have a lot of users and customers that go both like both ways depending on how they're trying to build things so oftent times

31:24

they have you know they want it just use it bundled within a next.js site and they'll just deploy it all together sometimes they'll want to use like Nest or Express or whatever and they'll kind of bundle it in there and then we do have a lot of people though that they want to just run their entire backend

31:42

through MRA and so they just deploy the Hono server that you get from Mastra so I would say I don't know that there's a a best practice i think it does just vary depending on the complexity of your setup the more complex use cases typically either do what you're doing where you bundle it in with some other backend service or run it so the backend

32:01

separate and then kind of the a lot of people that are just getting started or maybe want to have simpler setups will just bundle it in with their front end which is like an X.js app or you know what whatever they're using within their front end all right Tyler you got you got a few more minutes okay we got time all right so we got a couple more you know this is it's it's

32:27

good it's taken It's taken longer than I guess I I thought but we I knew it was going to be a jam-packed news day all right those are some pretty exciting uh pieces of news we just went through so yeah I think these next ones are not quite as exciting but still still exciting i don't want I don't want to underell it okay so this one I just just caught

32:47

my eye i don't think it's really that popular yet but I think we just kind of like it piqued our interest I would say it's a cool idea yeah can you explain it Tyler i haven't actually looked into it that much outside of just kind of like reading generally what it is but I guess the idea is this is a framework that

33:05

allows you to add a payment layer onto your custom MCP server so you could distribute your your MCP server and users of it could pay for access to some service through it or something like that that that's what I gathered is that Yeah that's what I gathered just reading through it briefly you know it's it's still new if you go to the GitHub they only you know it's not very active yet i

33:30

think there's like more like a example and I don't know how much they've actually released so time will tell if this becomes anything or if it's just someone kind of getting an idea out there but I do think in general this idea of we have this new protocol MCP when once you have some kind of new protocol people will try to find different ways to use it and and

33:52

monetize it potentially and this is maybe maybe there'll be some kind of you know stripe for MCP or something right in this in this case where you can almost accept accept payments for people using your MCP servers so your agents that you either you're building or you know if it's cloud desktop or something could basically use these you know

34:13

detailed MCP servers with you know somehow you off to it and it charges you based on usage or a monthly fee or whatever so just pretty interesting that you're not only will you be paying subscriptions your agents will be paying subscriptions for you on your behalf potentially so what a what a wild world yeah it's wild

34:32

there's a new part of the MCP spec that's coming as well is is this what you're going to talk about next or is this a different thing hey hey you know I uh I was prepared i told you I Nice okay this is what you're That's what I was gonna That's what I was going to bring up yeah hey we're on we're on the same page today i like it all right yeah like so I you know ca

34:56

candidly I did not read through this besides like scrolling through quickly so the it's part of the MCP spec you know we're on the model context protocol site it's called elicitation tell me a little bit about it from your understanding what you what you've seen based on what you read yeah I've only briefly read this as well and

35:15

this is just a draft as well so it's not part of the full spec yet but um my understanding is that this will allow the MCP server to define a schema for some data that the client can send to the server so you could imagine um defining a schema for a UI like a login UI or something like that so maybe you need to like put your email and password in to actually authenticate or

35:42

something like that right with the MCP server now this would uh add that capability so a client could display an actual UI to render that that that's I just skimmed it but I think that that's where this is going interesting so yeah it is important to note this is still just a draft i imagine that you know this it's going to go through some

36:05

iterations but if you are trying to follow what's you know what's happening with MCP what how is the spec changing you know looking at the drafts of like what's coming up is one way to do it so it kind of talks through shows you some diagrams on like how the message flow works so it presents the elicitation UI

36:27

provided requested information completes request so yeah there's And the other thing that I always look at with all these things is sometimes it's like they always give you like this is how it could be used and then oftentimes people are interesting in finding ways that it might be used for other things like you know

36:45

who knows you know we we have some cool things we're doing with MCP that I might tease here in the next section but uh we'll hopefully be announcing officially next week but so it is interesting like what sometimes new technologies get you know they have the they get used in maybe the intended ways but there's also a lot of unintended you know sometimes

37:04

good sometimes maybe bad but different consequences of like how people interact with how people use it mhm yeah this I feel like this has so many potential like applications it's going to be very interesting even like things like confirming tool calls or adding extra context for a tool call you know like the server could actually ask the

37:25

user during a tool call um oh I'm missing this piece of information you can fill it in or you know things like that so this will be really cool yeah and Valentina says "Love autopiloting finances." I'm assuming you mean like just letting your your agent take care of it for you through MCP yeah that nice

37:44

i also love it scares me a little bit but you know I I ultimately think it's it's going to be pretty cool all right so one more thing Tyler and then I I can let you go because I know you probably I kind of sprung this on you we did originally have another guest planned that unfortunately you know had some reason they weren't able to make it but we'll reschedule with them so

38:05

instead we get to spend more time going through the news today which I don't mind nice all right so this one I think is still early as well but I think it's it's just one of those things that's needed so it's this LLM UI you know we're we're good friends with Copilot Kit assistant UI of course there's quite a bit of like UI components and things getting built into Verscell's AI SDK especially with V

38:31

you know the V5 that's coming out but there's always going to be more right there's always new people trying to come up with things in different ways and so the idea is I think the ultimate challenge is how do you build really great user experiences where you have maybe these longunning agents or these

38:48

longunning workflows on some server somewhere and how do you build a really interactive experience for your users and so I think this is another attempt at trying to help solve that problem yeah I I I think this is really interesting too that it fixes the markdown syntax as it's streaming in so I don't know if you saw this i posted a

39:07

link to a video by Matt PCO where he was I think he's made like a a website just showing how janky it is to stream in markdown into a UI because you need the opening and closing tags to render a lot of markdown right so or italics or something you might have like an underscore start outputting some text and then another underscore and then now it's italicized but before you get that

39:31

last underscore you you actually see the first one render as a character and then once the next one comes in all the text changes and becomes italicized it seems like this library or framework or whatever it is um solve that somehow which is kind of cool yeah I'm curious how it must so it renders without

39:54

showing any markdown is that a custom button so it prompts the LLM to let it know it can use buttons like this has code blocks with syntax highlighting for 100 languages matches your displays frame rate text is streaming tokens okay yeah it's obviously we could look you know we do have the access to the GitHub but I would uh if you're interested in if you're trying to solve like really

40:24

interactive UIs this is at least on our radar of things that we are we're looking at because we are we get customers telling us every day they're they're asking questions on how do we make the UIs more uh responsive and we have you know we've spent a lot of time making sure we have workflow streaming and we spent a lot of time working with assistant UI and C-pilot kit and making

40:45

sure you can just use you know use chat from AIS SDK but there's still yeah it's still more difficult than we want it to be yeah another just probably related to the last topic i've been thinking about the agents economy where agents will work alongside humans in the orgs yes um definitely something I've thought about

41:10

as well i don't know how that's going to look yet but it is interesting and Valentina says "Markdown makes me want to cry." I agree with that actually like you know it's not bad but And then Valentina's making components for chat box chat bots and also says because Markdown I'm assuming it's because Markdown's such a bother agreed maybe this helps a little bit but I guess we

41:36

can uh kind of continue on all right Tyler well that's the news for today so I do appreciate I appreciate you coming on and spending you know 45 minutes with me as we kind Yeah it was fun it's probably the longest new AI news segment we've ever done because there was just even though it wasn't the most like the

41:55

most giant releases it wasn't like a complete new model dropped but I think there was just especially with reviewing R1 v2 and having some like in-depth look at it plus like the MCP draft plus the perplexity there's just enough that we can spend a lot of time jam-packed in there yeah just dig into it a little bit

42:14

more than normal rather than just kind of going over it on a cursory level all right well I probably Yeah I probably won't talk to you again so have a great weekend uh yeah see you later see you all right everybody thanks for joining us this is AI Agents Hour we just spent the last 40 or so minutes going over some AI news we talked about the uh Deepseek R1

42:42

V2 and how it looks pretty incredible to be honest and we had Tyler come in and talk about how he's even been testing the the Quen 8B model the new Quen AB model and getting some really good results locally running it and you know honestly we've just released this tool calling compatibility research that we did and we it's on our it's on the MRA blog but we tried the new DeepSeek

43:09

uh version and it actually outperforms the other one so at least in this narrow aspect of tool compatibility it seems like it's on par or better than all the other models which is pretty amazing because you can run an eight an 8 billion uh parameter model locally and it just outperforms what you can run

43:29

from the from the big players so nice to see that you know the open source world is still uh still moving forward even if it is a little behind on some benchmarks it's still uh it's not that far behind uh we talked a little bit about MCP uh spec and the elicitation proposal that's in draft status we talked a little bit about Perplexity's uh Perplexity's new launch and we showed

43:54

how it could be used to do some really good deep research so it's called Perplexity Labs and we did some deep research on Maestra and it was very favorable it talked very kindly of of Ma which I appreciate but we do want to talk next or maybe spend a little time and I'm just going to highlight some we'll call them like community tweets so I'm going to share some

44:21

community tweets that we appreciate around here and so we want to give some of these people some love if you're on X maybe give them a follow or go give give their post a like so this one here from Daniel Trevino says "Taking some time to keep learning diving into Kel Sam's book on AI agents which I have right here the way Monster breaks it down makes it smooth and easy

44:47

read i'm at the summer house in the Swedish forest beer in hand doesn't get better than this." Amen to that Daniel so I am actually You won't see me as much next week i'm going to be out i've had this uh family vacation planned for well before we started this stream and even well before we started MRA so I will be

45:06

out you know doing some stuff with the team checking in but not doing these live streams so you'll see other people from the team managing the live streams but I will hopefully have a beer in my hand and I already read the book you know I already read it but there might be a new version coming so we'll talk

45:24

about that in a minute but thanks Daniel for the post appreciate that let's do another one we got three today all right so this one's from Hashem hashem's a friend used to work at Gatsby with me and said "This is our analysis we kind of talked about a little bit about our LM tool compatibility air rate by model and how

45:51

um kind of the before and the after." So you can see how if we look at this this is from our blog post after we introduced like tool our tool compatibility layer in Maestra how we decreased error rates significantly for all these different models and we don't have uh DeepSeek R1 in there because before a couple days ago it didn't really support tool calling now it does and so when we tested it when Tyler

46:15

tested it it was already really good it didn't need any didn't need any help from us it just did it so really great uh great post thanks Hashim also a good follow and last but not least uh let's see here we got from Ryan ryan Zambrano best AI agent framework right now is MRA AI giving an MCP for docs is elite thank you my friend appreciate

46:46

that all right so with that we're going to talk about some upcoming launches that we have coming up with MRA um and we'll kind of tease a few of the things that we're going to be doing most of these will be landing next week assuming everything goes to plan sometimes it doesn't always go to plan but assuming it goes to plan you should be seeing a lot of this stuff uh land

47:14

next week and so don't tell anybody you know this is for only the people watching the stream or that get 47 minutes into this video but a few things that we're going to be launching next week so the first already talked about this book Sam my co-founder right there he just released the second edition it's available on Amazon right now if you

47:39

want the physical copy or if you want the digital copy you can get that for free going there and so you'll get the second edition which is I think this first edition I see if it has the page numbers on it it was like about 90 pages little less than 90 pages for this first edition i think the new version is like

48:00

140 pages so it's a little longer but this version didn't have anything about MCP there's a whole bunch of like just missing information because some of the stuff's changed a lot so we knew writing a book we're going to have to constantly keep it updated we try to we're going to try to update it every couple months because the whole space is changing but

48:19

you can get that copy see you know again it's like 140 pages so it's still a pretty quick read um it's it's kind of the goal is to be like no hype no BS just really simple we we do show you know some code examples with MRAA but it isn't really MRA specific per se we just show examples of how you can do it but I

48:39

think it's really educational even if you're not using Maestra to take a look at and if you're just looking to get into you know building AI agents and you want a a pretty good overview that you could probably read in a weekend uh and pretty casually I would I would recommend that it's pretty it's a pretty easy read and because of this uh you know

48:59

launch Sam doesn't even know that I'm announcing this so he doesn't know that I'm teasing this launch before it happens next week but if you want a copy if you want a physical copy of this book but the second edition not this edition the slightly slightly bigger edition of this book please uh drop a comment preferably an X if you can

49:19

because if you do it in YouTube I don't know how I'm going to get your information but I will be picking uh I'll be picking we'll say three or five people depending on how many we get if you want the second edition we will send it to you the only caveat is if I pick you you got to DM me on X with a couple

49:38

couple bits of information i need your name address and I believe I need your phone number and we'll just send it to you we'll just Amazon ship it directly to you um Valent Valentina you got it you DM me that information again I need your name need your like actual shipping information so I can just paste it into

49:56

Amazon and um I think I need your phone number too and then we'll we will get that shipped to you next week and you will get the second edition so if anyone else wants it we'll give out a few more copies uh you know if if you're watching this after the fact we're only going to do it live so I'll give everyone a few

50:13

minutes and we'll uh before we bring on our next guest we'll we'll bring a couple uh we'll we'll give out a few more copies so we're giving out one we'll give out a few more all right uh other interesting things uh that we're going to be launching we are going to be launching a Mastra course so this a lot of people are very similar to the backgrounds of

50:40

the founders of Mastra we we're we're coming from the web world we built websites and web apps and mobile apps and kind of like app development which is why we you know honestly why we're we spend so much time in Typescript and why we're kind of Typescript uh first but we do know that a lot of you are probably just getting into this stuff just like we were you know in in some

51:03

some cases you know like I started kind of working on some of this AI stuff about two years ago um you know Sam and Obby maybe a year ago you know as we're kind of building out Kepler the CRM but you know we're still pretty new to this world as you probably are and if you are interested in learning how to use Mastra specifically we're going to

51:24

have a Mastra course um so Utkar unfortunately and I should have said this India is one of the places we can't easily ship from Amazon to i would love to send you a copy please grab a digital a digital copy of the book but I'm sorry India is one of the there's a handful of countries that we I think there's maybe a dozen countries we've figured out we

51:46

can send to india was one that unfortunately was kind of hard for us to send to so apologies for that but appreciate you and hopefully you can get a digital copy um so yeah so we're releasing a course i am just finishing up a few videos for the course but it's not like a course that you would expect because in a traditional course you are kind of

52:11

learning from an instructor a teacher that probably records a bunch of videos and walks you through something the challenge with courses like that is videos are hard to keep updated so it goes they go out of date kind of quickly and this space is moving pretty quickly so we kind of came with that constraint

52:28

and said how would a course look in this new AI world and so and SUJ sorry again I can't send it to you but if you go to master.aibook AI/book you can get a digital copy but we so we came with the constraint of like what are we going to do with how do we make a course that's a little bit easier to keep updated and

52:52

ideally that would just be like text right like text and guides it's a little easier to keep updated but not everyone wants to just read text you know we we could put a whole bunch of guides on the website you know we have the book that if you want to read text you know we we want people learn in different ways so while we do

53:11

have some videos we do stuff like this we try to teach things we do workshops we we thought that the workshops are a better form of like constant updates of information we probably shouldn't do a full video course so instead and you know please don't share this out this is coming out next week um we are basically using MCP to build our course meaning that you are going to

53:34

take the course in your editor so whether you're using Cursor or Windsorf or VS Code any editor that supports MCP you can actually follow along it'll write the code for you tell you what it's doing and explain it to you so I won't be teaching it although there'll be a few videos from me that talk about how it works your agent your code agent

53:52

is going to be your teacher and we'll teach you so that's coming out next week there's teases of it out there now but we'll announce it next week i encourage you all to sign up for it try it out uh it's we're labeling it experimental because not all coding agents are created equal we found so some uh

54:10

editors work better some editors with specific models work better we have found that MCP support on Windows is just not very good so I don't there are some challenges with running just MCP in general even our doc server has struggles on Windows so this stuff's all uh very much changing and hopefully will

54:29

continue to improve but we did want to get something out there that was a little different that allow allows you if you want to get started with Maestra you can kind of figure it out uh by kind of having an coding agent walk you through how Ma works and building some agents building some adding memory uh building some workflows things like that

54:46

and then over time we'll be able to add more to it so it only has a couple sections to the course right now but we will be continuing adding continually adding more and you can kind of track your progress online and there's a whole bunch of like components of how your code editor maps to like a a course dashboard and all that so more to come

55:03

on that next week uh but we're pretty excited about it and the last uh kind of one last thing maybe we have two last things you know we're going to have announcement around master cloud you can still request access we're in private beta right now but if you do build a master agent and you want to deploy it somewhere you want to have like

55:22

production level tracing our cloud is in private beta we're going to have some announcements around uh some new things that we're adding to that next week and the final thing that this is not really a you know I wouldn't say it's it's a launch that is going to be hugely uh publicized you know it's not quite as big as we're launching a brand new course but it is such a quality of life

55:44

improvement and I believe it's live now i'm just going to share the video from that Marvin on our team put together around something that's available in workflows and I it might be in alpha it might be in latest i'm not sure where we're at with actually getting it launched but it's very cool and it's

56:02

basically a 4-minute video so I'm going to share this video and then we are going to have Joyce come on here in a little bit as well so give me a second to find it all right so let's see here i think I in order to share the sound I think I need to do this let's see if it comes through some reason I've had trouble getting sound to share depending on the on what we're doing but

56:37

let's see if we get this all right hello folks uh it's Friday so it's demo day and I'm going to showcase you what we've been working on uh with Topi it will be about Vinx workflows so thanks a lot to everybody that has been involved and thanks a lot to and shout out to TIG that has made this possible so let me show you a bit what uh the workflows

57:05

look like when Tony is uh showcasing them during a workshop session so you go to the okay not the problem so you go here and you go for example on the concurrency one before this is the before right so you have your graph it looks nice but it does not give you any information and this is a feedback that

57:27

we got last week uh when we interviewed a quite big company and they were like hey this graph looks good but it does not give any information so I'm going to show you how it works right now and basically all the information lives on this right panel right so if you want to have access to something you just watch those running and then you go here and

57:53

you have access to quite a big JSON that you have to manually uh broth so let me show you now a new version of it so I'm going to just go to the whoop the other one API workshop going to start this one and I'm going to show you how it works now with what we did so let me go to the uh good port we'll go here and I'm going to go to the workflows here i'm going to select the same one and now I'm having

58:29

access to almost the same UI right it's not crazy new it's just same thing i have all of this here but let's see what happens now when I trigger a run all right so now when you trigger a run this graph has an importance right it will highlight um the path that has been chosen and it will give you information about the time spent the input of the thing uh so the

59:14

input was Paris the output of this step so now you can see what has been outputed and what has been provided to this next step it's the same here you have access to the input so this has been given to this on and it has spit out something like that and then you can have access to the full report right below and a map config and all this

59:35

stuff and you can also have access to uh the trace in some situation so for example on this one I have access to the trace and if I click on it I'm directly redirected to the trace of this specific step we've not made it work yet for all the steps like for those nested workflow uh it's a bit different but if you go to the mapping blah blah blah you can go and see the trace that has been uh

1:00:02

created right there directly so that's one thing that we did uh which I think could help people just see the information flowing in their chart um but there's also something else that we've created which is a way to see the previous runs that happen on this workflow so let me try to trigger a bunch of them so San Francisco let's see how it work so San Francisco

1:00:30

it has decided to chose something else depending on the weather and then boom we have some information here and we can see what's the output result Friday blah blah blah i have some stuff with a summary clear sky yeah it's quite cold actually but yeah probably during the night though and if I click on this run button I have access to two different runs and I can see them so this one was

1:00:55

the Paris one and this one is the San Francisco one so we now have we now have a way to see um the history of runs disclaimer right now this will not work if you change the shape of your workflow but Tony and Topi are working on a way to snapshot the the state of the workflow at the moment we trigger a run

1:01:18

so that we can just show it uh as it was during the run itself so I hope you appreciate it and thanks for watching this one and again shout out to Topi and Tony for having made this possible see you later all right so that was that's pretty cool we have uh now the ability to get much more visibility into what's happening when you run workflows so if you want to build a workflow test it in the playground you can see what's happening

1:01:49

you can see the inputs and outputs much more clearly you can keep track of past runs so ultimately a huge quality of life improvement something that we've had like ideas we were going to do for a long time and customers keep asking about it but we just hadn't quite got around to it and now now we did so that

1:02:07

wasn't kind of the last exciting thing I wanted to announce as far as MRO related things today okay so we got a few questions in the chat so will the cloud support other server frameworks like Express or Nest so technically we're not going to have direct support for that it probably would just work to be honest because we're just running you know a node

1:02:30

package so you it probably would work um I don't think we've fully tested that yet right now it's kind of built for just running the Hono server but we should test it like there's a very good chance it would work and then over time maybe we will add you know official support for it but right now not officially supported might work though

1:02:50

uh Valentino will your copy be signed unfortunately no we had to go through it will be sent right from Amazon but if you are in San Francisco next week Sam is going to be at the um Sam is going to be at the AI engineering world fair so if you're there or even around this that area we can probably get you we'll have a whole bunch of copies of the new edition of the book we can probably get you a signed copy if you really want it

1:03:16

um one more question is is the cloud open source unfortunately no the cloud is not open source like all the framework is you can build it deploy it run it anywhere but uh the cloud is a commercial offering that we will be talking more about all right i will not be there unfortunately next week but Sam will be so otherwise I will be there uh in a

1:03:37

couple weeks so I'll be back and uh just as a reminder so a bunch of you just jumped in some things that we've done so far we spent a lot of time talking with Tyler from the Monster team about some AI news about some Deepseek R1 v2 information around how it's pretty good uh Perplexities Labs the new elicitation draft in the MCP spec so we talked a little bit about that and we did just mention some

1:04:04

upcoming master launches specifically we are about to launch you know not publicly yet but I guess it is kind of public here um we are about to launch the new version of Sam's book if you want a copy and you are in kind of the UK area like Europe US Japan if you're in any of those those uh areas just drop a comment and we will try to uh we'll send you a

1:04:31

copy we'll just send you a physical copy through Amazon we can't send it everywhere unfortunately there's a few places we can't easily send to but with that we do have a guest and so I do want to bring on our next guest who I have interacted with quite a bit online but never talked to yet in in person so

1:04:53

let's bring on Joyce and learn a little bit about some of the stuff she's doing with AI hi how's it going hey Joyce nice to officially kind of meet you yeah nice to meet you online um I feel we're not that far away physically i'm in SF as well yeah yeah i'm in So I I kind of split my time between uh SF and Sou Falls South

1:05:18

Dakota as random as that is but I am in SF quite frequently most you know a minimum of at least one or two weeks a month but yeah tell me a little bit about uh about you and some of the stuff that you're working on and then I I know we'll have questions this is live so if you're listening to this if you have questions drop them in the the comments of YouTube LinkedIn X wherever you're

1:05:40

watching this from yeah um so I'm Joyce and um my Twitter handle is on there i think some of my tweets these few days have been kind of appearing on people's feeds um but I currently do product at Crossman um so I think Crossnet is still um pretty new like or you know not a lot of like traditional AI people kind of know us

1:06:07

but um we're basically a developer platform for um companies and also developers individual developers to bring their apps and also their agents on chain um so I work specifically on agentic payments um with stable coins um meaning that you know if you have an agent and you want your agent to be able to easily own um its own money you can

1:06:36

do that um we can actually help you do that with our obviously APIs um and because agents can't own bank accounts for now um we can help your agents own wallets or you can bring your own wallets and your agents can just simply very simply easily transact with um stable coins so what can agents actually do with stable coins well you know initially when we first started um so I

1:07:03

work on this product called GOAT so GOAT is actually an open-source onchain action toolkit meaning that if you have an agent for example you built an agent in Master and now you're like "Hey I want my agent to actually be able to interact with the real world for example I want my agent go shopping on Amazon

1:07:22

for me." Um but you know Amazon blocks bots and they don't really allow kind of bots to serve their websites or like even pay and your bot can have a bank account so you come to cross and you have a wallet with us or if you already know crypto you already have a wallet and you would be able to use kind of

1:07:40

some of crossman's API with goat um to buy stuff with stable coins on Amazon and the best part is you never need to offer them for your agent um and however you get your stable coins is up to you or even if you just want to pay and like native um tokens on like the chains that you choose so this is a little bit more crypto um but I kind of wanted to share my screen and show you guys what exactly

1:08:06

this go thing is because I know a lot of people initially they're like "Wait this is so confusing." But once I show you it will all make sense so yeah and it Yeah feel free to share your screen and funny enough we did talk about there was someone that has started kind of an open source project around MCP payments and it's very new so but I do think there's

1:08:24

a lot of excitement around and you know may maybe some hesitation but mostly excitement around how can agents transact things on your behalf and so I do think that this is going to be a space that a lot of people are thinking about and maybe this is like it seems to be like this might be a really good use for crypto because then it's not your

1:08:45

like an actual bank account which you do need you know to be a human right to get a bank account but you don't necessarily have to to transact on chain so very cool yeah so that's actually the best part and a lot of people listening me will be like "Okay but what even is Goat?" And so if you're interested to join along

1:09:04

just go to github.com/goat- SDK um so when you come in here basically what you see is the leading agentic finance toolkit for AI agents and we did already make um an agent with the master framework so it's always been in there um what you see here is if you go into examples and you go into buy framework you see that there is a master folder

1:09:35

and this is a just a super simple um well in this repo it's just a super simple agent that can just you know transact on chain so like you can send tokens figure out which wallet using which chain you're on and if you're interested in following along during this you can just go onto um this repo and kind of look at how we have set up like a money transmitter agent um but

1:10:01

basically what this means is we have all these things called plugins um so if you go into packages you see that with all these things called so first and foremost adapters so Mastra is an adapter so for example if you don't want to just use the example code that we built for Mastra you can come in here and see how to use a Mastra agent with Goat because this is where you we show

1:10:26

you the implementation and then for plugins you can see every single app that your agent can use so again this is going to be super cryptosp specific but um what's interesting is that all these apps are onchain apps that humans already use and now we just make it super simple for you to just focus on

1:10:48

building the agent and then coming here and seeing which apps you want your agents to be on and once you choose one it's so like you literally just need to say import this uh import for example if I want to say import ether scan I just say import ether scan from goat SDK/ etherscan that's it um so yeah so this

1:11:08

is just the structure of the repo um I would love to show like a master agent in practice kind of just buying something for me from Amazon um so let me just go to my ID yeah that that is a I I want to see that too because I have not seen that yet so that is pretty cool let's see yeah so I actually wanted to

1:11:32

show like a flight agent um but I think Amazon is something that we can see here and now so today you know I'm just going to choose something cheap i just want to get like this pack of pens um and I'm going to make my agent buy this for me so what I'm going to do is um for the next demo we're going to have it by the book no I'm kidding yes actually yeah um

1:12:01

well honestly I could change it to that link if you just send me the link yeah i mean just search for principles of building AI agents book see if we I I know this is live demo now so I'm throwing some you know yeah no worries let's see and it should be if you notice like you'll notice that on Amazon it does say second edition so it is has been released

1:12:24

oh okay yeah and it's it's about the same price as the pens yeah all right let's do this then okay this is super cool so if it if it doesn't work we'll go back to the pens but if it does work everyone's going to calibrate it should work because all it does is try to extract um the Amazon ID from the link so I would just go in and start running the index file so I'll

1:12:51

just be chatting with the agent um through CLI obviously you know if you want to build on top of this and make your own UI like sometimes um I make like a Telegram bot that you can chat with with like a little more personality but today I just want to show you like the master framework and what we've built on it so I'm just going to say hey

1:13:10

what's my wallet address and so as you can see here it's calling the get address tool um because we did import say uh the wallet client and also uh one of the plugins that we imported is the headless checkout which enables um agents to just buy things on Amazon with stable coins um so it knows my wallet address which means that it now knows my wallet

1:13:39

balance so if I say "What's my wallet balance?" Or I can say like "What's your wallet balance?" Either way it's the wallet balance uh oop never mind it's So I can say "Hey could you please buy this for me on Amazon?" And it should say yes so it try to get my address first one more time

1:14:05

just check what chain it's on um and then it should Oh okay so it extracted the Amazon ID and it already has like um my address and my name and all that and so it called the buy token to buy token which basically is just a tool that we imported through the headless checkout plugin and it created an order so what you're seeing here is just um it's just

1:14:31

an ID that you can check in your Crossman console or once you get um the receipt on your email and then it said that the order has been paid for um so usually I'm able to show you guys like a receipt that I get um but today that system is down so I'm just going to show you that the payment actually did

1:14:53

go through so this is my address and um it actually did pay so how much did it it's $967 so together with tax I paid $10.35 um so usually the receipt will look something like this um so if I bought the book it would show me the book today's it's down for a little bit um so previously I bought Red Bull so it kind of just showed me so this is the order ID i get a receipt

1:15:28

from Craftsman it shows me exactly how much I paid with how much taxes and shipping is free because it's from Amazon Prime so the book should be here by like tomorrow or latest two days from now um and yeah that's basically it that's the agent that we built with um Mastra okay this this is epic so I have lots of

1:15:50

questions but first of all because because previously you know because you had to have the exact Amazon link right it extracts the ID from the Amazon link so I can already see in my head you know we just had uh Annie from stage hand uh on I think it was yesterday or two days ago and stage hand's a browser agent right so he can go search Amazon for you and like do re like go out get the web

1:16:14

page get the link so I could wire these two things together and make a really cool agent flow where I did make uh you know I think I tweeted something where I when I was first playing around on the stage hand where I had to go out and it added a product to my cart but I didn't want to have give the agent my credit

1:16:29

card and have it like type in credit card information so this does solve that problem so that is uh I think you could build I might have to at some point have to try building this in uh on a on a live stream where it's like let's actually wire it up so we can have an agent go out search Amazon for you find

1:16:47

a product recommend it ask me if I want to buy it and then make the transaction and then a few days later show it on the stream you know the stream that actually worked but now you're gonna get you're going to get the agent's book it's coming to your house right it's coming to my house and actually um I did post a

1:17:05

video series where I bought Coke for my friend and he put his New York address and by the time he flew back to New York it was by his doorstep waiting for him and we bought it we bought well today's video was about Amazon and that video was on Amazon but we can actually also buy stuff on Shopify and right now we're

1:17:25

testing rolling out flights as well so I had my agent just like go search for flights for me cuz I was like I only want flights under like $70 from SF to LA and it automatically booked a flight for me but I didn't know until I got like the receipt from like Frontier um so that was a very pleasant surprise but there's so much we can do now because you don't have to worry about giving

1:17:48

your agent a credit card you don't have to worry about um I know for Stripe for example they use like one-time payment tokens so every time before they pay you have to go in and put in like a code for this of course you know there are other security things you have to think about so for example like I don't want my agent just spend like thousands of dollars or thousands of USDC from my

1:18:08

wallet i want to be able to give it rules so you can do that or if you say "Hey they're never going to buy anything more than like $20 that's fine you just buy whatever they want i'll let them run in the wild." Um so there's so many things you can build now i think um I'm just so excited but obviously I'm also

1:18:24

more excited to see how we can start building more reliable agents as well because obviously these is also like you know very controlled very supervised like I go in periodically to check what it's doing yeah I I'm I'm just thinking about so I have a talk in San Francisco on um on June 17th I think at I think it's at uh it's at Elastic or it's it's with

1:18:49

Elastic but I'm I'm showing how to build like a personal finance agent so maybe I should just like take this example and also say like "Oh and by the way my personal assistant so it's not a personal finance i'm building a personal assistant agent but also my personal assistant can just buy me stuff on Amazon when I need it." So um this that's a pretty cool capability that

1:19:08

would be pretty easy to add I think because essentially I imagine we kind of looked at the code but it's essentially you just import the the you know import from code SDK and I give it I make it into a tool call and I give that tool call to my agent and it's easy to you know just as long as I have the API key

1:19:25

it should just work yeah it should just work and I can't even give you an API key that just allows you to transact with real money gave your wallet with real money as well um and the goat plugin the goat package is also just super simple to import even in Python so um but one thing I really like about the master of famer is that it makes it really easy to call tools um so building

1:19:49

this whole kind of example probably only took us like not even a day which is amazing because every time we add a new framework like it requires to go in like read the docs and understand and they're like oh this framework is different from this framework how do we call these tools but yeah with master is super

1:20:06

simple yeah well that's we we try to make it as simple as possible so I'm glad that it worked out it was easy one of the easier ones for you to integrate with and definitely always like to see you know these different open source projects kind of starting to tie things together because I do think you know like I I you know I do dabble in crypto

1:20:27

like a lot of us on that are watching this probably do you know but less so than maybe I used to because there's like these natural hype cycles and I think everyone was like oh AI is the new thing crypto is is the old thing but what if the world's comi collide a a little bit and I think that's where there's some like really interesting use cases where like these two technologies

1:20:46

can really work pretty well together um and you kind of get some of the additional security and visibility that you get in with crypto right where everything's on chain but then you also get um kind of this this new capabilities unlocked by agents that seems like a really cool use case for where I personally could see like yeah

1:21:06

maybe I would rather just have this thing be a stable coin and just use that rather than um you know rather than having to use uh giving an agent access to my credit card right yeah yeah there is one question from the chat this was probably when you were talking about Shopify just from Valentina how did you find that on Shopify i'm assuming like

1:21:28

the integration to Shopify Valentina i don't know if that's clear i don't know if that makes sense to you Joyce yeah so you know we now support over a billion SKUs on both Amazon and Shopify and then we're in the process of adding like tens and 20 more like providers um we work uh well I can't really say too much about how the integration works but basically um the way this integration

1:21:54

works is that you never have to offer him whatever stable coins your agent pays with um so we started with the first we we first started with like the biggest two um e-commerce websites right and so Shopify Amazon already work out the box um some companies have already started using it for their agents so that's actually pretty cool to see um like we're transitioning from just

1:22:20

agents that live in your IDE or like local host into just things that are in production so yeah I can't comment too much on that um but just know that it works and you never have to give it a bank account or credit card very cool all right anything else that you wanted to talk about today Joyce um no i am yeah I think that's all I wanted to

1:22:50

show today awesome well you can see Joyce's uh Twitter handle there on the screen X Twitter whatever we call it i change my mind every day but good definitely a entertaining follow so I'd recommend giving her a follow and uh yeah it's it's always fun but I yeah it was fun to actually get to chat not just over you know Twitter DM right

1:23:15

valentina says "Epic Share thanks so much thank you for tuning in thanks for having me have a good day." Yeah and yeah we'll definitely you know I will like I said I'm going to play around with that idea maybe we can build an agent using it and if I have questions I will uh I will definitely know who to tap on the shoulder for to help me out yes yes just DM me and I'll help you

1:23:36

all right we will see you later Joyce and thanks for coming on AI Agent Hour thank you for having me bye all right everybody so it's been a pretty uh pretty good episode today you know sometimes we go really long today I think we're going to keep it keep it kind of tight right hour and 23 minutes in i think that's a pretty good pretty

1:24:01

good live stream for the day so this has been AI Agents Hour we talked AI news you can go back and watch the recording if you missed it it'll be on YouTube it's on YouTube right now but you can get it by following us on YouTube um then you can also follow me and follow MRA on X if you are on X yourself you can do that sm Thomas 3 as I mentioned before you can get the book

1:24:30

master.ai/book if you want a digital copy of the book go ahead and sign up for that it's also on Amazon or if you you know we gave away at least one free book if anyone else wants one let me let me know dm me uh and with that we did talk about some upcoming monster launches we talked with Joyce and overall was pretty good Friday hope you

1:24:53

all have a great weekend i will be out next week so you'll get other people from the monster team obby will be back he's still in he's still in Europe time zone so our timing of the show might be a little off uh we might have some earlier episodes some later episodes so normally we come at you we try to come at you every day around noon Pacific

1:25:11

time give or take a little bit depending on if we're running late which we sometimes do but appreciate you all thanks for tuning in and we'll see you next week

More episodes