Back to all episodes

Claude 4, Linear for Agents, Meet the Mastras, and Building Guardrails

May 23, 2025

Today we check in with some AI news that covers Claude 4 and Linear for Agents. We also chat with Tony and Taofeeq from Mastra. Finally, Ward and Abhi work on input/output processors (guardrails) in Mastra.

Guests in this episode

Ward Peeters

Ward Peeters

Tony Kovanen

Tony Kovanen

Taofeeq Oluderu

Taofeeq Oluderu

Watch on

YouTube Spotify

Episode Transcript

0:00

hello everyone welcome to ai agents hour uh european edition um quick thing i know shane on the live stream yesterday said uh that we we weren't working together on the stream anymore that's bs all right he was trying to cut me out of my own show so i made my own show or our own version of it in this time zone so welcome ward um

0:26

say what up to everybody what up um today we're uh doing a typical show we're going to talk about what's happening in ai uh you'll meet some mastras uh as usual and then we're working on this feature it's guardrailsque right so we're going to start coding some processors that we want to like just play around with um so yeah we're going to have the chat open we'll answer

0:52

everything this is going to be fun um it currently is 12:00 p.m central was it european time so for those watching welcome all six of you so far um so yeah we'll get into it so ward um we previously were just taking care of some horses and goats and stuff uh why don't you let the audience know a little bit more about like where where

1:21

we're at and everything like that yeah so we're at at my place i live in a small town in belgium i'll be visiting so that's great i have in my backyard like animals like i have three goats and a horse um have to take care of them every day um and fun to have someone who can help me so i don't have to do it alone it's basically um giving them food giving them hay shovel poo giving them

1:49

water like basically everything you do um for animals and yeah that's about it i think you know there's like this american thing that we say it's like that's horseshit well there's truly horseshit back there that we were shoveling uh we also recorded a video of us talking about agent to agent we're going to release that probably after the live stream um while we were shoveling horse poop so um yeah we're going to get

2:14

into the show now i already see some things in the chat uh what's up jack martin hello what are your thoughts on claude 4 um we are going to talk about that actually in the news so we'll just give it a second see another question about funding news uh i don't kiss and tell so i'll just keep it that that way um let's get into the

2:40

news so first thing we're going to talk about today one second this i always mess this up all right sharing my screen so oh there we go all right so some of the news it might be repetitive i know um but this is really cool the ai gateway from verscell uh was released a couple days ago um it has a lot of interesting implications uh for i guess a lot of

3:25

this this model routing layer of an ai application so what the gateway allows you to do is essentially switch between tons of different models if you're using ai sdk already you're installing ai sdk openai and enthropic etc so it's pretty cool to just use like this gateway function and then you know pass a magic string and then you've got it um so this is really cool uh what the implications are though

3:56

are interesting because there are other companies doing this specific thing so i wonder if this is like versel putting a statement in the market right and so like notably right um open router is a very famous model routing thing that has an ai sdk provider now it's like a little confusing right like are you supposed to use open router should you use ai sdk i mean the options are yours

4:26

you know um but it's interesting to see like where this thing's going to go um so yeah why are they building it ai development is fast and only getting better and all that there's a new state-of-the-art model released every week that's so true we feel the burn ourselves on mra just there's always new models that people are having

4:46

trouble with tool calling and structured outputs new options like gwen for example gwen yeah um so now they're kind of like you know essentially getting into the ring here um there's another interesting thing that happened with this too because v 0's model was released at the same kind of like the same vibe right so if you wanted to use the vz coding model you could use it with the gateway and then

5:13

now you have vzero in your agents or whatever so i would love to see moser users um play around with it right it' be pretty cool create like a vzero clone and just show versel like "hey i did it i did it." um so this is going to be you're gonna have to pay for this duh um but right now it's free and rate limited free which is really cool to play ais sdk has

5:39

a really dope playground where you can just try different models and kind of see things there um so this only like kind of improves that playground experience as well um so yeah so it's not something you can host yourself if it's only for sale and same with open router was also very similar and i feel like bedrock and azure are also doing these types of things uh so you know what's versel

6:04

cooking you know can i host my own models on verscell maybe um and then the gateway is available to me it's pretty interesting especially for enterprise right like lots of enterprise especially in europe because of like um all the privacy concerns they uh want to host them all themselves or at least make sure not too much data goes to the lms themselves or their companies yeah

6:30

yeah all right let's do the next one i don't know if i shared um audio properly actually but maybe we'll find out linear has a new agent and also mcp i was thinking we could watch this video let's see check on twitter that it's uh i guess we could try this is the first time we're doing this without shane so like we honestly don't know what the

6:56

hell we're doing either uh he usually does all this stuff for us um let's try playing it and just kind of like see what happens nope we also don't know what the hell we're doing either he usually does all this stuff for us let's try playing it it's us watching us no no let me uh that's funny we're literally watching

7:27

us watching us okay um let me stop sharing stop sharing no let me it's funny we're literally watching us is watching us all right turn that off um i need to share audio as well and then here okay but then what a tease was that what a tease um so linear agents is such a crazy idea in a good way right the fact that maybe you create an issue and then you just like kind of like have a really good

8:53

speck in the issue and then you hit run that's crazy yeah and then you have a workflow where the issue is well written where just the vzero or whatever agent can just take care of it and even do like a pull request or what if you like put like a like excaladraw mockup in there and then you give it to your

9:13

coding design front-end agent that just executes that right um this is very timely and if you're like talking about like this space that we're in you know microsoft build was like last week which also i want to tell everyone sam met satya and gave him our book which is dope by the way that is so cool i wish i was in freaking japan of all places couldn't even meet satya but

9:39

and i'm talking about the dude from microsoft satya nadella and that's just dope um so i hope you read the book um delay in the voice uhhuh sorry about that guess we have some delay i know how to fix that we don't know how to fix it either we're going to keep going because we don't know how to do anything

10:08

um so sam met satya at the microsoft build but at microsoft build they um unveiled the one copilot is open source right and so now you can essentially leverage that but you can also leverage these coding agents within github to then go and do work for you i think this is a trend that's going to start happening right like we're also checking

10:34

the audio too that's fine i guess fine i guess can you try disabling video voice what is that is it on yours i don't know i don't have any i'm like muted right so i'm like we're using this dude that's why shane is the goat because he would know how to do this um i don't know well we'll keep going we'll try i don't know so that's um so a lot of

11:24

these uh these task processors like linear github issues you name it will all start just having these coding agents and like i guess so yeah what about all the slop that will get created that's part of the game i guess but eventually the good or the the better um agents or models come out of it i guess yeah somewhat better oh nice and paul from our team is

11:57

here or he just thanks paul uh for for chiming in there i guess it's better maybe we should just talk directly to this thing yeah maybe or maybe there's interweb connectivity problems um so yeah we'll move on to the next topic but uh i don't know it's like if ai is going to take our jobs this seems like the first freaking step right if you're like it's in your task manager

12:21

but it's nice though because you just prompt your issue or your user story and then it does everything what a time to be alive what a time to be alive hey paul you want to come on real quick and just hang i'm going to send you the link so while we wait for paul to just crash the this live stream let's move on to the next um news topic that we have

12:49

which is and jack i think you're going to like this one uh where is it oops right here not that it's in a coffee shop oh hey let's come anyway okay so here we go i'm going to share this tweet and then we can also look at the announcement so as every all know if you didn't know already claude 4 is out which is also very interesting because

13:32

there have been a lot of releases by big companies in the last couple days you have different coding agents uh codeex codeex came out then you have jewels and you know zero's model open ai and johnny iive and all this crazy um like it's pretty like just the amount of momentum right now is crazy i will say claude is my favorite model i think

14:01

it's because i'm an engineer and i'm usually doing engineering tasks that i just vibe with claude more often um so when i started using 3.5 sonnet it was honestly like this like revolution right like it was like crazy but then 3.7 in the beginning kind of let me down right and in windsurf today i still use 2.5 35 for a lot of things and what about you i do the same because

14:32

3.7 always changes too much so if you prompt it you have to say do not touch a b c and d just do this and with 3.5 you don't really have to but i do say the code quality of 3.7 is better but just the hassle of like oh no stop stop stop

14:50

is just uh not worth it yeah like if stop is your best friend then is it really doing the intelligence that we're thinking of you know 3.7 and win surf had a reasoning mode and then like the execution mode or whatever i never use reasoning mode because it just goes off the rails all the time you know um and i've used like gemini too but somehow i

15:13

always go back to claw yeah also between the like wind surf and cursor like the execution of an like the claude is interestingly different like i'm a wind surfer purist just cuz i'm just too lazy to switch between them so like when like when one doesn't work like for example if windsurf in the 3.7 is not working i

15:32

do something really crazy i actually code like myself holy crap dude no i know like a couple of weeks ago i posted in in slack like hey my cursor doesn't want to talk to the lm anymore i was like oh what i do i do now it's like it was really hard to just have no auto completion yeah like okay but then i just rebooted cursor and it was all good

15:57

again yeah so i started using u i played around with quad 4 and i think it has the smart intern problem still um in like an actual user case like these benchmarks are cool and we'll go to each like thing that they're in the sub tweet storm i mean but just like initial reaction is like it's even smarter intern now and that's maybe not a good

16:22

thing for some people when you want to have a lot of control you know um but it like it still makes errors it's still like i just don't like when things like overreach you know like i'm in control yet you're doing all this stuff changing all the files and i hate that um also the chat's blowing up we'll we'll

16:40

address those in a second um but yeah let's keep going here so claude opus 4 and sonet 4 are hybrid models offering two modes near instant responses and extended thinking for deeper reasoning both can um essentially alternate between reasoning and tool use like web search to improve responses this is kind of the dream like

17:04

whatever they're saying right here is the dream right um does it actually work like that i don't really know um but in a benchmark in the benchmark for sure you know um they have the s the sweet bench right this is like the standard bench uh benchmark that people do and you know benchmarks are interesting uh because you always win right when you publish yours um so you can see that

17:32

opus 4 sonic 4 are just killing it according to their benchmark so take that for whatever it is and then lastly here they say sonnet 4 is a significant upgrade to sonnet uh 3.7 it delivers superior coding and reasoning all while offering greater control over how eagerly it implements changes which is though i don't like this

18:02

because that's not true and anyone like in the chat who's also using this knows that this is not necessarily true but marketing is marketing so we should try to track every words they say for every release it's always something similar like it's better it's it's newer like it's always those hype words we are going to be coding something in a little bit so we will be using uh sonic 4 um

18:29

we'll you got we'll all see firsthand right like what's what this shit's all about um cool and then i think the questions maybe yeah let's get into the chat like uh they asked you that you use wind surf over cursor yes i do i'm a wind surfer what about you are you both i yeah i am both like most of the time i do cursor just for the daily tasks and coding but

18:54

then when i know i have to integrate like a new test suite or i know i have to um like plan something out i do think windserve is better for that so i just use like if it's brand new or i want to get like something out as a blueprint or something i switch to winds it just does everything pretty good because it most

19:15

of the time just tells me i'm going to do this out of the box so i don't have to put it in my prompt and then you just say yes and then it does the thing and also i find that um at least for me maybe it's my settings or something winds surf tries to create like a node script or a bash script and then i can reread it execute it change it myself

19:35

maybe or ask it to change where cursor just does a thing and you have to prompt it to change stuff so i like that over over um win server cursor in those things too let's ask uh our friend paul what he thinks too paul what up dude hey how's it going uh we can hear you i think it's now we can hear you all good which one do you use cursor or

20:10

wind i need to add a prerequisite before answering because predominantly my work isn't uh coding these days it's more writing i actually kick it old school with shypt in the browser oh i mean i have had some success with various things using uh flawed in the browser but i haven't really needed to switch over to a fullon ide because most

20:36

of my work is written yeah that's true um also introduce yourself uh to the the chat or to everyone here all 71 of you uh hi hello everyone i'm paul from uh the master team uh new started last week i think i'm on day seven now predominantly going to be focusing on the docs uh you'll see me around discord so if there are any issues or you want to contribute um you

21:03

can at me rather than uh rather than shane i think he was quite keen to point that out uh and i'll also be floating around on twitter as well where were you uh coming from paul and also how do we know you the backstory is uh gatsby i was gatsby's debil 2021 to 2023 i think and i uh moved on did some work in and around databases i was

21:31

recently just at neon uh before they got acquired by data bricks and i wanted to be back with with with the crew on so we took our chance to an opportunity came up and here i am that's awesome dude um let's keep answering some questions and then we'll let you go paul um so all right hey guys what's this live

21:56

about i'm new here oh that's actually a really good point because shane's really good at this too that i'm not this is ai agents hour you're supposed to like keep you just keep you have to keep telling the the audience what we've been talking about so yeah this is ai agents hour uh we do this every day uh this is the eu

22:15

stream there's going to be a i think there's going to be a us stream i'm not sure uh but we're concerned about our stream today uh we usually talk ai news you meet some of our employees here at mra we build stuff um and so what we've talked about today so far vzero model came out ai gateway uh cloud 4 and uh now we're talking with paul so where are you from paul

22:40

the uk or more specifically london uh but uh i'm currently in the countryside on um actually better not give my exact location away but i'm on a farm and there's deer around it's really nice and it's a beautiful sunny day nice let's dox this guy i'm just kidding which coffee shop all 80 of you let's dox i'm kidding just don't do that

23:10

um dude it's it's a pleasure having you here paul um thanks for joining and gorilla coming into the live stream we'll see you around see you bye see you dude bye all right we have some other mastras that are going to be joining us uh we're going to get to them in one second i think we have one more news article we have to talk about and that

23:34

is the big kahuna that i don't use let me one second and let's share this so cursor new tab model 1 million plus context window background agents like this thing is spicing up like like from all angles right now you got the idees making moves you know windsurf had that coding model like last week now

24:05

like there's async agents and background agents so let's just watch this cool here we go take it away hello we just released a new version of cursor we think it's a big step forward in our mission to build the best place to code with ai and we're excited to show you some of the improvements that we're shipping first we've trained a new tab model that's excellent at tab tab tab

24:31

sequences in addition to suggesting multi-line edits to the file that you're on this tab model can now suggest changes across multiple files it will also be faster at all jumps even in the same file which will let you tap tab tab through changes now max mode is now available for all models and cursor max

24:49

mode is our option for letting you turn on maximum context windows and unlimited tool calls at api pricing and so normal mode is still the recommended way to use agent but if you're in the midst of something that requires tons and tons of context you can just easily switch to max mode to tackle that inline edit or command k also got a big refresh and

25:07

cursor it is both interoperable with the agent now and you can very quickly edit an entire file with command k instead of just using it to edit sections of files and there are a bunch of other quality of life improvements you can now create multiroot workspaces to make multiple code bases available to cursor you can duplicate chats to export different

25:25

parts of a conversation you can tag folders to include entire code bases in context instead of individual files and lastly we've been spending the past few months experimenting with coding agents that run in the background and execute tasks in parallel in particular people here have found the background agent especially useful for making small

25:44

changes to cursor like small debugging fixes styling changes for answering very deep questions about the codebase that take a long time and a lot of research to answer and also for scaffolding out medium-sized prs and in fact we have integrations coming soon where you can have a background agent running on every

26:01

issue in your issue tracker we're excited to get this release out to you and i want to thank you for the feedback that you've given us online a lot of these changes are driven things that you are asking us for and we're excited to get these all out to you and we're excited for the next set of updates coming that was sick

26:20

i don't use cursor but again like background task linear came out now they're talking about background tasks all the issues from your issue tracker they all talk to each other maybe it's the same company yeah or maybe eventually everyone's going to be on the same company i don't know um all right let's answer some more messages we got um from

26:45

jack um it's a hiring process one let me could you share your hiring process not looking but learn that's a great question jack like we got outside of any our networks um pretty much all of us are pregat or post exatsby or ex gatsby uh js uh people and then as well as we took some people from netlefi but just like two uh

27:17

people we vibed with while we were there um so we like some of our yc uh batchmates uh are hiring their first engineers or founding engineers now but we hired ward and tony who i will now bring in tony what up tony um we hired these founding engineers literally the first day it almost felt like of the company we knew like what we were going

27:43

to do um so welcome tony welcome to the show thank you thank you and we have another mashra to for here he comes from us or comes to us from africa tafi what up ti so one thing you guys know need to know about ti all 90 of you uh ti has the best haircuts and he always like sometimes like he has the best style dude like look at him swag right straight swag

28:17

and we always make him embarrassed during like our uh company uh all hands and standups cuz like he'll like get a fresh haircut and we're like whoa this guy's a celebrity dude welcome to so kind of wanted to talk to you guys about like you know what's the like how did you guys get to mra what are you guys working on let's start with you tafi

28:44

um so yeah before master there was like the crm that the team tried to kepler crm and so i joined the team then doing that we built the crm and but eventually we had to like um put like move on to master which is what we build now and it has been amazing so far right in master i've worked on different things i've been working on workflows a lot recently

29:08

especially the vinx part that tony built um but most of my work is around playground and cloud really so like um showing the team being able to play with your workflows and agents and tools on the playground and also the cloud being able to like build your project and play with it with your workflows and same things that you can do on the playground then you can then you can do things too

29:33

things like that that mainly what my focus has been playground cloud and workflow nice and where do you live um legos nigeria in africa legos nigeria how hot is it in africa right now uh it's kind of hot right now to be honest it's kind of hot i think um last time i checked i checked the temperature a few days ago i've not checked recently

29:59

but a few days ago in lagos here it was close to 30° i think so which is that's 30°c which is which is not cool not cool but luckily for me i'm h i'm hindo all day it's my hy home so i don't feel all right tony do you tell the tell the audience who you are and uh what you're doing here cool yeah so i'm tony nice to

30:24

meet you everyone um so i got to gatsby because and i used to work together at gatsby um and we've worked together for many many years with ward together as well many many years in the past um i'm a phoning engineer here um and worked on just a number of things you know especially in the early days um small things here and there yeah worked on

30:48

like the tracing aspects um worked on the the new workflows implementation the vex workflows um also like the cloud platform most infrastructure and like the platform api side things tony's a beast by the way if anyone doesn't know that's what i made a joke that he worked on small things but he has actually worked on the biggest the

31:10

biggest things that we have and these two are kind of like a little mini squad right they kind of rolled um together so it was dope to see they have a demo for us so i was hoping they can share with everyone else uh so i'll let you take it away tony sure let me see about getting screen share going so does everyone see my cursor window yeah but let's bump up uh many clicks so

31:43

like i would do like four more four more how would that maybe two more two more oh no i'm not going to be able to see anything myself but uh okay minus minus one minus one okay good that's good good all right so um i'm going to show you guys like maybe some of you are already familiar with the agent network experimental

32:07

feature that's been on for a little while um just to kind of give a quick overview of what it is uh the agent network is a way to have like one kind of a overarching routing agent that in the current implementation has a group of other agents that it can use as tool calls and then based on the task you

32:26

give it it figures out what agents are specialized in which kind of tasks then it can chain multiple agents together to give you like a final um final stream of result back um so what we we've been working on um these past couple days and will continue work next week and the week after is a v-nx version of the asian network which is actually built on

32:50

v-nex workflows um so the whole idea of this routing routing task to agents um that are more specialized in specific tasks and not actually kind of um the result of let me see if i can find that loop somewhere in here uh there we go it's actually a result of like um this one kind of overarching workflow where we first call a routing a routing agent to figure out which um kind of a resource we should run um

33:24

mainly agents but you can also run other things like workflows and we're going to add support for like uh direct tool calls and stuff like that as well so it's just going to see like what input it actually got figure out what is the best course of action run that action take that output and then it's going to

33:40

loop back into a running step figure out if there's like another follow-up action needs to be taken or if the task appears to be complete it will exit the workflow um so that's the idea here i have a little example where we have two agents one which is used to do research but it does like this very kind of brief

34:00

bullet point based um um yeah like research outlines and then we have another agent that's actually just like a text synthesizing agent that writes articles and then if you put those together is an agent network uh you supply memory because that's what we need for the routing agent to figure out which steps have been run before what the reasons were for selecting those agents etc and then you pass in this

34:25

network of agents that you it has as at its disposal and then running through this so the prompt that we're using is why are the biggest cities in france how are they like it first calls agent one which is this research agent um the reason is that it's suitable for gathering concise research data gives

34:48

gives back a list of items from the l1 call here um task requires a full report so it then figures out that it needs to call agent two um which can write full reports based on based on this research information calls that agent two and here's our final result for the agent workflow so the next steps here what we're going to start exploring is if you have some more complex actions

35:18

um where you want to like inject a level of determinism into some of these these scenarios that might happen would be to start putting in workflows here instead of just agents so now you're in control of like all these different uh like you could force certain tool calls to happen or you could do like more complex

35:38

background processing tasks as like trigger from external system for example um and then the agent network would then figure out like which parts of the prompt actually fit into the input of the workflows and then it would just call that and you could stream back um anyone calls that happen in that workflow and that kind of stuff

35:59

dude this is dope it's kind of like the the agi dream right is to just let something go on a loop and just figure out for you you know but what like what do you think are the different like use case like what are like the what's the problem space here for uh this type of network uh network work you know so the way we see it there's kind of two primary use cases

36:27

um the first and the simpler one being um you just have some kind of a an unstructured input coming from an external source or from a user and you have all these different primitives like workflows with specific input schemas that have a specific task that they can accomplish or maybe you have specific

36:45

agent with their own tool sets and their own specialization you just need to figure out which one of those to call so that's going to be a very common common use case so what it would do in that case just figure out which primitive is the best one how to map all the inputs call that and then call it a day essentially and those would be either

37:03

streamable or maybe they could be like background tasks um we could imagine things like uh setting up um alerts on your infrastructure and then you need to figure out if you maybe need to like scale up your discs automatically or maybe you need to send specific kind of alerts or maybe you need to escalate your duty or that kind of stuff this way

37:22

you can also inject things like an llm in the process of figuring out how like uh like how bad an error is that comes from infrastructure and then you could choose the level um in the attribut that you want to set for it in terms of like how critical it is yeah like uh we had um a gentleman on a couple days ago from agentpf or agentperf.io

37:47

and he showed uh his name is venicious so shout out venicious um and he was showing his like he has like his own concept of like an agent network where the routing agent right um is responsible for then picking a specific workflow although this is also built in mra by the way so like the routing agent selects the workflow and then executes

38:11

the workflow which was super interesting because uh we were like oh hey why don't you just use agent network experimental which we published long ago go and the interesting he thing he said was like he doesn't necessarily want things to go in loop forever he wants the agent to use reasoning to pick pick the path execute

38:34

it and then that's it like if if there if there's more needed then the user should prompt back right to do it again so like are we going to support something like that tony yeah for sure that's exactly the the kind of more simple like rung to one action kind of case um that we just went through the the other case is really this more um i

38:57

don't know if i would call it infinite loop because fortunately most of the loops do finish cuz you do end up eventually reaching whatever your attach criteria are but it's more this this like longish task where you need to accomplish some bigger thing and you really need to put a lot of primitives to work uh both in parallel or sequentially to like get that done i but

39:16

i think that will be um i think it will be like a less frequent use case than the one you just described where you really need to just figure out okay this is the thing i need to call based on this input um because a lot of the inputs that that you are going to be getting es especially from all these external systems are just unstructured

39:34

uh whether it's from users whether it's from just a bunch of logs from your back end or something like you need to take that you need to figure out how do i actually put that into my workflow that can trigger a call to page duty or that can um i don't know post a slack message or something that hey something weird's happening um so i think yeah that's

39:54

going to be a very um a very very cool feature for that yeah we have a question from jack also jack you're like the goat today on the questions so keep them up thank you so much also hello to the 95 people watching this is ai agents hour uh we talked about the news we met paul and the monsters here tony and tofi we got a demo i'm not really good at this uh we got a demo for agent network and

40:23

now we're going to answer a question about it so um the question is could you summarize what like the agent network demo unlocks um what would you say tony like what's the biggest unlock here for the user i think there's a couple um one is you can just put together the these different primitives in this demo agents that have their own specific instruction

40:48

set um one is good at research one is good at writing stuff um and you can just give it like an infinite set of tasks to accomplish that you cannot accomplish with these two agents and the agent network will figure out on its own which ones to run in which order it will also figure out when you've accomplished

41:07

a task it could be that you need to call some number of these agents like three four five times before you actually complete it maybe there's a certain amount of review that needs to be done to like get a blog post into a specific shape or it could be like a facteing agent as well so then it needs to know like how long to keep facteing

41:26

um a lot of these things are very difficult to do with a workflow yourself because like there's so many like if conditions you need to put in there uh and then call elements lms yourself to evaluate if something still needs to get looped or whatever it just takes a lot of pain from this like task planning um

41:42

but just letting the agent figure out for you what actually needs to be done for sure and honestly like doing these loops is expensive right because you know you're calling an llm to evaluate you're doing like mini evals on every turn y um but if you you got the money who am i to stop you from doing it

42:09

you know and there's of course a lot of things that we can do so tony we were in japan yeah i was going to say um there's of course a lot of things that we can and will do to make sure that the class is not going to totally explode out of hand like in terms of the memory implementations that we use internally how to summarize the previous steps and all that kind of stuff so you don't like

42:28

keep passing the entire context through like longer loop iterations that kind of stuff but yeah i mean it's going to be expensive um no matter what because you're you'll be doing like loads of lm calls and then lm call to decide what to do next so it's really meant for these more complex and bigger tasks that you want to like accomplish fully

42:46

autonomously yeah for sure that's where guardrails will come in that's where guard rails little foreshadowing last question tony and uh we'll let you both you guys go so we were in japan last week uh i think we i guess recovered from jet lag now what was like a highlight that you got from the trip uh from like all the people

43:12

that we met um well it was a very cool experience because i feel like there was just so much excitement around like typescript specifically for for building ai native applications um and you know people were super excited about that that we came there and and talked about mra with them um and yeah it just feels

43:33

feels nice to be in an environment where everyone's like trying the same things and and everyone's like aligned on how typescript time script ships and python trains so yeah i would say japanese users are way more um ai pilled than the united states because while we argue on twitter about what the right thing to do is they just ship stuff uh so i'm gonna want to learn from

44:00

them just like ship and who gives a you know um yeah but with that thank you both you guys are amazing keep up the great work we'll see you after this essentially um but thanks for joining us on the stream later peace yeah see you dudes all right so now it's just us again cool so we're going to code in a second here but for those watching us almost a hundred of

44:30

you let me do a little recap again i got to get used to this recap um we have been talking about uh ai news cloud 4 linear agents um github agents async agents background agents so that was interesting yes um we met the homies paul was at a coffee shop i guess that we don't want to dox him at all and then

44:58

tony taffy came and talked about agent networks um how and then we got a question here which is kind of interesting how can i build a multi- aent collaborative systems with mra any repo to look at we actually just talked about this uh before maybe you joined or something which is totally fine uh you could rewind and stuff we're working on this right now we're calling it agent network we have an experimental version

45:24

that is like v's negative one uh you can look works pretty good already it works decent you know it's not like i i don't like it because i don't have the control i want but uh it does work and many people do use it they they do want more features on it which is why you can look at the mra open source uh look at our docs you can find agent network and you

45:48

can start playing with the current version uh today so go and do that cool um let's see let's see if there's any other questions in the chat that we want to answer um let's see oh i think we're all good there so let's move on now to coding hour and then we'll like bring this thing home um yeah shane is way better than

46:21

but uh we're going to we're going to build something so a couple weeks ago um oh here there's one question that we can do i have monra agents which i want to connect with vpy any tutorials or suggestions on how to connect it does monsterra support vpy or voice apps it's a good question like vp is also u has all the voice capabilities but also is

46:49

trying to be an agent framework a voice agent framework um we're not necessarily compatible with them because we are an agent framework ourselves um so we would need to build an integration it doesn't mean we're not going to or don't want to um but we have better relationships right now with the people over at livekit um and there has been some exploration over there mostra itself is

47:15

trying to you know we're not a voice framework but we have voice capabilities and you know we it's almost like a side quest for us right now we chip away at it a little bit a little bit but if you're truly trying to do a voice app and it's like you need to ship something and you got to go and do this why don't you you should just use boppy like

47:34

itself right or any of these other frameworks um but yeah uh so that's answers hope it's not the answer to your question it's not a good one but it's just the truth so i think what we have in master now is just you can speak to llm and it can speak back that's the main thing there's nothing much others going on so yeah we support like open ai

47:56

real time uh speech to speech and that's pretty much it and then they can do like the oneway things so like there's high latency right in this in this world here uh there's no built-in like voice activity detection and all that which all these voice frameworks give you right um and i don't know never say never on us in voice but uh yeah

48:20

so yeah if you need to ship something to production and you're like oh should i use monra and vpy or whatever uh i would just use voppy because that's built for a voice agent framework or something i' i haven't heard good things about the agent part of voppy but everything else i've heard really good things um and they're also yc company their partner in

48:42

yc is also ours like gustav so i feel some type of kinship to them maybe but not really at all i don't know them at all so um okay uh thanks for that great question oh and thank oh i didn't mean to click this but thanks jack i really didn't mean to you didn't mean to i didn't mean to but uh yeah okay also

49:02

we're gonna blur it whoa bring us back yes i'm back we were back okay uh i don't know well okay yes we're back all right so thanks for that um all right we're going to code now uh we're going to build some stuff i will share my screen um and we are going to vibe code as well so we don't know where this is going to go but it's going somewhere

49:36

share your screen okay i'm just getting ready for this with my 30,000 tabs open and command w right now you can't see it but i'm furiously command winging this one time when i was like in elementary school actually i think it was like middle school and this guy was like "you should just command w out of your life." and i was like

50:03

"dang qq." remember everyone remember qq if you're playing like counterstrike you just people command w your whole life that's that's dude kids are crazy um okay cool just hide from all responsibilities yeah all right so ward why don't we kind of like set the stage for the audience while i uh get things going here so we're going to build guard rails so um

50:32

there's basically three concepts if you talk about like guardrails you maybe want to do like input processing so you're doing you're getting an input from your user it might not be the best prompt that they did so you could do like prompt optimization with like calling like a cheaper or smaller model

50:51

or you can use a big one but mostly you just want to make it like quick and fast where it just enhances the prompt before you put it to the lm like now it's like kind of difficult with master in like you have to do a workflow or you have to call like an agent and another agent so we want to make that smoother so we have

51:11

like input processing then you have like real guardrails where you can um like trip wire like basically close the connection with the lm so you boot it up so you call the lm and then you call another thing which is the trip wire basically it's just a function or another model that runs and whenever it like tries to do whatever the prompt um

51:33

study to be and then if you say like oh it's not what i'm expecting like maybe it's not answering the question that actually got asked or something you could say like stop the connection to the um llm itself the agent and basically then say like okay no just quit and you just cut the wire so you're not like paying all the token and uh um

51:56

you just cut it loose so you can get like a new message from um um from the user so if you go back to like sonnet 3.7 where basically it goes off the rail you click on stop um it's like a manual trip wire basically because you want to cancel or if you could do it programmatically in in whatever way um

52:15

so we're gonna um build that part of it and then of course when um output processing you get something back from the lm but maybe you want to format it in some kind of object or just want to enhance that too or like cons condense it you you should be able to do it as well or maybe like remove things out of it like secrets or something so maybe the lm you give it secrets or um

52:43

it gave you some kind of weird secrets back maybe you don't want it so you can like mask it out or something we have something similar for memory where we mask where you can like write an um memory processors like input and output it's basically similar but we're just going to open it up on the agent level so you can um do it on the on the prompt

53:04

level basically good uh explanation dude i think for the sake of our remaining time what we'll do is input processing and output processing and we'll probably not worry about streaming right now yeah um and then maybe we can try to do trip wiring we'll see um before we begin there's one last thing uh let me i'll just read this question out loud

53:29

also if you're just joining us i don't know i think uh host yeah um one second i got to do the host thing all right there's this one question and then we'll get going thank you to all 111 of you in the live stream right now uh if you don't know what we're doing man this is you always have to do this um sir but shane does it shane does it all the time naturally

53:57

um if you're just joining us now this is ai agents hour hosted by msra we're doing the european edition uh because we're in the eu right now belgium to be exact uh we talked about ai news claude for this whole like is ai going to take our jobs because linear github background agents there all these agents

54:17

trying to do our work for us uh we talked api gateway vzer model damn you really have to do this the whole time yeah then we met the mastras um paul tony and tafi we talked about agent networks multi- aent collaboration and now we're going to build some stuff what a mouthful every time okay anyway sorry um let's answer this question from suj

54:41

is there only one way to interact with python tools through mra using mcp what would you recommend to develop mcp with python okay this actually a really interesting question um because that whole point of mcp like this whole dream of mcp right you write your mcp servers in whatever language you want and then mcp clients can then connect to them and then use them um i don't write like none

55:08

of us here will ever advocate for anything python so you ask like the wrong question there it's not that we're being tongue-in-cheek it's just we're not python devs so we cannot give you a good recommendation but mo like mcp python sdk does exist so you could write an mcp server that way um so good luck

55:26

to good good luck to you on that one but if you did have a python mcp server masher can connect to it with our mcp client um i think a lot of people think that you have to say language to language on these things but you don't have to um the only problem is if you're trying to ask two javascript guys about

55:44

python you're just going to be out of luck you know um here's another one please also make some videos and tutorials for exposing mra apps as mcp service which is in consumed in ids like your doc server this is a really really good question because we literally uh shipped or are shipping maybe alpha today uh daniel ship is going to ship this which

56:08

is you can expose mra agents as tools in the mcp server so then there you go also if you haven't caught up on like mra features uh this week we or i think it was last week you could create mcp servers with mraa attach them to the msra instance mosra itself turns into a hono server so you have now streamable http endpoints for your mcp servers so

56:35

technically for putting all this together uh your question here yes you can make moser apps into mcp which is very meta and you can use it in like ids and stuff we will share videos and tutorials i will give you that um but yeah you can do this and it's very exciting thank you for the questions by the way um all 120 of you okay now it's coding time so sorry we're not going to do any

57:01

more questions for a bit but if you have questions on the code and stuff let's keep it relevant yes cool all right all right all right ward so you're gonna so um cool so we pair program a lot at mra just in general i'll be driving and then ward's like the conductor this like some classic software engineering team uh but yes that's what we'll be

57:33

doing um and what we're going to first do is the input processor right so any input will can get modified or something so let's try to do that so we should add that where to the agent class because agent is our base class for agent so it's the where you put the description like i mean the instruction the name so and this agent config right here yes and

57:56

then let's just add maybe above here um input process or yeah maybe do it se separately like input processors and then output processors we can started one and it'll be an array of something array of functions i guess we will type it after a bit but so we just do function array i think that's enough we can do like a this is peak typescript development right here dude honestly it would be

58:20

even more peak if i did this that's how you always talk right no types no types all right so now we have we're gonna we'll come back to it y'all but we have this in our config um now we need to then pipe it to yeah constructor so probably to like a property or something because we're going to use it in the generate and or in the stream so for now we're only going to care about generate

58:46

so we need to store it so we can um grab it so we just do like this dot hashtag um which is like a private yes so you have in typescript maybe that's uh something that people do not really know is like you have the priv private accessor or modifier so you can do private x is like a string but you also have like the hashtags and private

59:09

existed in typescript for a long time but it actually isn't really private so if you if you compile this to javascript you could still access it so it's just typescript compilation so you can do it in typescript like typical complain like hey it's not accessible but it's not truly private the hashtag is because it's a javascript primitive so even if

59:30

you do like hashtag something and we expose it no one will be able to access it unless you make like a getter function or something else yeah and this is not necessarily a new typescript language thing but it's like within the last year or two or something i think chrome like the the browsers actually

59:47

shipped it probably two years ago on all the main things but that's why we always lag behind i think note oriented for a while but and here's also another tangent because we made this mistake in ma code as well um let's like this just this is also a stupid tangent but it's very typescript related so private x string like this was uh introduced as

1:00:08

like a stop gap for people before this we used to do this and this would mean that this doesn't mean actually like you can still do whatever you want but by doing underscore unerscore you're letting everybody on your team know that this is private yeah like a convention like it's internal it can change at any time it's not a public api yep and then what was like you know in react they were like underscore

1:00:33

yeah i don't know don't use this or you get fired literally existed in react codebase um for some reason i forget now but that was like throwback i think it's the same that they need to like you have react and react dom and they have to kind of sometimes talk to each other yeah the only way to really talk to each other is to expose it to the public but you don't

1:00:54

want people to use it but you might sometimes want to yeah like when you dangerously set in html you know um cool so now we have our input processors this is complaining you just set it you're not using it yet oh great so now we need to use it let's go to generate oops i have to say goodbye to the cleaning lady sorry folks be right back he'll be right

1:01:26

back it's cleaning day here okay also i don't know why i need to get back in focus okay cool all right i'll keep going while he's doing his thing so we have the input processor on the class now we need to use it um and we take these messages here um or like the messages that we can take from the agent

1:01:49

call agent.generate so what we need to put our input processing on the messages so already here we're doing some message modifications and stuff let's put our input processor that's actually a good point ward um if you can if you have multiple input processors let's say you have two is it going to be like a pipeline i would say so like first the first one first and

1:02:24

the second one um so if we did i guess it makes sense right if we did like this input processors for each yeah and then we do the processor and we'll figure that out but then each one is going to mutate this and then yeah so you always get like the the newer version yeah like that yes the order will matter but i think it does make sense mhm also these might be async right so you have to wait so then we

1:02:56

need to do like a for loop for loops easiest so four especially if you want it like sequential a for loop is so easy because you just do four con x yeah like this and vibes then you do a weight processor but you probably have to type the uh now it's any so can or wind surf can't figure it out i say we rock with any cuz live life dangerously there's another good

1:03:29

question what if so this back in the day with javascript right when you had a promise or a not a promise you would have to resolve the promise right because then otherwise it'd be like dot then yeah so what happens today like await is just really smart now yeah so basically await does the check for us so

1:03:49

before you had to do like if promise do if x then you kind of know if it's a promise or you do like instance of promise but mostly we did like xel then because you can create promise like objects but that's a whole other tangent so you usually had to like if promise x of 10 exists then you do x then and you give it the call back or something

1:04:12

else you just execute the call back um but it was like at least five lines of code or something and it's just a a hassle so many people just did promise.resolve open bracket and just put it so you always create a um like this yes so you always create like a promise the only downside with like this approach is basically it's always a next

1:04:36

stick so it's not it this always makes it async so for example if my processor wasn't async um a weight would just call it sequential but with like promise of resolve you will always wait like zero mill like one millisecond or something um and a wait is smart enough to do that check that we did before with the callback like if promis um then um it will do it for us so it's just sugar but

1:05:01

it's good sugar it's good sugar okay so technically we have input processors now technically we're going to go make an example for them we haven't done anything with them and obviously they'll probably break if we do something stupid but hey we're going to at least try it out um yeah something let me just go and

1:05:21

build this so we're using a monor repo in mra for better for worse um i mean i like it fine it's just kind of a a bee sometimes but we'll build and also that's not what i wanted to do i want to build core yeah 59 projects in here man and we should probably just do a build and while we do the build i will see if there's anything else we can

1:06:00

answer 134 people welcome i'm not going to do the repeat though i'm sorry like if you want to know what happened you got to look okay um i can do the repeat all right you you do the repeat so we did news basically we talked about um claude 4 uh api gateway from verscell or ai gateway from verscell um talked about linear like

1:06:25

their teasers we talked about cursor their backend background agents so lots of agents and then we meet the mastras we met with paul who do who does dogs we meet um tony and tafi who are working on network agent networks where you basically can combine multiple agents together and now we are coding guard

1:06:48

rails and waiting for paint to dry uh i haven't built the whole mra repo in a while so i kind of wanted to do it so we don't run into any unforeseen dependency issues or whatever and we use a turbo repo so basically you run everything like if you haven't run it for a while you have to run everything but um then

1:07:11

from there on it's like cached and it only does what it needs to do but sometimes it has to do everything if you touch like master core yeah like master core is like the core module a lot of things depend on it so basically everything depends on it because all the abstractions are there so um and then with the 0.10 10 release

1:07:31

we did on wednesday normally it should have gone on the tuesday basically we removed so many dependencies from master core so basically now it's a cheap shell and then all the other packages basically just use it so master core has all the contracts that we want like memory has to look like this agents has to look like this and then you can

1:07:51

basically build your own um abstractions on top like different stores um different telemetry providers like all of it cool so we um added input processors to the agent class and we are doing we're calling them we're not actually using anything yet what i like to do in this case i have this example that's really stupid it's just like

1:08:17

um it's just like my scratch pad of just different agents that i've written based on um different features like or like bug fixing something yeah so like we just have a bunch of scratchpad examples because once you have the monor repo built you can link those packages to an example and then start coding with it

1:08:37

and building things so i'm going to build an agent that would require us to do some type of input or message enhancement right um so let's call it like a tutor agent right because i think a lot of times people ask dumb questions so what if we can take a dumb qu like you can just ask whatever question and the input

1:08:59

processor actually makes it into a very well-formed question um so tutor new agent tutor agent new agent um yeah sure we don't need any tools but we do need input processors also there we go it's an array of any but uh messages i guess that's what you get that's vibe code right there but we do get messages and then let's just return messages right now mhm just to see if it works yeah just to see if it works and

1:09:37

i'm also going to put a second one just a log or something just to see that it calls it um that's true i'm going to put a log for sure but also let's add a new message just for fun funsies um roll user content sure i don't know why it always uses the france thing because we have it in our tests yeah so let's just

1:10:08

console.log messages here which will be an array um i believe and then we can also just say s dude one of my friends back in the day like he would not he was like i think either either a debugger or a console logger in the chat please tell me are you a debugger or a console logger um so yeah let if you're if you

1:10:30

care um i'm a console logger i'm a console logger but from time to time i use the debugger uh depends on where i'm at so if i'm doing a lot of like rollup configuration or like build stuff i use the debugger more because you can look at all the arguments and stuff but if i just want to know about what's this variable i do console log so i start with console log and if i can't

1:10:55

figure it out i go to debugger i'm only console logger uh i feel like i'm like an intuitive console logger because like when you first start console logging you put logs at like the top right but you don't need to put logs like at the top you know that the pro the top of the function is probably fine you got to figure out where you're going to but

1:11:14

it's it's just like a debugger actually it's literally the same just just putting a log there i have a friend who used to put pizza right but the problem is he do pizza in all of his logs so then you don't know where the did the log come from so then what did he do pizza one pizza two pizza pizza and then

1:11:32

i was like "oh it's like lil caesars which i don't think exists in europe though but at least not here." yeah pizza pizza um anyway so i use sud dude and i do s dude one s dude two and we also there's no such thing as dumb questions just react developers that is hilarious um and that is from rude people die

1:11:59

young that's pretty good i guess your time is coming your time is coming dude um and then he also said or they i don't know um log more in front end and debug more in back end so that's cool yeah remember like back you'd have to if you wanted to debug no.js you had to do the inspector and do all that bridge stuff dev tools or something yeah there's a

1:12:22

cool product y'all should try it's called subtrace subtrace so you can go to subtrace.io i believe and it's kind of like a node debugger i think they like i know the guys who are the founders there but they do it's like almost like wire sharkark for docker containers essentially which but they are running like node processes check it out if you're into this whole console log

1:12:46

versus debugging kind of situation let's continue also thank you for those comments um all right so now we have a s dude here let's put a s dude too just because it's hilarious and maybe we should also mutate this somehow reverse it uh but we al only going to send one reverse the text maybe oh like within

1:13:08

the content or something if you know messages like type user and you just but what's yeah we'll see so okay now we have these mhm so we could write it like a script but i think we could just put this in playground and then inspect the logs right so i'll do that especially if you put debug on then we should

1:13:28

definitely see this as oh yeah i was mentioning you can pass mcp servers right so here's an example of that happening who asked that was that um a yeah a so yeah hey a there you go or was that the question yeah yeah for sure um okay let's get rid of some of this stuff because it's not relevant right now but then let's add our tutor

1:13:56

agent cool tabing to victory tab to victor i need to mpmi ignore workspace this is so we can do the links yeah i can see it already on screen that it says like dot / dot package loggers so it basically took it from the mon repo instead of from npm then i can do now rundev or master dev master oh yeah dev okay cool oh v-ext do i have a v-x

1:14:34

workflow in here i probably do nice we're like fixing our own uh release sick okay that should okay we're good we're we're we're kicking it now oh snap storage is not initialized i don't have storage do i oh i don't have storage i should have updated this example earlier uh we need to add storage right so i don't have storage you can import from master lip sql or

1:15:07

something or remove some agents because i'm guessing some agent is using storage or something i mean let's see okay yeah because it's using memory and stuff yeah um if you all haven't seen the playground in a while marvin has made it keeps making it sick we also have greg leinsky if anyone knows greg

1:15:32

leinsky from poland he's also ex gatsby him and marvin are cooking um these europeans man they're just cooking on the front end uh but it looks dope um you can see memory is off now and also memory is not enabled so and we also don't have any tools or workflows on this thing which is chill we do have our

1:15:54

prompt enhancer here whatever so i'm just gonna say like we haven't done anything yet but like what is 2 plus two if i look at my terminal i should see my sude somewhere but i don't that's because we're streaming right like in the playground but let's use generate now what is 3 + 3 the capital of what oh cuz we added uh oh hey it's working dude

1:16:37

that's hilarious okay so now we have our s dudes what is 3 plus three what is two and then we added y roll content and then the the last thing this is dope okay be careful when you change your prompt yeah okay so then there's so many things you can do now right now you have this control let's actually do something interesting and i think we're going to have to do

1:17:04

something um even more interesting which is we don't just want messages in here we also want mra itself because i might want to do agent call any i might want to do a workflow i could do whatever the i want in there so we'll let's make that happen and i'm just pretending this works right now and runtime context oh we need

1:17:27

runtime context too that's a great point we need all also i'm going to get rid of the second one because i don't really care about it right now um okay so then what if we had an agent that is the prompt or the input enhancer agent right so let's make one of those um it also could just be an llm call too but i'm just going to i don't care so

1:17:54

um question enhancer let's see if i can tab my way to victory okay cool and so there are two options right i could just use this agent in here right like and i just call it directly like it just autocompleted for me right i could do this the problem with doing this in my opinion is when you register agents in mra like the agent itself is registered you get all the tracing and you get all the

1:18:33

observability so um that's why we would rather like retrieve this from the master instance so i could essentially do my question enhancer is mra.get agent and the agent will be like question enhancer now i have that same question enhancer i'm going to call this agent and then i'm going to call it here and this is getting sure i mean whatever and then we need to get we'll

1:19:05

just log some right now so i'll just see like uh this will be text also we're flying blind too no types pretty crazy um also this is not going to work as written right now i need to go change something so let's go do that um in agent here so right now we're just passing the messages which is cool we'll do that like this and then wow it's funny how that

1:19:37

works that's great and those those all exist there so now we have to assign it to master as well like the the cl i didn't think you registered yes yes so let me go build core you can do like turbo watch and then filter turbo watch then build build then d- filter equals master core it will watch the muscle core package so every time

1:20:06

you change something it will rebuild it dude you learn something new every day i didn't know that dude i've been doing it man and you can add like as many packages as you want so if you're looking for master core and maybe memory could do filter ma core space d- filter master memory and it will like watch

1:20:26

both and it will also watch all dependencies um so if master memory uses master core basically don't have to specify master core also don't have types right now or maybe i'm just i think it's hashtag master now yeah yeah yeah but look at like bottom this happen hey wind surfers this happens to me so much where i just get like an infinite typescript hell let

1:20:52

me know if that happens to y'all also there are a bunch of chat messages now so let's like go through those a little bit also welcome oh yeah we have to do the summary thing hold on let me just see how many of you are here 153 welcome all right we'll do some questions um some okay rude people die young dope name by the way again um you guys are

1:21:18

awesome love m what mashra is building very early days for me i'm trying to build to run only browserbased oo that's a little tough dude especially with like api keys and stuff you don't really want to expose them on the browser i think that's the main main thing um there's just a security hole um but also we're

1:21:38

not compatible with browser only either like we have um like too many note stuff yeah going on but it doesn't mean never say never i guess you know yeah it's just not a priority yeah and then hey how's it going dude um what about using superior agent act as a supervisor or something what is superior agent is that it i think like it's the

1:22:06

master agent that calls its minions i think that's what he means like the oh superior yeah superior yeah i don't know omar i think you got to clarify dude because i don't understand the same as a supervisor basically like but in what context like what we're trying to do right now i don't understand the context of

1:22:24

the question but anyway who cares can you reans the question and we'll definitely answer it yeah it could be that it's like uh just in your router you say you have to enhance first secondary do this and then yeah we could do that i guess yeah but then you don't or you do it to tool calling it's like less explicit what if the the lm doesn't

1:22:44

call the tool because you never know that they actually going to call the tool but it's a good idea you could probably do it as well but like with the input processors defining it you're always sure to get like a ran did this rebuild wow that's so cool oh it just did and it's watching it's watching it too that's amazing okay so now we go back we have technically we have this

1:23:11

going for us um i guess i can go back to reattach it to the masra oh i didn't do that yet um here nope not that here this is our question and answer so now we have that sorry okay so what do we expect we expect prompt to change um the prompt to change so what is math okay let's see oh this is what that's the response though right like yeah but but you did console lock text no did it

1:24:17

i think in your console log you did text but i don't know what's oh yeah i did i did i did um maybe you have to do a subd there it is what is mathematics and how does it encompass various branches such as arithmetic whatever it did improve whatever you know so what's like a we could um what am i trying to say i mean now we need to just replace it right so

1:24:49

um and then this is where like we're only sending a user message but there could be multiple messages in here so like which one are you supposed to do the last one let's say right we want to update the last one guess so you we probably want to in the future want to improve that it does it for every message maybe like um all at once

1:25:12

like you have if you have four messages to four agent calls mhm um but for now we can just take the last one okay and then if we do also prompt the agent probably this is my message one message two message three give me them back but yeah something we should figure out and probably write a blog post about or something yeah why is it so slow

1:25:42

because you probably use i don't know which model are you using mini i'm using mini for both so i mean i mean this is the this is the problem like with with doing this right you will introduce latency to the call i'm also doing a generate that's why it is calling you know uh sorry model settings i am doing a generate call so it is like waiting for the full response um

1:26:17

okay that's cool but now what we should do is there's probably other types of input processors so maybe this is like an enhancer so let's let's do something before this let's call it let's like maybe we should throw like do something that would throw like maybe like you shouldn't be allowed to have any inputs that are like even with a certain word or something yeah

1:26:41

ooh profanity filter okay let's vibe code this so i'm going to take this this i want let's get into wind surf i want to i wonder what it's going to do it's going to do some crazy probably like if i ask it to do a profanity filter like a gen general general one i think that's cool yeah just ask i want to okay so here

1:27:15

messages are in the shape of i'll just do it roll let's just say user content is a string or something i don't even i i probably could just reference the type as well um can i do that core message okay i can i guess i can um so now it knows what this is i want to throw in this function if the content has profanity in

1:27:56

it may the force be with you and um if you mess up just know that rude people die young all right we'll see how that goes i have a feeling i have a bad feeling about this already i don't think it's going to do it right also they have celsius in europe which i just discovered and uh that's cool i mean we're sponsored by grapile energy drink so i i guess i shouldn't have shown this i don't know um but i

1:28:53

don't have any grapiles on me right now so how to get the celsius maybe they should have a a brand of grapile in europe then hey grapile send us more drinks here too also y'all should use grapile though i was in japan and i pumped gravile on stage i gave a talk and they someone asked me like "what ai tools do you guys

1:29:20

use yourself?" and we were like "well we build a lot of tools ourselves for ourselves." and i mentioned i did my taxes with an agent but i'm not going to tell you guys about that yet but then i said "oh yeah we use gretile for pr reviews and stuff." um and then japanese twitter kind of went crazy with it so

1:29:38

yeah it's pretty good even if it doesn't always says the correct thing or maybe it's not um exactly or you're okay with like a side effect happening or something but a good thing it finds the places where you or your reviewer should have extra look at like maybe you did like a spawn a child process or something or you did like a fetch call

1:29:58

but you forgot that like cancellation is not happening like you just mentioned that like hey you forgot cancellation and maybe you're like i don't care about cancellation good for you but at least it prompted it and it's like okay let me think about this yeah we also just vibe coded this thing we didn't look at it once which is hilarious but let's answer

1:30:18

two questions here um rpdy i'm going to call you that rpdy rude people die young um your o token is in the browser so storing your api key isn't such a big deal technically your o token is not in the browser right like it's you're exchanging it for some a true identity token but i see your point though yeah but the the main difference is that your token is mostly like only a couple of hours valid yeah where like

1:30:45

api tokens mostly not unless you have your abstraction in the server already where you have your api kit that's just for you or for your service but not the open api one but then what's the point of running everything in the browser if you already have like a gateway of some sort or maybe it's just like client side

1:31:03

tooling or it's always maybe it's like the client is the browser or you know i mean the client is the browser sorry i'm going to explain this real quick uh there's this thing we call this thing client tools and i think people confuse it often with browser tools right like the client is the perspective of the client aka i'm calling a server and i'm

1:31:27

going to do a clientside tool call my like if you look at my ide right my ide is the client and then um right like windserf here is my client it's talking to a model claude through an api claude is then signaling to wind surf to do tool calls right you look at these tool calls that get analyzed that's a tool edited

1:31:52

this is maybe it's maybe that's a tool i don't know um all these are tool calls but the the model is telling windinsurf hey you need to call this tool windsurf owns the tool um so then it executes it client side right so it's a client side tool call so it doesn't mean browser but anyway that's just a tangent but come on to our discord and explain like the

1:32:16

example because i want to love i love to hear what like uh you're actually building and why you need like master in the browser because maybe we're not seeing something and maybe we should put more effort into it yeah i like not paying for stuff zero cloud cost sounds pretty good to me um and then sj asked "greptiles from yc right yc hackathon."

1:32:37

yes greiles from yc um they threw the biggest mcp hackathon which i missed which is fine um i was some doing something some other um but yeah super cool product um yeah okay so i'm just looking at some other ones u he already built browser only with versel ai sdk that's cool yeah i mean there's no databases involved with the aisdk there's no workflows that

1:33:12

have all types of node dependencies so yeah for sure um yeah i just well stated um then omar think more type less debug more and more and teach ai about it and receive an apology from ai is vibe coding um that's funny because we haven't even read this yet but it's right here i apologize for my error dude like so much stuff happened while we were just chilling

1:33:36

right let's read this and actually see what happened here oh my god okay so then let's see um it looked into the core message okay it saw that it was imported okay whatever analyzed the processor sure um understood some stuff had a lint error okay issue it found its own issue another issue apologized for the issue again fixed the reax finally

1:34:09

finally uh successfully implemented it and then gave us a summary so let's actually see what happened here okay sure check if a core message contains oh snap hey this is not safe for work uh these uh these words here but all right um so like cool cool i don't know i'm just going to it looks good lgtm um so i'm going to accept it but

1:34:43

then where are we using it like down here right i guess in the oh if it has profanity yeah now though do we want the llm to tell us that it's they couldn't do it or do they want to throw i would say throw because i don't think we have enough information to let the lm or we don't have enough because basically now you cannot skip the next one mhm it'll throw and then the next

1:35:18

one won't even get called yeah now it just degenerate throws i wonder what's going to happen in the playground crash again crash all right let's find out that's a good thing that we have to think about what if you throw your processor do you want the lm to draw to uh throw as well like what's the nice

1:35:37

result did exactly mhm what we wanted to essentially it's pretty sick dude okay so now we have like a use case right so you can filter things out now this is an essentially a guard rail it's kind of a guardrail but you're you're doing something before um now what a guardrail is supposed to do is run this check in parallel so it's

1:36:03

it's supposed to pass it through and then it'll run its check so this profanity check would happen but also we would start the execution in this case it would be a little weird because i just said a right like who cares the execution is really fast so it probably won't there'll be some like yeah but like look

1:36:22

at the pro the prompt enhancer like of course you can't do that parallel but let's say you do another model call because the enhancer also took a couple of seconds or yeah like 30 seconds or something maybe you should even use a different model but basically those things are like take time and latency

1:36:40

that you you can't really stream anything to the user maybe you can stream like reasoning or busy but there's nothing valuing for the user so um that's why those guards if you can do it parallel and you're already kicking it off maybe the um the model is like reasoning and doing its thing and before it actually starts streaming anything you already cut the wire or something so that's why parallel sometimes does make

1:37:07

sense because it's faster that you're not waiting we could do another type of profanity thing where it actually uses an agent to make it safe for work m so what we could do is like which i guess is the same thing as this enhancement but it could be like the output processor right because probably you set

1:37:22

some profanity words in uh your prompt so maybe the ln is going to respond in a similar way or maybe not but you never know and then the output could basically clean up all the profanity words okay that's a great question uh rude boy uh we're let's do an example like that where um we don't allow things outside of the scope of the agent so this is

1:37:49

what we'll do um so like in a tutoring case you're not allowed to ask anything outside of mathematics let's say so we have to have an agent that evaluates the is this a mathematical question it's like a judge it's like a judge yeah it's like a judge so we can do something very like um crude right now so let's actually so

1:38:14

we have this question enhancer but let's call it like the um let's call it the um prompt relevance it's answer relevancy essentially is like an eval dude i wonder if we could um i wonder if we could use an eval the answer relevancy that's just it's just a input output yeah what would be the output though the prompt itself no no so

1:38:42

the in yeah i guess so let's see input output well you you could the input would be um the prompt the prompt and the output would just be is this is this math let's not do that i mean we'll essentially what we're going to do is essentially we'll be very close to an eval um also like we have mra evals it's like a package for all of our evals and stuff

1:39:19

um and uh what am i going with that we have these off-the-shelf ones like hallucination answer relevancy bias etc so we're just thinking like could we use it or whatever um we're not going to because we're just hackers right now and vibing um so yeah we'll just keep doing this there's a question that just came

1:39:37

in but um we'll get to it in a second also i have no idea how many people are in here so let me check real quick um let's see 169 welcome ward do your thing tell them what we've been doing so basically we started with what's happening in ai we talked about um 4 uh we talked about the ai um gateway that forcell launched we talked about a

1:40:02

couple of models like vzero model got um um shipped as well or or released we talked about linear is up to something with like agents um cursor is like also revamping like background agents doing things um like outside of your scope or outside of the ed you just start it off and it does something um then we met the master so we met paul who is working on

1:40:28

docs and he's actually working on express server docs as well uh and then we met with tony and tafi who are working on like agent network so basically how can i have like a supervisor or a router that just says like hey i have this and it figures out which other agents or tools or workflows it needs to call and now we're here

1:40:48

coding guard rails or and input processors thank you ward um so in this case this prompt relevancy thing should we use runtime context in the instructions to then make it the relevancy would then be based on runtime context so for example if i do runtime context like this make a function yep and then so then we can do like if let's call it con topic runtime context dot

1:41:22

was it get y topic yep and then i have the topic and so we're going to return oh yeah by the way in agents in mra right um not all the properties yet but pretty much most properties are dynamic they can um and there's this thing called runtime context which is what we use to and you essentially the user pipe all this

1:41:49

metadata everywhere so what we're going to do here is make this prompt relevancy enhancer dynamic in a way uh by giving it the topic it needs to be relevant about so i have the topic um yeah random context is just a like a bag with key value like if you know typescript or javascript the map like a new map is

1:42:10

similar you just have a key and a value and uh you do whatever you want with the value and you pass it to the generate or if you use like the api request in in like the server you can also pass it as like a like a key value in your body or if you use the master client which we you should use for communication to the server it has it as well so it's like

1:42:35

all type say all type based okay so we have the enhancer i'm going to go put this on the mra instance like we did the other ones that way it can be registered and um we haven't shown any tracing yet but we're i'll show you in this this one because you'll see the tutor get called the other agent get called set up

1:42:53

storage oh yeah we do uh we'll do that in a second oh um cool so now we have this prompt relevancy enhancer let's make a another input processor here um and i do need runtime context in this case or do you need it i can use the i should keep i should use it right you use it because you have to pipe it to the generator right correct

1:43:26

cool all right and master you need too because you need the mhm agent from the master so we'll get the relevancy agents that's not the name though is it no it does generate in one go very nice but then we also need to pass runtime context yes and then i also want to do a output here which is a zod object i'm going to use structured output because i

1:43:58

just want to know like relevant boolean you know so let me also just update this i'm just going to put this i don't know if this is a trick or anything but i just like sometimes when i'm going to do a structured output i'll just drop what i want my shape to be even though i'm doing it in the zod and json schema type

1:44:18

stuff i just put it there because i don't know i don't give a like i just did it and honestly it works sometimes so it's basically because we um saw with tool calling that you sometimes have to duplicate yeah like the rules like just adding tools is not always enough you have to like re put it in the prompt like hey please like we have a tool called called that does this

1:44:40

we have a tool that does that so then this should be like if if it's not relevant then we throw otherwise return the message just return messages cool okay now we need to pass this topic in right so let me also log here what the topic is okay thank you darl for adding that to the playground so you can just do it in the playground so i need to go add runtime context

1:45:13

here um am i do you have a char yeah get it we're on charging duty um okay there's some questions i'll do while i wait as my battery dies uh 10% though we're golden dude we're golden um cool let me answer some questions so i have an express server and want to add mra agents and workflows can i integrate the local dashboard from npx create mra

1:45:45

into my existing setup the dash like the playground like what we're looking at huh i think technically you can we don't have docs but you basically you should be able to when you do a build to get the playground as long as you have like all the api setup but i would say no you can't do it today but if you really really want to put the effort

1:46:09

which i don't really recommend you could yeah if you want to use our api handlers right in your express server um you can use mra server here i'll show youall i'll show you all right now we have a package called mra server it's right here and what we so okay it's a little story time um for those who care i don't

1:46:31

know if anyone does but originally we had this thing called masher deployer and the deployer was all of our hono we use hono as our server shout out to hono um but everyone came to us next and were like "yo i use express i use ka i use whatever." um and so we were like okay well we want people to use whatever server they want um and they wanted to use our handlers that were existed on the hono server so

1:47:01

we're like okay cool let's lift everything out and we put it in this thing called mra server what mra server is you can just grab all the handlers these are just javascript functions then you can go into your express server and then you can set up the routes however you want and just call the handler as long as you're

1:47:18

passing in ma and some runtime context right you're golden um and so for hono like this is a lot eas totally do this but this doesn't really mean that you get the playground though this is just functionality and then i okay technically technically hey i'll help you right now maybe uh maybe you can do this so we have another module called playground

1:47:51

ui this is like the playground components so now you have your mra server and express using our mosher server package with the handlers doing your thing you have express server we have this playground ui that has all the playground modules you could import those and then you could finagle your way to victory um it's still a lot of work though and uh i'll keep a note of

1:48:16

this as like maybe something we entertain in the future but answer right now is like good luck have fun um okay and then let's see what other question we have these streaming sessions are gold thank you ward is right describing the tools in your system prompt makes tool use more solid yeah awesome okay back to business um this is a run runtime let me

1:48:41

make this bigger also we have about 10 minutes left of the stream thanks for everyone for joining but we're going to keep going um also i'll say how many people are here 168 of you thank you um all right so this is the runtime context uh part of the dashboard or the playground here we can just set everything up so let's just say the topic is

1:49:07

mathematics so that's that and at the bottom i guess is there a save button i don't know oh there is except i was too uh ah i was too zoomed in i don't know if that's a bug or just some but um yeah it's a bug and also the save should be on the right where we have so much yeah dude like let's empty ass space here yo marvin

1:49:32

greg get on it bro um okay so now we have mathematics let's go to our tutor agent and i should ask about biology right how are babies made um what do bees do or oh yeah like what are birds no no what are dolphins oh hello there that is not relevant but if i say maybe you should um in the error like now we did error maybe we do the response h but it's like true or false yeah

1:50:15

but yeah we can log it though oh you know what we should also put relevant and we should also put reason yeah yeah just like an evil like a eval it's the same as the eval man this is just an eval and then we got the reason very nice very nice yes and then let's also i think that's it that's all we got to do okay cool so we're back um what

1:50:45

are the birds doing in the nest whatever that means nice the reason which is not related to the topic of mathematics now we should ask a ma m yeah who invented oh okay actually this okay i'm going to do an easy one and then a curveball okay so i also studied mathematics in college because i'm a nerd dolphins can do mathematics that's pretty funny um who can or who invented

1:51:27

or no wait what is the quadratic formula and what is it used for what if it fails no it won't though it won't i'd be very surprised if it did we always pray to the to the models open ai okay it it went through right and then it also prompt enhanced that was a double whammy by the way it went through each processor by the way right didn't

1:51:56

we change that or did we keep the i kept all three so first it checked for relevancy then it checked for profanity and then it enhanced the question oh cool which was because we didn't really plan for this it's pretty sick you know but uh that's dope um okay so that's pretty sick curveball curveball okay so will it know information about like uh people of

1:52:22

mathematics right like so if i ask like who is pythagoras i don't know if i spelled that right but is it i guess so i guess so it's about i would say yes yeah because it already it didn't fail yes true so it's probably going through each one there you dude honestly we built something pretty cool we've built something pretty freaking cool um there's tons of work to do on it should we do output processor

1:52:57

and then wrap this thing up do we have time seven minutes i guess we could i guess we could try it's just the same as input but yeah but it's output so all right we'll do it um but let's let's answer this question real quick can i use my own sorry i should probably do it like this can i use my own reddit instance hosted on aws for memory

1:53:16

without needing to use upstit credentials into upstractor i actually don't know if we use the upstach's sdk um i think you use the same like a reddus url or i believe you can connect to upstash use the upst sdk to connect to something like reddis labs but i'm not necessarily sure i'll double check and if that's not the case then we got to change something because a lot of

1:53:47

people don't use necessarily upstach like we don't even use upstach anymore not that we're against upstach or anything we use reddis like proper and we think you should too but uh that's my opinion um so yeah if if not we'll figure it out but if it does then yeah you should be able to use uh that for your vectors um the vector part is it for

1:54:11

vectors that you wanted it or for storage memory so i guess it's both or whatever oh yeah for those who don't know memory in mra takes two types of storage classes if you want to use just for storing messages you use a storage property or which gets inherited from mra you can use postgress lib sql upst

1:54:32

etc kv right all kv type of stuff then there's also a vector store option uh so you can also chunk and embed and store vectors for the messages themselves this allows you to do something called semantic recall which is essentially doing vector search on the conversation pulling out the relevant details and then putting them into the context window uh we will do

1:54:56

some live stream sessions on memory it is the most fascinating thing um right now for me in ai um can we use val key instead of reddis which will be self-hosted at production um is if that's a reddis interface then yeah you should be able to do anything reddus like um if it's some other storage provider you can open a pr with the storage adapters i guess i'll show you here yeah do we have like master

1:55:25

storage with an abstract class and i think it's about eight functions you have to like implement or 10 and you're up you're off to go so these are all our stores some of them both have storage and vector storage right there is a difference um but you can check out here i think our most recent ones were couchbase and dynamo db which is

1:55:49

as a open source guys like this is so cool when contributors bring the technology that they love into our project dynamo couch base also ian from the community i just want to say who did the dynamo one awesome couchbase was some dudes from couchbase which was awesome i forgot their names i apologize and then uh mongodb was the vector storage part was from the team

1:56:14

themselves and then the storage part was from a startup using which like it's not like we didn't do any work or anything but it's just dope um so yeah that's the answer to that question oops i should undo that that's about it i think oh so shraan or however you say his name or her name i think it's but um basically i think you said you were going to do a

1:56:44

couple of more eu live streams yes oh yeah this thing so yeah usually we do the live stream at 12:00 p.m pacific time in california which essentially everybody else in the world doesn't get to enjoy them so while we're here just kind of testing out uh this time and there's 174 of y'all here pretty great um so thanks for being here we have to

1:57:08

wrap this up in three minutes though um so we're going to try to rage code this last piece which is the output processor um so we're gonna we could also just don't do that and just talk about it a little bit because basically it's the same as input processors but we're just going to put it after the message so we

1:57:27

don't really have to code it per se that's true yeah because okay let's just let's just sketch it i guess yes so if we have output processors here which will be an array but in this case you're going to have what's considered the output now output is interesting because it could be a string or an object given what you were trying to do in your generate call

1:57:51

or whatever so you would have this and then you can then essentially do the same thing like modify it maybe it's like output and second argument is if you do structure output you get like the sol object as well so you can maybe do something with that um streaming is going to be difficult specially ooh let's talk about streaming right so if you're streaming you have text

1:58:17

deltas right yes so what's the problem with that yeah you don't really know so it's about chunking right like a stream of text delta can be any chunk could be a comma could be like a whole sentence you don't really know so the question is okay when do i have to execute those output processors so do you buffer yourself in the output processor do we give you options to like do like string

1:58:46

length or i don't know yet like just spinning out loud um and then if you do like structured output like what part of the structured output are you getting like is it if it's an array of objects sure you can get it do it per object but if it's one object is it per key are you just waiting for the end that's true also text deltas are very small right so

1:59:09

you're going to spend a lot of money if you're doing some llm piece oh yeah so you essentially have to acrue text like almost in a batch interval and then you have to process it on that but then how do you acrue a text delta because then you have to prevent it from being streamed oh boy this is going to

1:59:28

be interesting dude uh we are definitely doing eu live streams next week let's continue on this uh this project here um aside from all the other stuff we have to do um but this was dope okay we are at time thank you so much everyone i know it was a rocky start but i think we caught our you know turbulence and then we figured it out at the end do we have

1:59:48

to uh do the revamp one more time like what's what we talked about oh yeah yeah go for it so we started today with like ai news we talked about claude 4 um we talked about like linears doing some agent stuff um um purcell shifter api gateway or ai gateway um vzero model is out as well um we talked about a cursor

2:00:16

going to like a um background agent so you they can do tasks um in the background and then we met the master so we met paul who is working on documentation we he recently joined us he's talking about he's doing talk docs about express we met tony and tafi who are working on um agent networks basically some kind of router so you

2:00:38

have x amount of tasks and the agent will figure it out on its own and call multiple agents um when it see fits yeah um thanks everyone uh we are the mr ai agents hour we usually say we're from the palace of the dog but we're about to release a video after this call so you can say that we're in from the palace of

2:01:01

the horse and that may not make sense to you but we'll see you on monday um probably around the same time uh so ciao what do they say in belgium or in dutch dive

More episodes