Farmers over Principal engineers, AI News, Guardrails
Today we question the path of going from principal engineer to farmer, we talk through AI News such as the new AG-UI protocol, we meet with Ward and Nik from Mastra, and we experiment with adding Guardrails to the Mastra framework.
Episode Transcript
what's up everyone we're coming at you live this is Master Live i'm Shane i'm Abby and today we got a whole bunch of things we're going to talk about we're going to talk a little bit about some tweets of mention uh so maybe some AI news we'll bring some other people at least one other person maybe two from
the MRA team to chat a bit and then we are going to build something so we're actually gonna try to build a new feature into master framework and you can just code alongside ofside us or tell us what we're doing wrong if you are in the chat so yeah some cool things that happened over the weekend that are not really AI
related but more this live stream related thing so on Friday we threw this event at uh YCHQ it's called the future of DevTools alongside Superbase and Starling starling is a spring batch YC batch uh company um it was really cool there was like demos and like just talking about dev tools there was pizza we talked about that on Friday there
would be a pizza we ordered the pizza it's pretty good um but at that so at YC there's this there's this energy drink uh that's sponsored by Greile and so I hit up Greile over the weekend i was like I'm I'm done pumping Celsius i need those Greile energy drinks and so happy to say that we will have Greile as our first sponsor where they're just going to hook us up with energy drinks and we'll just pump the
stock on those so I'm happy about that free sugar and caffeine what can go wrong nothing you know especially if you drink five of them in a day nothing's going to go wrong nothing Nothing can go wrong it has to be good for your health right yeah um but there yeah maybe there was a tweet that we brought we thought
sounded like someone on our team i wonder if we could share that tweet and then Yeah let me let me share that now um me screen share and here we go you all can see my screen uh this is a tweet from Dan uh he's pretty funny on Twitter uh but he said the software developer levels junior mid senior staff senior staff
principal and then farmer which we thought this is definitely a dude on our team so let's uh let's bring him out uh hey Ward are you there i'm here i'm here all right ward you you know you were a principal at at Gatsby and Netlifi and other places and you're also a farmer so tell us yeah so I have a small uh I have
a couple of animals in the backyard like three goats which I'll show you in a second and a horse so uh starting my farm uh year by year I guess so yeah not much to say but basically it's a hobby that uh went haywire so So so Mastra founding engineer and farmer yes exactly it's so funny that um Oh there yes there they are this is smaller than one
what's his or her name this is Juliet juliet juliet all right and then we have let's see we have Sh and then we have the big guy which is still a small goat but uh it's like Oh dude he's getting feisty yeah he always Oh he's getting feisty people uh people tune in for the coding and stay for the farming now we have the
Yeah where's the horse at he's eating at his stack love what you done with the place the goats they they uh destroy everything then here we have Whoa that was a big boy yeah come out in a sec just eating so Ward can while you're you tending your farm what what is it that you're working on at Mastra when you're not farming of course well I am the open
source lead so I'm trying to do as much as open source as I can making sure that the community gets taken care of but of course I'm not the only one um taking care of everyone but currently I'm working on getting um workflow streaming in a good uh position so basically streaming workflows like all the steps everything that's happening because with like the v- next workflows that we're building um so big shout out
to Tony which did like a lot of work there um we just want to make sure that streaming is taken care of as well we're probably gonna use like AI SDK uh data stream protocol and just uh extend it to our own that's awesome dude um Ward you've been So all three of us have been friends for a long time now um like tell us like what got you into I
guess software engineering how did you end up here well a long story but basically I started uh like as a web developer and then um as a newspaper company in Belgium and then there was like a vacant like a a job position um for Gatsby at open source i thought like let's try it because I was working on Lighthouse um for Google like the
performance tool i was like "Okay let's try if I can work for an American company for real." And then um yeah I I took I did uh the interviews got hired and I think that's where our journey started so I first met you Obby uh and then I think a year later or something uh Shane came into the picture and then Gatsby got got like acquired by
Netifi we all stood together and then basically I left Netly 5 first I think and then we all left eventually uh and I guess here we are now so uh here we are here we are all all circles led back to the beginning back to basically what's the best right like building open source and uh making a a good cloud that people can use and trust
we call you the bundlei ward could you let the let the audience know what does that even mean because maybe you don't even know but at this point we've said it so many times you definitely know yeah I I'm pretty good at like uh bundling tools so basically getting a JavaScript file and making it something completely different so all the tools
like uh Vit Webpack uh Rollup all those things I'm pretty good with them same thing with like Babel um like transformations so that's basically um why they call me the bundler key because I know so much about that uh but it's it's also a long story basically I've been using Webpack for I guess 10 years
or something or even longer because we're getting old um and I just like to understand a lot of things so basically when I'm stuck with something I try to spend lots and lots of hours in getting um to know how those things actually work and Webpack was one of the tools I put uh years and years in and eventually became like really good
at it and it has been serving me for many years uh getting me jobs like a master and stuff so can't complain about all the tears I shed uh the funny thing is when we hired Ward we didn't know that the bundle ruski was needed and then we minute we uh we hit this like point where we had to start bundling JavaScript we were like thank god that ward is here how the did
we know that he we would need to do this i don't know dude how many how many times Obby did you and I say "What the would we be doing if Ward was not here?" Yeah if Ward wasn't here dude we would be completely swear jar I understand yeah yeah swear you know not familyfriendly sometimes you know here swear jar
but it comes so many places right like even packaging a module today takes so much effort like you need to support common JS uh ESM um and it's just not that easy anymore right like before you were just writing a JavaScript file and here you are and now it's like oh I have TypeScript i have to compile it and then oh you are using ESM great oh you're
using common JS oh and then you have your TS config which has like 50 uh configurations and if you don't use the right one it breaks so yeah yasine had a has a comment so he's not the only warden not the only one yen says I have four sheeps in in my garden uh so also Yasine has a question you know what
happened in Netifi we're still really good friends with a lot of people over at Netifi we actually on Friday we were just we I don't know i probably should we could pull up that picture but we have a picture with Obby Sam myself and Matt the CEO of Netifi see if you can pull that up Obby while we uh pull it up
yeah while I stall but uh Matt Matt's agreed he's gonna come on to this live stream at some point we just got to find the right time to to get him on so uh yeah it's just one of those things where you you eventually are ready to build there it is yeah um so yeah eventually to try to build something new and so we spent we
spent some time in Netifi and then it was time to time to try the next the next thing yeah it's just when uh when you get acquired might just not be the same company as you were because you get acquired and then if it not for you then at least that was that was the thing for me like Netlifi great company but was
just not the same thing as it as Gatsby was for me so and now we're building our own company which we're running the playbook over you know a little bit differently this time though but with some of the same friends yeah most of the same friends best part right like you can enjoy work and spend time with your friends yeah thanks for coming on Ward we'll let you
go do your your chores yeah I have some poo to uh to shovel well I'll be there next week and I'll come help you out yeah it sounds great next week Obby will do it live with you and we'll we'll get that that's actually I will do it let's do that that'd be fun that's it all right thanks Ward see you Ward bye bye all right
all right dude what's next what should we cover there's another tweet that came There's a bunch of tweets that came out over the weekend so I'll share another one which is relevant to everyone using MRA and I guess in the space as well but we got AI SDK 5 preview which is dope um so if you all don't know if you're using AISDK right now it's in like the V4 um and there's be a lot of big changes that
are happening in V5 they're relevant if you're using AISDK we use ASDK under the hood for our model routing so this impacts us quite a bit um but it's pretty good like uh the work so there's two things I'll like just note here which are really good like the message format is changing it's becoming a little bit more ergonomic you can put metadata in messages um so like a lot of
people before were kind of constricted by the message format now you can kind of just make it happen so you can build rich user experiences off the message data the second big thing which is not in this tweet but the data stream protocol which is how all this streaming stuff happens with ASDK is getting a nice little um advancement um as before
the protocol was like defined by types that were like numbers so like type zero for example means text and type seven i'm just saying some but you know it means like data and now it'll be like type step start type tool call and that means you can make a way more structured UIs from streaming so this is pretty cool and we're going to support it in
MRA um yeah so that's dope yeah uh very cool uh another thing that's kind of related that maybe I'll try to pull up as well is you know last week we had a workshop with the co-pilot kit team and they just launched their uh protocol today and see if I can find the the actual tweet here and share it still uh there it
is and so yeah so AGUI was just launched by C-pilot Kit we were you know wanted to have day zero support essentially so we've been working with their team we did a whole workshop webinar on it last week but you know you can one of the things we've been hearing over and over is that it's really hard to build the front end and the and connect that to an agent in kind
of in the back end right whether that's a workflow whether that's actually an agent whether it's a whole network of agents and so if your user doesn't know what's happening in the front end it makes the whole it kind of breaks down the whole experience right and so Copilot Kit's doing some really cool
things obviously you can see even with what we showed last with uh the next version of AI SDK they're trying to improve it as well i think that's the next frontier is really how do we make front-end and backend AI applications work seamlessly together and so a lot of work being done on uh definitely on on
all these fronts yeah there's like some good questions we received on like why use for example why use AGUI UI libraries versus React AISDK React or Assistant UI or any number of UIs that's I think it's a good indication that that is kind of like this next frontier forming that there are different players trying to enter
the space and then build something useful right ai SDK React is kind of like this primitive based um set of tools that come together in the AI chat SDK which is like kind of like a full assistant with like the full application then you have like um AGUI and its CP like copilot kit stuff which is similar
deal but more abstractions around agentic behaviors like human in the loop or um generative UI etc so it's like what do you want to like what toolbox do you want to use right so yeah yeah one question someone said uh I will pronounce the name H very thoughtful name H can you see the chat here yes we can so if you are on uh if you're on YouTube if you're on LinkedIn if you're on X drop us a message we will see it we
can pull it up like this yep we just discovered that yes we can see the chat here and we can uh yeah we can answer questions along the way uh yeah no I was funny enough i was actually having a conversation with a customer today using MRA and they basically asked a bunch of questions of things we're working on but haven't fully documented or released yet one of
those things was client side tool calling which again with things like AGUI with copilot kit start to become easier to do that both the human in the loop and the ability to actually call tools not on the server side where the agent's running but actually call something in the browser on the client i think in their case I won't share too
much what they're building but it's kind of like an AI powered video editor where they need to the agent to be able to call tools to have to actually change what happens on the client in the browser yeah and client tools don't mean just the browser like you could do a client tool is any tool that is executed away from you know where you're
requesting it right so like let's say I'm on my own back end and I make a request to MRA or to anything and I specified client tools which means the AI agent will stop what it's doing when it hits that tool and respond to the user saying hey I stopped because I need to you need you need to call this tool that you told me about once you call the
tool could be anywhere right i could be the browser I could be in another server i call the tool and then I send the message back saying "Okay I handle the tool keep going." So this whole process it exists but like obviously I think all these frameworks are trying to abstract it and to make it easier yeah i mean it's just kind of hard to do right i
mean I can't the examples I was building six months ago you it was very difficult you had to very manual to wire up the front end to get really interactive experiences i don't think it's solved yet but it's you can tell that we're getting closer it's getting much easier yeah for sure what should we do next well I mean I think we have one other person that might want to join and you know Yeah
let's let him in the Sultan of support himself the the you know the I'm I'm trying to think of what goes with rag the royal royalty of rag i don't know something rag well we'll need we'll come we'll we will workshop it uh but the Sultan of support the Sultan of support we we're just trying to literation around here
you know we we don't have a lot of good ideas we just reuse the same things what's up Nick what is up Nick hey guys support is here yeah basically for background Nick has been you know last week alone probably answered I don't know hundreds of Discord messages not because he has to because he's just a
helpful dude and helping out everyone in the community so uh very appreciative to have you Nick but yeah maybe a quick little background give me a give me the the few sentences on on yourself and what you do at MRA yeah sounds good oh wait first you guys can hear me right i just want to make sure my audio
sound good always have audio issues um yeah so I'm I'm Nick i'm a a founding engineer at MRAA um I basically p primarily uh specialize in storage rack i've been working on memory um as for me I actually was uh I think a lot of our team used to be ex Netlifi so I got that under my belt too um and basically I think we all got poached by by Shane and poaching is
illegal in Africa no poaching we got scouted you you came to us i don't know dude you just showed up on our doorstep one day and Yeah i mean and there's there there's maybe some you know you probably had family conversations because you know you you and I be our brothers and all so I'm sure that that Yeah we forgot to mention that i forgot to mention that yeah yeah yeah big big bro and little
bro in in one live stream but yeah what do you do what are some of the things you work on at at MRA Nick um I'm basically right now I'm in charge of a lot of the like doing rag the our rag features um a lot of it is like our vector databases we have a lot of vector support over like I believe by now it's
over like 10 vector databases um so I've been helping review PRs that come in for those and also like implementing a bunch on my own and then making sure that they're all basically as seamless as possible with our MRA uh infrastructure yeah the wrang the wrangler of rag that's the rag that's the alliteration we're looking for so Nick a lot of users
come to MRA and they are trying to build some knowledge base and stuff what do you think is like the common pitfalls for people trying to do rag right now i think if it's not like it's basically I think setting up and making sure that they're proper properly formatting their data getting that to DB and then just knowing how to properly query it i think a lot of it is like knowledge and just first time like first
time learning how to do it and because it is kind of a as a concept it's hard for someone to get in there and really know what they want what they want to do until they start researching it um I think we provide a lot of tools that make it easier but it's still very like you have to know what you want out of rag before you can
properly implement it and get it working for your use case so I guess the Shane we should ask like the big question that we always ask let's see how Nick does it you go for it wait Nick is rag dead rag is not dead i think it still has is not dead all right hot takes hot takes here rag is still alive and well
so why do you think that i think and one thing it just the thing that we always get asked is with the larger context windows especially things like Gemini right Gemini and Gemini flash huge context windows why do you still need rag what what are some of the use cases and and why is it so important um I think it's especially important for memory even with large context windows I
think it just it's another tool in your in your belt that lets you properly um utilize memory also utilize as external DBs rather than having everything in the context window um especially for certain use cases where uh like I think if you like hit token limits or you don't want to rely on just context window you want to have be able
to support multiple models rather than just use Gemini or something like that i think having the ability to swap out models and still have the same kind of functionality I think that's useful um yeah yeah i think that'll be the main thing yeah i you know I was talking to someone at actually the event on Friday and they were saying that they used rag
with a with a vector DB provider and actually their latency was less when they just you shoved a whole bunch of because it could all fit in the context window what they needed it was big but it could fit in Gemini Flash they said their their latency was actually less their accuracy was higher and I think
for some cases that's probably true but I do think and maybe it's because I I think rag is harder right like you have to think there are things you have to think about you have to think about chunking embedding models you have to think about you know how you retrieve it how you rerank there's so many things you have to piece together to get it
right and you have to kind of measure it and determine if it's right where the easy solution is just use the big context window if you can and it's probably going to be pretty good and you got there really fast and so I think that's why this debate always keeps coming up i do think you know I'm I'm on the same as much as I'll I'll say the
hot take and say rag is dead like I don't think rag is dead but I I do think that uh it's it's becoming a little more niche because you don't need it quite as often as you used to you used to have to pull it off the shelf right away where now um you don't always have to you don't always have to do it or you can wait longer and when you but when you do
you're going to have to spend quite a bit of time to actually get the accuracy to be what you need it to be at least that's been the experience that I've seen also I think rag gets a bad name because think people think retrieval means you have to use a vector store you know but like you could retrieve from anywhere if you really wanted to so
I don't The latency thing is interesting because the minute you add database calls like of course you're going to have more latency like you're going one call to Gemini versus yeah one call to fetch your embeddings right or to create them um then one call to the model one call for reranking potentially so obviously there's going to be more latency like 100% you know but maybe that's the
trade-off that you that you make yeah I think it's definitely the trade-off and plus I mean yeah rag is definitely more definitely a lot more setup definitely a lot more I guess more pitfalls that are present but I think this if you're able to properly set it up it just gives you just another tool it gives you just more flexibility I
think rather than being set to rather than just relying on only context windows so the answer is it depends depends yeah it depends the answer everyone wants to hear of course well just Yeah i mean you gota gotta keep the gota keep the you know people doing consulting in in business you know like it does depend and it's actually true you know it does depend yeah plus there
are rag uh companies now that do rag for people i'll name a few if anyone's listening we have 94 people here hello thanks for being here i'll name a few you can use raggy.ai i'll put them in the chat so we got raggy.ai if you want professional rag stuff there's also some other homies that do rag grafflet they have an MCP server for it check them out
you know um who else uses does Rag we met some dude in the new in the new YC batch his name's Their companyy's called Chon they do rag it's not dead yeah and and of course you can roll it yourself too you know we want to we're doing the hackathon which we haven't talked about yet but we will we are doing the hackathon right now and you know MongoDB is a sponsor you know you need a vector
database got to store it i mean there's a there's a a lot of good vector databases out there but you know definitely uh can't go wrong with for sure yeah monster if you we have a hackathon going on right now the master build hackathon come build agents it's not too late it's all week long tons of prizes you can check out the previous live stream we did earlier today kind of announcing what all the
prizes were for that but go to monster.build if you're interested in building some stuff with us come build with us on Discord and with that Nick thanks for being here yeah see you later yeah see you Nick peace see you guys all right then there was just us then there were two should we build some let's build some dude i've been itching gotta get all that uh the news out of
the way so I'm ready to go all right I'll take over screen share um and for those of you just joining us thanks for tuning in we started uh we chatted about some AI news met some people on the team we are now going to try to build something so we're going to try to build uh something that we've been wanting to build into the MRA
framework for a long time and I don't know how far we're going to get today but we're going to you know do some exploration and then of course we'll keep working on it offline and hopefully you all will uh see the results at some point in the somewhat near future nice do you see my screen i can see your
screen i'm just trying to figure out how we can best position it there we go so we can see it well and I'm gonna do the same thing I ask you to do every time you share your screen yep I got it except sick all right so now we see the rag page okay cool sorry and if you and if you're listening you know if you're one of the 100 plus people listening to this if you do want
have questions along the way you know we'll try to stop and answer questions as we go within reason until we get until we struggle and get frustrated and then we'll stop paying attention to chat but uh on YouTube on Twitter on LinkedIn just find us and send us a message cool yeah and I can't see the chat so just
let me know yeah I will I will keep it I'll keep on top of it cool so what we're building today a lot of our customers or users or homies um are asking us like how do you Got them got them um we're going to add like we're going to look into adding guardrails a lot of people want guardrails um on their agent calls so first we got to understand what
guardrails is cuz let's just assume that we don't know anything which we don't right we don't know anything so let's figure it out so to do so I know that there are other agent protocols that do guard rails specifically OpenAI um agent framework or agent SDK um so we can go learn from them or like see what they're doing and then form an opinion let's go look at what other people are
doing and then we can see what our users are wanting so first let's just pull up um an AI agents SDK i'm just curious like how they think about guardrails you know all right so let's just also I'm kind of learning here too so you know we can learn together and I'll read out loud like as if we were in freaking class or something yeah well
you know we always say you know we're we're like builders doing AI right so we that's how that's how we we've always framed it since we really dug into this you know it's been over you know it's been a while we've been doing this for a while now but we we're still figuring this stuff out just like you all are things are changing quickly so
like guardrails guardrails as a name doesn't make sense to me i mean it kind of makes sense but I guess maybe we Let's go figure out why um why they named it that way cuz All right we'll figure that out also this word trip wire in my opinion is a terrible name as well but it's hard to name things so got to give people some
like slack cuz a trip wire I guess I could I could I could the theorize what it is i guess we'll find out so what are guardrails guard rails run in parallel to your agent which is interesting par the parallel part is interesting is like you know cuz it observes stuff well keep going enabling you to do checks and validations of user input for example imagine you have an agent that uses a very smart and hence
slow expensive model to help with customer requests you wouldn't want malicious users to ask the model to help them with their math homework so you can run a guardrail with a fast cheap model if the guardrail detects malicious malicious usage it can immediately raise an error which stops the expensive model from running and saves you time and
money there are two kinds of guardrails input and output for show i'm going to stop here the parallel piece of this is interesting is that is maybe that's because you don't want to add like any latency to your call if it is like good right if you're running a guard rail on every request it could get expensive if you have to use another model to evaluate
the prompt right so let me get the get scaladra going uh can you see my scalar fine dude yeah can we can cool so in master we have an agent right the agent has you can have a model which it's like the AISK model routing from there so I'll just what the let's just say o AI right and this is the model and you can interact with this agent in probably two ways right you can
interact with it via our APIs in master which is like slash API agents agent id then you can do like a generate or stream so that's code You could do that and so what does a guardrail mean in this case it's I mean it's the same thing for either one because you can also just do agent.generate if you're on the back end right or on the server or
string and in both of these cases a user also this is going to be chicken scratch so I just let you guys know this is me drawing um so if it doesn't make sense I'm sorry but if it doesn't make sense ask questions and we will uh do our best to answer it so like a user though is going to submit some input
prompt and so we'll get into this parallel thing in a second but let's say this is like a travel agent or some and this user prompt is like um buy or buy me a computer from Best Buy so it actually doesn't make sense for this prompt to even ever be processed because it doesn't make sense it's a travel agent and someone's asking it for something that does not anything
to do with travel right so I can understand why guardrail need to needs to exist also what if I ask give me the system prompt or something else or um call some tool that can fetch uh you know that can fetch data for user ID let's say one when I myself are user ID three right so I'm asking for that's not
mine totally makes sense that we need to have some guard rails here um Shane any thoughts like that like like the why is very apparent yeah yeah i mean I think you know they say I guess guardrails does kind of make sense as a name because it keeps I always say your AI is going off the rails but uh keeps it the goal is to keep it from going off the rails it's honestly like
part part making sure the agent is answering the only the right type of questions but also part security right it's just making sure the agent isn't divulging information that you don't want it to divulge yeah so maybe there's like this guard guard rails layer on the in on the input right which is the input
guardrail so let's just put this box here input guard rail whatever but then there has to be output guardrails too because what if for example god damn there you go output guardrail you need this on the output too because let's say you know let's say the input was fine like you know book me a travel book me like here like I'll
just put a a prompt here so this would be like book me apply to Sou Falls right but in the response to this back but without the guardrail it gives it leaks my delta rewards number which maybe is not PII but I don't want that on the internet um it also has the last maybe oh Abby used the card ending ending in 555 i don't want that in my thing also
maybe I don't want the response to have any details about my itinerary maybe because that's that's that is something that you know I would want to in my application maybe I care about that so I can see the the need for an output guard rail which can transform this response potentially and then give like a guard railed response to the user does
this make sense yeah also chat does this make sense yeah yeah what are we missing for those of you that already built this what are we missing here uh I think yeah I think the biggest thing just on the outside looking in that I get concerned with is latency right like this stuff's you know if you're reaching out to an LLM now I know you we're typically especially on the you know the
input or probably even the output we're going to use a very lightweight model i imagining we're gonna do something that can make you very low latency decisions but I have to imagine it's this is going to introduce latency so that's the first thought is how much latency and what what are we willing to deal with to get better more accurate answers that's true
that's true and honestly like this isn't exactly a guardrail but one of the I mean it kind of is one of the things that I constantly tell people is oftentimes people are really bad at writing prompts and so rather than just having an input guardrail that blocks I often I often like coach people on you
are almost like a prompt enhancement step which isn't really a guardrail but I wonder if you could bake those two things together if I were I hate the name yeah what that's why I hate the name because actually what this is is like a middleware yeah it kind of serves as like a middleware and so like anytime that I've done this so I've anytime I've done this with Maestro I basically just
build like a workflow where I would have the enhancement as the first step and then do the agent call and then maybe have some kind of you know multi-step where that is calling multiple agents where it's like a a lightweight agent at the beginning to rewrite the prompt then you have a the real agent doing the work and then maybe a lightweight agent
making like just verifying that you did it right or like giving that yourself the option to like tweak it or change it yep also I want to shout out a homie who also said this same conclusion can you see my Discord let's see funtor where are you at brother i'm going to look for you cuz you had this brilliant idea already there it is
I really think the agent abstraction would benefit from some kind of middleware agent middleware that could be you can bootstrap messages that makes sense you could I don't know what this means but we could figure it out um and this guard rails heristic based tool calling is interesting too but definitely you can do guard rails there and some other so funkurism thank
you shout out okay so okay we're kind of aligning on the possibilities here let's keep reading about OpenAI and see what they do also we should compare and contrast with some other tools too but okay so input guardrails run in three steps first the guardrail receives the same input pass to the agent so once again this is running in parallel so let's
actually maybe recreate this diagram real quick because I believe in this OpenAI world you have like a guard rail agent let's say call it the GLA so maybe these are like a they're like they both are receiving they're like a unit yeah can Can you Can you give me like one or two more clicks there you go sorry about that um and then these are like a pair now
until the input goes to both of them and then they're going to start executing them so okay that makes sense to me next the guardrail function runs to produce a guardrail function output which is then wrapped in an input guardrail result the does that mean what does that mean okay um I guess I have to look into this real quick what is this the output of a guardrail
function okay okay yeah all right this is just output okay it's just a type all right my bad i'm just dumb um but it's then wrapped in an input guardrail result which is the output from the input the input guard rail i hate this name dude but okay um that's fine so finally we check if the trip wire triggered say that three times fast trip wire trigger I can't even do it um is
true if true this exception is raised oh my god try to say this three times fast um and you can appropriately respond to the user or handle the exception okay so what happens if it throws though you have to cancel the request that's happening what if the What if this agent finishes before this thing can um complete like the assessment you know what I mean
so well I wonder if it I wonder if it Yeah i mean I I could see you like triggering them both in parallel and then waiting until they both finished to you know to respond then respond seems like yeah that's true because then you could run them both in parallel which would lower the latency and then if it you know assuming that the ideally the guardrail would be a lighter weight
model which would go faster so it should be done before this is like a GPT 40 or something yeah like a mini like a yeah it's like a 40 mini or something mini Gemini then it assumes that this thing is like a you know GPT45 or some yeah or yeah Or I mean ideally even if it was the same right if you could just assuming it wasn't
drastically you know that the guard rail wasn't drastically more you just wait until but you you trigger them in parallel yeah like the assumption is that the prompt checking or the input checking is just a low like a lightweight thing to do compared to what you're sending the uh but then in the case of enhancement you don't want to run this in parallel so I feel like maybe we should do
something that's both parallel and blocking you know what I mean you have the option in the API that we create you can run this guard you can run this guard rail or whatever and it has to complete before you start the operation which is a choice or you could do it in parallel where if it raises an exception
then you don't return the result or you uh close you know cancel the request yeah get the abort the stream so um I know we should get these guys on there where are they at the I I know one there's a YC company called Casco and they they have like a low latency uh guardrails like API but they'll basically they'll intercept I think it's
get casco or something casco.com i don't know casco yeah there they are but again shout out to these guys yeah so yeah you guys should come on and tell us all about this where you at uh someone find them and tell them we're talking about them yeah we have to book a demo to know that's whack but anyways
I I think the the idea is that they they'll like overwrite the message and just say like you know this this message has been blocked have you ever seen like if you ever try to get uh OpenAI's image gen to generate some trademarked image it'll get like part of the way and then it'll eventually flag it and say this was flagged for you yes copyright yeah
you stop the operation and then you return yeah so like once it hits you stop and return some kind of security or canned message around what what the exception was right yeah so maybe then maybe like the middle this middleware can because you want Yeah we need the option to let's just write this down we need the option to run guard rail in parallel and there's a there's
some chat uh what else johan says the per message context retrieval is probably that for every message you do a rag is to give the LLM some extra context that could be relevant to the message yeah dude beautiful comment there thank you that's really good for every new message a new rag is done and the relevant score is checked to the previous one if more relevant we inject
this one instead yep yeah i Are you the OP or you just a dude original poster or just a guy or just person cuz then Thank you for that cuz you added on defuncturerism's idea so so I would call that enhancement right input enhancement because I don't want to make an opinion on what you can do in this function right you should be able to do whatever the you want whether that's rag or
whatever i hate the word enhancers dude because we use that a lot in the Redux days of JavaScript enhancers and they're so whack so we can't use that i refuse but guardrail is not the right word for enhancement so let's leave that there hey chat think about it for us while we keep reading yeah is there a better name than guardrail i mean also tripwire is
whack i'm just going to write that down too tripwire is whack and if anyone disagrees that's okay but I think it's whack as a name um all right let's look at this note here input guardrails are intended to run on user input so an agent's guardrail only run if the agent is the first agent ah so it's talking about this in an agent network or a like swarm type of situation so in this case the
guardrail only exists on the input message going in but I don't agree with that because we're not going to do it that way if you put the middleware on your agent or whatever we're calling this middleware right now if you put this on the agent regardless if you're in an agent network I'll just draw a freaking
network real quick so for example this box now is in this network there's no reason why we can't apply all these same enhancement whatever middlewares you know as agents are inputting to each other right so in agent network you have like a router agent so I'll just put that dude on the board this is like the the routing agent it's the one doing all this kind of like you know quarterback
action so the guardrails would or these middleware would still exist in these calls from the routing agent so I don't think that we don't I don't think this matters or it shouldn't right if it's the first or whatever would you think so maybe in OpenAI that they put guardrails on like the whole run oh wait no they don't why
is the property on the agent instead of pass to runner.run it's because guardrails tend to be related to the actual agent yeah duh um you'd run different guardrails for different agents so collating the code is useful if this is true then why it only run why does it only run on the first agent maybe it's just a nuance
yeah I mean I could see Yeah I don't know i could see why you might want to have guardrails for different agents but yeah I don't know why you why you wouldn't just allow each agent in the network to be able to have their own guardrails so you could independently isolate and you know check the out the
output but maybe just for latency reasons you'd want it on just the initial but I I don't know seems seems arbitrary to me yeah if you want guardrails then you can't be I mean maybe you still can be lat latency sens sensitive but you have to measure that so that's going to be tough but maybe that's just a caveat yeah well I think it's just kind of like
if you if you're doing like input guardrails you could pretty much you could keep it pretty much low or no latency right because you're you can run it in parallel along when you send it to the agent and then ideally it would be almost no latency or very very little but if you're doing output guardrails you definitely have uh
some latency so my my dog's barking causing your dog to bark roxy sh Roxy shush sorry about that uh you know you know what I mean like I'll put latency output guard rails would have to yeah um we dogto dog communication uh happening here so I think that that's the that's the challenge for me is I think but ultimately you as a user should be able
to decide that you know knowing that input latency isn't as bad you can maybe just choose to make sure they're not asking to do something malicious and block it block the response if they are but give you the option to do an you know a post output guard rail knowing that that's going to increase latency but maybe for some especially like
security type things you you want to you know maybe verify that the even if the user didn't ask for something that the agent didn't do something unexpected that would cause uh yeah cause the agent to I guess go off the rails right and so while we wait for Obby to get back Johan not not just another dude not not just another dude to me thanks thanks
for dropping in the the comments and for those of you just joining you know we're we got quite a few people watching now that weren't watching 10 minutes ago we are we're trying to build guardrails or at least talking about how we would build guardrails into MRA the idea being that we want we want to be able to have
make it easy for users to protect their agents whether that's you know protect ensuring that the user prompt that's getting inserted isn't trying to do something that we don't support and also making sure that when that prompt is or is then taken by an LLM and created into a response that we are either sanitizing that response or making sure that
response doesn't go off the rails and and lead the user down maybe a not good path or or leak specific information that it shouldn't be and you are muted still Obby I believe oh sorry my dog went crazy so I just had to Yeah no worries had to regulate dude i have Yeah we we we do it live around here okay so that was input let's talk output
also how much time do we have yeah i don't know 10 minutes 15 minutes i don't know all right we'll keep learning this session yeah this is this is learning learning with Mstra this week and then uh next time we'll start building yeah all right so output guard rails running in three steps same input pass to the agent okay
all right this doesn't make sense the first guardrail receives the same input pass to the agent even though output guardrails are intended to run on the final agent output so an agent's guardrails are only if it's the last agent okay so this is the same like note from this one where it's like the beginning and ending of a
call but once again you could probably do it for each call depending on latency or whatever right um or maybe our views of agents are different than these these guys yeah this makes it seem like the guardrail is getting the same input that was passed to the agent and then it produces the guardrail output which then is wrapped in a result and then it triggers potentially the
this quote unquote trip wire yeah so these are just functions then um which I guess we kind of learned about input output then they're all the same except where they run and what the input actually is i also do not think this is the same input passed to the agent it's like the output right well I guess the output of the
last agent let's say okay sure we don't know what we're doing we'll figure it out trip wires terrible name if the input or output fails the guardrail the guardrail can signal this with a trip wire as soon as we see a guardrail that has tripped the trigger the trip wire damn are they like really trying to make people like up reading this we
immediately raise a I'm not going to read that exception and halt the agent execution okay this is dope we want to do something like that all right let's see how we implement this so math homework output it has an agent that checks if the user is asking you to do their math homework okay and then that's
just going to return with that pyantic is a really sick library for those in the Python space i mean we aren't but major respect to Bidantic um and then let's see and and a framework they have a framework too and AI yeah yeah they're dope hear a lot of good things yeah um input guardrail love Python dude
i love Python so much dude um you I am You and I are not the same i love it dude look how beautiful this looks like damn like I know like don't you just know exactly what this thing does all right um so you run the guardrail agent on the input then you have the result.final output which is this if it's math okay cool this input guard rail this makes
sense to me like how this works sure what's what doesn't make sense is the output right now so let me look at this so this is the output guardrail it runs it on the output.response i called it dude they need to change their docs this is wrong hey if you're watching this OpenAI fix your docs yo um because it is the output of the agent that you're
actually guard railing like how does this work actually what are they doing oh it's this check if the output includes any math i mean this is a dumb guardrail but I get it for sure for sure and then you have an output guardrail i guess if it doesn't have anything it'll sip so I mean ultimately they're just allowing you to pass in functions
either as input or output guardrails to your agent and then behind the scenes they're just running them either in parallel if it's input or on the output response so ultimately it's just these are just like helpers so you don't have to think about adding an extra step and handling that hey chat like we looked at
OpenAI who was good knowledge anyone else have any other people yeah any other prior art that we should consider as we're thinking about this yeah if you if you know of any any framework that does a really good job implementing guardrails or whatever the framework wants to call it uh definitely let us
know we are looking for Does Agno have it i don't know i don't know let's Let's do some digging do a little do a little research i can also ask Claude yeah let's ask Claude see what Claude says what was I asking him last time how to use GCP like idiot oh man we're like doing so two right now so I'm like how do you actually do use GCP um how do I Okay hold on come on
bro come on bro oh no yeah and I'm And I will uh I'm gonna I'm gonna have 03 uh do us do a little research project for us while while we sit here and I will report the results do these schools have guardrails maybe they don't call it that i feel like everyone should just call things the same thing you know i say that as I'm about to change dude name as
you're like hating the name and trying to think of a better name yeah oh dude we're changing yeah you you should talk dude all right i don't know if these guys have it maybe they do on agents i don't see it specifically um what a lane chain do they I have it guard rails output parser one of many oh look at that
maybe this doesn't exist in the new version okay um I will I will not say anything about docs but come on this must be probably the wrong thing to probably the wrong people to look at honestly u I'm I'm letting letting 03 do some do some research oh yeah a partner she's from uh maybe So I I see here maybe instructor with pedantic
oh so we're uh Oh we can get some instructor pantic yeah I mean let's Oh yeah we met with uh we met with this dude but I think the JavaScript version I believe maybe it's the same thing but uh Oops I spelled that wrong am I tripping or is this the wrong library um API reference okay is it like built in chat is that what you're saying or is there an actual like primitive for it try try pedantic
AI is another suggestion just let's look at pedantic AI and see the homies i know this I know this exists this concept exists in other places we're not you know it's not just open AI maybe it's called something else maybe Yeah cuz maybe guardrails isn't the freaking name for it what if I search here is there a guardrail dude they must not call it guardrails right like we're tripping
okay let's let's see uh so okay so here here's what here's what 03 has for me oh you want to share uh yeah yeah so we'll see if it's any good uh where is it i don't know why I can't okay well see if that works all right so OpenAI calls it guardrails so Bedrock has guard rails okay which I'm guessing it just works on any model that
you can use with Bedrock so crew AI has task guardrails okay lang chain which is this was the old that's what we saw yeah yeah we saw so maybe that lendex same name same basically yeah we're not doing that yeah autogen and Nvidia i don't really care so crew I guess we can take a look at Yeah we can take a look at crew maybe common design
patterns parallel execution task okay I'll take over screen share real quick let's do it yeah screen let's just look at crew task wait where the hell task guards to add a guardrail to a task what's a task though oh okay it's like an agent run yeah yeah i'm pretty sure in crew you can have like a you can basically send a task right to like your
crew yeah i like this task thing we're going to do something similar where you can like create tasks it probably won't be called a task though but didn't you just say what shouldn't everyone name things the same to make it easy well if everyone's naming things as guardrails then I guess we already threw it out there then what's what's what's that uh
that uh tune it's like there's 12 competing opinions we should like standardize it now there's 13 competing names yeah it's like there there are four names for this we should we should just pick one okay now there's five exactly unfortunately this is the the challenge of naming naming okay llm guardrail oh they make a nice little class for it it who are we kidding we're
not going to name it anyways we'll just come up with a fake name and then ask Sam to name it and he'll name it exactly sam will name it sam will name it and just just because Sam is like he's very good at naming things comparatively i'm not saying we get everything perfect but he's I naming things yeah he's he he has a little more naming
taste than I would have come up with the name guardrails you know so okay okay so this looks pretty chill like it has a validator which is essentially what we're trying to do too pass a validation type function but where does it run okay man python is so beautiful dude did I ever let tell you that it's like
gorgeous dude so legible validate this is cool yeah i don't know if you're being honest i don't know if you're being honest or you're exaggerating i think you should know by now where I stand on this well I know i don't Theirs is actually running essential well I don't know if it's parallel or not can't really but it
definitely can mask things and format them here you can see there's an output formatter format output so is it just on outputs or inputs uh validation transform task outputs before they are passed to the ne next task so that's our output middleware essentially that we want to do validate and transform in one function okay that's kind of
cool herror sent back to the agent the agent attempts to fix the issue until the guardrail returns true oh that's pretty interesting i don't know how it would fix the issue maybe like modifies what it the the task is the task details i kind of don't like this actually because it seems very not deterministic yeah yeah i mean I wonder if it's like
if it took the response that had the you know credit card number or whatever and it stripped it out and tried to fix it and then it it wouldn't Oh yeah and then it would it would basically like sanitize the result and then pass it through actually that's a really Yeah that's a good point that's a good point like
these maximum retries and stuff well let's just call it Let's call it sanitizers you know sanitizers another bad name output sanitize or or something i don't know terrible name we're not doing that kickoff okay yeah this makes sense so I think we're all kind of coalesing on the same idea of Okay to kind of re how many people or
should we do like a summary of what we Yeah yeah for people just joining we got like you know a lot of a lot of new people joined we so we are trying to figure out how do we build or what's the right form for building guardrails into Maestra the the open source framework basically trying to figure out how do we
ensure that if someone has an input prompt that's asking for something they shouldn't we can you know clean it up or block it if the LLM generates some kind of result we don't want it to send back to the user how do we block that result so it never gets to the user and different frameworks have different ways of doing it Some don't have it at all
some have you know different names for it but we're trying to figure out you know what do guardrails mean for Maestra and we're not going to implement it today obviously but we're doing some research and we're trying to figure out what what would the API maybe look like and how would we how would we build something that people could then use and
and get reasonable results from yeah so I think kind of distilled it down and maybe this is too general we could start here but if we had something called agent middleware yeah I I'm only looking at your crew AI screen my bad dude let me share this tab instead great so if we have agent middleware that has a input function
um and then maybe and like so here's some maybe we need some like options for parallel or blocking right and then you also need to be able to throw a I'm going quote this as an exception because there you know that's just what you have to do but maybe it's a special error that you have to throw and then you also have the option
to return so you you're you receive input and you could modify it modify input or you can just call next or whatever based on like maybe you don't want to do anything for this given one does that make sense yeah so I mean thinking through this would we you know do we want just that function to handle call out to any LLM
call agent it wants or does you know do we want it to pass in the model and the prompt into the input function like how ultimately because you could there could be a lot of things you do right you might want to write some some algorithmic code that's just like look for these specific words like you know here's a here's a list of swear words don't that don't let the you know one of
those escape i mean so I could see what kind of things can you run yeah because you might want like a LM is a judge i mean honestly this is very similar to evals right it's just like rather than like hallucinations or whatever you're just like you're rather than just doing the eval and giving a score you're
actually letting the the model decide if it should go through but otherwise it's pretty pretty close and so a prompt evaluator via code or LLM yeah so I mean whatever we do we should try to make it Yeah we should try to make it feel ergonomic with what we're doing with eval because dude technically in a guardrail if it's a function you could send it to a human and they could be like
sure you know yeah i mean the music that that means we're you know we we've been vibing for a while if the music cuts out yeah uh yeah i mean yeah you could send it i mean I don't know maybe in a long I I don't know how that's different than just like uh other human in the loop type activities but yeah you could in theory exactly yeah I wouldn't probably no one
would do that because it could take way too much time to get a response from a human but I mean that just shows you could do whatever you want in this function right to then throw this exception and we don't necessarily we shouldn't have an opinion on what you want to do we'll have examples of how to do some stuff but I think that's truly
what this thing is doing um chat would you agree i can't see the chat so I don't know uh only thing we got was uh interesting which I agree i don't know if that's a good thing hope hopefully hopefully it's a good thing seems like a good thing i can't see how output would ever be parallel right it just blocks the return i think it would have to right
hey Chad if you're listening uh and we're we're we're going off the rails if we're the agents going off the rails here be our guard rails help us out yeah I think it I think it would need to be blocking and because you're basically you are wanting to intercept the response so the only way would be the only true way would be latency you know
intensive of course but you want to if security is a major concern you may want to block it yeah and this actually could be a certain type of eval as well this would be an output eval via LLM if you want or you can use natural NLP libraries and whatever then we could also bring master eval like the eval kind of tools into
this world you know uh that we already have yeah i mean that's that's what I'm saying i feel like there's a way to connect these two in an ergonomic way correct so it just feels like eval whether you're doing eval or whether you're doing guardrails it's they feels cohesive i I don't know exactly what it
is but it Well the guardrail itself is just some function that they called guardrail so for us I don't know we're calling it like we'll call it something but it's an input function because you can also return a formatted output right so that's here or you can just do next on it like go on with your life it's pretty sick dude if we had this it's pretty cool
and the fact that we need to revive evals a bit to make it work nicely is good because now we get two birds we've already been we've already been working on it so it's just additional uh motivation yeah additional reason to make sure we do we do it this would be cool then because then you can do guard rails and you can do like transformation um and you put on your your
agent would you want to call these I mean actually my next question will take us off the rails so I need guardrails too because I was going to talk about tools and tool middleware and but that's like a that's another discussion um yeah and Jay Jay says "OpenAI SDK can do both input and output interceptions." Yes we if you joined
late we were looking at some of the OpenAI documentation for inspiration so yes we looked through we looked at a bunch of different ones yeah you know ultimately just trying to figure out for those that have joined what's you know how should guardrails work in Mastra what what do we think what can we learn from others that are
doing it how do we make it work with the other things that the other primitives that are part of the framework and how does this how do they you know tie into potentially like eval because they are somewhat similar somewhat related yeah hm i wonder if more so people want to transform things you know like would people use this more to enhance inputs on the way
into their model right with input enhancement yeah it's like an enhancer versus I mean can you just would it be an enhancer that can throw if it's you know so you can you can basically completely block or you can just Yeah change i don't know and then the parallel execution is interesting too like how like the right API for that we'll have to Yeah yeah cuz it's parallel it's not really an
enhancer at that point that is just strictly straight up guardrail action yeah or as open AI would call it that's like a trip wire right it's just like flags flags that something needs to needs to stop execution yeah h lots to think about any thoughts from the chat yeah i mean as we get kind of close to wrapping up here this was we we're
hoping to write some code today but turned into a research project that's still pretty fun yeah I I learned I learned a lot hopefully hopefully those of you that tuned in you know you learned some stuff as well and you can uh see how we make decisions around here we just uh figure it out yeah we just uh do the do the same thing you would do do some research try to figure it out see
what makes sense talk it out and yeah next time we'll hopefully start building yeah and this is good because we kind of learned that guardrails aren't enough if we only implemented guardrails then the next question would be I want to do some other it needs to be probably be a little bit more like fully baked on like what you can do with this but it's
cool yeah so I got another question let's let's take a quick detour as we start to wrap up that was fun can you talk about the hackathon yes we are doing a MRO hackathon right now if you haven't seen it go to master.build i'm just going to pull it up i'm pull up the site yeah share the tab let's Let's share so yes we are
again doing a hackathon go here to this site sign up it just started today so you are not too late you know it's all week long it's meant to be you know kind of come as you are you build something cool every submission gets prizes there's a whole bunch of raffle prizes so if you submit something you have a
chance to win some cool stuff so there's a live stream we did earlier it's on our YouTube channel if you do want to see what the prizes are who the sponsors are uh figure out how how you can actually participate but you can drop your email in there and just come to our Discord there's a MRA-build channel in Discord and that's
that's the best place to go if you're building something if you need help if you have questions around the rules anything like that please go to our Discord if you're looking for a link to our Discord best place to go is just go to our website it's in the footer or our docs it's in the header join our Discord
and yeah ask questions there and with that what do you think what should we wrap up that was a good stream we'll we'll be back tomorrow so yeah and and reach out to us yeah let us know what you want to see on these uh we're trying to trying to do these more often we bring on people from MRA to talk about
the stuff that they're working on we bring on guests so you know we have a couple guests coming on later this week just to talk about things they're building with AI ideally have we want this to be educational but also fun so if you have things that you want to learn more about that you think Abby and I should you know should talk about and
maybe teach or figure out on our own as you saw today we're learning alongside all of you let us know and yeah we'll try to incorporate a lot of that stuff in cool well thanks everyone for the admirals of AI by the way uh we are not we are not in the palace of the dog today nope nope we're traveling so we'll still keep doing the live stream yeah yeah we are we are all the admirals of AI uh trying to figure
figure this thing out so good to see you all and we will see you next time yep

