Back to all workshops

Build and deploy your first agent with Mastra

February 13, 20255:00 PM UTC1 hour

2025 might be the year of AI agents but this session is your chance to actually build and deploy an agent yourself.

AI agents are autonomous systems that can perceive their environments, make decisions, and independently take actions to achieve specific goals. With today’s tools and frameworks, it’s never been easier to build them, and Mastra.ai co-founder Shane Thomas will show you how to do it in less than an hour.

Use agents to build things like a personalized travel agent that puts together vacations for you. Or use multiple-agents to simulate a ‘team’ of writers and editors that seamlessly collaborate on content.

This event is open to all devs and aspiring AI engineers, regardless of background, so feel free to share the invite link. It’s recommended that you have a code editor and node v20+ installed prior to the session. You should be comfortable with basic JavaScript and the command line.

This isn’t a talk; it’s a live workshop where you’ll walk away having built and deployed your first AI agent.

Workshop Transcript

0:02

and yeah thanks everyone for joining this is meant to be interactive so if you want to try to follow along you can we will move kind of fast so you know if nothing else you can always refer to the recording after the fact and I'll also share the get repo that we build along the way let me share my screen

0:23

here and we'll do a quick introduction but then we'll spend most of the time actually just building something together so that way we can try to build and deploy an agent uh you know together so a few just kind of introductory slides so our first you know first things first what's the goal let's try to build and deploy our first AI agent you know some of you might have already built some things some of you have not

0:54

and so we'll start from the beginning talk about uh kind of the whole process and then we'll show a little bit how you do it in MRA and of course you know talk about and maybe have some time for some questions at the end so we're going to learn some common AI engineering Concepts and we're going to learn some of the MRA kind of core

1:11

Primitives you know agents workflows evals who am I I'm a founder and chief product officer of mastra I was in engineering and product at Gatsby and netfi before I built audio feed and I've been you know this dates me a little bit but I've been in open source software for over 15 years now as starting uh you

1:32

know when I first started a website development company about 15 years ago working primarily with Drupal so I've been doing a lot of Open Source stuff over the years and some info to connect with me if you have not already but the first thing that I think is worth answering or talking about is what are agents you know there's a lot of debate on what's considered an agent

1:56

I think the easiest definition I've seen is just it's some kind of software with non-deterministic code meaning you can't always predict the result there's something making a decision that's not our traditional you know deterministic code that we write where we know if it's gets executed it's going to return the same thing every

2:14

time but it's really a spectrum you know something that's maybe considered less agentic is something that is mostly deterministic but maybe there's a call to some kind of AI system typically an llm but wouldn't have to be something that's a little bit more agentic would be if you're using llms to kind of build

2:33

and execute their own plans and access their own tools and so they're making more decisions and therefore since the llm is making more decisions it's a little bit more agentic and then kind of like the most agentic you know the examples that we have would be something that's more fully autonomous that runs with very little or no human input an

2:53

example there might be like a self-driving car so you got all the way from like a simple trip planner to maybe a customer service assistant all the way to full self-driving and there's also been a lot of talk around you know do you need a framework for this stuff and I think it's like anything else no you of course don't but I do like this tweet here you

3:19

know if I'll let you all kind of read it but you start to as you get into this stuff realize that there's a ton of new Concepts ultimately it does follow a lot of the uh software development best practices so if you've been building software before you're going to see that a lot of it's pretty similar but there's a lot of new Concepts that make a lot of

3:42

this stuff working together kind of confusing and that's where we really see the opportunity for Frameworks like MRA to come in so MRA is an open source AI agent framework for typescript and what's included with asra we have agents that have tools memory tracing we have state machine based workflows so that you can do human in the loop meaning you can wait for the

4:08

human to respond and then continue a workflow we have evals which we will get to show today and then we have storage for rag pipelines a local development playground and a whole bunch more that's not included but our kind of high level principles is we want to provide something that's opinionated so you

4:26

don't have to make a lot of decisions to try something out but it's flexible that you can piece together and you're not kind of locked in so an example of that would be by default you get you know storage baked into MRA and you know you get kind of a vector database baked into MRA but you might want to use pine cone or PG vector and so you can just swap that out and if we don't have a

4:51

supported provider you can write one yourself so we do want to be opinionated but allow you the flexibility to really piece together the you know AI or agent stack that you want and we also want you to be able to just get further faster so you can actually get something out and deployed quickly and then have time to

5:08

iterate on it rather than worrying about making a ton of tech stack decisions before you even know what you're actually building so this is what we're going to do and then we're actually going to jump into the code and we will see how much we can get down in maybe you know say 30 to 40 minutes and then we'll have some

5:25

time to kind of do some followup and maybe answer a couple questions so we're going to create a new masterer projects project we'll create an agent that can tell us the weather so we'll do kind of the simple example we'll R run some eval metrics for our agent we'll talk about a workflow that can plan activities based

5:43

on the weather we'll we'll also then create an agent that can answer questions about maybe some personal finance transactions in a Google spreadsheet and then we'll deploy and test our agent in production so we'll actually deploy it somewhere so it can be live on the internet and if we can do that in a half hour I'd call that a

6:02

pretty rapid fire Workshop so let's go ahead and see if we can get started so I'm just going to get started by running I guess while I'm doing this if you can't see anything feel free to either unmute and tell me or you know yell in the chat and you Sam feel free to you know interrupt me but hopefully this is all visible maybe I can make it

6:30

just a touch bigger okay so we're going to run mpm create MRA at latest just to make sure we get the actual latest version and it's going to ask me to name it so I'm going to say Feb 13 workshop and this is going to go in and install all the dependencies for MRA and mastra is built so it can be run Standalone meaning your agent and your

7:06

workflows are all run behind an API and you can deploy that independently of any other application this means you could build an agent that can interact with a mobile app a you know desktop app a website it can also be used and you know we do have people that use it within an actual front-end website like a nextjs site as well so it can be it's meant to be used in either way but in this case

7:29

we'll just show kind of the Standalone version kind of running on its own we're going to just add the defaults because it'll give us some boilerplate code to look at and see something working right away we do want our agents to have tools so we'll add tools I am going to just use open AI although I kind of like anthropic better but we'll use open AI because that's the most common I will skip putting in my key and

7:56

then we do want to add some add the example so this is just the weather example we'll go through first and we'll take a look at the code so I'm going to go into this and I'm just going to copy over just my uh environment variables which has my open AI API key but typically you would just need to drop that API key

8:27

in I will just drop it in here so the first thing we can do is rather than looking at the code first let's actually run the development environment so Master comes with a a Dev server that allows you to see your agents and your workflows immediately and kind of iterate and test them so we're going to

8:46

run mpm run Dev and this is going to spin up and run the master Dev command so you can see it starts it has an open API uh spec so we can actually interact with the with the apis that it gives us and we can also then just open the playground so I will open that up and hopefully you all can see my my screen

9:12

we can see the playground so there's this weather agent let's actually chat with it and see what it can do so let's just ask it what it can do here and you can see over here on the right it tells you the instructions that this agent has tells you it's using GPT 40 and it has a tool called get weather let's just ask it the weather so I'm from sou Falls so let's say what is the

9:36

weather in sou Falls and it says the current weather is1 19 Celsius so uh that's pretty cold and because I'm in San Francisco let's just check what the weather is in San Francisco right now much better and you know this is kind of nice but I actually would want this in Fahrenheit because I don't actually know what this feels like I have a general idea let's actually update that and get

10:17

that working so let's see how the code actually works I'm going to go here and I'm using uh so I'm going to be using a editor called wind surf some of you may be using an editor called cursor if you haven't used one of those tools I'm going to be kind of showing you how my uh the way I develop has changed a lot over the last probably 18

10:42

months definitely the last 12 months so I'll be showing you uh kind of how I build things and I would encourage you if you haven't tried some of these tools just to spend some time with them because it does uh kind of change how you write code if you've been doing it for a long time and you uh are are looking to maybe

11:01

move a little bit faster so let's take a look at what comes out of the box when we created this project the most important thing is there's this Source directory we have agents we have tools we have workflows and then we have this index file you can see that we just create or instantiate this masterer class we pass in a workflow we pass in an agent and we

11:26

basically say that we want some a logger attached to it so because this is hopefully pretty self-explanatory we know that there's probably our agent probably lives in here the tools the agent use uses lives in here and the workflow which we didn't look at yet but we will soon lives here so let's go to our agent you can see

11:44

here those instructions we saw I'm going to say um always respond in using Fahrenheit Fahrenheit convert celsius to fahrenheit if necessary so I'm going to save that and then we'll go back and yeah see how if it actually listened to that but we can look at the tool as well so there's a lot going on here but this is really the kind of the base of it we actually

12:20

create a tool we're calling it weather tool and it's gonna have an output schema of the temperature you know humidity all this information and it just calls this G weather function which we have written code down here it's actually calling out to this API it's fetching the current response based on the location and then it is going to you

12:47

know basically get that information return it in this format and the really cool thing about this is now I have basically been able to wire up open AI uh in this case let's see what we're using open AI gbt 40 to be able to decide that it wants to run code that's living on my system so that's pretty interesting because now I can change this code give my weather agent

13:17

different tools and it can then execute that code and then make decisions so if we think about like what's happening when I actually do this if we go back here oh I do need to run Dev again as I stop that and you can see it says always respond in Fahrenheit we'll see if it listens but let's just talk about what's

13:43

all happening behind the scenes so I ask what the weather is in sou Falls hopefully it doesn't in Fahrenheit this time very still in the negative so yeah pretty cold I am happy I'm not there but let's let's talk about what actually happened and we can actually dig into it a little bit but it sends the request to

14:09

you know open Ai and then GPT 40 has decided I need to get the weather so I need I have these available tools so I'm going to call this tool it comes back to my machine runs the code on my you know on my machine passes the result back to GPT 40 to then Analyze That and then kind of synthesize the data and return this response so there's actually a few hops

14:33

that are happening so if we look at tracing we can see that we can see that a couple things are happening here so let's just say we're streaming text there's a system prompt that says you're helpful weather assistant I asked what is the weather in sou Falls it did this tool call so it called this weather tool and there's the actual data that was returned so this was you

14:58

know this tool call was run on my machine then it you know passes it back and it you there's some information here and then last this one should actually have the res the result which is right there so there's the actual response so you can kind of see we can dig into all the steps that happen in just that one chat message that I sent

15:24

it right but let's go ahead and try to to extend this a little bit and let's give it another tool maybe before we do I will show you really quickly what a workflow looks like so we have this workflow here that was pre-created basically it'll fetch the weather and then it'll plan activities based on the weather so we know it's pretty cold in sou Falls but

15:50

let's see what activities it plans and you can see it's actually streaming to my terminal here so as it's kind of think through it's saying if I want to do outdoor activity dress warmly due to extremely cold temperatures yeah you don't say uh but it'll basically like four different days tell me the weather you

16:14

know upcoming weather possible activities you know snowman building at McKennon Park sounds a little cold for that but maybe Library maybe is a little bit more appropriate I do have a a three-year-old who likes the library so as you can see it basically comes up with uh a response it then eventually will output it all once it's done and if

16:37

we look at what's actually happening in that workflow in this case we just created a new agent that we use just for this workflow but we Define steps so this is that fetch weather step and it's running this code here and so this is very similar to the tool call code we just duplicate it because you know it's a

16:57

it's a simple and very verb example so you can learn from it and then we have this plan activities step and that plan activity step does some other things and kind of prompts the agent so it calls agent. stream and it passes in this prompt says based on the F following weather forecast for in this case it would have say sou Falls suggest some activities but this is where all the

17:22

magic really happens so we defined those steps above and we create a workflow it say first the first step is fetch weather then we plan activities so if you think about that in your head if you say that out loud you're going to come up in your head with something that looks like this first we fetch weather

17:41

then we plan activities in some cases you might want to execute things multiple steps at the same time or have dependencies ADD conditionals you can build a pretty complex workflow and then of course you can kind of see traces for how those steps happen you know as we kind of showed in the agent section so

17:58

so that's workflows we won't spend too much time on that but you can uh think about you can build a little bit more deterministic flow and then there's some parts of this that actually call out to llms in this case getting the weather uh we it it does like do some processing it goes to the API I guess in fetching weather and then the planning activities

18:18

calls out to the llm to take that weather and then synthesize like some kind of plan for you so you can see it's kind of a multi-step and uses you know an llm call at certain steps of it whereas an agent gets to kind of choose based on the tools it has what uh what is used technically like both of these

18:36

are agentic right it just depends on what you need uh on which you know which you use and often It's a combination of both you may have agents that can call workflows you may have workflows that call agents so it's kind of a you have a lot of puzzle pieces you can put together as you need and there's we do

18:52

have a tool section that allows us to execute just any individual tool and I'll show that here in a second let's go ahead and just add a new agent so we have this weather agent and we have this weather tool but I actually want to create a new tool I have if I can pull in here I have this spreadsheet with fake transaction data think of this

19:22

as like if I would were to pull my credit card statement or my bank account statement and I want to be able to then ask questions about this so the first thing I need to do is I'm just going to make this public I think I already did that so I'm G to just make this public and I'll just make it a CSV mainly so I don't have to deal with the authentication Parts but just like any other if you've ever interacted with

19:48

an API we could go through the off steps and make sure that you know my code could actually correctly authenticate to this so I wouldn't have to make it public but making it public just speeds things up let's go ahead and yeah and if you haven't done this before this might seem a little weird but I'm just going to ask

20:06

a Cascade here to write this code for me and then we'll talk about what it's writing and you know how it does I need a new tool called get transactions that will go to a Google sheet URL and return C SV data please use the create tool example in tools and then I'm going to say also add the the tool to a new agent

20:58

similar to the example in agents index okay so what I'm telling it is I'm basically telling it I want it to write it write a tool and I do need to provide the URL use the following Google Sheets URL to fetch the data okay we'll see how it does it might not get it right but I bet it gets it pretty close so it's going to look at my current code it's going to

21:40

then you know tell me what it's doing it's looking at the tools file it's looking at the agents file it's then going to create a new get transactions tool and then add it to the agent or add it to a new agent so first it's editing the tools index file we should see some code here coming in pretty quickly and you can see it's writing some

22:12

code and this is a wrote a tool called transactions tool get transactions get transaction data from Google Sheets and this is giving me an the output scheme as an array which is actually you know that's probably okay if we look at at the what the actual get transactions does it fetches from this URL this is probably going to work but I just want it simple a little simpler so

22:38

I'm going to say can we make the tool only return the the tech the CSV text don't do any parsing just so we make sure it works so this will update this right now it's parsing the CSV it's a small enough data set that I really don't care about the parsing for this example but you know I would definitely want to test this and make sure it gets it in the right

23:03

format you know having structured data is typically better so normally I wouldn't do this but we'll just keep the example a little simpler so it's a little easier to understand and then it's going going to update the agent instructions so you can see it has there's a transactions agent it's using this transactions tool which it's

23:26

getting from here and we're just going to accept everything and I don't know if this is going to work we do need to come in here and add our agent so we're going to add our transactions agent to our import and to our master class and now if all goes well we should be able to come back to here we see we have a transactions

23:50

agent let's chat with it un let's just say how much did I spend at the Apple Store I spent $1,299 must have got something nice let's see if that's true Apple Store $1,299 so if we were to look at uh oops lost my spot here if I we were to look at the trace we could see that it's going to make this tool call

24:22

to this transactions tool and it's returning the all of the CSV in a string format back to the llm which is then kind of processing that looking up and seeing that you know here there's a transaction for the Apple Store in this amount could also come back and ask it how much did I spend in my PayPal account and this should be able to take if we look here there's multiple PayPal account payments so hopefully it'll

24:58

do some math and get this right and there you go you can see it takes both both those transactions it adds it up and it Returns the data so of course I could you know update this transaction list add more transactions uh Google Sheets does cach this data so if you do test this just know that it might not show up right away you might have to unpublish it and

25:20

publish it as well to get it to actually load the data in because it is cached for a while so it might not be immediate but you get the idea I'm actually just an llm is going to this uh going running code on my system which pulls data from this spreadsheet and I can ask questions about that data so you can imagine how this could be useful for all different

25:38

types of data whether that's local files on my system whether that's files behind some API somewhere but you could wire up agents to do all kinds of things the last thing I want to do though is I don't want two different agents I want to actually combine these things so let's go ahead and say combine

25:59

the weather agent and transactions agent into a single agent with two tools because typically I would want just a personal assistant that could do multiple things so rather than having to call no if I want to call and ask for the weather or know if I want to ask for transactions I'd like to just have an

26:27

assistant that could I could ask either thing and it would just make the decision I don't really like it called combined agent so we'll update that let's call it assistant so this should update it it'll call an assistant and then we'll give it one quick test we'll actually add an so we got about maybe 15 minutes so we'll see what we can get done but in the next 15

27:10

minutes hopefully we can add an eval we'll talk a little bit about evals just real briefly and then we'll try to get this thing deployed and actually out in uh in production here let's see did did actually work so we have an assistant you have two main functions you can do weather you can do transactions it tells us about the tools

27:35

it adds the two tools there uh the one thing I would just encourage you if you are using uh something like this make sure you're actually reviewing we're not really reviewing this code very quickly right we're moving pretty fast I would highly encourage you to make sure you know what the code's doing because it it's not always right and it can uh Lead You astray and you know it'll it

27:55

continues to get better and it continues to impress me every day but just don't don't completely trust it uh so the agent we pass in is the assistant and then of course I think our assistant does have our two tools so great let's actually run this thing and we're going to need to go back we have our assistant let's just see if

28:19

it can tell me the weather what is the weather in Los Angeles all right and how much did I spend at Amazon okay so you can see I can it can now decide and if I were to look at these traces it's you know in some cases it would use the weather tool some cases it would use the uh transactions tool but it was returning the right results so that's good let's go ahead and add one eval just to show how it works so

29:01

I'm going to go to the master docs because all this stuff uh is kind of in the masterer docs so I'm going to go down to evals evals are they sound like a Brand New Concept but if you've ever written software tests it shouldn't be that new that's really what uh what an eval is is it's kind of like a non-deterministic software test you

29:21

basically want to measure if is something working correctly and you might not get just a yes or no like you do in tests some times it's almost like a score and so you're measuring that score over time we have a whole bunch of different eval metrics some of them are more useful than others depending on what you're trying to do my

29:39

recommendation is typically you should just try to write a custom metric in our case I would want you know to make sure I would write a metric to make sure it always returned weather in Fahrenheit right if that was part of my my business goals of building this assistant or make sure that uh it would kind of validate or check the math on my transactions if

30:00

I were do to ask at something more complicated because sometimes it can get hung up on math so there those are the type of metrics I would probably want to build but that you can also use some of these off-the-shelf metrics to give you some data right away so you can get that kind of instant feedback the first thing I need to do is run mpm install at

30:20

mastra evals and that's just going to install our evals package and we can go to our agent and we should be able to add this tone consistency metric which we'll talk about what that is in a second but we'll look at some of these different supported evals so we have a whole bunch of evals on accuracy and reliability on how well it can

30:45

understand context and then an output quality so if we look at you know tone consistency it's saying evaluates the emotional tone and sentiment consistency so this is like a NLP or natural language processing metric you can also have met some metrics that whoops uh some metrics that actually use an L another llm to actually do the judging so it's often called llm as a judge something like answer relevancy so

31:18

in this case we can use in this example we would be using 40 mini to run this metric and then decide how good of a result it was so you're actually using an llm to judge the output of another llm which is kind of interesting but you also can be you know much more scalable than asking a human to do that evaluation ultimately you know you need probably a mix of both but in the cases

31:44

where you know humans are very expensive to spend time doing these things and judging results having an llm act as the judg is kind of useful so we're just going to use this tone consistency metric so you can see how it works but you could of course just copy those code samples and run any of those metrics pretty easily mpm run

32:04

Dev so we'll go back in here we'll run one more example what is the weather and let's do Sacramento Sacramento how do I spell Sacramento is that right Sam I don't know you live there sometime not anymore not right now but all right we got it way warmer than sou Falls so let's take a look at the evals and you can see we now have this tone consistency metric and it scored a 0.95

32:37

out of one so that's not too bad but you know I would encourage you to read through what the eval metrics do so you uh can actually see if they're if they're useful for what you're trying to do and a lot of this stuff accumulates over time so this initial like one eval isn't that important but you might want

32:54

to track it over time and just check in on some of these off the shelf evals to see if they uh if you change your system prompt and something go something degrades or goes down I think that's where it could become even more useful the one other thing is I will show in tools we may want to just test an an individual tool before we give it

33:15

to the agent in our case we were doing pretty simple tools so we didn't need to but if I wanted to test the get transactions tool there's I don't have to fill out any input because I didn't uh it's just returning the same thing every time but you can imagine if I did have filters for date ranges or something it would show up here and then

33:32

I would get the output so if we go to weather you'll see that it asks me to input the location and then it will just re just run that individual tool and return that that data okay so the last thing we'll do and see if we can get this done in about five minutes is to deploy this thing so I'm going to do get an it and I'm going

33:56

to deploy this thing to get so I'm going to add this commit workshop and we'll go to GitHub and we'll just create a new repo really quickly now I'll make it public so we can share it out if you want to try this on your own and this should push up the code and I'll drop this in the chat in case you do want to you know take a look at

34:51

it and let's go ahead and in this part uh you know we haven't actually announced this really publicly yet but we're going to go to masterer Cloud so it does exist it will be announced you know pretty soon and we're going to look at our I'm going to grab this Feb 13 Workshop repo that I just created we need to open AI API key so I

35:21

need to add my environment variables this is basically just going to spin up and run all of my uh agents and workflow over basically provide me an API that I can use so I could wire it up to any other you know front end application I wanted or you know mobile app or whatever so I do need to drop in this API key so give me just a second to do

35:47

that and then I will deploy the project and what this is doing is it's going to uh take all my code it's going to build it and then it's going to to essentially run the those apis so I have you know a scalable method scalable way to run my agent or especially with uh if you've built applications using you know some of the other front-end tools out there you might have run into like function

36:15

timeouts if you've been using serverless functions so we kind of handle all of that so you don't have to necessarily worry about it well while while we let this thing uh this thing kind of go I'll just pull over here so I can watch it let's go ahead and kind of go through some of the final slides and then we'll come back and we'll show this thing running here in

36:38

production all right so we did we did all this and then we'll we'll of course do this last step here in a minute but let's talk about things we didn't do we didn't have time to get to we didn't talk about memory memory is the ability for your agent to remember you know conversations so right now every time we chatted with our agent it was just like a a new message that it

37:03

didn't know have any history of what we asked it previously in MRA turning on memory is very easy you just you know do a flip a few switches basically and it it just works uh we didn't really talk about human in the loop workflows which would be the fact of maybe I would want you know a validation step in that workflow when I planned activities to ask me do I want you know indoor or outdoor activities or you know do I like

37:29

the way the direction that it's going and so maybe there'd be some kind of feedback mechanism tools are very important you saw that we had to manually build the tools ourself there are some kind of tool providers that you can use that will allow you to provide your agents more tools way more rapidly U mCP is

37:48

called a model context protocol that's in open source kind of movement around building agent tools there are other tool companies one example compos that kind of try to help solve the tool problem so there's a whole bunch of different ways you can try to solve tools besides just writing tools yourself we didn't show anything about rag which is retrieval augmented

38:10

generation this is important if you have a lot of data that you want to search through your an your the llm can only handle so much in its context at one time so the context window is uh basically the amount of think about like working memory you know a human has that's kind of similar in the context of an llm it only has so much that it can

38:30

keep in its head at one time so rag allows you to kind of do a search and then insert data into that so you can uh better look up information and then use that in in your results and then I mentioned briefly custom eval metrics which would be you know writing your own eval metric that kind of aligns with your what you're trying to get out of your agent so those would be things the nice thing is in the masterer docs those are pretty much all

38:57

top level items where you can dig in and start to learn more I have some examples and I will share these slides in a follow-up email after this where these are just some examples we published if you go to the blog post there's links to the actual source and the demos so you can see some slightly more complicated monster examples and what we showed is just

39:21

deploying the back end and we'll I'm going to actually check and see if it's done and it looks like it is so great we'll pull that in in just second but if we wanted to deploy a front end to connect to our back end assistant UI is a basically a way to really spin up kind of chat assistance really quickly so we have an example using that you could grab that code just swap out the MRA API

39:52

URL and point it at your agent and you could have a fully working front end deployed to NFI to versel to you know any other front-end hosting platform pretty quickly with just a you know a few lines of one line of code in an EnV file basically so let's go ahead and take a look at our deployed application so we have our

40:23

February 13 Workshop I don't actually let's let's take a look here and you can see we have logs that it was built and I'm going to actually go back and go to this one here so we have this URL that it gives you and this URL should actually turn and api. Json file let's see if it's running here it's still

41:08

loading okay and then this gives me all the available endpoints and the nice thing is I can use something like curl then or Postman and this is just an example that I had used earlier but how much did I spend at Amazon so I can run this and it's actually going to go out to my agent that's sitting out on my as Cloud it's going to pass in a message

41:30

how much did I spend and then it's going to return the text of saying you spent a total of this so it's basically doing all of this thing all the things we showed in Dev but it's available you know in a production ready scalable version that I could wire up to any application okay we went through a lot

41:49

it's 9:45 that's pretty good how uh how's everyone feeling do we have any questions maybe that's it now a good time for some questions I will stop sharing my screen okay so Sam did you have you been you've been answering some it looks like but any other questions that we want to talk through now um it seems like the most interesting thing that people and a

42:19

couple people are interested about sort of how do we manage agent memory so um Shane maybe you can you can take a whack at that yes so there is a you know we kind of have some systems that are built in let me go to the memory docs and kind of walk through some parts of it but when you're thinking about memory there's you you have like what's

42:42

considered like a short-term memory which is maybe a thread of the last X number of messages and then you have the idea of like longer term memory which you want to be able to go and retrieve when someone asks for it and so we have of some memory management if you're there was a paper called mgpt we based a

43:02

lot of our what our memory does on that uh with some we'd like to say some improvements on that to make it faster but it's based pretty heavily on the mem GPT paper and let me share my screen so we can just kind of take a look at the docs here so in memory you have this idea of threads and resources so in your code if you do call agent.

43:27

stream which is a way to like to stream results back from an agent you can pass in a thread ID you can pass in a resource ID which is typically a user ID from your application and now memory will just be managed for you now there are some things you may want to do so you may want to kind of pass in some memory options like you can pass in how many of the last messages you should use

43:52

you can do some semantic search so you can add semantic recall so you can say I want what's top K which is like the top matching results in this case the the best 10 results and then the two messages before and after each of that results so you can pass in the context so then in this case we would we could

44:11

pass in what what types of memory or how much memory we want to uh pass into the conversation and then you can also use uh some different Vector databases as I mentioned before you're not kind of tied to one vector DB you could use PG Vector in this case you could decide which embedding model you want to use when

44:31

you're embedding memories so there's a whole bunch of different configuration options there and the idea is out of the box you can just turn it on and it just kind of starts working but very likely as you get more sophisticated you're going to need to be able to kind of turn the dials and so all the dials are there for you to be able to turn that's a good question though

44:58

any other questions that we're running into or getting asked Sam obious here like tackling some questions as well so there's one that uh there's there's one that might you might want to sort of call out um Rodrigo is asking how you structure a next app that uses MRA um he's asking if he should sort of deploy them separately or

45:24

together or or how you know different repo same repo how should that work yeah so that's a really good question and it really does depend on your goals if you are looking and you know that you're only ever going to have a next app and you're never it's never going to extend beyond that and your your use case is say let's say relatively straightforward

45:49

then you can actually add Master directly to your next app there is some challenges with uh you know maybe some typescript things that you need to make sure you need to check your TS config file but you can basically run the MRA init command on your nextjs app we do have a guide that's coming out soon so uh we'll have a little more straightforward instructions for that

46:09

but we do have a lot of people that do that and that's nice because you can keep all your code in one place you can deploy it you know to versel netlify wherever you're deploying your next app and it should all just work and then you don't have to worry about two code bases as your agents and workflows get more

46:25

complex I think there is a lot of value gained from having a separate API that's run you know as its own application and then we do have a client basically just a JavaScript or typescript client that you can then just use in your your next app to reach out to your uh masterer agent API so it does depend a little bit uh my preference is separate because I

46:48

like separation of concerns but a lot of people would rather just have one application and if that's all you've ever or all you're planning on doing then uh running it inside your next app can sometimes make sense and then now hu is asking if we have any example of multi-agent yeah so in our examples we do have a hierarchical

47:12

multi-agent system so in this example is I guess this one's not a workflow so this one is more um multiple agents that can call each other as tools we do have in workflows we do have calling an agent and so you could think of you you can build a system where you can call multiple agents in a single workflow so we have a couple different examples here

47:36

here's a multi-agent workflow I guess as well so in this case we Define a workflow that in this case calls a copywriter and then an editor and they're two separate agents but we've kind of defined a a workflow that calls that there's a whole bunch of different patterns you can follow when Building multi-agent Systems the most most common one the most common ones that we've seen is either a more General workflow that

48:03

calls into certain agents in certain steps and then maybe you could always you know think about ways that you might want to uh have those agents talk to each other the other one is using agents as it's as a tool call into other agents so you basically have one agent that's maybe like a supervisor that has tools that can call into these other agents to

48:22

get certain types of information and so you know if you think about like a you know you might have an editor that is responsible for the entire blog publishing process you might have someone that's responsible for generating the ideas in the research you might have an agent that's responsible for actually writing the copy and then you have an agent that's responsible for reviewing and like making changes to

48:42

that copy but you have one that's like the supervisor or manager that calls out to the different tools and maybe you define the order of that or you just let them decide or let the llm decide so couple examples in here of of doing that that's the end of the day um hi this is Obby at the end of the day this is all JavaScript right and I think a lot of

49:03

times the AI Frameworks and libraries seem very intimidating um maybe you're not very comfortable with python or you're new in the space but our goal in design with monra is it is just JavaScript right it should be approachable anything that you can think of in JavaScript you probably can do there's not like any limitation there so I'm excited to see what everyone builds

49:28

yeah I I dropped a um I dropped a link in the um uh in the chat um that has a nice diagram um for multi-agent um this is something that we share a lot internally uh and um and so uh what this kind of illustrates right is that there's often like we're kind of we've been talking about in the CH chat there multiple Primitives and you can

49:52

rearrange them in interesting ways there's often multiple ways to solve the same problem and you are going to be the one that needs to sort of figure out like maybe test two or three of them out and see which one of them works better um keep in mind right when typically when we're writing software we get used

50:10

to sort of the software is predictable and it always does the same thing and agents are not always that way and so you know it may be that like one Arrangement here works better for you um something that we've seen is that often um you know a person can get working on it an AI application or an AI feature can get to a prototype you know

50:38

maybe in a week or a couple weeks uh but then you know then they spend the next month or two months kind of productionizing it which means getting it reliable enough that they feel comfortable shipping it to customers um and uh yeah I think that's like kind of the pattern right first just get something working get it to work and

50:59

then over time you're going to get it right as you improve the quality um and that's through evals uh that's through the feedback loop that that you have um and that's powerful because you know in the beginning you can just focus on hey I'm going to you know get this to work but then over time you're sort of using the scientific method uh to uh test your

51:25

your applic and and just try different configurations um and you're using your understanding of of the possibility in the idea space you know to kind of Traverse and and and iterate um towards a you know the most predictable the most reliable solution um right and that might be tuning your prompt that might be multi-agent configuration that might

51:51

be you know going from uh how how you structure it and and how you're doing to Tool calls to work for us um but the fun thing is um and this is one thing that we've done with monra right we have intentionally baked in all The Primitives that you you'll need so you can just kind of switch between the different building blocks within the

52:16

framework yeah great Point Sam I did see one other question asked you typescript vers Python and writing and building agents so this is a you know I would say it depends on your background and and what you're more comfortable with that's kind of like saying should you build you know in PHP or Ruby on Rails you know back in the day or you know like there's different

52:40

languages right for for what if you buil websites before on different web backends and and different Frameworks that you might use I think there's python was the most popular getting started because a lot of this stuff evolved from the machine learning communities like machine learning Engineers which were very heavily built on you know research and kind of data analysis which is why python was the

53:02

initial use case for that's why there's way more python AI Frameworks and agent Frameworks out there right now we're seeing more and more JavaScript usage and JavaScript Frameworks emerge and I think that's because you know Python's really uh useful for many things and that's kind of the first language people learn but often when you get to the point where you're building things a lot

53:25

of the application world is built on you know front-end Frameworks you JavaScript Frameworks things like that so you see a lot of typescript and a lot of JavaScript and so I think you're going to see more and more applications and agents and uh AI tools being built specifically for the JavaScript and typescript communities so I think it

53:44

will shift more to being a little more 50-50 over time even if python is a little bit more popular right now we're currently in y combinator and there's you know 100 plus AI agent companies I would say very close to that many in our batch and most there probably two-thirds are may be using python but more and more are either considering or have started using typescript so it's I think

54:10

it's shifting uh so uh and then um and and typescript is now that most of AI engineering is just built on top of open Ai and and and Claw anthropic you know it's you know typescript just becomes more and more powerful a lot of folks to have everything in one language yeah exactly yeah your front end application can be built in the same as your AI application uh so Brad has a

54:50

question how might you mutate workflows and agents at runtime I'm baking fewer and fewer prompts into my production code so making them configurable so you know stay tuned on we might have some solutions uh for that in the very near future Brad but in general uh you can you because it's just a a class you can kind of pass around agents and update

55:13

agents as needed so you could load something from the database you know change the instructions of the agent and then execute something to that you know execute a stream call or a generate call right to to that agent so it is all possible you would just kind of write your application code where you fetch from the database and then either dynamically build the agent or dynamically update a part of the agent

55:37

but yeah it should all it should all be possible but yes we do have something coming where maybe you don't have to do all that yourself and you could just uh use something within MRA for that because yes it's a very common use case actually of having how do you iterate on prompts fast quickly how do you measure

55:56

the changes you're making actually are making things better so you have some level of confidence right you're not what people often refer to is you're not just shipping on Vibes but yes okay well I think we're at time so feel free if you have there's some I'll send out the slides feel free to join the Discord if you do have more questions and you want to chat through

56:20

uh some things or of course just send me an email find me on social appreciate you all joining and yeah we we'll see you all around thanks for coming it's been an absolute pleasure see yall

Workshop Hosts

Watch Recording

YouTube