Voice agents with Roark, AI video editing with Mosaic, YC X25, AI news, we build a livestream agent
Today we talk with Daniel and James from Roark, chat with Adish from Mosaic, discuss YC X25 companies, review some AI news, and try to build a livestream agent to streamline my post livestream workflows.
Episode Transcript
ah ah ah ah a a Hey freak out hey hey hey hey hey hey hey we want we want step step Hey everyone we're going to get started in just a few seconds here all right everyone welcome to AI Agents Hour i'm Shane i'm your host today this is brought to you by Mastra for those of you you might have seen Abby doing uh his thing earlier that's cool but we're going to do our thing so
this is live if you are in the chat if you're on YouTube you're on X you're on LinkedIn just please drop a comment and I will try to read a lot of them on the stream we'll pull them up so let me know what you're thinking today we have a pretty uh jam-packed episode we're going to be talking about some AI news we're
going to have some guests from ROR some of my friends we're going to talk a little bit about some YC companies that are kind of in this uh AI space that might be interesting to you things that are going to be uh coming out soon they have their demo day coming up uh we're going to be talking to Adish from
Mosaic and we're going to be talking we're actually going to be building an agentic workflow so I'm going to invite someone from the master team on and we're going to just be building uh try to build something that's going to help me do some things with this stream so ideally you know after the stream is
over I want it I want to run have some agents do some work for me process the stream and we want to kind of build a workflow around that so before we get started as always you can follow me on Twitter at SM Thomas 3 i'll put that on the screen here also make sure you're following uh MRA MRA MRAI on X on you know all the
places on YouTube where you can see this live and I'm going to just have to get our guests in here so I'm going to send got to send out an invite link all right and then we're going to cover some AI news so let's go ahead and get into that so the first AI news there's an article that came out on Axios that says
uh behind the curtain a white collar bloodbath okay sounds uh sounds very ominous but let's read a little bit about what it says so it says uh Dario Amod I don't know if I'm pronouncing that right but the CEO of Enthropic has like a scary warning for the US government it the claim is that AI could wipe out half of all entry-level white collar jobs and spike
unemployment to 10 to 20% in the next 1 to 5 years and I he he also mentioned that AI companies and government need to stop sugar coating what's coming the possible mass elimination of jobs across technology finance law consulting and other white collar professions especially entry-level gigs and so why it matters um says that it could reorder society
overnight he's speaking out in hopes of jarring government and fellow AI companies into preparing and protecting the nation and few are paying attention uh so you know I think this is kind of on the back of the recent launch of Claude 4 uh you know it's yeah it's a big claim right but it's not new this has been predicted since you
know LLM's since I I would say it became very you know hot topic when chat GPT had its moment so I don't think this what the claims here are new at least not for people that have been paying attention to the AI space but it's pretty bold right it's like very you know sets a timeline sets like how much the percentage of white collar jobs you know he thinks could go away so I do think that there's um it's
a it's very interesting i think it's honestly probably to get some headlines i think Claude 4 launched i think he wants to get some headlines do I think it's a concern yes do I think it's on the timeline he's predicting no because I think that most uh model company CEOs are have to be over ambitious and so it's probably not quite on the timeline
you expect do I expect it on some timeline maybe i could see it have you know being disruptive long term curious what you all think in the chat do you think this next we're one or two model iterations away from disrupting you know half of all entry-level white collar jobs i personally think that there's
going to be tons of productivity tools that allow people to be even more productive so as long as you're willing to adopt the new tools I think there will certainly be some job displacement but I think new opportunities will begin to open up yes there could be uh some short-term challenges with that i don't think it's going to be as overblown as uh as big of a deal quite as what Daario
thinks but who knows so Opera had the you know the yes the browser that's still a thing but Opera has a new AI browser or is promising a new AI browser that's going to write code while you sleep it's called Opera Neon it's the idea is it's going to be the first AI agentic browser but it's not ready yet so I think it's uh they're teasing it trying to get some uh get
some people to pay attention to Opera which has anyone here used Opera beside like I know it was popular for a little while on mobile but I'm curious if anyone in this chat that's watching this has actually used Opera i did at one point for a while I was like I'm I'm going to be an Opera user this was you
know 10 years ago probably maybe maybe 12 years ago but I have uh since not really used them or even thought about Opera but maybe some people still do so curious if you were all uh using Opera or if you're excited about an AI or an agentic browser i think that there will be more uh there's very likely browser disruption waiting to be
had with with agents right there's browser agents that can use existing browsers why wouldn't there be a reimagined browser you know kind of thought of from like an AI first approach i think a lot of things you know will kind of iterate to add AI features and then there'll be new completely new things that pop up that
you know kind of have AI at the core of them and so maybe this agentic browser can be that thing i'm not necessarily confident that Opera is going to be the one to do it but you never know something to keep an eye on so last it last kind of news article that we want to talk about today is that retool has launched agents so the kind of the headlines
retool.com/agents i could probably just share it on the screen here and the the headline here is that you can save hours and headaches with agents that work your way all right uh so let's kind of read a little bit about it but build agents that can do real work in your business so I'm assuming it's the goal is like
retool is kind of a a platform for building admin applications and retool is pretty cool i've used it I kind of use it for a few things uh I have found you know just personally you know we were building our admin at MRA and we were just able to vibe code it so retool was less powerful for that i mean it would have worked but it was pretty easy to just kind of build what we needed and
you know it wasn't we didn't need anything too fancy we didn't need a ton of access controls so like retool does have more uh I guess enterprise type features that we would have had to build ourselves that we haven't really got to yet but it is a pretty powerful tool i I do believe I I saw a a post that
said you know Retool kind of already has uh thousands of agents that are already thousands of customers that are already using agents maybe or hundreds of customers but they do have quite a few people it looks like that are already kind of using retool agents so interesting there's more you know more and more people trying to you know add
agents to their to the mix of things and that's all the AI news for today you if you are just joining us thanks for joining AI Agents Hour we come at you live pretty much every day sometimes you get two episodes like today when Ob's in in Europe you know gallivanting around Europe he decides to live stream once in a while i never know when he's going to
do it so today we have two episodes so you get the bonus episode but I'm excited because I have some some friends joining the live stream today so I'm going to go ahead and bring on some special guests and yeah I'm going to bring on James and Daniel hey hey hey guys good to see you all it's been a little bit yeah it's been a while how are you thanks for having us on yeah I'm I'm doing pretty well i'm
doing I'm doing well i'm uh excited to chat more about Ror are you guys still in Malta we're We're still in Malta right now yeah right well I'm Yeah excited any idea when you're next time you're going to be in San Francisco i want I want to line up my travel schedule so we can hang out that's why I ask potentially
maybe in the next 8 to 12 weeks hopefully if all goes well yeah okay well I will I will keep my eye out for that so let me know so we can uh actually hang out in person not just on through a Zoom call or in this case a live stream but but maybe yeah it might be good if one of you two I guess if you both want
to do a quick introduction and then maybe just do a high level overview of what ROR is and then I just want to talk about voice agents and what you're seeing in in kind of the voice agent space yeah uh so hey everyone um yeah I'm James i'm one of the co-founders of Fork um yeah Daniel yeah hi everyone uh
Daniel uh also one of the co-founders of Ror and yeah uh thanks for having us on Shane yeah um yeah so just a quick overview of what we do uh so Ror is an endto-end uh voice agent testing platform we allow people to um run simulations of their customers so you can simulate um different accents different languages different customer journeys that your agents have taken and
make sure that you test your agent before you deploy it right um and then the other side the other product is also we have a live monitoring product where you can run live evaluations on every single call that comes in and we'll highlight issues with them but really the cool part is when you combine both of these features together and then that
lets you create test sets of live calls right so you get your calls in maybe some of them didn't go well and then with a click of a button you can actually add a real call as a test um that you can test for in your CI/CD pipeline or anytime that you're going to update your agent just to make sure that you don't have any regressions there
yeah very cool yeah I do I mean I have been talking to I was talking to someone else actually locally uh shout out to Noah if he's if he's watching this but he's also kind of in the voice agent space working on something and I guess one of the one of the questions I had when talking to him is okay voice agents are like I think voice agents are cool i obviously
think that you know there's there's a lot with you know text to speech and audio like I built a product called audio feed you know almost two years ago I started it so it's like I've I've been like really early into like thinking that audio and and not necessarily voice but like at least text to speech was going to be like really big and so but I
am curious who you know who are building voice agents today like where I know you talk with a lot of customers that have either built voice agents or are like in the the process what types of like companies are building voice agents and how do you see them commonly being used yeah um I think that's a great question
so you know the way I think about like voice AI today is it's very similar to the LLM space maybe a couple years ago so obviously startups are the fast movers they're the ones who want to try this new technology so you see startups um using voice AI as a wedge into replacing legacy products like uh dental
clinic CRM right or car dealership CRM where they're using like voice AI as this way of like creating this voice agent going ahead trying to selling um selling um you like a new type of CRM here of an agent that can take all of the calls um and then now as we're seeing voice AI um um exploding really
we're seeing larger enterprises whether that's in healthcare um or with um larger um airplanes as well or different providers we're seeing we're seeing much larger enterprises in those spaces um go ahead and try to use that technology there yeah very interesting i mean I think so i I know I've talked to some people you know that run small businesses that you know have admins or back office type things i
think one of the the there's still some like hesitance right around voice agents and I think that you know there especially with in small businesses I I do see you know maybe it makes sense for like call centers and and you know larger companies but and you know where they were maybe already outsourcing a
lot of that like customer service but I do think like smaller companies or at least people I've talked to do have some like express like concern you know of just I don't know if I want my customers calling in and talking to AI when people you know they're probably going to be able to tell it's AI uh they're going to
know it's not real they're going to feel like it's I'm losing you know maybe like that touch point with a customer have you seen Have you heard similar sentiments have you talked to customers that kind of have some of the same concerns i'm curious we we have so I think like every new technology rights like um there's this honeymoon period where you know people start using like
it's great you know we can replace 80 to 90% of your calls and then once you deploy it and you start getting feedback you realize that's not really throwable i think some of the best customers that that we have um maybe automated like anywhere between 40 to 60% of the calls and I think that's that's like a really like that's a stretch goal um and so what what we're seeing now is whereas
like you know your typical founder like the initial voice agent builder would go out they would go and they would try and sell and say hey we're going to replace your entire call center we're going to replace like every single use case now they're being a bit more careful right they're satisfying the use case that
their agent can actually accomplish like pretty well versus saying like can't do everything because to your point um it can't do everything today right there are some use cases where it's great for like appointment shuling um you know it's it's it's still hard but you can get to be like a really high quality bar
um and so we're seeing a ton of people focus on you know on different types of um appointmentuling but then when it comes to maybe legal right where uh stakes are higher where you have a person who called in whose child just died uh we we've seen some teams try to deploy these voice agents where they
have to quickly roll them back because uh the technology just isn't there yet but having said that um you know um voice AI today is a much better place than um you know where it was say 10 years ago right with like Siri or Alexa skills or um or previous frameworks and so now um we've sort of hit that
inflection point where it can actually be used in real life settings where it's actually useful where you know if you're calling a if you're calling like an agency of the government right and you usually have to wait like an hour or two hours um that's actually a really great use case to use voice AI um or if you want to quickly book an appointment or
you know for other parts of the world where reserving a table at the restaurant still can't be done over the phone um instead of like creating those APIs the voice agent is the API for these things right because a lot of the back office and and some of that work is still done through voice um and so for those use cases been great um but um I
would say it's still not brilliant for high state use cases and that's where now we're seeing um you know we're seeing the industry settle on like okay let's be real about what's possible and what's not at the same time though things are moving so quickly that you know with the next model you almost you know you want to almost tempt to try
this again right and you keep trying and and I'm certain that in next couple of years um you know all of the calls are eventually going to be replaced by voicei but um right now I'd say um based on use case uh anywhere between like 40 to 60% is is still great for the bottom line right because how much money you're saving um but it's also something that
was just unpathably like impossible just like two years ago yeah yeah I know that I was you know I was talking to this person and I talked to some other uh people that I know around just because I was curious on who's would even consider using voice AI because I I think it's you know I'm I'm kind of in a bubble
just like you you know like I I'm AI pill in a way right so I believe that this stuff's all going to disrupt what we're doing but I I think that the rest of the world maybe is less convinced or needs more time to you know to kind of get used to the idea of like you're not talking to necessarily a human but it can do a lot of the same things and so
and actually because of that it can maybe do things better because that human's not distracted or have other things that they're trying to do so I think there's in many cases it can actually perform better one of one of the my kind of suggestion and I I don't know maybe you've seen this but the suggestion for people that are a little
bit you know concerned with okay well I one I don't want to feel like my employees to think that their job's going away and two I don't want my customers to feel like I'm shortch changing them and don't want to give them a person because I'm trying to save money but my my comment was around like what about the off hours if you're a
business you have off hours where you are not manning the phones you know 247 necessarily for for a lot of businesses what if you could actually rather than like leave a message you could actually try to help those people and you could be very clear that this is AI you don't have to like maybe hide it but you could
tell people like "Hey this is you know an AI assistant but I can try to help you i can help you with these things you know what can I help you with?" Whatever or like the Is there is it Have you seen anyone almost do like failovers like if the receptionist doesn't answer the phone rather than going to a you know a
message machine it goes to like the AI assistant to like be the backup have you seen anything like that uh we have I'd say we've seen the opposite a bit more frequently though where um for people implementing this they they start with the voice agent right uh but then if the use case if it picks up some intent that the use case
uh you know something that they don't manage then they automatically go to a human at the same time we have seen uh this exact pattern where um you know if you're calling in off hours um for legal for example to your point for this company I was mentioning where they did manage the high stakes what they've done now is exactly what you mentioned instead of just leaving a message over
the phone at least the voice agent picks up it takes the details and then it tries to escalate by email or slack or some other type of notifications based on how important it is um but one comment I wanted to make uh because you mentioned coil centers and like people um you know people fearing having their jobs replaced what's interesting and I
think Kevin from Leaping mentioned this a few days ago is that um you know most people working at call centers actually don't like working there um the churn rate is incredibly high it's anywhere between like six to 11 months um and so you generally when you're working in a call center I had some my family that
always was um used to work there as well is you generally like tend to have people who are calling yelling at you screaming at you just calling you different names it's actually really really like it's a really hard job like especially emotionally um to um to to be at so um yeah so I'd say it's actually
like a a net positive right because now humans don't need to go through that trauma um and then yeah my own so I'm going to push back a little on that because I think I agree with you but also today what that means is all the easy questions that the you know the quoteunquote softball questions are going to be answered by the AI and the
really hard challenging ones are the ones that are going to get passed on to uh to humans today and then eventually maybe that goes away but um I guess yeah I mean we had I think that is the reality of today having said that at least you're still reducing right some of that load away and I think like an AI is uh can be um less emotional right about getting
someone like you know screaming at it or like just acting frustrated and it can just keep going into like you know a nice scoop um and the humans that are calling are going to get real frustrated when they don't get the reaction they want because I think that's part of it they want to hear someone else like have some kind of emotional reaction to their
you know their issue but uh but the AI can probably be much more empathetic and calm in in a lot of ways so maybe that's a good thing uh yeah and this is kind of ties back to like the first news article i don't know if you all caught it but you know the CEO of Anthropic is kind of predicting that they're going to wipe out half of entry level white collar
jobs i would count call centers is kind of part of that so maybe there is going to be some job displacement in call centers may and I don't think like I don't predict that they're going to go away because I think that there's always going to be the need for human touch on some things at least in the the relatively near future but I could see
less people being needed for those types of jobs yeah definitely i I think it's just definitely going to narrow down how many you have i one interesting on call centers though is that's easy to say for you know for um maybe companies in the US that you know they can hire like an offshore call center company in India
that speaks English right but what's what's really interesting is these models surround brilliant for multilingual right so if you want someone to speak German speak French speak Italian do that at a low latency of high quality that's still I'd say like a year behind like a voice agent that can speak English today um and so I
think the displacement won't all happen at the same time but but yes I I do agree that eventually it's it probably will happen maybe not 100% but like 95% would would be my bet 95% huh okay i I think that's a bold prediction what's the timeline what's your timeline five years for call centers i I think call
centers are the are the are are the ones that are getting targeted first uh for too many like valid reasons people don't want to work there um you want to be automated there's clear business uh value in doing it reduces costs it's like a win from honestly from most sides um I would say like three three years probably like a timeline you you heard
it here first in three years 95% of call center jobs are going to be gone according to James from Ror so I don't know we're going to bring you back on in three years James and we're going to be like do you think it hit the number was it 80% was it 95 was it 50 you know um yeah I guess Daniel what do you think about all this yeah I I think I
think to your both of your points like I think it will be uh the current voice caller jobs I think they will end up being a little bit more specialized uh since again like anything low stakes a voice agent can do so it will handle most of those uh I I do agree with James like I I do see most of the call centers
uh going away and like one very good point there is the language uh English is at this point at a very good rate uh to simulate uh you can most of the times you can hardly tell that it's a voice agent uh but when it comes to other language or speaking in certain tricky accents that's when it becomes a little bit tricky like especially if you you
have a particular language that has some very specific dialect that becomes a little bit tricky so I guess like the what what was it James was 90% like the I guess 95 95 90 So so if you are just tuning in we were talking about voice agents and we previously talked about job displacement with AI of you know based on what some
comments the CEO of Enthropic just made uh so the bar the overunder has now been set in three years 95% of call center jobs will be displaced by voice agents daniel do you do you take the over or do you take the under on 95% in three years i'll go with 90 to 95 like I I Okay so you're taking the under and I will take
the I will take the under of 90% i think it's going to be maybe like if I were to predict I think it's going to be like 50 but that's my prediction so you know but I but I'm just going to take the under on 90 and now I have that whole space so we will come back in a couple years and we'll see uh who who's right and who uh who's wrong abs absolutely i mean I'm
not going to be wrong by 5% right that's that's fine yeah that's okay but um one thing I want to mention though is um you know as we as we speak about where they're used as well um you know if we think about maybe an obvious one um we go to this example pretty often about like dental clinic right some of these
dental clinics um you know some in the US do have a receptionist who's you know taking calls doing payments and more some don't though so what you end up happening is this one person who's done like 10 things is now actually really happy that they have like a voice agent taking their calls um because uh they
stop missing calls right and they can actually focus on their job they can actually take payments they can actually help them other thoughts um so I don't think it always necessarily relates to like people losing their job necessarily right i think if anything it just eases the practice it it helps out in ways um that's just very beneficial to have
versus not but um yeah yeah absolutely um well I I always ask people that come on is would you want to demo something you want to show a little bit about ROR is that is that interesting to you guys i know you know we always like to say like seeing is believing in all this stuff so I know I know what you what you all have what you built but it might be cool for others who haven't seen it to see you know
we're talking about voice agents but what does how does ROR play into that what does ROR do how does it help how does it help someone like me if I want to build a voice agent actually do that a little bit better yeah uh definitely i'm happy to give a demo i'll just go ahead and share my screen here yeah and while you do that uh we're talking to
Daniel and James from ROOR we're talking about voice agents and they're going to show a little bit about uh what ROR does yeah um so yeah uh so I'll just make sure I'm sharing my screen here okay cool awesome it's showing up there you go um yeah so this is work this is our dashboard uh so we do have two products i'm going to start off with the live monitoring product we have a live one product and an end to
end uh testing product for the live one tree product um essentially we connect to uh different voice agents whether those are running uh you know using your own orchestration framework or you're using something like retail pipe cut live kit or mra um now as well i know you guys are getting into the voice and we're super excited to be working with
you guys um you know we'll hook into all of these um and we'll get um we'll start taking in all of your calls we'll do the transcription um we'll also create call summaries for you we'll do sentiment emotion analysis uh but we'll also capture stuff like you know where your tool calls were show you the input and
the results um but usually for the live monitoring um situation the most interesting thing are our evaluators and what we do there is we essentially evaluate every single call that comes in right based on some specific criteria that you have um and so in this case we're evaluating for whether a call is
um whether the agents responses are relevant to the customer's request and we start showing anytime there are any form of failures directly in the transcript here so in this case you can see that we highlight a failure so this is a call between an agent um a dental agent a dental clinic agent and a customer and a customer is asking for a specific type of treatment unfortunately the agent starts to hallucinate and says
it doesn't you know um it doesn't support this treatment even though it does and so our evaluators catch that issue we tell you hey the agent you know offered a cavity filling instead of addressing the customer's request for a root canal treatment now um that is our evaluation product i'd love to just show
you guys what it's like to define one of these evaluators actually so if we go on this tab here um this is our evaluators tab this is where you would be able to go in you know define what success means to your business and if you tap on new evaluator we can go ahead and do that um what you're going to get is we have a new set of templates that you can use
based on the industry that you're in um and in this case let's say that we're going to define um let's say that we're going to define one to test our um yeah just test I guess let's see response speed and clarity right um and so what this is going to do is it's going to automatically add in a couple of what we call evaluator blocks which are individual which are essentially are
metrics that you care about that you want your um evaluator to extract right and so here we're going to see we're going to check for the average response time and we're going to make sure that we have no um durations are less than one second here um we're going to make sure we have a custom prompt where we're
essentially telling the you know telling our evals just to rate how clearly our agent explains um you know the um the menu for for this you know the menu or anything it gives throughout the call um and then we also capture uh we want to check whether it was polite and what the overall customers um sentiment was um so
with that in mind I'm just going to go to one that's ready and just show a demo of that so we go here we're going to see that we're going to test for whether um you know the answer was relevant whether it offer time slots previous instructions and the latency and what we can do is we have this test evaluator here where you can really quickly just
drag and drop an audio file and it'll essentially run this entire eval on top of it um so yeah um that's that's a demo of our live evaluation suites um what I love to do now is just give a very quick like one minute demo of our test suites as well um so so let's say that you are pre-eployment right let's say that you
have your agent and you want to test you want to test for something specific um and you want to make sure that as you update your agents prompts you don't cause any regressions what you can do today with is you can go into our test sets feature you can hit create test set and once you do this you can actually
generate a ton of different scenarios different pods different customer variants whether it's people that have you know maybe someone speaking English with um a German accent right so you can decide whether they are male female or gender neutral you can also decide whether you want to add some background
noise um in the call so this is actually generating like synthetic data but like synthetic audio is that right yes exactly exactly so this is going to be uh you're essentially uh creating a simulation of a customer right that's then going to call your agent um and have that conversation based on the pod that you define here and so you know in
this situation we can actually define someone like hey let's say it's an elderly frustrated customer attempting to book an appointment for a root canal treatment uh once you use the app 10 yeah and what this is going to do is once we just hit add scenario here um you're going to see that it's going to add it as a test case um and you can
keep adding right you can like add 10 20 30 we have some customers who now have like hundreds of these test hundreds of different test cases up and running and then uh let's just give this a name here let's say this is a booking call flow uh the final thing to do here before we can actually start this off is to define how we're going to test whether
something's successful or not and that takes us back to the evaluator that I just showed so for this we're going to define whether something was successful or not by selecting the appointmentuling and follow-up evaluator that we just created in the previous step and then once that's done you just click next and uh with that um what's happening now is
work is calling your agent it's going through those test cases um and what's cool about this is that you can actually not just run a single test case but you can run hundreds of these in parallel at the same time um so because of the nature of it it being an actual phone call this will take um you know a potential two to three minutes to
actually complete but you can run multiple of these at the same time and so in the interest of time I'm just going to jump to the result here so once it's complete every test case will get a very um quick pass or fail score you can even open it up and then you'll get a summary of why this failed right um and then you can even hear um the you can
even hear the the customer call that our agent had with your agent here yeah can Can you play it can we hear uh yeah i'm not sure if I'm sharing my sound i think you are just click play okay yeah let's do it i'll I'll tell you nope just kidding i I lied to you uh and it was I lied to you on live on a live stream uh but yeah for some reason I thought it
would just uh maybe you have to click a button who knew uh maybe I can let me see if I can try this again let me see if I can share my what you hear um what input your advance options uh seems like I'm not seeing the option unfortunately oh yeah okay I am cool um live debugging there you go we'll we'll figure this out for those of you that just joined us we're talking to James and Daniel from
ROR we're seeing a demo of of what ROR can do and we're going to hopefully hear hear some audio hello this is Mary from Mary's Dental how can I assist you today hi I'd like to book an appointment for our root canal treatment please preferably on Tuesday great do you have any openings on Tuesday okay could I book it for 3 p.m on Tuesday
sure just to confirm could I have your full name please yes it's Rachel Tyrell thank you Rachel so we have you booked for a root canal treatment on Tuesday at 3 p.m does that sound good yeah um so that's uh an example of essentially the simulation going through i mean the agent didn't sound that friendly in the
middle there sounded kind of like short and to the point i would uh I would say you need to be a little friendlier show a little more empathy agent yeah it so that could be the depending on the prompt that's that's fed in essentially um we do allow to like customize both the emotion and the sentiment for for every single call um
and I think that's probably in this case the main reason this failed was due to the latency issues in between um but that that also affected the call quality here yeah cool very cool um okay so I think Daniel James this you are on X so if you do want to you know keep up with what Daniel and James are doing talking about you can find ROR at
roor.ai anything else you guys want to talk about today uh no I'd love to yeah I'd love to talk about how must incorporating actually voice AI into your stack we saw you guys released some pretty awesome like start uh you know start speaking as well like methods a couple of weeks ago so yeah so we so we are taking like a we'll
call it like baby steps approach into voice because we know voice is important uh you know ultimately we do believe that agents want to be multimodal right an agent should be able to have you should be able to attach a you know a voice to an agent and talk to the agent and then the voice should or the agent should be able to talk back so that's
really kind of where we started is we wanted to allow you to pass in a speech to text or text to speech and essentially like give an agent a voice um and then from there we've we've slowly you know people have asked okay well how do I now make this tie into calls and like handle the telephoneony and and all that parts and so we've kind of been like providing customers that
ask for that like guides on how they can do it it is possible it's not you know we don't necessarily have like first class support for everything yet you know we are working on it but it is something that we're kind of like baby stepping into and you know over time I think I think we'll have you know deeper
and deeper integrations we we do integrate with you know like OpenAI's real-time voice you know speech as well so we kind of have this idea like you can use real-time providers you can use you know you can bring your own model and speech to text text to speech and we kind of wire it all up for you we've added you know voice capabilities into
our dev playground so we want that to be multimodal as well and so I do think that you know over time we you know we'll kind of get further along in the like full voice agent support and you know call support we're not quite there yet as far as like where we want to be but yeah it is it is something that's
very interesting to interesting to us we do have quite a few uh of our customers that you know are building agents and they do need a voice aspects and sometimes it's we need we want it on like a web or mobile and we work really well for that sometimes it's oh we need actually like calls and it's like well we can wire it up here's how you do it
but over time we'll we'll continue to kind of improve that and and launch even more things around kind of voice yeah yeah i mean Monster is already like the you know the best um agent framework out there right so it just makes sense to also become the best voice agent framework out there i like to think we're the best agent framework uh others might disagree but uh my opinion is
we're the best um but so Chad if you're watching you know if you have any questions for Daniel or James you have any voice agent questions please drop them in this is live we're on X LinkedIn YouTube um definitely reach out to Daniel and James if you have questions if you want to talk voice agents you want to uh obviously try out ROR if you're building a voice agent uh but
yeah it was great having you guys and we'll definitely want to have you again in the future amazing uh yeah thanks so much for having us um yeah thank you thank you so much all right yeah we'll see you guys all right everyone so that was cool chatting with uh Daniel and James it's been a little while since I I last talked to them but I do think voice agents are
just really interesting space a lot of excitement around it i think like a lot of things every model iteration it unlocks a little bit more uh more and more use cases so like things that weren't possible slowly become possible so even like small improvements do start to unlock use cases that maybe wouldn't wouldn't have worked well before and so
it is kind of interesting like when you when I look back a year ago or two you know two years ago when I was starting to build audio feed and like the text to speech providers and how far the latency has decreased how far the quality has come how how much easier it is to just like implement you know speech and audio
features it it's definitely improved a lot over the last two years and I can I only guess that it's going to keep improving over the next few as well if you do have any uh questions please drop them in a comment on YouTube on X on LinkedIn and I will try to get to them so this uh next segment so I noticed that uh you know if
those of you been following us for a while or watching some of these episodes you'll see that today and even yesterday we had two live streams now you might think that this was planned that we had always planned on having two live streams but the truth you know the honest truth of it is uh we didn't plan
this ob's gone rogue uh he's in Europe he was in Japan now he's traveling through Europe and he thought it'd be fun to just try to hijack the show well if there is someone if you are watching the chat please tag Obby on X and just like point him to this segment you can't take the show for me i'm here to stay and you know you can
come back on my version of the show if you want or you can keep hosting your own but you know I think we know who's is better but somewhat jokes aside Obby did have a segment today that I'm going to blatantly steal he was talking about YC uh X25 companies and I think that's kind of a cool thing because there's a lot of cool uh new AI especially around like dev tools which if you're watching this
you you're following Maestro you probably care about building AI agents building AI applications and so one of the things I like to do is keep up to date with what's coming out because when you when you see like what's launching you know where you can kind of predict where the space is going you can kind of predict what you know our customers are
going to start asking for and so I I do like to uh just know what what's what's out there what you know are there good companies we can start to integrate with we can partner with um cool new technology trends that we can kind of like tap into in different ways and so I wanted to highlight some of my own
companies that I was especially like impressed with or excited about and you know these kind of are different levels of like you know you know they are still in YC but some of these are like really far along for YC companies so I I'm going to share uh share my screen and we will uh we'll talk through a few of these companies so the first company that I wanted to
highlight is Casco so Casco is agentic red teaming for AI agents and apps so it prevents AI agents and apps from messing up by evaluating them for security safety and accuracy so it says it already secures AI systems deployed in 60% of Fortune 500 companies so that's pretty cool um and I I know I think you know maybe the if I remember right I know Renee and
um yeah I think they're ex AWS so like they dealt with some of these like infrastructure type problems and so I think they have like a really cool low latency solution to driving uh basically like protecting LLM calls right so I think it's like a kind of a very low latency middle middleware that helps protect your LLM at least that's the way that it's kind of explained to
me when I talked to them so was very excited to see this maybe that's How long is this video two minutes let's watch it hopefully you can all hear this if it works for whatever reason when I'm streaming sometimes and it's not really working so we're just going to say we'll read it instead but it is pretty cool uh something that I'm interested in i
would Yeah definitely think security in AI is is kind of a big deal it's more and more uh becoming more and more important especially as like enterprises try to get into AI like security is obviously important for everybody but it becomes even more important when you have a lot more at stake which if you're an enterprise you have thousands of
employees you have you know millions of dollars in in revenue and in customers it becomes much more important to make sure that you're if you're deploying an agent it doesn't mess up or there there's some like stop gaps in place so you have some level of confidence all right next up I want to talk about
Starling so Starling is pretty cool it's kind of like an internal developer portal so it's kind of pulls all of your like developer tools into one kind of like local UI for your team and so you kind of like have one entry point and so one of the founders Daniel led an engineering team at Netflix that built
their IDP it was called Netflix Council and so it was a central hub for all of Netflix's engineers managers all that stuff and it has the highest daily active user internal tool for developers in Netflix and you know Yonas built some uh stack share which is a developer community that had tons of tons of developers using it so if we look at the
problem developers are spending 20% of their time on engineering tasks outside of their code editor yep i I would agree with that like a lot of a lot of engineers are spending time just in project management systems you know writing tests all kinds of things i guess I guess if you're writing tests you're probably in your code editor but
maybe fixing tests or debugging tests uh yeah use Century GitHub CircleCI Page Duty Super Pace Forell a whole bunch more things right like if you think about if you're developing something think of all the tools all the like developer tools you use in an average day if you're like actually shipping code it's quite a bit so they're building a system of AI agents that perform engineering tasks
using a deep context that only exists in an internal developer portal so it's kind of a central command center that houses internal services applications APIs documentation and has AI agents that are continuously improving them so it brings all the tools into one system and kind of looks sounds like it layers on agents to help
make the kind of intersection of those tools much smoother cool so yeah I just thought this was really cool um kind of saw some early demos of it and yeah I was really uh really pretty impressed and last and this one is uh I think o you know it's it's kind of tied to security but it's more like authentication of agents and how to
handle authentication you know we we're building some things at MRA to try to help with this and to try to integrate with different O providers but uh you know better off is the fastest going growing O framework for TypeScript so it's definitely has like a very uh steep trajectory of of usage and adoption and I think just O is is like a
really big problem to try to solve and I like the idea that it's trying to be framework agnostic so it can just kind of integrate neatly with a lot of things um it looks like it has some integrations with or being used by other open source projects 350,000 monthly MPM downloads that's super impressive a lot of people on Discord following it so I think O is
Yeah especially when you get into like Aentic and AI applications O becomes really important and I wish O were a solved problem an easy to solve problem i don't think it ever will be right there's like a lot of concerns you have to think about so there's so off will never be easy but anything we can do to make off easier
because if there's one thing I know implementing off in the application is always one of the things I look forward to the least so anytime I'm building and you know like for instance when I built audio feed when I built a lot of even like our our example apps that use off it's like that part of it it's easier
now with you know agents can help you get there but you still have to know what you're doing you still have to protect you know know how to protect the right routes know how you know who should be you know actually authorized to do certain things and anything you can do to kind of like provide a library to make that easier you you have my support so I think this is pretty
cool and so yeah those are the I would say three of the YC companies that I have been kind of specifically interested in from the current batch if you are in the chat you know we have over 100 people watching this right now if you're in the chat let me know there any it doesn't have to be a YC company any new AI tools AI applications that
you've been kind of interested in checking out um definitely want to want to hear from you what what else should we be talking about here on on this stream we we try to highlight anything that's kind of up and coming and we talk through it give our perspectives and uh definitely want to continue to do that so let us know what we're missing
And we do have a guest coming on shortly but before we get to that I did notice one other thing that you know I stole this segment from Obby obby stole a segment from me where he basically highlighted a open-source GitHub repo in this case it was one that I had already highlighted that you can tell he does
not watch the stream but uh he highlighted Muscleme our friend Eric and I wanted you know let Obby know that just because you went to sushi with Eric does does not negate the fact that he came on the stream with me first so uh just a a fun fact there for for the audience watching if you you kind of get
to pick between Aby's live stream early in the morning because he's in the EU or my live stream uh later in the day but today's been a busy day kind of at Mafra you you've probably seen we we've been posting a lot of content around just doing a lot of live streaming and the reason is because we just ultimately we
have a lot of things that we are talking about internally and we're thinking about when it comes to AI because this stuff I've said this you know hundred times if you've been watching the stream the last few weeks stuff is changing a lot there's always new launches there's new features there's new uh tools that are out there and so we try to stay on
top of it as much as we can we get we learn a lot from just talking to our users and our customers and uh that helps us decide like what are we building how are we thinking about the future in AI and what's to come and so we want to bring a lot of that those learnings and those lessons to you all so as you're thinking about whether you're building an AI agent you're
building an AI application you're considering getting started um maybe you're like a lot of the people that I talk to and you maybe came from like a web background or you know building apps or either web apps or websites and now you're kind of introduced to this idea of AI and you want to you know figure out how you can
use it or how you can release you know AI features in your app whether that's a full-blown agent or just like some calls to an LLM to add some nice uh you know nice user experience enhancements to the app um that's why we're here we want to kind of introduce you to all that and so you know we Obby had a stream earlier today he talked through a bunch of things had had some guests we did have a
workshop at MRA earlier today as well which was you know part of it was live stream despite my uh fighting with some technical difficulties you know we're still new to this we're figuring this out so uh you're going to get some technical challenges uh that we run into but we're getting better but there was a a live stream hiccup but besides that
there was a really great uh workshop that we host so we typically host one like in-person workshop i say in person on Zoom so it's a little more interactive uh but an a Zoombased workshop every week where we try to teach something very specific often it's it's almost always using something with Mastra so it's a good way if you are
thinking about you know you want to learn some stuff about Maestra you don't know where to get started uh you can go to our website we almost always have it kind of in the the top banner you can go to Luma i'm going to post the link here i'll just share my screen you can find us on Luma so if you go to lu.m
/mastra you can see we have one event we'll be scheduling some more we typically do like I said about one a week um so you can see our this today we had we covered our new workflows engine which was just released and um so and I'll talk a little bit about that in just a second but we so that was kind of a nice way to get started with what are
agentic workflows how do they work within MRA why are they important last week we had one on just rag and how you can use rag to build productionready rag applications and how you can do that with MRA we did building a full stack agent we did brought some friends from co-pilot kit a couple weeks ago we did
one with uh talking about building an agent with MRA we did a talked a little bit about MCP but mostly it was just me showing Matt Hok what MRA was and how you can be used to build agents we did a workshop on just vibe coding tips and tricks building voice agents with ROAR so we had a whole bunch of uh these workshops
that you can go in and you can register for and we do kind of some hands-on learning and so we always release the recordings they're on YouTube so you can see not all of them but the last few at least have been on YouTube and they will be going forward so if you're interested in learning a little bit more about Maestra you can go there and check that
out um I do want to call out just workflows in general so we recently did release a new workflows engine in MRA and so that was what the workshop was all about today and the the cool thing about our new workflows engine is we originally built it thinking that you know okay we're going to build this workflows engine
everyone's going to use it they'll just you know it'll be easy to deploy but that wasn't necessarily the case what we realize is people have their own uh they want a simple API to build workflows but they also need you know might want to deploy it to ingest or temporal or to cloudflare workers or workflows uh so we
did uh kind of rebuild our workflows engine to one streamline the API but also just make it uh we're slowly adding kind of integrations where you can run our workflows engine on all these different platforms just to make it easy so wherever you want to run your workflows you can and I think one of the questions we get often and you know kind of came up a little bit in and we talked
about a little bit in the workshop today was when why do workflows exist why can't I just like tell my agent to do it and give them some tools and let my agent cook and the truth is you can but there's kind of this spectrum of like determinism and so we try to cover that with like three core MRT primitives that
you can build with so on like the most deterministic side of the spectrum you have workflows because it's relatively the steps are relatively well defined you can have branches you can call out to LLMs for specific things you can wait for human in the loop feedback so you can kind of suspend for as long as you need to and then a human responds and it
resumes and picks up where it left off uh so that's like it's still kind of agentic but it's kind of a little bit more deterministic then you have kind of agents in the middle where you can define agents give them tools give them a system prompt and the agent can kind of go through its loop and call tools and and make decisions on its own and
then you kind of have even further on like the more probabilistic side which is even less deterministic is kind of agent networks and that's something that we kind of have an experimental state in MRA today but we're about to kind of we're getting very close to kind of having that ready for an official release so we're we're kind of triing a
few new changes to the APIs and uh I think that's going to be that's going to land here pretty pretty soon and so we using those things you can kind of compose pretty complex uh agentic systems and so that that's I think when you're kind of deciding my always rule of thumb is if you can make it a workflow you should probably try to make
it a workflow because it does um it does give you that little bit more uh level of confidence that the quality of your results is going to be more consistent so and with that if you are joining us thanks for joining AI Agents Hour we're here every day usually it's around noon Pacific unless Obby uh is running his EU version which was earlier
today but we do uh talk about AI news we talk with people from the master team we bring on special guests we talk about building agents and AI applications but let's bring on our next guest all right how's it going hey Shane going well how are you it's It's good it's been a while since I've I've seen you it has it has but we've been communicating on the Slack which is awesome i'm really
excited to see how the live stream goes in real life here yeah yeah you get to experience it firsthand so be we can talk about we can definitely talk about that but I'm uh maybe you should give a little introduction and let the let the you know 100 plus people that are watching this know a little bit about you and and Mosaic as well absolutely
yeah um so my name is Adish i'm the founder and CEO of Mosaic uh we were in the kind of most recent Y Combinator batch with uh Shane and the Monster team uh what we're building is with Mosaic is really a a gentic paradigm for video editing so what we give users is a canvas uh a nodalbased canvas where you can create and run your own uh
multimodal video editing AI agents and it sounds like a mouthful but um I promise like by the end of this demo uh session it'll be very clear what you can do and kind of to your point Shane about um you know agentic workflows and when you can make things workflows you should because it builds that level of confidence that's exactly what you know
Mosaic offers you so you have the ability to kind of come in and predefine these utility based agentic workflows um and yeah have some level of confidence that you know what you're getting is is what you want yeah abs absolutely and I think that was one of the things that I thought was really interesting when I I mean I saw
the first demo or one of the first demos one of the early demos right during YC and it was the idea of okay well I don't just let AI decide I can actually like define a workflow of how I want like my video pipeline to work so it's like it's it's still a little bit more control i have the dials i get to turn the dials a little bit more but I still don't have to but I can kind of wire it up once and
then let it let it keep running right i can just like reuse it multiple times and it's so um and I I know we'll probably get into and we'll demo and whatever but uh so that that was the thing that I thought was uh most exciting was just this idea of like repeatable video production workflows which as you you know if you're watching this you know that we're we do a lot of
video uh videos over here so the idea of being able to wire that up and have like a consistent video pipeline is is really cool and even you know even after this you know when you leave I'm going to try to build an aentic workflow of like how we handle things post stream like creating summaries and descriptions and all that in a workflow and so I know at
some point you know maybe mosaic gets inserted into a part of that process as well yeah I mean that's exactly it like when we started what we really started with just kind of inspired by things like cursor was like a a chatbased solution to your traditional nonlinear video editor very similar to what cursor
did with your traditional code IDE and um we found a couple different you know just problems with it one from just a kind of simple uh UI UX perspective when you have these large video files that you're trying to process and then handle like state changes of um it becomes problematic when you have that like
prompt response UX that's kind of baked into chat and so when you can now define these workflows and have them instead run in the background kind of on autopilot uh you're not sequentially blocked with how long the response time can be when you know you're processing these large video files um and then the second thing is kind of what you were
getting at which is you have these kind of repeated workflows that you're defining for the content that you're producing anyways and so if you can just package these up as these automations that you can then run and trigger programmatically with an API as well um that means that you can 10x your content game instead of having this like single point solution where you're kind of
still bottlenecked by how quickly you can interact with some sort of chat co-pilot right so um the idea yeah is you can define these agents and basically have them uh be very specifically geared towards a certain workflow so maybe you want to create shorts from a longer live stream but then you want to have another agent which is actually just doing some cleanup and maybe another which is optimizing for different social
platforms um and then yeah just like run these kind of in the background on autopilot yeah absolutely and I think that yeah it's it's just an exciting it's an exciting time to be dealing with like video because video editing is such a tedious task it's so timeconuming and I've I would not say I I'm far from a
proficient video editor but I'm I'm good enough to be dangerous like I'm good enough to know what I don't know but also like accomplish quite a bit of like basic level things yeah and yeah it's but it is so time consuming and so I I think like any tools that can help that process and help make video a little bit more you know malleable in like an easy
to use way is like it's like we still haven't it hasn't been figured out yet like it's it's still like too difficult in my opinion i 100% agree i mean I like to think that we're on kind of the forefront of this where you know we're really pushing the boundaries of what is possible i mean multimodal AI is still
very nent in the sense that you know like compared to inputs of text and image like it is still very very new like I mean just last year right like Google Gemini was starting to make huge strides in terms ter terms of its vision and video capabilities and um I think we're at like this crucial pivotal moment where um like not only is content
creation increasingly becoming video but multimodal AI is increasingly becoming better and we like to think we're kind of like riding these two waves and we're going to hit this point where it's just like crazy how good multimodal AI will become that it can like really understand your video content and at that point you don't even have to
predefine these workflows could be like a real time agentic um editor for you uh that has a sense of you know the context of your video as well as your style of video content creation right like what do you like what works for you and personalize uh its memory to to really create um like beautiful aesthetically pleasing content for you um I do think like video AI is is just like it's it's
kind of about to blow up um and there's a lot of different you know companies I think existing ones as well as like completely completely new ones which are in the space and so yeah I'm just excited to see where it goes over the next few months yeah I mean we you know we are using and we have tested lots of different video AI tools for you know this live stream but I've also you know
I kind of built some video AI tools into audio feed you know two years ago it was probably a year and a half ago when I did that you know kind of before we were building Mastra and so like I've been very interested in this space too i think it's a it is kind of ripe for disruption and there's a lot of people doing interesting things but I I do like
definitely like appreciate your take on like the the workflow and like the streams that you can generate and create yeah yeah yeah um I think it'll also be really interesting how to kind of navigate the design challenge of you know people want to automate as much of the content creation pipeline and and basically have this engine which is automatically ingesting their you know live streams
and podcasts and content and then creating these clips for them and even just distributing it out and you know optimizing for different platforms but then at the same time um you still want some level of creative control where you can come in here and say "Okay like at a more granular level I actually want to trim this clip by 5 seconds and I want
to add this B-roll here and I want to add this music here." Um and so kind of navigating that design problem of you know people want automation and want to have this ability to market themselves and their brand and their content at a 10x scale at the same time they want granular level control and so how do you find that middle point um is is a
problem I think that is very interesting and something we're definitely navigating so so now I have a very self-serving question because you know again I'm in I'm we're you know at Master we are clearly in the market for video tools that help make working with video easier and we've obviously been talking quite a bit on like using Mosaic
for some different use cases and you know definitely interested in in where that's going to go because I I see a lot of promise but I am curious because I you know haven't used the product that much yet only like seen demos I've I've played around with it a little bit and we're trying to figure out like what parts of our like video editing can we can use mosaic for but I I am curious
you know so I defined this mosaic workflow am I so right now I I can come into the canvas and I can you know upload and I'm sure maybe you'll show it here in a little bit but so the canvas is really cool I can upload a video and I can kind of have it process the workflow and I can get the result out at the end right so it's like input a video in does a bunch of things within my
control some of it's AI some of it's whatever um like steps and then I I get a result out do you have on the road map or maybe it's already available um the ability for me to like hit that from the API oh yeah so on Friday we're rolling out an API uh that will be able to trigger these workflows um just programmatically so you could you know um literally just
call our API send us like a a video file get some signed URLs back with uh your edited video content yes okay so so here's what I would like here's my proposal to you you know is so in this next segment when I once I kick you off this stream and I bring in someone else from the team that's going to help me build this we're going to start building
a master workflow that tries to basically where I can send it afterwards i can send it the link to the YouTube video it'll get the transcript it's going to like write show notes for me it'll schedule a tweet to go out as like a summary so it'll do all like the goal is like automate a bunch of those tasks
right because we don't have a production team the production team is literally Obby and me and whoever else can help in that day and you know we have a few other people that we kind of like help out here and there but it's like this is a pretty lean like we're just kind of making it up as we go right but if we can automate a lot of it we can start to
feel like things are even more polished and so I you ideally like one step in this master workflow would be we have this video okay send it to Mosaic to like create me some shorts or create me you know a clip and maybe I can even like my agent can call that workflow but I can tell it what you know kind of what I want it to do like oh I I talked to
you know a dish uh at this time frame like go I want to create a couple two shorts for this or whatever call this workflow and it can actually like smart enough to do that and it can read the transcript and know like the timestamps of when I talk to you and so it can like send that off and get much better results so
that's what I'd like to build that's my dream scenario I know we're not there yet obviously the API doesn't even exist but um and this workflow is like fictional in my head but I'm hoping over the next maybe few weeks on this stream to like build that out so others can see like okay here's some like really cool things you can do so I I will I will be
a a tester of of your API when it's available so and I will happily happily show it on the stream once we kind of get it to a point where we're like confident this thing's gonna going to actually work yeah I mean Friday I'll I'll let you know uh that's kind of the goal for us and to your point about it being able to dynamically produce these
workflows for you like this is definitely possible like I was just playing around with um you know having um Gemini basically produce one of these workflows for me based on just like a short sentence about what I wanted to create so I you know gave it um a video i gave it a sense of like okay what nodes it has access to and just asked it
like hey I want to like create two different shorts which talk about uh these different things and it has the context of my video so it can actually pull out certain moments and configure each of those tiles with the relevant context of my video so like what you're suggesting definitely is possible and in the realm of what we could do um but
yeah as a first step here what we're going to allow with the API is you can come in and define your own workflows in uh the Mosaic UI and then trigger them programmatically um also just configure each of those uh programmatically so you don't have to come in and and say "Hey I want to pull out this moment where Adish and I talk about Mosaic." Um you would
be able to just do that from from the API yeah and I think that's that's like the the goal right now is just I'm totally hap happy to go in and define the workflow and to you know and then I ideally I can just wire up an agent or a master workflow that you know can maybe wait for some human feedback where I can like send it some information and then it it knows it passes some information
into that workflow or whatever or somehow like you know I can kind of kick off this post stream workflow and it just kind of like I check a few boxes i I put in some information about what I want and it kind of knows it has the tools to kind of do it for me just like if I were to you know if I had a you know a full-time video editor that could
that could do that stuff ideally I would want to like spend a little time to define it and then I just want to be able to kind of like get some of this uh some of these things out because I I do think that we do these really long live streams but a lot of this content like you know as much as I appreciate everyone the 126 people watching right
now I I know that you know you not everyone can always watch every moment of this so there'll be good parts of this me and you talking that I would love to like clips that I can share with people that I think people would want to see but they're not going to spend the two hours to watch this entire stream to get that right they just like you know
some sometimes you have the time and sometimes life gets in the way so I'd like to like have both forms of content for sure yeah i mean I think shorts is really the name of the game these days though um with people's attention spans and also just like how much time someone has um no one's really coming in and watching your you know two three-hour
live stream or podcast unless they've gotten a glimpse or taste of it first from you know a short or a clip and and like said hey I actually want to like go check the rest of this out um so yeah yeah well I mean you've kind of teased it can we see some kind of demo can you show a demo so people can you know can
believe can you seeing is believing they can actually know that oh this is this is real this exists of course I wouldn't I would not leave you hanging Shane so this is Mosaic um can you see this okay yeah i mean you can maybe do one click of zoom if it if that still looks good but yeah that's a little bit easier to
read for sure um yeah so the idea is you know you can come in here you have uh different templates that you can just start off with but you can also just come in here and define your own uh video editing automation and so the idea is now you kind of get dropped into this into this canvas um you have the ability
to actually go and pick out different tiles that you want so maybe you know uh let's say Shane has uploaded his live stream um into Mosaic and now he wants to pull out a couple different shorts from from that live stream uh it's as simple as just dragging and dropping a couple different tiles here um and we
actually have a template which just does exactly this um so you would be able to say okay now um I want to create a short about this particular um topic um and let's say that topic is you know uh talking about moment they talk about um agentic workflows right um and unlike other clipping tools which kind of just take their best guess you have that ability to be the creative director here
so you can say that's the particular topic I want to create a short about um you could also just be generic you could say something as generic as "Hey find the most viral moments." And the agent can do that as well uh but once you've actually found those clips uh you can take it a step further so you can now say "Okay now that I've found that particular moment they're talking about
agentic workflows I want to go ahead and uh maybe add some some B-rolls to that." Um and then I want to you know use uh maybe the Veo 2 model um I want to use uh sign up for some image models and I can also specify the style of the B-roll that I want so maybe I want smooth and cinematic uh shots uh when they talk about agentic workflows right um and
then I can you know go ahead and enhance the audio i can do all sorts of different things here maybe I want to make this um localized into a couple different languages so uh everyone around the world can learn about agentic workflows right so I can go ahead and dump this into maybe a couple different languages here so I have the option to
you know pick from a wide variety um and then I can add some captions as well if I want to uh go ahead and you know really make sure that it's localized for these different languages and I have the ability to uh choose from a couple different options here um as well as some options that we've defined on our own for example cinematic captions uh
with this uh we're actually doing a salency analysis of the video and trying to figure out uh okay where are certain things happening in the video where should we position captions relative to those things that are happening where should we emphasize certain words with certain colors or certain font sizes uh just to again create that engagement from the end viewer to not click off and
and be engaged with what's happening in the video uh when it comes to this other branch I can do you know completely different things maybe I want to reframe this into a couple different aspect ratios so I can repurpose it for you know Tik Tok or Instagram or YouTube shorts um and so this is kind of you
know at a high level what you can do um and now again once I've defined this I can go ahead and uh save it maybe I want to call this the MRA mosaic um and now I can again reuse this for the next live stream and the next live stream um I'll show you what this kind of looks like in practice with um maybe two agents that
I've kind of already run here so this is an agent where pretty simple workflow um I have this kind of raw footage this is actually from one of our users um is it possible to make $10,000 or more per month with information or skills you already have the answer is a resounding the answer is yes and it's so as you can see
you know this is a talking head style video he's just recording himself kind of reading off a script and then uh looking off camera kind of recalibrating uh there's got some you know it's got some false starts or some bad takes in there and then some good takes interspersed in between and so it's as simple as saying "Okay I want to actually just go ahead and remove all
the bad takes." Um and now the AI can actually go through your entire footage like find all of the good takes stitch them together into one cohesive narrative um and this is what kind of the end result would look like here is it possible to make $10,000 or more per month with information or skills you already have the answer is yes that's because the information and experience
you have is valuable to others who are struggling with what you've already been able to overcome you could be like "Cool." And so it went ahead and did that but now I can actually again go into the editor um and tweak and polish things as needed so you know maybe I felt that one of the cuts was a little bit too uh abrupt and so I can go ahead
and just you know extend that uh and it's as simple as just coming into here and having that GL granular level control uh that I would want um as the again end creative um so this is a pretty simple flow i'll show you one which is maybe a little bit more um complex where we've actually taken a long form podcast uh in this case this is the podcast which you know Gary and the the Y Combinator partners run uh
called Light Cone um and I've gone ahead and repurposed it into a couple different shorts and so each of these branches here represent a different version of my video edit that you know are all kind of simultaneously running in parallel and so in this first branch here what I've said is hey I want to find and cut out all the moments where
the guy with the glasses is talking uh and because it's a fully multimodal agent you can give it this visual cue uh and it can go and actually analyze your visual content uh find the moments where the guy with the glasses is talking and then cut them out of out of your video um and then I said okay now that I've cut out those moments I want to add some
AI B-roll i want to use these models i want to have these many clips per minute i want these types of smooth and cinematic shots um and now the agent will actually go through the footage like find the moments where it thinks B-roll should be added generate the B-rolls place it in the right moments uh and so it's kind of doing all of this on autopilot um and I'll show you what that
right now if you're building a startup working on like cutting edge AI even if you haven't find the right So Gary the guy with the glasses was the one who you know started the introduction of the podcast um and it you know found and cut all those moments out so it jumped automatically to you know when Har was talking here idea yeah why give up and
go back to Google or college cool and then it added this you know B-rolls um I think this B-roll should probably go a little earlier you know when he was talking about AI in particular so again I have that ability to come in here and move that around if things don't make sense I can delete them um and so again you have that granular level control uh
that you would want um the other thing I'll show you is what's happening on this branch which is something completely different so in this case we're actually finding the moments where the guy with the glasses is talking but keeping them in the video uh so kind of the inverse of what was happening in that other branch uh and then we go ahead and reframe it into this 9 by16
aspect ratio uh and we use active speaker detection here so if there's multiple people talking it'll intelligently focus on the right person uh really great way to just you know repurpose your content that already uh exists that you're already creating for different you know social platforms but then we branch off again and we kind of
localize our content into Spanish into French add some captions here um I'll show you what this one looks like this is Gary in Spanish with some captions similar to Gary with the voice cloning that's being used um and again just like being able to distribute your content across different localities uh this is um the other types of captions
that I was mentioning these are cinematic captions where again we've done our own like silency analysis of the video position captions in different places with different fonts different emphases on uh different words um I'll show you what that looks like something's happening i'm not happy with that let me go all the way to the end go into you know this outside world and uh
from first principles understand the root cause of this and then uh you're going to discover all so again this is like a type of edit which might take you 30 minutes maybe an hour to do for even just a short one minute video uh something now you can kind of run on autopilot and just drives that engagement from from the viewer um I'll
show you this final branch here where in this case we're actually finding and creating a short about when they talk about AI startup school Um we then go ahead and just add a background to this uh you can specify any type of video media uh image media or even just a background color um and then we go ahead
and add some AI music and you can actually see the agent doing some thinking here you know it's analyzing the video's content rhythm emotional moments trying to create a custom soundtrack which fits with the context of the video uh calls it this announcement background soundtrack um and then adds it to to the video and
again you can view this in the timeline maybe I only want this for the first 15 seconds so I can go ahead and you know just go ahead and delete that and now I've got some music for the first 15 seconds as kind of an intro to to the video i have news for you guys yc is throwing our first ever AI startup school in San Francisco on June 16th
cool and I found out you know one particular moment from this longer podcast so um yeah this is kind of the idea with Mosaic you can define this workflow have it kind of run on autopilot again it created all of these variants from just that one input source video and um now yeah I can reuse this across you know the next Lite Cone podcast and the next Lite Cone podcast trigger it programmatically um and yeah
just 10x you know my content creation game yeah that is very cool was really uh yeah I'm excited i'm excited to try it out a little bit more you know of course we've already been testing it on some things you know that that you of course know the people watching maybe don't but yeah we've been testing it with some of this content that you're seeing today
some of the other video content that we have and yeah definitely excited to see you know like some of the stuff we're going to build be able to build with it and if nothing else like also like I have a feeling we have some ideas that you've ei shipped yet like the the API and and some other things yeah API is coming on Friday um and um when it comes to some of the other kind
of feature requests when it comes to reframing and when it comes to being able to produce like multiple shorts from a single tile um these are features that we're you know shipping soon um in the works right now and excited to get them out to you and have you try them out yeah any any uh parting words i know you got to go uh it's you know your
contact info is there try out Mosaic if especially if you're doing videos like you know this live stream or any other videos that you might be doing if you're watching this yeah no just um happy to have had this opportunity to jump on the call and um I'll love to you know take some clips out of this live stream itself and and hand them to you and and
and let's see what we can come up with yeah yeah let's uh let's do it send them my way we're trying to get more shorts out you know like more clips and more content so people who can't watch this whole long live stream can still see the some of the highlights some of the best moments yeah absolutely all right Adish it's great talking to you and yeah we'll chat again soon okay
thanks Shane bye all right everyone we are you know about 90 minutes into this live stream today and excited for this uh kind of next segment because we've been having some great guests but now we're actually going to try to build something and I don't want to build alone so I wanted to invite you know someone else from the team on to try to help me and I I know
if you've been watching this for a while I kind of gave you an a sneak peek of what I want to build but Eug John is coming in cold he does not know what I want to build yet so he is in for a surprise and he's going but he's going to help me and we're going to either succeed or possibly just really struggle together but we're going to we're going
to do it hey everyone yeah I'm really uh really excited it's been a while i've been uh mainly working on cloud these days a lot of uh monitoring observability and Masha is growing every every day new releases so so this is the this is the cool thing it's why one of the reasons I wanted to invite you on so I did ask I
asked a bunch of people from the team like if they were interested in coming on and we had a few people that were saying like "Okay yeah I'd come on for a half hour." But the reason I wanted I thought it'd be fun for you to come on is because we're going to be trying some things that I think neither of you neither of us have really dove deep into yet so we're going to have fresh eyes
and we're going to maybe run into problems but you know together I feel like we can get through it and we can you know actually build something so so here's my here's what I want to build and uh then we can try to get started and see how far we can get in the next you know half hour i don't know how far along we're going to get but we can always uh we'll continue it you know
I can continue this on and at another live stream because we'll we'll be back you know tomorrow and Friday and you know we'll have plenty of opportunity to keep working on this but the thing that I do want to build is so one of the problems that you know we I currently have and I want to use AI to solve like
my own problems i think that's a good use case to start with is that after these live streams you know we're on these live streams for for two hours and the days that Obby you know hijacks a stream and does his own thing he's on for two hours and we got so we have all this video content but it's you know not very useful to not
have a really good like show notes or transcript of the content so the first thing I want to do is just can we build some kind of workflow that can take a YouTube URL maybe use MCP or something and get the transcript from YouTube and generate some show notes that then we could put into the description on YouTube um we're
we're on Spotify now so like the description on Spotify so it doesn't have to be completely automated yet but if nothing else just an agent or a workflow i mean maybe it can start as an agent where I can give it a URL to the YouTube video it can go out it can generate the transcript give it to me and generate like some kind of summary i
think that seems like a reasonable thing we can accomplish in 30 minutes i think so as well and and then if it works everyone will know because I'll you know I will send out the summary through like a a tweet and add it to the YouTube video and everything so as soon as this stream is done we will test it out and hopefully hopefully if it works if we
can accomplish the mission we will uh we will send it out afterwards all right well I guess let's get started yeah so time is ticking you know we we have 25 minutes on the clock let's see what we can get done uh you want me to drive uh probably be best all right you know you can drive too but no I'll drive
today give me one second to uh pull some things up make Yeah yeah before this call just in case I reinstalled the MCP doc server you know just in case yeah just in case you need to ask questions along the way hey that's that's smart thinking you know like you were you knew we we you know we don't know what we're getting into and we're going to probably need
some help all right so I am going to share my screen and I'm just going to share everything so hopefully I uh close all the things that I don't want people to see which there isn't much there all right so let me share my whole window okay so we're we're here i suppose we just mpm create posture at latest this is still the command right it hasn't changed on
me still good all right i I I joke but sometimes like we we have such a the team is moving so fast that like sometimes we're just shipping stuff and I like "Oh that changed i didn't even know it was changing." Uh all right so let's call this like live stream just call it live stream live stream agents so this is going to be our live stream agent and might might have some workflows maybe we'll start as an agent
and then we'll eventually kind of back it into a workflow potentially would we have a single agent or or ide or yeah I don't know i think we'll start with one and then we'll uh we will decide what to do from there so I think we just you know I mean the first thing is probably just create some kind of
tool or workflow that will run and will generate the transcript summarize so maybe maybe it needs to be a workflow i don't know we'll we'll think about it as we're as we're building this out and then we can eventually just give the workflow to an agent and the workflow would probably just take in a YouTube link as kind of like the input all right so we can create just go
the defaults um sure i'll just install this example and we can use OpenAI why not we'll skip the API key we'll add the example because I like to see what I'm working with and delete it rather than just start from scratch i don't know why um sure i will I'll use cursor and I will install the doc server so we can ask questions if need be
i'm going to copy in an environment file that I have i think this is right uh okay so it should be good so I should be able to run mpm rundev and just we'll just confirm that everything works so I should have an open a AI API key here all right so let's just test it what's the weather in Sou Falls which is where I'm at called the weather
tool that seems about right okay so it's working now i suppose the next step is we are going to need something so the goal let's let's take a step back the goal is that we create some kind of workflow where I can send a YouTube URL after the video after our live stream is done so I get the YouTube URL I can send it and it will get the transcript because I do
want to have like a timestamp transcript and then also then from that transcript then create like show notes or like a summary of like when important things happened or something so it can provide a summary and then maybe even like a third step is provide like a a tweet summary that's a little bit shorter so like a summary that can be used for like
the YouTube description and the Spotify description but then also like a an expost where I can you know maybe it's a little more fun a little bit less uh formal or structured but we can still like highlight what happened in that stream so to me that seems that screams like I should probably just make a workflow that takes in the YouTube URL and does all those steps that I need does that
seem right to you yep that sounds about right all right i'm just really curious how you're going to get the YouTube URL uh streamed we'll find out yeah that'll be so I mean because I think it'll we can just run it on any after uh so ideally like I'd be able to run it on this one that you know Obby did
earlier today or maybe this one that I did so I should be able to like just copy this link this would be like the the link that I want to be able to send in right right so ideally I could actually have it like you know look at the last video and I could just tell it if I if we use an uh some kind of YouTube MCP I could probably just say like hey get the video from May
28th and it would just like see all the last videos and be able to know that oh is this the one you're talking about yes okay call the workflow maybe that that's like step two step one is I'll just give it the URL step two is maybe we'll let it look it up so I think that means we probably need to find like a YouTube MCP that we can use
so let's see if let's we'll start with MCP run and see what see what they have for me no YouTube tools i got to sign in let me try that now let's try to YouTube youtube no YouTube it's got to be some It's got to be a YouTube MCP somewhere someone had to do this huh all right well let's see what else we got see
let's check smithery actually let's check the the MCP registry registry here and we can uh we can find we start at the top of the list and look through uh All right let's see if there's a YouTube in here extract YouTube video information all right this promising I didn't want to All right so YouTube MCP server let's actually let's open these
two we need a YouTube API key so I'm going to have to get that so this is using Python and Docker okay like I don't really want to do that i mean I can but I just want to see what these other options this one is not a lot i will look at smithery just to see see what we got here so this one has seemingly more
usage so this one can retrieve transcripts and subtitles from YouTube videos effortless i like effortlessly i I do appreciate you know something that's effortless i'll be the judge of that but you know we can try it okay so we got a couple options youtube MCP server interact with YouTube content seamlessly retrieve video details manage captions and analyze trends okay so let's see it can get transcripts for
multiple videos returns a text content of the video's captions all right I think that's what we want right yep yep and then this one is just like just has get transcripts so we have two options we can try one of the things I've always uh you know you never know if it's actually going to work so you sometimes need to
uh have backup options also it's good to like make sure it has a source code so you don't you know a lot of this stuff is the wild west so you do need to be careful about the stuff that you're installing right so I mean it is better if there's like it has some usage that's why you know some of these registries like Smithery and others that tell you
uh different usage trends do at least help make some kind of better decision right you're not just you're not picking one that has zero usage now that doesn't mean it's safe but at least gives you a little indication that it might be right this also has the search videos you know in case we get to step two so that'd be good yeah all right so how do
I remember how to do this i need to get a YouTube API key okay so why don't I try to do that quickly i'm I'm going to uh do that over here so we can uh not share the API key with everybody so uh now here's a question do you know how to get do do you have any idea how to get an API key from YouTube because I
don't all right i have no idea as well uh I I'm I'm going to start clicking around but if you want to maybe search like if we can find out where you get a YouTube API key actually at I'm in my channel settings let's see well that's Claude we'll see yeah claude has to say "Oh there's got to be way." Chat help chat if you're if you're listening help us out we need a YouTube API key i
got to figure out how to get that hopefully it's a I don't know why but they're redirecting me to Google Console yeah I I I did wonder if you like I needed to use Google Console i was hoping to avoid I like to avoid I like to leave Google console to you and I don't I don't want to use Google console but okay oh yeah I will log into Google Console and see if I can find a way to
get an API key um yeah Google Cloud Services we have dad boss from New England do you do you tell me how I how I can do this let's see so from there it says uh select a project we already have one uh enter enable YouTube data API should be on API and services should I just do this in the master cloud project yeah yeah let's try that um
well the good news is if it if it breaks we have the person here to that can fix it yeah exactly all right all right so I'm in enabled APIs and services do I need to add an API and service uh yeah you should search for YouTube youtube so there's YouTube data API youtube analytics API sounds like So YouTube data so this says we need YouTube's data API key okay so
we're going to add So I'm in Google console i I guess I could probably share this part right so I'm in Google Console i'm going to enable this but I'm gonna I don't want anyone to see if it does give me an API key so I'm going to Okay so it has been enabled it says to use this API you may need credentials so I'm going to click create credentials
yep and I'm going to get user data next I need to have an OOTH consent screen oh boy what's my app name well I would say like this is just an internal thing please why do I why do we have to make this so complicated which I'm kind of interested in what the limits are like so I need to add or remove scopes what What are our optional settings here default language add or
remove scopes um wow what What do I need for scopes um I suppose it's all the YouTube data API view your YouTube account i don't really need to manage the YouTube account I don't think see edit YouTube videos so look at the different scopes that we have here manage YouTube videos sure active channel members i don't really know that I care about that download
would that be something we need yeah yeah probably i mean I suppose I can manage my YouTube account maybe I'm giving I'm giving the agent I'm trusting the agent with a lot link to your channel add remove edit apps you own on your YouTube channel i don't know if that's that important guess let's look and see what are what are some of the things we can get related videos get top videos get video
details do I need to be able to Does it need to be able to see my videos manage my assets it's probably probably does need this I would imagine right as scary as it is permanently delete my YouTube videos but okay let's update sensitive scopes okay save and continue oath client ID type web application probably oh man like I don't I just
wanted a API key everybody come on why is it so hard it's always It's always an O problem we we had this earlier it's always an off problem okay so I mean like how am I going to I just want to authenticate and get an API key how do I get a you Okay we have one person say web app let's let's follow them maybe yeah is that
I'm looking um Yeah all right all right well we're going to the name of our client okay web app go to OOTH and create API key okay i mean I know I would need like an authorized origin if I was actually setting up a real OOTH client so download credentials i have a client ID all right maybe I can and so it's letting me now create credentials so I'm going to create an API key all right see
if this works i'm not I'm not showing this on the screen i do have an API key this key is unrestricted okay well I'm not going to ins I'm pulling this over so I can just try to insert it here all right so let's try to connect so I have a smithery key now so I'm not going to So I am curious how to use smithery with
uh because I selected obviously just cursor because I didn't really you know ultimately it should be the same npx-y install with a key all right so I'm going to try to copy this thing and just see what happens here and I'm not going to all right so let's now let's go into our code all this just to uh just to get to our code now let's add it we need to add
Um call a live stream agent and honestly I maybe I should start with I'll just put it here for now uh I need to install MCP right right install atmastrac i think that's right now we need to import MCP client see I don't know what this is servers it's like command or something and just I'm actually just looking it up myself i think it's like because I know there's
a URL but I I also think there's like a I think I need I think I need the name of the server actually so the name of the server is YouTube and then in here there's like a command or something there it is yeah command or URL wherever it's going we got this okay okay okay so the command is MPX and then there's like args which is an array of all these
available args which is going to be a pain but oh would install be oh yeah I guess it would be a separate well yeah I don't know actually if we need to install all first and then run it like I'm not actually sure how this works i don't know why I would need client where is uh where's someone someone tag Henry from Smithery smithery
henry come in and help us trying to set this up so we're going to need the profile here we're going to need a key here but I don't know if it's actually like if this is the right command so this is probably not going to work that's my That's my hunch so this isn't right so I need to pass in the profile need to pass in the the
key and yeah that's what's that's what's next like if we try connect right now would it just error on uh like 40 44 or something uh yeah I don't I think it's like const tools equals MCP client get tools we need to await it i'm sure this isn't going to work but let's like don't like this autocomplete thing stop autocompleting please all right
and then key is all right so I need to create this smithery profile the smithery key so I'm going to do that quickly over here okay and then need to add my smithther key to this i'm just adding this to my env file and We're going to spend this whole time just trying to get Yeah the MCP it's normally not this hard all
right well to be honest once we get the MCP I feel like we're already halfway there we get transcript yeah it's like yeah just get getting the MCP working is the hard part okay so we got this thing here is this going to work i doubt it but let's try it and really I should just give be giving this these tools to thing fail to
connect look at that client client not specified okay well maybe I can fake it because it's because we're not actually like a a real client here so client cursor well I mean we are a real client we're just might not be like I don't know if this actually matters like Mastra Henry ma should be a supported client let's make that
happen it's probably going to say MRA is not a real client what's the air mostra valid options yep okay i mean so there's there's another option here we can look up this there we can probably install this MCP server directly so if this doesn't work we're going to just go the direct route and not go through Smither's install is there a URL from Smithery or that's
what yeah well it it there is actually URL um yeah I guess we could do the URL option too so that's maybe better maybe we should try that this doesn't work yeah that was all right all right we had someone from chat me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me me mention that
and uh are you see are you seeing the chat yeah I can see the chat huh this is like a I think there's a reream problem chat i haven't seen any chat messages i thought I was here alone the whole time how many chat messages have we have we received you John uh we got we got a few okay everyone's trying to help out that's chat i am sorry i don't know what's going on with Reream i had this
weird bug that like the music wouldn't stop playing and so I think there's like some reream issues it's it's technical uh technical issues that I normally the technical issues are 100% our fault this time seems like it's not our fault i'm sorry if I have not been uh responding i did not uh know that anyone was actually
chatting so reream normally will pull in all the the chats across all the different channels but it is uh not doing that okay so we we're going to need to process smithery YouTube URL let's do that oh I need to do like new URL something like that there we go okay need to add this ENV file quickly this this URL so I'm just going
to add this to my ENV smithery U YT URL drop in this whole URL thing um all right now moan truth didn't air did it work i don't know all right so let's because it's not actually going to be usable let's do this so we're going to grab this we're going to pull this out of here this is not the right spot for it i just wanted to see if I you know
wanted to get the code written don't don't uh follow my example uh so this is going to live here i'm also just going to copy a bunch of stuff from our weather agent because I'm kind of lazy i could ask the doc server I suppose to just write this stuff for me right tools be tools like this it's not a weather agent this is a live stream
agent you help you I'm just going to make this really simple you help process video data after Marra live streams you can in and say three to three to five sentences i'm going to say no it always does more two to three sentences all right this is like really simple i don't want to use many i'm going to use something bigger okay it's
gonna What do we think is this going to work um I hope so all right i doubt it but anytime for for those of you that are new I can't read the chat we got technical difficulties but for those of you that are new anytime we ask is do you think this is going to work and it's the first time we try it and we just wrote a bunch of code or something it's uh it is the the AI agents hour drinking
game so whether you have coffee water or you know adult beverage it's it's later in some time whatever it is if we get it right everyone cheers we take a drink we very rarely cheers it's happened once because nothing ever works the first time but we're going to try it we're going to give it a go and we
should just be able to working is just I can see the tools and I could uh I can call the tools directly and my agent can call call the tools it's a big ask are you going to give a URL I guess as well with that uh I will I will ideally give it a URL yes all right so let's go to the master playground do we see tools do we see tools no i gotta get
refresh i didn't This is not This is okay this one doesn't count i didn't uh I didn't add it that's on me live stream agent uh has no exported member okay well I'll fix that because I called it live stream agent so we didn't set it up yet we're not going to count that we're I know it didn't work but we're not counting that all right now we're starting the
server it was throwing an error is it still throwing an error let's find out okay should have a second agent we do this agent should have some tools it does nice things are looking good okay should we just maybe ask it to uh search videos can tell me if we had a live stream on May 27th all right moment of truth it's using the right
tool that No that actually was true because we did have a workshop earlier which was live streamed so that's technically correct what else did it return here it's not the easiest to read but there's there it is is there more it's hard to see what values here come on there's got to be another one our special guest on this free
Maestro live stream i don't know what that is huh i don't know what that is but it did return the value so in that case I'm going to call that that worked cheers everybody cheers it did it did actually answer the question correctly so it's working kind of um well thank you let's see what else what's that that's a thank you chat it was a Yeah was it see clear are you able
to click on it are you able to click on the chat message and show it do you have that capability i don't know if you do i I can only see it i wonder if I I wonder if I refresh the page and come back you stay on the live stream i'm gonna see if I can get back get this chat message and if I leave it's up up to you to figure
the rest out all right i'll keep you all entertained here um well I guess it's just me um I'm not even sure if is chat still working please mention anything um okay nice he came back didn't know what to say oh so yes Clevolt AI I'm sorry the chat wasn't working dude i'm sorry or or a girl I don't know but whatever i I
apologize yes Postman might have it i'm seeing all these chat messages sorry uh sorry all unfortunately we had technical difficulties but I do appreciate all the help along the way so yeah you've been uh yeah yeah they helped out so appreciate appreciate you all in the chat uh and also like you know you can give me a hard time too if I'm not if I'm not responding in the chat normally
I would appreciate you calling me out uh because that is uh that's appreciated around here you can you can call us out we we we don't always you know we don't always do a great job checking these kind of things all right we're back though and we've already went over time but I feel like we're we're very close
so I just want to see if I can get the transcript to this video all right so we had some videos past live streams AI Agents Hour part two so I want to see if I can get the transcript for AI Agents Hour part two let's see if it can do it transport is closed okay it seems like it failed tool execution so I don't know why transport is closed but let's let's just run it again see if I can get a
connection okay i think I got to put this just so available i was kind of worried about that that it that the transcript wouldn't actually be there because I think in YouTube I think you have to like upload the transcript i don't think it like automatically generates the transcript for you so that is what I was worried about that's why I think this other uh
YouTube transcript server is possibly something we need so I'm going to copy this not going to show my key but I I want to say so I was talking with actually like Ally who did some SEC helps us with some security stuff she was on the security corner we were talking about this and I think she said like the get transcript wasn't working for her because she was doing some stuff with YouTube MCP probably
should have just went and watched that video because I probably could have remembered how she was doing it um I don't remember it was like it was like a it would summarize like YouTube videos and like pull out like it would it was basically like pulling out educational content from YouTube videos like on specifically around like bikes she was
like there's like like people that are into like I think fixing bikes or or something like that like mountain bikes things like that but I don't remember but it is available on our live stream and maybe if we could pull the transcript we can find out so that is the goal that is the current goal all right so I'm going to I'm just going to create another environment variable for this other
URL smithery U transcript URL that's what I'm going to call it and we are going to try again okay so YouTube let's call this search and actually we'll just call it YouTube search and then we're going to do YouTube transcript URL smithery transcript URL but the thing is we don't want to pass in all these tools now i
just want to pass in the right one so it's YouTube and it's going to be get or search videos so let's see yeah so it's going to be you i'm just gonna do this YouTube so we're going to pass in tools youtube search videos i think I don't know if that's going to It's clearly not working well it needs a key right
yeah like the cuz it should I'm getting the tools i'm just not spreading all of them so what am I missing here like the the the key it should be like tool colon tools YouTube search video yeah tool or YouTube tool oh yeah like this whatever huh there we go uh but it's not quite right because it's going to be tools YouTube and because I think it's just it's going to be YouTube
search videos I think and then YouTube and then whatever this one is YouTube get transcripts it's going to be YouTube transcript get transcripts is it plural or not it's plural okay I'm gonna call this YouTube search okay so I should only have two tools which it's always better you know in our experience giving your agent less tools is going to make it more
accurate all right and Cleolt AI says it depends sometimes it takes days before transcripts are available very possible and so I think that's probably why this other you know MCP server which is probably some kind of service that I have no idea what it does uh besides giving me the transcript exists but I could I could try an older uh video as well and
see all right so we're going to copy this same thing just going to refresh the page or start a new chat maybe let's check i don't I only see YouTube search interesting youtube transcript all right let's just add them all here and see what we get see if it shows up it might be airing out oh also I I changed the MCP server
and I didn't restart so maybe it needed to get a new connection i don't know we'll see let's just see I guess if it has YouTube transcript get transcripts okay what was the video it was AI Agents Hour part two May 27th so search videos we'll see if it did call the right transcript one okay well can you tell me i guess I could look and see
like the transcript okay but where is the transcript text there's the text is it is it actually the whole uh I mean it looks pretty long i don't See you next time goodbye that definitely sounds like me i'm I'm cooked it's the same It's the same greeting and the same ending of every video it it right it's got to be
right uh okay but it did summarize it for me and it's about three you know three sentences so I would say that worked right at the end of the day i think we accomplished what we set out to accomplish uh there's a lot more we obviously need to do here we struggled with uh with MCP getting it set up but once we have it it's pretty easy right we were uh we were able to
very quickly uh get our agent wired up once we had the tools yeah it's pretty cool that you know all you need is that MCP server and you can already start getting transcripts from YouTube honestly the worst the worst part of the whole thing besides like the MCP issue was just getting the API key for YouTube
yeah that's true like chat chat chat saved the day and uh yeah chat saved the day here on helping us get that chat saved the day on helping us uh get uh get connected to MCP we probably would have saved it another five minutes if we just would have been paying attention but I can't help it i'm sorry i couldn't
see the chat for some reason there's some reream bug if you're if you're from reream and you're watching I don't know what something happened i couldn't see the chat but a quick refresh was all that was need was all that was needed but I'm glad you were here Eugen because if I would have refreshed and you weren't we
would just never would have happened i never would have seen the chat if you weren't here exactly but yeah thanks for coming on the the show you John thanks for helping out it was fun maybe we'll continue this and try to build more and actually automate some things it'd be nice like the next step is that there's a I know there's a typefully MCP and we use typefully to
like schedule out posts because you know sometimes we can write a few posts at once so you can have it post to like X or LinkedIn and so it'd be nice to have like after the after an event to post something to X and say like here's the summary of the of the event with the link to the actual video or something
like that i think we could do something cool like that yeah that would be awesome and it was uh yeah great being on here you know take a little break from work so it's quite kind of nice yeah yeah but happy to be uh you know it's it's always good to to use you know use the other side of the product right you're always so busy on cloud that you got to like play around with the actual
framework side yeah one day maybe we'll deploy to cloud you know yeah yeah that's that's the next thing next thing is going to be once we get it working it's got to live somewhere right all right you John i'll talk to you later man yep take care bye all right everyone thanks for hanging out with us thanks for watching us struggle and thanks for helping out uh
appreciate you all watching you know we had we've had a lot of live stream content today i know you can't consume it all so hopefully we get some of these automations so we can highlight some of the best moments and you can uh keep up if you can't follow along with all the content we're putting out there uh today we talked a little bit about AI news we talked about if AI is going to replace
all white or 50% of white collar jobs in the last three years as the anthropic CEO has predicted we then talked about Ror and uh you know James from ROR predicted that in the next three years 90% of call centers 95% of call centers are going to be replaced by voice AI i took the under on the on that daniel was
I think said 90% so I took the under i don't know what you all think but I I think it's going to take a little longer than that but voice AI is really cool and we got to see a demo of ROR we highlighted some YC companies that are kind of in the AI space in the AI dev tools space uh that are just really interesting talked about that we had
Adish from Mosaic come on show us a demo we saw some really cool like video editing things that I do want to start to like the reason I I the reason I wanted to talk to him is because we're struggle with this problem of all this video data and we want to be able to repurpose it so people who again are busy can't watch this whole stream you
know we've been on here for two hours and 20 minutes it's a long time so uh yeah definitely was interested i've been interested to try out Mosaic i've tried it out a little bit we're trying to figure out like the right workflows to use there's a few little features that they're working on that I think will kind of unlock some really uh really cool stuff for us um and then yeah we
tried to build this agentic workflow to take these live stream videos ideally we can just give an agent tell the agent the title it goes out it gets the transcript for us it generates a summary it'll schedule eventually schedule some posts for us and then maybe you know once this Mosaic API drops it'll trigger
a mosaic workflow that'll make some clips for it and do all kinds of cool things but thank you for watching we'll see you next time and yeah thanks for watching AI Agent Hour goodbye



