首页  >>  来自播客: User Upload Audio 更新   反馈

What's next for AI agents ft. LangChain's Harrison Chase - YouTube

发布时间 2024-03-29 12:47:43    来源

中英文字稿  

I'd like to introduce Harrison Chase. One of the reasons I was really excited to come back today was because I think it was a year ago at this event that I met Harrison and I thought, boy, if I get to meet super cool people at Harrison, I'm definitely going to come back this year. Quick question, how many of you use Langchain? Yeah, wow, okay, it's almost everyone. Don't see that, don't. Pro-vular talk, run, pip and start Langchain. If you aren't using Langsof yet, I'm a huge fan. Harrison works with a massive developer community. If you look at the Pip High-down-nose stats, I think Langchain is by far the leading Generative AI orchestration platform, I think. And this gives a huge view into a lot of things happening in Generative AI, so I'm excited to have him share with us what he's seen with AI agents.
我想介绍一下Harrison Chase。我之所以非常激动地回来参加今天的活动,其中一个原因是因为我想,大约一年前在这个活动上我遇到了Harrison,我当时就想,哇,如果我能在Harrison这里遇到超酷的人,我肯定会今年再来的。快问一下,有多少人使用Langchain?哇,好的,几乎每个人都在用。如果你还没有用Langchain,快去试试吧。Harrison与庞大的开发者社区合作。从Pip High-down-nose的统计数据来看,我认为Langchain绝对是领先的生成AI编排平台。这为我们提供了许多关于生成AI发展的见解,所以我很高兴让他分享一下他与AI代理人的见解。

Thanks for the intro and thanks for having me excited to be here. So today I want to talk about agents. So Langchain's a developer framework for building all types of all on applications, but one of the most common ones that we see being built are agents. And we've heard a lot about agents from a variety of speakers before, so I'm not going to go into too much of a deep overview, but at a high level, it's using a language model to interact with the external world in a variety of forms. And so tool usage, memory, planning, taking actions is kind of the high level gist. And the simple form of this you can maybe think of as just running an LLM in a for loop. So you ask the LLM what to do, you then go execute that, and then you ask it what to do again, and then you keep on doing that until it decides it's done.
谢谢介绍,谢谢让我感到兴奋来到这里。今天我想谈一谈代理。Langchain是一个开发者框架,用于构建各种类型的应用程序,但我们看到最常见的是代理应用程序。我们之前听过许多演讲者讲述代理,所以我不会深入介绍,但在高层次上,它是使用语言模型与外部世界以各种形式进行交互。因此工具使用、记忆、规划、采取行动是其高层次的要点。简单形式可以理解为只需在for循环中运行一个LLM。你询问LLM该做什么,然后执行,再次询问它该做什么,然后一直持续这样做,直到它决定完成为止。

So today I want to talk about some of the areas that I'm really excited about, that we see developers spending a lot of time in and really taking this idea of an agent and making it something that's production ready and real world and really the future of agents as the title suggests. So there's three main things that I want to talk about, and we've actually touched on all of these in some capacity already, so I think it's a great roundup. So planning, the user experience, and memory. So for planning, Andrew covered this really nicely in his talk, but we see a few, the basic idea here is that if you think about running the LLM in a for loop, oftentimes there's multiple steps that it needs to take. And so when you're running it in a for loop, you're asking it implicitly to kind of reason and plan about what the best next step is, see the observation, and then kind of resume from there and think about what the next best step is right after that.
今天我想谈一下一些让我感到非常兴奋的领域,我们看到开发者花费大量时间,并将这种代理的概念变成了一个真正实用并适用于实际世界的未来代理。所以,我想谈一下三个主要方面,实际上我们已经在某种程度上都提到过这些,我认为这是一个很好的总结。计划、用户体验和记忆。关于计划,Andrew在他的讲话中已经很好地涵盖了这一点,但基本的想法是,如果你考虑在一个for循环中运行LLM,通常需要多个步骤。因此,当你在for循环中运行它时,你实际上要求它隐式地推理和规划下一步该怎么做,观察到情况,然后从那里继续,思考接下来的最佳步骤是什么。

Right now at the moment, language models aren't really good enough to kind of do that reliably. And so we see a lot of external papers and external prompting strategies kind of like enforcing planning in some method, whether this be planning steps explicitly up front or reflection steps at the end to see if it's kind of like done everything correctly as it should. I think the interesting thing here, thinking about the future, is whether these types of prompting strategies and these types of like cognitive architectures continue to be things that developers are building or whether they get built into the model APIs as we heard Sam talk a little bit about. And so for all three of these to be clear, like I don't have answers, and I just have questions. One of my questions here is, are these planning, prompting things short term hacks or long term necessary components?
目前,语言模型还不够可靠,无法可靠地实现这种功能。因此,我们看到很多外部论文和外部提示策略在某种程度上强化规划,无论是在前面明确规划步骤,还是在最后反思步骤,看看是否已经正确完成所有任务。我认为在这里,思考未来的有趣之处在于这些提示策略和这些认知架构是否会继续成为开发人员正在构建的东西,或者是否会像我们听到山姆谈论的那样被内置到模型 API 中。因此,对于这三者来说,我要明确的是,我没有答案,只有问题。我在这里的一个问题是,这些规划、提示的事物是短期的修补还是长期必要的组成部分?

Another kind of like aspect of this is just the importance of basically flow engineering. And so this term I heard come out of this paper, alpha code, it basically achieves state of the art kind of like coding performance, not necessarily through better models or better prompting strategies, but through better flow engineering. So explicitly designing this kind of like graph or state machine type thing. And I think one way to think about this is you're actually offloading the planning of what to do to the human engineers who are doing that at the beginning. And so you're relying on that as a little bit of a crutch.
另一个类似的方面就是流工程的重要性。我听说过一个术语叫做"alpha code",它基本上通过更好的流工程实现了最先进的编码性能,而不一定是通过更好的模型或更好的提示策略。它是通过明确设计这种图表或状态机来实现的。我认为,一个思考方式是,实际上将该做什么的计划外包给一开始执行此任务的人类工程师。因此,你在某种程度上依赖这个作为支撑。

The next thing that I want to talk about is the UX of a lot of agent applications. This is actually one area I'm really excited about. I don't think we've kind of nailed the right way to interact with these agent applications. I think human in the loop is kind of still necessary because they're not super reliable. But if it's in the loop too much, then it's not actually doing that much useful thing. So there's kind of like a weird balance there. One UX thing that I really like from Devon, which came out a week, two weeks ago, and Jordan B kind of like put this nicely on Twitter, is the presence of like a rewind and edit ability. So you can basically go back to a point in time where the agent was and then edit what it did or edit the state that it's in so that it can make a more informed decision. And I think this is a really, really powerful UX that we're really excited about at Ling Training exploring this more. And I think this brings a little bit more reliability. But at the same time kind of like steering ability to the agents.
我接下来要谈论的是许多代理应用的用户体验。这其实是我非常兴奋的一个领域。我认为我们还没有完全掌握与这些代理应用互动的正确方式。我认为人机协作仍然是必要的,因为它们并不是非常可靠的。但如果它参与的太多,那么它实际上并没有做太多有用的事情。所以在这方面有一种奇怪的平衡。我非常喜欢来自 Devon 的一个用户体验功能,这是一两周前发布的,Jordan B 在 Twitter 上也很好地总结了这一点,那就是有像倒带和编辑的功能。这样你基本上可以回到代理处于某个时间点的状态,然后编辑它的行为或状态,以便它能做出更明智的决策。我认为这是一种非常强大的用户体验,我们在 Ling Training 上非常兴奋地探索着这方面。我觉得这可以增加一些可靠性,同时也让代理有更多的操控能力。

And speaking of kind of like steering ability, the last thing I want to talk about is the memory of agents. And so Mike, as I peer showed this off a little bit earlier, where he was basically interacting with the bot and kind of like teaching it what to do and correcting it. And so this is an example where I'm teaching in a chat setting, an AI to kind of like write a tweet in a specific style. And so you can see that I'm just correcting it in natural language to get to a style that I want. I then hit thumbs up. The next time I go back to this application, it remembers the style that I want. But I can keep on editing it. I can keep on making it a little more differentiated. And when I go back a third time, it remembers all of that. And so this I would kind of classify as kind of like procedural memory. So it's remembering the correct way to do something. I think another really important aspect is basically personalized memory.
说到类似于方向盘的机动性,我想要谈的最后一件事就是智能体的记忆。就像迈克之前展示的那样,他基本上是在与机器人交互,教它该怎么做并做出纠正。这是一个例子,在聊天设置中我在教一个人工智能以特定风格写推文。你可以看到我只是用自然语言纠正它,以达到我想要的风格。然后我点了一个赞。下次我回到这个应用程序时,它记得我想要的风格。但我仍然可以继续编辑它,使其更有差异化。第三次我回去时,它记得所有这一切。我会把这种记忆归类为程序性记忆。它记得正确的做事方式。我认为另一个非常重要的方面就是个性化记忆。

So remembering facts about a human that you might not necessarily use to do something more correctly, but you might use to make the experience kind of like more personalized. So this is an example kind of like journaling app that we're building and playing around with for exploring memory. You can see that I mentioned that I went to a cooking class and it remembers that I like Italian food. And so I think bringing in these kind of like personalized aspects, whether it be procedural or kind of like these personalized facts, will be really important for the next generation of agents.
通过记住有关人类的事实,你可能不一定会用来做某些更正确的事情,但你可能会用来使体验更加个性化。所以这是一个我们正在构建和玩耍的日记应用的例子,用于探索记忆。你可以看到我提到我去上了烹饪课,它记得我喜欢意大利食物。因此,我认为引入这些个性化的方面,无论是程序性的还是这些个性化的事实,都将对下一代代理人非常重要。

That's all I have. Thanks for having me and excited to chat about all of this. If anyone else just shout out a chat about this after.
这就是我所能提供的了。谢谢邀请我,很期待和大家聊这些话题。如果还有其他人想聊这个话题,随时欢迎一起讨论。