首页 >> 来自播客: Dwarkesh Patel 更新反馈

Andrej Karpathy — “We’re summoning ghosts, not building animals”

发布时间 2025-10-17 17:15:45 来源

What I'm interested in learning is terrible. It just so happens that everything that we had before is much worse. I'm actually optimistic. I think this will work. I think it's tractable. I'm only sounding pessimistic because when I go on my Twitter timeline, I see all this stuff that makes no sense to me. A lot of it is, I think, honestly just fundraising. We're not actually building animals. We're building ghosts. These lack sort of ethereal spirit entities because they're fully digital and they're kind of like mimicking humans and it's a different kind of intelligence. It's business as usual because we're in an intelligence explosion already and have been for decades. Everything is gradually being automated. It has been for hundreds of years.

我感兴趣学习的东西其实挺可怕的。不过巧的是，我们之前拥有的东西要糟糕得多。因此，我其实是乐观的。我认为这个能行，我觉得是可以解决的。我之所以听起来很悲观，是因为当我浏览我的推特时，会看到很多让我不明所以的内容。其中很多内容实际上只是为了筹款。我们并没有真正建立动物，而是在创造“鬼魂”。这些“鬼魂”缺少具体的灵魂实体，因为它们完全是数字化的，有点像在模仿人类，是一种不同类型的智能。这一切没有什么新鲜事，因为我们已经处于智能爆炸的时代，而且已经持续数十年。一切都在逐步自动化，这种趋势已经持续了几百年。

Don't write block posts. Don't do slides. Don't do any of that. Like build the code, arrange it, get it to work. It's the only way to go otherwise you're missing knowledge. If you have a perfect AI tutor, maybe you can get extremely far. The geniuses of today are bare-scotch in the surface of what a human mind can do, I think. Today, I'm speaking with André Carpati. André, why do you say that this will be the decade of agents and not the year of agents? Well, first of all, thank you for having me here. I'm excited to be here. So the quote that you've just mentioned, it's the decade of agents. That's actually a reaction to an existing pre-existing quote, I should say, where I think a lot of some of the labs, I'm not actually sure who said this, but they were alluding to this being the year of agents with respect to LLMs and how they were going to evolve.

不要写博客文章。不要做幻灯片。不要做这些事情。应该去编写代码，调整它，让它运行。这是唯一的办法，否则你会错过知识。如果你有一个完美的AI导师，也许你可以走得很远。我认为，当今的天才仅仅触及了人类心智所能达到的表面。今天，我与安德烈·卡帕蒂交谈。安德烈，为什么你说这是代理的十年，而不是代理的一年呢？首先，谢谢你邀请我来这里，我很高兴能够在此交流。刚才你提到的那句话，"代理的十年"，实际上是对一个已有说法的回应。之前有不少实验室的人说过这是"代理的一年"，主要是指在大型语言模型（LLM）上的发展，以及它们将如何演变。

And I think I was triggered by that because I feel like there's some over-predictions going on in the industry. And in my mind, this is really a lot more accurately described as the decade of agents. And we have some very early agents that are actually extremely impressive and that I use daily, you know, cloud and codecs and so on. But I still feel like there's so much work to be done. And so I think my reaction is, like, we'll be working with these things for a decade. They're going to get better. And it's going to be wonderful. But I think I was just reacting to the timeline, I suppose, of the implication.

我想我之所以受到触动，是因为我感觉这个行业有些过度预测。在我看来，用“代理人的十年”来描述这段时期更为准确。我们现在已经有一些非常初期但极其令人印象深刻的代理人，比如我每天使用的云计算和编解码器等等。但我仍然觉得还有很多工作需要完成。所以我的反应是，我们将和这些技术共同发展十年。它们会变得更好，这将是一件很美好的事情。但我想我只是对预期的时间线有所反应。

What do you think will take a decade to accomplish? What are the bottlenecks? Well, actually make it work. So in my mind, I mean, when you're talking about an agent, I guess, or what the labs have in mind and what maybe I have in mind as well, is you should think of it almost like an employee or like an intern that you would hire to work with you. So for example, you work with some employees here. When would you prefer to have an agent like cloud or codecs do that work? Like, currently, of course, they can't. What would it take for them to be able to do that? Why don't you do it today? And the reason you don't do it today is because they just don't work.

你认为哪些事情需要十年才能完成？有哪些瓶颈？实际上，要让它们运作起来。那么，在我心中，当我们谈论一个代理时，或者实验室和我自己可能都设想的是，你应该把它当成一个员工或实习生，就像你会雇佣他们来与你一起工作一样。例如，你在这里与一些员工一起工作。在什么时候你会更倾向于让像云计算或代码这样的代理来完成这些工作？目前，他们当然无法做到。为了让他们能够做到，需要什么条件？你为什么今天不去做？原因是因为他们目前根本无法正常运作。

So like they don't have enough intelligence, they're not multimodal enough. They can do computer use and all this kind of stuff. And they don't do a lot of the things that you've alluded to earlier. You know, they don't have continued learning. You can't just tell them something and they'll remember it. And they're just cognitively lacking and it's just not working. And I just think that it will take about a decade to work through all those issues. Interesting. So as a professional podcaster, a viewer of AI from afar, it's easy to identify for me. Like, oh, here's what's lacking. Continual learning is lacking or multimodality is lacking.

他们的智力似乎不够，也不够多模态。他们可以操作电脑之类的东西，但没有你之前提到的某些能力。比如说，他们没有持续学习的能力，不能简单地告诉他们一些东西就能记住。他们在认知方面有欠缺，这样的情况行不通。我觉得解决这些问题大概要花十年时间。作为专业播客主持人，从远处观察AI，这些问题对我来说很容易识别，比如持续学习的缺乏或多模态的不足。

But I don't really have a good way of trying to put a timeline on it. Like if somebody's like, how long will continue learning take? There's no like prior I have about like, this is a project that's just like five years, 10 years, 50 years. Why a decade? Why not one year? Why not 50 years? Yeah, I guess this is where you get into like a bit of, I guess, my own intuition a little bit. And also just kind of doing a bit of an extrapolation of with respect to my own experience in the field. So I guess I've been in AI for almost two decades.

我其实没有一个很好的方法来设定时间线。如果有人问我，学习需要多久？我也不能给出确切的判断，比如这个项目到底是五年、十年还是五十年。为什么是十年？为什么不是一年或者五十年？我想，这是涉及到我的直觉和根据我在这个领域的经验进行的一点推测。我在人工智能领域已经差不多有二十年了。

I mean, it's going to be maybe 15 years or so, not that long. You had Richard Sutton here who was all around, of course, for much longer. But I do have about 15 years of experience of people making predictions of seeing how they actually turned out. And also I was in the industry for a while and I was in research and I've worked in the industry for a while. So I guess I kind of have just a general intuition that I have left from that. And I feel like the problems are tractable. They're surmountable. Yeah. But they're still difficult. And if I just average it out, it just kind of feels like a ticket, I guess to me.

我的意思是，大概可能需要15年左右，这并不是很长的时间。你们这儿有Richard Sutton，他当然从事这个领域的时间更久。不过，我大概有15年的经验，可以看到人们预测未来的情况以及这些预测最后是如何实现的。我曾在这个行业工作过一段时间，也从事过研究，所以我觉得我有一种总的直觉。我觉得这些问题是可以解决的，是可以克服的。然而，它们仍然很困难。如果我把这些因素平均考虑一下，感觉就像是领到一张入场券。

This is actually quite interesting. I want to like, you're not only the history, but what people in the room felt was about to happen at various different breakthrough moments. Right. What were the ways in which their feelings were either overly pessimistic or really optimistic?. Yeah. And I think they were going to go through each other one by one. Yeah. I mean, that's a giant question because, of course, you're talking about 15 years of stuff that happened. I mean, it has actually like so wonderful because there have been a number of, I would say, seismic shifts that were like the entire feel has sort of like suddenly looked a different way. And I guess I've maybe lived through two or three of those. Yeah. And I still think there will continue to be some because they come with some kind of like almost surprising irregularity.

这实际上非常有趣。我想了解的不仅仅是历史，还有当时在场的人在各种重要突破时刻的感受。对吧。他们的感觉是过于悲观还是非常乐观？是的。我想应该会一个一个地讨论这些感受。对，这是个庞大的问题，因为我们谈的是过去15年发生的事情。这真的很美妙，因为期间有许多巨大的变化，让整个领域突然呈现出不同的面貌。我想我大概经历了其中两三个这样的时刻，我仍然认为这样的变化还会继续出现，因为它们总是以一种几乎出人意料的不规则方式出现。

Well, when my career began, of course, like when I started to work on deep learning, when I became interested in deep learning, this was just kind of like by chance of being right next to Jeff Hinton at the University of Toronto. And Jeff Hinton, of course, is kind of like the Godfather figure of AI. And he was training all these neural networks and I thought it was incredible and interesting. But this was not like the main thing that everyone in the AI was doing by far. Yeah. This was a niche little subject on the side. That's kind of maybe like the first like dramatic sort of seismic shift that came with the Alexander and so on. I would say like Alex and it's sort of re-oriented everyone and everyone start to train neural networks.

嗯，我的职业生涯开始时，当我开始对深度学习产生兴趣时，这完全是偶然的，因为我当时恰巧在多伦多大学与杰夫·辛顿紧挨着。杰夫·辛顿可以说是人工智能领域的教父，他当时训练所有这些神经网络，我觉得这真是太不可思议和有趣了。但是那时，这不是人工智能领域的主流，而只是一个小众的课题。后来发生了巨大的变化，比如亚历山大等人的出现，我会说是亚历克斯等的工作重新引导了大家的方向，促使所有人都开始训练神经网络。

But it was still like very like per task, per specific task. So maybe I have an image classifier or I have a neural machine translator or something like that. And people became very slowly actually interested in basically kind of agents, I would say. And people started to think, okay, well maybe we have a check mark next to the visual cortex or something like that. But what about the other parts of the brain? How can we get an actual like full agent or in full entity that can actually interact in the world? And I would say the Atari sort of deep reinforcement learning shift in 2013 or so was part of that early effort of agents in my mind. Because it was an attempt to try to get agents that not just perceive the world but also take actions and interact and get rewards from environments.

尽管如此，这仍然是非常具体地针对每一个任务。比如说，我可能有一个图像分类器或者一个神经机器翻译器之类的东西。人们逐渐对“智能体”这个概念产生了兴趣。我会说，人们开始考虑，也许我们已经解决了类似视觉皮层的部分问题。但是，大脑的其他部分呢？我们如何获得一个完整的智能体，一个能够在现实世界中进行互动的完整实体？我认为，大约在2013年，Atari 的深度强化学习的转变是早期探索智能体的一个部分尝试。在我看来，这是在尝试创建一种不仅能够感知世界，还能够采取行动、与环境互动并从中获得奖励的智能体。

And at the time this was Atari games. And I kind of feel like that was a misstep actually. And it was a misstep that actually even the early opening eye that I was a part of of course kind of adopted. Because at that time the site guys was reinforcement learning environments, games, game playing, beat games, get lots of different types of games. And open I was doing a lot of that. So that was maybe like another like prominent part of I would say AI where maybe for two or three or four years everyone was doing reinforcement learning games. And basically that was a little bit of a misstep. And what I was trying to do at opening eye actually is like I was always a little bit suspicious of games as being like this thing that would actually lead to a GI because in my mind, you want something like an accountant or like something that's actually interacting with the real world.

在那个时候，这主要是关于雅达利游戏的。我个人觉得这其实是一个失误。而这种失误甚至在我参与的早期OpenAI项目中也有所体现。因为那时的趋势是强化学习环境，游戏，玩游戏，打败游戏，以及接触各种不同类型的游戏，而OpenAI也在大力研究这一块。所以那可能算是人工智能领域的一个显著部分，我会说那段时间可能有两到四年，大家都在做强化学习游戏。但其实这有点误入歧途。我在OpenAI所尝试的，其实是对游戏持有一点怀疑的态度，我觉得游戏不太可能引领我们走向通用人工智能。在我看来，我们需要的更像是一个会计师，或者是能够与真实世界进行互动的东西。

And I just didn't see how games kind of like add up to it. And so my project at opening eye for example was within the scope of the universe project on an agent that was using keyboard and mouse to operate web pages. And I really wanted to have something that like interacts with you know the actual digital world that can do knowledge work. And it just so turns out that this was extremely early way too early. So early that the wish didn't have been working on that. Yeah, because if you're just stumbling your way around and keyboard mashing and masking and trying to get rewards in these environments. Your reward is two sparse and you just won't learn and you're going to burn a forest computing.

我当时没有看到游戏能如何真正有所贡献。因此，我在OpenAI的项目，比如说，是在“宇宙项目”的范围内进行的，目标是开发一个能使用键盘和鼠标操作网页的智能体。我真的很希望能够开发一些能与实际数字世界互动并进行知识工作的东西。然而，事实证明，这个想法太超前了，早到不应该花精力在这上面。因为如果你只是随便乱摸索，随意按键盘和掩盖问题，试图在这些环境中获取奖励，你得到的奖励太稀少，这样不仅学不到东西，还会消耗大量的计算资源。

And you're never actually going to get something off the ground. And so what you're missing is this power of representation in the neural network. And so for example today people are training those computer using agents, but they're doing it on top of a large language model. And so you actually have to get the language model first. You have to get the representations first and you do that by all the pre training and all the LM stuff. So I kind of feel like maybe loosely speaking it was like people keep maybe trying to get the full thing to early a few times where people like really try to go after agents too early. I would say that was a Atari and universe and even my own experience. And you actually have to do some things first before you sort of get to those agents.

你实际上无法让某件事起步。你所缺少的是神经网络中的表示能力。举个例子，如今人们在训练计算机智能体时，会基于大型语言模型进行。但首先你必须先得到语言模型，必须先进行表示，而这需要通过预训练和所有语言模型相关的工作来实现。所以我觉得，大致来说，可能是人们总是想过早地全面实现目标，比如过早地尝试开发智能体。我想说，这种情况曾在Atari和Universe项目中出现过，甚至我自己也有过这样的经历。在发展智能体之前，你实际上需要先做好一些准备工作。

And maybe now the agents are a lot more competent, but maybe we're still missing sort of some parts of that stack. But I would say maybe those are like the three like major buckets of what people were doing. Training neural nets, per tasks, trying to the first round of agents. And then maybe the LLMS and actually seeking the representation power of the neural networks before you tack on everything else on top. Interesting. Yeah, I guess if they were to steal man the sort of a sudden perspective would be that humans actually can just take on everything at once. Even animals can take on everything at once, right? Animals are maybe a better example because they don't even have the scaffold of language. They just get thrown out into the world and they just have to make sense of everything without any labels. And the vision for AGI then should just be something which like just looks at sensory data, looks like the computer screen.

也许现在的智能体更加有能力，但我觉得我们仍然缺少一些技术堆栈的部分。我认为，人们主要在做三件事情：针对具体任务训练神经网络，尝试第一轮智能体的开发，以及在不加入其他技术之前，探索神经网络的表示能力。这很有趣。如果从一个极端的角度看，人类实际上可以同时应对所有问题。甚至动物也可以同时应对所有事。动物可能是一个更好的例子，因为它们甚至没有语言的框架。它们被抛入世界，必须在没有任何标签的情况下理解一切。因此，AGI（人工通用智能）的愿景就应当是能够像人和动物一样，只需观察感觉数据，就是看计算机屏幕上的信息，从中理解事物。

And it just like figures out what's going on from scratch. I mean if a human was put in a similar situation that would be trained from scratch, well, I mean this is like a human growing up where animal growing up. So why shouldn't that be the vision for AI rather than like this thing where we're doing millions of years of training? I think that's a really good question. And I think I mean so so certain was in your podcast now I saw the podcast and I had a write up about that podcast almost that gets into a little bit of how I see things. And I kind of feel like I'm very careful to make analogies to animals because they came about by very different optimization processes. Animals are evolved and they actually come with a huge amount of hardware that's built in. And when for example, my example in the post was the zebra, the zebra gets born and a few minutes later it's running around and following its mother.

它就像从零开始搞清楚发生了什么。我是说，如果把人类放在类似情况下，从零开始培养，这就像人类或者动物的成长过程。那么，为什么不把这种方式作为人工智能的发展方向，而不是进行数百万年的训练呢？我觉得这是个很好的问题。我听了你的播客，还专门写了篇文章解析我对这件事的看法。我对将人工智能与动物类比持谨慎态度，因为动物是通过完全不同的优化过程进化而来的。动物在出生时就已经具备了大量的“硬件”。比如，我在文章中提到的例子是一匹斑马，斑马出生仅几分钟后就能四处奔跑并跟随它的母亲。

That's an extremely complicated thing to do. That's not reinforcement learning. That's something that's baked in. And evolution obviously has some way of encoding the weights of our neural nets and ATCGs. And I have no idea how that works but it apparently works. So I kind of feel like brains just were king from a very different process. And I am very hesitant to take inspiration from it because we're not actually running that process. So in my post I kind of said we're not actually building animals. We're building ghosts or spirits or whatever people want to call it. Because we're not we're not doing training by evolution. We're doing training by basically imitation of humans. And the data that they've put on the internet. And so you end up with these like sort of ethereal spirit entities because they're fully digital and they're kind of like mimicking humans.

这是一件极其复杂的事情。这不是强化学习，这更像是已经内置的东西。显然，进化在某种程度上可以将我们神经网络的权重和ATCG编码。我不知道这是如何运作的，但它显然确实有效。因此我感觉，大脑的形成来自一个完全不同的过程。我对从中获取灵感持非常谨慎的态度，因为我们并没有真正运行那个过程。因此，我在文章中提到，我们实际上并不是在建造动物，而是在创造幽灵或灵魂，或者人们愿意称之为何种存在。因为我们不是通过进化来训练，而是基本上通过模仿人类和他们在互联网上留下的数据来进行训练。结果是，我们得到的这些存在更像是虚无的灵魂，因为它们完全是数字化的，并且在某种程度上模仿着人类。

And it's a different kind of intelligence. Like if you imagine a space of intelligence is we're starting off at a different point almost. And we're not we're not really building animals. But I think it's also possible to make them a bit more animal like over time. And I think we should be doing that. And so I kind of feel like sort of just I guess one more point is I do feel like Sutton basically has a very like his framework is like we want to build animals. And I actually think that would be wonderful if we can get that to work. That would be amazing. If there was a single like algorithm that you can just you know run on the internet and it learns everything that would be incredible. I almost suspect that I'm not actually sure that it exists. And that's only actually not what animals do. Because animals have this outer loop of evolution.

这是一种不同类型的智能。想象一下智能的空间，我们几乎是从一个不同的起点开始的。我们不是在模拟动物，但我认为也有可能随着时间的推移，使这些智能变得更像动物一些。我觉得我们应该尝试这样做。我想补充一点，我感觉Sutton的框架就是我们应该着手构建类似动物的智能。我认为如果我们能做到这一点，那会非常棒。如果有一个单一的算法，可以在互联网上运行并学习一切，那将是难以置信的。不过，我几乎怀疑这种算法是否真的存在。因为动物还有一个由进化构成的外部循环，而这并不是简单的算法所能呈现的。

Right. And a lot of what little click learning is actually a lot more maturation of the brain. And I think that's actually actually very little reinforcement learning for animals. And I think a lot of the reinforcement learning is actually like more like motor tasks is not intelligence tasks. So actually kind of think humans are actually like really us are all roughly speaking is what I would say. Can you read the lessons? A lot of that intelligence is not motor tasks. It's what it's right. A lot of the reinforcement learning in my perspective would be things that are a lot more like motor like like simple kind of like task throwing hoop or stuff like that. But I don't think that humans use reinforcement learning for a lot of intelligence tasks like problem solving and so on. Interesting.

好的。实际上，许多小的学习进步其实更多来源于大脑的成熟。我认为，动物其实进行的强化学习非常有限。许多强化学习更像是操作任务，而不是智力任务。我想说，人类大体上并不通过强化学习来进行智力任务。你能理解这些教训吗？很多智力并不是操作任务，而是其他的东西。在我看来，许多强化学习任务更类似于像投篮这样简单的操作任务。但我不认为人类在解决问题等智力任务中使用了很多强化学习。这很有趣。

That doesn't mean we don't have we we shouldn't do that for research. But I just feel like that's what animals do or that. I'm going to take a second to digest that because there's a lot of different ideas. Maybe it was one clarification question I can ask to understand the perspective. So I think you suggest that look evolution is doing the kind of thing that pre training does in the sense of building something which can then understand the world. The difference I guess is that evolution has to be titrated in the case of humans through three gigabytes of DNA. So that's very unlike the weights of a model. I mean literally the weights of the model or a brain which obviously is not encoded in the sperm and the egg or does not exist in the sperm and the egg. So it has to be grown. And also the information for every single synapse in the brain simply cannot exist in the three gigabytes that exist in the DNA.

这并不意味着我们不应该为了研究而去做这些事情。但我觉得这似乎像是动物会做的事情。我需要一点时间来消化这些信息，因为这里面包含了很多不同的想法。也许我可以问一个澄清性的问题来理解这个观点。我认为你的意思是说，进化在某种程度上与预训练相似，因为它在构建一种能够理解世界的东西。不同之处可能在于，进化需要通过人类DNA的三十亿字节来进行调整。这与模型的权重非常不同。我指的是模型或大脑的权重显然不会被编码在精子和卵子中，或者说它们并不存在于精子和卵子中，而是要进行成长。此外，大脑中每个单一突触的信息也无法存在于DNA包含的三十亿字节中。

Evolution seems closer to finding the algorithm which then does the lifetime learning. Now maybe the lifetime learning is not analogous to our out to your point. Is that compatible with the thing you were saying or would you disagree with that? I think so. I would agree with you that there's some miraculous compression going on because obviously the weights of the neural net are not stored in ATCGs. There's some kind of a dramatic compression and there's some kind of like learning algorithms encoded that that take over and do some of the learning online. So I definitely agree with you on that. Basically I would say I'm a lot more kind of like practically minded. I don't come at it from the perspective of like let's build animals. I come from the perspective of like let's build useful things. So I have a hard hat on and I'm just observing that look we're not going to do evolution because I don't know how to do that.

进化似乎越来越接近找到可以进行终身学习的算法。现在，可能这种终身学习并不完全对应你的观点。与您所说的观点兼容吗，或者您会不同意呢？我觉得是兼容的。我同意您的看法，这里存在某种神奇的压缩过程，因为显然神经网络的权重并不是存储在ATCG中的。有某种戏剧性的压缩，还有某种学习算法被编码进去，负责进行在线学习。所以在这一点上，我完全同意你的看法。基本上我想说，我的思维更偏向实际应用。我不是从“让我们制造动物”的角度来看待问题，而是从“让我们制造有用的东西”的角度出发。所以我可以说，我就像戴着安全帽，只是在观察：我们不会进行进化，因为我不知道该怎么做。

But it does turn out we can build these ghost spirit like entities by imitating internet documents. This works. And it's actually kind of like it's a way to bring you up to something that has a lot of sort of built in knowledge and intelligence in some way. Similar to maybe what evolution has done. So that's why I kind of called pre-training this kind of like crappy evolution. It's like the practically possible version with art technology and what we have available to us to get to a starting point where we can actually do things like reinforcement learning and so on. Just to steal man the other perspective because after doing this interview and thinking about it a bit. It has an important point here. Evolution does not give us the knowledge really right it gives us the algorithm to find the knowledge and that seems different for training.

但是事实证明，我们可以通过模仿互联网文档来构建这些类似“鬼魂”般的实体。这种方法确实有效。这样做实际上有点像赋予你丰富内置知识和智能的一种方式，类似于进化所做的事情。因此，我把这种“预训练”称为“蹩脚的进化”。这是使用我们现有的技术和手段，达到一个起点，使我们能够进行强化学习等任务的可行方法。为了更全面地看待问题，在这次采访之后和仔细思考之后，我意识到这里有一个重要的观点。进化实际上并没有直接给予我们知识，而是提供了找到知识的算法，这和训练过程似乎有区别。

So if perhaps the perspective is that pre-training helps build the kind of entity which can learn better it teaches meta learning and therefore it is a similar to like finding an algorithm. But if it's like evolution gives us knowledge and pre-training gives us knowledge that analogy seems to break down. So it's subtle and I think you're right to push back on it. But basically the thing that pre-training is doing so you're basically getting the next token predictor on over the internet and you're training that into a neural net. It's doing two things actually there are kind of like unrelated. Number one it's picking up all this knowledge as I call it. Number two it's actually becoming intelligent. By observing the algorithmic patterns in the internet it actually kind of like boots up all these little circuits and algorithms inside the neural net to do things like in context learning and all this kind of stuff.

如果我们认为预训练有助于构建一种能够更好地学习的实体，这相当于教授元学习，因此类似于找到一种算法。然而，如果将其类比为进化赋予我们知识，而预训练也赋予我们知识，这种类比似乎有些问题。所以这个问题很微妙，我认为您质疑这一点是对的。但基本上说来，预训练在做的事情其实可以分为两个方面，这两者是相对独立的。首先，它正在获取我所谓的所有这些知识。其次，它实际上在变得智能。通过观察互联网上的算法模式，它实际上激活了神经网络内部的各种小电路和算法，从而实现上下文学习等功能。

And actually you don't actually need or want the knowledge. I actually think that's probably actually holding back the neural networks overall because it's actually like getting them to rely on the knowledge all too much sometimes. For example I kind of feel like agents one thing they're not very good at is going off the data manifold of what exists on the internet. If they had less knowledge or less memory actually maybe they would be better. And so what I think we have to do kind of going forward and this would be part of the research paradigms is actually think we need to start. We need to figure out ways to remove some of the knowledge and to keep what I call this cognitive core. Is this like intelligent entity that is kind of stripped from knowledge but contains the algorithms and contains the magic you know of intelligence and problem solving and the strategies of it and all this kind of stuff. There's so much interesting stuff there.

实际上，你并不需要或想要那些知识。我确实认为，这可能在整体上阻碍了神经网络的发展，因为它们有时过于依赖这些知识。例如，我觉得代理在超出互联网现有数据范围的问题上表现得不太好。如果它们掌握的知识或记忆更少，反而可能表现得更好。因此，我认为我们在未来需要做的是，作为研究范式的一部分，开始考虑如何去除某些知识，并保持我所谓的“认知核心”。这是一个被剥离了知识但仍包含算法、智能魔力、问题解决能力及其策略的智能实体。这方面有很多有趣的东西。

Okay so let's start with in context learning. This is an obvious point but I think it's worth just like saying it explicitly and meditating on it. The situation in which these models seem the most intelligent in which they are like I talked to them and I'm like wow there's really something on the other end that's responding to me thinking about things. If it like makes a mistake it's like oh wait that's actually the wrong way to think about it I'm packing up. All that is happening in context that's right I feel like the real intelligence you can like visibly see.

好的，我们先从上下文学习开始。这是一个显而易见的观点，但我觉得值得明确地说出来，并深入思考。让这些模型看起来最聪明的情况是，当我和它们交流时，会感到惊叹，就像有某种智慧在另一端回应我，思考事情。如果它们犯了错误，会意识到并纠正，说“哦，等等，这样想是不对的，我要调整一下。”所有这些都发生在上下文中，这就是我觉得真正的智能是可以被明显观察到的。

And that in context learning process is developed by gradient descent on pre training right like in met it's spontaneously meta learns in context learning. But the in context learning itself is not gradient descent in the same way that our lifetime intelligence as humans to be able to do things is conditioned by evolution but our actual learning during our lifetime is like happening through some other process. I actually don't fully agree with that but you should continue with that. Actually then I'm curious to understand how that analogy breaks down.

在上下文学习过程中，通常是通过对预训练模型进行梯度下降来实现的。就像在元学习中，它自发地进行上下文学习。但上下文学习本身并不是以梯度下降的方式进行的，就像人类的终生智能是由进化所决定的，但我们在一生中的实际学习是通过其他过程实现的。我实际上并不完全同意这一点，但你可以继续研究这个观点。其实，我很好奇这个类比是如何崩溃的。

I think I'm hesitant to say that in context learning is not doing gradient descent because I mean it's not doing explicit gradient descent but I still think that in context learning basically it's pattern completion within a token window right. And it just turns out that there's a huge amount of patterns on the internet and so you write the model kind of like learns to complete the pattern. And that's inside the weights the weights of the neural network are trying to discover patterns and complete the pattern and there's some kind of adaptation that happens inside the neural network right.

我认为，我很犹豫要说“上下文学习”不是在做梯度下降，因为虽然它不是在进行显式的梯度下降，但我仍然认为“上下文学习”基本上是在一个标记窗口内完成模式匹配。在互联网上存在大量的模式，因此模型会学会去完成这些模式。这种学习其实发生在神经网络的权重内部，权重试图发现并完成模式，在神经网络内部有某种形式的自适应过程在发生。

Which is kind of magical and just falls out from internet just because there's a lot of patterns. I will say that there have been some papers that I thought were interesting that actually look at the mechanisms behind in context learning and I do think it's possible that in context learning actually runs a small green descent loop internally in the layers of the neural network. And so I recall one paper in particular where they were doing linear regression actually using in context learning.

这有点奇妙，因为在互联网上有大量的模式，这种现象就自然出现了。我想说，有一些我认为很有趣的论文专门研究了上下文学习背后的机制。我认为有可能在上下文学习中，神经网络的内部层实际上运行着一个小型的梯度下降循环。我尤其记得有一篇论文，讨论的是如何利用上下文学习进行线性回归。

So basically your inputs into the neural network are x y pairs x y x y x y that happened to be on the line. And then you do x and you expect the y and the neural network when you train it in this way actually does do does do linear regression. And normally when you would run linear regression you have a small greenness and optimizer that basically looks at x y looks at an error calculus degree into the weights and does the update a few times.

基本上，你输入到神经网络的内容是一些位于直线上的 x, y 对。然后，你输入 x 并期望得到相应的 y。当你这样训练神经网络时，它实际上是在进行线性回归。通常，当你执行线性回归时，会有一个小的梯度下降优化器，它基本上通过查看 x, y 对计算误差，然后调整权重并重复多次更新。

It just turns out that when they looked at the weights of that in context learning algorithm. They actually found some analogies to to gradient descent mechanics. In fact, I think even the paper when struck was stronger because they actually hard coded the weights of the neural network to do gradient descent through attention and all the internals of the neural network. So I guess that's just my only pushback is that who knows how in context learning works.

他们发现，当研究该上下文学习算法的权重时，实际上找到了一些与梯度下降机制类似的东西。实际上，我认为论文的措辞可能更强，因为他们实际上将神经网络的权重硬编码，以通过注意力机制实现梯度下降，以及神经网络的所有内部运作。因此，我唯一的疑虑是，谁又能真正了解上下文学习是如何运作的呢？

But I actually think that it's probably doing a little bit of some kind of funky gradient descent internally. And then I think that that's that's possible. So I guess I was only pushing back on you're saying it's not doing in context learning who knows what it's doing. But it's probably maybe doing something similar to it, but we don't know.

但我其实觉得，它可能在内部做某种奇特的梯度下降。我认为这是可能的。所以我只是想反驳你说的它没有进行上下文学习，谁知道它到底在做什么。但它可能在做一些类似的事情，不过我们不确定。

So then it's worth thinking about, okay, if both of them are implementing gradient this or if in context learning and pre training are both implementing something like gradient descent. Why does it feel like in context learning actually we're getting to this like continual learning real intelligence like thing. Whereas you don't get the analogous feeling just from pre training, at least you could argue that.

那么值得考虑的是，如果两者都在实现类似梯度下降的东西，或者说上下文学习和预训练都在实现某种梯度下降的过程，为什么会觉得上下文学习似乎更接近那种持续学习和真正智能的状态？而仅仅通过预训练，你可能并不会有类似的感觉，至少可以这么说。

And so if it's the same algorithm, what could be different? Well, one way you can think about it is how much information does the model store per information to receives from training. And if you look at pre training, if I think if you look at llama three, for example, I think it's trained on 15 trillion tokens and if you look at the 70 B model, that would be the equivalent of 0.07 bits per token in that it sees in pre training in terms of like the information in the weights of the model compared to the tokens it reads.

如果算法是相同的，那么会有什么不同呢？我们可以从信息存储的角度来考虑这个问题，也就是模型从训练中接收到的信息量和它存储的信息量之间的关系。举个例子，如果看预训练过程，比如说LLaMA 3，它可能是在15万亿个标记上训练的。而对于一个70B的模型来说，这就相当于每个标记平均储存了0.07比特的信息量。这就是说，模型权重中的信息与它读取的标记之间的信息量之比大约为0.07比特/标记。

Whereas if you look at the KV cache and how it grows per additional token in context learning, it's like 320 kilobytes. So that's a 35 million full difference in how much information per token is assimilated by the model. I wonder if that's relevant at all. I think I kind of agree. I mean the way I usually put this is that anything that happens during the training of the neural network, the knowledge is only kind of like a hazy recollection of what happened in training in the training time.

如果你查看KV缓存，并观察它在上下文学习中每增加一个词时增长的情况，大约是320千字节。这意味着每个词汇所吸收的信息量相差3500万倍。我想知道这是否有任何相关性。我觉得我有点同意这一点。我的理解是，在神经网络的训练过程中所发生的任何事情，网络获得的知识仅仅像是对训练时发生情况的一种模糊记忆。

And that's because the compression is dramatic. You're taking 15 trillion tokens and you're compressing it to just your final network of a few billion parameters. So obviously it's a massive amount of compression going on. So I kind of refer to it as like a hazy recollection of the internet documents, whereas anything that happens in the context window of the neural network, you're plugging all the tokens and building up all this KV cache representation is very directly accessible to the neural net.

这主要是因为压缩的幅度非常大。你将15万亿个标记（tokens）压缩成一个只有几十亿参数的最终网络。因此，很显然，这其中进行了大规模的压缩。我常常把它比作对互联网文档的一种模糊记忆，而在神经网络的上下文窗口中发生的任何事情，你将所有标记都输入其中，并构建出所有这个KV缓存表示，是神经网络可以非常直接访问的。

So I compare the KV cache and the stuff that happens at test time to like more like a working memory, like all the stuff that's in the context window is very directly accessible to the neural net. So there's always like these almost surprising analogies between LLMS and humans. And I find them kind of surprising because we're not trying to build a human brain, of course, just directly we're just finding that this works and we're doing it. But I do think that anything that's in the weights, it's kind of like a hazy recollection of what you read a year ago.

所以，我将KV缓存和测试时发生的事情类比为工作记忆，就像上下文窗口中的所有内容对神经网络来说都是直接可访问的一样。总是可以看到大型语言模型(LLM)和人类之间几乎令人惊讶的类比。我认为这有点令人惊讶，因为我们当然不是在直接尝试构建一个人脑，所谓直接构建大概就是指我们仅仅发现这种方式可行然后去做。但我确实认为，存储在权重中的内容有点像是一年前你读过的东西的一种模糊记忆。

Anything that you give it as a context at test time is directly in the working memory. And I think that's a very powerful analogy to things for things. So when you, for example, go to an LLM and you ask it about some book and what happened in it, like on the claims book or something like that, the LLM will often give you some stuff, which is roughly correct. But if you give it the full chapter and ask it questions, you're going to get much better results because it's now loaded in the working memory of the model.

在测试时，任何你提供的上下文都是直接存在于工作记忆中的。这是一个非常强有力的类比。例如，当你询问大型语言模型（LLM）关于某本书的内容时，比如书中的某些观点，LLM通常会提供一些大致正确的信息。但是，如果你提供完整的章节并提问，得到的结果会更好，因为此时这些内容已经被加载到模型的工作记忆中。

So I basically agree with you very long way of saying that I kind of agree and that's why. And stepping back, what is the part about human intelligence that we like have a most feel to replicate with these models? I almost feel like just a lot of it. So maybe one way to think about it. I don't know if this is the best way, but I almost kind of feel like again making these analogies in perfect as they are. We've stumbled by with the transformer neural network, which actually extremely powerful.

所以基本上我同意你的观点。长话短说，我有点同意你的看法，这就是原因。再往回想一下，人类智能中我们最希望这些模型能够复制的部分是什么呢？我几乎觉得大部分都是这样的。所以，也许可以这么想。我不确定这是不是最好的方式，但我几乎有点感觉，即使这些类比并不完美，但我们已经偶然发现了变压器神经网络，这确实非常强大。

Very general. You can train transformers on audio or video or text or whatever you want. And it just learns patterns and they're very powerful and it works really well. That to me almost indicates that this is kind of like some piece of cortical tissue. It's something like that because the cortex is famously very plastic as well. You can rewire, you know, parts of brains. And there was a slightly gruesome experiments with rewiring like visual cortex of the auditory cortex and this animal like learn file, et cetera.

非常通用。你可以用音频、视频、文本或任何你希望的内容来训练变压器（Transformers）。它们擅长于学习模式，非常强大且效果极佳。对于我来说，这几乎表明它有点类似于大脑皮层组织。因为众所周知，皮层具有很强的可塑性。你可以重组大脑的部分功能。过去有些稍微有些残忍的实验，比如将视觉皮层重新接入到听觉皮层，而动物能够学会适应这些变化。

So I think that this is kind of like cortical tissue. I think when we're doing reasoning and planning inside the neural networks. So basically doing a reasoning traces for thinking models. That's kind of like the prefrontal cortex. And then I think we maybe those are like blue check marks. But I still think there's many brain parts and nuclei that are not explored. So maybe for example, there's a basic angle of doing their reinforcement learning when we find two in the models on reinforcement learning.

我认为这有点像大脑皮层。在神经网络中进行推理和计划时，基本上就像是在为思维模型进行推理追踪。这有点类似于前额皮质。我认为这可能像是蓝色的检查标记。但是，我仍然认为我们还有许多大脑的部分和核团尚未被探索。比如，当我们微调模型进行强化学习时，可能类似于基底神经节在执行强化学习的功能。

But you know, whereas like the hippocampus, not obvious what that would be. Some parts are probably not important. Maybe the cerebellum is like not important to cognition. It's thoughts of maybe we can skip some of it. But I still think there's, for example, the amygdala, all the emotions and instincts. And there's probably like a bunch of other nuclei in the brain that are very ancient that I don't think we've like really replicated.

你知道，就像海马体一样，有些部分的作用并不明显。有些部位可能不是很重要。也许小脑对于认知来说并不重要，它可能是可以略过的部分。但我仍然认为，例如杏仁核，它与情感和本能有关。而且可能还有很多大脑中非常古老的核团，我认为我们还没有真正复制它们的功能。

I don't actually know that we should be pursuing, you know, the building of an analog of human brain. Again, an engineer mostly at heart. But I still feel like maybe another way to answer the question is, you're not going to hire this thing as an intern and it's missing a lot of it's because it comes with a lot of these cognitive deficits that we all intuitively feel when we talk to the models.

我不太确定我们是否应该致力于打造模拟人类大脑的东西。说到底，我更像是一个工程师。不过，我觉得或许可以换个方式来回答这个问题：你不会把这样的东西当作实习生来雇用，因为它在很多方面存在认知缺陷，这些缺陷在我们与这些模型对话时都直观地感受得到。

And so it's just like not fully there yet. You can look at it as like not all the brain parts are checked off yet. This is maybe relevant to the question of thinking about how fast these issues will be solved. So sometimes people will say about continual learning. Look, actually you could already you could easily replicate this capability just as in context learning emerged spontaneously as a result of pre training.

所以，这就好像还没有完全就绪。你可以把它看作是大脑的某些部分还没有完全运作好。这也许与我们思考这些问题会多快解决有关。因此，有时候人们会谈到持续学习，他们会说，其实你已经可以轻松复制这种能力，就像上下文学习在预训练中自发出现一样。

Continue a learning over longer horizons will emerge spontaneously if the model is incentivized to recollect information over longer horizons or horizons longer than one session. So if there's some like outer loop RL which has many sessions within that outer loop, then like this continual learning where it uses like fine tunes itself or it writes to an external memory or something will just sort of like emerge spontaneously.

如果模型有动力在更长时间内或超过一个单一会话期限内回忆信息，那么持续学习就能够自然而然地出现。因此，如果存在一个外层循环强化学习，其中包含多个会话，那么这种持续学习的形式，比如通过自身微调或者写入外部记忆来实现，就会自发地出现。

Do you think do you think things are the other plausible I just I don't have really a prior of like how plausible is that how likely that to happen. I don't know that I fully resonate with that because I feel like these models when you boot them up and they have zero tokens in the window, they're always like restarting from scratch where they were. So I don't actually know in that world view what it looks like because again, maybe making some analogies to humans just because I think it's roughly concrete and kind of interesting to think through. I feel like when I'm awake, I'm building up a context window of stuff is happening during the day. But I feel like when I go to sleep, something magical happens where I don't actually think that the context window stays around. I think there's some process of distillation into weights of my brain and this happens during sleep and all this kind of stuff.

你觉得那些事情是不是有其他合理的可能性？我只是在没有确切的先入之见下觉得这种可能性有多大。我不太能完全理解这件事，因为我觉得这些模型在启动时，当它们的窗口中没有任何词元时，就像是从头开始。我不知道在这种世界观中会是什么样子。或许我们可以借用一些人类的类比，因为这样会比较具体，也挺有趣。我觉得，当我醒着时，我在不断积累一天中发生的事情的上下文。但当我睡觉时，我觉得发生了一些神奇的事情，我不认为这个上下文是保留下来的。我认为有一种过程在把信息提炼到我的大脑中，这是在睡眠中发生的，类似于这种情况。

We don't have an equivalent for all that in a large language models and that's to me more adjacent to when you talk about continual learning and so on as absent. These models don't really have a distillation phase of taking what happened analyzing it, obsessively thinking through it. Basically doing some kind of a synthetic data generation process and distilling it back into the weights and maybe having a specific neural net per person. Maybe it's a lower, it's not a full, yeah, it's not a full weight neural network. It's just small, some of the small sparse subset of the weights are changed. But basically, we do want to create a ways of creating these individuals that have very long contexts. It's not only remaining in the context window because the context windows grow very, very long.

我们在大型语言模型中没有等效的机制，这让我联想到当我们讨论持续学习等内容时所缺少的东西。这些模型其实并没有一个提炼阶段来分析发生过的事情，深入思考它。基本上，它们没有进行某种合成数据生成过程，再将其提炼回模型的权重中，并为每个人单独设置一个特定的神经网络。可能这是一个较小的网络，而不是一个完整的权重神经网络，仅仅是修改了一些小的稀疏权重子集。但是，基本上，我们确实希望创造一种方法，来建立可以拥有非常长上下文的个体。这不仅仅是将信息保留在上下文窗口中，因为上下文窗口会变得非常非常长。

Maybe we have some very elaborate sparse attention over it. But I still think that humans obviously have some process for distilling some of that knowledge into the weights. We're missing it. And I do also think that humans have some kind of a very elaborate sparse attention scheme, which I think we're starting to see some early hints of. So deep seek V3.2 just came out and I saw that they have like a sparse attention as an example. And this is one way to have very, very long context windows. So I almost feel like we are redoing a lot of the cognitive tricks that evolution came up with through a very different process. But we're I think converged on a similar architecture cognitively.

也许我们在这方面已经有一些非常复杂的稀疏注意机制。但是我仍然认为，人类显然有某种将这些知识转化为权重的过程，我们可能还没完全掌握。而且我确实认为，人类拥有某种非常复杂的稀疏注意力机制，我觉得我们已经开始看到一些早期的迹象了。比如，最近推出的Deep Seek V3.2，其中就有一个关于稀疏注意力的例子。这是一种实现超长上下文窗口的方法。因此，我几乎觉得我们正在通过一个完全不同的过程重新创造进化过程中形成的许多认知技巧。但我认为，我们在认知上正趋向于类似的结构。

Interesting. In 10 years, do you think it will still be something like a transformer but with a much more modified attention and more sparse MLPs and so forth? Well, the way I like to think about it is, okay, let's translation invariance in time. So 10 years ago, where were we? 2015, we had a convolutional neural networks primarily. Resurone networks just came out. So remarkably similar, I guess, but quite a bit different still. I mean, transformer was not around. You know, all the all these sort of like more modern tweaks on a transformer were not around. So maybe some of the things that we can bet on, I think in 10 years, by translational sort of equivalence is we're still training general networks with forward backward pass and update through gradient. And so that's an update through gradient descent. But maybe it looks a little bit different and it's just everything is much bigger.

有趣的是，你认为在10年后，它会依然类似于一个transformer，但具有更多修改过的注意力机制和更稀疏的多层感知机（MLP）等等吗？我比较喜欢用一种方式来思考这个问题：让我们考虑时间上的平移不变性。那么10年前，我们处于什么阶段？2015年时，我们主要用的是卷积神经网络，而残差网络刚刚出现。虽然这些都与我们现在使用的模型相似，但仍有很大不同。当时，transformer还没有出现，也没有这些新型的transformer改进版。所以，也许我们在10年后可以预见的一些事情，是我们仍然在通过前向和后向传播来训练通用网络，进行梯度下降的更新。只不过，那个时候可能看起来略有不同，并且一切都变得更大。

Actually, recently I also went back all the way to 1989, which was kind of a fun exercise for me a few years ago, because I was reproducing young lacunes, 1989, convolutional network, which was the first neural network I'm aware of trained via gradient descent, like modern neural network trained grading descent on a digital recognition. And I was just interested in, okay, how can I modernize this? How much of this is algorithms? How much of this is data? How much of this progress is compute and systems? And I was able to very quickly like half the learning rate just knowing by travel, time travel by 33 years.

实际上，最近我也回到了1989年，这对我来说是几年前的一次有趣的练习。当时我在重现年轻时的1989年的卷积网络，那是我所知道的第一个通过梯度下降训练的神经网络，就像现代神经网络使用梯度下降在数字识别中训练一样。我只是想知道，我该如何让它现代化？这其中有多少是算法的进步？有多少是数据的作用？又有多少进步来自计算和系统的改进？通过这33年的“时间旅行”，我能够非常快速地将学习率降低一半。

So if I time travel by algorithms to 33 years, I could adjust with young lacundated 1989 and I could basically have to learning half the air. But to get further gains, I had to add a lot more data. I had to like 10X the training set. And then I had to actually add more computational optimizations, had to basically train for much longer, would drop out and other regularization techniques. And so it's almost like all these things have to improve simultaneously. So, you know, we're probably going to have a lot more data. We're probably going to have a lot better hardware. Probably going to have a lot better kernels and software. We're probably going to have better algorithms.

如果我通过算法穿越到33年后，我可能会适应年轻的1989，并且基本上只需要学一半的内容就够了。但要获得更大的进步，我不得不加入更多的数据，几乎需要将训练集扩大10倍。接着，我还需要加入更多的计算优化，基本上需要更长时间的训练，同时使用dropout和其他正则化技术。可以说所有这些方面都需要同时改进。所以，我们可能会有更多的数据，更好的硬件，更好的内核和软件，以及更好的算法。

So all of those, it's almost like no one of them is winning too much. All of them are surprisingly equal. And this has kind of been the trend for a while. So I guess to answer maybe your question, I expect differences. Algorithmic late to what's happening today. But I do also expect that some of the things that have stuck around for a very long time will probably still be there. It's probably still a giant neural network trained with gradient descent. That would be my guess. It's surprising that all of those things together only have, have the error, which is like 30 years of progress. Maybe half is a lot because like you have the error that actually means that half is a lot. But it's I guess what we're shocking to me is everything needs to improve across the board architecture, optimize a loss function, and also has improved across the board forever. So I kind of expect all those changes to be alive and well.

所以，所有这些方面，似乎没有哪个方面特别突出。它们之间都出乎意料地均衡。这种趋势已经持续了一段时间。或许可以回答你的问题，我预期会有一些差异。与当前情况相比，我也预期一些已经存在很久的东西依然会存在。可能仍然是一个用梯度下降训练的大型神经网络。这是我的猜测。令我惊讶的是，这些进步综合起来之后，仅仅相当于30年的进步也有可能只有一半，因为你得到了一个错误率，这意味着一半已经算多了。但令我震惊的是，各个方面都需要改进：架构、损失函数优化等，似乎这些方面永远都在持续改进。所以，我期待所有这些变化仍然活跃并且表现良好。

Yeah, actually, I was about to ask a very similar question about NanoChat because since you just coded up recently, every single sort of step in the process of building a chatbot is like fresh in your RAM. And I'm curious if you had similar thoughts about like, oh, there was no one thing that was relevant to going from GPT2 to NanoChat. What are sort of like surprising takeaways from the experience of building a chat? So NanoChat is kind of a repository I released. Was it yesterday or day before I can't remember. We can see this leave the vibration that went into the well, it's just trying to be a, it's trying to be the simplest, complete repository that covers the whole pipeline into end of building a chatchipy T clone. And so, you know, you have all of the steps, not just any individual step, which is a bunch of I worked on all the individual steps sort of in the past and really small pieces of code that could have show you how that's down in the process.

好的，其实我正准备问一个关于NanoChat的类似问题，因为你最近刚刚编写了这个项目，构建一个聊天机器人的每一步都像是新鲜记忆在你脑海中。我很好奇你是否有类似的想法，比如从GPT-2到NanoChat的过程中，并没有一个单一的东西特别重要。关于构建聊天机器人的经历，有哪些让人意外的收获？NanoChat是我发布的一个代码库。是昨天还是前天？我记不清了。这个项目试图成为一个最简单且完整的代码库，它涵盖了构建聊天GPT克隆全流程中的所有环节。所以，你在这里可以看到每一个步骤，不只是单个步骤，我过去在各个独立的步骤上工作过，并编写了一些很小的代码片段来展示这个过程是如何完成的。

I think in terms of learning, it's not so much, I don't know that I actually found something that I learned from it necessarily. I kind of already had in my mind as like how you build it. And this is just a process of mechanically building it and making it clean enough. And so that people can actually learn from it and that they find it useful. Yeah. What is the best way for somebody to learn from it? Is it just like delete all the code and try to re-implement from scratch, try to add modifications to it? Yeah, I think that's a great question. I would probably say, so basically it's about a thousand lines of code that takes you through the entire pipeline. I would probably put it on the right monitor, like if you have two monitors, you put it on the right. And you want to build it from scratch, you build it from start, you're not allowed to copy paste. You're allowed to reference, you're not allowed to copy paste. Maybe that's how I would do it.

我觉得从学习的角度来看，这不是说我一定从中学到了什么新的东西。我脑子里已经大概知道该如何构建它。而现在，只是一个把它机械化构建并让它足够清晰的过程，这样人们就能从中学到东西，并觉得它有用。对，别人怎么才能最好地从中学习呢？是不是就像删除所有代码然后尝试从头开始重写，或者尝试进行一些修改？我觉得这是个好问题。我大概会说，这个过程大约涉及一千行代码，覆盖了整个流程。我可能会建议，把代码放到右边的显示器上（如果你有两个显示器的话），然后从头开始构建，不能复制粘贴，只能参考。可能我会这么做。

But I also think the repository by itself, it is like a pretty large beast. I mean, it's a ritz. When you write this code, you don't go from top to bottom. You go from chunks and you grow the chunks. And that information is absent. Like you wouldn't know where to start. And so I think it's not just a final repository that's needed. It's like the building of the repository, which is a complicated chunk growing process. Right. So that part is not there yet. I would love to actually add that probably later this week or something in some way. Like either it's probably a video or something like that. But maybe roughly speaking, that's what I would try to do is build the stuff yourself. But don't allow yourself copy paste.

但是，我也认为仅仅依靠这个代码库，它就像一个庞然大物。我的意思是，它就像一个迷宫。当你编写这个代码时，不是从头到尾线性进行的，而是从小块开始，然后逐步扩展这些小块。而这些信息是缺失的，比如你不知道从哪里开始。因此，我认为不仅仅是需要一个最终的代码库，还需要一个建立代码库的过程，而这个过程是一个复杂的小块增长过程。目前这部分还没有。我希望可能在本周晚些时候以某种方式添加这一部分，可能是一个视频或者其他形式。但大概来说，我会尝试做的是亲自构建这些内容，但不要让自己依赖复制粘贴。

Yeah. I do think that there's two types of knowledge almost. Like there's the high-level surface knowledge. But the thing is that when you actually build something from scratch, you're forced to come to terms with what you don't actually understand. And you don't know that you don't understand it. Interesting. And it always leads to a deeper understanding. And it's like just the only way to build this. Like if I can't build it, I don't understand it. Is that a finite code, I believe, or something along those lines? I 100% I've always believed this very strongly because there's all these like micro things that are just not properly arranged. And you don't really have the knowledge. You just think you have the knowledge. So don't write blockposts. Don't do slides. Don't do any of that. Like build the code, arrange it, get it to work. So the only way to go, otherwise you're missing knowledge.

好的。我确实认为知识可以分为两种类型。一种是高层次的表面知识。但是，当你真正从零开始构建某个东西时，你会被迫去面对自己实际上并不理解的部分。而且有趣的是，你并不知道自己不理解这一点。这总会引导我们去获得更深入的理解。这就是唯一的构建方法。比如说，如果我不能构建它，那我就不理解它。我相信这大概是一种具体的原则？我一直非常坚信这一点，因为本来有很多细微的东西并没有被正确安排好，而我们并没有真正掌握这些知识，只是以为掌握了。因此，不要写博客文章，不要做幻灯片之类的事情，而是着手编写代码，整理好它，让它运行起来。这是唯一的途径，否则你就会错过一些知识。

You treated out that coding models were actually a very little help to you in assembling this repository. And I'm curious why that was. Yeah. So the repository, I guess I built it over a period of a bit more than a month. And I would say there's like three major classes of how people interact with code right now. Some people completely reject all of LLMs and they are just writing by scratch. I think this is probably not their, their I think to do anymore.

你提到编程模型实际上对你组建这个代码库帮助不大。我很好奇这是为什么。嗯，这个代码库，我大约用了一个多月的时间来建立。我认为目前人们与代码互动有三种主要方式。有些人完全拒绝使用大语言模型（LLMs），他们完全从零开始手动编写代码。我觉得这一方式可能已经不是他们应该继续采用的方法了。

The intermediate part, which is where I am, is you still write a lot of things from scratch, but you use the autocomplete. That's basically available now from these models. So when you start writing out, it will be piece of it. It will, they will all complete from you. And you can just tap through. And most of the time it's correct. Sometimes it's not, and you edit it. But you're still very much the architect of what you're writing.

中间阶段，也就是我所在的阶段，是指你仍然需要从头开始写很多东西，但你会使用自动补全功能。现在这些模型基本上可以提供这样的功能。当你开始写时，它会自动补全部分内容。你可以直接选择这些补全选项，大多数情况下它们都是正确的。有时可能不正确，你就需要对其进行编辑。但你依然是你所写内容的主要规划者。

And then there's the, you know, vipe coding. You know, hi, please implement this or that, you know, enter and then let the model do it. And that's the agents. I do feel like the agents work in very specific settings. And I would use them in specific settings. But again, these are all tools available to you. And you have to like learn what they, what they're good at and what they're not good at and what to use them.

然后，还有那个，你懂的，编程工具。你知道的，像是“你好，请实现这个或那个”，然后按回车键，让模型去完成。这些就是代理。我觉得这些代理在特定场合下非常有效。我会在特定情况下使用它们。不过，这些都是你可以使用的工具。你需要了解它们擅长什么、不擅长什么，以及什么时候使用它们。

So the agents are actually pretty good, for example, if you're doing boilerplate stuff. Boil plate code that's like just, you know, just copy paste stuff. They're very good at that. They're very good at stuff that occurs very often in the internet. Because there's lots of examples of it in the training sets of these models. So, so there's like features of things that where the models will do very well.

所以这些代理实际上非常不错，比如说，如果你在处理一些模板化的东西。模板化的代码，就像那种你知道的，只是复制粘贴的内容，它们在这方面非常擅长。它们对互联网中经常出现的东西也很拿手，因为在这些模型的训练数据中有很多这样的例子。因此，在某些特定特征上，这些模型的表现会非常好。

I would say nanochet is not an example of this because it's a fairly unique repository. There's not that much code, I think, in a way that I've structured it. And, and it's not boilerplate code. It's like actually like intellectually intense code almost and everything has to be very precisely arranged. And the models were always trying to, they kept trying to, I mean, they have so many cognitive deficits, right? So one example, they keep trying to, they keep misunderstanding the code.

我觉得“nanochet”不属于这种情况，因为它是一个相当独特的代码库。这个项目的代码量并不多，我对其进行了特别的结构化安排。它不是那种简单的样板代码，而是那种需要深入思考的代码，每一个部分都需要精确地安排。这些模型一直在尝试理解代码，但由于它们有很多认知缺陷，常常对代码产生误解。

Because they, they have too much memory from all the typical ways of doing things on the internet that I just wasn't adopting. So the models, for example, I mean, I don't know if I want to get into the full details, but they keep, they keep, they keep thinking I'm writing normal code and I'm not. Maybe one example, maybe one example. So the way to synchronize, so we have eight GPUs that are all doing forward records.

由于他们习惯了网上一些典型的方法，而我并没有采用这些方法，所以他们有太多这方面的记忆。比如模型，就像我说的，我不知道是否要详细说明，但他们总是认为我在写普通的代码，而我并不是。也许我可以举个例子，比如同步的方法，我们有八个GPU同时在进行前向计算。

The way to synchronize gradients between them is to use distributed data parallel container of PyTorch, which automatically does all the, as you're doing the back where it will start communicating as synchronizing gradients. I didn't use DDP because I didn't want to use it because it's not necessary. So I threw it out and I basically wrote my own synchronization routine that's inside the step of the optimizer.

在它们之间同步梯度的方法是使用 PyTorch 的分布式数据并行容器，该容器会在反向传播时自动进行梯度同步和通信。但我没有使用 DDP，因为我认为没有必要。所以我没有用，而是自己编写了一个同步程序，这个程序集成在优化器的步骤中。

And so the models were trying to get me to use the DDP container. Yeah. And they were very concerned about, okay, this gets way too technical, but I wasn't using that container because I don't need it and I have a custom implementation of something like it. And they just couldn't internalize it. You had your own. Yeah, they couldn't, they couldn't get passed out.

因此，这些模型试图让我使用DDP容器。他们非常担心，虽然这有点太技术性了，但我没有使用那个容器，因为我并不需要它，而且我已经有一个类似的自定义实现。但他们就是无法理解这一点。你有你自己的方法，他们就是不能理解。

And then they kept trying to like mess up the style, like they're way too over defensive. They have make all these tri-catch statements. They keep trying to make a production code base. And I have a bunch of assumptions in my code and it's okay. And, and it's just like, I don't need all this extra stuff in there. And so I just kind of feel like they're bloating the code base.

他们总是试图搞乱风格，好像过于防御过度。他们做了一堆三重校验声明。他们一直试图让代码库看起来像是生产环境的。而我在代码中有很多假设，这没问题。而且，我觉得我不需要所有这些额外的东西。所以我觉得他们是在让代码库变得臃肿。

They're bloating the complexity. They keep misunderstanding. They're using deprecated APIs a bunch of times. So it's total mess. And it's just, it's just not net useful. I can go in and I can clean it up, but it's not net useful. I also feel like it's kind of annoying to have to like type out what I want in English, because it's just too much typing.

他们在使事情变得复杂。他们总是误解。他们反复使用过时的 API。这简直是一团糟。而且这样做没有任何真正的用处。我可以进去整理一下，但整体上来说没什么用。我还觉得不得不用英语打出我想要表达的内容有点烦，因为那实在是打字太多了。

Like if I just navigate to the part of the code that I want and I go where I know the code has to appear and I start typing out the first three letters, auto-complete gets it and just gets it the code. And so I think it's, this is a very high information bandwidth to specify what you want. If you point to the code where you want it and you type out the first few pieces, and the model will complete it.

如果我只是导航到想要查看的代码部分，然后到达预期代码出现的位置，开始输入前三个字母，自动补全功能就会识别出来并补全代码。所以我认为，这是一种非常高信息传输量的方式来指定你想要什么。只要你指向想要的代码位置并输入开头的几个字符，模型就会完成剩下的部分。

So I guess what I mean is, I think these models are good in certain parts of the stack. Actually use the models a little bit in. There are two examples where I actually use the models that I think are illustrative. One was when I generated a report, that's actually more boilerplatey. So actually, if I could part partially some of that stuff, that was fine.

所以我想表达的是，我认为这些模型在技术栈的某些部分表现不错。实际上，在一些地方我也使用了这些模型。我可以举两个例子，我认为它们很有说明性。一个例子是当我生成报告时，报告大多是模板化内容。所以，如果我能将其中一部分交给模型处理，那也没有问题。

Because it's not like mission critical stuff and then it's fine. And then the other part is when I was rewriting the tokenizer in Rust. I'm actually not as good at Rust because I'm fairly new to Rust. So I was doing, there's a bit of vibe coding going on in when I was writing some of the Rust code. But I had Python implementation that I fully understand and I'm just making sure I'm making more efficient version of it and I have tests. So I feel safer doing that stuff.

因为这并不是紧急任务，所以没有关系。另外一方面，当时我在用Rust重写分词器。我对Rust并不太擅长，因为我对这门语言还比较陌生。所以在写一些Rust代码时，我有点依赖直觉编程。但我有一个用Python实现的版本，我完全理解，并且确保我正在做一个更高效的版本，还做了测试。所以我在做这些工作时感到比较安全。

And so basically they lower or like the increased accessibility to languages or paradigms that you might not be as familiar with. So I think they're very helpful there as well. Because there's a ton of Rust code out there, the models are actually pretty good at it. I happen to know that much about it. So the models are very useful there. The reason I think this question is so interesting is because the main story people have about AI exploding and getting a super intelligence pretty rapidly is AI automating, AI engineering and AI research.

基本上，他们降低了或者说提高了你对不太熟悉的语言或编程范式的可访问性。我认为这一点非常有帮助。因为现在有大量的Rust代码，模型在这方面表现得相当不错。而我对Rust也比较了解，所以我觉得这些模型在这方面非常有用。我认为这个问题之所以有趣，是因为人们普遍认为人工智能会迅速发展成超级智能的主要原因在于AI在自我自动化、AI工程和AI研究方面的应用。

So they'll look at the fact that you can have cloud code and make entire applications, crowd applications from scratch and be like, if you had this capability inside of open AI and deep mind and everything, well, just imagine the level of like just, you know, a thousand of you or a million of you in parallel finding little architectural tricks. And so it's quite interesting to hear you say that this is the thing they're sort of asymmetrically worse at. And it's like quite relevant to forecasting whether the AI 2027 type explosion is likely to happen anytime soon.

他们会注意到，你可以使用云代码从零开始构建整个应用程序、众包应用程序。如果你在 OpenAI 和 DeepMind 等中具备这种能力，那就想象一下，比如说，有成千上万甚至上百万个你同时在寻找一些小的架构技巧。因此，你说这是他们在这方面相对较差的地方，听起来相当有趣。这对于预测类似 AI 2027 年那种爆炸式发展的可能性是否会很快发生，非常具有参考价值。

I think that's a good way of putting it. And I think you're getting at some of my, like why my timelines are a bit longer, you're right. I think, yeah, they're not very good at code that has never been written before. Maybe it's like one way to put it, which is like what we're trying to achieve when we're building these models. Very naive question, but the architectural tweaks that you're adding to NanoChat, they're in a paper somewhere, right? They might even be in a repo somewhere.

我认为你这样说挺好的。我觉得你说到了我的一些想法，比如为什么我的时间表会稍微长一些，你是对的。我想，是的，他们在处理从未写过的代码时不太擅长。也许可以这样说，这正是我们在构建这些模型时试图实现的目标。可能这个问题显得有点天真，但你对NanoChat进行的架构调整有发表在某篇论文里吗？也可能在某个代码仓库里找得到。

So is it, is it surprising that they aren't able to integrate that into whenever you're like, add rope embeddings or something, they do that in the wrong way? It's, it's tough. I think they kind of know, they kind of know, but they don't fully know and they don't know how to fully integrate it into the repo and your style and your code and your place and some of the custom things that you're doing. And how fits with all the assumptions of the repository and all this kind of stuff.

所以，这是否令人惊讶呢？他们无法正确地将其整合到你的项目中，比如添加rope embeddings或类似的东西。他们的方式不对。这确实是个难题。我想他们多少有些了解，但并不是完全明白，也不知道怎么全面地将这些整合到你的代码库中，包括你的风格、代码、位置，以及一些你正在做的自定义工作。他们还需要考虑如何与代码库中的各种假设和其他相关内容相契合。

So I think they do have some knowledge, but they haven't gotten to the place where they can actually integrate it, make sense of it, and so on. I do think that a lot of the stuff, by the way, continues to improve. So I think currently probably state of the art model that I go to is the GPT-5 Pro. And that's a very, very powerful model. So if I actually have 20 minutes, I will copy paste my entire repo and I go to GPT-5 Pro, the Oracle for like some questions.

所以我认为他们确实有一些知识，但还没达到能够实际整合和理解这些知识的程度。不过我确实觉得很多东西在不断进步。现在我比较常用的顶尖模型是GPT-5专业版，这是一个非常强大的模型。如果我有20分钟时间，我会把整个代码库复制粘贴进去，然后在GPT-5专业版中寻找一些问题的答案，就像是咨询一位“神谕”。

And often it's not too bad and surprisingly good compared to what existed a year ago. But I do think that overall the models are, they're not there. And I kind of feel like the industry, it's, it's, it's over. It's making too big of a jump and it's trying to pretend like this is amazing and it's not. It's slop. And I think they're not coming to terms with it and maybe they're trying to fundraise or something like that. I'm not sure what's going on.

这段话翻译成中文是： “而且往往情况下，这些比一年前已有的要好得多，令人惊讶的是表现不错。但是，我确实觉得整体而言这些模型还未达到理想状态。我有点感觉行业有些过于激进，试图假装这些很出色，但实际上并不是。这些是粗制滥造的。我认为他们没有正视这个问题，可能是在试图筹款之类的。我不太清楚具体情况。”

But it's, we're at this intermediate stage. The models are amazing. They still need a lot of work for now. Out of complete is my sweet spot. But sometimes for, sometimes of code, I will go to a null imagined. Yeah. Actually, this is also, here's another reason why this is really interesting. Through the history of programming, there's been many productivity improvements, compilers, lending, better programming languages, etc.

但是，我们正处于一个中间阶段。模型确实令人惊叹，但目前仍需要大量改进。对我来说，"未完成"是最舒服的状态。但有时，为了编写代码，我会选择“空想象”。实际上，这也是为什么这个话题非常有趣的另一个原因。在编程的历史中，有过很多提高生产力的进步，比如编译器、静态分析工具、更好的编程语言等等。

And there's a lot of increased program productivity, but have not led to an explosion. So that's like one, that sounds very much like autocomplete tab. And this other category is just like automation of the programmer. Yeah. And it's interesting. You're seeing more in the category of the historical analogies of like, you know, better compilers or something.

这里提到的程序生产力有了显著提高，但并没有出现爆炸式的增长。这有点像自动补全功能。同时还有另一种情况，可以看作是程序员的自动化。这很有趣，因为你会发现更多的情况类似于历史上的一些东西，比如更好的编译器之类的。

And this discusses that one other kind of thought of that is like, I do feel like I have a hard time differentiating where AI begins and stops because I do see AI as fundamentally an extension of computing in some, in some pretty fundamental way. And I feel like I see a continuum of this kind of like recursive self-improvement or like of speeding up programmers all the way from beginning, like even like I would say like code editors.

这句话讨论了另一种思路：我确实感到很难分辨人工智能的起点和终点，因为在某种基本层面上，我认为人工智能本质上是计算的一种延伸。我觉得我能看到一个连续的过程，从一开始就不断自我改进，或者说让程序员的效率不断提升——甚至可以说这个过程从代码编辑器就已经开始了。

Yeah. And then I think that's like a syntax highlighting syntax or like checking even of the types like data type checking. All these kinds of tools that we've built for each other, even search engines like why aren't search engines part of AI? Like I don't know, like ranking is kind of AI, right? At some point Google was like, even early on, they were thinking of themselves as an AI company doing Google search engine, which I think is totally fair. And so I kind of see it as a lot more of a continuum than I think other people do. And I don't, it's hard for me to draw the line. And I kind of feel like, okay, we're now getting a much better autocomplete. And now we're also getting some agents, which are kind of like these loopy things, but they kind of go off rails sometimes.

是的，然后我认为这就像是语法高亮语法检查，或者是数据类型检查之类的。我们为彼此构建的所有这些工具，甚至是搜索引擎，为什么它们不算作AI的一部分呢？我不知道，排名算是AI的一种，对吧？在某个时候，谷歌甚至在早期就把他们自己视为一家做搜索引擎的AI公司，我觉得这是完全合理的。所以，我觉得这比其他人想象的更像是一个连续的过程。我很难划出界限。我感觉现在我们有了一个更好的自动补全功能，也有了一些代理程序，这些程序有点像循环结构，但有时它们会失控。

And what's going on is that the human is progressively doing a bit less and less of the low level stuff. For example, we're not writing the assembly code because we have compilers. Yeah. Right? Like compilers will take my high level language and see and write the assembly code. So we're abstracting ourselves very, very slowly. And there's this what I call autonomy slider of like more and more stuff is automated of the stuff that can be automated at any point in time. And we're doing a bit less and less than raising ourselves in the lay rep traction over the automation.

正在发生的情况是，人类在逐渐减少涉及低层次工作的部分。比如说，我们不再亲自编写汇编代码，因为我们有编译器。编译器会把我们用高级语言写的代码转化为汇编代码。所以，我们在慢慢地对具体细节进行抽象。我称之为“自主化滑块”，随着技术进步，越来越多的事情在任何时候都可以被自动化，我们参与的细节越来越少，逐渐提升自己在自动化中的位置。

One of the big problems with RL is that it's incredibly information sparse. The box can help you with this by increasing the amount of information that your agent gets to learn from with every single episode. For example, one of their customers wanted to train a coding agent. So the box augmented an IDE with a bunch of extra data collection tools and staffed a team of expert software engineers from their aligner network to generate your checkered data. To generate trajectories that were optimized for training. Now obviously these engineers evaluated these interactions on a passfuel basis. But they also rated every single response on a bunch of different dimensions like readability and performance.

强化学习的一个大问题是信息非常稀疏。这个“盒子”可以通过增加你的智能体在每一集学习到的信息量来帮助你解决这个问题。例如，他们的一个客户想要训练一个写代码的智能体。于是，盒子在一个集成开发环境（IDE）中增加了一系列数据收集工具，并从他们的对齐网络中招募了一支由软件工程专家组成的团队，生成多样化的数据。这些专家生成了为训练优化的轨迹。显然，这些工程师在每一个环节进行的互动都是以通过为基础进行评估的。但他们还从可读性和性能等多个维度对每一个响应进行了评分。

And they wrote down their thought processes for every single rating that they gave. So you're basically showing every single step an engineer takes and every single thought that they have while they're doing their job. And this is just something you could never get from usage data alone. And so label box packaged up all these evaluations and included all the agent trajectories and the correct of human edits for the customer to train on. This is just one example. So go check out how label box can get you high quality frontier data across domains, modalities and training patterns. Reach out at labelbox.com slash the barcash.

他们记录下了给每个评分时的思考过程。也就是说，你可以看到工程师在工作时所采取的每一个步骤和产生的每一个想法。这些信息是单靠使用数据无法获得的。因此，Labelbox 将所有这些评估进行了打包，并包含了所有代理轨迹以及客户用于训练的人工编辑。这只是一个例子。请去看看 Labelbox 如何在不同领域、模式和训练模式下为你提供高质量的前沿数据。如需了解更多，请访问 labelbox.com/barchash。

Let's talk about our all bet. You two do some very interesting things about this. Conceptually, how should we think about the way that humans are able to build a rich world model just from interacting with our environment. And in ways that seems almost irrespective of the final reward at the end of the episode. If somebody, you know, somebody starting a start a business and at the end of 10 years she finds out whether the business succeeded or failed. We say that she's earned a bunch of wisdom and experience. But it's not because like the log probs of every single thing that happened over the last 10 years or up waited or down waited something much more deliberate and rich is happening.

让我们来谈谈我们的所有赌注。你们两个人在这方面做了一些非常有趣的事情。从概念上讲，我们应该如何理解，人类是如何通过与环境互动来构建丰富的世界模型的。这种方式似乎几乎与最终的奖励无关。比如说一个人开创一项业务，十年后才能知道这个业务是成功还是失败。我们会说，她在这个过程中积累了很多智慧和经验。但这并不是因为过去十年中每一件事情都被简单地加权或减权，而是因为有一些更深思熟虑和丰富的事情在发生。

What is the ML analogy and how does that compare to what we're doing with other ones right now. Yeah, maybe the way I would put it is humans don't use reinforcement learning is maybe what I've said at all. I think they do something different which is, yeah, you experience. So we first been learning is a lot worse than I think the average person thinks. The most personal learning is terrible. It just so happens that everything that we had before it is much worse. Because previously we're just imitating people so has all these issues. So in reinforcement learning say working with you're solving a math problem. This is very simple. You're giving a math problem and you're trying to find a solution.

机器学习的类比是什么？这与我们目前在做的其他事情相比如何？也许我会这样说，人类并不像我可能所说的那样使用强化学习。我认为他们做的事情有些不同，是通过体验来学习。因此，我认为强化学习没有人们普遍认为的那么好。个人学习也是糟糕的，但恰巧我们之前用的一切更为糟糕。因为以前我们只是模仿他人，所以存在很多问题。以强化学习为例，比如在解决数学问题时，这是非常简单的：你获得一个数学问题，然后尝试找到一个解决方案。

Now in reinforcement learning, you will try it lots of things in parallel first. So you're given a problem you try hundreds of different attempts. And these attempts can be complex right they can be like, oh, let me try this. Let me try that. This didn't work that didn't work, etc. And then maybe you get an answer and I checked the back of the book and you see, okay, the correct answer is this. And then you can see that, okay, this one, this one and that one got the correct answer but these other 97 of them didn't. So literally what reinforcement learning does is it goes to the ones that worked really well. And every single thing you did along the way, every single token gets up waited of like do more of this.

在强化学习中，你会先并行尝试很多事情。你会被给到一个问题，然后尝试数百种不同的解决方案。这些尝试可能很复杂，比如，你可能会说“让我试试这个，再试试那个，这个不行，那个也不行”等等。然后你可能会找到一个答案，然后对照书后的答案核对正确答案。你会发现，也许有几个答案是正确的，但还有97个答案是错误的。强化学习的做法就是关注那些表现良好的尝试，每一个过程中采取的每一个步骤都会被提升，提示你多做这些有效的尝试。

The problem with that is I mean people will say that your estimator has high variance but what I mean is just noisy. It's noisy. So basically it kind of almost assumes that every single little piece of the solution that you made that right the dry answer was correct thing to do, which is not true. Like you may have gone down the wrong alleys until you write the right solution. Every single one of those incorrect things you did as long as you got to the correct solution will be up waited as do more of this. It's terrible. It's noise. You've done all this work only to find a single at the end you get a single number of like, oh, you did correct. And based on that you weigh that entire trajectory is like up wait or down wait.

这其中的问题是，人们可能会说你的估计量方差很大，但我的意思是它只是有很多噪音。就是噪音。所以基本上它几乎假设了你解决方案中的每个小细节都是正确的，但这并不是真的。你可能在找到正确答案之前走过很多错误的路。这些错误的操作，只要你最终找到了正确的解决方案，都会被认为是应该多做的。这很糟糕，就是噪音。你做了这么多努力，却只得到一个结果，比如，哦，你做对了。然后根据这个结果，你会给整个过程加权或减权。

And so you're the way I like to put it is you're sucking supervision through straw because you've done all this work that could be a minute to roll out. And you're you're like sucking the bits of supervision of the final reward signal through straw and you're like putting it, you're like. You're basically like, yeah, you're broadcasting that across the entire trajectory and using that to up wait or down with that trajectory. It's too crazy. A human would never do this. Number one, a human would never do hundreds of roll outs. Number two, when a person sort of finds a solution, they will have a pretty complicated process of review of like, okay, I think these parts that I did well, these parts I did not do that well. I should probably do this or that and they think through things.

翻译成中文：所以，我喜欢用的说法是，你就像是通过吸管吸取监督信号。因为你做了很多工作，可能需要一分钟才能完成。而你就像是通过吸管吸取最终奖励信号的一小部分，然后你要把它分散到整个过程，并用它来调整这个过程的优劣。这样太疯狂了，人类根本不会这样做。第一，人类不会进行数百次的尝试。第二，当一个人找到解决方案时，他们会进行一个相对复杂的审视过程，比如“好吧，我觉得这些部分做得很好，这些部分做得不太好。我可能应该这样或那样”并深入思考。

There's nothing in current elements that does this. There's no equivalent of it. But I do see papers popping out that are trying to do this because it's obvious to everyone in the field. So I kind of see as like the first imitation learning actually, by the way, was extremely surprising and miraculous and amazing that we can fine tune by imitation humans. And that was incredible because in the beginning, all we had was base models. Base models are autocomplete. And it wasn't obvious to me at the time. And I had to learn this and the paper that like blew my mind was instruct GPT because it pointed out that, hey, you can trade the pre-train model, which is autocomplete. And if you just fine tune it on text that looks like conversations, the model will very rapidly adapt to become very conversational and it keeps all the knowledge from pre-train.

当前的元素中没有实现这样的功能，没有对应的替代物。不过，我确实看到有一些论文开始尝试去实现这一点，因为这个需求在这个领域中已经很明显。我把它看作是最初的模仿学习，这实在是令人吃惊和奇妙，因为我们能够通过模仿人类进行微调。起初，我们只有基础模型，也就是自动完成的模型。当时我并不觉得这有什么特别的，但我后来学习到的、让我大开眼界的论文是Instruct GPT。这篇论文指出，你可以在预训练模型的基础上进行训练，这些模型最初只是用来自动完成。如果你在类似对话的文本上对它进行微调，这个模型就能非常迅速地适应，并变得非常擅长对话，同时它还保留来自预训练阶段的所有知识。

And this blew my mind because I didn't understand that it's just like stylistically can adjust so quickly and become an assistant to a user through just a few loops of fine tuning on that kind of data. It was very miraculous to me that that worked. So incredible. And that was like two years, three years of work. And now came RL. And RL allows you to do a bit better than just imitation learning, right? Because you can't have these reward functions and you can hill climb on the reward functions. And so some problems have just correct answers. You can hill climb on that without getting expert trajectories to imitate. So that's amazing. And the model can also discover solutions that the human might never come up with. So this is incredible.

这让我感到十分震惊，因为我之前不明白，仅仅通过在这样的数据上进行几次微调，就能在风格上如此快速地调整，成为用户的助手。这对我来说简直就是奇迹，太令人难以置信了。而且这花了大约两三年的时间。然后，强化学习（RL）出现了。强化学习让你能做得比单纯的模仿学习更好，对吗？因为你可以设置这些奖励函数，并在奖励函数上进行爬坡优化。所以有些问题是存在正确答案的，你可以在没有专家路径来模仿的情况下进行优化。这太神奇了。而且模型还能发现人类可能从未想到的解决方案。简直不可思议。

And yet it's still stupid. So I think we need more. And so I saw a paper from Google yesterday that tried to have this reflect and review paid idea in mind. What was the memory bank paper or something? I don't know. I've actually seen a few papers along these lines. So I expect there to be some kind of a major update to how we do algorithms for all of them coming in that realm. And then I think we need three or four or five more. Something like that. You're so good to come up with the evocative evocative phrases sucking supervision through a straw is like so good. Why hasn't so you're saying like you're probably without a base reward is that you have this huge trajectory.

这仍然很愚蠢。所以我认为我们需要更多改进。因此，我昨天看到谷歌的一篇论文，试图在这个反思和审查付费的想法上进行探讨。好像是关于记忆库的论文之类的。我不太确定，但我确实见过一些类似的论文。所以我预计在这方面会有一些重大的算法更新。而且我认为我们还需要三到五个类似的突破。你真的很擅长提出引人注目的短语，比如“通过吸管吸监督”，这太妙了。你提出的问题是，当没有基础奖励时，你仍然有一个巨大的发展轨迹。

And then at the end, you're trying to learn every single possible thing about what you should do and what you should learn about the world from that one final bit. Why hasn't given the fact that this is obvious? Why hasn't processed space supervision as an alternative business access way to make models more capable? What has been preventing us from using this alternative paradigm? So process based approach is refers to the fact that we're not going to have a reward function only at the very end of after you have made 10 minutes of work and not going to tell you you did well or not. Well, I'm going to tell you at every single step of the way how well you're doing.

然后在最后，你试图从那最后一点去了解每一件你应该做的事，以及你应该从世界中学习的事。为什么在这个显而易见的事实下，我们还没有采用基于过程的监督作为一种使模型更加有能力的替代商业途径？是什么阻止了我们使用这种替代的模式？所谓的基于过程的方法是指，我们不会只在你完成10分钟工作后才给予奖励反馈，而是会在你每一个步骤中告诉你做得怎么样。

And this is basically the reason we don't have that is not trick is tricky how you do that properly. Because you have partial solutions and you don't know how to assign credit. So when you get the right answer, it's just an equality match to the answer very simple to implement. If you're doing basically process supervision, how do you assign an automatical way partial credit assignment? It's not obvious how you do it. Lots of labs, I think, are trying to do it with these LLM judges. So basically you get LLM's to try to do it. So you prompt an LLM, hey, look at a partial solution of a student. How well do you think they're doing if the answer is this and they try to tune the prompt?

这基本上就是为什么我们没有实现这个功能的原因，因为要正确地做到这一点是很棘手的。因为你会有部分解决方案，但不知道如何合理分配积分。因此，当你得到正确答案时，这只是与答案的一致匹配，非常容易实现。如果你在进行过程监督，如何自动分配部分积分呢？这并不是显而易见的。很多实验室都在尝试使用大型语言模型（LLM）来解决这个问题。所以基本上你就是让LLM来尝试完成这项工作。你可以提示一个LLM，说：“看看学生的部分解答，如果正确答案是这个，你觉得他们表现如何？”然后他们会尝试调整提示。

I think this is kind of tricky is quite subtle. And it's the fact that anytime you use an LLM to assign a reward, those LLM's are giant things with billions of parameters and they're gameable. And if you're reinforced from the learning with respect to them, you will find adversarial examples for your LLM judges, almost guaranteed. You can't do this for too long. You do maybe 10 steps or 20 steps. Maybe it will work, but you can't do 100 or 1000 because it's not obvious because I know I understand it's not obvious, but basically the model will find little cracks.

我觉得这有点复杂而且相当微妙。主要是在于每次你用大型语言模型（LLM）来分配奖励时，这些模型都有数十亿个参数，并且可能被利用。如果你一直以这些模型为基础进行学习强化，几乎可以肯定地会找到对模型不利的对抗性示例。你不能这样做太长时间，最多只能进行10或20步的操作，也许还能奏效，但绝对不能做到100或1000步。虽然这点可能并不明显，但模型基本上会找到一些小漏洞。

It will find all these like spurious things in the nooks and crannies of the giant model and find a way to cheat it. So one example that's prominently in my mind is I think this was probably public. But basically if you're using an LLM judge for a reward, so you just give it a solution from a student and ask it if the student will or not. We were training with reinforcement learning against that reward function and it worked really well. And then suddenly the reward became extremely large, like it was massive jump and it did perfect.

它会在这个巨大的模型的每个角落里找到所有这些似乎无关紧要的东西，并找到一种办法来绕过它。让我印象深刻的一个例子可能是公开的。基本上，如果你用大型语言模型（LLM）作为奖励的评判标准，你给它一个学生的解答，然后询问模型学生是否会做得很好。我们用强化学习来对抗这个奖励函数，效果非常好。但是突然间，奖励分数变得极高，就像是突然飞跃，而且结果非常完美。

And you're looking at it like, wow, this means the student is perfect and all these problems is fully solved math. But actually what's happening is that when you look at the completions that you're getting from the model, they are complete nonsense, they start out okay, and then they change to it, it's just like, okay, let's take 2 plus 3 and we do this and this. And then it's like, it's crazy, how is it getting a reward of one or 100 percent?

你看着它，会觉得，哇，这意味着学生是完美的，所有这些问题都被完全解决了。但是实际上发生的情况是，当你查看从模型中得到的答案时，它们完全是胡说八道，一开始还不错，然后就变得不对劲，比如说，好，我们来看2加3，我们这样算那样算。然后结果却是荒谬的，怎么会得到满分或100%的奖励呢?

And you look at the LLM judge and turns out that the the the the the the the the is an adversarial example for the model and it assigns 100 percent probability to it. And it's just because this is an out of sample example to the LLM. It's never seen you during training and you're in pure generalization land. It's never seen you during training and in the pure generalization land, you can find these examples that break it. You're basically training the LLM to be a prompt injection model.

你看看这个大型语言模型（LLM）的评判，结果发现“the the the the the the the the”是一个对模型的对抗性例子，它被分配了100%的概率。这仅仅是因为对于这个LLM来说，这是一个训练样本之外的例子。模型在训练过程中从未见过这样的输入，你进入了完全依赖泛化能力的领域。在这个完全泛化的领域中，你可以找到这样的例子来弄乱模型。实际上，你是在把LLM训练成一个提示注入模型。

Not even that prompt injection is way too fancy or you're finding adversarial examples. These are non-sensical solutions that are obviously wrong but the model things are amazing. So to this thing, you think this is the bottleneck to making RL more functional. Then that will require making LLM's better judges if you want to do this in an automated way. And then so is it just going to be like some sort of GAN like approach or you had to train models to be more robust to I think the labs are probably doing all that like, okay, so the obvious thing is like the the should not get a 100 percent reward.

即便是提示注入也太复杂了，或者你正在寻找对抗性实例。这些都是显然错误但模型却认为很棒的无意义的解决方案。因此，你认为这是让强化学习（RL）更具功能性的瓶颈。要在自动化方式下做到这一点，就需要让大型语言模型（LLM）成为更好的评判者。那么，这是否会像某种对抗生成网络（GAN）的方法，或者需要训练模型更加鲁棒？我想实验室可能正在尝试所有这些方法。显而易见的事情是，模型不应该得到100%的奖励。

Okay, well, take the the the the the the the the the training set of the LLM judge and say this is not 100 percent. This is 0 percent. You can do this. But every time you do this, you get a new LLM and it still has adversarial examples. There's infinity adversarial examples. And I think probably if you iterate this a few times, it'll probably be hard and hard to find adversarial examples. But I'm not 100 percent sure because this thing has a trillion parameters or whatnot.

好的，把大型语言模型（LLM）评估中的训练集拿过来，然后说这个模型并不是百分之百完美，而是有零分的可能。你可以这样做，但每次这么做时，你会得到一个新的LLM，而它仍会有对抗样本。对抗样本是无限的。我觉得如果你反复迭代几次，可能会越来越难找到对抗样本，但我不敢完全确定，因为这个模型可能有上万亿参数之类的。

So I bet you the the labs are trying. I don't actually I still think I still think we need other ideas. Do you have some shape of what the other idea. So like this this idea of like every review. Yeah, review solution and compass synthetic examples such that when you train on them, you get you get better and like metallurin it in someone. And I think there's some papers that I'm starting to see pop out.

所以，我敢打赌那些实验室正在尝试。不过，实际上我还是认为我们需要其他想法。你有没有其他想法的大致轮廓？比如像这种，每次评审，评估解决方案，然后结合一些合成的例子，这样当你在这些例子上进行训练时，可以得到更好的结果，并在某种程度上达到元学习。我开始注意到有一些论文在这方面开始出现。

I only am at a stage of like reading abstracts because a lot of these papers, you know, there's just ideas. Someone has to actually like make it work on a frontier LLM lab scale in full generality. Because when you see these papers, they pop up and it's just like a little bit of noisy, you know, it's cool ideas. But I haven't actually seen anyone convincingly show that this is possible. That said, the LLM labs are fairly closed. Also who knows what they're doing that.

我目前只处于阅读论文摘要的阶段，因为很多论文中只是有一些想法。必须有人能够在前沿的大型语言模型实验室中全方位地实现这些想法。因为当你看到这些论文时，它们就像是一些有趣但略显杂乱的创意。我还没有看到有人能令人信服地证明这些想法是可行的。不过，大型语言模型实验室相对封闭，我们也不知道他们具体在做些什么。

But yeah. So I guess I guess I see a very not easy, but like I can conceptualize how you would be able to train on synthetic examples or synthetic problems that you have made for yourself. But there seems to be another thing humans do. Maybe sleep is this, maybe daydreaming is this, which is not necessarily come up with fake problems, but just like reflect. And I'm not sure what the ML analogy for daydreaming or sleeping, but just reflecting, I haven't come up with any problem. I mean, obviously the very basic analogy is to be like fine tuning on reflection bits. But I feel like in practice, that probably wouldn't work that well. So I don't know if you have some take on what the analogy of like this thing is.

好的。我觉得我能理解如何通过自制的合成例子或问题进行训练，虽然这不简单，但可以想象出实现的方法。不过，似乎人类还有其他的一些活动，比如说睡觉或者做白日梦，这些活动并不一定是为了制造假问题，而更多是为了反思。我不太确定机器学习中是否有类似于人类白日梦或睡觉的概念，只是单纯的反思。我想到的最基本的类比是对机器学习模型进行某种“反思”微调，但我觉得在实际操作中可能效果不佳。不知道你是否对这种现象有什么看法。

Yeah, I do think that we're missing some aspects there. So as an example, when you're reading a book, I almost feel like currently when LLM are reading a book, what that means is we stretch out the sequence of text and the model is predicting the next token and it's getting some knowledge from that. That's not really what humans do, right? So when you're reading a book, I almost don't even feel like the book is like exposition. I'm supposed to be attending to and training on the book is a set of prompts for me to do synthetic generation or for you to get into a book club and talk about it with your friends.

是的，我确实认为在这方面我们有所欠缺。举个例子，当你在读一本书时，我感觉目前大型语言模型（LLM）在“读”书时，实际上是将文本序列展开，然后模型预测下一个词元，并从中获取一些知识。这并不是人类真正的阅读方式，对吧？所以当你在读一本书时，我几乎不觉得这本书只是一些陈述。我应该专注于它，并通过它进行学习，而这本书其实是一系列提示，可用于促使我生成新的思考，或者与朋友在读书会上进行讨论。

And by manipulating that information that you actually gained that knowledge. And I think we have no equivalent of that again with all alums. They don't really do that, but I'd love to see during pre-training some kind of a stage that thinks through the material and tries to reconcile with what it already knows and thinks through for like some amount of time and gets that to work. And so there's no equivalence of any of this. This is all research. There's some subtle, very subtle that I think are very hard to understand reasons why it's not trivial.

通过操控获得的信息，你实际上获得了知识。我认为目前我们还没有类似于这样的方法，尤其是在所有校友中间。他们并不真正这样做。但我希望在预训练阶段能有一个步骤，能够深入思考材料，与已有的知识进行整合，认真思考一段时间并使其有效运作。所以在这方面没有任何等同的做法。这一切都属于研究。原因很微妙，非常难以理解，为什么这不是一件简单的事情。

So if I can just describe one, why can we just synthetically generate and train on it? Well, because every synthetic example, like if I just give synthetic generation of the model thinking about a book, you look at it and you like this looks great. Why can't I train on it? Well, you could try, but the model will actually get much worse if you continue trying. And that's because all of the samples you get from models are silently collapsed. They're silently. It's not obvious if you look at any individual example of it. They occupy a very tiny manifold of the possible space of sort of thoughts about content.

所以，如果我能描述一个例子，为什么我们不能直接用合成实例来生成并训练模型呢？因为每一个合成的例子，比如我给一个模型生成它在思考一本书的样子，你看了可能觉得不错，为什么不能拿来训练呢？其实你可以尝试，但如果继续这样做，模型的表现会变得更差。这是因为你从模型里得到的所有样本实际上都是在无声中被收缩的。这其实不容易察觉，因为这些示例占据了可能内容思想空间中的一个非常小的区域。

So the elements when they come off, they're what we call collapsed. They have a collapse data distribution. If you sample one easy way to say it is go to chat you pity and ask it tell me a joke. It only has like three jokes. It's not giving you the whole breadth of possible jokes. Yeah. It's giving you like it knows like three jokes. Yeah. They're soundly collapsed. So basically you're not getting the richness and diversity and the entropy from these models as you would get from humans.

所以，当这些元素被提取出来时，我们称之为“坍塌”。它们的数据分布是“坍塌的”。举个简单例子，你可以去聊天机器人那里，问它讲个笑话。它大概只有三个笑话，不会给你所有可能的笑话种类。对，它就只知道那三个笑话。这就是所谓的完全坍塌。因此，基本上你无法从这些模型中获得像人类那样丰富、多样和充满变化的信息。

So humans are a lot more serve noisier, but at least they're not biased. They're not in a statistical sense. They're not the silent the collapsed. They maintain a huge amount of entropy. So how do you get synthetic generation to work despite the collapse and while maintaining the entropy is a research problem? Just to make sure I understood the reason that the collapses relevant to synthetic data generation is because you want to be able to come up with synthetic problems or reflections which are not already in your data distribution.

所以人类的噪音要大得多，但至少他们没有偏见。从统计学的角度来看，他们不是沉默的或崩溃的。他们保持了大量的熵。那么，如何在合成数据面临崩溃的情况下，并且同时保持熵的情况下，成功生成合成数据是一个研究难题。为了确保我理解正确的是，崩溃与合成数据生成相关的原因是，你希望能够提出不在你现有数据分布中的合成问题或反映。

I guess what I'm saying is say we have a chapter of a book and I ask a no to think about it. It will give you something that looks very reasonable. But if I ask it 10 times, you'll notice that all of them are the same. You can't just see scaling scaling quote unquote reflection on the same amount of prompt information and then get returns from that. Any individual sample will look okay, but the distribution of it is quite terrible. It's quite terrible in such a way that if you continue training on too much of your own stuff, you actually collapse.

我想我在说的是，假设我们有一本书的一章内容，如果我让一个模型来思考这个内容，它会给出看起来非常合理的结果。但是，如果我问它10次，你会发现它给出的答案都是一样的。你不能只是依靠同样数量的提示信息，期待从中获得不同的结果。虽然每个单独的样本看起来都还不错，但整体表现却非常糟糕。糟糕到如果你不断地用自己的结果进行训练，最终会导致模型崩溃。

I actually think that there's no fundamental solutions to this possibly. I also think humans collapse over time. I think these analogies are surprisingly good. But humans collapse during the course of their lives. This is why children have completely, they haven't overfitted. They will say stuff that will shock you because you can see where they're coming from. It's just not the thing people say. Because they're not yet collapsed. But we're collapsed. We end up revisiting the same thoughts. We end up saying more and more of the same stuff.

我实际上觉得这可能没有根本的解决办法。我还认为，人类随着时间的推移会陷入困境。我觉得这些类比出奇地好。但在人生的过程中，人类会崩溃。这就是为什么孩子们总是给我们意想不到的惊喜，他们没有形成固定的思维模式。他们会说出一些让你震惊的话，因为你能看到他们的思考方式，而这些话往往不是人们通常会说的。因为他们还没有固定化，而我们已经固定化了。我们最终总是会重蹈覆辙，重复着相同的想法和言论。

The learning rates go down and the collapse continues to get worse. And then everything deteriorates. Have you seen this in a very interesting paper that dreaming is a way of preventing this kind of overfitting in collapse. The reason dreaming is evolutionary adaptive is to put you in weird situations that are very unlikely or day-to-day reality. It's an interesting idea. I do think that when you're generating things in your head and then you're attending to it, you're training on your own samples. You're training on your synthetic data. If you do it for too long, you go off rails and you collapse too much. You always have to seek entropy in your life. Talking to other people is a great source of entropy. Maybe the brain has also built some internal mechanisms for increasing the amount of entropy in that process. This is a very ill-formed thought. I'll just put it out and let you react to it.

学习速率下降，崩溃则进一步恶化，然后一切都会恶化。你是否在一篇非常有趣的论文中见过这种观点：做梦是一种防止崩溃过程中过拟合的方法。做梦之所以具备进化上的适应性，是因为它让你身处一些非常不可能发生或日常生活中不会经历的奇怪情境中。这是个有趣的想法。我确实认为，当你在头脑中创造一些东西并关注它时，你实际上是在用自己的样本进行训练，相当于在用合成数据训练自己。如果你这样做太久，你会偏离正轨，崩溃得更严重。你总是需要在生活中寻找变化，与他人交谈是一个提供变化的绝佳途径。也许大脑也建立了一些内部机制来增加这个过程中的变化量。这只是一个不是很成熟的想法，我就这么说出来，看看你的反应。

The best learners that we are aware of, which are children, are extremely bad at recollecting information. In fact, at the very earliest stages of childhood, you will forget everything. You're just an amnesiac about everything that happens before a certain year date. But you're extremely good at picking up new languages and learning from the world. And maybe there's some element of being able to see the forest for the trees. Whereas if you compare it to the opposite aspect from you have LLM pre-training, which these models will literally be able to regurgitate word for word. It's the next thing we get PDF page. But their ability to learn abstract concepts really quickly in the way a child can is much more limited. And then adults are somewhere in between where they don't have the flexibility of childhood learning. But adults can memorize facts and information in a way that is hard for kids.

我们知道，最优秀的学习者是儿童，但他们在回忆信息方面非常糟糕。事实上，在儿童的最早阶段，你会忘记所有事情。对于某个年份之前发生的所有事情，你就像个失忆者。但在学习新语言和吸收世界知识方面，儿童非常出色。也许这反映了“见林不见树”的能力。然而，如果你将此与大语言模型（LLM）的预训练进行对比，这些模型能够逐字复述信息，就像我们从PDF页面中提取内容一样。但它们在快速学习抽象概念方面的能力远不如孩子。而成人则介于两者之间，他们没有儿童学习的灵活性，但成人能够以儿童难以做到的方式记忆事实和信息。

I don't know if there's something interesting about that. I think there's something very interesting. I do think that humans actually, they do kind of like have a lot more of an element compared to LLM's of seeing the forest for the trees. And we're not actually that good at memorization, which is actually a feature. Because we're not that good at memorization, we actually are kind of like forced to find patterns like an enormous, an enormous sense. I think LLM's in comparison are extremely good at memorization. They will recite passages from all these training sources. You can give them completely nonsensical data. Like you can take, you can hash some amount of text or something like that. You get a completely random sequence. If you train on it, even just I think a single iteration or two, it can suddenly re-urgitate the entire thing.

我不知道这件事是否有什么有趣之处。但我确实觉得有一些非常有趣的点。我认为人类相比大型语言模型（LLM）来说，其实更能做到“见树又见林”。我们实际上并不擅长记忆，这实际上是一种特性。因为我们不擅长记忆，所以我们被迫去寻找模式，从而形成一种巨大的理解能力。我认为大型语言模型在记忆方面表现非常出色。它们会背诵来自训练资料中的段落。即使你给它们完全无意义的数据，比如随机生成的一段文本，如果用这些数据进行训练，甚至只需要一两次训练，它们也能把整段内容重新复述出来。

You'll memorize it. There's no way a person can read a single sequence of random numbers and recite it to you. And that's a feature, not a bug almost, because it forces you to like only learn the generalizable components, whereas LLM's are distracted by all the memory that they have of the pre-trained documents. And it's probably very distracting to them in a certain sense. So that's why when I talk about the cognitive core, I actually want to remove the memory, which is what we talked about. I'd love to have them less than memory so that they have to look things up. And they only maintain the algorithms for like thought and the idea of an experiment and all this cognitive glue of acting.

你会记住的。一个人不可能只看一串随机数字就能背出来。而这几乎是一种优点而不是缺陷，因为这迫使你只学习那些可以推广的部分，而大型语言模型（LLM）却被它们记忆中预先训练的文档所分散注意力。在某种意义上，这对它们来说可能非常干扰。所以，当我谈论“认知核心”时，我其实想去除记忆，这就是我们之前讨论的。我希望它们减少记忆，以便需要查阅信息。它们只保留用于思考和实验理念等认知操作的算法，就像执行任务的认知“粘合剂”。

And this is also relevant to preventing Marlowe's collapse. Let me think. I'm not sure. I think it's almost like a separate axis. It's almost like the models are way too good at memorization and somehow we should remove that. And I think people are much worse, but it's a good thing. What is a solution to model collapse? I mean, there's very naive things you could attempt is just like the distribution over LLM should be wider or something. Like there's many naive things you could try. What ends up being the problem with the naive approaches?

这也与防止Marlowe的崩溃有关。让我想想。我不太确定。我觉得这几乎像是一个独立的维度。模型在记忆方面太强了，我们应该以某种方式减弱这一点。而我认为人类在这方面要差很多，但这反而是件好事。究竟如何解决模型崩溃问题呢？有一些非常简单的方法可以尝试，比如让大型语言模型（LLM）的分布更广泛等等。你能尝试很多简单的方法。但这些简单方法最终会面临什么问题呢？

Yeah, I think that's a great question. I mean, you can imagine having a regularization for entropy and things like that. I think that's because they just don't work as well empirically because right now, like the models are collapsed. But I will say most of the tasks that we want of them don't actually demand the diversity. It's probably the answer of what's going on. And so it's just that the model, the frontier labs are trying to make the models useful. And I kind of just feel like the diversity of the outputs is not so much number one. It's much harder to work with an evaluate and all this kind of stuff.

是的，我觉得这是个很好的问题。就是说，你可以想象对熵等进行正则化。我认为这是因为在实际操作中效果不太好，因为现在模型有点过于集中化。但我想说的是，我们对它们的多数任务实际上并不需要多样性。这可能就是问题的根源。因此，前沿实验室试图让这些模型变得更为有用。我觉得输出的多样性并不是首要任务，因为多样化会使处理和评估变得更加困难。

But maybe it's not what's actually capturing most of the value. It's actually penalized. If you're super creative in RL, it's not good. Or maybe if you're doing a lot of writing, help from LLM and stuff like that, I think it's probably bad because the models will give you these silently all the same stuff. So they're not, they won't explore lots of different ways of answering a question. But I kind of feel like maybe the diversity is just not as big of a, yeah, maybe not as many applications needed so the models don't have it. But then it's actually a problem. It's in the generation time, et cetera. So we're actually shooting ourselves in the foot by not allowing this entropy to maintain in the model. And I think possibly the labs should try harder.

但可能这并不是实际捕捉到大多数价值的东西。事实上，这部分价值被限制了。如果你在强化学习（RL）中超级有创造力，这并不是好事。或者如果你写了很多东西，依赖大型语言模型（LLM）之类的帮助，我认为这可能不好，因为这些模型会给你提供几乎相同的内容。它们不会去探索多种不同的问题解答方式。但我感觉可能是多样性并没有那么重要，可能很多应用并不需要它，所以模型也没有体现这一点。但这其实是个问题，特别是在生成时。因此，我们实际上是在自毁前程，因为没有让这种随机性在模型中保持下去。我认为实验室可能需要在这方面更加努力。

And then I think you hinted that it's a very fundamental problem. It won't be easy to solve. And what's your intuition for that? I don't actually know if it's super fundamental. I don't actually know if I intended to say that. I do think that I haven't done these experiments. But I do think that you could probably regularize the entropy to be, to be higher. So you're encouraging the model to give you more and more solutions. But you don't want it to start deviating too much from the training data. It's going to start making up its own language. It's going to start using words that are extremely rare. You know, so it's going to drift too much from the distribution. So I think controlling the distribution is just like a tricky, it's just like someone just has to. It's probably not trivial in that sense.

然后，我觉得你暗示这是一个非常基本的问题，不容易解决。你是怎么看的？我其实不确定这是否真的很基本，也不确定我是否有意这样说。我确实认为我还没有做这些实验。但我认为你可以通过调节熵的方式，让它变得更高。这样可以鼓励模型提供更多解决方案。但你不希望它过多地偏离训练数据，否则它可能会开始创造自己的语言，使用非常罕见的词汇，太偏离原来的分布。所以，我觉得控制分布是一件复杂的事情，不是那么简单。

How many bits should the optimal core of intelligence end up being if you just had to make a guess? The thing we put on the one-noim and pros. How big does it have to be? So it's really interesting in the history of the field because at one point everything was very scaling-pilled in terms of like, oh, we're going to make much bigger models, trillions of parameter models. And actually what the models have done in size is they've gone up. And that was actually kind of like, actually even come down. The models are smaller. And even then I actually think they memorized way too much. So I think I had a prediction of a while back that I almost feel like we can get cognitive course.

如果让你猜测，一个理想的核心智能需要多少比特才能实现？我们放在一台机器上的智能核心，究竟要多大？在这一领域的发展历史中，这是一个非常有趣的问题。曾经有一段时间，人们对扩大模型规模非常热衷，比如构建拥有数万亿参数的大型模型。实际上，这些模型的规模确实上升到了很大的程度，但是之后又开始缩小了。即便如此，我还认为这些模型记忆的信息太多了。因此，我曾经有一个预测，我几乎觉得我们能够获得认知核心。

There are very good at even like a billion billion parameters. It should be all very like, like if you talk to a billion parameter model, I think in 20 years, you can actually have a very productive conversation. It thinks and it's a lot more like a human. But if you ask it some factual question, you might have to look it up. But it knows that it doesn't know and it might have to look it up and they will just do all the reasonable things. That's actually surprising that you think it will take a billion, because already we have a billion parameter models or a couple of billion parameter models that are like very intelligent. Also, the other models are like a trillion parameters, right? But they remember so much stuff like.

即使有数十亿参数的模型，它们也表现得非常出色。如果你与一个拥有十亿参数的模型对话，我认为在未来20年内，你可以进行非常有成效的对话。它的思考方式更接近于人类。然而，如果你问它一些事实性的问题，它可能需要查询一下。它知道自己不知道，并可能需要查阅相关信息，并会做出所有合理的事情。令人惊讶的是，你认为需要十亿参数，因为我们已经有十亿或几十亿参数的模型，它们已经很智能了。而且，还有参数达到一万亿的模型，对很多信息的记忆非常强大。

Yeah. But I'm surprised that in 10 years, given the pace, okay, we have a GPT OSS 20B that's way better than GPT 4 original, which was a trillion plus parameters. So given that trend, I'm actually surprised you think in 10 years, the cognitive core is still a billion parameters. I would, yeah, I'm surprised you're not like that's going to be like tens of millions or millions. No, because I basically think that the training data is, so here's the issue, the training data is the internet, which is really terrible.

好的。不过让我感到惊讶的是，考虑到技术发展的速度，我们用了大约10年的时间就开发出了一个名为GPT OSS 20B的模型，它比最初拥有超过一万亿参数的GPT-4要优秀得多。因此，考虑到这个趋势，我很惊讶你认为再过10年，核心认知模型的参数数量仍然会保持在十亿级别。我原本以为你可能会觉得参数数量会下降到几千万甚至几百万。不过，我的观点是训练数据本身存在问题，因为它主要来源于互联网，而互联网上的数据质量确实很差。

So there's huge amount of gains to be made because the internet is terrible. Like if you actually, and even the internet, when you and I think of the internet, you're thinking of like, oh, Wall Street Journal, or that's not what this is. So when you're actually looking at preaching data set in the front to the lab, and you look at a random internet document, it's total garbage. Like I don't even know how this works at all. It's some like stock ticker symbols. It's a huge amount of slop and garbage from like all the corners of the internet. It's not like your Wall Street Journal article that's extremely rare.

互联网的问题很多，因此改进的空间也很大。当我们谈到互联网时，可能会想到《华尔街日报》这样的内容，但实际上情况并非如此。当你在研究数据集时，看到的随机互联网文档往往是非常混乱的。我甚至不知道这些内容是如何运作的，里面充斥着股票代码以及来自互联网各个角落的杂乱信息。这些内容和《华尔街日报》那种高质量的文章是截然不同的，后者是非常少见的。

So I almost feel like because the internet is so terrible, we actually have to sort of build really big models to compress all that. Most of that compression is memory work instead of like cognitive work. But what we really want is the cognitive part, actually delete the memory. And then, so I guess what I'm saying is like, we need intelligent models to help us refine even the pre-training set to just narrow it down to the cognitive components. And then I think get away with a much smaller model because this is a much better data set and you could train it on it.

我几乎觉得，由于互联网的信息非常混乱，我们实际上需要构建非常庞大的模型来压缩这些信息。其中大部分压缩都是记忆工作，而不是认知工作。但我们真正想要的是认知部分，实际上可以删除那些记忆。因此，我的意思是，我们需要智能模型来帮助我们优化预训练数据集，只专注于认知成分。这样一来，我们可以用一个更小的模型来处理，因为这会是一个更优质的数据集，并且可以用它来进行训练。

But probably it's not trained directly on it. It's probably distilled for a much better model still. But why is it distilled version still, a billion? Is it, I guess, the thing I'm curious about? I just feel like this situation work is extremely well. So almost every small model, if you have a small model, it's almost certainly distilled. Why would you train on? Right, no, no, no, why is the dissolution not in 10 years not getting below one billion? Oh, you think it should be smaller? Yeah. And then a billion? I mean, come on, right? At some point, it should take at least a billion knobs to do something interesting. You think it should be even smaller? Yeah, I mean, just like if you look at the trend over the last few years, just finding a little hanging fruit and going from like trillion plus models that are like literally two orders of magnitude smaller.

但可能它并不是直接在这个数据上训练的。可能是通过一个更好的模型进行蒸馏得到的结果。但是为什么这个蒸馏版的参数量仍然有十亿呢？这是我好奇的地方。我觉得这个情况效果非常好。所以几乎每个小模型，如果你有一个小模型，它几乎肯定是经过蒸馏的。那你会选择训练什么呢？对，为什么蒸馏版本在十年后还没有小于十亿呢？哦，你认为它应该更小吗？是的，我是这么认为的。至少在某个时候，应该需要至少十亿个参数才能做一些有趣的事情。你觉得它应该更小？是的，因为如果你看看过去几年的趋势，找到一些容易实现的目标，从上万亿级别的模型变成小两个数量级的模型。

In a matter of two years and having better performance. Yeah, yeah. It means you think that the sort of like core of intelligence might be even way, way smaller. Like plenty of room at the bottom to pair first fine men. I mean, I almost feel like I'm already contrarian by talking about a billion in the parameter cognitive core and you're out doing me. I think, um, yeah, maybe we could get a little bit smaller. I mean, I still think that there should be enough, yeah, maybe it can be smaller. I do think that practically speaking, you want the model to have some knowledge. You don't want it to be looking up everything. Yeah. Because then you can't like think in your head. You're looking up way too much stuff all the time. So I do think it needs to be some basic curriculum needs to be there for knowledge. But it doesn't have a certain knowledge.

在短短两年内取得更好的表现。是的，是的。这意味着你认为智能的核心可能实际上要小得多。就像在底层还有很多空间来配对首批优秀人才。我的意思是，我几乎觉得谈论一个拥有十亿参数的认知核心已经是逆向思维，而你却超越了我。我想，嗯，也许我们可以把它做得更小。我仍然认为从实际的角度来看，模型应该具备一些知识，而不是事事都需要查找。因为这样在思考时就会过于依赖查找信息。所以我认为模型需要具备一些基础的知识课程，但不一定要具备某种特定的知识。

Yeah. So we're discussing what like plausibly could be the cognitive core. There's a separate question, which is, what will actually be the size of furniture models over time? And here's the prediction. So we had increasing scale up to maybe 4.5 and now we're seeing decreasing slash plateauing scale. There's many reasons that could be going on. But the other prediction about going forward will scale with the bigger models be bigger, will they be smaller, will they be the same? Yeah, I don't know that I have a super strong prediction. I do think that the labs are just being practical. They have a flops budget and a cost budget. And it just turns out that pre-shaning is not where you want to put most of your flops or your cost. So that's why the models have gotten smaller because they are a bit smaller, the pre-shaning stages smaller, etc.

好的，我们在讨论什么可能是认知核心。这是一个话题，但还有另一个问题，那就是家具模型的规模在未来会如何变化？这是我的预测。我们之前看到模型规模不断增长，可能增长到4.5，但现在看到规模在缩小或停滞。这背后有很多原因。而对未来的另一个预测是：规模更大的模型会继续变大吗？还是会变小或保持不变呢？我不太敢下一个非常确定的预测。我认为各实验室只是很实际，他们有浮点运算预算和成本预算。结果发现，前期训练（Pre-shaning）阶段并不是你想投入大部分计算能力和花费的地方。这也是为什么模型变小了，因为前期训练阶段缩小了等等。

But they make it up in the reinforcement learning and all this kind of stuff, mid-training and all this kind of stuff that it follows. So they're just being practical in terms of all the stages and how you get the most bang for the buck. So I guess like forecasting that trend I think is quite hard. I do still expect that there's so much longing for it. That's my basic expectation. And so I have a very wide distribution here. Do you say they're looking for it to be similar in kind to the kinds of things that have been happening over the last two to five years? Like just in terms of like if I look at nano chat versus an energy PT and then the architectural tweaks you made, is that basically like the flavor of things you continue to keep happening or is there?

但是，他们在强化学习和所有这些中期训练等方面弥补了这一点，所以他们只是根据所有阶段的实际情况来最大化投资回报。因此，我认为预测这种趋势非常困难。我仍然期望大家对此有很大的期待。这是我基本的预期。因此，在这方面，我持有很广泛的看法。你说他们希望这类似于过去两到五年中发生的事情吗？比如，当我查看纳米聊天与能量PT以及你所做的架构调整时，这基本上就是你希望继续发生的事情，还是有其他方向？

You're not expecting any giant furniture. I expect the data says to get much much better because when you look at the average data says they're extremely terrible. Like so bad that I don't even know how anything works to be honest. Like look at the average example in the training set. Like factual mistakes, errors, nonsensical things. Somehow when you do it at scale, the noise washes away and you're left with some of the signal. So data sets will improve a ton. It's just everything gets better. So our hardware are all the kernels. All the kernels for running the hardware and maximizing what you get with the hardware. So Nvidia is slowly tuning the actual hardware itself, tenser course and so on. All that needs to happen.

你不需要期待有巨大的家具。我认为数据会大大改善，因为如果你查看平均数据，它们是非常糟糕的。不夸张地说，糟糕到我都不知道是怎么运行的。看看训练集中的平均例子吧，充满了事实错误、各种错误和没有意义的内容。不知怎的，当你大规模操作时，这些噪音会被过滤掉，剩下的是有效信息。因此，数据集将会有很大提升，一切都会变得更好。而我们的硬件，包括运行硬件的所有内核，都在不断优化，以最大化硬件的性能。像Nvidia这样的公司一直在慢慢调整他们的硬件，如张量核心等等。这些改进都是必不可少的。

We'll continue to happen. All the kernels will get better and utilize the chip to the max extent. All the algorithms will probably improve over optimization architecture and just all of the modeling components of how everything is done and what the algorithms are that we're even training with. So I do kind of expect like a very just everything, nothing dominates everything plus 20%. This is like roughly what I've seen. Okay, this is my general manager, Max. Good to be here here every day. And you have been here since you were onboarded about six months ago. When I was a month ago. All right, time passes so fast. But when I onboarded you, I was in France. And so we basically didn't get the chance to talk at all almost. And you basically just gave me one logins. I gave you access to my Mercury platform, which is the banking platform that I was using at the time to run the podcast.

我们将继续发展。所有内核都会变得更好，并充分利用芯片。所有算法可能会通过优化架构得到改进，以及模型组件的所有方面——包括我们用来训练的算法。因此，我确实期待着一个一切都有所提升，没有任何一个领域主导一切，并且整体提升了大约20%吧。这大概是我所见到的情况。好的，这是我的总经理，马克斯。每天见到你都很高兴。自六个月前你加入以来，你一直在这里。当我一个月前的时候，时间过得真快。但是当我把你介绍进来时，我人在法国，所以我们几乎没有机会交谈。你基本上只给了我一个登录，我给你开通了我当时用来运行播客的Mercury平台（一个银行平台）的访问权限。

And so I logged in Mercury assuming that that would just be the first of many steps. But I realized that was how you were running the entire business. Even down to a lot of our editors are international contractors. So you would just figure out how to set up these recurring payments to set up basic payroll. I mean, Mercury made the experience of all of these things I was doing before. So seamless that it didn't even occur to me until he pointed it out that this is not the natural way to set up payroll or invoicing or any of these other things. Yeah, I was surprised, but I was like, it's worked so far. That's right. So maybe I'll trust it. And then now I can't think of doing anything else. All right, you heard them. Visit mercury.com to apply online in minutes. Cool. Thanks, Max. Thanks for having me. Dude, you're great at this. I'm so nervous. But thank you. Mercury is a financial technology company, not a bank. Banking services provided through choice financial group, call a money, and evolve bank interest, members, FDIC.

于是，我登录了Mercury，假设这只是众多步骤中的第一步。但我意识到，这就是你运营整个业务的方式。甚至我们的许多编辑都是国际合同工。因此你会设法安排这些定期付款以建立基本的薪酬体系。我的意思是，Mercury让这些我以前需要做的事情变得如此顺畅，以至于没人提醒我这是不自然的工资或开具发票等操作方式时，我都没意识到。我很惊讶，但同时觉得它到目前为止运作顺利，那我也许可以信任它。现在我无法想象用其他任何方式做这些事情了。好的，大家都听到了。访问mercury.com，几分钟内在线申请。太好了，谢谢Max。感谢你请我来。伙计，你真擅长这个。我很紧张，但谢谢你。Mercury是一家金融科技公司，而不是银行。银行服务由Choice Financial Group、Coala Money和Evolve Bank Trust提供，均为FDIC成员。

People have proposed different ways of charting how much progress you've made towards full AGI. Because if you come up with some line, then you can see where that line intersects with AGI and where that would happen on the X-axis. And so people have proposed, oh, it's like the education level. So if you had a high schooler and then they went to college with RL and they're going to get a PhD. Yeah, I don't like that one. Or then they propose horizon length. So maybe they can do tasks that take a minute. They can do those autonomously. Then they can autonomously do tasks that take an hour, a human an hour, a human a week, et cetera. How do you think about what is the relevant Y-axis here? How should we think about how AGI is making progress? So I guess I have two answers to that.

人们提出了不同的方法来描绘你在实现完全通用人工智能（AGI）过程中取得了多少进展。因为如果你能划出一条线，就可以看到这条线与AGI相交的地方，以及它在X轴上的位置。有些人提议用教育水平来代表这个进展，比如让高中生接触强化学习（RL），然后像上大学一样，最后获得博士学位。我不太喜欢这种方法。还有人建议用“视野长度”来衡量，也就是说，也许现在人工智能可以自主完成需要一分钟的任务，然后逐渐到能够自主完成需要一小时、一天甚至一周的人类任务。那么我们该如何看待在这个过程中做哪个Y轴是相关的？我们应该如何看待AGI的进展呢？我想我对此有两个答案。

Number one, I'm almost tempted to reject the question entirely. Because again, I see this as an extension of computing. Have we talked about how to chart progress in computing? Or how do you chart progress in computing since 1970s or whatever? What is the X-axis? So I kind of feel like the whole question is kind of funny from that perspective a little bit. But I will say, I guess like when people talk about AGI and the original AGI and how we spoke about it when we when opening I started. AGI was a system you can go to that can do any task that is economically valuable, any economically valuable task at human performance or the AGI. So that was the definition and I was pretty happy with that at the time.

首先，我几乎想完全拒绝这个问题。因为在我看来，这可以被视为计算领域的一个延伸。我们讨论过如何衡量计算领域的进展吗？或者你从1970年代以来是如何衡量计算进步的？X轴该怎么设定？从这个角度看，我觉得整个问题都有点滑稽。不过我想说，关于AGI（人工通用智能）以及最初我们讨论AGI时的情景，人们提到AGI时是指一个系统，它可以完成任何具有经济价值的任务，并达到人类的表现水平。这是当时的定义，而我当时对此感到满意。

And I kind of feel like I've stuck to that definition forever. And then people have made up all kinds of other definitions. But I like I feel like I like that definition. Now number one, the first concession that people make all the time is they just take out all the physical stuff. Because we're just talking about digital knowledge work. I feel like that's a pretty major concession compared to the original definition, which was like any task a human can do. I can lift things, et cetera. Like, AGI can't do that obviously. So, okay, but we'll take it. What fraction of the economy are we taking away by saying, oh, only knowledge work. I didn't actually know the numbers. I feel like it's about 10 to 20 percent if I had to guess is only knowledge work. Like someone could work from home and from tasks, something like that.

我感觉我一直坚持那个定义，然后人们又创造了各种其他定义。但我喜欢那个定义。首先，人们常常做出的第一个让步就是，把所有的体力活动都排除在外，因为我们只在讨论数字知识工作。我觉得这相比于原来的定义——即人类可以完成的任何任务——是一个相当重大的让步，比如搬东西之类的，显然AGI（人工通用智能）无法做到这些。所以，好吧，但我们接受这个让步。那么，当我们说只考虑知识工作时，我们排除了经济中的多少部分呢？我其实不知道确切的数字。如果让我猜的话，我觉得只有10%到20%的工作可能是知识工作，比如可以在家远程工作的任务之类。

I still think it's a really large market. Like, yeah, what is the size of the economy and what is 10 to 20 percent? Like we're still talking about a few trillion dollars of even in the US of market share almost or like work. So still a very massive bucket. But I guess like going back to the definition, I guess what I would be looking for is to what extent is that definition true. So are there jobs or lots of tasks if we think of tasks as you know, not jobs, but tasks kind of difficult. Because the problem is like society will refactor based on the tasks that make up jobs compared to what's based on what's so automatable or not. But today, what jobs are replaceable by AI.

我仍然认为这是一个非常庞大的市场。例如，经济的规模到底有多大，而10%到20%的市场份额又是多少？即便在美国，这也涉及到数万亿美元的市场份额或工作。因此，这依然是个非常巨大的领域。不过，我想回到定义上，我会关注这种定义在多大程度上是正确的。我们讨论的任务并非指工作，而是指具体的任务，而这可能比较困难。因为社会将根据构成工作的任务来进行调整，这取决于哪些任务是可以自动化的。目前，究竟有哪些工作是可以被人工智能取代的呢？

So a good example recently was Jeff Hinton's prediction that radiologists would not be a job anymore. And this turned out to be very wrong in a bunch of ways, right? So radiologists are alive and well and growing, even though computer vision is really, really good at recognizing all the different things that they have to recognize in images. And it's just messy, complicated job with a lot of surfaces and dealing with patients and all this kind of stuff in the context of it. So I guess I don't actually know that by that definition, AI has made a huge amount of dent yet.

最近有一个很好的例子是Jeff Hinton的预测，他曾说放射科医生这个职业会消失。然而，这个预测被证明在许多方面是错误的。事实上，放射科医生依然活跃，并且需求在增长。虽然计算机视觉技术在识别图像中的各种事物方面表现得异常出色，但放射科医生的工作复杂且多面，包括处理病人等多个方面。因此，我认为按照这种定义，人工智能在这一领域还没有产生很大的影响。

But some of the jobs maybe that I would be looking for have some features that I think make it very amenable to automation earlier than later. As an example, call center employees often come up and I think rightly so. Because call center employees have a number of simplifying properties with respect to what's automatable today. Their jobs are pretty simple. It's a sequence of tasks and every task looks similar. Like you take a phone call with a person, it's 10 minutes of interaction or whatever it is, probably a bit longer. In my experience a lot longer.

但我可能会寻找的一些工作可能具备某些特征，这些特征使得它们更适合更早地实现自动化。举个例子，呼叫中心的员工经常被提到，我认为这是有道理的。因为从今天可自动化的角度来看，呼叫中心员工的工作拥有许多简化的特性。他们的工作相对简单，是一系列的任务，而且每个任务看起来都很相似。比如，与某人进行电话交流，无论是10分钟的互动，还是稍长一些，根据我的经验，通常会更长一些。

And you complete some task and some scheme and you change some database entries around or some point that. So you keep repeating something over and over again and that's your job. So basically you do want to bring in the task horizon, how long it takes to perform a task. And then you want to also remove context like you're not dealing with different parts of services of companies or other customers. It's just the database you and a person you're serving. And so it's more closed, it's more understandable and it's purely digital.

你完成了一些任务和计划，并更改了一些数据库条目。你不断地重复这些操作，这就是你的工作。因此，你需要考虑任务的时间范围，也就是完成一个任务所需的时间。另外，你不需要处理公司、其他客户或服务的不同部分，只是面对数据库和你服务的对象。所以这个工作环境更加封闭、易于理解，而且完全是数字化的。

So I would be looking for those things. But even there, I'm not actually looking at full automation yet. I'm looking for an autonomy slider and I almost expect that we are not going to instantly replace people. We're going to be swapping in AIs that do 80% of the volume. They delegate 20% of volume to humans. And humans are supervising teams of five AIs doing the call center work that's more wrote. So I would be looking for new interfaces or new companies that provide some kind of a later that allows you to manage some of these AIs. They're not yet perfect.

所以我会寻找这些东西。不过即便如此，我并不打算完全实现自动化。我希望能有一个自主性调节器，我几乎可以预见我们不会马上取代人力。我们可能会引入人工智能来承担80%的工作量，并将20%的工作交给人类完成。而人类则负责监督五个人工智能团队处理更加死板的客户服务工作。因此，我会寻找那些提供新界面或新的公司，它们能够提供工具帮助管理这些人工智能。因为这些技术还不够完美。

Yeah. And then I would expect that across the economy and a lot of jobs are a lot harder than call center employees. I wonder with radiologists. I'm totally speculating. I have no idea how with the actual workload of radiologists involves. But one analogy that might be applicable is when we almost are first being ruled out, there would be a person sitting in the front seat. And you just had to have them there to make sure that if something went really wrong, they're the monitor.

好的。那么我想整个经济中很多工作都比呼叫中心的员工要难。我在想放射科医生是怎样的呢？我只是完全在猜测，没有确切了解放射科医生实际的工作负荷。但是有一个类比可能适用，当自动驾驶刚开始被引入时，会有一个人坐在前排座位。这个人的职责就是确保如果出现严重问题时，他们可以进行监控和干预。

And I think even today, people are still watching to make sure things are going well. Robotaxi, who was just deployed, actually still has a person inside it. And we could be in a similar situation where if you automate 99% of a job, that last 1% of the human has to do is incredibly valuable because it's bottlenecking everything else. And if it was the case with like with the radiologists where the person sitting in the front of the Uber or the front of the Waymo, has to be specially trained for years in order to be able to provide the last 1%.

我认为即使在今天，人们仍然在关注事情是否进展顺利。刚刚投入使用的自动驾驶出租车（Robotaxi）实际上仍然有个人坐在里面。我们可能会面临类似的情况，即使工作实现了99%的自动化，剩下的1%仍需要由人来完成，而这个1%非常重要，因为它是其他所有工作的瓶颈。这种情况就像放射科医生一样，其中坐在优步(Uber)或Waymo前座的人需要经过多年的特殊培训才能提供那最后的1%。

Their wages should go up tremendously because they're like the one thing bottlenecking wide deployment. So radiologists, I think the wages have gone up for similar reasons. If you're like the last bottleneck, you should, you're like, and you're not fungible, which like, you know, a waymo driver might be fungible with other things. So you might see this thing where like your wages go like, and then to get a 90% and then like just like that. I've been the last 1% has gone.

他们的工资应该大幅上涨，因为他们是广泛部署的瓶颈因素之一。所以，我认为放射科医生的工资也因为类似原因上涨了。如果你是最后一个瓶颈，而且你不是可以被替代的，比如Waymo的司机可能会被其他替代，那么你的工资可能会经历一个这样的过程：先大幅上涨到90%，然后突然一下子到最后的1%。

I see. And I wonder if we're similar things with radiology or salaries of call center workers or anything like that. Yeah, I think that's an interesting question. I don't think we're currently seeing that with radiology or, and I don't have like in my understanding, but I think radiology is not a good example, basically. I don't know why Jeff Hinton picked on radiology because I think it's an extremely messy, messy, complicated profession.

我明白了。我在想，我们是否也在放射学、呼叫中心员工的薪资等方面看到类似的情况。我认为这是一个有趣的问题。目前，我认为放射学没有出现这样的情况，我个人也不是很了解，但我觉得放射学并不是一个好的例子。我不明白杰夫·辛顿为什么选择放射学，因为我认为这是一个极其复杂且混乱的职业。

Yeah. So I would be a lot more interested in what's happening with call center employees today, for example, because I would expect a lot of the road stuff to be automatable today. And I don't have a first-leather access to it, but maybe I would be looking for trends of what's happening with the call center employees. Maybe some of the things I would also expect is maybe they are swapping in AI, but then I would still wait for a year or two, because I would potentially expect them to pull back and actually rehire some of the people. I think there's been evidence that that's already been happening in the. Okay. .generally, and companies that have been adopting AI, which I think is quite surprising. Yeah. And I also find we're really surprising.

好的。我会对当今呼叫中心员工的动态更感兴趣，因为我预计很多重复性的工作在今天已经可以被自动化了。虽然我没有第一手的访问权限，但可能我会寻找有关呼叫中心员工的趋势。我也许会预期他们正在引入人工智能，但我可能会等待一两年，因为我觉得他们可能会回头重新雇用一些员工。我认为已经有证据表明这种情况已经在发生了。通常情况下，采用人工智能的公司竟然出现了这种情况，我觉得这很令人惊讶。而且我也发现这一点真的很让人意外。

Okay. AGI, right? Like a thing which should do everything. And okay, we'll take out physical work. So the thing we should be able to do all knowledge work. And what you would have naively anticipated, that the way this regression would happen is like, you take a little task that a consultant is doing, you take that out of the bucket, you take a little task that an accountant is doing, you take that out of the bucket, and then you're just doing this across all knowledge work. But instead, if we do believe we're on the path of AGI with the current paradigm, the progression is very much not like that. At least it just does not seem like consultants and accounts and whatever are getting like huge productivity improvements.

好的，AGI，对吧？就像一个应该能做所有事情的东西。当然，我们排除体力劳动。所以这个东西应该能够做所有的知识工作。你可能会天真地预期，这种变革会是这样的：你把顾问做的一些小任务从他们的工作中拿走，再把会计做的一些小任务从他们的工作中拿走，然后对所有知识工作都这样做。但实际上，如果我们相信在当前的框架下我们正在走向AGI，那么进展并不是这样的。至少目前看起来，顾问、会计等人的生产力并没有得到大幅度提高。

It's very much like programmers are getting more and more chills of the way of their work. If you're looking at the revenues of these companies, discounting just like normal chat revenue, which I think is like, I don't know. That's similar to Google or something. Just looking at API revenues. It's like dominated by coding. So this thing, which is general, quote-unquote, is to be able to do any knowledge work. It's just overwhelmingly doing only coding. And it's a surprising way that you would expect the AGI to be deployed. So I think there's an interesting point here because I do believe coding is like the perfect first thing for these LLMs and agents. And that's because coding has always fundamentally worked around text.

程序员们对他们的工作方式越来越感到寒意。如果你查看这些公司的收入，剔除掉像普通聊天收入这样的部分（我觉得这有点像谷歌等类似公司），单看API收入的话，你会发现编程占据了主导地位。因此，这种被称为“通用”的能力——所谓可以进行任何知识工作的能力，实际上却主要在进行编程工作。你可能会觉得，AGI的应用方式很出乎意料。我认为这里有个有趣的点，因为我确实相信编程是大型语言模型（LLMs）和智能体的最理想的起点。这是因为编程自始至终就是围绕文本展开的。

It's computer terminals and text, and everything is based around text. And LLMs, the way they're trained on the internet, love text. And so they're perfect text processors. And there's all this data out there, and it's just perfect fit. And also we have a lot of infrastructure prebuilt for handling code and text. So for example, we have a Visual Studio code or your favorite IDE showing you code. And then agent can plug into that. So for example, if an agent has a diff where it made some change, we suddenly have all this code already that shows all the differences to a code base using a diff. So it's almost like we've prebuilt a lot of the infrastructure for code.

这段话的意思是：计算机终端和文本是这类系统的核心，一切都围绕着文本展开。而大语言模型（LLMs）通过在互联网上训练，非常擅长处理文本。因此，它们是完美的文本处理器。外界有大量的数据，对于这些模型来说是很合适的。此外，我们已经预先建立了很多处理代码和文本的基础设施。例如，我们有像Visual Studio Code这样的集成开发环境（IDE），可以显示代码，然后代理可以与其对接。例如，当一个代理有一个代码差异（diff）并做出一些修改时，我们已经有完善的系统，能够利用这些差异展示代码库的变化。因此，这就像是我们已经为代码建立了大量的基础设施。

Now contrast that with some of the things that don't enjoy that at all. So as an example, there's people trying to build automation not for coding, but for example for slides. Like I saw a company doing slides. That's much, much harder. And the reason it's much, much harder is because slides are not text. Slides are little graphics and they're arranged spatially. And there's visual component to it. And slides don't have this prebuilt infrastructure. Like for example, if an agent is to make a different change to your slides, how does a thing show you the diff? How do you see the diff? There's nothing that shows diffs for slides. So someone has to build it.

与此相对，一些事物根本没有这样便利的条件。例如，一些人试图为幻灯片而不是代码构建自动化。我看到有公司正在制作幻灯片软件。这要困难得多。原因在于幻灯片并不是文本，而是由小图形组成，并以空间方式排列，还有视觉元素。幻灯片没有预先构建的基础设施。例如，如果一个人工能应用于不同的修改到你的幻灯片上，如何显示不同之处？你如何看到差异？幻灯片没有任何显示差异的工具，所以有人需要去创建。

So it's just some of these things are not amenable to AIs as they are, which is text processors and code surprisingly is. I actually am not sure if that alone explains it because I personally have tried to get LLLMs to be useful in domains which are just pure language and language out. Like rewriting transcripts, like coming up with clips based on transcripts, etc. And you might say, well, it's very plausible that like, I didn't do every single possible thing I could do. I put a bunch of good examples in context, but maybe I should have done some fine tuning, whatever. So our mutual friend, Andy Matushak, told me that he actually tried 50 billion things to try to get models to be good at writing space repetition prompts.

所以，问题在于有些事情并不适合现有的人工智能，因为它们主要是文本处理器，而编程代码居然是可以处理的。我其实也不确定这是否能完全解释这个问题，因为我个人尝试过让大型语言模型（LLLMs）在纯文本输入和输出的领域中发挥作用，比如改写转录稿、根据转录稿创建片段等。你可能会说，很可能我没有尝试所有的可能方法。我确实在上下文中放了一些好的例子，但也许我应该进行一些微调之类的。我们的共同朋友安迪·马图沙克告诉我，他实际上尝试了50亿种方法来让模型擅长编写间隔重复的提示。

Very much language in, language out task. The kind of thing that should be death center in the repertoire of these LLMs. And he tried in context learning obviously with a few short examples. He tried, I think he told me like a bunch of things like supervised fine tuning and like retrieval, whatever. And he just could not get them to make cards to satisfaction. So I find it striking that even in language out domains, it's actually very hard to get a lot of economic value out of these models separate from coding.

这段文字大意是：这是一项非常依赖语言输入和输出的任务，应该是这些大型语言模型（LLM）的核心能力之一。他尝试了使用上下文学习技术，并提供了一些简短的示例来进行尝试。我想他告诉我，他还尝试了很多其他方法，比如监督微调和信息检索等。不过，他始终无法达到他对卡片制作的满意要求。我觉得很惊讶的是，即便在仅涉及语言输出的领域，要真正从这些模型中获得显著的经济价值，实际上是非常难的，尤其当与编程无关时。

And I don't know what explains it. Yeah, I think I think that makes sense. I mean, I would say, yeah, I'm not saying that anything text is trivial. I do think that code is like, it's pretty structured. Text is maybe a lot more flowery and there's a lot more like entropy and text, I would say. I don't know how I'll put it. And also, I mean code is hard. And so people sort of feel quite empowered by LLMs, even from like simple, simple kind of knowledge.

我不知道该怎么解释这一点。是的，我觉得这有道理。我的意思是，我认为没有任何文本是琐碎的。我确实认为代码相对结构化，而文本可能更花哨、变化更多，可能更复杂。我不知道该怎么表达这一点。此外，代码本身也很难。因此，即使是简单的知识，人们也会因为大型语言模型（LLM）的帮助感到很有力量。

I basically, I don't actually know that I have a very good answer. I mean, obviously like text makes it much much easier maybe. He's maybe why I put it, but it doesn't mean that all text is trivial. How do you think about super intelligence? Do you expect it to feel qualitatively different from normal humans or human companies? I guess I see it as like a progression of automation in society. And again, like it's trapling the trend of computing. I just feel like there will be a gradual automation of a lot of things.

我基本上不知道我是否能给出一个很好的答案。我是说，显然，文字可能让事情变得简单得多，也许这就是我所想的，但这并不意味着所有文字都很简单。你怎么看超级智能？你认为它会与普通人或人类公司有本质上的不同吗？我觉得这就像是社会中自动化的一个进程。此外，这也符合计算发展的趋势。我只是觉得很多事情会逐渐被自动化。

And super intelligence will be sort of like the extrapolation of that. So I do think we expect more and more autonomous entities over time that are doing a lot of the digital work and then eventually even the physical work, probably some amount of time later. But basically I see it as just automation, roughly speaking. I guess automation includes the things humans can already do and super intelligence wise things humans. Well, but some of the things that people do is invent new things, which I would just put into the automation of that makes sense.

超级智能可以被视为这种趋势的延伸。因此，我确实认为，随着时间的推移，我们会看到越来越多的自主实体，它们将承担大量的数字工作，最终甚至会承担一些实体工作，可能会在某段时间后实现。但基本上，我把这一切看作是自动化，大致而言。我认为自动化包括人类已经能够做的事情，而超级智能则涵盖了人类擅长的事情。人类所做的一些事情是发明新事物，我认为这就是自动化的一部分，如果这样讲得通的话。

Yeah. But I guess maybe less abstractly and more sort of qualitatively. Do you expect something to feel like, okay, this because this thing can either think so fast or has so many copies or the copies can merge back in them themselves or is quote unquote much smarter. Any number of advantages in the I might have it will qualitative the civilization in which these AIs as well. Just feel qualitatively different from your.

好的，不过我想可能不太抽象，而是更具质感一些。你是否预计某种感觉，比如由于这个东西思考速度极快、或者有许多副本，或这些副本可以自我合并，或者说它显然“更聪明”，因此在某种程度上，AI可能拥有的任何优势，都将导致它们所在的文明，明显与我们现在的文明有所不同？

I mean, it is fundamentally automation, but I mean, it will be like extremely foreign. I do think it will look really strange because like you mentioned, we can run all of this on the computer cluster, etc. And much faster and all this thing. I mean, maybe some of the scenarios, for example, that I start to get like nervous about with respect to when the world looks like that is this kind of like gradual lots of control and understanding of what's happening.

我的意思是，这基本上就是自动化，但这将是非常陌生的。我确实认为它看起来会很奇怪，因为正如你提到的，我们可以在电脑集群上运行所有这些东西，而且速度更快等等。我觉得，当世界变成那样的时候，我可能会开始对一些情况感到不安，比如逐渐失去对发生事情的控制和理解。

And I think that's actually the most likely outcome probably is that there will be a gradual loss of understanding of and will gradually layer all the stuff everywhere. And there will be fewer and fewer people who understand it and that there will be a sort of this like scenario of gradual less of control and understanding of what's happening. That to me seems most likely outcome of how the stuff will go down.

我认为实际上最可能的结果是，人们对这些事情的理解会逐渐减弱，并且这种情况会逐渐蔓延到各个方面。懂得这一切的人会越来越少，最终可能会出现一种情况，即对正在发生的事情逐渐失去控制和理解。在我看来，这就是这些事情最可能的发展结果。

Let me prove on that a bit. It's not clear to me that loss of control and loss of understanding are the same things. A board of directors at like whatever TSMC Intel name a random company. They're just like prestigious 80 year olds. They have very little understanding and maybe they don't practically actually have control. But actually, maybe their example is the president of the United States. The president has a lot of fucking power. I'm not trying to make a good statement about the current operant, but maybe I am.

让我对此进行进一步说明。我不太确定失去控制和失去理解是否是同一回事。比如说，像TSMC、英特尔这样的公司董事会，他们可能都是一些有名望的80岁老人，他们对企业实际运作的理解可能很有限，或许他们实际上也没有多少控制权。但他们的例子也许就像美国总统，总统拥有极大的权力。我并不是要对现任总统做出什么评价，但或许某种程度上，我说的就是这个意思。

But like the actual level of understanding is very different from the level of control. Yeah, I think that's fair. That's a good perspective. I think like I guess I expect loss of both. Yeah, outcome. I mean, loss of understanding is obvious, but why loss of control? So, so we're really far into a territory of I don't know what this looks like, but if I was to write sci-fi novels that would look along the lines of not even a single like entity or something like that.

但实际的理解水平与控制水平非常不同。我认为这是合理的，这是一种好的看法。我感觉，我似乎预期两者都会丧失。结果，就是理解的丧失很明显，但为什么会失去控制呢？我们现在已经深入到一个我无法想象的领域，但如果让我写科幻小说，它可能会描绘成不仅仅是一个实体之类的东西。

That should sort of like take over everything, but actually like multiple competing entities that gradually become more and more autonomous and some of them go rogue and the others like fight them off and all this kind of stuff. And it's like this hot pot of completely autonomous activity that we've delegated to. I kind of feel like it would have that flavor. It is not the fact that they are smarter than us that is resulting in the loss of control.

这应该会像是接管了一切，但实际上是多种互相竞争的实体逐渐变得越来越自主，其中一些变得不受控制，其他的则与它们对抗，等等。这就像一个完全自主活动的大杂烩，我们把事情委托给它们来处理。我觉得这种情况大概会是这样。导致我们失去控制的并不是因为它们比我们聪明。

In fact, that they are competing with each other and whatever arises out of that competition, at least the loss of control. I mean, I basically expect there to be I mean, a lot of these things, I mean, they will be tools to people and the people could some of the population is like they're acting on behalf of people or something like that. So maybe those people are in control, but maybe it's a loss of control overall for society in the sense that of like outcomes we want or something like that. Where you have entities acting on behalf of individuals that are still kind of roughly seen as out of control. Yeah, yeah.

实际上，他们彼此竞争，而这种竞争所带来的结果至少是失去控制。我的意思是，我基本上预期很多这样的情况会出现。我是说，这些东西会成为人们使用的工具，而有些人可能代表一部分人行事，因此也许那些人是有控制权的。但是从整体社会的角度来看，这可能是一种失去控制，因为我们可能得不到我们想要的结果。虽然有些实体以个人的名义行动，但整体上仍然被视为失控的。是的，就是这样。

This is a question I should have asked earlier. So we were talking about how currently it feels like when you're doing AI engineering or AI research, these models are more like in the category of compiler rather than in the category of a replacement. At some point, if you have quantum code AGI, it should be able to do what you do. And do you feel like having a million copies of you in parallel results in some huge speed up of AI progress? Basically, if that does happen, would you do you expect to see an intelligence explosion or even once we have a true agent, not talking about other loans today, but really.

这是一个我本应该早点问的问题。我们之前谈到，目前在进行AI工程或研究时，这些模型更像是编译器，而不是直接替代者。某一时刻，如果你拥有量子代码的通用人工智能（AGI），它应该能够做你所做的事情。你觉得如果有一百万个与你相同的副本并行工作，会不会导致AI进展的大幅加速？基本上，如果这种情况发生，你是否预期会看到智能的爆炸式增长，或者即便我们拥有一个真正的智能代理，而不是今天讨论的其他模拟？

I guess like what I mean is I do, but it's business as usual because we're in an intelligence explosion already and have been for decades. When you look at Jeep, it's basically the Jeep curve that is an exponential weighted sum over so many aspects of the industry. Everything is gradually being automated has been for hundreds of years. Industry revolution is automation and some of the physical components and the tool building and all this kind of stuff. Compilers are early software automation, et cetera. So I kind of feel like we've been recursively self improving and exploding for for a long time.

我想表达的是，我确实认同这种观点，但这其实就是“业务照旧”，因为我们已经处于一个智力爆炸的时代，而且这个状态已经持续了几十年。当你观察像吉普车这样的例子时，会发现它代表一种在许多行业层面上的指数加权总和。事实上，所有事物都在逐步实现自动化，这种趋势已经持续了几百年。工业革命本身就是一种自动化，包括物理组件、工具制造等等。编译器则是早期的软件自动化示例。所以，我觉得我们已经进行了很长时间的递归自我改进和爆炸式发展。

Maybe another way to see it is, I mean, Earth was a pretty, I mean, if you don't look at the bio mechanics and so on, it was a pretty boring place, I think, and looked very similar if you just look from space and Earth is spinning and then like we're in the middle of this like firecracker event, but worse of seeing it as a motion. But I definitely feel like this is this has already happened for a very long time. And again, like I don't see AI as like a distinct technology with respect to what has already been happening for a long time.

也许可以换个角度来看，我的意思是，如果不考虑生物力学等因素，地球原本是一个相当乏味的地方。从太空中看，地球就像一个旋转的球体，我们就像置身于一场烟花爆炸的中心，这种运动感甚至比烟花更强烈。但我确实认为，这种变化已经持续了很长时间。而且，我并不认为人工智能是一种完全独立的技术，因为类似的变化其实早已发生。

You think it's going to continue with this hyper exponential trend. And that's why like this is this was very interesting to me because I was trying to find AI in the GDP for a while. I thought that GDP should go up. But then I looked at some of the other technologies that I thought were very transformative, like maybe computers or mobile phones or et cetera. You can't find them in GDP. GDP is the same exponential. And it's just that even for example, the early iPhone didn't have the app store and it didn't have a lot of the bells and whistles that the more iPhone has.

你认为这种快速的指数增长趋势将会继续下去。这很让我感兴趣，因为我一直尝试在GDP中找到人工智能的影响。我原以为GDP会因此上升。但当我查看其他一些我认为具有强大变革性的技术，比如电脑或手机等时，我发现它们在GDP中并没有明显体现。GDP还是保持相同的指数增长。例如，早期的iPhone没有应用商店，也缺乏现在iPhone具备的许多功能和特色。

And so even though we think of 2008 was it when iPhone came out as like some major seismic change. It's actually not everything is like so spread out and so slowly the fuses that everything ends up being averaged up into the same exponential. And it's the exact same thing with computers. You can't find them in the GDP is like, oh, we have computers about. It's not what happened because it's such slow progression. And with AI, we're going to see the exact same thing. It's just more automation.

虽然我们通常认为2008年是iPhone问世的时候，是一个重大的变革时刻，但实际上，这些变化都是缓慢且分散的，所以最终这些影响会平均融合进同一个指数增长之中。这种情况和计算机的发展完全一样。你在GDP中无法找到计算机的巨大变化，因为它们的发展是一个非常缓慢的过程。对于人工智能来说，我们将看到同样的情况，这只不过是更多的自动化而已。

It allows us to write different kinds of programs that we couldn't write before. But AI is still fundamentally a program. And it's a new kind of computer and a new kind of computing system. But it has all these problems. It's going to diffuse over over time. And it's still going to add up to the same exponential. And we're still going to get an exponential that's going to get extremely vertical. And it's going to be very foreign to live in that kind of an environment.

它让我们能够编写以前无法编写的各种程序。但人工智能本质上仍然是一个程序。它是一种新型计算机和新型计算系统。但它也存在各种问题。这些问题会随着时间的推移逐渐扩散。尽管如此，人工智能仍然会以指数级增长，并且最终达到非常快速的发展。生活在这样的环境中会让人感到非常陌生。

Are you saying that like what will happen is if you go, if you look at the trend before the Industrial Revolution to currently, you have a hyper exponential where you go from like 0% growth to then 10,000 years ago, 0.02% growth. And then currently we're at 2% growth. So that's a hyper exponential. And you're saying if you're charting AI on there, then it's like, AI takes you to 20% growth or 200% growth. Or you could be saying, if you look at the last 300 years, what have you been seeing is you have technology, you have your technology, computers, electrification, steam, steam engines, railways, et cetera. But the rate of growth is exact same. It's 2%. Are you saying the rate of growth will no, I personally expect this, the rate of growth has also stayed roughly constant, right? For only the last 200, 300 years. But over the course of human history, it's like exploded, right? It's like gone from like 0% basically to like faster, faster, faster, industrial explosion.

你是在说，如果我们查看工业革命之前到现在的增长趋势，会发现一个超指数增长的模式吗？那时的增长几乎是0%，而在大约一万年前，增长率是0.02%，现在已经达到2%。这是一个超指数增长。如果我们把人工智能的发展放在这个图表上，你是说人工智能可能让增长率达到20%或者200%吗？或者你是在说，过去300年我们看到的是科技进步，比如计算机、电气化、蒸汽机、铁路等，但增长率仍然保持在2%。你的意思是增长率不会变吗？我个人认为过去200到300年的增长率基本保持不变，但在人类历史的长河中，增长确实是爆炸性的，从几乎0%开始变得越来越快，经历了工业爆发。

Yeah. 2%. Basically, I guess what I'm saying is for a while, I tried to find AI or look for AI in like the GDP curve. And I've kind of convinced myself that this is false. And that even when people talk about recursive self-improvement and labs and stuff like that, I even don't, this business as usual, of course, is going to recursively self-improve. And it's been recursively self-improving. Like LLMs allow the engineers to work much more efficiently to build the next round of LLM. And a lot more of the components are being automated and tuned and et cetera. So all the engineers having access to Google search is sort of part of it. All the engineers having an ID, all of them having autocomplete or having Cloud Code, et cetera. It's all just part of the same speed up of the whole thing.

好的。2%。基本上，我想说的是，有一段时间，我试图在GDP曲线中找到人工智能的影响。不过我现在有点说服自己，这种想法是不对的。即使当人们谈论递归自我改进和实验室开发这些概念时，我也认为这些只是常规运作的一部分。当然，这种常规运作会不断进行递归自我改进。而且它一直在递归自我改进。比如大型语言模型（LLM）可以帮助工程师更高效地工作，构建下一代的大型语言模型。许多组件都在被自动化和调整。所有工程师都能够使用谷歌搜索、集成开发环境（IDE）、自动补全、云代码等工具，这些都是整体加速进程的一部分。

So it's just so smooth. But just to clarify, you're saying that the rate of growth will not change. Like the intelligence explosion will show up as like, it just enabled us to continue staying on the 2% growth trajectory just to get in and help us stay on the 2% growth trajectory. Yeah, my expectation is that it stays the same pattern. Yeah. I mean, just to throw that opposite argument against you, my expectation is that it blows up because I think true AGI, and I'm not talking about LLM coding bots, I'm talking about like actual, this is like a replacement of a human in a server, is qualitatively different from these other productivity improving technologies because it's labor itself, right? I think we're living a very labor constrained world. Like if we talk to any startup founder and a person who can just be like, okay, what do you need more of? You just need really talented people. And if you just have billions of extra people who are inventing stuff, integrating themselves, making companies, bottoms, start to finish, that feels qualitatively different from just like a single technology.

所以，这一切都非常顺畅。但为了澄清一下，您是说增长率不会改变。就像智能爆炸将显现出来一样，它只是让我们能够继续保持在2%的增长轨迹上，以帮助我们继续保持这种增长轨迹。是的，我的预期是它会保持相同的模式。只是为了提出一个相反的观点，我的预期是它会爆发性增长，因为我认为真实的通用人工智能（AGI），而且我不是在谈论像编程的语言模型机器人，我是在说像服务器里的人类替代品，这与其他提高生产力的技术有本质的不同，因为它本身就是劳动力，对吧？我认为我们生活在一个劳动力非常紧缺的世界中。比如说，如果我们与任何初创公司的创始人交谈，他们通常会说需要的就是非常有才华的人。而如果我们突然有数十亿额外的人在发明东西、整合自己、从头到尾创立公司，那感觉与仅仅是一项单一技术完全不同。

It's just sort of like just asking if you like, if you had 10 billion extra people on the planet. I mean, maybe a counterpoint. I mean, number one, I'm actually pretty willing to be a convinced one way or another on this point. But I will say, for example, computing is labor. Computing was labor. Computers like a lot of jobs disappear because computers are automating a bunch of digital information processing that you now don't need a human for. And so computers are labor. And that has played out. And self-driving as an example is also like computers doing labor. So like, I guess that's already been playing out. So still business as usual.

这就好比问，如果地球上多了100亿人会怎样。我想，也许可以从另一个角度来看待这个问题。首先，我其实很愿意接受各种观点的说服。但我想说，例如，计算也是一种劳动。过去，计算确实是一种劳动。因为计算机的出现，很多工作消失了，因为计算机自动化地处理大量电子信息，所以现在不需要人来做这些事情了。因此，计算机本身就是一种劳动。这已经在我们的生活中实现了，自动驾驶汽车也是一个例子，计算机在从事劳动。所以，我觉得这已经成为常态。

Yeah. I guess you have a machine which is spitting out more things like that at potentially faster pace. And so we historically have, we have examples of the growth version changing where like you went from, you know, 0.2% growth to 2% growth. So it seems very plausible to me that like a machine which is then spitting out the next self-driving car and the next internet and whatever. I mean, I kind of, yeah, I see where it's coming from. At the same time, I do feel like people make this assumption of like, okay, we have God in the box and I can do everything. And it's just won't just won't look like that. It's going to be, it's going to be able to do some of the things. It's going to fail at some other things. It's going to be gradually put into society and basically end up with the same pattern as my prediction.

是的。我想你可能有一台机器，可以以更快的速度输出类似的东西。而我们在历史上有过增长模式改变的例子，比如从0.2%增长到2%的增长。所以，我觉得很有可能会出现一台机器，不断推出下一个自动驾驶汽车、下一个互联网等等。我大概能理解这种观点。然而，我也觉得有些人把这看作是“上帝在盒子里”，认为它可以做一切事情。但事实不会是那样的。它会能做一些事情，但在其他事情上可能会失败。它将逐步融入社会，基本上呈现出和我预测相似的模式。

Because this assumption of suddenly having a completely intelligent, fully flexible, fully general human in a box and we can dispense it at arbitrary problems in society. I don't think that we will have this like discreet change. And so I think we'll arrive at the same, at the same kind of a gradual diffusion of this across the industry. I think what often ends up being misleading in these conversations is people, I don't like to use a word intelligence in this context because intelligence implies you think like, oh, super intelligence will be sitting, there will be a single super intelligence sitting in a server and it'll like divine how to come up with new technologies and inventions that causes this explosion.

因为这种假设认为我们可以突然拥有一个完全智能、完全灵活、完全通用的人形机器，然后我们可以用它来解决社会中的各种问题。我并不认为这种突变会发生。所以，我认为我们还是会逐渐在各个行业中引入这种技术。在这些讨论中，常常容易产生误解的是，人们会认为这个“智能”意味着有一个超级智能存在于某个服务器中，它可以自行思考并发明新技术，造成技术的爆炸式发展。但我不太喜欢在这种情况下使用“智能”这个词。

And that's not what I'm imagining. I'm imagining 20% growth. I'm imagining that there's billions of, you know, basically like very smart human like minds potentially or that's all that's required. But the fact that there's hundreds of millions of them, billions of them, each individually making new products, figuring out how to integrate themselves into the economy just a way if like a highly experienced smart immigrant came to the country. You wouldn't need to like figure out how we integrate them in the economy. They figured out they could start a company, they could like make inventions, you know, or like just increase productivity in the world.

这不是我想象的。我想的是增长20%。我想的是，可能有数十亿个类似非常聪明的人类大脑。这就是全部所需，但事实上有数亿、数十亿这样的存在，每个个体都在创造新产品，想办法让自己融入经济。就像一个经验丰富的聪明移民来到一个国家，你不需要去想怎么把他们融入经济，他们自己就能想办法。他们可以创办公司，发明新东西，或者就是提高全球的生产力。

And we have examples even in the current regime of places that have had 10, 20% economic growth. You know, if you just have a lot of people and less capital in comparison to the people, you can have Hong Kong or Shenzhen or whatever just had decades of 10% plus growth. And I think there's like, there's a lot of really smart people who are ready to like make use of the resources and do this like period of catch up because we've had this discontinuity.

在当前体制下，我们也看到了经济增长达到10%或20%的地方。例如，像香港或深圳这样的地方，因人口众多而资本相对较少，几十年来实现了超过10%的增长。我认为有许多非常聪明的人准备好利用这些资源来实现快速发展，因为我们此前经历了一段停滞期。

And I think, yeah, I'm going to be similar. So I think I think I understand, but I still think that you're presupposing some discrete jump, there's some unlock that we're waiting to claim. And suddenly we're going to have geniuses and data centers. And I still think you're presupposing some discrete jump that I think has basically no historical precedent that I can't find in any of the statistics. And that I think probably won't happen. I mean, the answer revolution is such a jump, right? You went from like zero percent growth, 0.2% growth to 2% growth.

我觉得，我可能会有类似的看法。所以我认为我明白你的意思，但我还是觉得你在假设有某个明显的飞跃，就像我们在等待某个解锁的时刻，然后突然我们就会有天才和数据中心。但我认为这样的飞跃几乎没有历史先例，我在任何统计数据中都找不到这样的例子，我觉得它可能不会发生。答案革命就是一个这样的飞跃，对吧？经济从零增长或0.2%的增长一下子提升到了2%的增长。

I'm just saying like you'll see another jump like that. I'm a little bit suspicious. I would have to look at it. I'm a little bit suspicious. And I would have to take a look. For example, like maybe some of the logs are not very good from before the industrial revolution or something like that. So I'm a little bit suspicious of it, but yeah, maybe you're right. I don't have strong opinions. Maybe you're saying that this was a singular event that was extremely magical. And you're saying that maybe there's going to be another event that's going to be just like that extremely magical.

我的意思是，你可能会看到类似的再一次激增。我对此有点怀疑，我需要检查一下。举个例子，也许在工业革命之前的一些记录不太准确，所以我对这些抱有一点怀疑，但也许你是对的。我没有很强烈的看法。也许你在说这是一个非常神奇的独特事件，而你觉得可能会有另一个同样神奇的事件发生。

It will break paradigm and so on. I actually don't think the crucial thing with the industrial revolution was that it was not magical. Right? Like if you just zoomed in, where you would see in 1770 or 1870 is not that they're like what's some key invention. Yeah, exactly. But at the same time, you did move the economy to a regime where the progress was much faster and the exponential 10x. And I expect a similar thing for me. I were it's not like there's going to be a single moment where we made the crucial invention.

它将打破现有的模式，等等。实际上，我并不认为工业革命的关键在于它没有什么神奇之处，对吧？如果你仔细观察1770年或1870年的情形，你不会发现某个关键的发明。而是，那个时候经济发展进入了一个进步更快、增长呈指数级的阶段。我对自己的情况有类似的预期。我认为不会有一个单一的时刻象征着我们取得了关键性的发明。

Some overhang that's being unlocked. Like maybe there's a new energy source. There's some unlock in this case some kind of a cognitive capacity. And there's an overhang of cognitive cognitive work to do. And you were expecting that overhang to be filled by this new technology when it crossed to the threshold. And I mean, maybe one way to think about it is through history, a lot of growth, I mean, growth comes because people come up with ideas.

有一些积压的事情正在被解决。可能就像有了一个新的能源。在这种情况下，有某种认知能力的解锁。存在着需要做的认知工作的积压。你期望这个积压会在新技术达到一定门槛时得到解决。换句话说，纵观历史，很多增长都是因为人们提出了新的想法。

And then people are like out there doing stuff to execute those ideas and make valuable output. And through most of this time, population is an exploding that has been driving growth. For the last 30 years, people have argued that growth is stagnated. Population and frontier countries are also stagnated. I think we go back on the hyper exponential growth in population and output. Right. And sorry, exponential growth in population that causes hyper exponential growth and output.

然后，人们在外面积极地执行这些想法，创造有价值的成果。在大部分时间里，人口增加一直推动着增长。在过去的30年中，人们认为增长已经停滞。人口和前沿国家的发展也在停滞。我认为我们将回到人口和产出超快增长的局面。对了，人口的指数增长会导致产出的超指数增长。

Yeah. I mean, yeah, it's really hard to tell. Yeah. I understand that viewpoint. I don't intuitively feel that viewpoint. So we just got access to Google's VO3.1. And it's been really cool to play around with. The first thing we did was run a bunch of problems through both VO3 and 3.1 to see what's changing the new version. So here's VO3. Hi, I'm Max and I got stuck in a local minimum again. It's okay, Max. We've all been there.

是的。我是说，是的，这真的很难判断。我理解这种观点，但我没有直观地感受到这种观点。所以我们刚刚获得了Google VO3.1的访问权限，玩起来真的是很酷。我们首先做的就是通过VO3和VO3.1处理一堆问题，看看新版本有什么变化。这是VO3的一个例子：“嗨，我是Max，我又陷入了一个局部最小值。” 不要紧，Max，我们都经历过这种情况。

Took me three Xbox to get out. And here's VO3.1. Hi, I'm Max and I got stuck in a local minimum again. It's okay, Max. We've all been there. Took me three Xbox to get out. Vibram 1's output is just consistently more coherent and the audio is noticeably higher quality. We've been using VO4 a while now, actually. We released an essay earlier this year about AI firms, fully animated by VO2. And it's been amazing to see how fast these models are improving. This update makes VO even more useful in terms of animating our ideas and our explainers. You can try VO right now in the Gemini app with Pro and Ultra subscriptions. You can also access it through the Gemini API or through Google Flow.

用了三个Xbox才出来。这就是VO3.1。你好，我是Max，我又一次陷入了局部最小值。没关系，Max，我们都经历过。我用了三个Xbox才摆脱出来。Vibram 1的输出总是更连贯，音质也显著提升。实际上，我们已经使用VO4一段时间了。今年早些时候，我们发布了一篇关于AI公司的文章，完全由VO2动画制作。看到这些模型提高的速度真是令人惊讶。这次更新让VO在动画制作我们的想法和解释器方面变得更加有用。你可以在Gemini应用中通过Pro和Ultra订阅立即体验VO，也可以通过Gemini API或Google Flow访问它。

You recommend a Nick Lane's book to me and then on that basis, I find it also fun and super interesting and I interviewed him. And so I actually have some questions about thinking about intelligence and evolutionary history. Now that you, over the last 20 years of doing AI research, you maybe have a more tangible sense of what intelligence is, what it takes to develop it. Are you more or less surprised as a result that evolution just sort of spontaneously stumbled upon it? I love Nick Lane's books by the way. I was just listening to his podcast some way up here.

你向我推荐了一本尼克·莱恩的书，基于此，我也发现这本书非常有趣，所以我采访了他。现在我对智能和演化历史有一些疑问。过去20年你一直在进行AI研究，也许你对智能是什么以及如何发展它有了更切实的认识。你是否对进化自发地发现智能这一事实感到更惊讶，还是更不惊讶呢？顺便说一句，我非常喜欢尼克·莱恩的书。刚才在路上，我还在听他的播客。

With respect to intelligence and its evolution, I do came it came fairly, I mean, it's very, very recent. I am surprised that it evolved. I find it fascinating to think about all the worlds out there. Like, say there's a thousand planets like Earth and what they look like. I think Nick Lane was here talking about some of the early parts. Okay, he expects basically very similar life forms, roughly speaking, and bacteria like things and most of them. And then there's a few breaks in there. I would expect that the evolution of intelligence intuitively feels to me like it should be fairly rare event.

关于智慧及其进化，我认为这是一种相对比较新的现象。我感到惊讶它能进化出来。想到宇宙中可能有无数类似地球的星球，我觉得很有趣。比如，假设有一千个类似地球的星球，想象它们会是什么样子。我记得Nick Lane曾谈到关于生命初期的一些话题，他基本上认为这些星球上会有相似的生命形式，大多数是类似细菌的东西，然后在某些地方会有一些突破。我直觉上觉得智慧的进化应该是一个相当罕见的事件。

And there have been animals for, I guess maybe you should base it on how long something has existed. For example, bacteria I've been around for two billion years and nothing happened. Then going to your care, it's probably pretty hard because this bacteria actually came up quite early in Earth's evolution or history. And so I guess how long have we had animals, maybe a couple hundred million years, like multi-cellular animals that run around crawl, etc. Which is maybe 10% of Earth's lifespan or something like that. So maybe on that time scale is actually not too tricky. I still feel like it's still surprising to me, I think intuitively that it evolved.

多年以来，地球上就存在着动物。或许我们可以根据某种生物存在的时间长度来看待这个问题。举个例子，细菌已经存在了20亿年，但在很长的一段时间里没有发生什么特别的变化。想到我们对这些细菌的关注，可能会有些困难，因为它们在地球早期的进化或历史中就已经出现了。至于动物，我们大概有了几亿年历史，比如那些可以到处跑动或爬行的多细胞动物。这大概只有地球寿命的10%左右。从这个时间尺度上看，这似乎其实并不难理解。然而，我依然觉得动物的进化是令人惊讶的，从直觉上来说，似乎难以想象。

I would maybe expect just a lot of animal-like life forms doing animal-like things. The fact that you can get something that creates culture and knowledge and accumulates it is surprising to me. Okay, so there's actually a couple of interesting follow-ups. If you buy the Sun Perspective that actually the crux of intelligence is animal intelligence, what the Quaritese said is if you got to the squirrel you'd be most of the way to AGI. Then we got to squirrel intelligence, I guess right after the Cambrian explosion, 600 million years ago.

我可能会期待看到许多类似动物的生命形式做着动物般的事情。而能够产生文化、积累知识的生命形式让我感到惊讶。如果你接受太阳的视角，即认为智力的核心其实是动物智力，那么Quaritese所说的就是，如果达到松鼠那样的智力水平，你就已经接近通用人工智能了。我想，我们可能在大约6亿年前的寒武纪大爆发之后就达到了松鼠的智力水平。

It seems like what instigated that was the oxygenation event 600 million years ago. Immediately the intelligence algorithm was there to make the squirrel intelligence. It's suggestive that animal intelligence was like that. As soon as you had the oxygen in the environment, you could just get the algorithm. Maybe there was an accident in the devolution smell that went so fast, but I don't know if that suggests it's actually quite at the end going to be quite simple. Yes, basically it's so hard to tell with any of these stuff.

看起来，600百万年前的氧气事件是这一现象的诱因。紧接着，就有了一个智力算法，使松鼠具有了智慧。这表明动物的智慧是这样形成的。只要环境中有氧气，就能生成这个算法。也许在退化过程中有一个意外导致发展的速度很快，但我不确定这是否意味着最终的过程其实很简单。总之，这些事情真的很难判断。

I guess you can base it a little bit on how long something has a zixit or how long it feels like something has been bottlenecked. So, Nikolayn is very good about describing this very apparent bottleneck in bacteria in Archaia for 2 billion years. Nothing happened. Extreme diversity of biochemistry and yet nothing that grows to become animals 2 billion years. I don't know that we've seen exactly that kind of an equivalent with animals and intelligence to your point.

我想你可以根据某件事情存在"瓶颈"的时间长短来判断，或者根据感觉某件事情被"瓶颈"了多久来判断。因此，Nikolayn很擅长描述这种在细菌和古菌中显而易见的瓶颈状态，这种状态持续了大约20亿年。在这段时间里，虽然生物化学极其多样化，但却没有任何生物演化成动物。这种情况持续了20亿年。就像你所说的，对于动物和智力的发展，我不知道我们是否见过类似的情况。

I guess maybe we could also look at it with respect to how many times we think evolution of intelligence has like individually sprung up. That's a really good thing to investigate. Maybe one thought on that is I almost feel like well there's the hominid intelligence. I would say like the bird intelligence, right? Like Ravens, etc., extremely clever. But their brain parts are actually quite distinct and we don't have that much existence.

我想我们也可以从智力进化出现了多少次的角度来看待这个问题。这确实是一个非常值得研究的话题。也许可以这样考虑：比如说，灵长类动物的智力是一种进化。那么鸟类的智力，比如乌鸦之类的，也非常聪明，但它们的大脑结构实际上与我们很不同，而且我们没有那么多相关知识。

So maybe that's a slight event of there's a slight indication of maybe intelligence springing up a few times. In that case, you may be expected more frequently or some long time. A former guest, Gwern, and also Carl Schlumman have made a really interesting point about that. Which is their perspective is that the scalable algorithm which humans have and primates have arose in birds as well. And maybe other times as well. But in humans found a evolutionary niche which rewarded marginal increases in intelligence. And also had a scalable brain algorithm that could achieve those increases in intelligence.

也许有一些迹象表明，在某些时候，智能有可能会偶尔出现。在这种情况下，你可能会更频繁或更长时间地被期望。有位以前的嘉宾，Gwern，以及Carl Schlumman，提出了一个很有趣的观点：他们认为，人类和灵长类动物拥有的可扩展算法在鸟类中也出现过，可能在其他时期也曾出现。但在人类中，这种可扩展的大脑算法找到了一个进化的生态位，这个生态位奖励对智力的边际提升，并能够实现这些智力的提升。

And so for example, if a bird had a bigger brain which is like collapsed out of the air. It's very smart for the size of its brain but it's like it's not an niche which rewards the brain getting bigger. Yeah, maybe similar with some really smart dolphins, etc. Exactly. Whereas humans, you know, like we have hands that like reward being able to learn how to do tool use. We can externalize digestion and more energy to the brain. And that kicks off the flywheel. Yeah, and just stuff to work with.

例如，如果一只鸟的大脑更大，但在空中就像塌陷了一样。就大脑的大小来说，它已经很聪明了，但这并不是一个能让大脑变得更大的生态位。也许这与一些非常聪明的海豚类似。而人类就不同了，比如我们有双手，这让我们能够学习使用工具。我们可以把消化过程外部化，把更多的能量分配给大脑。这就启动了一个良性循环。还有很多可以利用的条件。

I mean, I'm guessing it would be harder to if I was dolphin. Yeah. I mean, how do you do? You can't have fire for example, stuff like that. I mean, the probably like the universe of things you can do in water, like inside water is probably lower than what you can do on land. Just chemically. Right. Yeah, I do agree with this with this viewpoint of these niches and what's what's being incentivized. I still find it's kind of miraculous that I don't I would maybe expect it things to get stuck on like animals with bigger muscles, you know.

我的意思是，我在猜测如果我是海豚，那就会更困难一些。是的，我的意思是，比如说，你没法生火，类似这样的事情。在水里的可行性宇宙可能比在陆地上要少，仅从化学反应的角度来说，对吧。是的，我同意关于这些生态位和它们所受到激励的观点。我还是觉得有些神奇，我可能本来会以为事物会在肌肉更大的动物上停止进化。

Yeah. Like going through intelligence actually really fascinating breaking point. The way where it is the reason it was so hard is it's a very tight line between being in a situation where something is so important to learn. That it's not just worth distilling the exact right circuits. Yeah, directly back into your DNA versus it's not important enough to learn at all. Yeah. It has to be something which is like you have to to incentivize building the algorithm to learn in lifetime.

好的。其实，深入研究智能的过程真的非常有趣，也是一种突破。这其中的困难在于，这是一条非常细微的界限：一方面，我们需要学习的一些东西非常重要，以至于必须精确提取出对的「电路」，几乎像直接写入DNA那样重要；而另一方面，有些东西根本不值得去学习。我们必须找到一种激励机制，以便在整个生命过程中不断学习和优化算法。

Yeah, exactly. You have to incentivize some kind of adaptability. You actually want something that you actually want environments that are unpredictable. So evolution can't bake your algorithms into your weights. A lot of a lot of animals are basically pre baked in the sense. And so humans have to figure it out that test time when they get born. And so maybe there was you actually want these kinds of environments that actually change really rapidly or something like that where you can't foresee what will work well.

是的，没错。你需要激励某种适应能力。实际上，你希望环境是不可预测的，这样进化就不能将算法直接固定在生物的特性中。很多动物基本上已经在这方面被“预设”好了。而人类在出生后需要自己去摸索解决问题。所以，也许你真的希望这些环境能够快速变化，因为这样你就无法预见什么会有效。

And so you actually put all that intelligence you create intelligence to figure it out the test time. So Quentin Pope had this interesting blog post where he's saying the reason he doesn't expect a sharp takeoff is. So humans had the sharp takeoff where 60,000 years ago we seem to have had the kind of architectures that we have today. And 10,000 years ago agricultural evolution, modernity dot dot dot what was happening that 50,000 years.

于是，你实际上运用所有创造出的智能来解决测试时间的问题。Quentin Pope 在一篇有趣的博客文章中提到，他不认为会出现快速跃升的原因是：人类曾经历过一个快速跃升，大约在六万年前，我们似乎已经具备了今天这样的架构。而一万年前出现了农业革命、现代化等等。那么，在这五万年间到底发生了什么？

Well, you had to build this sort of like cultural scaffold where you can accumulate knowledge over generations. This is an ability that exists for free in the way we do AI training where if you retrain a model, it can still I mean, in many cases, they're literally distilled, but they can be trained on each other. You know, they can be trained on the pre same pre training corpus. They don't literally have to start from scratch. So there's a sense in which the thing which it took humans a long time to get this cultural loop going just comes for free with the way we do all of training.

好吧，你必须建立一种文化框架，以便能够在几代人中积累知识。我们在进行人工智能训练时，就享有这种能力。也就是说，如果你重新训练一个模型，它通常仍然可以在之前的训练成果上继续发展。事实上，很多情况下，模型是经过提炼的，它们可以在相同的预训练数据集上相互训练，而不需要从头开始。所以，这种在人类花费很长时间才建立起来的文化循环，在我们进行训练时就不费吹灰之力地实现了。

Yes, no, because all of them don't really have the equivalent of culture. And maybe we're giving them way too much and incentivizing not to create it or something like that. But I guess like the invention of culture and of written record on and of like passing down notes between each other. I don't think there's an equivalent of that with all of them right now. So, albums don't really have culture right now.

是，也不是，因为所有这些事物并没有真正拥有文化等同物。也许我们给予它们太多关注，反而助长了它们不去创造文化或类似的现象。不过，我觉得文化的创造、书面记录的发明，以及人与人之间的传递笔记，目前这些事物都没有类似的存在。所以，专辑现在并不真正具备文化属性。

And it's kind of like one of the I think impediments, I would say. Can you give me some sense of what LLM culture might look like? Also, in the simplest case, it would be a giant scratch pad that the LLM can edit. And as it's reading stuff or as it's helping out work with work, it's editing the scratch pad for itself. Why can't an LLM write a book for the other LLMs? How would be cool?

这有点像是我认为的障碍之一。你能给我一些关于LLM（大型语言模型）文化可能是什么样的感觉吗？最简单的情况是，它会像一个巨大的草稿板，LLM可以在上面编辑。LLM在读取信息或协助工作时，会为自己编辑这个草稿板。为什么LLM不能为其他LLM写一本书呢？那会多酷啊！

Yeah. Like why can't other LLMs read this LLM's book and be inspired by it or a shock by it or something like that? There's no equivalence for any of this stuff. Interesting. When would you expect that kind of thing to start happening? And more general question about multi-agent systems and a sort of independent AI civilization and culture?

是啊。为什么其他语言模型不能阅读这个语言模型的书，并从中获取灵感或受到震撼之类的？这些东西没有对应的事物。很有趣。你认为这种情况什么时候会开始发生呢？还有一个更普遍的问题是，关于多代理系统，以及一种独立的人工智能文明和文化？

I think there's two powerful ideas in the realm of multi-agent that have both not been really claimed or so on. The first one I would say is culture and LLMs basically are growing repertoire of knowledge for their own purposes. The second one looks a lot more like the powerful idea of self-play in my mind is extremely powerful. So evolution actually is a lot of competition basically driving intelligence and evolution. And for an alpha-go, more algorithmically, like alpha-go is playing against itself and that's how it learns to get really good at go. And there's no equivalent of self-playing LLMs, but I would expect that to also exist but no one has done it yet. Like why can't an LLM, for example, create a bunch of problems that another LLM is learning to solve? And then the LLM is always trying to serve more and more difficult problems, stiff like that.

在多智能体领域中，我认为有两个非常强大的理念尚未被真正提出或应用。第一个理念是“文化”，即大型语言模型（LLMs）基本上是在为它们自身目的而不断增长的知识库。第二个理念在我看来像是“自我对弈”的强大想法。演化实际上是通过大量竞争来驱动智慧和进化的过程。在算法上，对于AlphaGo来说，它通过与自身对弈来学习和提高围棋技能。目前还没有类似自我对弈的LLMs，但我预计这类模型会出现，只是还没有人这样做。例如，为什么一个LLM不能创建一系列问题供另一个LLM学习解决？然后，LLM可以不断尝试提供越来越难的问题，类似这样的过程。

So I think there's a bunch of ways to actually organize it and I think it's a realm of research. But I think I haven't seen anything that convincingly claims both of those multi-agent improvements. I still think we're mostly in the realm of a single individual agent, but I also think that will change. And in the realm of culture also I would bucket also organizations and we haven't seen anything like that convincingly either. So that's why we're still early. Can you identify the key bottleneck that's preventing this kind of collaboration between LLMs? Maybe like the way I would put it is somehow remarkably again some of these analogies work and they shouldn't, but somehow remarkably they do. A lot of the smaller models somehow remarkably resemble kindergarten student or then elementary school student or high school student etc.

我认为有很多方法可以组织它，而且我觉得这也是一个研究领域。但我还没有看到能够令人信服地同时展示这两种多智能体改进的例子。我认为我们大多数仍处于单一智能体的阶段，但我也认为这种情况会改变。在文化领域，我也包括组织，但在这方面我们还没有看到令人信服的例子。这就是为什么我们还处于早期阶段。你能确定阻碍大语言模型之间这种合作的关键瓶颈吗？也许我会这样表达：某种程度上，一些比喻起到了作用，而这些比喻本不该起作用，却神奇地有效。许多较小的模型令人惊讶地类似于幼儿园学生、小学生或高中生等。

And somehow we still haven't graduated enough where the stuff can take over. But still mostly like my cloth code or codex they still kind of feel like this elementary student. I know that they can take PhD quizzes, but they still cognitively feel like kindergarten or non-teer school students. So I don't think they can create culture because they're still kids. Like they're Savant kids. They have perfect memory of all this stuff etc. And they can convincingly create all kinds of slop that looks really good. But I still think they don't really know what they're doing and they don't really have the cognition across all these old check boxes that we still have to collect.

尽管有些进步，但我们似乎还没达到让这些技术完全接管的程度。不过，我的代码或手册大多仍然感觉像是小学生水平。我知道它们可以通过博士考试，但从认知角度看，它们仍然像幼儿园或未接受正规教育的学生。所以我不认为它们能创造文化，因为它们还是孩子。就像是天才儿童，虽然记忆力惊人，能够完美记住各种东西，并且可以很有说服力地制造出看似不错的作品，但我仍然觉得它们并不真正明白自己在做什么，而且在很多我们依然需要关注的标准上缺乏认知。

Yeah. So you've talked about how you were at Tesla leading self driving from 2017 to 2022. And then you first hand saw this progress from we went from cool demos to now thousands of cars out there actually autonomously doing drives. Why did that take a decade? Like what was happening through that time? So I would say one thing I will almost instantly also push back on is this is not even near done. So in a bunch of ways that I'm going to get to. I do think that self driving is very interesting because it's definitely like where I get a lot of my intuitions because I spent five years on it.

好的。你提到过你在特斯拉从2017年到2022年领导自动驾驶项目，你亲眼见证了从初期的精彩演示发展到如今有数千辆车在路上进行自动驾驶。为什么这个过程花了十年时间？期间都发生了些什么？我想强调的一点是，这个项目远未完成，后面我会详细说明。我确实认为自动驾驶非常有趣，因为我在这个领域工作了五年，从中得到了很多直觉和启发。

And it has this entire history where actually the first demos of self driving go all the way to the 1980s. You can see a demo from CMU in 1986. There's a truck that's driving itself on roads. But okay fast forward. I think when I was joining Tesla, I had I had a very early demo of a waymo and it basically gave me a perfect drive in 2014 or something like that. So perfect waymo drive a decade ago. Give to customer on Palo Alto and so on because I had a friend who worked there. And I thought it was like very close and then still took a long time.

这项技术有着悠久的历史，实际上自动驾驶的首次演示可以追溯到1980年代。你可以看到卡内基梅隆大学在1986年的一个演示，当时有一辆卡车在路上自动行驶。但让我们快进到后来的发展。当我加入特斯拉时，我体验了Waymo的一个早期演示，那大概是在2014年左右。这次Waymo的演示非常完美，一次完美的Waymo驾乘经历，大约十年前就提供给了帕洛阿尔托的客户，因为我的一个朋友在那里工作。我当时觉得这项技术似乎已经非常接近成熟，但仍然花了很长时间才更接近实用化。

And I do think that some there's for some kinds of tasks and jobs and so on. There's a very large demo to product gap where the demo is very easy, but the products are very hard. And it's especially the case in cases like self driving where the cost of failures too high. Right. Many in the many industries tasks and jobs maybe don't have that property. But when you do have that property that definitely increases the timelines. I do think that for example in software engineering, I do actually think that that property does exist. I think for a lot of vibe coding, it doesn't.

我确实认为，在某些类型的任务和工作中，存在一个非常大的从演示到产品的差距。演示很容易，但产品却很难实现。这种情况在例如自动驾驶这样的领域尤其明显，因为失败的成本太高了。在许多行业中，任务和工作可能没有这种特性，但一旦存在这种特性，确实会拉长时间线。我确实认为，比如在软件工程中，这种特性确实存在。但我认为，对于许多即兴编程来说，这种特性并不存在。

But I think if you're writing actual production grade code, I think that property should exist because any kind of mistake actually leads to security vulnerability or something like that and millions and hundreds of millions of people's personal social security numbers, etc. Get leaked or something like that. And so I do think that it is a case that in software people should be careful. Kind of like in self driving. Like in self driving, if you if it thinks go wrong, you might get injury in. I guess there's worse outcomes, but I guess in software, I almost feel like it's almost unbounded how terrible something could be.

但我认为，如果你在编写实际的生产级代码，我觉得这种特性应该存在，因为任何错误都可能导致安全漏洞或者类似的问题，导致数百万甚至上亿人的个人社会安全号码等信息被泄露之类的。所以我确实认为在软件开发中人们应该格外小心。就像自动驾驶一样，如果出现问题，你可能会受伤。当然，可能会有更严重的后果，但在软件中，我几乎觉得问题的严重性是无法估量的。

So I do think that they share that property. And then I think basically what takes them long amount of time and the way to think about it is that it's a march of knives and every single nine is a constant amount of work. So every single nine is the same amount of work. So when you get a demo and something works 90% of time, that's just that's just what the first nine. And then you need a second nine and third nine, fourth nine, fifth nine. And while I was a Tesla for was it five years or so, I think we went through maybe three knives two knives. I don't know what it is, you know, but like multiple knives of iteration, there's still more knives to go. And so that's why these things take takes a long. And so it's definitely formative for me like seeing something that was a demo on very unimpressed by demos.

我确实认为它们具有相似的特点。基本上，可以把它们看作一个需要长时间完成的过程，这个过程就像是“九次前进”，每一次“九次前进”都需要相同的工作量。也就是说，当你看到一个演示在90%的时间里能够正常运作时，那只是完成了“第一次九次前进”。接下来你还需要完成第二次、第三次、第四次、第五次......在我为特斯拉工作的五年里，我们大概完成了两到三次“九次前进”。但即便如此，还有更多次要完成。这就是为什么这些事情需要耗费很长时间。这段经历对我来说非常有意义，让我看到一个演示并不那么让我印象深刻。

So whenever I see demos of anything, I'm extremely unimpressed by that. It works better if you can. If it's a demo that someone cooked up and is just showing you it's worst, if you can interact with it, it's a bit better. But even then you're not done. You need actual product. It's going to face all these challenges in when it comes in contact with reality, you know, these different pockets of behavior that need patching. And so I think we're going to see all this stuff play out. It's a march of knives, each nine is constant. Demos are encouraging still a huge amount of work to do. I do think it is a kind of a critical safety domain unless you're doing bipcoding, which is all nice and fun and so on.

所以每当我看到任何东西的演示时，我都不会感到特别惊讶。它更有价值的是，如果你能与之互动。如果这是一个别人精心准备的演示，并且只是给你展示其中最糟糕的部分，那么如果你能和它进行互动，情况会稍微好一些。但即便如此，你还没有完成。你需要的是实际的产品，当它接触到现实时，会面临各种各样的挑战和需要修补的行为模式。所以我觉得我们会看到这一切的发展，就像是一场永不停息的修补之路。演示会令人振奋，但还有大量的工作要完成。我确实认为这是一种关键的安全领域，除非你是在做一些既有趣又愉快的编码工作。

And so that's why I think this also enforces my timelines from that perspective. That's very interesting to hear you say that the sort of safety guarantees you need from software actually not dissimilar to self driving because what we'll often say is that self driving took so long. Because the cost of failure is so high, like a human makes a mistake on average every 400,000 miles or every seven years. And if you had to release a coding agent that couldn't make a mistake for at least seven years, it would be much harder to deploy. But I guess your point is that if you made a catastrophic coding mistake like breaking some important system every seven years.

这就是为什么我认为这也从这个角度强化了我的时间表。听你说到你对软件的安全保障需求其实和自动驾驶并没有太大区别，这非常有趣。我们常说自动驾驶花了很长时间，因为失败的代价太高。比如，一个人平均每40万英里或每七年犯一次错误。而如果你必须发布一个在至少七年内不能出错的编码代理，确实会更难部署。但我猜你的观点是，如果你每七年犯一次重大编码错误，比如破坏某个重要系统，这也是不能接受的。

And in fact, in terms of sort of wall clock time, it would be much less than seven years because you're like constantly outputting code like that. So it's like per tokens or in terms of tokens would be seven years, but in terms of wall clock time. So it's always so much harder problem. I mean self driving is just one of thousands of things that people do. It's almost like a single vertical isopos, whereas when we're talking about general software engineering, it's even more there's more surface area. There's another objection people make to that analogy, which is that with self driving, what took a big fraction of that time with solving the problem of building basic, having basic perception that's robust and building representations.

事实上，从实际时间来看，这可能远不到七年，因为你在不断地输出代码。就像每个代码令牌或按令牌来计算可能是七年，但从实际时间来看。这总是更困难的问题。我是说自动驾驶只是成千上万事情中的一件。它几乎是一个单一的垂直过程，而我们在谈论一般软件工程时，涉及的面更广。有人还会对这个类比提出异议，因为在自动驾驶中，一个大时间的一部分被用来解决建立稳健的基本感知和构建表示的问题。

And having a model that has some common sense so it can generalize to when it sees something that's slightly out of distribution. If somebody's waving down the road this way, you don't need to train for it. The thing will have some understanding of how to respond to something like that. And these are things we're getting for free with LLMs or VLMs today. So we don't have to solve these very basic representation problems. And so now deploying AI's across different domains will sort of be like deploying a self driving car with current models to a different city, which is hard, but not like a 10 year long task.

拥有一个具备常识的模型，可以让它在遇到略微超出分布的情况时进行泛化。例如，如果有人在路上挥手示意，你不需要专门为此进行培训。模型会有一定的理解，知道如何应对此类情况。而这些能力是我们今天通过大型语言模型（LLM）或视觉语言模型（VLM）免费获得的。因此，我们不必解决这些非常基础的表示问题。现在，在不同领域部署人工智能就有点像把当前的自动驾驶汽车模型部署到一个不同的城市，虽然这很困难，但并不需要十年的时间。

Yeah, basically I'm not 100% sure if I fully agree with that. I don't know that we're how much we're getting for free. And I still think there's like a lot of gaps in understanding in what we are getting. And we're differently getting more generalizable intelligence and the single entity, whereas self driving is a very special purpose task that requires in some sense building a special purpose task is maybe even harder in a certain sense. Because it doesn't like fall out from a more general thing that you're doing at scale, and that makes sense. So, but I still think that the analogy doesn't I still don't know if it fully resonates because like the LMs are still pretty fallible.

嗯，基本上我不完全确定我是否完全同意这个观点。我不知道我们到底得到了多少免费资源。而且我仍然认为我们在理解所得到的东西上有很多缺漏。我们确实得到了更多可推广的智能系统，而这是一种单一实体；而自动驾驶是一个非常特殊的任务，从某种意义上来说，构建一个特殊任务可能在某种程度上更困难。因为它不是从你在大规模上做的更一般的事情中自然而然得出的，这能解释得通。但我依然觉得这个类比不完全合适，因为语言模型仍然相当容易出错。

And I still think that they have a lot of gaps and that still needs to be filled in. And I don't think that we're getting like magical generalization completely out of the box. Sort of in a certain sense. And the other aspect that I want to also actually return to when I was in the in the beginning was self driving cars are newer and you're down still. So even though so the deployment still are pretty minimal right. So even way more and so on has very few cars and they're doing that roughly speaking because they're not economical right because they built something that that lives in the future.

我仍然认为他们还有很多不足之处需要填补。我不认为我们可以立刻获得全方位的奇迹般的概括。在某种意义上来说。另一个我要提到的方面是关于自动驾驶汽车的。尽管自动驾驶技术已经有了一段时间，但它仍然比较新颖，而且发展的速度比较慢。因此，即便是像Waymo这样的公司，目前的投入也非常有限。这是因为这些技术尚未达到经济化的阶段，因为他们开发的东西更多是面向未来的。

And so they had to like pull back future, but they had to make it unaccommodable. So they have all these like you know there's all these costs not just marginal costs for those cars and their operation maintenance, but also the capex of the entire thing. So making the economical is still going to be a slog I think for them. And then also I think when you look at these cars and there's no one driving. I also think it's a little bit deceiving because there are actually very elaborate tell operation centers of people actually kind of like an eluped this cars. And I don't have the I don't have the full extent of it, but I think there's more human in a loop that you might expect. And there's people somewhere out there basically beaming in from the sky. And I don't actually know they're fully in the loop with the driving may I think some of the times they are, but they're certainly involved and there are people. And in some sense we haven't actually removed the person we've like moved them to somewhere we can't see them.

这段话的大意是这样的：他们不得不限制未来的发展，但又不得不让其不那么容易被接受。因此，不仅仅是那些汽车的边际成本及其运营维护费用，还有整个项目的资本支出（capex）都是他们需要面对的成本挑战。我认为使其经济上可行对他们来说仍是一项艰难的任务。而且当你看到这些汽车时，虽然没有人在驾驶，但这其实有点误导，因为实际上有很复杂的远程操作中心，那里的人们以某种方式在操控这些车辆。虽然我不完全清楚整个程度，但我认为其中的人力参与可能比你预想的要多。在某种意义上，我们并没有真正取消人为因素，只是把人转移到了我们看不见的地方。

I still think there will be some work as you mentioned going from environment to environment. And so I think like there's still challenges to to make self driving real. But I do agree that it's definitely cross a threshold where it kind of feels real unless it's like really to operate it. For example, Waymo can't go to all the different parts of the city. My suspicion is it's like parts of city where you don't get good signal. Anyway, so basically I don't actually know anything about the stack. I mean, I'm just making up make out. Surely you let's love driving for five years of Tesla. Sorry, I don't know anything about the specifics of Waymo. I feel like I talk about them. Actually, by the way, I love Waymo and I take it all the time. So I don't want to say like sure. I just think that people again are sometimes a little bit to know even about some of the progress.

我仍然认为，正如你提到的，从一个环境转到另一个环境中，仍需要一些工作。因此，我认为让自动驾驶真正实现还是有一些挑战。但我同意，现在它已经跨过了某个门槛，感觉好像变得很真实了，除非你真的去操作它。例如，Waymo不能行驶到城市的所有地方。我猜测这可能是由于一些城市区域信号不好。不过，基本上我对技术细节并不了解，只是在猜测。显然，经过五年的特斯拉自动驾驶体验，你才会了解这些东西。抱歉，我对Waymo的具体细节不是很清楚，只是觉得有时人们甚至对进展了解得太少。顺带一提，我很喜欢Waymo，经常使用它。所以我并不是要否定它，我只是觉得有时候人们对这方面的进展了解不够。

And I still think there's a huge amount of work. And I think Tesla took in my mind a lot more scalable approach. And I think the team is doing extremely well and it's going to. And I'm kind of like on the record for predicting how this will go, which is like waymo had like early start because you can package up so many sensors. But I do think Tesla is taking the more scalable strategy and it's going to look a lot more like that. So I think this will have to still play out and hasn't. But basically like, I don't want to talk about self-driving or something that took a decade because it didn't take it. It didn't take it. That makes sense. Because one, the start is at 1980, not 10 years ago. And then two, the end is not here yet. Yeah, the end is not near yet because when we're talking about self-driving, usually in my mind, it's self-driving at scale.

我仍然认为还有大量工作要做。我觉得特斯拉采取了一种在我看来更具扩展性的方法。我认为他们的团队表现得非常出色，并且会继续这样。我对这方面的预测一直很明确：尽管Waymo在早期有优势，因为它可以集成很多传感器，但我确实认为特斯拉采用了更具扩展性的策略，其成果将会大不相同。我觉得这一切还需要时间来验证。目前不能简单判断。基本上，我不想谈论自动驾驶是个花了十年的事情，因为其实不是这样的。事实上，这个起点是1980年，而不是十年前。其次，这个过程还远未结束。因为当我们谈论自动驾驶时，我通常想到的是大规模运用的自动驾驶。

Yeah. People don't have to get a driver's license, et cetera. I'm curious to bounce two other ways in which the analogy might be different. And the reason I'm especially curious about this is because I think the question of how fast AI is deployed, how valuable it is when it's early on is potentially the most important question in the world right now. Like if you're trying to model what the year or 20 or 30 looks like, this is the question you want to have some understanding of. So another thing you might think is one, you have this lazy requirement with self-driving where you have, I have no idea what the actual models are. And then you have this kind of thing like tens of millions of parameters or something which is not the necessary constraint for knowledge work with LLM's or maybe it might be with the computer use and stuff.

好的。人们不需要获得驾驶执照等等。我对这种类比可能存在的不同方面感到好奇。尤其让我好奇的原因是，我认为关于人工智能部署速度和早期价值的问题，可能是当前世界上最重要的问题之一。如果你想预测未来一二十年的发展趋势，这是需要理解的问题。所以，你可能会想到的是，首先，在自动驾驶方面有一个比较宽松的要求，我不太清楚具体的模型是什么。然后，还有一些像拥有数千万参数这样的东西，而这并不是大型语言模型在知识工作中必要的限制，或者在计算机使用中可能会是。

But anyways, the other big one is maybe more importantly on this catbacks question, yes, there is additional cost of serving up an additional copy of a model. The sort of op-ex of a session is quite low and you can amortize the cost of AI into the training run itself, depending on how inferences scaling goes and stuff. But it's certainly not as much as like building a whole new car to serve another instance of a model. So it just the economics of deploying more widely are much more favorable. I think that's right. I think if you're sticking a realm of bits, bits are like a million times easier than anything that touches the physical world. I definitely grant that. Bits are completely changeable, arbitrarily reshuffleable at a very rapid speed. So you would expect a lot more faster adaptation, also in the industry and so on.

无论如何，另一个重要的问题是关于资本支出的。是的，确实存在提供更多模型副本的额外成本。一个会话的运营成本相对较低，你可以在一定程度上将训练过程中AI的成本分摊到整个模型中，这当然也取决于推理的规模和其他因素。但这种成本绝对不像建造一辆新车那么高，以服务于模型的另一个实例。因此，从经济角度看，更广泛的应用是更有利的。我认为这是正确的。我认为如果你只涉及数字层面，处理比特要比处理实体世界的任何东西容易百倍。我完全同意这一点。比特可以快速且随意地重新排列，所以你可以期待更快的适应能力，包括在行业中也是如此。

And then what was the first one? The lazy requirements and the legacy requirements for model size. I think that's roughly right. I mean, I also think that if we are talking about knowledge work at scale, there will be some latings requirements practically speaking because we're going to have to make critical huge amount of compute and serve that. And then I think the last aspect that I very briefly want to also talk about is like all the rest of it, just all the rest of it. So what the society think about it, what is the legal around how is it working legally, how is it working insurance wise, who's really like what is the, what are those layers of it and aspects of it. What happens with what is the equivalent of people putting a cone on a waymo. You know, there's going to be covenants of all that.

好的，那么第一个是什么？关于模型大小的宽松要求和遗留要求。我想大致上是这样的。我的意思是，如果我们谈论大规模的知识工作，实际上会有一些滞后要求，因为我们将不得不进行大量计算和服务。我还想简要谈一下最后一点，就是所有其他方面的内容。社会对此怎么看？它在法律上是如何运作的，保险方面如何？谁真正负责，这其中的层次和方面是什么？比如像人们在Waymo自动驾驶车上放置锥形帽的情况会有什么等效的情况？这其中会有很多约定。

And so I do think that I almost feel like self-strapping is a very nice analogy that you can borrow things from. Yeah, what is the equivalent of the cone on the car? What is the equivalent of a teleoperating worker who's like hidden away? And almost like all the aspects of it. Yeah. Do you have any opinions on whether this implies that the current day I build out, which would like 10X the amount of available compute in the world in a year or two and maybe like 100 more than 100X that by the end of the decade. If the use of AI will be lower than some people in the early predict, does that mean that we're over building compute or do you, is that a separate question? Kind of like what happened with railroads and all this kind of stuff.

我确实认为“自捆绑”是个很好的比喻，你可以从中借鉴很多。比如说，车上的锥形帽子相当于什么？远程操作工人又相当于什么？几乎所有这些方面都是类似的。你有没有什么看法，比如，当前我们正在建设的基础设施将在一两年内提高全球计算能力的10倍，到本十年末可能提高超过100倍。如果人工智能的使用低于一些早期预测，这是否意味着我们在过度建设计算能力，或者这是否是一个独立的问题？这有点像铁路建设这样的事情。

What's the railroads? Sorry. Yeah. There is like a historical precedent or was it with telecommunication industry, right? Like pre-paving the internet that only came like a decade later, you know, and creating like a whole bubble in the telecommunications industry in the late 90s kind of thing. So I don't know. I mean, I understand I'm sounding very pessimistic here. I'm only doing that. I'm actually optimistic. I think this will work. I think it's tractable. I'm only sounding pessimistic because when I go on my Twitter timeline, I see all this stuff that makes no sense to me.

铁路行业有什么历史背景吗？抱歉。是的。类似于电信行业可能有这样的历史先例，对吧？就像为互联网的出现铺平了道路，而互联网大约在十年后才真正到来，并且在90年代末电信行业还曾经形成一个泡沫。所以我不太确定。我知道我听起来很悲观，但实际上我很乐观。我觉得这可行，我认为问题是可以解决的。我只是因为在我的推特时间线上看到很多让我摸不着头脑的内容，所以才显得有些悲观。

And I think there's a lot of reasons for why that exists. And I think a lot of it is, I think honestly just stuff fundraising. It's just incentive structures. A lot of it may be fundraising. A lot of it is just attention, you know, converting attention to money on the internet, you know, stuff like that. So I think there's a lot of that going on. And I think I'm only reacting to that. But I'm still like overall very bullish on technology. I think we're going to work through all this stuff. And I think there's been a rapid amount of progress. I don't actually know that there's over building. I think that there's going to be, we're going to be able to go up what in my understanding is being built.

我认为这里有很多原因导致这种情况存在。坦率地说，其中很大一部分与筹款有关。这归结为激励结构。很多可能是为了筹款，还有很多是为了在互联网上把注意力转化为金钱之类的。因此，我认为这类事情正在大量发生，而我只是在对此作出反应。但总的来说，我对技术仍然非常乐观。我认为我们将会解决这些问题，并且已经取得了快速的进展。我不认为目前有过度建设。我认为我们将能够利用我们正在构建的内容。

Because I do think that, for example, Cloud Code or OpenAI Codex and stuff like that, they didn't even exist a year ago. Right. Is that right? I think it's roughly right. This is miraculous technology that didn't exist. I think there's going to be a huge amount of demand as we see the demand and trash if it's already in someone. So yeah, I don't actually know that there's over building. But I guess I'm just reacting to like some of the very fast timelines that people continue to say incorrectly. And I've heard many, many times over the course of my 15 years in AI, we're very reputable people. Keep getting this wrong all the time.

因为我确实认为，比如像Cloud Code或OpenAI Codex之类的技术，它们在一年前还不存在。是吧？我想这大致是正确的。这是一种不可思议的技术，以前没有出现过。我认为，随着需求的增加，会有大量的需求涌现，如果这项技术已经在某人手中了。那么，是的，我其实不知道是不是有过度建设的情况。但我只是对那些非常快的时间表做出反应，人们总是不断地错误预测。在我从事AI行业的15年里，我听到过很多次这些错误的预言，甚至是来自一些非常有声望的人。

And I think I want us to be properly calibrated. And I think some of this also, it does have like geopolitical ramifications and things like that when some of these questions. And I think I don't want people to make mistakes on that sphere of things. So I do want us to be grounded in reality of what technology is and isn't. Let's talk about education in Urika and stuff. One thing you could do is start another AI lab and then try to solve those problems. Yeah, you're curious what you up to now. Yeah. And then yeah, why not AI research itself? I guess maybe like the way I would put it is.

我希望我们能够准确地校准自己。我认为其中一些问题也有地缘政治的影响。我不希望人们在这个领域犯错误。因此，我希望我们能够在科技的现实中扎根，了解科技能做什么，不能做什么。让我们来谈谈乌里卡的教育等问题。你可以做的一件事是建立另一个人工智能实验室，然后尝试解决这些问题。你现在好奇在做什么？那为什么不进行人工智能研究呢？我想我可能会这样表达。

I feel some amount of like the terminism around the things that AI labs are doing. And I feel like I could help out there, but I don't know that I would like uniquely. I don't know that I would like uniquely improve it. But I think like my personal big fear is that a lot of this stuff happens on the side of humanity. And that humanity gets disempowered by it. And I kind of like, I care not just about all the distance fears that we're going to build. And that AI is going to build in a fully autonomous way. I care about what happens to humans. And I want humans to be well off in this future.

我对人工智能实验室正在做的事情感到有些终极主义的倾向。我觉得我可以在这方面提供一些帮助，但我不确定自己是否能够独特地改进它。不过，我个人最大的担忧是，这些事情中的很多是在远离人类的情况下发生的，人类因此变得无力。我不仅关心我们将要构建的所有遥远目标以及人工智能可能完全自主构建的场景，我更关心人类的未来。我希望在这个未来中，人类能过得好。

And I feel like that's where I can lot more uniquely add value than like an incremental improvement in the frontier lab. And so I guess I'm most afraid of something maybe like the picture in movies like Wally or Idiocracy or something like that where humanity is sort of on the side of this stuff. And I want humans to be much, much better in this future. And so I guess to me, this is kind of like through education that you can actually achieve this.

我觉得在这个方面，我能够独特地增加更多的价值，而不是仅仅在前沿实验室中进行递增的改进。所以，我最害怕的是一些类似于电影《机器人总动员》或《白痴时代》中的情景，那就是人类在这些科技发展中逐渐被边缘化。我希望人类在未来能够变得更好。因此，我认为，这实际上可以通过教育来实现。

And so what are you working on there? Yeah. So Eureka is trying to build, I think maybe the easiest way I can describe it is we're trying to build the Starfleet Academy. I don't know if you watch Star Trek. I haven't. Yeah. Okay. Starfleet Academy is this like elite institution for frontier technology building spaceships and graduating cadets to be like in the pilots of these spaces. No, not. So I just imagine like an elite institution for technical knowledge.

所以你在那边正在做什么呢？好吧，Eureka 正在尝试建立一个组织，我想最简单的描述方式就是，我们在试图打造一个类似星际舰队学院的机构。我不知道你是否看过《星际迷航》。我没有。嗯，星际舰队学院是一个精英机构，致力于前沿科技的发展，比如建造飞船并培养驾驶这些飞船的学员。所以可以想象成一个致力于技术知识的精英机构。

And basically kind of school that's very up to date and very like a premier institution. A category of questions I have for you is just explaining how one teaches technical or scientific content. Well, because you are one of the world masters at it. And then I'm curious both about how you think about it for content you already put out there on YouTube. Yeah. But also to the extent it's any different how you think about it for you, Rika.

基本上，这是一所非常现代化、首屈一指的学校。我有一种问题想请教你，就是如何教授技术或科学内容。因为你是世界上该领域的专家之一。我对你在YouTube上发布的内容是怎么想的很感兴趣。同时，我也想知道你为Rika传授内容时，思考方式是否有所不同。

Yeah. Well, the respect to Eureka, I think like one thing that is very fascinating to me about education is like I do think education will pretty fundamentally change with a ice on the side. And I think it has to be rewired and changed to some extent. I still think that we're pretty early. I think there's going to be a lot of people who are going to try to do the obvious things which is like, oh, have an LLM and ask it questions and get you know do all the basic things that you would do via prompting right now.

好的，关于对Eureka的看法，我觉得教育中非常吸引我的一点是，我确实认为教育会随着人工智能的应用而发生根本性的改变。我觉得教育体系需要在某种程度上重新调整和改变。不过，我认为我们目前仍处于初期阶段。我相信将会有很多人尝试做一些显而易见的事情，比如使用大型语言模型（LLM），通过向它提问和获取答案，做一些现在通过提示可以完成的基本任务。

I think it's helpful, but it still feels to me a bit like slop, like slop. I would like to do it properly. And I think the capabilities not therefore what I would want. What I'd want is like an actual tutor experience. Maybe a prominent example in my mind is I was recently learning Korean, so I'm just learning. And I went through a phase where I was learning Korean by myself on the internet.

我认为这有帮助，但对我来说，感觉仍然有点像是草草了事。我希望能够做好这件事。而且我觉得目前的能力并不完全符合我的期望。我真正想要的是像一个真正的导师的体验。也许我脑海中一个显著的例子是，我最近在学习韩语，所以我只是在自学。我曾经有一段时间是在网上自学韩语。

I went through a phase where I was actually part of a small class in Korea taking a taking a Korean with a bunch of other people which was really funny. But we had a teacher and like 10 people or so taking Korean. And then I switched to one on one tutor. And I guess what was fascinating to me is I think I had a really good tutor.

我经历了一个阶段，那时我参加了一个在韩国的小班学习韩语，这让我觉得很好玩。班上有大约十个人，还有一位老师。后来我改成了一对一的辅导。我觉得有趣的是，我认为我的辅导老师真的很优秀。

But I mean just thinking through like what this tutor was doing for me and how incredible that experience was and how high the bar is for like what I actually want to build eventually. Because I mean she was extremely so she instantly from a very short conversation understood like where I am as a student what I know and don't know. And she was able to like probe exactly like the kinds of questions or things to understand my world model.

我的意思是，回想一下这个导师为我做的事情，那种体验真是难以置信。而且，这也为我最终想要建立的标准设定了很高的门槛。因为她真的是非常厉害，仅仅通过很短的谈话，就能迅速理解我作为学生所处的水平，以及我知道和不知道的内容。她能够准确地提出问题或思考点，以便理解我的思维模式。

No LLM will do that for you 100% right now not even close right. But a tutor will do that if they're good. Once she understands she actually like really served me all the things that I needed at my current sliver of capability. I need to be always appropriately challenged. I can't be faced with something too hard or too trivial.

没有任何大型语言模型（LLM）能够在现在做到100%正确，甚至接近都达不到。但是一个好的导师可以做到。如果她理解了，她就会真正地根据我目前的能力水平，提供我所需要的一切。我需要的是适度的挑战，不能面对过于困难或过于简单的事情。

And a tutor is really good at serving you just the right stuff. And so basically I felt like I was the only constraint to learning like my own. I was the only constraint. I was always given the perfect information. I'm the only constraint. And I felt good because I'm the only impediment that exists. It's not that I can't find knowledge or there's not properly explained or etc.

一位导师非常擅长为你提供正好合适的内容。所以，基本上我觉得学习上唯一的限制就是我自己。我始终获得了完美的信息，唯一的限制就是我自己。我感到不错，因为唯一存在的障碍就是我自己。并不是我找不到知识，或者知识没有得到妥善解释等等。

Like it's just my ability to memorize and so on. And this is what I want for people. How do you automate that? So very good question at the current capability you don't. But I do think that with as and that's why I think it's not actually the right right time to actually build this kind of an AI tutor.

就像是我的记忆能力等等。这就是我希望大家能够拥有的能力。怎么自动化实现呢? 这是个好问题，但以现有的技术能力而言，目前还做不到。但我认为，随着技术的进步，我们有可能实现这一点。这也是我认为现在其实还不是开发这种AI导师的合适时机的原因。

I still think it's a useful product and lots of people will build it. But I still feel like the bar is so high and the capability is not there. But I mean, even today I would say Charter Beauty needs an extremely valuable educational product. But I think for me it was so fascinating to see how high the bar is.

我仍然认为这是一个有用的产品，很多人将会去开发它。但我还是觉得门槛太高，而能力达不到。不过，即便在今天，我也会说Charter Beauty需要一个极具价值的教育产品。但对我来说，看到这个高门槛还是让我感到非常惊讶。

And when I was with her I almost felt like there's no way I can build this. But you're building it right? Anyone who's had a really good tutor is like how are you going to build this? So I guess I'm waiting for that capability. I do think that in a lot of ways in the industry, for example, I did some AI consulting for computer vision.

当我和她在一起时，我几乎觉得我不可能做到这一点。但你正在实现它，对吧？任何有过优秀导师的人都会想，你要怎么做到这一点？所以我想我在等待那种能力。我确实认为在很多方面，比如在行业中，我曾为计算机视觉做过一些人工智能咨询。

A lot of my times the value that I brought to the company was telling them not to use AI. It wasn't like I was the AI expert and they described the problem. I said don't use AI. This was my value. And I feel like it's the same in education right now where I kind of feel like for what I have in mind, it's not yet the time, but the time will come. But for now I'm building something that looks maybe a bit more conventional. That has a physical and digital component and so on. But I think there's obvious, there's obvious it's obvious how this should look like in the future.

在很多时候，我对公司的贡献就是告诉他们不要使用人工智能。并不是说我是人工智能专家，他们描述了问题后我就建议不要用人工智能。这就是我的价值所在。我觉得在教育领域也是类似的情况，我感觉对于我所设想的内容，现在还不是时候，但那个时间会到来。现阶段，我正在构建一些看起来可能更为传统的东西，具有实体和数字组成部分等等。但我认为，很明显可以看出未来应该是什么样子。

Do this and you're willing to say what is the thing you hope will be released this year or next year? Well, so I'm building the first course and I want to have a really, really good course. State of the art, obvious state of the art destination you go to learn AI in this case. Because that's just what I'm familiar with. So I think it's a really good first product to get to be really good. And so that's what I'm building and Nanachad which you briefly mentioned is a capstone project of LLM101N, which is a class that I'm building. So that's a really big piece of it. But now I have to build out a lot of the intermediates and then I have to actually like hire a small team of, you know, TAs and so on.

你愿意分享一下你希望今年或明年发布的是什么吗？我正在开发第一门课程，并希望将其打造成一门非常优秀的课程，成为学习AI的一个顶尖目的地。这是我熟悉的领域，所以我认为这是一个非常好的起点。目前，我正在构建这个课程和你提到过的Nanachad项目，它是我正在开发的LLM101N课程的一个重点课题。因此，它是一个非常重要的部分。但是，现在我需要制定很多中间环节，并且还需要雇用一个小团队，包括助教等。

Actually like build the entire course. And maybe one more thing that I would say is like many times when people think about education, they think about sort of like the more. What I would say is like kind of a softer component of like diffusing knowledge or like. But I actually have something very hard and technical in mind. And so in my mind education is kind of like the very difficult technical like process of building ramps to knowledge. So in my mind Nanachad is a ramp to knowledge because it's a very simple. It's like the super simplified full stack thing. If you give this artifact to someone and they like look through it, they're learning a ton of stuff. And so it's giving you a lot of what I call your recuss per second, which is like understanding per second.

其实，我喜欢构建整个课程。还有一点我想说的是，很多人想到教育时，往往会想到那种更软性的部分，比如传播知识等。但我心中其实有一个非常硬核和技术性的理念。在我看来，教育就像是一个构建通向知识的坡道的复杂技术过程。以我的观点，Nanachad 就是通向知识的坡道，因为它非常简单，是个超级简化的全栈东西。如果你把这个工具交给某个人，他们通过它可以学到很多东西。因此，它能够实现我所说的“每秒理解量”，即每秒理解的内容量。

That's what I want lots of your recuss per second. And so to me, this is a technical problem of how do we build these ramps to knowledge. And so I almost think of your recuss almost like it's not like maybe that different maybe through some of the future from tier labs or some of the work that's going to be going on because I want to figure out how to build these frontier, these ramps very efficiently so that people are never stuck. And everything is always not too hard or not too trivial. And you can't you have just the right material to actually progress.

这正是我想要的——每秒收到大量你的反馈。在我看来，这是一个技术性的问题，涉及如何构建通往知识的阶梯。我几乎把你的反馈看作是一种未来实验室或者某些即将开展的工作中的不同之处。因为我想弄清楚如何有效地建立这些前沿、这些阶梯，以确保人们不会被困住。一切都需要不太难也不太简单，并且你正好拥有适合进步的材料。

Yeah, so you're imagining this short term that instead of a tutor being able to like probe your understanding. If you have enough self awareness to be able to probe yourself, you're never going to be stuck. You can like find the right answer between talking to the TA or talking to an L and looking at the reference implementation. It sounds like automation or AI is actually not as good even. Like so far it's actually the big off I hear is your ability to explain AI, you're hardified in the source material of the class, right. That's like fundamentally what the course is.

好吧，所以你可以想象一下，短期内如果你有足够的自我意识来检查自己的理解，而不是依赖导师的引导，那么你就不会陷入困境。你可以通过与助教交流或者查看参考实现找到正确答案。听起来自动化或者人工智能实际上并不是很好，至少目前为止，让我听到的主要问题是如何解释人工智能，你对课程的原始材料有深入的了解，对吧。这基本上就是课程的核心所在。

I mean, I think you always have to be calibrated to what the capability of the existing industry. And I think a lot of people are going to pursue like, oh, just ask, charge of PT, etc. But I think like right now, for example, if you go to charge between you say, oh, teach me AI. There's no way it's like I'm just going to give you some slop, right? Like when I, AI is never going to write nanochat right now, but nanochat is a really useful, I think, intermediate point. So I still, I'm collaborating with AI to create all this material. So AI is still fundamentally very helpful.

我的意思是，我认为你总是需要根据现有行业的能力来进行调整。我觉得很多人可能会追求像“哦，只要问一下GPT”等等。但是现在，比如说，如果你去问GPT“教我AI”，它不可能给出一个完整的答案。AI现在还不能写出“nanochat”这样的程序，但“nanochat”是一个非常有用的中间产物。所以我仍然与AI合作来创造所有这些材料。AI在根本上还是非常有帮助的。

Earlier on, I built a CS231N Stanford, which was one of the earlier, actually, sorry, I think it was the first deep learning class as Stanford, which became very popular. And the difference in building out to 31N and L101N now is quite stark, because I feel really empowered by the LMS as they exist right now. But I'm very much in the loop. So they're helping me build the materials. I go much faster. They're doing a lot of the boring stuff, etc. So I feel like I'm developing the course much faster and those LLM, if used in it, but it's not yet at a place where I can creatively create the content. I'm still there to do that.

早些时候，我设计了斯坦福大学的CS231N课程，它是斯坦福最早的深度学习课程之一，其实抱歉，我想它应该是第一个，这门课非常受欢迎。现在，制作L101N课程和当初制作231N课程之间的区别非常明显，因为我感受到了现有学习管理系统（LMS）的巨大帮助。我完全参与其中，它们帮助我制作课程材料，因此我可以更快地推进。它们处理了很多枯燥的工作等等。所以我感觉自己在开发这门课程时速度快了很多。不过，这些LMS虽然能在一定程度上使用，但还没有达到能够创造性生成内容的地步。所以创意内容的部分我还是需要亲自处理。

So like, I think the trickiness is always calibrating yourself to what exists. And so when you imagine what is available through Ureka in a couple of years, it seems like the big bottleneck is going to be finding corpothies and field after field who can convert their understanding into these ramps. So I think it would change over time. So I think right now, it would be hiring faculty to help work hand in hand with AI and a team of people probably to build a state of the art courses. And then I think over time, it can maybe some of the TAs can actually become AI's because some of the TAs like, okay, you just take all the course materials. And then I think you could serve a very good like automated TA for the students when they have more basic questions or something like that, right? But I think you'll need faculty for the overall architecture of course and making sure that it fits.

所以，我觉得问题在于如何调整自己，以适应现有的情况。当你想象几年后Ureka能提供什么时，似乎最大的瓶颈将是找到各个领域的专家，他们可以将自己的理解转化为可行的路径。我认为这种情况会随着时间变化。目前，我认为需要招聘教师，与AI和一个团队合作，打造最先进的课程。同时，我认为随着时间的推移，一些助教可能实际上会变成AI，因为有些助教只需掌握所有课程材料，就可以很好地作为自动化助教，帮助学生解答一些基础问题。但是，我认为要确保课程的整体架构合理，并让其契合，还需要依靠教师。

And so I kind of see a progression of how this will evolve. And maybe at some future point, you know, I'm not even that useful in AI's doing most of the design much better than I could. But I still think that that's going to take some time to play up. But are you imagining that like people who have expertise in other fields are then contributing courses or do you feel like it's actually quite essential to the vision that you given your understanding of how you want to teach are the one designing the content. Like I don't know, Sal Khan is like narrating all the videos on Khan Academy. Are you imagining something like that or?

我大概能看到这方面发展的一个进程。也许在未来的某个时刻，人工智能在设计方面的表现会远超我个人的能力，以至于我变得不那么重要。不过，我仍然认为这需要一些时间才能实现。但你是否认为有其他领域的专家参与课程设计，还是觉得根据你对教学方法的理解，由你来设计内容是很关键的？就像萨尔·卡恩亲自为可汗学院的所有视频配音一样，你是否也想象成这样呢？

No, I will hire faculty. I think because there are domains in which I'm not an expert. And I think that's the only way to offer the state of the art experience for the student ultimately. So yeah, I do expect that I would hire faculty, but I will probably stick around in AI for some time. But I do have something I think more conventional in mind for the current capability. I think that what people would probably anticipate. And when I'm building Starfleet Academy, I do probably imagine a physical institution and maybe a tier below that a digital offering. That is not the state and not the state of the art experience you would get when someone comes in physically full time.

不，我会聘请教师。我认为这是因为有些领域我并不是专家。为了让学生最终获得最前沿的学习体验，这可能是唯一的方法。所以，是的，我确实预期会聘请教师，但我可能会在人工智能领域停留一段时间。不过，我确实有一些计划是基于目前的能力，可能相对更传统一些。我想这也是人们所预料的。当我在创建星舰学院时，我确实想象这是一个实体机构，可能还会有一个稍逊色的数字课程。这并不是当有人全职到校时所能获得的顶尖体验。

And we work through material from start to end and make sure you understand it. That's the physical offering. Yeah. The digital offering is a bunch of stuff on the internet and maybe some LLM assistant and some bit more gimmicky in a tier below. But at least it's accessible to like 8 billion people. So yeah, I think you're basically inventing college from first principles for the tools that are available today. And then just like for just like selecting for the people who have the motivation and the interest of actually really engaging out material.

我们会从头到尾学习材料，确保你理解。这是实地课程。至于在线课程，有很多网络上的资源，可能还包括一些LLM助手和一些更花哨的东西，在略低一层的类别中。但至少这些资源对全球大约80亿人都是可以接触到的。所以，我认为你基本上是在根据当前可用的工具，重新发明大学的教学方式。然后，类似地，我们挑选那些有动力和兴趣真正投入学习材料的人。

Yeah. And I think there's going to have to be a lot of not just education, but also reeducation. And I would love to help out there because I think the job will probably change quite a bit. And so for example, today a lot of people are trying to upskill in AI specifically. So I think it's a really good course to teach in this in this respect. And yeah, I think the motivation wise before AGI motivation is very simple to solve because people want to make money. And this is how you make money in this street today. I think post AGI is a lot more interesting, possibly because yeah, if everything is automated and there's nothing to do for anyone, why would anyone go to a school, etc.

是的，我认为不仅需要教育，还需要再教育。我很愿意在这方面提供帮助，因为我认为工作可能会有很大变化。比如说，今天很多人都在努力提升自己的AI技能，所以我觉得在这方面开设课程是非常有意义的。是的，我认为在通用人工智能（AGI）出现之前，动机很简单，因为人们想赚钱，而AI正是如今赚钱的方法。而在AGI出现后，情况可能会更有趣，因为如果一切都被自动化了，没有事情需要人们去做，那为什么还要去上学等等呢？

So I think I guess like I often say that pre AGI education is useful post AGI education is fun. And in a similar way as people, for example, people go to gym today. Yeah. But we don't need their physical strength to manipulate heavy objects because we have machines to do that. Yeah. So they still go to gym. Why do they go to gym? Well, because it's fun, it's healthy, it's and it's and you look hot when you have a six back, I don't know. I guess like. And so it's I guess what I'm saying is it's attractive for people to do that. Right.

所以，我想表达的意思是，我经常说，人工智能（AGI）普及之前的教育是有用的，而AGI普及之后的教育则是有趣的。这有点像今天的人去健身房的情况。我们不需要用体力去搬动重物，因为机器可以帮我们做这些事情。但人们还是去健身房。为什么他们去呢？因为这很有趣，对健康有益，而且拥有六块腹肌也让你看起来更有吸引力。所以，我想说的是，这种活动对人们很有吸引力。

And in a certain like very deep psychological evolutionary sense for humanity. Yeah. And so I kind of think that education will kind of play out in the same way like you'll go to school, like you got a gym. And you'll and I think that right now I think not that many people learn because learning is hard. You bounce from material because and some people overcome that barrier, but for most people it's hard.

从人类深层的心理和进化角度来看，我认为教育的趋势可能会像健身一样。你会像去健身房一样去上学。我觉得目前学习的人并不多，因为学习是件困难的事。人们常常在面对学习材料时感到挫折，虽然有些人能够克服这种障碍，但对大多数人来说，这并不容易。

Yeah. But I do think that we should it's a technical problem to solve. It's a technical problem to do what my tutor did for me when I was learning Korean. I think it's tractable and buildable and someone should build it. And I think it's going to make learning anything like trivial and desirable and people will do it for fun because it's trivial. If I had a tutor like that for any arbitrary piece of like knowledge, I think it's going to be so much easier to learn anything and people will do it.

是的。但我确实认为我们应该把这当作一个技术问题来解决。就像我学习韩语时，我的导师为我做的那样，这是一个技术问题。我认为这是可行且可构建的，应该有人来实现它。我相信这会让学习任何东西变得简单且有吸引力，人们会因为其简单性而乐在其中。如果我能有这样一位导师来指导我学习任何知识点，我觉得学习任何东西都会变得更加容易，人们都会愿意去学习。

And they'll do it for the same reason as they go to gym. I mean, that sounds different from using this supposed AGI you're using this to basically as entertainment or as like a self-betterment. But it's down to like you had a vision also that this education is relevant to keeping humanity in control of A.I. I see. And they sound different and I'm curious is it like it's entertaining for some people but then empowerment for some others? How do you think about that? I think this so I do definitely feel like people will be I do think like eventually it's a bit of a losing game. If that makes sense, I do think that it is in long term. Yeah. Long term, which I think put is longer than I think maybe most people in the history. It's a losing game. I do think that people can go so far and that we barely scratch the surface of much person can go. And that's just because people are bouncing off of material that's too easy or too hard.

他们这样做的原因和去健身房的原因是一样的。我是说，这听起来和使用所谓的通用人工智能作为娱乐或自我提升的目的不同。但关键是，你也有一个设想，就是这种教育有助于人类继续控制人工智能。我明白了。这听起来似乎有所不同，我很好奇，这对某些人来说是种娱乐，但对另一些人来说是种赋能？你怎么看待这个？我确实觉得，人们会这样做，我确实认为，最终这有点像是一个输掉的游戏。如果这样说有道理的话，我确实认为，从长远来看，这种情况是这样的。长远来说，我觉得这比历史上大多数人的看法要长。这是一个输掉的游戏。我确实认为，人们可以走得很远，而我们几乎只触及了一个人所能达到的表面。这仅仅是因为人们接触到的材料太简单或太难。

And I actually kind of feel that people will be able to go much further. Like anyone speaks five languages because why not? Because it's so trivial. Anyone knows, you know, all the basic curriculum of hundreds of others, etc. Now that I'm understanding the vision, that's very interesting. Like I think it actually has a perfect analog in gym culture. I don't think a hundred years ago anybody would be like ripped. Like nobody would have, you know, be able to like just spontaneously bench two plates or three plates or something. It's actually very common now. And you're because this idea of systematically training and lifting weights in the gym or systematically training to be able to run a marathon. Which is the capability spontaneously you would not have or most humans would not have. And you're imagining similar things were learning across many different domains, which were intensely deeply faster.

我实际上觉得人们将能走得更远。比如，为什么不能每个人都能说五种语言呢？因为这会变得很简单。每个人都掌握几百门学科的基础知识等等。现在我理解这个愿景了，非常有趣。我认为这与健身文化有完美的类比。大概一百年前，没有人会身体强壮得不得了。没有人能够轻松地卧推两三块杠铃，但现在这相当普遍。你因为有系统地在健身房锻炼、举重或训练跑马拉松，这些能力是你或大多数人自然不具备的。而你想象着类似的事情可以在许多不同领域的学习中发生，并且是极其深入、快速的。

Yeah, exactly. And I kind of feel like I am betting a little bit implicitly on some of the timelessness of human nature. And I think it will be desirable to do all these things. And I think people will look up to it as they have for millennia. And I think this will continue to be true. And I think it will actually also maybe there's some evidence of that historically. Because if you look at for example aristocrats or you look at maybe ancient Greece or something like that. Whenever you had little pocket environments that were post AGI in a certain sense. I do feel like people have spent a lot of their time flourishing in a certain way, either physically or comfortably. And so I think I feel okay about the prospects of that. And I think if this is false and I'm wrong and we end up in like, you know, Wally or Idiocracy future, then I think it's very. I don't even care if there's like, Dyson spheres. This is terrible outcome.

是的，完全正确。我有点感觉自己在隐约地赌人性的永恒。我认为做所有这些事情对人类来说是非常有吸引力的，并且我觉得人们会像过去几千年来一样仰慕它们。我相信这种情况还会继续存在。历史上也许有一些证据支持这个观点。例如，如果你看看贵族阶层或古希腊之类的社会，在某种程度上，它们就像是超越通用人工智能（AGI）之后的小环境。我觉得在这些环境中，人们常常以某种方式充实自己，不管是身体上还是精神上的舒适。所以我对这种发展的前景感到乐观。我认为，即使我错了，结果是电影《机器人总动员》或《傻蛋进化论》中描述的那种未来，我也会觉得非常糟糕。即便有戴森球这样的进步，这样的结局仍然是不可接受的。

Yeah. Like I actually really do care about humanity. Like everyone has to just be superhuman in a certain sense. I guess it's still a world in which that is not enabling us to. It's like the culture world, right? Like you're not fundamentally going to be able to like transform the trajectory of. Yeah. Technology or. Yeah. Influence decisions by your own labor or cognition alone. Maybe you can influence decisions because the AIs like for your approval. But you're not like. It's not because I've like. I can. Because I've invented something or I've like come up with a new design. I'm like really influencing the future. Yeah, maybe. I don't actually think that. I think there will be transition in a period where we are going to be able to be in the loop and, you know, advanced things if we actually understand a lot of stuff.

好的。我其实真的关心人类。在某种意义上，每个人都必须变得超凡脱俗。我想我们仍然生活在一个不完全实现这种期望的世界中。这就像是个文化世界，对吧？你不可能仅凭自己的努力或聪明才智来根本性地改变技术发展的轨迹，或者对决定产生影响。或许你能影响决策，因为人工智能需要得到你的认可，但这并不是因为我发明了什么，或是提出了一个新的设计，从而真正影响了未来。我也许不完全这样认为。我觉得会有一个过渡时期，如果我们真的理解了很多东西，我们就有能力参与其中并推动事物的发展。

I do think that long term that probably goes away, right? But maybe it's going to even become a sport. Right now you have power lifters who go extreme on this direction. So what is power lifting in a cognitive era? Yeah. Maybe it's people who are really trying to make Olympics out of knowing stuff. Yeah. Like. And if you have a perfect AI tutor, maybe you can get extremely far. Yeah. I almost feel like we're just barely. The geniuses of today are bears' question of surface of what a human mind can do, I think. Yeah. I love this version. I also. It's like. I feel like the person who have like most product market fit with is like. Me because like my job involves having to learn different subjects every week. And I am very excited if you can.

我确实认为，从长远来看，这种情况可能会消失，对吧？但也许它甚至会发展成一项运动。现在，你有了在这方面非常极端的力量举重者。那么在认知时代，力量举重是什么呢？也许就是那些真的想通过知识获取奥运资格的人。如果有一个完美的AI导师，或许你可以走得非常远。我几乎觉得，今天的天才们只是稍微触及了人类大脑能够做到的事情的表面。我喜欢这种说法。同时，我觉得和这个概念最契合的人就是我，因为我的工作需要每周学习不同的科目。如果这成为现实，我会感到非常兴奋。

I'm similar for that matter. I mean, you know, a lot of people, for example, hate school and when I get out of it. I was. I was actually. I really liked school. I loved learning things, etc. I wanted to stay in school. I stayed all the way until PhD. And then they wouldn't let me stay longer. So I went to the industry. But I mean, I. Basically, it's roughly speaking. I love learning. Even for the sake of learning. But I also love learning because it's a form of empowerment and being useful and productive. I think you also made a point that we started also just to spell it out. I think what's happened so far with online courses is that. Why haven't they already enabled us to. Enable everything we've been to know everything. And I think they're just so motivation-laden because there's not obvious on ramps. And it's like so easy to get stuck.

我的情况也差不多。比如说，很多人讨厌上学，并且一离开学校就不想回去。而我实际上非常喜欢上学，我喜欢学习新知识等等。我希望能一直呆在学校，所以我读到博士学位。然而，学校不再允许我继续待下去，所以我便去了行业工作。大体上说，我热爱学习，不仅仅是因为学习本身，还因为学习能让我更有能力，更有用，更有生产力。我觉得你也提到了这一点，从某种程度上说，学习的重要性已经被清楚地表达出来。我觉得现在的网络课程有个问题就是：为什么它们没有让我们掌握所有我们需要知道的东西。我认为这是因为课程需要强烈的自我激励，而且缺少明显的入门途径，这使得我们很容易停滞不前。

And if you had. Instead of this. Basically, like a really good human tutor. It would just be such an unlike from a motivation perspective. I think so. Because it feels bad to bounce from material. It feels bad. You get negative reward from a sinking amount of time and something and this doesn't pan out. Or like being completely bored because which are getting us too easy or too hard. So I think. Yeah, I think when you actually do it properly, learning feels good. Yeah. And I think it's a technical problem to get there. And I think for a while it's going to be AI plus human collab. And at some point, maybe it's just AI.

如果有这样一个系统，而不是现在这样，基本上就像一个非常出色的人类导师，从激励的角度来看，效果会截然不同。我是这样认为的。因为在学习材料之间来回切换会让人感到不适。不舒适感来自于投入大量时间却没有成果，或者是因为材料太简单或太难使人感到极其无聊。所以我认为，当你真正掌握正确的学习方法时，学习的感觉会很好。我认为这是一个技术问题，需要去解决。我觉得在一段时间内，学习可能会是人工智能和人类的合作，然后有一天可能就只依靠人工智能了。

Yeah, I don't know. Can I ask some questions about teaching well? If you had to give advice to another educator in another field that you're curious about. To make the kinds of YouTube tutorials you've made. Maybe it might be especially interesting to talk about domains where you can't just. You can't test somebody's technical understanding by having them code something up or something. What advice would you give them? So I think that's a pretty broad topic. I do feel like there's basically. I almost feel like there are 10, 20 tips and tricks that I kind of semi-consciously probably do. But I guess like on a high level, I always try to.

是的，我也不太清楚。不过我可以问一些关于如何有效教学的问题吗？如果你要给对其他领域感兴趣的教育工作者一些建议，让他们制作类似于你制作的YouTube教程。特别是在一些领域中，你不能通过让学生编写代码来测试他们的技术理解时，你会给他们什么建议？我觉得这是一个相当广泛的话题。我觉得在这方面，我基本上也有大概10到20个半意识到的小技巧。不过在更高的层次上，我总是尝试着去......

I think a lot of this comes from my physics background. I really, really did enjoy my physics background. I have a whole rant when I think how everyone should learn physics in early school education. Because I think early school education is not about cremalling knowledge or memory for tasks later in the industry. It's about booting up a brain. And I think physics uniquely boots up the brain the best. Because some of the things that they get you to do in your brain during physics is extremely valuable later. The idea of building models and abstractions and understanding that there are. There's a first or of approximation that describes most of the system.

我觉得这主要源于我的物理学背景。我真的非常热爱我的物理学背景。我一直认为，每个人都应该在早期学校教育中学习物理。因为我认为早期的学校教育不只是为了将来在行业中记忆知识或完成任务，而是为了启动大脑。而我认为物理学能最有效地激活大脑。因为在学习物理时，你的大脑需要进行的一些思考，对以后的发展非常有益。比如，构建模型、进行抽象思维和理解有时只需要一个初步的近似就能描述大部分系统的原理。

But then there's a second or third or first or a term that may or may not be present. And the idea that you're observing like a very noisy system. But actually there's like these fundamental frequencies that you can abstract away. Like when a physicist walks into the class and they say, oh, assuming there's a spherical cow and dot, dot, dot. And everyone laughs at that. But actually this brilliant, brilliant thinking that's very journalizable across the industry. Because yeah, cow can be approximate as a sphere, I guess, in much of a ways. There's a really good book, for example, Scale. It's basically from a physicist talking about biology. And maybe this is also a book I've recommended reading.

这段文字可以翻译成中文如下：但是，接下来可能会有第二个、第三个或第一个或某个可能存在或不存在的项。这个想法就像是你在观察一个非常嘈杂的系统，但实际上，你可以抽象出其中的一些基本频率。就好像当一个物理学家走进课堂时，说：“假设有一头球形的奶牛，等等。”大家通常会笑，但实际上这是一种非常聪明的思维方式，在各行各业中都具有普遍适用性。因为奶牛可以在某种程度上被近似为一个球体。例如，有一本非常好的书《规模》，基本上是一个物理学家讨论生物学的内容。也许这也是一本我推荐阅读的书。

But you can actually get a lot of really interesting approximations and chart scaling loss of animals. And you're going to get their heartbeats and things like that. And they actually line up with the size of the animal and things like that. You can talk about an animal as a volume. And you can actually derive a lot of. You can talk about the heat dissipation off that. Because your heat dissipation grows as the surface area, which is growing a square. But your heat creation or generation is growing as a cube. And so I just feel like physicists have all the right cognitive tools to approach brow and solving in the world.

你实际上可以得到很多有趣的近似值和动物缩放图。你还可以得到它们的心跳之类的数据，这些数据实际上与动物的体型以及其他相关因素相对应。你可以把动物当做一个体积来讨论，并从中推导出很多东西。你可以谈论它们的散热，因为随着表面积的增长，散热量也在增加，而表面积是按平方增长的。然而，热量的产生或生成是立方增长的。因此，我觉得物理学家拥有正确的认知工具来研究和解决这个世界的问题。

So I think because of that training, I always try to find the first order terms, or the second order terms of everything. When I'm observing a system or a thing, I have a tangle of a web of ideas or knowledge in my world, in my mind. And I'm trying to find what is the thing that actually matters? What is the first order component? How can I simplify it? How can I have a simple thing that actually shows that thing? It shows an action. And then I can tackle in the other terms. Yeah. Maybe an example from one of my repos that I think illustrates as well as a called micrograd. I don't know if you're familiar with this.

我想因为受到这样的训练，我总是试图找到事物的一阶项或二阶项。当我观察一个系统或事物时，我的大脑中有一张想法和知识的复杂网络。我不断地尝试找出真正重要的是什么？什么是一阶成分？我如何简化它？我如何用简单的东西来展示它，它能够显示出一个动作。之后，我再去处理其他方面的问题。也许我可以用我一个项目中的例子来说明这一点，这个项目叫做micrograd，不知道你是否了解。

Micrograd is 100 lines of code that shows back propagation. You can create neural networks out of simple operations like plus and times, etc. Lego blocks of neural networks. And you build up a computational graph and you do a forward pass and a backward pass to get the gradients. Now this is at the heart of all neural network learning. So micrograd is a 100 lines of pretty interpretable Python code. And it can do forward and backward arbitrary neural networks, but not efficiently. So micrograd, these 100 lines of Python are everything you need to understand how neural networks train. Everything else is just efficiency. Yeah. Everything else is efficiency.

Micrograd是一个由100行代码组成的程序，用来展示反向传播的过程。你可以通过简单的加法和乘法等操作来创建神经网络，就像搭建乐高积木一样。你会创建一个计算图，然后通过前向传播和反向传播来获取梯度。而这正是所有神经网络学习的核心所在。所以，micrograd包含了100行相对容易理解的Python代码。它可以对任意的神经网络进行前向和反向传播，但效率不高。因此，micrograd这100行Python代码提供了你需要了解神经网络训练方式的基础，而其他的事情都只是为了提升效率。是的，其他的都只是效率问题。

And there's a huge amount of work to do efficiency. You know, you need your tensors. You lay them out. You stride them. You make sure your kernels are orchestrating memory movement correctly, etc. It's all just efficiency, roughly speaking. But the core intellectual sort of piece of neural network training is micrograd. It's 100 lines. You can easily understand it. You're changing. It's a recursive application of chain rule to drive the gradient. It's a lot to optimize any arbitrary differential function. So I love finding these like, you know, the smaller the terms.

在效率方面有大量的工作要做。你需要准备好张量，布置它们，设置步长，确保你的内核正确地管理内存移动，等等。这一切归根结底都是在追求效率。但神经网络训练的核心思想是微梯度（micrograd），它只有100行代码，你可以很容易理解。它通过递归地应用链式法则来计算梯度，可以优化任意的微分函数。所以我非常喜欢找到这些简化的方式。

Yeah. And serving them on a, on a platter and discovering them. And I feel like education is like the most intellectual interesting thing because you have a tangle of understanding and you're trying to lay it out in a way that creates a ramp where everything only depends on the thing before it. And I find that this like, you know, untangling of knowledge is just so intellectually interesting as a cognitive task. Yeah. And so I love doing it personally, but I just find I have a fascination with trying to lay things out in a certain way. Maybe that helps me. It also just makes a learning experience so much more motivated.

是的，把它们呈现在一个盘子上，然后去发现它们。我觉得教育是最引人入胜的智力活动，因为你需要理清复杂的知识，并通过这种方式创建一个前后依赖关系的逻辑链。我发现这个解开知识的过程作为一种认知任务非常有趣。我个人很喜欢这样做，同时我也对如何以某种方式陈述知识感到着迷。这可能对我有帮助，也让学习体验变得更加有动力。

Your tutorial on the transformer begins with bi-grams, literally like a lookup table from, here's the word right now, or here's the previous word, here's the next word, and it's literally just a lookup table. Yes, the essence of it, yeah. I mean, such a brilliant way. Like, okay, start with the lookup table and then go to a transformer. And each piece is motivated. Why would you add that? Why would you add the next thing? You could memorize a sort of attention formula, which is like having an understanding of why this is, every single piece is relevant, what problem it solves.

您的关于Transformer的教程从二元语法开始，这就像一个查找表一样：这是当前的词，或者前一个词，然后是下一个词，这实际上就是一个查找表。这就是它的本质，对吧。我认为这是一种非常巧妙的方式。就像，好的，从查找表开始，然后再进入Transformer。每一部分都有其动机。为什么要加入这一部分？为什么要加入下一个部分？您可以记住一种注意力公式，这就像理解为什么每个部分都是相关的，它解决了什么问题。

Yeah, yeah, yeah. You're presenting the pain before you present a solution and how clever is that? And you want to take the student through that progression. So there's a lot of like other small things like that that I think make it make it nice and engaging and interesting. And, you know, always prompting the student, there's a lot of small things like that that I think are, you know, important and a lot of good educators will do. Like, how would you solve this? Like, I'm not going to present a solution before you're going to guess.

是的，是的，是的。你在提出解决方案之前先呈现问题，这种方法多么巧妙啊！你希望带领学生经历这个过程。我认为还有很多类似的小细节使得教学过程既有趣又吸引人。你知道，不断激励学生，我认为有许多这样的细节是很重要的，很多优秀的教育者都会这样做。比如，你会怎么解决这个问题？我不会在你猜测之前就给出解决方案。

Yeah. Yeah, that would be wasteful. That would be, that's, that's a little bit of a, I don't want to swear, but like it's a, it's a, it's a dick move towards you to present you with the solution before I give you a shot to try to, um, right, to come up with it yourself. Yeah. And because if you try to come with yourself, you, you get, I guess you get a better understanding of like, what is the action space? Yeah. And then what is the sort of like objective? Then like, why does only this action fulfill that objective, right?

对，对，那样做是非常浪费的。这有点像一个不太礼貌的行为，就是在我给你机会自己尝试解决问题之前，就直接告诉你答案。如果你自己试着去解决，你可能会对“有哪些可选的行动”有更好的理解，以及什么是“目标”。然后，你就会明白为什么只有这个行动可以实现那个目标。

Yeah. Well, you have a chance to like try yourself and you've, you've given appreciation when I give you the solution. And, uh, you, you maximize the amount of knowledge per you fact added. That's right. Yeah. Why do you think by default people who are genuine experts in their field are often bad at explaining it to somebody ramping up? Well, as the curse of knowledge and expertise. Yeah. This is a real phenomenon. And I actually suffered from it myself as much as I tried to not, not suffer from it. But you take certain things for granted and you can't put yourself in issues of new of people who are just starting out.

是的。嗯，你有机会尝试一下自己解决问题，当我给你提供解决方案时，你表达了赞赏。而且，你能够最大化每条新增知识的价值。没错。你认为为什么那些真正是各自领域专家的人往往不擅长为新手解释呢？这就是知识和专业的诅咒。是的，这是一种真实存在的现象。实际上，我自己也曾深受其害，尽管我尽量避免这种情况发生。但是，你会理所当然地认为某些事情是理所当然的，无法站在刚开始学习的人的角度去考虑问题。

And, uh, this is pervasive and happens to me as well. One thing that I actually think is extremely helpful as an example, someone was trying to show me a paper in biology recently. And I just had instantly so many terrible questions. Uh, so what I did was I used chat GPT to ask the questions with the, with the paper in context window. And then, uh, it worked through some of the simple things. And then I actually shared the thread to the person who shared it, uh, who actually like wrote that paper or like worked on that work.

这很普遍，也发生在我身上。有一件事让我觉得非常有帮助。最近有人试图给我看一篇生物学的论文，我瞬间产生了很多困惑的问题。于是，我用了ChatGPT来提问，把论文放在它的上下文窗口中。它帮我解决了一些简单的问题，然后我把这段对话分享给了给我看论文的那个人，他可能就是写这篇论文或者参与这项工作的人。

And I almost feel like it was like, um, like if they can see the dumb questions I had, it might help them explain better in the future and some like that. Because, um, so for example, for my material, I would love if people shared their dumb conversations with chat GPT about the stuff that I've created. Because it really helps me put myself again in the shoes of someone who's starting out.

我几乎觉得，如果他们能看到我那些看似愚蠢的问题，可能有助于他们在未来更好地解释。因为，比如说，对于我的材料，我很希望人们分享他们与ChatGPT关于我所创作内容的那些“愚蠢”对话。这真的能帮助我再一次站在刚刚起步的人的角度去思考。

Another trick like that that I just works astoundingly well. Um, if somebody writes a paper, we're a blog post, we're an announcement. It is in a hundred percent of cases true that just the narration or the transcription of how they would explain it to you over lunch is way more, uh, not only understandable. Yeah. But actually also more accurate and scientific in the sense that people have a bias to explain things in the most abstract, jargon-filled way possible and to clear their throat for four paragraphs before they explain the central idea.

这段话可以翻译为： "还有一个非常有效的小技巧。如果有人写了一篇论文、博客文章或公告，几乎可以肯定地说，他们在午餐时直接向你解释会比书面内容更容易理解，也更加准确和科学。因为人们往往有一种倾向，总是想用最抽象、术语满满的方式来解释事情，而且总会在真正解释核心思想之前铺垫好几段。"

But there's something about communicating one on one with a person, which compels you to just say the thing to say the thing. Yeah. Actually, I saw that tweet. I thought it was really good. I shared it with a bunch of people actually. I think it was really good. And I noticed this many, many times. Maybe the most prominent example is I remember back in my PhD days doing research, etc. You read someone's paper, right? And you work, you try to understand what is doing, etc.

但跟一个人一对一交流时，总有某种力量促使你直接说出想说的话。是的。我确实看到了那条推文，我觉得非常好，所以分享给了很多人。我反复注意到这种情况。也许最鲜明的例子是，我记得在读博期间做研究的时候。你读别人的论文，试图理解他们在做什么等等。

And then you catch them, you're having beers at the conference later and you ask them, so like this paper, like so, what are you doing? Like what is the paper about? And they will just tell you these like three sentences that like perfectly captured the essence of that paper and told you to give you the idea and you didn't have to read the paper. And like it's only when you're sitting at the table with a beer or something like that and like, oh, the paper is just, oh, you take this idea, you take that idea and try this experiment and you try this thing.

然后你抓住机会，在会议上跟他们一起喝啤酒时问他们：“这个论文是关于什么的？”他们会用三句话完美地概括论文的核心，让你明白其中的意思，这样你就不需要读整篇论文。通常就是在这种喝啤酒聊天的轻松环境中，他们会说：“哦，这篇论文就是你把这个想法和那个想法结合起来，然后进行这个实验。”

And they have a way of just putting it conversationally. Right. And just like perfectly, like, why isn't that the abstract? Exactly. This is coming from the perspective of how somebody who's trying to explain an idea should formulate it better. What is your advice as a student to other students where if you don't have a car party who is doing the exposition of an idea, if you're reading a paper from somebody or reading a book, what strategies do you employ to learn material you're interested in in fields you're not an expert.

他们总是能够以一种对话的方式来表达这个想法。对，就好像是在说，为什么这不是摘要呢？没错。这是从一个人如何更好地阐述一个想法的角度出发的。如果你作为学生如何建议其他学生，在没有人帮助解释一个想法的情况下，当你在阅读某人的论文或书籍时，你会使用什么策略来学习你感兴趣但并不擅长的领域的材料？

I don't actually know that I have like unique tips and tricks to be honest. Basically, it's kind of a painful process. But you know, like, redraft one. I think like one thing that has always helped me quite a bit is I had a small tweet about this actually. So like learning things on demand is pretty nice learning depth wise. I do feel like you need a bit of alternation of learning depth wise on demand.

老实说，我并没有什么独特的技巧和窍门可分享。基本上，这个过程确实有点困难。但你知道，可以试着重写草稿。有一件事一直对我很有帮助，我之前发过一条关于这方面的小推文。根据需要去学习一些东西是很不错的选择，从深度上去学习。我觉得你需要在按需深度学习的时候，进行一些变换和调整。

You're trying to achieve a certain project that you're going to get a reward from and learning breath wise, which is just, oh, let's do whatever 101. And here's all the things you might need, which is a lot of school. There's a lot of breath wise learning like, oh, trust me. You'll need this later. You know, that kind of a stuff. Like, okay, I trust you. I'll learn it because I guess I need it. But I love the kind of learning where you actually get a reward out of doing something and you learn on demand.

你正在努力完成一个项目，这个项目可以让你获得奖励。这与那种“广泛学习”不同，所谓“广泛学习”就是像“让我们随便学点基础知识101，这里是你可能需要的所有东西，学校里有很多这种学习，类似于‘相信我，你以后会需要的’这种感觉。你会想，‘好吧，我相信你，我会学，因为我觉得我需要’”。但我更喜欢那种通过实际做事情而获得奖励，并且按需学习的方式。

The other thing that I've found is extremely helpful is maybe this is an aspect where education is a bit more selfless because explaining things to people is a beautiful way to learn something more deeply. This happens to me all the time. I think it probably happens to other people too because I realize if I don't really understand something, I can't explain it. You know, and I'm trying and I'm like, actually, I don't understand this.

我发现另一个非常有帮助的事情是，也许这是教育更为无私的一个方面，因为向他人解释事物是一种更深入学习的美妙方式。这种情况经常发生在我身上。我觉得这种情况也可能发生在其他人身上，因为我意识到如果我自己并不真正理解某件事，我就无法解释它。你知道，尝试解释时，我才意识到，其实我并不理解这件事。

And so knowing to come to terms with that. And then you can go back and make sure you understood it. And so it fills these gaps of your understanding. It forces you to come to terms with them and to reconcile them. I love to re-explain and things like that. And I think people should be doing that more as well. I think that forces you to manipulate knowledge and make sure that you know what you're talking about when you're explaining it.

因此，要学会接受这一点。然后，你可以回过头来确保自己理解了。这能填补你理解上的空白，迫使你接受并解决这些问题。我喜欢重新解释这些东西，我觉得人们也应该多这样做。我认为这样做迫使我们运用知识，并在解释时确保自己真正明白所说内容。

I think that's an excellent note to close on. Andre, that was great. Thank you. Thanks. Good luck, Tom. Hey everybody. I hope you enjoyed that episode. If you did, the most helpful thing you can do is just share it with other people who you think might enjoy it. It's also helpful if you leave a rating or a comment on whatever platform you're listening on.

我觉得这是一个很好的结束语。安德烈，那真是太棒了，非常感谢你。谢谢。祝你好运，汤姆。大家好，希望你们喜欢这一集。如果你喜欢，最有帮助的就是把它分享给其他你认为可能会喜欢的人。如果你能在你所使用的平台上留言或评分也会有帮助。

If you're interested in sponsoring the podcast, you're going to reach out at thewarkesh.com slash Advertise. Otherwise, I'll see you on the next one.

如果您有兴趣赞助这个播客，请访问 thewarkesh.com/Advertise 联系我们。否则，我们下期再见。