Mark Zuckerberg – Meta’s AGI Plan
发布时间 2025-04-29 16:07:06 来源
以下是内容的中文翻译:
在这一集播客中,马克·扎克伯格讨论了Meta在人工智能领域的最新进展,特别是重点介绍了Llama 4以及Meta AI在各个平台上的整合。他强调Meta AI已经拥有近10亿月活跃用户,表明其普及程度很高。个性化是关键,Meta正在利用来自信息流、个人资料、社交图谱和AI互动中的用户上下文来增强用户体验。
扎克伯格对开源AI的前景持乐观态度,他注意到竞争日益激烈,创新也超越了Meta本身。他承认目前存在专业化趋势,一些模型在数学和编码等推理任务中表现出色,但他强调对于面向消费者的应用来说,延迟和单位成本的智能至关重要。他表示Meta也在开发一个推理模型,最终将会发布。
他谈到了对AI模型进行基准测试的挑战,强调Meta更重视Meta AI产品中对用户的实际价值,而不是通用的基准测试。他认为这些基准测试很容易被“游戏化”,并且通常与用户的实际需求不符。扎克伯格表示,Meta的北极星是其产品中的用户满意度和反馈。
扎克伯格同意AI将自动化软件工程和AI研究这一前提,这可能导致智能爆炸。Meta正在积极开发编码代理和AI研究代理,以推进Lama的研究。他预测,在12到18个月内,AI将编写这些工作的大部分代码,在质量和测试方面超越人类的能力。他认为这种编码方面的进步是通向通用人工智能(AGI)过程中的重要但单一的组成部分。
尽管如此,扎克伯格认为,实现AGI不仅仅依赖于编码自动化,还需要物理基础设施、供应链发展、监管框架,以及至关重要的用户共同进化。用户学习如何与AI互动,以及AI向用户学习,都将是这一过程不可或缺的部分。
他认为,Meta在包括WhatsApp在内的各个平台上的分发,以及即将推出的Meta AI应用程序,对于收集训练数据和构建全面的AI助手至关重要。
扎克伯格承认人们对AI潜在的滥用或不健康关系的担忧。他认为过早地限制AI的使用会扼杀创新,他说:“人们很聪明;他们知道什么对他们的生活有价值。”他相信随着AI的发展,社会将找到表达其价值的方式。他提到了这样一个观点,即平均而言,人们拥有的朋友并没有他们希望的那么多。在拥有AGI的未来,如果用户能够接受,AI有可能弥合这一差距。
他解决了人们对AR技术消除奖励机制摩擦的担忧,澄清说,特定的多任务处理演示只是例证,并非最终产品目标。他强调,眼镜的设计优先考虑用户舒适度。
扎克伯格认识到中国越来越重视构建AI的物理基础设施,并强调美国需要简化数据中心和能源生产。他承认出口管制在阻碍中国AI发展方面的作用。他强调了国际竞争对手模型中嵌入的安全和漏洞。
他为Lama许可证辩护,称其在开放源代码访问与确保Meta能够与大型云公司就商业用途进行对话之间取得了平衡。他说,这并非为了阻止其他AI实验室使用Lama模型。
谈到AI治理时,扎克伯格强调Meta有责任承担其决策并接受社区反馈,而不是过多地依赖外部参与者。他认为与政府保持良好关系很重要。
最后,扎克伯格谈到了从AI中释放价值的问题,他认为AI的各种用途都将有商业模式。他认为Meta在这方面的基本价值观是,他们希望尽可能地为世界上更多的人服务。他预计AI带来的文化和社会丰富将产生巨大的影响。
In this podcast episode, Mark Zuckerberg discusses Meta's latest advancements in AI, specifically focusing on Lama 4 and Meta AI's integration across various platforms. He highlights that Meta AI has almost a billion monthly users, indicating significant adoption. Personalization is a key focus, leveraging user context from feeds, profiles, social graphs, and AI interactions to enhance the user experience.
Zuckerberg is optimistic about the open-source AI landscape, noting the increasing competition and innovation beyond just Meta. He acknowledges the specialization happening, with some models excelling in reasoning tasks like math and coding, but emphasizes the importance of latency and intelligence per cost for consumer-facing applications. He states that Meta is also developing a reasoning model to launch eventually.
He touches on the challenges of benchmarking AI models, emphasizing that Meta prioritizes real-world user value within Meta AI products over generic benchmarks, which he believes are easily gameable and often don't align with actual user needs. Zuckerberg states that Meta's North Star is user satisfaction and feedback within its products.
Zuckerberg agrees with the premise that AI will automate software engineering and AI research, potentially leading to an intelligence explosion. Meta is actively developing coding agents and AI research agents to advance Lama research. He predicts that within 12 to 18 months, AI will write most of the code for these efforts, surpassing human capabilities in quality and testing. He believes this coding advancement is an important, but singular part of this AGI process.
Despite this, Zuckerberg argues that achieving AGI is not solely dependent on coding automation but also requires physical infrastructure, supply chain development, regulatory frameworks, and, critically, user co-evolution. Users learning how to interact with AI, and AI learning from users, will be integral to the process.
He believes that Meta's distribution across its platforms, including WhatsApp, and the upcoming Meta AI app are crucial for gathering training data and building comprehensive AI assistants.
Zuckerberg acknowledges concerns about potential misuse or unhealthy relationships with AI. He argues that prematurely restricting AI use stifles innovation, stating, "people are smart; they know what is valuable in their lives." He believes as AI evolves, society will find ways to articulate its value. He touches on the idea that, on average, people don't have as many friends as they would like. In a future with AGI, AI could potentially bridge that gap, if users accept it.
He addresses worries about the removal of friction for reward hacking by AR technology, clarifying that specific multitasking demos are just illustrations and aren't a final product goal. He emphasizes that the glasses' design priority will be user comfort.
Zuckerberg recognizes China's increasing focus on building physical infrastructure for AI and highlights the need for the US to streamline data center and energy production. He acknowledges the role of export controls in hindering Chinese AI development. He highlights the security and safety vulnerabilities that are embedded into international competitors models.
He defends the Lama license, stating that it balances open-source access with ensuring Meta can engage in conversations with large cloud companies about commercial use. He states that it is not about stopping other AI labs from using the Lama model.
Touching on AI governance, Zuckerberg emphasizes Meta's responsibility to own its decisions and take community feedback, but not defer too much to outside actors. He believes it is important to have good relationships with the government.
Finally, Zuckerberg touches on unlocking value from AI, he believes there will be business models for all the various uses of AI. He believes Meta's basic value on this is they want to serve as many people in the world as possible. He expects the cultural and social enrichment from AI will have massive effects.
摘要
Zuck on: * Llama 4, benchmark gaming, open vs source * Intelligence explosion, business models for AGI * DeepSeek/China, ...
GPT-4正在为你翻译摘要中......
中英文字稿 
I would guess that the world is going to get a lot more, a lot funnier. And like, we're weirder. If you think that someone is doing as bad, and they think it's really valuable, most of the time in my experience, they're right and you're wrong. I am worried that we're just removing all the friction between getting totally a reward hacked by our technology. We are trying to build a coding agent that advances Lama research. I would guess that like sometime in the next 12 to 18 months, we'll reach the point where like most of the code that's going towards these efforts is written by AI. I tend to think that for at least the foreseeable future, this is going to lead towards more demand for people doing work, not less. If you've got the cost of providing that service down to one tenth of what it would have otherwise been, maybe now that actually makes sense to go do.
我猜测未来世界会变得更加有趣,也更加古怪。如果你认为某人所做的事情不好,而他们却觉得很有价值,在我的经验中,大多数时候他们是对的,而你是错的。我担心我们正在消除所有与技术带来的奖励挂钩的障碍。我们正在尝试建立一个推动Lama研究的编码代理。我猜在未来的12到18个月内,绝大部分相关代码会由AI编写。我倾向于认为,至少在可预见的未来,这将会导致对劳动者的需求增加,而不是减少。如果你把提供服务的成本降低到原来十分之一,也许现在就真的有意义去做这件事情。
All right, Mark, thanks for coming on the podcast again. Yeah, happy to do it. Good to see you. You too. Last time you were here, you had launched Lama 3. Yeah. Now you've launched Lama 4? Well, the first version. That's right. What's new? What's exciting? What's changed? Oh, well, I mean, the whole field is so dynamic. So I mean, I feel like a ton has changed since the last time that we talked. Met AI has almost a billion people using it now, monthly. So that's pretty wild. And I think that this is going to be a really big year on all of this, because especially once you start getting the personalization loop going, which we're just starting to build in now, really, from both the context that all the algorithms have about what you're interested in feed and all your profile information, all the social graph information, but also just what you're interacting with the AI about.
好的,马克,谢谢你再次参加我们的播客。很高兴能来。见到你很开心。我也是。上次你来的时候,你刚推出Lama 3。是的。现在你已经推出Lama 4了?嗯,算是第一个版本。没错。有些什么新变化或激动人心的地方?哦,我的意思是整个领域变得非常动态。我觉得自从我们上次交谈以来,变化很多。Met AI现在每个月差不多有十亿人使用,这很疯狂。我认为今年将是这一切发展的重要一年,尤其是当你开始构建个性化循环的时候。我们现在刚开始打造这个个性化循环,它利用算法对你的兴趣、所有个人资料信息和社交图谱信息的了解,以及你与AI的互动内容来进行个性化调整。
I think that's just going to be kind of the next thing that's going to be super exciting. So really big on that. The modeling stuff continues to make really impressive advances too, as you know. The Lama 4 stuff, I'm pretty happy with the first set of releases. You know, we announced four models, and we released the first two, the Scout and Maverick ones, which are kind of like the mid-size models, mid-size to small. It's not like, actually, the most popular Lama 3 model was the 8 billion parameter model. So we've got one of those coming in the Lama 4 series too. Our internal code name for it is Little Lama, but that's coming probably over the next, over the coming months. But the Scout and Maverick ones, they're good. There's some of the highest intelligence per cost that you can get of any model that's out there, the natively multimodal, very efficient run on one host, designed to just be very efficient and low latency for a lot of the use cases that we're building for internally.
我认为那将是下一个让人感到非常兴奋的事情。所以我对此非常看好。正如你所知,模型技术也在持续取得令人印象深刻的进步。我对Lama 4的首批发布感到很满意。我们宣布了四个模型,并发布了其中的两个:Scout和Maverick。这两个属于中小规模的模型。事实上,Lama 3中最受欢迎的模型是一个有80亿参数的模型。在Lama 4系列中,我们也会推出一个这样的模型。我们内部将其代号为小Lama,预计将在未来的几个月中发布。不过,Scout和Maverick这两个模型已经相当不错了。它们在每单位成本的智能表现上是市场上最好的,一天然的多模功能,能够高效运行在一个主机上,旨在为我们内部构建的许多使用场景提供高效和低延迟的体验。
And that's our whole thing. We basically build what we want, and then we open source it, so other people can use it too. So I'm excited about that. I'm also excited about the Behemoth model, which is coming up. That's going to be our first model that is sort of at the frontier. I mean, it's like more than two trillion parameters. So it is, I mean, it's, as the name says, it's quite big. So we're trying to figure out how we make that useful for people. It's so big that we've had to build a bunch of infrastructure just to be able to post train it ourselves. And we're trying to wrap our head around how does the average developer out there, how are they going to be able to use something like this? How do we make it so it can be useful for distilling into models that are of reasonable size to run? Because you're obviously not going to want to run something like that and a consumer model.
这就是我们的整体理念。我们基本上是建立我们想要的东西,然后开源出来,让其他人也能使用。因此,我对此感到兴奋。我也对即将推出的Behemoth模型感到兴奋。这将是我们第一个处于前沿的模型。我是说,它有超过两万亿个参数,如其名所示,确实相当大。所以我们正在努力弄清楚如何让它对人们有用。它太大了,以至于我们不得不建立大量基础设施才能亲自进行训练。我们正在思考普通开发者如何能够使用这样的东西。我们要如何让它有用,能够被转化为可以实际运行的合理大小的模型?因为显然你不会想要在消费级设备上运行这样庞大的模型。
But yeah, I mean, there's a lot to go. I mean, as you saw with the with the Lama three stuff last year, the initial Lama three launch was was exciting. And then we just kind of built on that over the year. 3.1 was when we released the 405 billion model. 3.2 is when we got all the multimodal stuff in. So we basically have a roadmap like that for this year too. So lock on on. I just want to hear more about it. There's this impression that the gap between the best close source and the open source models has increased over the last year, where I know the full family of Lama four models is not yet, but Lama four Maverick is 35 on Chabot Arena and about major benchmarks, it seems like 0.4 mini or GP Gemini 2.5 flash are beating Maverick, which is in the same class. What do you make of that impression?
好的,我的意思是,还有很多事情要做。 就像去年的Lama三号一样,最初的Lama三号发布非常令人振奋,然后我们在这一年的基础上继续发展。 3.1版本是我们发布了4050亿参数模型的时候,3.2版本是我们引入了各种多模态功能的时候。所以,今年我们也有类似的路线图。所以请关注。我只是想听更多关于这方面的内容。有人认为去年封闭源码模型和开源模型之间的差距增大了,尽管我知道Lama四号的完整系列尚未发布,但Lama四号中的Maverick在Chabot Arena上的得分是35,而在主要基准测试中,似乎0.4 mini或GP Gemini 2.5 flash的表现优于Maverick,它们属于同一类别。你怎么看待这种看法?
Yeah, well, okay, there's a few things. I actually think that this has been a very good year for open source overall. Right, if you go back to the like what we were last year, what we were doing with Lama was like the only real super innovative open source model. Now you have a bunch of them in the field. And I think in general, the prediction that this would be the year were open source generally overtakes close sources. The most used model models out there, I think is generally on track to be true. I think the thing that's been sort of an interesting surprise, I think positive in some ways, negative in others, but I think overall good is that it's not just Lama. There are a lot of good ones out there. So I think that that's quite good.
好的,有几点我想说一下。我其实认为这一年对开源来说总体上是非常好的。如果你回顾一下去年,当时我们只有Lama是唯一一个真正创新的开源模型。现在,市场上已经有很多这样的模型了。我认为,总体上大家预测今年开源会在普及度上超过闭源的看法,大体上是会成真的。我觉得有趣和出乎意料的是,不仅仅只有Lama有发展,现在有很多不错的开源模型。我认为这总的来说是件好事。
Then there's the reasoning phenomenon, which you basically are alluding to with talking about 0.3 and 0.4 and some of the other models. I do think that there is this specialization that's happening where if you want a model that is sort of the best at math problems or coding or different things like that, I do think that these reasoning models with a lot of the ability to just consume more test time or inference time compute in order to provide more intelligence is a really compelling paradigm. But for a lot of the applications that, and we're going to do that too, we're building an Lama 4 reasoning model and that'll come out at some point.
然后我们来谈谈推理现象,这实际上是你在提到0.3和0.4以及其他模型时暗示的内容。我确实认为正在发生一种专业化现象——如果你想要一个在数学问题、编程或类似领域表现最好的模型,那么这种推理模型就非常有吸引力,因为它们能够花费更多的测试时间或推理计算时间以提供更高的智能。对于许多应用来说,这种方式非常有吸引力,我们也将在这方面进行努力。我们正在构建一个Lama 4推理模型,并将在未来某个时候推出。
For a lot of the things that we care about, latency and good intelligence per cost are actually much more important product attributes. If you're primarily designing for a consumer product, people don't necessarily want it to wait half a minute to go think through the answer. If you can provide an answer that's generally quite good too in half a second, then that's great and that's a good trade-off. I think that both of these are going to end up being important directions. I am optimistic about integrating the reasoning models with kind of the core language models over time. I think that's sort of the direction that Google has gone in with some of the more recent Gemini models and I think that's really promising.
对于我们关心的许多事物来说,延迟和每单位成本的良好智能实际上是更重要的产品特性。如果您主要是在设计消费产品,用户并不希望它花半分钟去思考答案。如果您能在半秒内提供一个相当不错的答案,那就很棒,这是一个很好的权衡。我认为这两个方向最终都会变得很重要。我对将推理模型与核心语言模型随着时间的推移进行整合感到乐观。我认为这正是谷歌在最近的一些Gemini模型中所走的方向,我觉得这非常有前景。
But I think that there's just going to be a bunch of different stuff that goes on. You also mentioned the whole chat about arena thing, which I think is interesting and it goes to this challenge around how do you do the benchmarking? Basically, how do you know what models are good for which things? One of the things that we've generally tried to do over the last year is anchor more of our models in our meta AI product, North Star use cases because the issue with both open-source benchmarks and any given thing like the LM arena stuff is they're often skewed for either a very specific set of use cases which are often not actually what any normal person does in your product.
我认为将会有很多不同的事情发生。你提到关于“arena”的讨论,我觉得这很有趣,并且涉及到一个挑战,就是如何进行基准测试。基本上,你怎么知道什么模型适合用在哪些地方?在过去的一年里,我们通常尝试做的一件事是将更多模型植根于我们的meta AI产品的北极星用例中。因为,不论是开源基准测试还是像LM arena这样的东西,它们通常倾向于适用于某个特定的用例集合,而这些用例往往并不是普通人在你的产品中实际会用到的。
They are often weighted kind of the portfolio of things that they're trying to measure is different from what people care about in any given product. Because of that, we've found that trying to optimize too much for that stuff is often let us astray and actually not lead towards the highest quality products and the most usage and best feedback within meta AI as people use our stuff. We're trying to anchor our North Star in basically the product value that people report to us and what they say that they want and what their revealed preferences are and using the experiences that we have.
他们经常衡量的一些指标与人们在具体产品中关心的内容不同。因为这个原因,我们发现如果过于追求这些指标,通常会导致我们走入歧途,反而不能创造出最高质量的产品,也无法在Meta AI中获得最多的使用和最佳反馈。我们正在努力把“北极星”定在用户向我们报告的产品价值上,关注他们说想要的东西以及他们表现出来的偏好,并基于我们拥有的经验来进行改进。
Sometimes these things don't quite line up and I think a lot of them are quite easily gameable. I think on the arena you'll see stuff like Sonnet37, it's a great model and it's not near the top and it was relatively easy for our team to tune a version of Loma 4 Maverick that basically was way at the top whereas the one that we released that's the pure model actually has no tuning for that at all so it's further down. I think you just need to be careful with some of the benchmarks and we're going to index primarily on the products.
有时候这些事情并不完全符合预期,而且我认为其中很多都很容易被操控。比如在那个竞争场合,你可能会看到像Sonnet37这样的模型,它是个很优秀的模型,但不是排在顶端。我们的团队相对轻松地调试了一个Loma 4 Maverick的版本,使它位居前列。然而,我们发布的纯模型版本实际上没有针对这个进行任何调优,所以排名靠后。我认为你需要对一些基准测试保持谨慎,我们将主要专注于产品本身。
Do they feel like there is some benchmark which captures what you see as a North Star of Value to the user which can be sort of objectively measured between the different models and you're like, I need Loma 4 to come out on top on this. Well I mean our benchmark is basically user value in meta AI. Right, so it's not compare other models. Well we might be able to because we might be able to run other models in that and be able to tell and I think that's one of the advantages of open sources basically you have a good community of folks who can poke holes that okay where is your model not good and where is it good.
他们是否觉得有某种基准可以客观地衡量不同模型之间的用户价值,这相当于用户价值的北极星,而你希望Loma 4在这个基准上胜出。嗯,我们在Meta AI中的基准基本上就是用户价值。对,所以不是直接比较其他模型。不过,我们可能有能力进行比较,因为我们可以运行其他模型并进行评估。我认为这也是开源的一个优势,即有一个良好的社区,他们能够指出你的模型哪里做得好,哪里表现不佳。
But I think the reality at this point is that all these models are optimized for slightly different mixes of things. I mean everyone is trying to I think go towards the same, you know I think all the leading labs are trying to create general intelligence right and super intelligence whatever you call it right that basically AI that can lead towards a world of abundance where like everyone has these superhuman tools to create whatever they want and that leads to just dramatically empowering people and creating all these economic benefits. I think that's sort of however you define that I think that that's kind of what a lot of the labs are going for.
但我认为,目前现实情况是所有这些模型都针对稍有不同的内容进行了优化。我的意思是,我觉得大家都在努力朝着同一个方向前进。可以说,所有领先的实验室都在尝试创造通用智能,或者超级智能,不管你怎么称呼它。目标是研发能够引领进入一个丰裕世界的人工智能,让每个人都拥有超越人类能力的工具,去创造他们想要的东西,从而大幅度增强个人能力,带来巨大的经济利益。我认为,不论你如何定义,这就是许多实验室所追求的目标。
And but there's no doubt that different folks have sort of optimized towards different things. I think the anthropic folks have really focused on kind of coding and an agents around that. You know the open AI folks I think have gone a little more towards reasoning recently. And I think that there is a space which if I to guess I think we'll end up probably being the most used one which is quick is very natural to interact with is very natively multimodal that fits into kind of throughout your day the ways that you want to interact with it.
没有疑问的是,不同的人群在不同的领域进行了优化。我认为,Anthropic团队真的把重点放在了编程和围绕这些的智能体方面。而OpenAI团队最近可能更加关注推理能力的提升。我猜测有一个领域最终可能会成为最常用的,那就是快速、自然且具有多模态交互的领域,它能够很好地融入你的日常生活中,符合你希望与其互动的方式。
And I think you got a chance to play around with with the new meta AI app that we're releasing and you know one of the fun things that we put in there is the demo for the full duplex voice. And it's I mean it's early right? I mean it's not you know there's a reason why we haven't made that the default voice model in the app. But there's something about how naturally conversational it is that I think is just like really fun and compelling and I think being able to mix kind of that in with the right personalization is going to lead towards a product experience where you know I would basically just guess that you go forward a few years like we're just going to be talking to AI throughout the day about different things that we're wondering and you know it's like you'll you'll have your phone you'll talk you'll talk to on your phone you'll talk to it while you're browsing your feed apps it'll give you context about different stuff you'll be able to answer questions it'll help you as you're interacting with people and messaging apps you know eventually I think we'll walk through our daily lives and we'll either have glasses or you know other kinds of AI devices and just be able to kind of seamlessly interact with it all day long.
我想你已经有机会尝试我们即将发布的新Meta AI应用程序。其中一个有趣的功能是我们加入了全双工语音的演示。当然,现在这个功能还处于早期阶段,所以我们暂时没有把它设为应用程序中的默认语音模型。不过,我觉得它自然流畅的对话方式非常有趣且引人注目。如果能结合适当的个性化设置,我相信这会引导我们开发出一种新的产品体验。
我大概可以猜测,再过几年,我们会整天和AI对话,讨论我们想知道的各种事情。比如说,你可以通过手机和AI对话,当你浏览社交媒体时,AI会为你提供背景信息,回答你的问题,还会在你使用聊天应用与他人互动时提供帮助。最终,我认为我们将随着日常生活的推进,通过佩戴眼镜或其他AI设备,与AI进行无缝互动。
So I think that that is that's kind of the north star and whatever the benchmarks are that lead towards people feeling like the quality is like that's what they want to interact with that I think is actually the thing that is ultimately going to matter the most us. I got a chance to play around with both Orion and also the mid AI app and the voice mode was super smooth that was quite impressive. I on the point of what the different labs are optimizing for to steal man their view I think a lot of them thing that once you fully automate software engineering and AI research then you can kick off an intelligence explosion where you have millions of copies of these software engineers replicating the research that happened between Lama one and Lama four that scale of improvement again in the matter of weeks or months rather than years and so it really matters to just have close the loop on the software engineer and then you can be the first to ASI.
我认为这是一种指引方向,任何能够让人们感受到质量符合他们想要互动的基准,最终都是最重要的部分。我有机会体验了Orion和中级AI应用,它们的语音模式非常流畅,令人印象深刻。至于不同实验室的优化目标,很多人认为,一旦完全实现软件工程和AI研究的自动化,就能够引发智能爆炸,类似在几周或几个月内复刻Lama 1到Lama 4之间研究所取得的进展,而不是以年为单位。所以,关键在于闭合软件工程师的循环,这样你就可以成为第一个实现人工超级智能(ASI)的人。
What do you make of that? Well, I personally think that's pretty compelling and that's why we have a big coding effort too. We're working on a number of coding agents inside meta because we're not really an enterprise software company. We're primarily building it for ourselves. So again, we go kind of like for the specific goal. We're not trying to build a general developer tool. We are trying to build a coding agent in an AI research agent that basically advances Lama research specifically and it's like just fully kind of plugged into our tool chain and all this.
你怎么看待这个?嗯,我个人认为这是非常有说服力的,这也是我们进行大量编码工作的原因之一。我们正在Meta内部开发多个编码代理,因为我们其实不是真正的企业软件公司。我们主要是为自己开发。因此,我们的目标非常明确,我们并不是在尝试构建一个通用的开发者工具,而是在构建一个编码代理,通过AI研究代理来专门推动Lama研究的发展。这个编码代理将全面融入我们的工具链和所有相关工作。
I think that that's important and I think is going to end up being an important part of how this stuff gets done. I would guess that like sometime in the next 12 to 18 months we'll reach the point where most of the code that's going towards these efforts is written by AI and I don't mean like autocomplete. Today you have good autocomplete. You start writing something and it can complete the section of code. I'm talking more. You give it a goal. It can run tests. It can improve things. It can find issues. It writes higher quality code than the average very good person on the team already. I think that's going to be a really important part of this for sure.
我认为这非常重要,并且我觉得这将成为完成这些事情的一个重要部分。我猜大概在接下来的12到18个月里,我们将会达到一个阶段,在这个阶段,大部分为这些工作编写的代码将由AI完成,而我不是指自动补全。如今,你已经有了不错的自动补全功能,你开始写一些东西,它可以完成代码的部分。但我说的更多的是,你给它一个目标,它可以运行测试,改进功能,发现问题。它编写的代码质量甚至比团队中非常优秀的人还要高。我认为这肯定会是这一过程中的一个重要部分。
I don't know if that's the whole game. I think that that I think is going to be a big industry and I think that that's going to be an important part of how AI gets developed. I guess one way to think about this is this is a massive space. I don't think that there's just going to be one company with one optimization function that serves everyone as best as possible. I think that there are a bunch of different labs that are going to be doing leading work towards different domains. Some are going to be more kind of enterprise focused or coding focused. Some are going to be more productivity focused. Some are going to be more social or entertainment focused.
我不知道这是否是全部内容。我认为这将成为一个大产业,也是AI发展的重要部分。可以这样想,这个领域非常宽广。我不认为只有一家公司会用一种优化方式来最好地服务所有人。我认为会有许多不同的实验室在各个领域中进行领先研究。有些实验室可能更加关注企业或编程,有些可能更注重生产力,还有些则可能专注于社交或娱乐。
Within the assistant space I think there are going to be some that are much more kind of informational or productivity. Some are going to be more companion focused. There's going to be a lot of the stuff that's just like fun and entertaining and like shows up in your feed. I think that there's just a huge amount of space. Part of what's fun about this is going towards this AGI future. There are a bunch of common threads for what needs to get invented, but there are a lot of things at the end of the day that need to get created. I think you'll start to see a little more specialization between the groups if I had to guess.
在助手领域,我认为会有一些更侧重于提供信息或提高生产力的助手,还有一些会更注重陪伴功能。同时,也会出现很多只是为了娱乐和趣味而存在的助手,比如在你的信息流中出现的一些有趣内容。我认为这个领域有着巨大的发展空间。有趣的一点是,这与我们迈向AGI(通用人工智能)的未来息息相关。虽然需要发明的基础性技术有很多共同点,但最终仍然需要创造出多种多样的事物。我猜,各个群体之间可能会开始出现更多的专业化分工。
It's really interesting to me that you basically agree with the premise that there will be an intelligence explosion and something like super intelligently at the end. But then if that's the case, tell me about misunderstanding you. If that's the case, why even bother with personal assistance and whatever? Why not just get to super-human intelligence first and then deal with everything that is there? Well, I think that that's just one aspect of the flywheel. Part of what I generally disagree with on the fast takeoff thing is it takes time to build out physical infrastructure. If you want to build a gigawatt cluster of compute, that just is going to take some time.
我觉得很有趣的是,你基本上同意将来会有一个智能爆炸,最终会出现类似超级智能的东西。但是如果是这样的话,我可能误解了你的意思。既然如此,为什么还要花时间在个人助手之类的东西上呢?为什么不先直接研发出超人类智能,然后再处理之后的一切呢?我认为这只是飞轮效应的一个方面。关于快速飞跃这一点,我通常不同意,因为搭建物理基础设施需要时间。如果你要建立一个吉瓦级的计算集群,那必然会花费一些时间。
It takes some video a bunch of time to stabilize their new generation of the systems and then you need to figure out the networking around it and then you need to build the building, you need to get permitting, you need to get the energy and then you want some, whether it's gas turbines or green energy, you need to. There's a whole supply chain of that stuff. I think there's a lot of. We talked about this a bunch on the last time that I was on the podcast with you. Some of these are just physical, world, human time things that as you start getting more intelligence in one part of the stack, you'll basically just run into a different set of bottlenecks. That's sort of the way that engineering always works. It's like you sell one bottleneck, you get another bottleneck.
这段文字可以翻译成中文如下:
"要使新一代系统稳定下来需要一些视频一段时间,然后你得弄清楚围绕它的网络问题,再然后你得建造建筑,获得许可,获取能源,无论你需要的是燃气轮机还是绿色能源,所有这些都有完整的供应链。我觉得这里有很多事情。上次在播客中和你聊过这些。有些只是物理世界和人类时间的问题。当你在某一环节获得更多智能时,你基本上会遇到另一组瓶颈。这就是工程学的常规运作方式:解决一个瓶颈,又会出现另一个瓶颈。"
Another bottleneck in the system or another ingredient that's going to make this work well is basically people getting used to in learning and having a feedback loop with using the system. I don't think these systems don't tend to be the type of thing where something just shows up fully formed and then people magically fully know how to use it. That's the end. I think that there is this co-evolution that happens where people are learning how to best use these AI assistants on the same side. The AI assistants are learning what those people care about and the developers of those AI assistants are able to make the kind of AI assistants better.
系统中的另一个瓶颈,或是让它更好运行的另一个因素,其实是人们逐渐习惯学习和使用这个系统,并形成一个反馈循环。我认为这些系统不太可能只是突然出现,完全成熟后,人们就神奇地完全知道怎么使用它。我认为有一个共同进化的过程,一方面人们在学习如何最佳地使用这些AI助手,而另一方面AI助手也在学习用户的需求,同时AI助手的开发者则可以改进AI助手的质量。
Then you're also building up this base of contexts and how you wake up and you're like a year or two into it. The AI assistant can reference things that you talked about a couple of years ago and that's pretty cool. You couldn't do that if you just launched the perfect thing on day one. There's no way that it could reference what you talked about two years ago if it didn't exist two years ago. My view is there's this huge intelligence growth. There's a very rapid curve on the uptake of people interacting with the AI assistants. The learning feedback and data fly wheel around that.
然后,你也在建立一个背景库,以及你是如何醒来的,这就像是一个或两年的积累。AI助手可以引用你几年前谈论过的事情,这非常酷。如果你在第一天就推出一个完美的东西,是无法做到这一点的,因为它根本不存在两年前。我的观点是,这里有一个巨大的智能增长。人们与AI助手互动的速度增长非常快,并且围绕着此互动有一个学习反馈和数据的良性循环。
Then there is also the build out of the supply chains and infrastructure and regulatory frameworks to enable the scaling of a lot of the physical infrastructure. I think at some level all of those are going to be necessary and not just the coding piece. One specific example of this that I think is interesting. Even if you go back a few years ago, we had a project on, I think it was on our ads team to automate ranking experiments. That's a pretty constrained environment. It's not like right open-ended code. It's basically look at the whole history of the company, every experiment that any engineer has ever done in the ad system and look at what worked, what didn't, what the results of those were and basically formulate new hypotheses for different tests that we should run that could improve the performance of the ad system.
然后,还有供应链和基础设施的建设,以及监管框架的建立,以支持大量物理基础设施的扩展。在某种程度上,我认为所有这些都是必要的,而不仅仅是编码方面的工作。我认为这个事情有一个特别有趣的例子。即便是在几年前,我们的广告团队有一个项目,旨在自动化进行排名测试。这个环境相对比较受限,并不是像开放式的代码那样复杂。这个项目的基本思路是:查看公司的整个历史记录,分析每一位工程师在广告系统中做过的每一个实验,看看哪些有效,哪些无效,以及这些实验的结果。然后,根据这些信息为我们应该进行的不同测试制定新的假设,以改进广告系统的性能。
We basically found we were bottlenecked on compute to run tests based on the number of hypotheses. It turns out even with just the humans that we have right now on the ads team, we already have more good ideas to test than you actually have either kind of compute or really cohorts of people to test them with. Even if you have like three and a half billion people using your products, you still wouldn't want each test needs to be statistically significant. It needs to have some number of whatever it is, hundreds of thousands or millions of people. There's kind of only so much throughput that you can get on testing through that. We're already at the point even with just like the people we have that we already can't really test everything that we want. Now just being able to test more things is not necessarily going to be additive to that.
我们发现因为假设数量的增加,我们在测试中受限于计算能力的瓶颈。即使仅仅依靠我们广告团队目前有人手,我们已经有超出计算能力和可用测试人群的好想法。在拥有35亿用户的情况下,你仍然需要确保每次测试在统计学上具有显著性,这意味着每次测试仍然需要数十万到数百万的人参与。测试的吞吐量是有限的。即使在现阶段,我们已有的团队已经无法测试所有我们想要测试的内容。仅仅增加能够测试的数量并不一定会带来额外的好处。
We need to get to the point where the average quality of the hypotheses that the AI is generating is better than all the things above the line that we're actually able to test that the best humans on the team have been able to do before it will even be marginally useful for it. I think that there's like, we'll get there. I think pretty quickly. It's not like, okay, cool, the thing can write code. All of a sudden everything is just improving massively. There are these real world constraints that basically it needs to first, it needs to be able to do a reasonable job. Then it needs to be able to have the compute and the people to test. Then over time as the quality creeps up, I don't know, are we here in like five or ten years and it's like no set of people can generate a hypothesis as good as the AI system? I don't know, maybe.
我们需要达到这样的程度:AI生成的假设的平均质量要比我们能够测试的、团队中最佳人类之前能够做到的任何假设都要好,然后它才会有些许实用价值。我认为我们很快就会到达这个阶段。这并不是说,“好吧,AI可以写代码了,所有事情就会迅速大幅度提升。” 实际上存在一些现实世界的限制,首先,AI需要能够合理地完成任务,然后它需要有计算能力和人员来进行测试。随着时间的推移,质量慢慢提高,我不知道,也许五年或十年后,AI生成的假设会比任何一组人都要好?我不确定,也许吧。
Then I think in that world obviously that's going to be how all the value is created. That's not the first step. Publicly available data is running out. The major AI labs like Meta, Google DeepMind and OpenAI all partner with scale to push the boundaries of what's possible. Through scales data foundry, major labs get access to high quality data to fuel post training, including advanced reasoning capabilities. Scales research team seal is creating the foundations for integrating advanced AI into society through practical AI safety frameworks and public leaderboards around safety and alignment.
在那个世界中,我认为显然所有的价值都是这样创造的。但这不是第一步。公开可用的数据正在逐渐枯竭。像Meta、Google DeepMind和OpenAI这样的主要人工智能实验室与Scale合作,推动可能性的边界。通过Scale的数据铸造厂,主要实验室获得高质量数据,以支持后续培训,包括先进的推理能力。Scale的研究团队Seal正在通过实用的AI安全框架和围绕安全与对齐的公共排行榜,为将先进AI整合到社会中奠定基础。
Their latest leaderboards include humanities last exam, Enigma, Eval, multi-challenge, and Vista, which test a range of capabilities from expert level reasoning to multimodal puzzle solving to performance on multi-turn conversations. Scale also just released scale evaluation, which helps diagnose model limitations, leading frontier model developers rely on scale evaluation to improve the reasoning capabilities of their best models. If you're an AI researcher or engineer and you want to learn more about how scales data foundry and research lab can help you go beyond the current frontier of capabilities, go to scale.com slash to our cache.
他们最新的排行榜包括人文学科最后一场考试、Enigma、Eval、多重挑战和Vista,这些测试涵盖了从专家级推理到多模态解谜以及多轮对话表现等多种能力。Scale公司还刚刚发布了Scale评估工具,该工具有助于诊断模型的局限性,领先的前沿模型开发者依赖Scale评估来提升他们最佳模型的推理能力。如果您是AI研究员或工程师,并希望了解更多关于Scale的数据铸造厂和研究实验室如何帮助您超越现有能力的前沿,请访问scale.com斜杠our cache。
If you buy this view that this is where intelligence is headed, the reason to be bullish on Meta is obviously that you have all this distribution, which you can also use to learn more things that can be useful for training. You mentioned the Meta app as not as a billion active users. Not the app, not the app. The app is a standalone thing that we're just launching now. I think it's fun for people who want to use it. It's a cool experience. We can talk about that. We're kind of experimenting with some new ideas in there that I think are novel and worth talking through.
如果你认同这种观点,认为未来智能的发展方向在这里,那么看好Meta的理由显而易见:因为你拥有广泛的分发渠道,这些渠道还能用于学习更多有助于培训的东西。你提到Meta应用程序不像是拥有十亿活跃用户的应用。不,不是那个应用。那个应用是我们刚刚推出的独立产品。我觉得对于想要使用它的人来说,这很有趣。这是一个很酷的体验。我们可以聊聊这个。我们正在那里尝试一些我认为新颖且值得探讨的新点子。
But I'm talking mostly about our apps. Meta is actually most used in WhatsApp. So in WhatsApp is mostly used outside of the US. We just passed like 100 million people in the US, but it's not the primary messaging system in the US, I messages. So I think people in the US probably tend to underestimate the Meta AI use somewhat, but it's also part of the reason why the standalone app is going to be so important is the US is for a lot of reasons one of the most important country. And the fact that WhatsApp is the main way that people are using Meta AI and that's not the main messaging system in the US means that we need another way to build a first-class experience that's in front of people.
我主要在谈论我们的应用程序。Meta 实际上在 WhatsApp 上使用最多。而 WhatsApp 主要是在美国以外地方使用。我们刚刚在美国达到了大约1亿的用户,但在美国这不是主要的通讯系统,主要是iMessage。因此,我认为美国的人可能有些低估了Meta AI的使用情况。但这也是为什么独立应用程序如此重要的部分原因,因为美国由于多种原因是一个非常重要的国家。而WhatsApp是人们主要使用Meta AI的方式,而这并不是美国的主要通讯系统,这意味着我们需要另一种方式来提供一个顶级的用户体验给人们。
And I guess to finish the question, the bearish case would be that if the future of AI is less about just answering your questions and more so just being a virtual coworker, it's not clear how Meta AI instead inside of WhatsApp gives you the relevant training data to make a fully autonomous programmer, remote worker. So yeah, in that case, does it not matter that much who has more distribution right now with LLMs? Well, again, I just think that there are going to be different things. It's like, if you were sitting at the beginning of the development of the internet and it's like, well, what's going to be the main internet thing? Is it going to be knowledge work or is it going to be like massive consumer apps? It's like, I don't know, you get both, right? It's like, you don't have to choose one, right?
翻译:我想要回答这个问题,悲观看法可能是,如果未来的人工智能不只是用来回答你的问题,而更多的是作为一个虚拟同事的话,那么目前来说,不太清楚WhatsApp中的Meta AI如何能够提供相关的训练数据,以培养出一个完全自主的程序员或远程工作者。所以在这种情况下,现在谁拥有更大的大型语言模型分布是否并不那么重要呢?不过,我还是觉得会有不同的发展方向。这就像是在互联网发展的初期,你在想,互联网的主要应用会是什么?是知识型工作还是大规模的消费类应用?其实,两者都会存在,你不需要只选择其中之一,对吧?
And now, the world is big and complicated and does one company build all that stuff. I think normally the answer is no. But yeah, no, to your question, people do not code in WhatsApp for the most part and I don't foresee that that's going to be like that's people starting to write code in WhatsApp is going to be like a major, major use case. Although I do think that people are going to ask AI to do a lot of things, the result in the AI coding without them necessarily knowing it. So that's a separate thing. But we do have a lot of people who are writing code at Meta and they use Meta AI. We have this internal thing that we call MetaMate and basically in a number of different coding and AI research agents that we're building around that. And that has quite its own feedback loop and I think can get good for accelerating those efforts.
现在,这个世界庞大而复杂,要靠一家公司来建造所有这些东西通常是不可能的。我认为一般情况下的答案是否定的。至于你的问题,大多数情况下,人们并不会在WhatsApp中编写代码,我也不认为人们会开始在WhatsApp中写代码,这是一个非常重大的应用场景。虽然我确实认为人们会让人工智能做很多事情,而这些事情的结果可能就是人工智能在不被人知晓的情况下进行代码编写。但这是另一个话题。不过,我们在Meta有很多人利用Meta AI进行代码编写。我们有一个内部的工具叫做MetaMate,它包含了多个我们正在开发的编码和AI研究代理。这形成了一个反馈循环,我认为这可以加速这些工作的进展。
Again, I just think that there are going to be a bunch of things. I think AI is almost certainly going to unlock this massive revolution in knowledge work and code. I also think it's going to be kind of the next generation of search and how people get information and do more complex information tasks. I also think it's going to be fun. I think people are going to use it to be entertained. And a lot of the internet is memes and humor. We have this amazing technology at our fingertips and it is amazing and funny when you think about it, how much of human energy just goes towards entertaining ourselves and pushing culture forward and finding humorous ways to explain cultural phenomenon that we observe.
我认为将会出现很多新的变化。人工智能几乎可以肯定会在知识工作和编程领域引发一场巨大的革命。我还认为,它将成为搜索和获取信息以及处理复杂信息任务的下一代工具。同时,我觉得这会很有趣。人们会使用它来娱乐自己。互联网上充满了表情包和幽默。当我们想一想,我们手中拥有如此令人惊叹的技术,而其中有多少人类的精力都用来娱乐自己、推动文化进步以及用幽默的方式解释我们观察到的文化现象时,真是既惊人又有趣。
I think that that's almost certainly going to be the case in the future. If you look at the evolution of things like Instagram and Facebook, if you go back 10, 15, 20 years ago, it was like text, then we all got phones with cameras, most of the content became photos. Then the mobile network's got good enough that if you wanted to watch a video on your phone, it wasn't just like buffering. So that got good. So over the last 10 years, most of the content has moved basically towards video at this point, most of the time spent in Facebook and Instagram is video. But like, I don't know, do you think in five years, we're just going to be like sitting in our feet and consuming media that's video?
我认为这几乎可以肯定会成为未来的趋势。看看Instagram和Facebook等平台的演变,回顾10、15、20年前,内容主要是文字。后来我们都有配备相机的手机,大部分内容变成了照片。移动网络也发展到可以在手机上流畅地观看视频,而不仅仅是缓冲。因此,在过去的10年里,大多数内容基本上转向了视频。目前,在Facebook和Instagram上,大多数时间都是用来看视频。但不知道,你觉得再过五年,我们会不会就那么坐着,通过看视频来消费媒体内容?
It's like, no, it's going to be interactive. It's like you'll be scrolling through your feed and there will be content that is basically, I don't know, maybe it looks like a real to start, but then like you talk to it or you interact with it and it talks back or it changes what it's doing or you can jump into it like a game and interact with it. And that's all going to be like AI. So I guess my point is there's just all these different things and I guess we're ambitious, so we're working on a bunch of them. But I don't think anyone company is going to do all of it.
这就像,没有,这是要互动的。例如,你在浏览信息流时,可能会看到某些内容,它可能起初看起来像是一个短视频,但你可以与它对话或互动,它会回应你或者改变它的行为,或者你可以像玩游戏一样进入其中与之互动。这一切都将由人工智能驱动。所以我的意思是,有很多不同的项目,我们也雄心勃勃,所以正在致力于其中很多个。但我不认为有哪家公司能做到全部。
Okay, so on this point of AI generated content or AI interactions, already people have meaningful relationships with AI therapist, AI friends, maybe more. And this is just going to get more intense as these AI's become more unique and more personable, more intelligent, more spontaneous and funny and so forth. How do we make sure people are going to have relationships with the AI's? How do we make sure that these are healthy relationships? Well, I think there are a lot of questions that you only really can answer as you start seeing the behaviors. So probably the most important up front thing is just like ask that question and care about it at each step along the way.
好的,关于人工智能生成的内容或人工智能互动这一点,目前人们已经在与AI治疗师、AI朋友,甚至可能更多的AI建立有意义的关系。随着这些人工智能变得更加独特、更具个性、更聪明、更有创意、有趣等等,这种趋势只会更加明显。我们如何确保人们会与人工智能建立关系呢?如何确保这些关系是健康的呢?我认为,许多问题只有在观察这些行为发展之后才能真正得到回答。因此,最重要的一点就是在每一个步骤中都提出这个问题,并认真对待它。
But I think also being too prescriptive up front and saying we think these things are not good often cuts off value. Because I don't know, people use stuff that's valuable for them. One of my core guiding principles in designing products is like people are smart, right? They know what is valuable in their lives. Every once in a while, you know, something bad happens in a product and you want to make sure that you design your products well to minimize that. But if you think that someone is doing as bad and they think it's really valuable, most of the time in my experience, they're right and you're wrong and you just haven't come up with the framework yet for understanding why the thing that you're doing is valuable and helpful in their life.
但我也认为,如果一开始就过于具体地规定哪些东西不好,往往会扼杀其中的价值。因为我不知道,人们会使用对他们有价值的东西。在设计产品时,我的一个核心指导原则是:人是聪明的,他们知道什么对他们的生活有价值。当然,有时候产品中会出现一些不好的情况,我们需要确保设计产品时尽量减少这种情况的发生。但是,如果你认为某样东西不好,而那个人却觉得它非常有价值,那么在我的经验中,多数情况下他们是对的,而你是错的,你只是不知道如何理解这个东西为什么对他们的生活有用和有帮助。
Yeah, so that's kind of the main way that I think about it. I do think that people are going to use AI for a lot of these social tasks. Already one of the main things that we see people using that AI for is kind of talking through difficult conversations that they need to have with people in their life. It's like, okay, I'm having this issue with my girlfriend or whatever, like help me have this conversation or like I need to have this hard conversation with my boss at work. Like how do I have that conversation? That's pretty helpful.
是的,这就是我对这个问题的主要看法。我认为人们将会在许多社交任务中使用人工智能。我们已经看到,人们主要利用人工智能来帮助他们进行生活中需要的困难对话。比如,我和女朋友有问题,需要帮助我进行这段对话,或者是我要和老板进行一场棘手的谈话,我该怎么进行这样的对话?人工智能在这些情况下是非常有帮助的。
And then I think as the personalization loop kicks in and the AI just starts to get to know you better and better, I think that will just be really compelling. One thing just from working on social media for a long time is there's the stat that I always think is crazy. The average American I think has, I think it's fewer than three friends, three people have made consider friends. And the average person has demand for meaningfully more. I think it's like 15 friends or something, right? I guess there's probably some point where you're like, all right, I'm just too busy. I can't deal with more people.
然后我认为,随着个性化循环的启动,人工智能会越来越了解你,这将是非常吸引人的。我从长期从事社交媒体工作中了解到的一个统计数据总是让我觉得很不可思议。平均来说,美国人好像只有不到三个朋友,仅有三个人可以真正被认为是朋友。而大多数人希望有更多有意义的朋友,我想大概是15个朋友左右吧。不过我猜总会有个临界点,让人觉得“好吧,我实在太忙了,没法再处理更复杂的人际关系了。”
But the average person wants more connectivity connection than they have. So, there's a lot of questions that people ask of stuff like, okay, is this going to replace kind of in-person connections or real life connections? And my default is that the answer that is probably no. I think that there are all these things that are better about kind of physical connections when you can have them. But the reality is that people just don't have the connection and they feel more alone a lot of the time than they would like.
大多数人希望拥有比他们目前更多的连接。但很多人会问,这些技术会不会取代面对面或现实生活中的连接?我的默认答案是,可能不会。我认为,面对面的连接有很多优势,只要有机会,现实中的连接总是更好的。然而,现实情况是,很多人并没有那么多的连接机会,他们常常感到比自己愿意的更加孤独。
So I think that a lot of these things that today there might be a little bit of a stigma around. I would guess that over time we will find the vocabulary as a society to be able to articulate why that is valuable and why the people who are doing these things are like why they are rational for doing it and how it is adding value for their lives. But also I think that the field is very early. I think they're handful of companies and stuff are doing virtual therapists. There's virtual girlfriend type stuff. But it's very early. The embodiment in the things is pretty weak.
我认为,今天有些事情可能有些许偏见。随着时间的推移,我们作为一个社会会找到合适的语言去解释为什么这些事情有价值,也可以解答那些正在做这些事情的人为何是理性的,以及这如何为他们的生活增添了价值。不过,我也认为这个领域还处于起步阶段。目前只有少数几家公司在做虚拟治疗师,还有一些在做虚拟女友之类的东西。但这些技术的发展还很早,所涉及的具体化表现还很薄弱。
A lot of them, you open it up and it's just like an image of the therapist or the person you're talking to or whatever. I mean sometimes there's some very rough animation but it's not like an embodiment. You've seen the stuff that we're working on in reality labs where you have the codec avatars and it feels like it's a real person. I think that's kind of where it's going. You'll be able to basically have like an always on video chat where it's like, and also the AI will be able to, the gestures are important too. More than half of communication when you're actually having a conversation is not the words that you speak. It's all the nonverbal stuff.
很多时候,你打开它,里面只是显示一些图像,比如你正在交谈的治疗师或其他人。有时也会有一些非常粗糙的动画,但并不像真实的化身。你应该已经见过我们在现实实验室中正在研究的东西,比如那些编解码器化身,它让你感觉像是在和真人见面。我认为这就是未来的发展方向。你将能够进行一种随时在线的视频聊天,而且人工智能也可以参与其中。手势同样重要,因为在实际对话中,超过一半的交流内容并不是你说出的词语,而是所有的非语言信息。
Yeah. I did get a chance to take out Orion the other day and I thought it was super impressive. And I'm mostly optimistic about the technology just because generally as you mentioned, like Libertarian about if people are doing something probably they think it's good for them. Although I actually don't know if it's the case that if somebody is using TikTok they would say that they're happy with how much time there's something on TikTok or something. So I'm mostly optimistic about it. Also in the sense that we're going to be living in this future world of AGI.
好的。我前几天确实有机会试用了Orion,我觉得它非常令人印象深刻。总的来说,我对这项技术持乐观态度,因为正如你提到的,如果人们在做某件事,他们可能认为这对他们有好处。虽然我不太确定,像是如果有人在用TikTok,他们是否会对自己花在TikTok上的时间感到满意。不过,我对未来的发展总体上还是持乐观态度,特别是考虑到我们将生活在一个AGI(通用人工智能)时代的未来世界中。
We need to be in order to keep up with that humans need to be upgrading our capabilities as well with tools like this. And just generally looking at me more beauty in the world if you can see, student-gibbly everywhere or something. I was worried that one of the flagship use cases that your team showed me was I'm sitting at the breakfast table and on the periphery of my vision is just a bunch of reels that are scrolling by maybe in the future my AGI girlfriend is on the other side of the screen or something.
我们需要提升自己的能力,以跟上发展的步伐,比如使用这样的工具。只要能看到,美丽无处不在,就像动画电影中的场景一样。我担心的是,你们团队向我展示的一个主要应用场景是在早餐桌上,我的余光中是不断滚动的视频片段,也许在未来,我的人工智能女友就在屏幕的另一边。
And so I am worried that we're just removing all the friction between getting totally re-reward hacked by our technology. How do we make sure this is not what ends up happening in five years? Again, I think people have a good sense of what they want. That experience they saw was a demo just to show multitasking and holograms. I agree that I don't think that the future is like you have stuff that's trying to compete for your attention in the corner of your vision all the time. I don't think people would like that too much.
因此,我担心我们正在消除与科技相关的所有障碍,导致我们再次被“重回奖励”的现象所困扰。我们该如何确保在五年后不会发生这种情况呢? 我仍然认为,人们对自己的需求有良好的判断能力。他们看到的体验只是一个展示多任务处理和全息图像的演示。我同意,我不认为未来会是那种你视野的一角总有东西在努力争夺你注意力的状态。我觉得人们不会太喜欢那样。
So, it's actually one of the things we're designing these glasses that we're really mindful of is like probably the number one thing that glasses need to do is get out of the way and be good glasses. And as an aside, I think that's part of the reason why the Rayban Meta product has done so well is like, all right, it's great for listening to music and taking phone calls and taking photos and videos and the AI is there when you want it.
所以,我们在设计这些眼镜时非常注意的一点就是,眼镜最重要的功能就是“不碍事”,并且首先要是优质的眼镜。同时,我认为这也是为什么雷朋 (Rayban) 与 Meta 合作的产品如此成功的部分原因。这个眼镜不仅能很好地用于听音乐、打电话、拍照和录像,而且在你需要的时候,AI 功能也很出色。
But when you don't, it's like a great, good looking pair of glasses that people like and it kind of gets out of the way well. I would guess that that's going to be a very important design principle for the augmented reality future. Right, the main thing that I see here is, I think it's kind of crazy that for how important the digital world is in all of our lives, the only way we can access it is through these like physical digital screens.
但是当你没有这样的东西时,就像是一副既好看又受人喜欢的眼镜,它们不会妨碍到你。我想这将成为增强现实未来设计中的一个非常重要的原则。我的主要观点是,我觉得很奇怪,我们的生活中数字世界如此重要,但我们目前只能通过物理的数字屏幕来访问它。
It's like you have a phone, you have your computer, you can put a big TV, it's like this huge physical thing. It just seems like we're at the point with technology where the physical and the digital worlds should really be fully blended. And that's what the holographic overlay is allowed you to do. But I agree, I think a big part of the design principles around that are going to be, okay, you'll be interacting with people and you'll be able to bring digital artifacts into those interactions and be able to do cool things like very seamlessly, right? It's like if I want to show you something here, like here's a screen, okay, here it is, I can show you, you can interact with it, it can be 3D, we can kind of play with it. You want to play a card game or whatever, it's like, all right, here's a deck of cards, we can play with it, it's like two of us are here physically, like you have a third friend who's just hologramming in, right? And they can kind of participate too.
就像你有一个手机,有一个电脑,你还可以放一个大电视,这些都是很具体的东西。似乎我们现在正处于这样一个技术发展的阶段,物理世界和数字世界应该真正融合。而全息覆盖技术就可以实现这一点。我同意,我认为在设计这类技术时,一个重要原则是你可以与人互动,并能在这些互动中引入数字化元素,非常顺畅地做一些很酷的事情。比如说,如果我想给你看点东西,我可以给你展示一个屏幕,你可以与它互动,它可以是3D的,我们可以一起玩。如果你想玩纸牌游戏,比如说,这里有一副牌,我们可以一起玩,就像我们两个人真实地在一起,而你的第三个朋友可以通过全息影像的方式加入进来,他们也可以参与其中。
But I think that in that world, people are going to be, you know, just like you don't want your physical space to be cluttered, it's sort of like a, you know, it just kind of has like a, it wears on you psychologically. I don't think people are going to want the digital kind of physical space to feel that way either. So I don't know, that's more of an aesthetic and one of these norms that I think we'll have to get worked out. But I think we'll figure that out. Going back to the AI conversation, you're mentioning how big of a bottleneck the physical infrastructure can be. And related to other open source models like deep seek and so forth, deep seek right now has less compute than a lot like meta and you could argue that it's competitive with a lot of models. If China is better at, you know, physical infrastructure, industrial scale ups, getting more power and more data centers online, how worried are you that this will, they might beat us here.
我认为在那个世界里,人们就像不希望自己的物理空间杂乱无章一样,也不希望数字空间给自己带来这样的感觉,那种心理上的负担。所以我认为这种美学规范需要去解决,而我们最终会找到答案。回到人工智能的话题,你提到物理基础设施可能会成为一个重大的瓶颈。目前,像DeepSeek这样的开源模型计算能力不如Meta等公司,但它依然具有竞争力。如果中国在物理基础设施、工业规模扩展、增加电力和数据中心方面更有优势,你会多担心他们在这些方面超过我们呢?
I mean, I think it's a, it's like a real competition. Yeah. I'm seeing the, the industrial policies really play out where, yeah, I mean, I think China's bringing online more power. And because of that, I think that the US really needs to focus on streamlining the ability to build data centers and build and produce energy or I think we will be at a significant disadvantage. At the same time, I think some of the export controls on things like chips, I think you can see how they're clearly working in a way because, you know, there was all the conversation with deep seek about how they did all these like very impressive low-level optimizations. And the reality is they did and that is impressive. But then you ask, why did they have to do that when none of the, like American labs did it? And it's a well because they're using like partially nerfed chips that are the only thing that Nvidia has allowed to sell in China because of the export controls.
我的意思是,我觉得这真的像是一场真正的竞争。我看到工业政策确实在发挥作用,我觉得中国正在增加能源供应。正因如此,我认为美国确实需要专注于简化建设数据中心和生产能源的能力,否则我们将处于明显的不利地位。同时,我认为在芯片等领域的一些出口管制也在发挥作用。比如,有很多关于深度优化的讨论,他们确实做到了,这很令人印象深刻。但你可能会问,为什么他们必须这样做,而美国的实验室没有这么做?原因是他们使用的是因为出口管制,Nvidia只允许在中国销售的功能受限的芯片。
So deep seek basically had to go spend a bunch of their calories in time doing low-level infrastructure optimizations that the American labs didn't have to do. Now they produced a good result on text, right? It's like, I mean, deep seek is text only. So the infrastructure is impressive. The text result is impressive. But every new major model that comes out now is multimodal, right? It's image, it's voice, and there isn't. And now the question is why is that the case? I don't think it's because they're not capable of doing it. I think that they basically had to spend their calories on doing these infrastructure optimizations to overcome the fact that there were these export controls.
深度寻求基本上不得不花费大量时间和精力进行低层次的基础设施优化,而美国的实验室则不需要这样做。现在,他们在文本方面取得了不错的成果,可以说,深度寻求只专注于文本。他们的基础设施令人印象深刻,文本成果也非常出色。但是,现在每一次推出的新主要模型都是多模态的,包括图像、语音等,而他们没有做到。这就引出了一个问题:为什么会这样?我并不认为这是因为他们无法做到。我认为他们把精力放在了进行这些基础设施优化上,以克服出口管制带来的挑战。
When you compare like Lama 4 with deep seek, I mean, our reasoning model isn't out yet. So I think that the kind of R1 comparison isn't clear yet. But we're basically like effectively same ballpark on all the tech stuff is what deep seek is doing, but with a smaller model. So it's much more kind of efficient per the kind of cost per intelligence is lower with what we're doing for Lama on text. And then all the multimodal stuff we're effectively leading at. It just doesn't even exist in their stuff. So I think that the Lama 4 models when you compare them to what they're doing are good. And I think generally people are going to prefer to use the Lama 4 models.
当你将Lama 4与Deep Seek进行比较时,我的意思是,我们的推理模型还没有推出。因此,类似R1的比较目前还不明朗。但在技术层面上,我们基本上达到了Deep Seek所做的水平,但使用的是更小的模型。因此,我们在成本效益方面更高,智能成本更低。而在多模式的处理上,我们其实处于领先地位,Deep Seek在这方面根本没有涉足。所以我认为,就Lama 4模型而言,与他们所做的比较,它们表现得非常不错。我相信一般人会更倾向于使用Lama 4模型。
But I think that there is this interesting contour where like it's clearly a good team that's doing stuff over there. Then I think you're right to ask about the accessibility of power, the accessibility of compute and chips and things like that. Because I think the kind of work that you're seeing the different labs do and play out, I think is somewhat downstream of that. Premium products attract a ton of fake account signups, bot traffic, and free-tier abuse. And AI is so good now that it's basically useless to just have a cap draw of six squiggly numbers on your signup page.
我认为,这里有一个有趣的点,那就是那边显然有一个优秀的团队在开展工作。我认为你提到的关于电力、计算设备和芯片可获取性的问题是很有道理的。因为各个实验室的工作成果在某种程度上是这些因素的延续。高级产品往往会吸引大量假账号注册、机器人流量和免费等级的滥用。现在的人工智能技术非常先进,因此在注册页面上仅仅设置六个扭曲的验证码基本上已经没什么用。
Take cursor. People were going to insane links to take advantage of cursors for credits, creating and deleting thousands of accounts, sharing logins, even coordinating through Reddit. And all this was costing cursor a ton of money in terms of inference compute and LLMA PI calls. Then they plugged in work OS radar. Radar distinguishes humans from bots. It looks at over 80 different signals from your IP address to your browser, to even the fonts installed on your computer to ensure that only real users can get through. Radar currently runs millions of checks per week.
取得Cursor。人们为了获取Cursor的积分而疯狂地利用各种手段,创建和删除数千个账户、共享登录信息,甚至通过Reddit进行协调。这一切都让Cursor在推理计算和LLM API调用上花费了大量资金。于是,他们引入了工作操作系统Radar。Radar通过分析超过80种不同的信号来区分真人和机器人,包括你的IP地址、浏览器,甚至是你电脑上安装的字体,以确保只有真实用户才能通过。Radar目前每周运行数百万次检查。
And when you plug Radar into your own product, you immediately benefit from the millions of training examples that Radar has already seen through other top companies. Previously, building this level of advanced protection in-house was only possible for huge companies. But now with work OS radar, advanced security is just an API call away. Learn more at workOS.com slash radar. All right, back to Zuck. So Sam Altman recently tweeted that OpenAI is going to release an open source, a soda reasoning model. I think part of the tweet was that we will not do anything silly, like say that you can only use it if you have less than 700 million users.
当你将Radar整合到你的产品中时,你会立刻受益于Radar从顶级公司获取的数百万个训练示例。之前,要在内部构建这种高级防护系统只有大型公司才能做到。但现在,通过工作操作系统Radar,先进的安全性只需一个API调用即可实现。想了解更多,请访问workOS.com/radar。好了,回到扎克。Sam Altman最近在推特上表示,OpenAI即将发布一个开源的Soda推理模型。我认为他在推特中提到,OpenAI不会做傻事,比如规定只有在用户少于7亿的情况下才能使用该模型。
DeepSeq has the MIT license, whereas LLMA, I think a couple of the contingencies in LLMA license require you to say built with LLMA on applications using it. Any model that you train using LLMA has to begin with the word LLMA. What do you think about the license? Should it be less onerous for developers? I mean, look, we've basically pioneered the open source LLMA things. I mean, I don't consider the license to be onerous. I kind of think that when we were starting to push on open source, it was this big debate in the industry of like, is this even a reasonable thing to do?
DeepSeq 使用 MIT 许可证,而 LLMA 的许可证中有一些条件要求你在使用 LLMA 开发的应用程序中注明“使用 LLMA 构建”,并且任何使用 LLMA 训练的模型名称必须以“LLMA”开头。你怎么看待这样的许可证条款?是否应该对开发者更宽松一些?我觉得我们基本上引领了开源 LLMA 的发展。我并不认为这个许可证苛刻。我想当我们开始推动开源时,行业内就有这样一个大的争论——这是否甚至是一个合理的做法。
Can you do something that is safe and trustworthy with open source? Like, will open source ever be able to be competitive enough that anyone will even care? And basically, when we were answering those questions, which a lot of that hard work, that I think a lot of the teams at MetA, although there are other folks in the industry, but really the LLMA models were the ones that I think broke open this whole open source AI thing in a huge way. We were very focused on, okay, if we're going to put all this energy into it, then at a minimum, if you're going to have these large cloud companies like Microsoft and Amazon and Google turn around and sell our model, that we should at least be able to have a conversation with them before they do that around basically like, okay, what kind of business arrangement should we have?
你能用开源做一些安全和值得信赖的事情吗?开源能否变得足够有竞争力,以至于人们会关心它呢?基本上,当我们在回答这些问题时,我觉得MetA的很多团队付出了大量艰苦的工作,当然,业界也有其他人参与,但真正让整个开源AI领域取得突破的是LLMA模型。我们非常专注于这个方向:如果我们要投入大量精力,那么至少当像微软、亚马逊和谷歌这样的云计算巨头要销售我们的模型时,我们应该能够在此之前与他们进行对话,讨论一下应该达成怎样的商业安排。
But our goal with the license isn't, we're generally not trying to stop people from using the model. We just think like, okay, if you're like one of those companies or if you're Apple, just come talk to us about what you want to do and let's find a productive way to do it together. So I think that that's generally been fine. Now, if the whole open source part of the industry evolves in a direction where there's like a lot of other great options and if the license ends up being a reason why people don't want to use LLMA, then I don't know, we'll have to reevaluate the strategy, whether you know, what it makes sense to do at that point.
我们的许可证目标并不是阻止人们使用模型。我们通常只是希望,如果你是那些公司之一,或者是像苹果这样的大公司,来找我们谈谈你想做什么,我们一起找到一个积极的合作方式。所以我认为现在这个方法总体上还不错。但如果整个开源领域的发展趋向于出现很多其他更好的选择,并且许可证成为人们不愿使用LLMA的一个原因,那么我们可能就需要重新评估策略,看看在那种情况下,什么是合适的做法。
But I just don't think we're there. That's not in practice a thing that we've seen companies coming to us and saying, we don't want to use this because your license says that if you reach 700 million people, you have to come talk to us. So, at least so far, it's a little bit more of something that we've heard from like kind of open source purists like, is this as clean of an open source model as you'd like it to be? And look, I mean, I think that debate has existed since the beginning of open source with like, you know, just all the GPL license stuff versus other things and it's like, okay, just like does it need to be the case that anything that touches open source can has to be open source or can people just take it and use it in different ways?
我只是觉得我们尚未达到那种程度。实际上,我们还没有看到公司因为许可证上写着"如果你的用户达到7亿,你就得跟我们谈谈"而不愿使用我们的产品。因此,至少到目前为止,这更多是来自开源纯粹主义者的一种声音,比如,他们会问我们的开源模型是否达到他们期望的那种纯粹。而且,我认为这种争论自开源开始以来就存在了,比如关于GPL许可证和其他事情的讨论:是否任何接触开源的东西都必须是开源的,或者人们能否就此拿来并以不同方式使用?
And I'm sure there will continue being debates around this, but I don't know if you're spending many, many billions of dollars training these models, I think asking the other companies that are also huge and similar in size and can easily kind of afford to have a relationship with us to talk to us before they use it, I think it seems like a pretty reasonable thing. If it turns out that you, you know, other models are also, you know, there's like a bunch of good open source models. So that part of your mission is fulfilled and maybe other models are better at coding.
我相信关于这个问题的争论会一直继续下去,但如果你花了成百上千亿美金来训练这些模型,那么我认为要求那些同样规模大、财力雄厚的公司在使用前先与我们沟通一下,似乎是个蛮合理的事情。或许最终会发现还有许多优秀的开源模型存在,这也实现了你们的一部分使命,也许其他模型在编程方面更出色。
Is there a world where you just say, look, opens there's some ecosystem is healthy, there's plenty of competition. We're happy to just use some other model, whether it's for internal software engineering at meta or deploying to our apps. We don't necessarily need to build with Lama. Well, again, I mean, we do a lot of things. So it's possible that, you know, I guess, let's take a step back. The reason why we're building our own big models is because we want to be able to like build exactly what we want, right?
有没有一种可能,你会说,看看,现在已经有一个健康的生态系统了,竞争也很充分,我们可以使用其他模型,不管是在 Meta 的内部软件工程中,还是在我们应用的部署中。我们不一定非得用 Lama 来构建。当然,我们做很多事情,可能性是有的。我们还是要退一步来思考,我们之所以开发自己的大型模型,是因为我们想要能够精确地构建出我们想要的东西。
And none of the other models in the world are sort of exactly what we want. If they're open source, then you can take them and you can find to them in different ways. But you still have to deal with the model architectures and, you know, they make different size trade-offs around that affect the latency and inference cost of the models. But it's like, okay, the scale that we operate at, that stuff really matters.
世界上的其他模型都不是完全符合我们需求的。如果它们是开源的,那么你可以使用它们,并通过不同的方法进行优化。但仍然需要处理模型的架构,并且它们做出的不同规模的权衡会影响模型的延迟和推理成本。不过,对于我们运作的规模来说,这些细节真的很重要。
Like we made the Lama Scout and Maverick models certain sizes for a specific reason, because they fit on a host and we wanted certain latency, especially for the voice models that we're working on that we want to just basically have pervade and be across everything that we're doing from the glasses to all of our apps to the meta AI app and all this stuff.
我们将Lama Scout和Maverick模型设计成特定大小是有原因的,因为它们适合在主设备上运行,并且我们希望达到特定的延迟表现。特别是对于我们正在开发的语音模型,我们希望这些模型能够广泛应用于我们所做的一切,包括智能眼镜、所有应用程序到Meta AI应用等等。
So I think that there's a level of control of your own destiny that you only get when you build the stuff yourself. That said, there are a lot of things that like AI is going to be used in every single thing that every company does. When we build a big model, we also need to choose which things, which use cases internally we're going to optimize for.
所以我认为,当你亲自构建某样东西时,你就能在某种程度上掌握自己的命运。话虽如此,AI 将用于每个公司所做的每一件事情。当我们构建一个大型模型时,我们也需要选择将在哪些事情上,哪些内部用例上进行优化。
So does that mean that for certain things we're not going to, you know, think that like, okay, maybe Claude is better for building this specific development tool that this team is using. And like, use that. Fine. Great. I don't think we don't want to fight with, you know, one hand tied behind our back. We're doing a lot of different stuff.
这是否意味着对于某些事情,我们不会认为像“好吧,也许Claude更适合打造这个团队正在使用的特定开发工具”这样的想法并加以使用呢?很好。很棒。我认为我们不想在有劣势的情况下去竞争。我们正在做很多不同的事情。
You also asked would we maybe, would it not be important because other people are doing open source? I don't know. On this, I'm a little more worried because I think you have to ask for anyone who shows up now and is doing open source now that we have done it. There's a question, which is would they still be doing open source if we weren't doing it?
你还问到会不会因为其他人都在做开源,所以我们也应该重视这一点?我不太确定。对此,我有些担忧,因为我觉得我们需要问那些现在才开始做开源的人,尤其是在我们完成这些事情之后。如果我们没有在做开源的话,他们还会继续做开源吗?
And like, I think that there are a handful of folks who see the trend that more and more development is going towards open source. And like, crap, like we kind of need to be on this train or also we're going to lose. Like we have some closed model API and like, increasingly a lot of developers, that's not what they want. So I think you're seeing a bunch of the other players start to do some work in open source.
我觉得有一些人已经看到了一个趋势,就是越来越多的开发转向开源。而且,如果我们不赶上这趟列车,就有可能落后。我们有一些封闭的模型API,但越来越多的开发者不想要这种东西。所以,我认为你可以看到其他一些公司也开始在开源方面下功夫。
It's just unclear if it's dabbling or fundamental for them in the way that it has been for us. And, you know, a good example is like what's going on with Android. Right? It's like Android start off as the open source thing. There's not really like any open source alternative. Like I think over time, Android has just been kind of getting more and more closed.
这不太清楚,他们对待这件事情是仅仅涉猎一下,还是像我们一样视为根本。你知道的,一个很好的例子就是安卓系统。安卓最初是一个开源项目,而现在并没有真正的开源替代品。我认为随着时间的推移,安卓变得越来越封闭。
So I think if you're us, you kind of need to worry that if we stopped pushing the industry in this direction, that like all these other people, maybe you're only really doing it because they're trying to kind of compete with us in the direction that we're pushing things. And you know, they already have their revealed preference for what they would build if open source didn't exist. And it was an open source, right?
我觉得,如果你是我们中的一员,你可能会担心,如果我们不再推动这个行业往这个方向发展,那些其他人可能只是因为想要在我们推动的方向上与我们竞争才这样做。而实际上,如果开源不存在,他们可能已经表现出他们真正想要建立的东西,而那不是开源项目,对吧?
So I just think we need to be careful about relying on that continued behavior for the future of the technology that we're going to build at the company. I mean, another thing I've heard you mention is that it's important that the standard gets built around American models like Lama.
我只是认为,我们需要谨慎对待对这种持续行为的依赖,因为它关系到我们公司未来要开发的技术。我听你提到过另一件事,那就是围绕像Lama这样的美国模型建立标准,这非常重要。
I guess I wanted to understand your logic there because it seems like with certain kinds of networks, it is the case that the Apple App Store just has a big contingency around what it's built around. But it doesn't seem like, you know, you, if you build some sort of scaffold for deep seek, you couldn't have easily just switched it over to Lama 4, especially since between generations.
我想了解一下你的逻辑,因为对于某些类型的网络来说,似乎苹果应用商店有很大的依赖基础。然而,看起来你要是为深度搜索构建某种框架,应该可以很轻松地切换到Lama 4,尤其是在不同版本之间的切换。
Like Lama 3 wasn't MOE. Lama 4 is. So things are changing between generations of models as well. So what's the reason for thinking things will get built out in this contingent way on a specific standard? I'm not sure what do you mean by contingent. Or as in like it's important that people are building for Lama rather than for Lama in general because that will determine what the standard is for the future. Sure. I mean, I think these models encode values and ways of thinking about the world.
翻译成中文,并尽量表达清晰易读:就像Lama 3不是MOE,而Lama 4是。因此,不同代模型之间也在发生变化。那么,为什么认为事物会按照某个特定标准以这种依赖的方式构建呢?我不太确定你说的依赖是什么意思。或者说,人们开发的重点在于专门为Lama开发,而不是笼统地为Lama开发,因为这将决定未来的标准是什么。没错,我认为这些模型确实包含了一些价值观和思考世界的方式。
And you know, we had this interesting experience early on where we took an early version of Lama and we translated it. I think it was, it might have been into French or some other language. And the feedback that we got, I think it was French. From French people was this sounds like an American who learned to speak French. Like it doesn't sound like a French person. It's like, what do you mean? Does it not speak French well? It's like, no, it speaks French fine. It's just like the way that it thinks about the world is like, seems slightly American.
你知道吗,我们早期有过一个有趣的经历。当时我们把Lama的一个早期版本进行了翻译。我记得是翻译成法语或者其他语言,我觉得应该是法语。我们收到的反馈是,这听起来像一个学会说法语的美国人,而不像一个地道的法国人。我们很疑惑,就问:“是不是说得不好?”结果,他们说:“不,法语说得很好,只是它对世界的思考方式有点像美国人。”
So I think there's like these subtle things that kind of get built into it. Over time as the models get more sophisticated, they should be able to embody different value sets across the world. So maybe that's like a very kind of, you know, not particularly sophisticated example, but I think it sort of illustrates the point. And you know, some of the stuff that we've seen in testing some of the models, especially coming out of China is like they sort of have certain values encoded in them.
所以我认为有一些微妙的东西逐渐融入到了系统中。随着模型变得越来越复杂,它们应该能够体现世界各地的不同价值观。或许这不是一个特别复杂的例子,但我觉得它能说明问题。而且,我们在测试一些模型时,尤其是中国出的那些模型,发现它们似乎在某种程度上编码了特定的价值观。
And it's not just like a light fine tune to get that to feel the way that you want. Now the stuff is different, right? So I think language models or something that has like a kind of like a world model embedded into it have more values. Reasoning I think is, I mean, I guess there are kind of values or ways to think about reasoning. One of the things that's nice about the reasoning models is their train unverifiable problems.
这不仅仅是进行一些浅显的调整来让它达到你想要的效果。现在,情况已经不同了,对吧?所以我认为语言模型或那些内置了某种世界模型的东西具有更大的价值。我觉得,推理是一种有价值的方法或思维方式。推理模型的一个优点是,它们是通过解决那些无法验证的问题来训练的。
So do you need to be worried about like cultural bias if your model is doing math? Probably not, right? I think that that's, you know, I think it's like the chance that like some reasoning model that was built elsewhere is like going to kind of incept you by like solving a math problem in a way that's, that's, um, DVS seems low. Um, there's a whole set of different issues.
那么,如果你的模型在进行数学计算时,你需要担心文化偏见吗?大概不需要吧,对吗?我认为,一种在其他地方构建的推理模型通过一种奇怪的方式来解决数学问题,从而影响你的可能性似乎很低。问题的关键在于其他一系列不同的问题。
I think we're on coding, which is the other verifiable domain, which is, you know, I think you, you kind of need to be worried about like waking up one day and like does a model that I have some tie to another government like can it embed all kinds of different vulnerabilities in code that then like the intelligence organizations associated with that government can then go exploits and now you sort of like, all right, like in some future version where you have, you know, some model from some other country that we're using to like secure or build out a lot of our systems and then all of a sudden you wake up and it's like everything is just vulnerable to in a way that like that country knows about, but like you don't or it turns on a vulnerability at some point.
我认为我们现在在讨论的是编程,这是另一个可以验证的领域。我觉得你可能需要担心的是,有一天醒来时发现自己关联的某个政府的模型可能会在代码中嵌入各种漏洞,然后该政府的情报机构就可能利用这些漏洞进行攻击。想象一下,在未来的某个版本中,我们使用来自其他国家的模型来保护或构建我们的大量系统,然后突然之间发现所有东西都存在漏洞,而这些漏洞是那个国家了解但我们不知道的,或者某个时候这些漏洞被激活。
Those are real issues. So what we've basically found is, um, now I mean, I'm very interested in studying this because I think one of the main things that's interesting about open source is the ability to distill models. Um, you know, most people, the primary value isn't just like taking a model off the shelf and saying, okay, like meta built this version of Lama, I'm going to take it and I'm going to run it exactly in my application.
这些是真正的问题。所以,我们基本上发现的是,嗯,现在我是说,我对研究这一点非常感兴趣,因为我认为开源最有趣的一点之一就是可以提炼模型。你知道,对于大多数人来说,主要的价值并不仅仅是像从架子上拿下一个模型然后说,好,比方说Meta创建了这个版本的Lama,我就要把它拿过来直接在我的应用程序中运行。
It's like, no, well, your application isn't doing anything different if you're just running our thing. You're at least going to fine tune it or try to distill it into a different model. And when we get to stuff like the behemoth model, like the whole value in that is being able to basically take this very high amount of intelligence and distill it down into a smaller model that you're actually going to run.
这段话的意思是:如果你只是运行我们的程序,而不做其他事情,那你的应用程序就没有任何不同。至少你应该对其进行微调,或尝试将其提炼成一个不同的模型。而当我们谈到像庞大模型这样的东西时,其核心价值就在于能够将这种高智能“大象”简化为一个更小的模型,而这个小模型才是你真正会使用的。
But this is like the beauty of distillation. And it's like one of the things that I think has really emerged as a very powerful technique in the last year since the last time we sat down is, um, I think it's worked better than most people would predict as you can basically take a model that is much bigger and take probably like 90 or 95% of its intelligence and run it in something that's 10% the size.
但这就像蒸馏的美妙之处。这一技术在去年以来,尤其是我们上次会面之后,展现出了极大的威力。我认为它比大多数人预期的要有效得多,因为你可以把一个很大的模型的智慧大约90%或95%提取出来,然后在只是它大小10%的情况下运行。
Now, do you get 100% of the intelligence? No, but like 95% of the intelligence at 10% of the cost is like pretty good for a lot of things. Um, the other thing that's interesting is now with this like more varied open source community where you, it's not just Lama, you have other models, you have the ability to distill from multiple sources.
现在,你能获得100%的情报吗?不,但用10%的成本得到大约95%的情报对于很多事情来说已经相当不错了。另一个有趣的方面是,现在有一个更丰富的开源社区,不仅仅限于Lama模型,还有其他模型,你可以从多个来源中提取信息。
So now you can basically say, okay, Lama is really good at this. Like maybe the architecture is really good because it's fundamentally multimodal and fundamentally more, um, inference friendly and more efficient. But like, let's say the Southern models better at coding. Okay, well, just you can distill from both of them and then build something that's better than either of them for your own use case. Um, it's that's cool. But you do need to solve the security problem of knowing that you can distill it in a way that is safe and secure. And so this is something that we've been researching and have put a lot of time into and what we've basically come to is like, look, anything that's kind of like language is, is quite fraught because there's like a lot of values embedded in that.
现在你可以说,Lama在这方面表现得非常好。可能是因为它的架构非常优秀,因为它从根本上是多模态的,更适合推理,也更高效。但是,假设说另一个模型在编码方面更好。那么,你可以从这两个模型中提炼优点,来构建一个比它们任何一个都更适合你需求的模型。这很酷。不过,你需要解决安全性的问题,确保提炼过程是安全的。我们在这个问题上投入了大量的研究和时间。我们发现,任何与语言相关的东西都很复杂,因为其中包含了很多价值观。
So unless you don't care about having the values from whatever the model is that you got, you probably don't want it like to distill the straight like language world model. Um, on reasoning, I think you can get a lot of the way there. By limiting it to verifiable domains, running, um, kind of code, cleanliness and security filters, like, like, like, whether it's like the Lama guard open source or the code shield open source things that we've done that basically, um, allow you to incorporate different, different, um, input into your models and make sure that the, that both the input and the output, um, are secure.
所以,除非你不在乎模型所提供的数值,否则你可能不希望直接像语言模型那样去简化处理。关于推理,我认为通过将其限制在可验证的领域,可以在这方面取得很大进展。可以运行一些代码、清洁和安全过滤器,比如我们使用过的Lama guard开源或者code shield开源项目,这些工具基本上允许你将不同的输入整合到你的模型中,并确保无论是输入还是输出都是安全的。
And then just a lot of red teaming to make sure that you're, your, um, like you just have people or experts who are looking at this, it's a guard is this model doing anything that isn't what I want after distilling from something. And I think with the combination of those techniques, you can probably distill on the reasoning side for verifiable domains quite securely. Um, that's something I'm pretty confident about and it's something that that we've done a lot of research around. But I think this is a very big question is like, how do you do good distillation because there's just so much value to be unlocked. But at the same time, I do just think that there is some fundamental bias in the different models.
然后进行大量的“红队”测试,以确保你的模型在蒸馏之后没有做出任何你不希望的事情。这样做可以让一些人或专家来检查这项工作。结合这些技术,我认为在可验证的领域中,你可以相对安全地进行推理方面的蒸馏。这是我非常有信心的一点,也是我们进行了大量研究的领域。不过,我认为这确实是一个非常大的问题,就是如何做好蒸馏,因为有很多潜在的价值可以被挖掘。但与此同时,我也认为不同的模型中确实存在一些根本的偏差。
Speaking of value to be unlocked, what do you think the right way to monetize AI will be? Because obviously digital ads are quite lucrative, but as a fraction of total GDP, it's small and comparison to like all the remote work, um, like, even if you can increase this productivity and not replace work, that's still worth tens of trillions of dollars. So is it possible that ads might not be it? Yeah, how do you think about this? I mean, like we were talking about before, there's going to be all these different applications and different applications tend towards different things.
谈到要挖掘的价值,您认为人工智能应该如何正确地盈利呢?显然,数字广告虽然利润丰厚,但与整体GDP相比,它所占的比例很小。即便只是通过提高生产力而不是取代工作,这也能带来数以万亿计的美元价值。那么,是否有可能广告并不是最佳的盈利方式呢?您怎么看呢?就像我们之前谈到的,会有不同的应用场景,而不同的应用倾向于不同的方向。
Um, ads is great when you want to offer people a free service, right? Because it's free. You need to cover it somehow. Ads is like, okay, it's ads solves this problem of like a person does not need to pay for something and they can get something that is like amazing for free. Um, and also by the way, with modern ad systems, a lot of the time people think that the ads add value to the thing. If you do it well, right? It's, um, you know, you need to get be good at ranking and, and you need to be good at having enough liquidity of advertising inventory.
嗯,当你想为人们提供免费服务时,广告是个不错的选择,对吧?因为它是免费的,你需要以某种方式弥补成本。广告就是这样解决这个问题的:人们无需付费就能获得一些很棒的东西。而且,顺便提一句,借助现代广告系统,很多时候人们认为广告为产品增添了价值。如果你做得好,就是说,你需要擅长广告排序,同时要确保拥有足够的广告库存流通性。
Um, so that way, you know, if you only have five advertisers in the system, no matter how good you are at ranking, you may not be able to show something to someone that they're interested in. But if you have a million advertisers in the system, then you're probably going to be able to find something pretty compelling. If you're good at it, picking out, you know, the different needles in the haystack that that person's going to be interested in.
嗯,这样的话,你懂的,如果系统里只有五个广告商,不管你排名能力多好,你可能都无法展示出用户感兴趣的东西。但如果系统里有一百万个广告商,那么你可能就能找到一些非常吸引人的东西。只要你擅长,从中挑出能够吸引特定用户注意的“针”。
So I think that definitely has its place. But there are also clearly going to be other business models as well, including one is that, um, just have higher costs. So it doesn't even make sense to offer them for free, um, which by the way, there have always been business models like this. There's a reason why social media is free and ad supported. But then if you want to watch Netflix, um, or like ESPN or something, you need to pay for that.
我认为这种模式确实有其存在的意义。但显然,还有其他的商业模式存在,比如,有些产品或服务的成本较高,因此免费提供就不太现实。事实上,这种有收费的商业模式一直存在。社交媒体之所以能免费使用,是因为有广告支持。而像Netflix或ESPN这样的服务则需要用户付费。
Let's go. Okay, because the content that's going into that, like they need to produce it. And that's very expensive for them to produce. And they probably could not have enough ads in the service in order to make up for the, the cost to produce in the content. So basically you just need to pay to, to access it, um, then the tradeoff is fewer people do it, right? It's like they're talking about hundreds of millions of people using those instead of billion. So it's, it's, there's kind of a value switch there.
好的,我们走吧。因为他们需要制作这些内容,而制作这些内容的成本非常高。他们可能无法通过服务中的广告来弥补制作内容的成本。所以基本上你需要付费才能访问这些内容。这样一来,参与的人就会更少,对吧?就像是他们谈论的是有几亿人使用这些服务,而不是几十亿。所以这里在价值上有一个转换。
Um, I think similar here, you know, not everyone is going to want like a software engineer or a thousand software engineering agents or whatever it is. But if you do, that's something that you, you are probably going to be willing to pay thousands or tens of thousands or hundreds of thousands of dollars for. Um, so I think that this just speaks to the diversity of different things that need to get created is like, they're going to be business models at each point along the spectrum. And it met a, um, yeah, for the consumer piece, we definitely want to have a free thing. And I'm sure that will end up being ad supported. But I also think we're going to want to have a business model that supports people using arbitrary amounts of compute to do like really even more amazing things than what it would make sense to be able to offer the free service. And for that, I'm sure we'll end up having a premium service.
嗯,我觉得这里的情况也很相似,你知道,不是所有人都会需要软件工程师或者成千上万的工程代理之类的东西。但是,如果你需要这样的服务,那你可能会愿意花几千、几万甚至几十万美元来获取。因此,我认为这反映了需要创建的事物多样性,即在这个过程中每个点都会有商业模式。至于消费者方面,我们肯定想提供一个免费的选择,我相信这最终会通过广告来支持。但我也认为,我们会希望有一种商业模式来支持用户使用大量计算资源,以实现一些更惊人的事情,这些是免费服务无法提供的。为此,我相信我们最终将会推出一个高级服务。
But, but I mean, I think our, our basic, you know, values on this, so we want to serve as many people in the world. Lambda is de-cloud for AI developers. They have over 50,000 Nvidia GPUs ready to go for startups, enterprises, and hyper sailors. Compute seems like a commodity though. So why use Lambda over anybody else? Well, unlike other cloud providers, Lambda's only focus is AI. This means their GPU instances and on demand clusters have all the tools that AI developers need pre installed, no need to manually install CUDA drivers or manage Kubernetes. And if you only need GPU compute, you can save a ton of money by not paying for the overhead of general purpose cloud architectures.
但是,我的意思是,我认为我们的基本价值观是希望服务于世界上的尽可能多的人。Lambda 是为人工智能开发者提供去云化服务的平台。他们拥有超过5万块 Nvidia GPU,专门为创业公司、企业和超大规模公司随时提供服务。计算资源似乎像商品一样,那么为什么要选择 Lambda 而不是其他服务商呢?不同于其他云服务提供商,Lambda 只专注于人工智能。这意味着他们的 GPU 实例和按需集群已经预装了所有人工智能开发者需要的工具,不需要手动安装 CUDA 驱动或管理 Kubernetes。如果你只需要 GPU 计算,你可以通过不支付通用云架构的额外费用来节省大量资金。
Lambda even has contracts that let enterprises use any type of GPU in their portfolio and easily upgrade to the next generation. For all of you wanting to build with Lama 4, Lambda has a serverless API without rate limits. It's built with rapid scaling in mind. Users have 1000 extra inference consumption without ever having to apply for a quota or even speak to a human head to lambda dot AI slash the forecast for a free trial of their inference API for sharing the best open source models like deep sea and Lama 4 at the lowest prices in the industry.
Lambda公司甚至还提供合同,让企业可以使用其产品组合中的任何类型的GPU,并轻松升级到下一代产品。对于所有想使用Lama 4进行构建的人,Lambda提供了一个无速度限制的无服务器API。它专为快速扩展而设计。用户可以享有额外1000次推理消耗,无需申请配额,也不用和人工客服交流。访问lambda.ai/theforecast,可以免费试用他们的推理API,这项服务以业内最低的价格共享像Deep Sea和Lama 4这样的最佳开源模型。
All right, back to Zach. How do you keep track of? You've got all these different projects that some of which we've talked about today. I'm sure there's many I don't even know about. As the CEO overseeing everything, there's a big spectrum between like going to the Lama team and here's the hyper parameters you should use to just giving like a mandate like go make the AI better. And there's many different projects. How do you think about the way in which you can best deliver your value ad and oversee all these things?
好的,我们回到Zach这边。你是如何管理所有这些不同的项目的呢?有些项目是我们今天提到的,我相信还有很多是我不知道的。作为负责一切的CEO,你需要在参与细节(比如告诉Lama团队应该使用哪些超参数)和给出总体方向(比如要求团队提升AI性能)之间做出平衡。面对这么多不同的项目,你是如何考虑以最佳方式发挥你的价值,并监督所有这些工作的呢?
Well, I mean, a lot of what I spend my time on is trying to get awesome people onto the teams, right? I mean, it's um, so there's that. And then there's stuff that cuts across teams. It's like, all right, you build met AI and you want to get it into WhatsApp or Instagram. It's like, okay, then I'm now into get those teams to talk together. And then there's a bunch of questions like, okay, I was. You know, it's like, okay, do you want the thread for met AI and WhatsApp to feel like other WhatsApp threads or do you want it to feel like other kind of like AI chat experiences? There's like different idioms for those.
好的,我的意思是,我花很多时间在为团队招募优秀的人才。然后,还有一些需要跨团队协调的工作。比如说,你开发了一个名为met AI的项目,你希望把这个项目整合到WhatsApp或Instagram中。我就需要让这些团队一起沟通。接着,还有许多问题需要考虑。例如,你希望met AI在WhatsApp上的对话界面看起来像其他的WhatsApp对话,还是想让它更像其他的AI聊天体验?两者之间有不同的风格和特性。
And so I think that there's like all these interesting questions that sort of need to get answered around like, how does this stuff basically fit into all of what we're doing? Then there's a whole other part of what we're doing, which is basically pushing on the infrastructure. If you're if you want to stand up a gigawatt cluster, then first of all, that has a lot of implications for the way that we're doing infrastructure buildouts. It has sort of political implications for how you engage with the different states where you're building that stuff. It has financial implications for the company in terms of, all right, there's like a lot of economic uncertainty in the world.
所以我认为有很多有趣的问题需要回答,比如这些东西如何融入我们正在做的事情中。此外,我们还有另一个正在进行的部分,那就是推进基础设施。如果你想建立一个千兆瓦级的集群,首先,这会对我们的基础设施建设方式产生很大影响。这还会对你如何与建设地点所在的不同州进行互动产生政治影响。在财务方面,由于世界经济存在不确定性,这也会对公司造成影响。
Do we like go double down on infrastructure right now? And if so, what are their trade offs? Do we want to make around the company? Like those are things that like, it's tough for other people to really make those kind of decisions. And then, and then I think that there's this question around like taste and quality, which is like, when is something good enough that we want to ship it? And, and I do feel like in general, I'm the steward of that for the company, although, you know, we have a lot of other people I think have good taste as well, who are also filters for for different things.
我们现在是否要加大基础设施的投入?如果是的话,有什么利弊?这些都是需要在公司内部做出的决策。但这些决策很难由他人来做。然后,还有涉及品味和质量的问题,就是说,何时产品足够好以至于可以发布?总体来说,我觉得我是公司在这方面的负责人,当然,我们还有很多其他有品味的同事,也是不同方面的把关者。
But, yeah, I think that those are, those are basically the areas. But I think AI is interesting because more than some of the other stuff that we do, it is more of research and model led, been really product led. Like you can't just like design the product that you want and then try to build the model to fit into it. You really need to like, design the model first and like the capabilities that you want and then you get some emergent properties. Then it's going to build some different stuff because this kind of turned out in this way. And I think at the end of the day, like, like people want to use the best model, right? So that's partially why, you know, when we're talking about building the most, like personal AI, the best voice, the best personalization. And like also a very smart experience with very low latency. Those are the things that we basically need to design the whole system to build, which is why we're working on full duplex voice, which is why we're working on, like the personalization to both have like good memory extraction from your interaction with AI, but also be able to plug into all the other meta systems.
但是,是的,我认为这些基本上就是那些领域。不过,我觉得人工智能很有趣,因为与我们做的其他一些事情相比,人工智能更依赖于研究和模型,而不是以产品为导向。你不能只是先设计出你想要的产品,然后再试图构建模型来适应它。你真的需要先设计模型及其能力,然后你就会出现一些新特性。这会导致系统构建出一些不一样的东西。最终,人们想要使用最好的模型,对吧?这也是为什么我们在谈论构建最个性化的人工智能,最好的语音识别,最佳的个性化体验,以及非常智能且延迟极低的体验时,必须设计整个系统以实现这些目标。这就是为什么我们在研究全双工语音技术,同时也在完善个性化技术,以便不仅能从与人工智能的互动中提取有用记忆,还能接入其他所有的大系统。
And why we design the specific models that we designed to have the kind of size and latency parameters that they do. Speaking of politics, there's been this perception that some tech leaders have been aligning with from you and others have donated to his inaugural event. We're on stage with them. And I think you settled like a lawsuit, which gave a result that I'm getting $25 million. I wonder what's going on here? Is it does it feel like the cost of doing business with the administration or, yeah, what's the best sort of thing about this? My view on this is like, he's the president of the United States. Our default as an American company should be to try to have a productive relationship with whoever is running the government. I would do this, you know, like we've tried to offer to support previous administrations as well. I've been pretty public with some of my frustrations with the previous administration, how they basically did not engage with us or the business community more broadly, which I think, frankly, I think is going to be necessary to make progress on some of these things.
为什么我们设计的特定模型有这样大小和延迟参数的原因。谈到政治,有人认为一些科技领袖与你有合作,他们为就职活动捐款,并且在台上站在一起。我想你解决了一场诉讼,结果是我会得到2500万美元。我想知道这里究竟发生了什么?这是否感觉像是与政府打交道的成本,或者这有什么好的方面?我的看法是,他是美国总统。作为一家美国公司,我们的默认态度应该是努力与政府保持良好的关系。对于之前的政府,我们也努力提供支持。我对上一届政府的某些不满也非常公开,他们基本上没有与我们或更广泛的商业界进行接触,说实话,我认为这对于在某些问题上取得进展是必要的。
Like we're not going to be able to build the level of energy that we need. If there, if you don't have a dialogue and they're not prioritizing trying to do those things. So, but fundamentally, you know, look, I mean, I think a lot of people want to write the story about like, like, you know, what direction are people going? I just think it's like, we're trying to build great stuff. We want to work with, we have a productive relationship with people. And that's sort of, that's how I see it. And it is also how I would guess most others see it. But obviously, I can't speak for them. You've spoken out about how you've rethought some of the ways in which you, engage and defer to the government in terms of moderation stuff in the past. How are you thinking about AI governance? Because if AI is as powerful as we think it might be, the government will want to get involved. What is like the most productive approach to take there?
我们可能无法建立我们所需的能量水平。如果没有对话,且相关方不重视尝试去做这些事情,就很难实现这一目标。根本上,我认为许多人都关心人们的发展方向,但在我看来,我们只是想创造出色的事物,并与人们保持良好的合作关系。我认为大多数人可能也是这样想的,但显然,我不能替他们发言。你曾谈到你重新思考了过去在内容管理方面如何与政府互动和合作的问题。那么,你对人工智能治理有什么看法呢?因为如果人工智能真的像我们所想的那样强大,政府会想要介入。那么,最有效的方法是什么呢?
And what should the government be thinking about here? Yeah. I guess in the past, I probably just, I mean, most of the comments that I made, I think we're in the context of content moderation, where, you know, it's been an interesting journey over the last 10 years on this where there, it's obviously been an interesting time in history. There have been novel questions raised about online content moderation. Some of those have led to, I think, productive new systems getting built like our AI systems to be able to detect nation states trying to interfere in each other's elections. I think we will continue building that stuff out and that that I think has been that positive. I think other stuff. We went down some bad paths. Like I just think the fact checking thing was not as effective as community notes because it's not internet scale solution. There weren't enough fact checkers and like people didn't trust the specific fact checkers. They like, you know, you want a more robust system.
政府在这里应该考虑些什么呢?在过去,我的很多评论都围绕着内容审核这个话题。过去十年在这方面经历了很有趣的发展,这段历史时期无疑是很特别的。在线内容审核提出了一些新颖的问题,其中一些问题促成了建设新系统,例如通过我们的人工智能系统检测国家间相互干预选举的企图。这部分工作持续推进,并且取得了积极的效果。但另一方面,我们也走了一些错误的道路。比如,我认为事实核查的效果没有像社区笔记那样好,因为它不是一个适用于互联网规模的解决方案。事实核查员的数量不足,而且人们也不信任具体的事实核查员。我们需要一个更健全的系统。
So I think what we got with community notes is the right one on that. But, but my point on this was, it was more that, that I think historically, I probably deferred a little bit too much to either the media and kind of their critiques or the government on things that they did not really have authority over, but just as like a central figure. Like I think we tried to build systems that were maybe we could not have to make all the content moderation decisions ourselves or something. And I guess I just think part of the growth process over the last 10 years is just, okay, like we're a meaningful company. We need to own the decisions that we need to make. We should listen to feedback from people, but shouldn't defer too much to people who are not who do not actually have authority over this because at the end of the day, we're like, we're in the seat and we need to like own the decisions that we make.
我认为我们在社区注释方面取得的成果是正确的。但我的观点是,从历史上看,我可能过多地依赖媒体及其批评,或者政府在一些他们没有真正权威的事情上发言,只是因为他们是中央人物。比如,我们试图建立一些系统,这样我们就不必亲自做出所有内容审核的决定。我想,在过去10年的成长过程中,我们意识到自己是一家有影响力的公司,需要对自己的决策负责。我们应该听取反馈,但不应该过于依赖那些实际上没有权威的人。毕竟,最终我们处于决策的核心位置,必须为自己的决定承担责任。
And so I think we probably, you know, it's been a maturation process and in some ways painful, but I think we're probably a better company for both tariffs increase the cost of building data centers in the US and shift buildouts to Europe and Asia. It is really hard to know how that plays out. I think we're probably in the early innings on that and it's very hard to know. Got it. What is your single highest leverage hour and a week? What are you doing in that hour? I don't know. I mean, every week is a little bit different. And it's probably got to be the case that the most leverage thing that you do in a week is not the same thing each week or else by definition, you should probably spend more than one hour doing that thing every week.
所以,我认为这可能是一个成熟的过程,在某些方面可能很痛苦,但我认为我们因此成了更好的公司。关税增加了在美国建立数据中心的成本,并将建设转移到欧洲和亚洲。这种情况的发展很难预测,我认为我们可能还处于早期阶段,很难确定其走向。明白了。你一周中最有影响力的那一个小时是什么时候?你在那一小时会做什么呢?我不知道,每周的情况都会有些不同。最有影响力的事情每周可能都不一样,否则,你可能应该每周花不止一小时在那件事情上。
But yeah, I don't know. It's part of the fun of both, I guess, this job, but also the industry being so dynamic as like things really move around. Right? And like, and the world is very different now that it was at the beginning of the year, then it was six months into the middle of last year. I think a lot has sort of has really advanced meaningfully and like a lot of cards have been turned over since the last time that we sat down. I think that was about a year ago, right? Yeah, yeah. But I guess you were saying earlier that recruiting people is a super high leverage thing you do. It's very high leverage. Yeah. Yeah.
不过,我也不知道。这可能就是这份工作和整个行业的乐趣吧,因为事情总是在不断变化,对吧?现在的世界和年初或去年的年中都很不一样。我觉得很多事情都有了重要的进展,很多底牌也揭开了。上次我们坐下来讨论应该是一年前,对吧?是的。不过,你之前提到过,招聘是你做的一件极具杠杆效应的事情。确实如此,非常具有杠杆效应。
What would be possible if you talked about these models being middle-level soft engineers by the end of the year or what would be possible if say software productivity increased like 100x in two years? What kinds of things could be built that we can't build right now? What kinds of things? Well, why, it's, and that's an interesting question. I mean, I think one theme of this conversation is that the like amount of creativity that's going to be unlocked is going to be massive. And if you look at like the overall arc of kind of human society and the economy over 100 or 150 years, it's basically people going from being primarily agrarian and most of human energy going towards just feeding ourselves to that has become a kind of smaller and smaller percent in the things that take care of like our basic physical needs or a smaller and smaller percent of human energy, which has led to two impacts.
如果到今年年底,这些模型被认为是中级软件工程师,或者假设软件生产力在两年内提高了100倍,会有什么可能性呢?我们现在无法构建的东西,到那时会是什么样子的?这是一个有趣的问题。我认为,这次讨论的主题之一就是将会释放出大量的创造力。如果你回顾过去100或150年人类社会和经济的整体演变,可以发现,人们从主要依靠农业生存,绝大部分人类精力都用于解决温饱问题,逐渐转变为这些基本生存需求占用的精力越来越少。这种变化带来了两个影响。
One is more people are doing kind of creative and cultural pursuits. And two is that more people in general spend less time working and more time on entertainment and culture. I think that that is almost certainly going to continue as this goes on. This isn't like the one to two-year thing of what happens when you have a super powerful software engineer, but I think over time, you know, if you, like everyone is going to have these superhuman tools to be able to create a ton of different stuff and you're going to get this incredible diversity, part of it is going to be solving like things that we hold up is like, these like hard problems like solving diseases or like solving different things around science or or just like different technology that makes our lives better.
越来越多的人从事创造性和文化方面的追求。其次,一般来说,更多的人花更少的时间工作,而更多的时间投入在娱乐和文化上。我认为,这种趋势几乎肯定会继续下去。这并不像那种一两年的短期变化,就像当有一个超级强大的软件工程师时会发生的事情。但我认为,随着时间的推移,人人都会拥有这些超级工具来创造各种各样的东西,结果将是我们看到不可思议的多样性。其中一部分将是在我们认为困难的问题上取得突破,比如解决疾病问题、解决各种科学问题,或是发展一些让我们生活更美好的技术。
But I would guess that a lot of it is going to end up being kind of cultural and social pursuit and entertainment. And like I would guess that the world is going to get a lot more like a lot funnier and like weirder and and corkier in a way that like the memes on the internet have sort of gotten over the last 10 years. And I think that that adds a certain kind of richness and depth as well that in kind of funny ways, I think it actually helps you connect better with people because now like, I don't know, it's like all day long. I just find interesting stuff on the internet and like send it in group chats to the people I care about who I think are going to find it funny.
我猜想其中很大一部分将成为一种文化和社交的追求以及娱乐活动。我认为世界将变得更加有趣、怪异和别具一格,就像过去10年在互联网上流行的那些表情包一样。我觉得这为生活增添了一种特别的丰富性和深度,从某种有趣的方式来看,它实际上帮助你更好地与人连接。因为最近,我整天都在网上发现有趣的东西,然后分享到群聊里,和那些我在乎、并且认为他们也会觉得好笑的人分享。
And it's like like the media that people can produce today to express very, very nuanced specific cultural ideas. I don't know, it's cool. And I think that'll continue to get built out. And I think it does advance society in a bunch of ways even if it's not like the hard science way of curing a disease. Yeah. But I guess this is sort of if you think about it like the like meta social media view of the world is like, yeah, I think people are going to spend a lot more time doing that stuff in the future.
这段话的大意是:如今,人们可以创作出非常细腻的作品来表达特定的文化思想,我觉得这非常酷。我认为这种趋势会继续发展,虽然不像治愈疾病那样属于硬科学的进步,但它确实在许多方面推动了社会的前进。就像从一个宏观社交媒体视角来看这个世界,我认为未来人们会花更多的时间在这类事情上。
And it's going to be a lot better and it's going to help you connect because it's going to help express different ideas. Because the world's going to get more complicated but like our technology or cultural technology to kind of express these very complicated things in like a very kind of funny little clip or something are going to just get so much better. So I know that's all great. I don't know, next year for I tend to, I mean, just I guess one other thought that I think is interesting to cover is I tend to think that it for at least the foreseeable future.
这将会变得更好,它会帮助你连接,因为它能表达不同的想法。随着世界变得越来越复杂,我们的科技或文化技术也会提高,能够通过一种有趣的小片段来表达这些复杂的事情。所以这确实是件好事。我不知道明年会怎样,但对于可见的未来,我有一个有趣的想法想要分享。
This is going to lead towards more demand for people doing work not less. Now people have a choice of how much time they want to spend working. But I'll give you one interesting example of something that we were talking about recently. We, so you have like three almost three and a half billion people use our services every day. And one question that we've struggled with forever is like how do we provide customer support?
这将导致对工作的需求增加,而不是减少。现在人们可以选择自己想要花多少时间在工作上。但我来给你一个有趣的例子,这是我们最近谈论的一件事情。我们每天大约有三十到三十五亿人使用我们的服务。一个我们一直在努力解决的问题是:我们该如何提供客户支持呢?
Right today, like you can you can write an email. But we've never seriously been able to contemplate having, having like voice support where someone can just call in. And I guess that's maybe one of the artifacts of having a free service, right? It's like the revenue per person's not so high that you can have an economic model that people can kind of call in. But also, I'm with three and a half billion people using your service every day.
在今天,你可以写一封邮件。但是我们从未真正考虑过能够提供语音支持,就是让人们可以直接打电话进来。我想这可能是因为提供的是一个免费服务,对吧?每个人带来的收入不高,所以无法建立一个能让大家随意打电话的经济模式。再加上,每天有三十五亿人使用你的服务。
I mean, you do, there'd be like a massive, massive number of people like some like the biggest call center in the world type of thing. But it would be like 10, 20 billion dollars some ridiculous a year to kind of staff that. So we've never really kind of like thought too seriously about it because it was always just like, no, there's no way that this kind of makes sense. But now as the AI gets better, you're going to get to this place where the AI can handle a bunch of people's issues, not all of them, right?
我的意思是,你确实需要一个非常庞大的人力队伍,类似于世界上最大的呼叫中心。要维持这样的规模,每年可能需要花费100亿、200亿美元,这样的金额实在是太离谱了。因此,我们从未认真考虑过这种方案,因为一直觉得这不太现实。但现在随着人工智能的进步,情况开始改变了。人工智能将能够处理大量人的问题,虽然可能不是所有问题。
Because maybe 10 years from now or something, it can handle all of them. But when we're thinking about like a three to five year time horizon, it'll be able to handle a bunch kind of like self driving cars can handle a bunch of terrain. But in general, they're not like doing the whole route by themselves yet in most cases, right? It's like people thought truck driving jobs were going to go away.
也许在10年之后,它能够处理所有问题。但当我们考虑三到五年的时间范围时,它可能能够处理很多问题,就像自动驾驶汽车可以在很多地形上行驶一样。但总体而言,它们还不能在大多数情况下完全自主完成整个路线,对吧?就像人们曾认为卡车司机的工作会消失一样。
There's actually more truck driving jobs now than there were like when we started talking about self driving cars, you know, whatever it was almost 20 years ago. And I think for going back to this customer's support thing, it's like, all right, it wouldn't make sense for us to staff out, um, calling for everyone. But let's say the AI can handle 90% of that. Then like, and then if you if it can't handle it, then it kicks it off to a person.
实际上,现在的卡车驾驶工作比我们开始谈论自动驾驶汽车时(大约20年前)还要多。我认为回到客户支持这个话题上,就像这样:好的,我们不可能为每个人都配备客服人员。但是假设AI可以处理90%的客户服务请求,那么当AI不能处理的情况下,再转交给人工客服来处理。
Okay, now like if you've gotten the cost of providing that service down to one tenth of what it would have otherwise been, then all right, maybe then now that actually makes sense to go do and that would be kind of cool. So the net result is like, I actually think we're probably going to go hire more customer support people, right? It's like the common knowledge or like the kind of common belief that people have is that like, oh, this is clearly just going to automate jobs and like all these jobs are going to go away.
好的,现在如果你把提供该服务的成本降低到原来的十分之一,那么也许这样做就真的有意义了,而且这还挺不错。最终的结果是,我实际上认为我们可能会招聘更多的客户支持人员。一般的看法或者人们普遍的信念是,这种情况显然会让工作自动化,把所有这些工作都替代掉。
I actually just that has not really been how the history of technology has worked. It's been, you know, you you can you like create things that take away 90% of the work and that leads you to want more people, not less. Yeah, I mean, to close off the interview, I I've been playing devil's advocate on a bunch of points and I really appreciate you being a good sport about it.
实际上,这并不是技术发展的历程。通常,你会创造出能减轻90%工作的东西,这反而会让你需要更多的人,而不是更少。是的,我是说,在结束采访时,我在很多问题上扮演了唱反调的角色,我非常感谢你对此的宽容和理解。
But I do think there's like not enough and bound to how much beauty there can be in the world, especially if there's billions of AI is optimizing about beauty, you can see in the amount of connection you can have and so forth. So yeah, I'm pretty optimistic about it. Final question, who is the one person in the world today who you most seek offer advice?
我确实认为世界上的美是没有上限的,尤其是如果有数十亿的人工智能在不断优化美这个概念。你可以在人与人之间的联系中看到这一点。所以我对未来感到相当乐观。最后一个问题:今天,世界上有哪个人是你最想寻求意见的人?
Oh, man. Well, I feel like it's part of my style is I like having a breadth of advisors. So it's it's not just it's not just one person, but it's um we've got a great team. I mean, it's uh, you know, I'm I think that there's people at the company people on our board. Um, there's a lot of people in the industry who are doing new stuff.
哦,天哪。嗯,我觉得这算是我的一种风格吧,我喜欢有很多顾问。所以不只是一个人,而是一个很棒的团队。就是说,我认为公司里的人、董事会的人都很出色。而且,还有业内很多人在做新的事情。
There's there's not there's not like a single person. Um, but it's uh, I know it's fun. And also as when the world is dynamic, I'm just having a reason to work with people you like on cool stuff. I'm to me like that's what life is about.
没什么具体的人,但我知道这很有趣。而且,当世界充满变化时,我觉得和你喜欢的人一起做有趣的事情,这就是人生的意义所在。
Yeah. All right. Great note to close on. Awesome. Thanks for doing this.
好的。太棒了,结束在这样的一个 note 上。非常感谢你参与这次活动。
Yeah. Thank you. I hope you enjoyed this episode. If you did, the most helpful thing you can do is just share it with other people who you think might enjoy it. Send it to your friends, your group chats, Twitter, wherever else. Just let the word go forth.
好的。谢谢你。我希望你喜欢这一集。如果你喜欢,最有帮助的事情就是与其他你认为可能也会喜欢的人分享这集。可以发给你的朋友、群聊、微博或者其他平台。让更多的人知道吧。
Other than that, super helpful if you can subscribe on YouTube and leave a five star review on Apple Podcasts and Spotify. Check out the sponsors in the description below. If you want to sponsor a future episode, go to doarkesh.com slash advertise.
除此之外,如果你能订阅我们的YouTube频道,并在Apple Podcasts和Spotify上留下五星好评,将会非常有帮助。请查看下方说明中的赞助商信息。如果你想为未来的节目赞助,请访问doarkesh.com/advertise。
Thank you for tuning in. I'll see you in the next one.
感谢收看,我们下次再见!