Chris Lattner: Future of Programming and AI | Lex Fridman Podcast #381
发布时间 2023-06-02 21:18:32 来源
摘要
Chris Lattner is a legendary software and hardware engineer, leading projects at Apple, Tesla, Google, SiFive, and Modular AI, including the development of Swift, LLVM, Clang, MLIR, CIRCT, TPUs, and Mojo. Please support this podcast by checking out our sponsors:
- iHerb: https://lexfridman.com/iherb and use code LEX to get 22% off your order
- Numerai: https://numer.ai/lex
- InsideTracker: https://insidetracker.com/lex to get 20% off
EPISODE LINKS:
Chris's Twitter: https://twitter.com/clattner_llvm
Chris's Website: http://nondot.org/sabre/
Mojo programming language: https://www.modular.com/mojo
Modular AI: https://modular.com/
PODCAST INFO:
Podcast website: https://lexfridman.com/podcast
Apple Podcasts: https://apple.co/2lwqZIr
Spotify: https://spoti.fi/2nEwCF8
RSS: https://lexfridman.com/feed/podcast/
Full episodes playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4
Clips playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41
OUTLINE:
0:00 - Introduction
2:20 - Mojo programming language
12:37 - Code indentation
21:04 - The power of autotuning
30:54 - Typed programming languages
47:38 - Immutability
59:56 - Distributed deployment
1:34:23 - Mojo vs CPython
1:50:12 - Guido van Rossum
1:57:13 - Mojo vs PyTorch vs TensorFlow
2:00:37 - Swift programming language
2:06:09 - Julia programming language
2:11:14 - Switching programming languages
2:20:40 - Mojo playground
2:25:30 - Jeremy Howard
2:36:16 - Function overloading
2:44:41 - Error vs Exception
2:52:21 - Mojo roadmap
3:05:23 - Building a company
3:17:09 - ChatGPT
3:23:32 - Danger of AI
3:27:27 - Future of programming
3:30:43 - Advice for young people
SOCIAL:
- Twitter: https://twitter.com/lexfridman
- LinkedIn: https://www.linkedin.com/in/lexfridman
- Facebook: https://www.facebook.com/lexfridman
- Instagram: https://www.instagram.com/lexfridman
- Medium: https://medium.com/@lexfridman
- Reddit: https://reddit.com/r/lexfridman
- Support on Patreon: https://www.patreon.com/lexfridman
GPT-4正在为你翻译摘要中......
中英文字稿
On one axis, you have more hardware coming in. On the other hand, you have an explosion of innovation in AI. And so what happened with both TensorFlow and PyTorch is that the explosion of innovation in AI has led to, it's not just about major multiplication and convolution. These things have now like 2,000 different operators. And on the other hand, you have, I don't know how many pieces of hardware there are. It's a lot part of my thesis, part of my belief of where computing goes, if you look out 10 years from now. This is not going to get simpler. Physics isn't going back to where we came from. It's only going to get weirder from here on out. And so to me, the exciting part about what we're building is it's about building that universal platform, which the world can continue to get weird. Because again, I don't think it's avoidable. It's physics. But we can help lift people, scale, do things with it. And they don't have to rewrite their code every time a new device comes out. And I think that's pretty cool.
在一个轴上,硬件数量越来越多。另一方面,AI方面的创新爆发。TensorFlow和PyTorch都经历了这种趋势,AI领域的创新不仅仅限于主要的乘法和卷积,这些函数现在已经拥有了2000多个不同的操作符。而在另一方面,硬件数量也是不计其数。这是我论文的一部分,也是我对未来10年计算机的信仰。这种趋势不会变得更加简单,物理世界也不会回到我们最初的状态。它只会变得更加奇怪。对我来说,我们正在构建一种通用平台,哪个世界能够继续变得奇怪。因为我觉得这是不可避免的,这是物理世界的规律,但是我们可以帮助人们更好地应对,提高规模,使用它来完成任务,而不必每次有新的设备推出都要重写他们的代码。我认为这很酷。
The following is a conversation with Chris Latner. His third time on this podcast. As I've said many times before, he's one of the most brilliant engineers in modern computing. Having created LLM Compiler Infrastructure Project, the Clang Compiler, the Swift Programming Language, a lot of key contributions to TensorFlow and TPUs as part of Google. He served as vice president of Autopilot Software at Tesla, was a software innovator and leader at Apple. And now he co-created a new full stack AI infrastructure for distributed training, inference, and deployment on all kinds of hardware called modular. And a new programming language called Mojo. That is a superset of Python, giving you all the usability of Python, but with the performance of C, C++. In many cases, Mojo code has demonstrated over 30,000 x speed up over Python. If you love machine learning, if you love Python, you should definitely give Mojo a try.
以下是与克里斯·拉特纳的对话。这已经是他第三次在这个播客中出现。正如我之前说过的那样,他是现代计算中最聪明的工程师之一。他创建了LLM编译器基础设施项目,Clang编译器,Swift编程语言,为TensorFlow和Google的TPU作出了很多重要贡献。他曾担任特斯拉公司自动驾驶软件副总裁,在苹果公司担任软件创新者和领导者。现在,他与人合作创建了一个新的全栈人工智能基础设施,用于分布式上的训练、推断和部署各种硬件,称为“modular”。还有一个名为Mojo的新编程语言,它是Python的超集,提供了所有Python易用性,但具有C、C++的性能。在许多情况下,Mojo代码已经展示了比Python高达30,000倍的速度。如果你喜欢机器学习,如果你喜欢Python,你应该试试Mojo。
This programming language, this new AI framework and infrastructure, and this conversation with Chris is mind blowing. I love it. It gets pretty technical at times, so I hope you hang on for the ride. This is the Lex Friedman podcast to support it. Please check out our sponsors in the description. And now with your friends, here's Chris Lattner.
这种编程语言、这种新的人工智能框架和基础设施以及与Chris的对话都非常令人震惊,我很喜欢。有时候它变得非常技术性,所以我希望你能跟上。这是支持Lex Friedman播客的。请在描述中查看我们的赞助商。现在和你的朋友一起听Chris Lattner的演讲。
It's been, I think, two years since we last talked. And in that time, you somehow went and co-created a new programming language called Mojo. So it's optimized for AI. It's a superset of Python. Let's look at the big picture. What is the vision for Mojo?
我想,我们上一次谈话已经是两年前的事了。在这段期间,你竟然创造出了一种名为Mojo的新编程语言。它是为人工智能做了优化的Python的超集。让我们把大局观拍摄出来。对于Mojo,有何愿景?
For Mojo, well, I made the gift to Zoom out. So I've been working on a lot of related technologies for many, many years. So I've worked on LLVM and a lot of things, and mobile and servers and things like this. But the world's changing. And what's happened with AI is we have new GPUs and new machine learning accelerators and other ASICs and things like that that make AI go real fast. At Google, I worked on TPUs. That's one of the biggest large scale deployed systems that exist for AI. And really what you see is if you look across all of the things that are happening in the industry, there's this new compute platform coming.
对于Mojo,我给他制作了一个放大镜的礼物。因此,多年来我一直在研究许多相关技术,包括LLVM、移动设备、服务器等等。但是世界在变化。AI的发展使我们拥有了新的GPU、机器学习加速器以及其他ASIC等,这些设备可以让AI飞快地运行。在Google,我曾经参与了TPU的研发,这是最大规模的AI系统之一。如果你关注产业中的所有动态,你会发现一种新的计算平台正在崛起。
And it's not just about CPUs or GPUs or TPUs or NPUs or IPUs or whatever, all the PUs. It's about how do we program these things? And so for software folks like us, it doesn't do us any good if there's this amazing hardware that we can't use. And one of the things you find out really quick is that having the theoretical capability of programming something and then having the world's power and the innovation of all the smart people in the world get unleashed on something can be quite different.
这不仅仅是关于CPU、GPU、TPU、NPU、IPU或其他所有处理器。而是关于我们如何为它们编程。对于像我们这样的软件人员来说,如果有了无法使用的惊人硬件,那么这对我们毫无帮助。你很快就会发现,拥有理论上的编程能力和释放全世界所有聪明人的创新力和动力之间存在很大的差别。
And so really where Mojo came from was starting from a problem of we need to be able to take machine learning, take the infrastructure underneath it, and make it way more accessible, way more usable, way more understandable by normal people and researchers and other folks that are not themselves like experts in GPUs and things like this. And then through that journey, we realized, hey, we need syntax for this. We need to do a program language.
Mojo的诞生实际上是由于我们面临一个难题,即我们需要将机器学习和其底层基础设施变得更加易于使用、更加容易被普通人、研究人员以及其他非GPU等专家所理解。在这一过程中,我们意识到了我们需要一种语法来实现这一目标,因此我们需要使用一种编程语言。
So one of the main features of the language, I say so fully in Jest, is that it allows you to have the file extension to be an emoji, or the fire emoji, which is one of the first emojis used as a file extension I've ever seen in my life. And then you ask yourself the question, why in the 21st century, we're not using Unicode for file extensions. This is, I mean, it's an epic decision. I think clearly the most important decision you made and the most, but you could also just use emoji as the file extension.
语言的一个主要特点我开玩笑说是它允许你的文件扩展名使用表情符号,例如火的表情符号,这是我有生以来见过的第一个将表情符号作为文件扩展名的例子。于是你会问自己,在21世纪,为什么我们不使用Unicode作为文件扩展名呢?这是一个具有史诗般意义的决定。我认为这是你做出的最重要的决定。但你也可以只是使用表情符号作为文件扩展名。
Well, so, okay, so take a step back. I mean, come on, Lex, do you think that the world's ready for this? This is a big moment in the world, right? This is where we're released to this, out to the world. This is innovation. I mean, it really is kind of brilliant. Emojis is such a big part of our daily lives. Why is it not in programming? Well, and like you take a step back and look at what file extensions are, right? They're basically metadata, right? And so why are we spending all the screen space on them and all the stuff? Also, you know, you have them stacked up next to text files and PDF files and whatever else, like, if you're gonna do something cool, you want it to stand out, right? These are colorful. They're visual. They're beautiful.
好的,那么,让我们退后一步。我的意思是,Lex,你认为世界准备好了吗?这是世界上的一个重要时刻,对吧?这是创新。我是说,这真的是相当聪明的。表情符号是我们日常生活中的重要部分。为什么它不在编程中呢?如果你退后一步看看文件扩展名,你会发现它们基本上是元数据,对吧?那么为什么我们要把所有屏幕空间都花在它们身上呢?此外,你可以将它们放在文本文件和PDF文件等文件旁边摆着,如果你想做一些有趣的事情,你希望它们能够脱颖而出,对吧?这些表情符号是色彩缤纷的,视觉效果很好,因此非常美观。
What's been the response so far from, is there support on like windows on operating system in displaying like file explorer? Yeah, yeah. The one problem I've seen is the Git doesn't escape it, right? And so it thinks that the Fire Emojis unprintable and so it like prints out weird hex things if you use the command line Git tool, but everything else as far as the more works fine. And I have faith that Git can be improved. So I'm not worried. Git Hub is fine. GitHub is fine. Yeah. GitHub is fine. You have code, windows, like all this stuff, totally ready because people have internationalization in their normal part of their paths.
目前,我们是否已经收到了反馈呢?支持在Windows等操作系统上显示文件资源管理器吗?是的,是的。我所见到的一个问题是Git不会转义,导致它认为“火焰表情符”是无法打印的,因此如果你使用命令行Git工具它会打印出奇怪的十六进制形式,但是其他大部分的工作都正常。我相信Git可以得到改进,所以我不担心。GitHub很好。GitHub很好。是的,GitHub很好。你有代码、Windows,所有这些,都准备好了,因为人们在他们的路径中通常都有国际化。
So let's just take the next step, right? Somewhere between, oh, wow, that makes sense. Cool. I like new things too. Oh my God, you're killing my baby. Like, what are you talking about? This can never be like, I can never handle this. How am I gonna type this? Like all these things. And so this is something where I think that the world will get there. We don't have to bet the whole farm on this. I think we can provide both paths, but I think it'll be great. When can we have emojis as part of the code? I wonder. Yeah, so I mean, lots of languages provide that. So I think that we have partial support for that. It's probably not fully done yet, but yeah, you can do that. For example, in Swift, you can do that for sure.
所以我们接下来就走第一步吧,对吧?从“哦,噢,这很有道理。酷!” 到“我的天啊,你要毁了我的宝宝。”,你在说什么呢? 这永远不可能发生,我无法处理这种情况。我怎么打这个?像所有这些事情一样。因此,我认为全世界都会走向这里。我们不必将全部赌注都放在这里。我认为我们可以提供两种方法,但我认为这将是伟大的。我们什么时候可以将表情符号作为代码的一部分?我想知道。是的,很多编程语言都提供了这种功能。所以我认为我们部分支持这种功能,但可能还没有完全完成,但你肯定可以在 Swift 中做到这一点。
So an example we gave it Apple was the dog cow. Yeah, so that's a classical Mac heritage thing. And so you use the dog and the cow emoji together and that could be your variable name. But of course the internet went and made pile of poop for everything. Yeah. So if you wanna name your function pile of poop, then you can totally go to town and see how that gets through code review.
这个例子是我们给予 Apple 的:狗牛。这是经典的Mac风格,可以将狗和牛的表情符号一起用作变量名。但网络上的人总是把一切都变成一堆便便的表情符号。如果你想把你的函数命名为“一堆便便”,那么你完全可以这样做,看看它是否能通过代码审查。
Okay, so let me just ask a bunch of random questions. So is Mojo primarily designed for AI or is it a general purpose program? Yeah, good question. So it's AI first. And so AI is driving a lot of the requirements. And so modular is building and designing and driving Mojo forward. And it's not because it's an interesting project theoretically to build, it's because we need it. And so at modular, we're really tackling the AI infrastructure landscape and the big problems in AI and the reasons that it is so difficult to use and scale and adopt and deploy and like all these big problems in AI. And so we're coming at it from that perspective. Now, when you do that, when you start tackling these problems, you realize that the solution to these problems isn't actually an AI specific solution.
好的,那么让我提一堆随机问题。Mojo主要是为人工智能设计的还是通用程序?嗯,好问题。它是以人工智能为主要设计目标。因此,人工智能驱动了许多需求。Modular正在建立和设计并推动Mojo的发展,不是因为这是一个有趣的理论项目,而是因为我们需要它。因此,在Modular,我们真正解决的是人工智能基础设施领域和人工智能中的大问题,以及它在使用、扩展、采用和部署方面存在的所有大问题。因此,我们从这个角度来看待这个问题。当你开始解决这些问题时,你会意识到解决这些问题的解决方案实际上并不是一个仅针对人工智能的解决方案。
And so while we're doing this, we're building Mojo to be a fully general programming language. And that means that you can obviously tackle GPUs and CPUs and like these AI things, but it's also a really great way to build NumPy and other things like that. Or, you know, just if you look at what many Python libraries are today, often they're a layer of Python for the API and they end up being C and C++ code underneath them. That's very true in AI. That's true in lots of other domains as well. And so anytime you see this pattern, that's an opportunity for Mojo to help simplify the world and help people have one thing. So optimize through simplification by having one thing.
在我们创建Mojo的同时,我们正在将它建立成一个完全通用的编程语言。这意味着你不仅可以处理GPU、CPU和类似的AI技术,还可以用它来很方便地构建NumPy等其他工具。另外,如果你看看今天许多Python库,通常它们的API层是Python,但它们底层其实是用C和C++实现的。这在AI领域非常普遍,也适用于许多其他领域。因此,每当你看到这种模式时,Mojo都会成为简化世界、帮助人们只需使用一个工具来优化过程的机会。通过使用一个工具来简化优化,获得最佳效果。
So you mentioned modular. Mojo is the programming language. Modular is the whole software stack. So just over a year ago, we started this company called Modular. Okay, what Modular's about is it's about taking AI and up leveling it into the next generation.
你提到了模块化。Mojo是一种编程语言。模块化指的是整个软件堆栈。所以,一年多前我们创立了一家名为Modular的公司。Modular的目标是将人工智能升级到下一代水平。
And so if you take a step back, what's gone on in the last five, six, seven, eight years is that we've had things like TensorFlow and PyTorch. And these other systems come in. You've used them, you know this. And what's happened is these things have grown like crazy. They get tons of users. It's in production deployment scenarios. It's being used to power so many systems. I mean, AI is all around us now. It used to be controversial years ago, but now it's a thing.
所以如果你退后一步看,过去五六七八年发生了什么是 TensorFlow 和 PyTorch 等其他系统的出现。你使用过它们,你了解这一点。而这些系统的增长非常惊人。他们拥有大量的用户,已经在生产部署场景中使用,为许多系统提供了动力。我是说,人工智能现在无处不在。多年前人工智能还有争议,但现在它已经成为了一件事情。
But the challenge with these systems is that they haven't always been thought out with current demands in mind. And so you think about it, when where were LLMs eight years ago? Well, they didn't exist, right? AI has changed so much. And a lot of what people are doing today are very different than when these systems were built. And meanwhile, the hardware side of this has gotten into a huge mess. There's tons of new chips and accelerators and every big company's announcing a new chip every day, it feels like. And so between that, you have like this moving system on one side, moving system on the other side and it just turns into this gigantic mess, which makes it very difficult for people to actually use AI, particularly in production deployment scenarios.
然而,这些系统的挑战在于它们并非总是考虑了当前的需求。如果你想想,从八年前开始算起,LLM是什么?其实当时还不存在这个概念,对吧?AI的发展带来了很大的变化,而人们今天所做的许多事情与这些系统构建时的情况非常不同。同时,硬件方面也变得非常混乱。出现了大量新的芯片和加速器,每个大公司好像每天都在发布新的芯片。此外,这两个方面相互影响,很容易形成一团杂乱无章的局面,这使得人们在实际应用中使用AI非常困难,特别是在生产部署环境下。
And it's what modular stings were helping build out that software stack to help solve some of those problems so that then people can be more productive and get more AI research into production. Now, what Mojo does is it's a really, really, really important piece of that. And so that is part of that engine and part of the technology that allows us to solve these problems. So Mojo is a programming language that allows you to do a high level programming, the low level programming. They do all kinds of programming in that spectrum that gets you closer and closer to the hardware.
模块化语言的作用是帮助构建软件堆栈,以解决一些问题,使人们在生产中能够更有效率,加快AI研究工作进展。Mojo是这一引擎和技术的非常重要的组成部分,它提供了一种编程语言,使你可以进行高层次和低层次的编程,在不同的范围内去接近硬件层面。
So it takes that back. So Lex, what do you love about Python? Oh boy, where do I begin? What is love? What do I love about Python? You're a guy who knows love. I know this. Yes. How intuitive it is? How it feels like I'm writing natural language English? How when I can not just write but read other people's codes and how I can understand it faster? It's more condensed than other languages, like ones I'm really familiar with, like C++ and C. There's a bunch of sexy little features. Yeah. We'll probably talk about some of them, but list comprehensions and stuff like this. And don't forget the entire ecosystem of all the packages. Oh yeah, there's probably huge. There's always something. If you want to do anything, there's always a package.
那它收回了之前说的话。那么Lex,你喜欢Python的哪些方面呢?哦,天啊,我从何说起?什么是爱?我喜欢Python的哪些方面呢?你是一个懂爱的人。我知道。对,它多么直观!写起来感觉就像写自然语言一样流畅。我不仅可以编写,还可以阅读他人的代码,并能更快地理解它。与其他语言比如C ++和C相比,它更为简洁。有一堆性感的小特性,比如列表推导式等等。而且别忘了Python的所有包的生态系统,哦对,那可能是巨大的,如果你想做任何事情,总有一个包适合你。
Yeah, so it's not just the ecosystem of the packages and the ecosystem of the humans that do it. That's an interesting dynamic. That's good. I think something about the usability and the ecosystem makes the thing viral. It grows and then it's a virtuous cycle. There's many things that went into that. I think that ML was very good for Python. I think that TensorFlow and PyTorch and these systems embracing Python really took and helped Python grow. But I think that the major thing underlying it is that Python is like the universal connector. It really helps bring together lots of different systems so you can compose them and build out larger systems without having to understand how it works.
是的,这不仅仅是软件包的生态系统和人类的生态系统都在起作用。这是一种有趣的动态。这很好。我认为可用性和生态系统使得这个东西成为了病毒式增长。它不断发展,形成良性循环。其中有很多因素。我认为ML对Python来说非常有帮助。我认为TensorFlow和PyTorch等系统对Python的信赖真的促进了Python的增长。但我认为其中最重要的一点是Python就像是一个通用连接器。它真的能够将很多不同的系统连接在一起,使你能够组合它们并构建更大的系统,而不必了解它是如何工作的。
But then what is the problem with Python? I guess you could say several things, but probably that is slow. I think that's usually what people complain about. I mean, other people complain about tabs in spaces versus curly braces or whatever. Those people are just wrong because it is actually just better to use indentation. Wow, strong words. Actually, a small tangent. Let's actually take that. Let's take all kinds of tangents. Oh, come on. You can push me on. I can take it.
那么Python的问题是什么呢?我猜你能说出几个,但最主要的问题应该是它的速度比较慢。我想这就是人们通常抱怨的原因。其他人抱怨使用制表符和空格或者花括号之类的问题。但这些人是错的,因为实际上使用缩进更好。哇,话说得真强。实际上,我们来谈一谈一个小话题。让我们来折腾一下。噢,来吧,你可以逼迫我说下去。
I mean, listen, I've recently left Emacs for VS Code. The kind of hate mail I had to receive. Because on the way to doing that, I also said I've considered Vim and chose not to and went with VS Code. You're touching on deep religions, right? Anyway, tabs is an interesting design decision.
我的意思是,听着,我最近从Emacs切换到了VS Code。我收到了很多恶意邮件。这是因为在转换之前,我考虑过Vim但最终选择了VS Code。这是在涉及深层次信仰。不管怎样,选用分页式编辑是一个有趣的设计决策。
So you've really written a new programming language here. Yes, it is a super set of Python, but you can make a bunch of different interesting decisions here. Totally. So I think it's actually to stick with Python in terms of some of the syntax. Well, let me explain why.
你真的写了一种新的编程语言。是的,它是 Python 的超集,但是你可以在这里做出许多不同而有趣的决定。完全可以。所以我认为在某些语法方面最好还是坚持使用 Python。让我来解释一下为什么。
I mean, you can explain this in many rational ways. I think that the indentation is beautiful, but that's not a rational explanation, right? But I can defend it rationally. So first of all, Python 1 has millions of programmers. It's huge. It's everywhere. It owns machine learning. So factually, it is the thing.
我的意思是,这个可以用很多理性的方式来解释。我觉得缩进很美观,但这不是一个理性的解释,对吧?但我可以用理性来支持它。首先,Python 1 有数百万的程序员。它太大了。无处不在。它掌握着机器学习。所以事实上它就是那个东西。
Second of all, if you look at it, C codes, C++ code, Java, whatever, Swift, curly brace languages, also run through formatting tools and get indented. And so if they're not indented correctly, first of all, we'll twist your brain around. It can lead to bugs. There's notorious bugs that have happened across time where the indentation was wrong or misleading and it wasn't formatted, right? And so it turned into an issue, right?
另外,如果你看看C代码、C++代码、Java、Swift或其他花括号语言,它们都会通过格式化工具进行缩进。如果它们没有正确地缩进,首先会让你头脑混乱。这可能会导致 bug。在历史上已经发生过一些臭名昭着的 bug,其中缩进不正确或误导,没有经过格式化,因此变成了一个问题。
And so what ends up happening in modern, large scale code bases and automatic formatters. So now what you end up with is indentation and curly braces. If you're going to have, you know, the notion of grouping, why not have one thing, right? And get rid of all the clutter and have a more beautiful thing, right? Also, you look at many of these languages. It's like, okay, well, you can have curly braces or you can omit them if there's one statement or you just like enter this entire world of complicated design space that objectively you don't need if you have Python style and indentation.
在现代的大规模代码库和自动格式化中,最终产生的结果就是缩进和花括号。如果你想有组合的概念,为什么不只需要一样东西呢? 清除所有杂乱无章的东西,让它更美观,对吧?此外,你看看许多这些语言。这就像是,好吧,你可以有花括号,或者如果只有一个语句,你可以省略它们,否则你就进入了一个复杂设计领域,这是你不需要的,如果你只采用Python风格和缩进原则。
So yeah, I would love to actually see statistics on errors made because of indentation. Like how many errors are made in Python versus in C++ that have to do with basic formatting, all that kind of stuff. I would love to see. I think it's probably pretty minor because once you get, like, you use VS code editor too. So if you get VS code set up, it does the indentation for you generally, right? And so you don't, you know, it's actually really nice to not have to fight it. And then what you can see is the editor is telling you how your code will work by indenting it, which I think is pretty cool.
我很想看到有关缩进错误的统计数据,比如在Python和C ++中,由于基本格式问题而导致的错误数量等等。我很想看到这些数据。我认为这可能还不算很严重,因为一旦你开始使用VS Code编辑器,它会自动进行缩进,对吧?这样你就不用费心去手动调整缩进,这真的很好。此外,编辑器通过缩进来展示代码的工作方式对我们也很友好,我认为这很酷。
I honestly don't think I've ever, I don't remember having an Aaron Python because I intended stuff wrong. So I mean, I think that there's, again, this is a religious thing. And so I can joke about it. And I love to kind of, you know, I realize that this is such a polarizing thing and everyone wants to argue about it. And so I like poking at the bear a little bit, right? But frankly, right, come back to the first point, Python one, like it's huge. It's an AI. It's the right thing. For us, like we see Mojo as being an incredible part of the Python ecosystem. We're not looking to break Python or change it or quote unquote fix it. We love Python for it is our view is that Python is just not done yet.
我真的不认为我曾经有过一个叫Aaron Python的东西,因为我本意就是会搞错。所以我想说,这是一个宗教性的事情。我可以开玩笑,我喜欢挑战一下,你知道吗,我明白这是一个非常极端的事情,每个人都想争论它。所以我喜欢挑逗熊一下。
但是,回到第一个观点,Python一号太巨大了。它是人工智能的正确选择。对于我们而言,Mojo是Python生态系统中不可思议的一部分。我们不想打破或改变Python,或所谓的修复它。我们爱Python,因为我们的看法是Python还没有完成。
And so if you look at, you know, you mentioned Python being slow. Well, there's a couple of different things to go into that, which we can talk about if you want. But one of them is it just doesn't have those features that you would use to do C-like programming. And so if you say, okay, well, I'm forced out of Python into C for certain use cases, well, then what we're doing is we're saying, okay, well, why is that? Can we just add those features that are missing from Python back up to Mojo? And then you can have everything that's great about Python, all the things you're talking about the love, plus not be forced out of it when you do something a little bit more computationally intense or weird or hardware-y or whatever it is you're doing.
如果你看一下,你提到了Python速度慢。好吧,这有几个不同的方面可以探讨,如果你想的话我们可以谈一谈。但其中一个原因是它缺乏用于像C语言编程那样的功能。所以如果你说,好吧,我不得不在某些用例中从Python转向C,那么我们正在说什么?我们可以把这些缺失的功能添加到Mojo中,这样就可以拥有关于Python的所有优点,你所谈论的所有喜爱,而且在做一些更具计算强度、奇怪或硬件性质的事情时不会被迫退出Python。
Well, a million questions I want to ask. High level, again, is it compiled or is it an interpretive language? So Python is just in time compilation. What's Mojo?
好的,我有一百万个问题想问。首先,Python是编译还是解释语言?它是即时编译的语言。那么Mojo是什么?
So Mojo, the complicated answer, does all the things. So it's interpreted, it's jit compiled and it's statically compiled. And so this is for a variety of reasons. So one of the things that makes Python beautiful is that it's very dynamic. And because it's dynamic, one of the things they added is that it has this powerful meta programming feature. And so if you look at something like PyTorch or TensorFlow or even a simple use case, you define a class that has the plus method. You can overload the dunder methods like dunder add, for example, and then the plus method works on your class. And so it has very nice and very expressive dynamic meta programming features.
Mojo能够完成复杂的任务,包括解释、JIT编译和静态编译。这是由于Python非常动态的特性。因此,Python添加了强大的元编程功能。例如,PyTorch、TensorFlow或简单的用例,你可以定义一个具有plus方法的类,并重载dunder方法(如dunder add),然后plus方法可用于类。这使得Python具有非常好的、可表达动态元编程特性。
In Mojo, we want all those features come in. We don't want to break Python, we want all the work. But the problem is you can't run those superdynamic features on an embedded processor or on a GPU. Or if you could, you probably don't want to just because of the performanc.
在Mojo中,我们希望所有这些功能都能够包含进去。我们不想破坏Python,希望能做到所有的工作。但问题是,你不能在嵌入式处理器或GPU上运行这些超动态的功能。或者即使你可以,由于性能的问题,你可能也不想这样做。
And so we entered this question of saying, okay, how do you get the power of this dynamic meta programming into a language that has to be super efficient in specific cases? And so what we did was we said, okay, we'll take that interpreter. Python has an interpreter in it. Take that interpreter and allow it to run it compile time. And so now what you get is you get compile time meta programming.
因此,我们开始考虑一个问题,即如何将动态元编程的强大功能集成到一个需要在特定情况下非常高效的语言中。因此,我们采取了一种方法,即利用 Python 内部的解释器,允许它在编译时运行。现在,您将得到一个编译时元编程的结果。
And so this is super interesting and super powerful because one of the big advantages you get is you get Python style expressive APIs. You get the ability to have overloaded operators. And if you look at what happens inside of PyTorch, for example, with automatic differentiation and you can remove all these things, they're using these really dynamic and powerful features at runtime. But we can take those features and lift them so that they run at compile time.
因此,这非常有趣和强大,因为您可以获得类似Python的表达式API。您可以具有重载操作符的能力。如果您观察PyTorch内部的自动微分,您会发现他们使用运行时非常动态和强大的特性。但是我们可以使用这些功能并将它们提升到编译时运行。
And so you're, because C++ has meta programming with templates, but it's really messy. It's super messy. It's always, it was accidentally, I mean, different people have different interpretations. My interpretation is that it was made accidentally powerful.
因此,你会因为C++具有模板元编程而感到困惑,但它真的很混乱。非常混乱。它总是这样,可能是由于意外原因,不同的人有不同的解释。我的解释是它意外地变得强大。
It was not designed to be Turing complete, for example, but that was discovered kind of along the way accidentally. And so there have been a number of languages in the space. And so they usually have templates or code instantiation, code copying features of various sorts.
该语言最初并没有被设计成图灵完备的,但是这种特性后来被发现是一种意外的收获。因此,在这个领域中有许多不同的语言。这些语言通常具有各种模板或代码实例化、代码复制等特性。
Some more modern languages or some more newer languages, let's say, like, you know, they're fairly unknown, like Zig, for example, says, okay, well, let's take all of those types so you can run it, all those things you can do at runtime and allow them to happen at compile time. And so one of the problems with C++, I mean, which is one of the problems with C++ is wrong words.
还有一些比较现代的语言或者说比较新的语言,例如像Zig这样,相对较为不为人知。他们说:“我们来把所有那些在运行时可以执行的东西,都过一遍编译,让它们在编译时就可以完成。”因此,C++的一个问题是,有很多单词用错了,这也是它的一个问题。
We're not meant to everybody to. Oh, it's okay. I mean, everybody hates me for a variety of reasons. Anyways, I'm sure, right? I've written it just the way they show love. I have written enough C++ code to earn a little bit of grumpiness with C++, but one of the problems with it is that the metaprogramming system templates is just a completely different universe from the normal runtime programming world.
我们不是要得到每个人的喜欢。哦,没关系。我的意思是,每个人都因为不同的原因讨厌我。不管怎样,我相信,是吧?我写的就是他们展现爱的方式。我写了足够多的C++代码,摸索出了C++的一些坑,但是其中一个问题是,元编程的模板系统与正常的运行时编程世界完全不同。
And so if you do metaprogramming and programming, it's just like a different universe, different syntax, different concepts, different stuff going on. And so, again, one of our goals with Mojo is to make things really easy to use, easy to learn, and so there's a natural stepping stone.
如果你使用元编程和编程,那就像是另一个宇宙,有不同的语法、概念和行为。因此,Mojo 的一个目标是使使用和学习变得非常容易,让人们能自然地迈出下一步。
And so as you do this, you say, okay, well, I have to do programming at runtime. I have to do programming at compile time. Why are these different things? How hard is that to pull it off? Because that sounds, to me, as a fan of metaprogramming in C++ even, how hard is it to pull that off? That sounds really, really exciting because you can do the same style programming at compile time and at runtime. That's really, really exciting.
当你这样做的时候,你会说:“好的,我必须在运行时做编程。我必须在编译时做编程。为什么这两者不同?这样做有多难?”因为这听起来对我来说非常令人兴奋,即使是作为C++元编程的粉丝,这样做有多难?听起来真的很令人兴奋,因为你可以在编译时和运行时做相同的编程风格。这真的非常令人兴奋。
Yep. And so, I mean, in terms of the compiler implementation details, it's hard. I won't be shy about that. It's super hard. It requires, I mean, what Mojo has underneath the covers is a completely new approach to the design of the compiler itself. And so this builds on these technologies like MLIR, that you mentioned, but it also includes other caching and other interpreters and compilers and other stuff like that.
是的,就编译器的实现细节而言,这很难,我不会掩饰这一点。这非常难。它需要 Mojo 在编译器设计本身方面采用全新的方法。这就借鉴了你提到的 MLIR 等技术,但也包括其他缓存、解释器、编译器等等技术。
You have an interpreter inside the compilers. Yes. And so it really takes the standard model of programming languages and kind of twist it and unifies it with the runtime model, which I think is really cool. And to me, the value of that is that, again, many of these languages have metaprogramming features. Like they grow macros or something.
编译器内部有解释器。是的。因此,它实际上采用了编程语言的标准模型,并将其与运行时模型结合起来,这个想法非常酷。对于我来说,这个价值在于,许多这些语言具有元编程功能。例如,它们可以生成宏等。
LISP, right? Yes. I know your roots, right? And this is a powerful thing, right? And so, you know, if you go back to LISP, one of the most powerful things about it is that it said that the metaprogramming and the programming are the same, right? And so that made it way simpler, way more consistent, way easier to understand, reason about, and it made it more composable.
LISP,对吧?是的。我知道你的根源,对吧?这是一件很强大的事情,对吧?因此,如果你回到LISP,其中最强大的事情之一就是它指出元编程和编程是相同的,对吧?这使得它变得更简单、更一致、更易于理解和推理,同时也更容易组合。
So if you go to the library, you can use it both at runtime and compile time, which is pretty cool. Yeah. And for machine learning, I think metaprogramming, I think we could generally say is extremely useful. And so you get features, I mean, I'll jump around, but there's the feature of auto tuning and adaptive compilation just blows my mind.
所以如果你去图书馆,你可以在运行时和编译时都使用它,这非常酷。是的。对于机器学习来说,我认为元编程通常非常有用。你可以获得自动调整和自适应编译功能,这让我感到非常惊奇。虽然我会跳来跳去谈论特点。
Yeah. Well, so, okay, so let's come back to that. All right.
好的,好的,那么我们回到那个问题上来吧。好的。
So what is machine learning? What is a machine learning model? Like you take a PyTorch model off the internet, right?
那么什么是机器学习?什么是机器学习模型?比如你从互联网上找到了一个PyTorch模型,是这样吗?
机器学习是一种人工智能技术,它使用算法和统计模型来让计算机系统从数据中学习,不断地改进自己的性能。机器学习模型是针对特定问题和数据集的算法,它们能够自主学习和做出预测或决策。比方说,你可以在互联网上找到一个PyTorch模型,它就是一种机器学习模型,可以被用来识别图像、识别语音、预测股票价格等。
It's really interesting to me because what PyTorch and what tends to flow in all these frameworks are kind of pushing compute into is they're pushing into like this abstract specification of a compute problem, which then gets mapped in a whole bunch of different ways.
对我来说这真的很有趣,因为PyTorch和其他一些框架所推动的都是将计算推向一种抽象的计算问题规范,然后通过各种不同的方式进行映射。
Right? And so this is why it became a metaprogramming problem.
对吧?这就是为什么这成为了一种元编程问题的原因。
意思是说,在编程中,当需要对代码进行编程时,它会变得非常抽象和复杂,因此需要使用元编程来解决问题。
You want to be able to say, cool, I have this neural net, now run it with batch size 1000. Right? Do a mapping across batch.
你想要能够说,酷,我有这个神经网络,现在用1000批量运行它。对各个批次进行映射,对吧?
Or, okay, I want to take this problem now running across 1000 CPUs or GPUs. Right?
好的,我现在想要将这个问题分配到1000个CPU或GPU上运行,对吧?意思是希望将问题分配到大量的CPU或GPU上以加速运算。
And so this problem of like describe the compute and then map it and do things and transform it or actually it's very profound. And that's one of the things that makes machine learning systems really special.
因此,描述计算机、映射和转换数据的问题其实是非常深刻的,这也正是机器学习系统非常特殊的原因之一。
Maybe can you describe auto tuning and how do you pull off? I mean, I guess adaptive compilation is what we're talking about as metaprogramming. How do you pull off auto tuning? I mean, is that as profound as I think it is?
能否描述一下自动调谐以及如何实现?我的意思是,我想我们谈论的是元编程中的自适应编译。你如何实现自动调谐?我的意思是,它是否像我想的那样深奥?
It seems like a really like, you know, we'll mention list comprehensions.
“看起来好像我们会提到列表推导式。”
To me, from a quick glance at Mojo, which by the way, I have to absolutely like dive in. As I realize how amazing this is, I absolutely must dive in. It does looks like just an incredible feature for machine learning people.
对于我来说,在短暂地浏览了Mojo之后,我必须得深入了解。我认识到这个东西是多么的惊人,所以我必须深入研究。看起来这对于机器学习的人来说是一个非常棒的功能。
Yeah. So what is auto tuning? So take a step back.
好的,那么什么是自动调音呢?先从一个角度看待这个问题。
Auto tuning is a feature in Mojo. It's not so very little of what we're doing is actually research.
自动调谐是Mojo中的一项功能。实际上,我们所做的很少是研究。
Like many of these ideas have existed in other systems and other places.
这样的想法在其他系统和地方都存在。
And so what we're doing is we're pulling together good ideas, remixing them and making them into hopefully a beautiful system. Right?
我们正在做的是收集好的想法,重新创作并将它们转化成一个美好的系统。是吗?
And so auto tuning, the observation is that turns out hardware systems algorithms are really complicated.
因此,自动调节的观察是,硬件系统算法实际上非常复杂。
意思:自动调节的发展表明硬件系统中的算法非常复杂。
Turns out maybe you don't actually want to know how the hardware works. Right?
结果可能你并不想了解硬件是如何工作的,对吗?
A lot of people don't. Right?
很多人不这样做,对吧?意思是许多人没有采用某种行为或做法。
And so there are lots of really smart hardware people. I know a lot of them where they know everything about, okay, that the cache size is this and the number of registers is that.
所以,有很多非常聪明的硬件专家。我认识很多人,他们知道一切,例如缓存大小是多少,寄存器数量是多少等等。
And if you use this, like the vector, it's going to be super efficient because it maps directly onto what it can do. And like all this kind of stuff, or the GPU has SMs and has a warp size of whatever, right?
如果您使用此工具,就像向量一样,它将非常高效,因为它可以直接映射到它可以执行的内容上。就像所有这些东西一样,或者说GPU有SM和warp大小为任何值。
All the stuff that goes into these things, or the title size of your matrix changes makes it Safe, and the other things that you could use are their actual previous cryptographic disclosure data which allows you to understand what you're doing to how to build that BrigadeWARE images.
所有组成这些东西的东西,或者您的矩阵标题大小的更改都会使其安全,您可以使用的其他东西是它们实际上的以前的加密披露数据,这可以让您理解您所做的事情,从而构建BrigadeWARE图像。
(注:此段话语法有些问题,句子有点不通顺,翻译之后的表达也不是很明确,但尽可能保证翻译的准确性和易读性)
Not only that that you're using in proper technology to construct it.
你不仅在使用适当的技术来建造它。
意思是指你不仅在建造过程中使用了合适的技术,而且还意味着你做得非常好,以至于这句话需要强调这一点。
That you really spoke about in some of the types of systems that you can gear through to do automatic architecture inbetween over the machine.
您在某些系统类型中实际讨论的是,您可以通过它们来进行自动架构,使机器之间能够更好地协作。
It takes absolutely many tasks and slash features of how to do it, what to do is an H100 versus a Volta versus a whatever, GPU, have different characteristics, right?
这意味着要做许多任务和删减功能,如何做到以及做什么是 H100 、Volta 或其他 GPU 有不同的特征,对吧?
A lot of the algorithms that you run are actually the same, but the parameters, these magic numbers you have to fill in end up being really fiddly numbers that an expert has to go figure out.
你运行的很多算法实际上都是相同的,但是这些参数,也就是你必须填写的神奇数字,最终会变得非常繁琐,只有专家才能解决。
翻译人:Lemonie
And so what auto-tuning those is says, okay, well, guess what? There's a lot of compute out there, right?
那么,自动调节它们会说什么呢?好吧,你猜怎么着?有很多计算资源可用啊,对吧?
So instead of having humans go randomly try all the things or do a grid search or go search some complicated multi-dimensional space, how about we have computers do that? Right, and so what auto-tuning does is you can say, hey, here's my algorithm.
因此,不是让人类随机尝试所有的东西,或进行网格搜索,或搜索某个复杂的多维空间,我们为什么不让计算机来做呢?对,所以自动调优可以做到这一点,你可以说,嘿,这是我的算法。
If it's a matrix operation or something like that, you can say, okay, I'm gonna carve it up into blocks, I'm gonna do those blocks in parallel, and I wanna, this, with 128 things that I'm running on, I wanna cut it this way or that way or whatever.
如果这是一个矩阵操作或类似的任务,你可以将其切分成块,然后并行处理这些块。在我运行的128个任务中,我想要按照这种方式或那种方式进行切割。
And you can say, hey, go see which one's actually empirically better on the system. And then the result of that, you cache for that system.
你可以说,嘿,去看看哪一个在系统中实际上更好。然后,你将该系统的结果缓存起来。
Yep, right.
没错,对的。
You save it.
你保存它。这句话是一个简短的命令,意思是要求某人保存某个物品或文件。它可能用在书面或口头交流中,例如在办公室或家庭生活中。这句话可以用于个人或团体之间的交流,例如老师给学生分发试卷时可以说“你保存好试卷”,或者家长让孩子保存重要文件时可以说“你要好好保存这份文件”。
And so come back to twisting your compiler brain, right?
所以现在让你重新开始思考编译器了,对吧?
这句话意思是让你回到思考编译器的问题上来,重新思考。
So not only does the compiler have an interpreter that she used to do meta programming, that compiler that interpreter that meta programming now has to actually take your code and go run it on a target machine.
因此,编译器不仅拥有她用来进行元编程的解释器,而且编译器中的那个解释器还必须将你的代码实际地在目标机器上运行。
See which one it likes the best and then stitch it in and then keep going, right?
看看哪一个是它最喜欢的,然后缝上去,然后继续前进,对吗?
意思:选择最适合的,进行后续的操作。
So part of the compilation is machine-specific.
这些编译程序的一部分是针对特定机器的。
Yeah, well, so I mean, this is an optional feature, so you don't have to use it for everything.
是啊,我的意思是,这是一个可选功能,你不必在所有情况下都使用它。
But yeah, if you're, so one of the things that we're in the quest of is ultimate performance. Yes.
是的,如果你也是,我们追求的一个目标是终极性能。
Ultimate performance is important for a couple of reasons, right?
极致性能之所以重要,原因有几点,对吧?
So if you're an enterprise, you're looking to save costs and compute and things like this, ultimate performance translates to fewer servers.
如果您是一家企业,您希望节省成本和计算资源等,最终的性能将转化为更少的服务器。
Like if you care about the environment, hey, better performance leads to more efficiency, right?
如果你关心环境,那么更好的表现能带来更高的效率,这是不是很好呢?
I mean, you could joke and say, like, Python's bad for the environment, right?
我的意思是,你可以开玩笑说,比如说Python对环境不利,对吧?
And so if you move to Mojo, it's like at least 10x better, just out of the box and then keep going, right?
所以如果你切换到Mojo,至少开箱即用,性能就会提高10倍,然后保持不断改进,对吧?
But performance is also interesting because it leads to better products. And so in the space of machine learning, right, if you reduce the latency of a model, so it runs faster. So every time you query the server around the model, it takes less time. Well, then the product team can go and make the model bigger. Well, that actually makes it so you have a better experience as a customer. And so a lot of people care about that.
性能也很有趣,因为它可以带来更好的产品。在机器学习领域,如果你减少模型的延迟,使其运行更快,那么每次查询服务器关于该模型的请求所花费的时间将更少。那么产品团队就可以使模型更大,这实际上会让客户有更好的体验。所以很多人都非常关注这个方面。
So for auto tuning, for like tile size, you mentioned 128 for TPU, you would specify like a bunch of options to try, just in the code, just simple statement. And then you just set and forget and know depending on where it compiles, it'll actually be the fastest. And yeah, exactly. And the beauty of this is that it helps you in a whole bunch of different ways, right?
对于自动调整,比如瓷砖大小,您提到了TPU的大小为128,您可以在代码中指定一堆选项进行尝试,只需使用简单语句。然后您只需设置并忘记,知道它的编译位置,它将是最快的。这样做的好处在于它可以以许多不同的方式帮助您。
So if you're building, so often what will happen is that you've written a bunch of software yourself, right? You wake up one day, you say, I have an idea, I'm gonna go code up some code, I get to work, I forget about it, I move on with life, I come back six months or a year or two years or three years later, you dust it off and you go use it again in a new environment. And maybe your GPU is different. Maybe you're running on a server instead of a laptop, maybe whatever, right? And so the problem now is you say, okay, well, I mean, again, not everybody cares about performance, but if you do, you say, okay, well, I wanna take advantage of all these new features. I don't wanna break the old thing though, right?
所以,如果你正在开发软件,往往会发生这样的事情:你自己写了一堆软件,对吧?你有一天醒来,想到一个想法,就开始写代码,然后遗忘了它,接着继续生活,六个月、一年、两年或三年后你再次拿起它并在新环境中使用。也许你的GPU已经不同了,也许你在服务器上运行而不是在笔记本电脑上运行,或者还有其他原因,对吧?所以现在的问题是,你想要利用所有这些新功能,但又不想破坏旧的功能。
And so the typical way of handling this kind of stuff before is, if you're talking about C++ templates, or you're talking about C with macros, you end up with if-deafs, you get like all these weird things get layered in, make the code super complicated and then how do you test it, right? Becomes this crazy complexity, multi-dimensional space that you have to worry about. And that just doesn't scale very well.
因此,处理这种问题的典型方式过去是,如果您正在讨论C++模板或使用宏的C语言,则最终会得到if-deafs,然后会有很多奇怪的东西叠加在一起,使代码变得非常复杂,然后如何测试呢?这变得非常复杂,您需要担心多维空间的复杂性。这种方式不太可扩展。
Actually, let me just jump around before it goes on specific features. Like the increase in performance here that we're talking about can be just insane. You write that Moja can provide a 35,000 X speed up over Python, how does it do that?
其实,让我在详细讨论具体特点之前先跳跳舞。我们要讨论的这个性能增强可能会让人疯狂。你写到Moja可以提供比Python快35,000倍的速度,它是如何做到的?
Yeah, so I can even do more. But we'll get to that. So first of all, when we say that we're talking about what's called CPython. It's the default Python that everybody uses. When you type Python three, that's like typically the one you use, right?
是的,我可以做更多。但我们将会谈到那个。首先,当我们说时,我们在谈论所谓的CPython。它是每个人都使用的默认Python。当你输入Python 3时,通常使用的就是这个版本,对吧?
CPython is an interpreter. And so interpreters, they have an extra layer of like byte codes and things like this, that they have to go read, parse, interpret and make some kind of slow from that perspective. And so one of the first things we do is we move to a compiler. And so just moving to a compiler, getting the interpreter out of the loop is two to five to 10 X speed up depending on the code. So just out of the gate, just using more modern techniques, right?
CPython是一个解释器。因此,解释器需要额外的一层字节码和其他东西,必须读取、解析、解释,并从这个角度生成一些慢的代码。因此,我们首先要做的一件事就是使用编译器。仅仅将解释器移出循环,就能让代码在速度上达到两到五到十倍的提升,具体视代码而定。因此,我们要使用更加现代的技术。
Now, if you do that, one of the things you can do, you can start to look at how CPython started to lay out data. And so one of the things that CPython did, and this isn't part of the Python spec necessarily, but this is just sets of decisions, is that if you take an integer, for example, it'll put it in an object. So in Python, everything's an object. And so they do the very logical thing of keeping the memory representation of all objects the same. So all objects have a header, they have like payload data. And what this means is that every time you pass around an object, you're passing around a pointer to the data. Well, this has overhead, right?
现在,如果你这样做,其中一件事情是你可以开始看CPython如何开始布置数据。所以,CPython所做的一件事情,并不一定是Python规范的一部分,但它只是决策的一组,就是如果你拿一个整数举例,它会将其放在对象中。在Python中,所有的东西都是一个对象。因此,他们做了一个非常合乎逻辑的事情,就是将所有对象的内存表示形式保持相同。因此,所有对象都有一个标头,它们有像有效载荷数据一样的数据。这意味着每次你传递一个对象,你都在传递一个指向数据的指针。好,这是有开销的,对吧?
Turns out that modern computers don't like chasing pointers very much and things like this. It means that you have to allocate the data, means you have to reference count it, which is another way that Python uses to keep track of memory. And so this has a lot of overhead. And so if you say, okay, let's try to get that out of the heap, out of a box, out of an indirection, and into the registers. That's another 10x. So it adds up if you're a reference counting every single thing you create that adds up.
原来现代计算机并不喜欢不断引用指针等操作。这意味着你必须分配数据,也就是必须对其进行引用计数,这是Python跟踪内存的另一种方式。这会带来很多开销。所以如果你说,好吧,让我们尝试将它从堆、盒子、间接引用中取出,并放入寄存器。这又增加了10倍。因此,如果你对每个创建的东西都进行引用计数,这些负担就会累加。
Yeah, and if you look at, you know, people complain about the Python Gill, this is one of the things that hurts parallelism. That's because of the reference counting. Right, and so the Gill and reference counting are very tightly intertwined in Python. It's not the only thing, but it's very tightly intertwined.
是的,如果你看看,人们抱怨Python Gill,这是伤害并行性的事情之一。这是因为参考计数。对的,Python的Gill和参考计数非常紧密地交织在一起。这不是唯一的原因,但它非常紧密地交织在一起。
And so then you lean into this and you say, okay, cool. Well, modern computers, they can do more than one operation at a time. And so they have vectors. What is a vector? Well, a vector allows you to take one, instead of taking one piece of data, doing an add or a multiply, and then pick up the next one, you can now do a four, eight, or 16, or 32 at a time. Right, well, Python doesn't expose that because of reasons. And so now you can say, okay, well, you can adopt that. Now you have threads. Now you have additional things, like you can control memory hierarchy.
因此,你就倾向于这个方向,你说,好的。现代计算机可以同时执行多个操作。所以,它们有向量。什么是向量?向量允许你一次处理四个、八个、十六个或三十二个数据,而不是一个一个地处理数据、进行加法或乘法,然后处理下一个。不过,由于某些原因,Python没有暴露这个功能。因此,现在你可以采用其他方式。你可以使用线程,还可以控制内存层次结构。
And so what Mojo allows you to do is it allows you to start taking advantage of all these powerful things that have been built into the hardware over time. And the library gives very nice features. So you can say, just parallelize this. Do this in parallel. So it's very powerful weapons against slowness, which is why people have been having fun, like just taking code and making it go fast, because it's just kind of a drum and rush to see how fast you can get things.
Mojo可以让你开始利用硬件上随着时间推移内置的这些强大的特性。这个库提供了非常棒的功能,比如你可以轻松地并行处理。这些都是对付速度慢的有力武器,所以人们乐在其中,喜欢将代码优化得更快,因为看到它能多快就像打鼓一样刺激。
Before I talk about some of the interesting stuff with parallelization, all that, let's first talk about, like, the basics. We talked about the indentation. All right, so this thing looks like Python. It's sexy and beautiful like Python, as I mentioned.
在谈论并行等一些有趣的内容之前,先让我们基础知识。我们已经讨论了缩进。好的,这个东西看起来像Python。正如我之前提到的那样,它像Python一样性感和美丽。
Is it a typed language? So what's the role of types? Yeah, good question. So Python has types. It has strings, it has integers, it has dictionaries, and all that stuff. But they all live at runtime. Right, and so because all those types live at runtime and Python, you don't have to spell them. Python also has this whole typing thing going on now, and a lot of people use it. I'm not talking about that. That's kind of a different thing. We can go back to that if you want.
这是一种静态语言吗?类型的作用是什么呢?这是个好问题。Python是有类型的,有字符串、整数、字典等等。但是它们都是在运行时存在的,并不需要定义时声明类型。现在Python还有一个称为“typing”的特性,很多人都在使用。不过这个跟我们讨论的有些不同。如果你感兴趣,我们可以再回头来谈一下。
But typically, you just say, I have a def, and my def takes two parameters. I'm going to call them a and b, and I don't have to write a type. So that is great. But what that does is that forces what's called a consistent representation. So these things have to be a pointer to an object with the object header, and they all have to look the same. And then when you dispatch a method, you go through all the same different paths, no matter what the receiver, whatever that type is. So what Mojo does is it allows you to have more than one kind of type. And so what it does is it allows you to say, OK, cool. I have an object. And objects behave like Python does. And so it's fully dynamic, and that's all great.
通常,你只需要说“我有一个def”,我的def需要两个参数,我将称它们为a和b,不用写类型也可以。这很棒。但是这就是所谓的强制统一表示。这些必须是指向带有对象头的对象的指针,并且它们必须看起来都一样。当你调用一个方法时,无论接收方是什么类型,都要经过所有不同的路径。Mojo允许你拥有多种类型。它允许你声明一个对象,并且对象的行为类似于Python。因此,它是完全动态的,这非常棒。
And for many things, classes, that's all very powerful and very important. But if you want to say, hey, it's an integer, and it's 32 bits, or 64 bits, or whatever it is, or it's a floating point value, it's 64 bits. Well, then the compiler can take that, and it can use that to do way better optimization. And it turns out, again, getting rid of the interactions, it's huge. It means you can get better code completion because you have, because the compiler knows what the type is, and so it knows what operations work on it.
对于许多事物和类,它们都非常强大和重要。但是,如果你想说它是一个整数,有32位,或64位,或者是一个浮点数,有64位,那么编译器可以利用这些信息来进行更好的优化。事实证明,消除交互非常重要。这意味着你可以获得更好的代码完成,因为编译器知道类型,所以它知道能对其进行哪些操作。
And so that's actually pretty huge. And so what Mojo does is it allows you to progressively adopt types into your program. And so you can start, again, it's compatible with Python. And so then you can add however many types you want, wherever you want them. And if you don't want to deal with it, you don't have to deal with it.
所以这实际上非常重要。 Mojo 允许您逐步将类型引入您的程序中。您可以从与 Python 兼容的版本开始,然后随心所欲地添加尽可能多的类型,随时任意位置添加。如果您不想处理它,您也可以不处理。
And so one of our opinions on this is that it's not that types are the right thing or the wrong thing. It's that they're a useful thing. Which is kind of optional. It's not strict typing. You don't have to specify a type. Exactly. OK, so starting from the thing that Python's kind of reaching towards right now, trying to inject types into it. Yeah, with a very different approach. But yes. Yeah, what's the different approach?
因此,对于这个问题,我们认为类型并不是好与坏的问题,而是有用与否的问题,这是可选的。这并不是严格的类型定义,你不必指定类型。好的,那么从Python现在正在尝试引入类型这件事开始,它采用了非常不同的方法。是的。那么这种不同的方法是什么呢?
I'm actually one of the people that have not been using types very much in Python. So why did you say? It just, well, because I know the importance, it's like adults use strict typing. And so I refuse to grow up in that sense. It's a kind of rebellion. But I just know that it probably reduces the amount of errors, even just for forget about performance improvements. It probably reduces errors when you do strict typing.
其实我是那些在Python中很少使用类型的人之一。那么你为什么会这么说呢?嗯,因为我知道类型的重要性,这就像成年人使用严格的类型一样。所以在这方面我不想成长。这是一种反抗。但我知道,即使只是忘记性能改进,采用严格的类型可能能够减少错误的数量。
Yeah, so I think it's interesting if you look at that. And the reason I'm giving it a hard time is that there's this cultural norm, this pressure, there has to be a right way to do things. You know, grownups only do it one way. And if you don't do that, you should feel bad. Right? Some people feel like Python's a guilty pleasure or something. And it's like, when it gets serious, I need to go rewrite it. Exactly.
嗯,我认为如果你对此有所了解,这很有趣。我这样批评它的原因是,社会文化的传统和压力让我们认为一定有一种正确的方法去做某件事情。你知道的,成年人只会用一种方法来做事情。如果你不这样做,你应该感到内疚。有些人认为Python只是一种不正当的快感,而且当它变得严肃起来时,他们需要重新写代码。就是这样。
Well, I mean, cool. I understand history. And I understand kind of where this comes from. But I don't think it has to be a guilty pleasure. Yeah. So if you look at that, you say, why do you have to rewrite it? Well, you have to rewrite it to deploy. Well, why do you want to deploy? Well, you care about performance, you care about profitability, or you want a tiny thing on the server that has no dependencies, or you have objectives that you're trying to attain.
嗯,我的意思是,很酷。我理解历史,也了解这来自哪里。但我不认为这必须是一种有罪的享受。对的,所以如果你看到这个,你会说,为什么你要重写它?嗯,你必须重写它才能部署。那么,为什么你想要部署呢?因为你在乎性能,你在乎盈利能力,或者你想要一个没有依赖关系的服务器上的微小东西,或者你有想要达成的目标。
So what if Python can achieve those objectives? So if you want types, well, maybe you want types because you want to make sure you're passing them the right thing, sure, you can add a type. If you don't care, you're prototyping some stuff, you're hacking some things out, you're pulling some RAM code off the internet. It should just work. And you shouldn't be pressured. You shouldn't feel bad about doing the right thing, or the thing that feels good.
那么如果Python可以实现这些目标又怎样?如果你想要类型,那可能是因为你想确保传递的是正确的东西,当然,你可以添加类型。如果你不在意,你可能只是在原型制作一些东西,或者在网上搞一些RAM代码。它应该直接工作,你不应该感到压力,也不应该因为做正确的事情或感觉好的事情而感到内疚。
Now, if you're in a team, you're working at some massive internet company, and you have 400 million lines of Python code, well, they may have a house rule that you use types, because it makes it easier for different humans to talk to each other and understand what's going on, bugs at scale. And so there are lots of good reasons why you might want to use types. But that doesn't mean that everybody should use them all the time.
如果你在一个团队中,在一家庞大的互联网公司工作,并且有 4 亿行的 Python 代码,那么他们可能有一项内部规定,让你使用类型,因为这样可以让不同的人更容易地交流并理解发生了什么问题,尤其是在大规模的错误中。因此,有很多好的理由为什么你会想要使用类型。但这并不意味着每个人都应该始终使用它们。
So what Mojo does is it says, cool, well, allow people to use types. And if you use types, you get nice things out of it. You get better performance and things like this. But Mojo is a full, compatible, super set of Python. And so that means it has to work without types. It has to support all the dynamic things. It has to support all the packages. It has to support for comprehensive, list comprehensions, and things like this.
Mojo的作用是允许人们使用类型,如果您使用类型,就能获得更好的性能等好处,但Mojo也是Python的完全,兼容的超集,这意味着它必须能够在不使用类型的情况下运行,必须支持所有动态操作,所有包,列表推导式等综合功能。
And so that starting point, I think, is really important. And I think that, again, you can look at why it cares so much about this. And there's many different aspects of that, one of which is the world went through a very challenging migration from Python 2 to Python 3. And this migration took many years. And it was very painful for many teams. There's a lot of things that went on in that. I'm not an expert in all the details. And I honestly don't want to be. I don't want the world to have to go through that. And people can ignore Mojo. And if it's not their thing, that's cool. But if they want to use Mojo, I don't want them to have to rewrite all their code.
因此,我认为起点非常重要。并且我认为,你可以看看为什么它如此重视这一点。这其中有很多不同的方面,其中之一就是从Python 2到Python 3的迁移对世界来说是一个非常具有挑战性的过程。这个迁移持续了多年,对很多团队来说非常痛苦。其中发生了许多事情,我并不是所有细节的专家。老实说,我也不想成为专家。我不希望世界再次经历这种迁移。人们可以忽略Mojo,如果这不是他们的事情,那也很好。但是如果他们想使用Mojo,我不希望他们不得不重写所有的代码。
I mean, this super set part is so much brilliant stuff here. That definitely is incredible. We'll talk about that. First of all, how's the typing implemented differently in Python versus Mojo? So this heterogeneous flexibility, you said, is definitely implemented. Yeah.
我的意思是,这个超级集合部分是如此出色。这绝对是不可思议的。我们会谈论这个的。首先,Python和Mojo中的输入方式有何不同?你所说的这种异构的灵活性,肯定已经实现了。是的。
So I'm not a full expert in the whole back story on types in Python. So I'll give you that. I can give you my understanding. My understanding is basically, like many dynamic languages, the ecosystem went through a phase where people went from writing scripts to writing a large scale, huge code bases in Python. And at scale, it kind of helps have types.
我并不是 Python 类型的全部背景的专家,但我可以告诉你我的理解。就像许多动态语言一样,Python 的生态系统经历了从编写脚本到编写大规模、庞大的代码库的阶段。在这样的规模下,使用类型会有所帮助。
People want to be able to reason about interfaces. What do you expect to string or an inch? Or like, what are these basic things, right? And so what the Python community started doing is it started saying, OK, let's have tools on the side. Checker tools, right? The go and enforce invariance. Check for bugs. Try to identify things. These are called static analysis tools, generally. And so these tools run over your code and try to look for bugs.
人们希望能够理解接口的设计原理。比如说,你期望的是字符串还是英寸?或者基本的这些东西是什么?因此,Python社区开始提供一些工具,如检查工具,用于确保不变性、检测错误,并尝试识别问题。通常称为静态分析工具。这些工具在运行代码时,会尽力寻找错误。
What ended up happening is there are so many of these things. So many different weird patterns and different approaches on specifying the types and different things going on that the Python community realized and recognized, hey, hey, there's the thing here. And so what they started to do is they started to standardize the syntax for adding types to Python.
最终发生的事情是,这些东西有太多了。许多不同的奇怪模式和不同的方法来指定类型,以及不同的事情发生,Python社区意识到并认识到这个问题。因此,他们开始标准化Python中添加类型的语法。
Now, one of the challenges that they had is that they're coming from kind of this fragmented world where there's lots of different tools. They have different trade-offs and interpretations and the types of different things. And so if you look at types in Python, according to the Python spec, the types are ignored. Right. So according to the Python spec, you can write pretty much anything in a type position. And technically, you can write any expression.
现在,他们面临的一个挑战是,他们来自一个碎片化的世界,有很多不同的工具。这些工具有不同的权衡和解释,而且可以处理不同的事情。因此,如果你看看Python中的类型,根据Python的规范,类型被忽略了。根据Python的规范,你可以在类型位置写任何东西。从技术上讲,你可以写任何一个表达式。
OK. Now, that's beautiful because you can extend it. You can do cool things. You can write build your own tools. You can build your own house, linter, or something like that. But it's also a problem because any existing Python program may be using different tools. And they have different interpretations. And so if you adopt somebody's package into your ecosystem, try to run the tool you prefer. It may throw out tons of weird errors and warnings and problems just because it's incompatible with how these things work.
现在,这很棒因为你可以扩展它。你可以做很酷的东西。你可以编写构建自己的工具。你可以建造自己的房屋、语法分析器等等。但这也是个问题,因为现有的Python程序可能使用不同的工具,并且它们有不同的解释。如果你将某个人的程序包引入你的生态系统中,尝试运行你喜欢的工具,它可能会因为与这些工具的工作方式不兼容而抛出大量奇怪的错误和警告。
Also because they're added late and they're not checked by the Python interpreter, it's always kind of more of a hint than it is a requirement. Also, the CPython implementation can't use them for performance. And so it's really a big one, right?
这是因为这些添加的特性较晚,并且未经过 Python 解释器的检查,因此它们更像是提示而不是要求。此外,CPython 实现无法利用它们提高性能。因此,这是一个比较大的问题,对吧?
简单来说,这段话是在讲解 Python 中的一些特性,它们的添加比较晚,并且不是 Python 解释器必须要遵守的要求,而是一种提示。此外,这些特性也不能提高程序的性能。
So you can't utilize the compilation for the just-intact compilation. OK. Exactly. And this all comes back to the design principle. It's kind of hints. They're kind of the definitions a little bit murky. It's unclear exactly the interpretation in a bunch of cases. And so because of that, you can't actually, even if you want to, it's really difficult to use them to say it is going to be an int. And if it's not, it's a problem. A lot of code would break if you did that.
因此,您无法将编译用于刚刚完整的编译。好的,确切地说,这一切都归结于设计原则,它有点提示性。在许多情况下,它们的定义有点模糊,解释也不太清楚。因此,即使您想,使用它们确实很困难,以此来确定其将是一个整数。如果它不是,将会出现问题,很多代码会出现故障。
So in Mojo, so you can still use those kind of type annotations. It's fine. But in Mojo, if you declare a type and you use it, then it means it is going to be that type. And the compiler helps you check that and force it. And it's safe. And it's not a best effort hint kind of a thing. So if you try to shove a string type thing into an integer. You get an error from the compiler. From the compiler compile time. Nice.
在Mojo中,仍然可以使用这些类型注释。这没问题。但是,在Mojo中,如果你声明了一个类型并使用它,那么它就会成为这个类型。编译器可以帮助你检查并强制执行。这是安全的,不是一个尽力而为的提示。所以,如果你试图将字符串类型的东西强制转换为整数,编译器会在编译时给出错误提示。很好。
OK. What kind of basic types are there? Yeah. So Mojo is pretty hardcore in terms of what it tries to do in the language, which is the philosophy there is that we, again, if you look at Python, Python's a beautiful language because it's so extensible. And so all of the different things in Python, like for loops and plus and all these things can be accessed through these underbar-armbar methods. So you have to say, OK, if I make something that is super fast, I can go all the way down to the metal.
有哪些基本类型?是的。Mojo 在语言方面努力尝试做的事情相当强硬,哲学理念是,如果你看看 Python,Python 是一种美丽的语言,因为它非常可扩展。因此,在 Python 中所有不同的东西,例如 for 循环和加号等都可以通过这些下划线和垂杠方法进行访问。因此,你必须要说,如果我做出的东西非常快速,我可以一直深入到底层。
Why do I need to have integers built into the language? And so what Mojo does is it says, OK, well, we can have this notion of structs. So you have classes in Python. Now you can have structs. Classes are dynamic. Structs are static. Cool. We can get high performance. We can write C++ kind of code with structs if you want. These things mix and work beautifully together. But what that means is that you can go and implement strings and ints and floats and arrays and all that kind of stuff in the language.
为什么我需要在语言中建立整数? Mojo的所作所为就是说,好的,我们可以有这个struct的概念。因此,你在Python中有类。现在你可以有structs。类是动态的。结构是静态的。很酷。我们可以使用structs编写类似于C++的代码来获得高性能。这些东西混合在一起非常好。但这意味着你可以在语言中实现字符串、整数、浮点数、数组等所有东西。
And so that's really cool because to me, as a idealizing compiler language type of person, what I want to do is I want to get magic out of the compiler and put it in the libraries. Because if somebody can, if we can build an integer that's beautiful and has an amazing API and does all the things you'd expect an integer to do, but you don't like it, maybe you want a big integer. Maybe you want to sideways an integer. I don't know what all the space of integers are. Then you can do that. And it's not a second class citizen.
因此,作为一个理想化编译语言人士,对我而言,真正酷的事情在于,我想把编译器中的魔力提取出来,并放在库里。因为,如果有人可以构建一个美丽的整数,有一个惊人的API并完成所有你期望整数做的事情,但如果你不喜欢它,也许你想要一个大整数,或者你想把整数横向移动。我不知道所有的整数空间是什么,但你可以这样做。它不是二等公民。
And so if you look at certain other languages, like C++, one I also love and use a lot, ints is hard code in the language. But complex is not. And so it wasn't kind of weird that you have this STD complex class, but you have int. And complex tries to look like a natural numeric type and things like this. But integers and floating point have these special promotion rules and other things like that that are magic. And they're hacked into the compiler. And because of that, you can't actually make something that works like the built-in types.
如果你看其他某些语言,例如我也很喜欢并经常使用的C++,对于整数(ints)来说,它是硬编码在语言中的。但是复数(complex)却没有。因此,如果你在STD complex类与整数之间进行比较,会感到有些奇怪。复数试图像自然数值类型一样,但整数和浮点数有这些特殊的提升规则和其他类似魔法的东西。它们已经被添加到编译器中了。因此,你实际上不能创造一个像内置类型一样运作的东西。
Is there something provided as a standard? Because it's AI first. New miracle types are so important here. So is there something like a nice standard implementation of integer and float? Yeah. So we're still building all this stuff out. So we provide integers and floats and all that kind of stuff. We also provide buffers and tensors and things like that, the expected ML context. Honestly, we need to keep designing and redesigning and working with the community to build that out and make that better. That's not our strength right now.
AI 是重点,因此是否有标准方式提供内容呢?新的创新类型在这里非常重要。是否有像整数和浮点数这样的标准实现呢?是的,我们正在构建所有这些东西。我们提供整数、浮点数以及像缓冲区和张量等预期的机器学习环境。诚实地说,我们需要不断设计、重新设计并与社区合作来构建这些内容并使其更好。这不是我们现在的优势。
Give us six months or a year. And I think it'll be way better. But the power of putting in the library means that we can have teams of experts that aren't compiler engineers that can help us design and refine and drive us forward. So one of the exciting things we should mention here is that this is new and fresh. This cake is unbaked. It's almost baked. You can tell it's delicious, but it's not fully ready to be consumed. Yep. That's very fair. It is very useful. But it's very useful if you're a super low-level programmer right now.
给我们六个月或一年时间,我相信情况会好得多。但将代码库开放给大家使用的好处在于我们可以吸引专家团队,他们不一定是编译工程师,可以帮助我们设计、完善和推进项目。因此,我们要强调的令人兴奋的事情之一是,这是新的和新鲜的。这个项目还没有完成,它像一块未烤好的蛋糕,看起来很美味,但还没有完全准备好,无法食用。没错,这是很公正的评价。它非常有用,但如果你是一个超级底层的程序员,现在用它非常有用。
And what we're doing is we're working our way up the stack. And so the way I would look at Mojo today in May and 2023 is that it's like a 0.1. So I think that a year from now, it's going to be way more interesting to a variety of people. But what we're doing is we decide to release it early so that people can get access to it and play with it. And we can build with the community. We have a big roadmap fully published, being transparent about this. And a lot of people are involved in this stuff. And so what we're doing is we're really optimizing for building the thing the right way. And building it the right way is kind of interesting, working with the community, because everybody wants it yesterday. And sometimes there's some dynamics there. But I think it's the right thing.
我们正在逐步提高技术层次的能力。所以我认为,在2023年5月的时候,今天我们所拥有技术 MoJo 仅仅只是 0.1 版本。但是,明年的今天,一定会有更多的人对它感兴趣的。我们决定提前发布它,以便让人们可以使用和玩耍。同时,我们也希望与社区一起建设,同时公布了一个详尽的路线图,希望能够让大家透明交流。目前很多人都在参与这一项工作。我们正为了正确的做法而最优化,这在与社区共同开发的情况下显得尤为重要,因为大家都希望事情进展得更快、更快。因此,我们仍旧坚信正确的方向,这是我们所追求的。
So there's a discord also. So the dynamics is pretty interesting. Sometimes the community probably can be very chaotic and introduce a lot of stress. Gouda famously quid over the stress of the Waller's operator. I mean, that broke. Maybe that broke the camel's back. Exactly. And so it could be very stressful to develop. But can you just add tangent upon a tangent? Is it stressful to work through the design of various features here, given that the community is so virtually involved?
所以也有 discord。因此,这里的动态非常有趣。有时候,社区可能会非常混乱,带来很多压力。Gouda因Waller's operator的压力而声名狼藉。我的意思是,那可能是压垮骆驼的最后一根稻草。确实如此。因此,开发过程可能会非常有压力。但是你能够在各种功能的设计过程中添加额外的分支吗?考虑到社区的广泛参与,这是否会带来压力?
Well, so I've been doing open development and community stuff for decades now. Somehow this has happened to me. So I've learned some tricks. But the thing that always gets me is I want to make people happy. And so this is maybe not all people all happy all the time. But generally, I want people to be happy.
嗯,我已经进行了数十年的开放式开发和社区工作。不知怎么的,这成了我的事业。因此,我学会了一些技巧,但总有一件事情让我困扰:我想让人们开心。这并不是说每个人都能一直开心,但总的来说,我希望人们开心。
And so the challenge is that, again, we're tapping into some deep-seated long tensions and pressures, both in the Python world, but also in the AI world, in the hardware world, and things like this. And so people just want to move faster. And so, again, our decision was, let's release this early. Let's get people used to it or access to it and play with it. And let's build in the open, which we could have had the language monk sitting in the cloister up on the hilltop, be varying away, trying to build something.
挑战在于,我们在Python、人工智能世界、硬件领域等深层次的紧张和压力上又开始进行了探索。而人们希望更快地前进。因此,我们的决定是:早早地发布这个东西,让人们适应并尝试使用它。我们要公开构建,而不是让一些人像隐士一样独立在山顶上构建。
But in my experience, you get something that's way better if you work with the community. And so, yes, it can be frustrating. It can be challenging for lots of people involved. And if you mention our Discord, we have over 10,000 people on the Discord, 11,000 people or something. Keep in mind, we released Mojo two weeks ago. So it's very active. So it's very cool.
但是根据我的经验,如果你与社区合作,你会得到更好的结果。当然,这可能会让许多人感到沮丧和面临挑战。如果你提到我们的Discord,我们有超过10,000人在Discord上聚集,大约有11,000人。请记住,我们两周前刚刚发布了Mojo。所以这非常活跃,非常酷。
But what that means is that 10, 11,000 people all will want something different. And so what we've done is we've tried to say, OK, cool, here's our roadmap. And the roadmap isn't completely arbitrary. It's based on, here's the logical order in which to build these features or add these capabilities and things like that. And what we've done is we spun really fast on bug fixes.
但这意味着有10、11,000人都想要不同的东西。因此,我们试图说,“好的,这是我们的路线图。”而这个路线图并不是完全武断的。它是基于逻辑顺序来构建这些功能或添加这些能力等等的。并且我们已经快速解决了故障问题。
And so we actually have very few bugs, which is cool. I mean, actually for projects in the state. But then what we're doing is we're dropping in features very deliberately. I mean, this is fun to watch because you got the two gigantic communities of hardware systems engineers. And then you have the machine learning Python people that are higher level. And it's just two army. They've been at war. Yeah. They've been at war.
所以,实际上我们很少有漏洞,这很酷。我的意思是,对于处于这种状态的项目而言,这实际上还不错。但接下来我们正在有意识地增加功能。这很有趣,因为你有两个庞大的硬件系统工程师社区和一个更高层次的机器学习Python人群。这就像两支军队,他们一直在战争中。是啊,他们一直在战争中。
So here's a token novel or something. OK. So here's a test. And again, it's super funny for something that's only been out for two weeks, right? People are so impatient, right? But OK, cool. Let's fast forward a year. Like in a year's time, mojo will be actually quite amazing and solve tons of problems and be very good. People still have these problems, right?
这是一本小说或类似的东西。好的。现在来测试一下。而且,仅仅出来两周就已经非常有趣了,对吧?人们太不耐烦了,对吧?但好吧,让我们快进一年。一年后,mojo将会非常出色,解决许多问题,并且非常好用。人们仍然面临这些问题,对吧?
And so you look at this and you say, and the way I look at this at least is to say, OK, well, we're solving big, longstanding problems. To me, I, again, working on many different problems. I want to make sure we do it right. There's a responsibility, you feel, because if you mess it up, there's very few opportunities to do projects like this and have them really have impact on the world.
因此,你看着这些问题,你说,至少我看待这些问题的方式是,我们正在解决历史悠久的大型问题。对我来说,我又在处理许多不同的问题。我想确保我们做得正确。你会感到责任心,因为如果你搞砸了,很少有机会像这样做项目,并让它们真正地对世界产生影响。
If we do it right, then maybe we can take those feuding armies and actually heal some of those wounds. Yeah. This feels like a speech by George Washington or Abraham like something. And you look at this and it's like, OK, well, how different are we? We all want beautiful things. We all want something that's nice. We all want to be able to work together. We all want our stuff to be used, right?
如果我们做得对,也许可以通过治愈一些战争造成的伤口来化解那些互相争斗的军队。是的,这感觉像是乔治·华盛顿或亚伯拉罕·林肯的演讲。当你看看这个,你会问自己,我们有多不同呢?我们都想要美好的事物。我们都想要一些好的东西。我们都想要能够共同合作。我们都希望我们的东西可以被使用,对吧?
And so if we can help heal that, now I'm not optimistic that all people will use Mojo and they'll stop using C++. Like, that's not my goal, right? But if we can heal some of that, I think that'd be pretty cool. Yeah, and we start by putting the people who like braces into the Gulag. No. So there are proposals for adding braces to Mojo. We just tell them what you're saying. We tell them no.
因此,如果我们能够帮助治愈这方面的问题,我不乐观地认为所有人都会使用Mojo并停止使用C ++。那不是我的目标,对吧?但如果我们能够治愈其中一些问题,我认为那将是相当酷的。是的,我们从让喜欢花括号的人进入古拉格开始。不,实际上有人建议将花括号添加到Mojo中。我们只是告诉他们你的意思。我们告诉他们不要这样做。
OK. Well, anyway, so there's a lot of amazing features on the roadmap and those ready implements. It'd be awesome. I could just ask you a few things. Yeah, go for it.
好的,总之,路线图上有很多令人惊奇的功能,还有一些已经实现了。这将是令人惊叹的。我可以问你一些问题吗?当然可以。
So the other performance improvement comes from immutability. So what's this var and this let thing that we got going on? What's immutability? Yeah, so one of the things that is useful, and it's not always required, but it's useful, is knowing whether something can change out from underneath you.
因此,另一个性能改进来自于不可变性。那么我们正在使用的那个var和let是什么?何谓不可变性?是的,有用的一个问题不总是必须要知道某些东西是否能够从下面更改,但这是有帮助的。
And Python, you have a pointer to an array. And so you pass that pointer to an array around to things. If you pass into a function, maybe take that and scroll away in some other data structure. So you get your array back and you go to use it. Now, somebody else is putting stuff in your array. How do you reason about that? Because it's to be very complicated, at least a lot of bugs.
Python中,你有一个指向数组的指针。所以你将那个指向数组的指针传递给其他东西。如果你传递到一个函数中,也许把它放到其他数据结构中。所以你会拿回你的数组,并且开始使用它。现在,有其他人往你的数组里存东西。你如何解决这个问题?因为它可能会非常复杂,并且引发很多错误。
And so one of the things that, again, this is not something Mojo forces on you, but something Mojo enables is a thing called value semantics. And what value semantics do is they take collections like arrays, like dictionaries, also tensors and strings, and things like this at a much higher level, and make them behave like proper values.
所以,Mojo能做到的一件事情就是所谓的值语义。这并不是Mojo强制要求的,但Mojo让这成为可能。值语义的作用就是让数组、字典、张量、字符串等集合类别的数据结构高级别地表现为一个真正的值。
And so it makes it look like if you pass these things around, you get a logical copy of all the data. And so if I pass you an array, you're an array. You can go do what you want to it. You're not going to hurt my array. Now, that is an interesting and very powerful design principle that defines a way a ton of bugs. You have to be careful to implement it in an efficient way. As their performance hit, that's significant. Generally not, if you implement it the right way, but it requires a lot of very low level getting the language right bits.
因此,它让人觉得如果你传递了这些东西,你会得到所有数据的逻辑副本。所以,如果我传递一个数组给你,你就会拥有一个数组,你可以随意操作它,但不会伤害我的数组。这是一个很有趣和非常强大的设计原则,它定义了许多bug的解决方式。你必须小心实现它,以保证效率。通常情况下,如果你正确地实现它,性能不会太受影响,但它需要很多非常底层的编程工作。
I assume there'll be a huge performance hit. Because the benefit is really nice, because you don't get into the. Absolutely. Well, the trick is you can't do copies. So you have to provide the behavior of copying without doing the copy. Yeah. How do you do that? How do you do that? It's not magic. It's actually pretty cool.
我认为会有极大的性能损失。因为好处真的很不错,因为你不会陷入其中。完全是这样。嗯,诀窍在于你不能做复制。所以你必须提供复制的行为,而不进行复制。是的。你怎么做到的?你怎么做到的?这不是魔术。实际上相当酷。
Well, so before we talk about how that works, let's talk about how it works in Python. So in Python, you define a person class, or maybe a person class is a bad idea. You define a database class, right? And a database class has an array of records, something like that, right? And so the problem is that if you pass in a record or a class instance into the database, it'll take a hold of that object, and then it assumes it has it. And if you're passing an object in, you have to know that that database is going to take it. And therefore, you shouldn't change it after you put in the database, right? This is just a kind of have to know that. You just have to kind of know that, right?
好的,所以在讨论它如何运作之前,让我们先谈谈Python中的工作原理。在Python中,你要定义一个人类,或者说定义一个人类可能不是个好主意。你应该定义一个数据库类,对吧?一个数据库类有一个记录数组,像这样,对吧?问题在于,如果你把一个记录或一个类实例传递给数据库,它会掌控那个对象,然后它会认为自己已经拥有它了。如果你传递一个对象进来,你必须知道那个数据库会收下它。因此,在把它放进数据库之后,你不应该再对它进行修改,对吧?这只是一种你必须知道的事情。你只需要了解这一点,对吧?
And so you roll out version one of the database. You just kind of have to know that. Of course, Lex uses his own database, right? Yeah. Right? Because you built it. You understand how this works, right? Somebody else joins the team. They don't know this. Right? And so now they suddenly get bugs. You're having to maintain the database. You shake your fist. You argue the tenth time this happens. You're like, OK, we have to do something different. Right?
于是你推出了数据库的第一个版本。你必须要了解这一点。当然,Lex使用他自己的数据库,对吧?是的。对吧?因为你建造了它。你明白它的运作方式,对吗?有人加入了团队,他们不知道这一点。现在他们突然遇到了错误。你不得不维护数据库。你摇了摇拳头。当这种事情发生第十次时,你争吵道:好的,我们必须做一些不同的事情。
And so what you do is you go change your Python code, and you change your database class to copy the record every time you add it. And so what ends up happening is you say, OK, I will do what's called a defensive copy inside the database. And then that way, if somebody passes something in, I will have my own copy of it. And they can go do whatever, and they're not going to break my thing.
因此,你要做的就是修改你的Python代码,并修改你的数据库类,在每次添加记录时进行复制。这样,你最终会执行所谓的“防御性复制”,并且如果有人传入某些东西,我会有自己的副本。他们可以做任何事情,而不会破坏我的事情。
This is usually the two design patterns. If you look in PyTorch, for example, this is cloning a tensor. Like, there's a specific thing, and you have to know where to call it. And if you don't call it in the right place, you get these bugs. And this is state of the art.
这通常是两种设计模式。例如,在PyTorch中,这是克隆张量。你需要知道在哪里调用这个特定的功能。如果你没有在正确的地方调用它,就会出现一些错误。这是现代技术的状态。
So a different approach. So it's used in many languages, so I work with it in Swift. As you say, OK, well, let's provide value semantics. And so we want to provide the view that you get a logically independent copy. But we want to do that lazily. And so what we do is you say, OK, if you pass something into a function, it doesn't actually make a copy. What it actually does is it just increments a reference to it. And if you pass it around, you stick in your database. It can go into the database, you own it. And then you come back out of the stack. Nobody's copied anything. You come back out of the stack, and then the caller, let's go of it.
采用一种不同的方法。这在很多语言中都被应用,我在 Swift 中使用这种方法。你说得对,我们要提供价值语义。我们希望提供一个逻辑上独立的副本视图,但是我们希望使用懒加载方式来实现。因此,我们的做法是这样的:如果你将某些东西传入一个函数,它实际上并没有创建一个副本,它只是增加一个引用。如果你将它传递给其他函数或将其存放在数据库中,你就拥有了它。然后你回到调用栈,没有任何复制。你回到调用栈并且调用者释放了它。
Well, then you've just handed it off to the database. You've transferred it, and there's no copies made. Now, on the other hand, if your coworker goes and hands you a record, and you pass it in, you stick it in the database, and then you go to town, and you start modifying it, what happens is you get a copy lazily on demand. And so what this does is it gives you copies only when you need them. And it also defines where the bugs, but also generally reduces the number of copies in practice. But the implementation details are tricky here.
那么,你只是把它交给了数据库。你已经传输了它,不会有任何复制品产生。另一方面,如果你的同事给你一条记录,你再把它传递进去,你把它放进了数据库,然后你开始修改它,你会在需要时懒惰地得到一个副本。所以这样做的好处是只在需要时才会产生副本。它也定义了缺陷的位置,但通常在实践中减少了副本数量。但是这里的实现细节很棘手。
Yeah, so this is, yes. Something with reference counting, but to make it performant across a number of different kinds of objects.
嗯,这个是关于引用计数的,但是要让它在多种不同类型的对象上都表现得高效。
Yeah, well, so you need a couple of things. So this concept has existed in many different worlds. And so again, it's not novel research at all. The magic is getting the design right so that you can do this in a reasonable way.
嗯,你需要几个东西。这个概念在许多不同的领域存在。因此,这并不是新的研究。关键在于设计得当,以便以合理的方式实现这一点。
And so there's a number of components that go into this. One is when you're passing around, so we're talking about Python and reference counting and the expense of doing that. When you're passing values around, you don't want to do extra reference counting for no good reason.
因此,这里有许多组成部分。其中一个就是当你在传递值时,我们谈论的是Python中的引用计数和其消耗。当你在传递值时,你不想无缘无故地进行额外的引用计数。
And so you have to make sure that you're efficient, and you transfer ownership instead of duplicating references and things like that, which is a very low level problem. You also have to adopt this, and you have to build these data structures.
因此,你必须确保自己的效率,并且转移所有权而不是复制引用等低级问题。你还必须采用这种方法,并且构建这些数据结构。
And so if you say, you know, Mojo has to be compatible with Python, so of course the default list is a reference semantic list. It works the way you'd expect in Python. But then you have to design a value semantic list. And so you just have to implement that, and then you implement the logic within.
因此,如果你说,你知道 Mojo 必须与 Python 兼容,那么默认列表就是一个引用语义列表。它与 Python 的使用方式相同。但然后你需要设计一个值语义列表。所以你只需要实现它,然后在内部实现逻辑。
And so the role of the language here is to provide all the low level hooks that allow the author of the type to be able to get and express this behavior without forcing it into all cases or hard coding this into the language itself.
因此,语言在这里的作用是提供所有低级别的钩子,使类型的作者能够获得和表达这种行为,而不会将其硬编码到所有情况中或直接写入语言本身。
But there's ownership, so your constantly transferring, tracking who owns this thing. Yes. And so there's a whole system called ownership. And so this is related to work done in the Rust community. Also the Swift community has done a bunch of work, and there's a bunch of different other languages that have all kind of C++ actually has copy constructors and destructors and things like that. And so, I mean, C++ has everything. So it has move constructors and has like this whole world of things.
但是有所有权,所以你经常转移、追踪谁拥有这个东西。是的。因此,有一个被称为所有权的整个系统。这与 Rust 社区中的工作有关。同样,Swift 社区也做了大量工作,还有许多其他语言,如 C++,它实际上有复制构造函数和析构函数等。因此,我想说,C++ 什么都有。它有移动构造函数和整个世界的其他东西。
And so this is a body of work that's kind of been developing for many, many years now. And so Mojo takes some of the best ideas out of all these systems and remixes in a nice way so that you get the power of something like the Rust programming language. But you don't have to deal with it when you don't want to, which is a major thing in terms of teaching and learning and being able to use and scale these systems.
因此,这是一个已经发展了许多年的作品集。Mojo汲取了所有这些系统中最好的想法,并以很好的方式重新混合,以便您能够获得类似Rust编程语言的强大能力。但是当您不想要它时,您不必去处理它,这在教学和学习以及使用和扩展这些系统方面是一个重要的因素。
How does that play with argument conventions? What are they? Why are they important? How does the value semantics, how does the transfer ownership work with the arguments when they're passing the functions?
这与参数约定如何配合?它们是什么?为什么它们很重要?在参数传递函数时,价值语义和转移所有权如何影响参数?
So if you go deep into systems programming land, this isn't, again, this is not something for everybody. But if you go deep into systems programming land, what you encounter is you encounter these types that get weird.
因此,如果你深入系统编程领域,这并不是每个人都适合的。但是,如果你深入系统编程领域,你会遇到一些奇怪的类型。
So if you're used to Python and you think about everything, I could just copy it around. I can go change it and mutate it and do these things. And it's all cool. If you get into systems programming land, you get into these things like I have an atomic number, or I have a mutex, or I have a uniquely owned database handle, things like this.
如果你已经习惯了Python,并且可以理解所有的东西,你可以随意复制它。你可以修改它,改变它,做这些事情,这一切都很酷。但如果你涉及到系统编程领域,你会遇到这样的东西,比如我有一个原子数字,或者我有一个互斥锁,或者我有一个唯一拥有的数据库句柄,类似这样的东西。
So these types you can't necessarily copy. Sometimes you can't necessarily even move them to a different address.
这些类型有时候你不能只是简单地复制。有时候你甚至不能把它们移动到不同的地址。这些类型是比较独特和特殊的,需要更精细和深度的处理。
And so what Mojo allows you to do is it allows you to express, hey, I don't want to get a copy of this thing. I want to actually just get a reference to it. And by doing that, well, you can say, you can say, OK, if I'm defining something weird, like an atomic number or something, it has to be.
Mojo让您能够表达自己的意图,例如:我不想得到这个东西的副本,我只想得到它的引用。通过这样做,您就可以定义一些奇怪的东西,例如元素的原子序数,对它进行引用。
So an atomic number is an area in memory that multiple threads can access at a time without locs. And so the definition of atomic numbers, multiple different things have to be poking it. Therefore, they have to agree on where it is. And so you can't just move it out from underneath one because it breaks what it means.
原文意思:原子数是内存中的一个区域,多个线程可以同时访问它而不需要锁定。因此,原子数的定义要求多个不同的事物都会对其产生影响。因此,这些事物必须就它的位置达成共识。因此,您不能仅仅因为它的位置不合适而将其移动,因为这将破坏其意义。
简易翻译:原子数指的是内存中的某个区域,多个线程可以同时访问它而不需要锁定。由于它的定义要考虑多种因素,所以必须要确定它的位置,不能轻易移动,否则会失去它的含义。
And so that's an example of a type that you can't copy. You can't move. Once you create it, it has to be where it was. Now, if you look at many other examples, like a database handle, what happens, how do you copy a database handle?
这是一个你无法复制、移动的类型的例子。一旦创建它,它必须停留在原处。如果你看其他例子,比如数据库句柄,你该怎么复制一个数据库句柄呢?
Do you copy the whole database? That's not something you necessarily want to do. There's a lot of types like that where you want to be able to say that they are uniquely owned. So there's always one of this thing. Or if I create a thing, I don't copy it.
你复制整个数据库了吗?这不是必要的事情。像这样的许多类型,你希望能够说它们是独一无二的所有权。因此,总是有一种这样的事物存在。或者说,如果我创建了一个事物,我不会复制它。
And so what Mojo allows you to do is it allows you to say, hey, I want to pass around a reference to this thing without copying it. And so it has borrowed conventions.
Mojo允许您做的事情是,允许您在传递引用而不是复制时说:“嘿,我想传递这个东西的引用。”因此,它借用了一些约定。
So you can say you can use it, but you don't get to change it. You can pass it by mutable reference. And so if you do that, then you get a reference to it, but you can change it. And so it manages all that kind of stuff. So it's just a really nice implementation of make C++ has the reference as a point. Reference as a point.
你可以使用它,但你不能改变它。你可以通过可变引用传递它。如果你这样做,你就会得到一个引用指向它,但你可以改变它。这样它就可以管理所有这些东西。所以这只是一个非常好的实现,让C++中的引用成为一个指针。
Yeah, has more. Different kinds of applications and smart pointers that you can explicitly define this a lot. But you're saying that's more like the weird case versus the common case. Well, it depends on where I mean, I don't think I'm a normal person. So I mean, I've got one to call other people weird. But if you talk to a typical Python programmer, you're typically not thinking about this.
是的,还有更多的内容。你可以定义各种不同的应用和智能指针。但你说这更像是奇怪的情况,而不是常见的情况。这取决于我所说的地方,我不认为自己是一个正常的人。所以我说不出别人奇怪。但如果你和典型的Python程序员交谈,通常是不会考虑这个问题的。
This is a lower level of abstraction. Now, if you talk to a C++ programmer, certainly if you talk to a Rust programmer, again, they're not weird, they're delightful. These are all good people, right? Those folks will think about all the time. Right, and so I look at this as there's a spectrum between very deep low level systems. I'm going to go poke the bits and care about how they're laid out in memory all the way up to application and scripting and other things like this. And so it's not that anybody's right or wrong. It's about how do we build one system that scales?
这是一个更低层次的抽象。如果你与C++程序员,特别是与Rust程序员交谈,他们并不奇怪,而是令人愉快的人。这些人会一直考虑这些问题。因此,我认为在系统范围内存在着非常深层次和低层次之间的差异。有些人会关注位操作和内存布局,而有些人会关注应用和脚本等其他方面。所以,并不是谁对谁错,而是我们要如何构建一个可扩展的系统。
By the way, the idea of an atomic number has been something that always brought me deep happiness because the flip side of that, the idea that threads can just modify stuff asynchronously. So the whole idea of concurrent programming is a source of infinite stress for me. Well, so this is where you jump into, again, you zoom out and get out of program languages or compilers and you just look what the industry has done.
顺便提一句,原子数的概念总是让我感到非常开心,因为与此相反的,线程可以异步地修改数据的想法让我感到无限的压力。所以并发编程的整个概念对我来说是一种无限的压力源。但是,这就是你重要的时刻,你需要放开视野,远离编程语言或编译器,只看业界已经做到了什么。
My mind is constantly blown by this, right? And you look at what Moore's Law has this idea that computers for a long time, single thread performance just got faster and faster and faster and faster for free. But then physics and other things intervened in power consumption, other things started to matter. And so what ended up happening is we went from single core computers to multicore, then we went to accelerators. And this trend towards specialization of hardware is only going to continue. And so for years, us programming language nerds and compiler people have been saying, OK, well, how do we tackle multicore?
你看,我的心情总是被这个所吹走的,对吗?摩尔定律这个想法使得计算机单线程的性能持续增长,而这一切都是免费的。但随着物理和其他因素干涉,功耗和其他因素变得重要起来。因此,我们从单核心计算机发展到多核心,然后再到加速器,硬件专业化的趋势只会继续。多年来,我们编程语言爱好者和编译器人员一直在思考多核心的解决办法。
For a while, it was like, multicore is the future. We have to get on top of this thing. Then it was multicores the default. What are we doing with this thing? And then it's like, there's chips with hundreds of cores in them. What will happen? And so I'm super inspired by the fact that in the face of this, those machine learning people invented this idea of a tensor. And what is a tensor? A tensor is an arithmetic, an algebraic concept.
有一段时间,人们认为多核处理器是未来的发展趋势。我们必须先行一步跟上这种发展。然后多核处理器变成了默认选项,我们需要利用这种技术做出什么呢?如今,处理器上拥有上百个核心。这会带来哪些变化?正因为面对这种挑战,机器学习专家们提出了张量的概念,这令我十分感叹。那么什么是张量呢?它是一种计算、代数概念。
It's like an abstraction around a gigantic, parallelizable data set. And because of that and because of things like TensorFlow and PyTorch, we're able to say, OK, we'll express the math of the system. It enables you to do automatic differentiations. It enables you to do all these cool things. And it's an abstract representation. Well, because you have that abstract representation, you can now map it onto these parallel machines without having to control, OK, put that right here. Put that right there. Put that right there. And this has enabled an explosion in terms of AI compute, accelerators, like all the stuff. And so that's super, super exciting.
这就像是一个围绕着一个庞大的可并行化数据集的抽象。因为有了这个抽象、像TensorFlow和PyTorch这样的工具,我们能够表达系统的数学计算,实现自动微分和其他很酷的功能。它是一个抽象的表达方式。由于你拥有这个抽象的表达方式,你现在可以将它映射到这些并行机器上,而不必控制,把这个放在这里,把那个放在那里。这使得AI计算加速器等东西得到了爆炸式的发展,这是非常令人兴奋的。
What about the deployment of the execution across multiple machines? So you write that the modular compute platform dynamically partitions models with billions of parameters and distributes their execution across multiple machines enabling unparalleled efficiency, whether the use of unparalleled in that sentence. Anyway, enabling unparalleled efficiency scale and reliability for the largest workloads. So how do you do this abstraction of distributed deployment of large models?
多机部署执行的情况如何?那么,你写道,模块化计算平台会动态地将有数十亿参数的模型进行分区,并将它们的执行分配到多台机器上,从而实现无与伦比的效率,并保证最大的工作负载得到无与伦比的效率、可扩展性和可靠性。那么,你们是如何实现对大型模型进行分布式部署的抽象呢?
Yeah, so one of the really interesting tensions. So there's a whole bunch of stuff that goes into that. I'll pick a random walkthrough. If you go back and replay the history of machine learning, the brief most recent history of machine learning, because as you know, it's very deep. I knew Lex when he had an AI podcast. Yes. Right, yep.
嗯,有一个非常有趣的紧张关系。这其中包含了很多东西。我举一个随机的例子。如果你回顾机器学习的历史,最近的机器学习历史,因为你知道,它非常深奥。我认识 Lex 是因为他有一个 AI podcast。是的,没错。
So if you look at just TensorFlow and PyTorch, which is pretty recent history in the big picture, but TensorFlow is all about graphs, PyTorch, I think, pretty unarguably ended up winning and why did it win, mostly because of usability. And the usability of PyTorch is, I think, huge. And I think, again, that's a huge testament to the power of taking abstract theoretical technical concepts and bringing it to the masses.
如果你只看 TensorFlow 和 PyTorch,这在大局上来看是相当近期的历史,但 TensorFlow 的重点在于图形,而 PyTorch 确实无可争议地获胜了,主要是因为它的可用性。我认为 PyTorch 的可用性非常巨大。而这再次证明了将抽象的理论技术概念带给大众的力量是巨大的。
Now, the challenge with what the TensorFlow versus the PyTorch design points was that TensorFlow's kind of difficulty is for researchers, but it was actually pretty good for deployment. PyTorch is really good for researchers. It's kind of not super great for deployment. And so I think that we as an industry have been struggling.
现在,对于TensorFlow和PyTorch的设计重点来说,挑战在于TensorFlow的困难程度更适合研究人员,但实际上非常适合部署。而PyTorch非常适合研究人员,但在部署方面并不是特别好。因此,我认为我们作为一个行业一直在苦苦挣扎。
And if you look at what deploying a machine learning model today means is that you'll have researchers who are, I mean, wicked smart, of course, but they're wicked smart at model architecture and data and calculus. Like they're wicked smart in various domains. They don't want to know anything about the hardware or deployment or SQL Plus or things like this.
如果你看看如今部署机器学习模型的意义,就会发现需要那些非常聪明的研究人员,他们当然非常厉害,但是他们特别擅长的是模型架构、数据和微积分等领域,他们并不想了解关于硬件、部署或者 SQL Plus 等方面的东西。
And so what's happened is you get people who train the model, they throw it over the fence, and they have people that try to deploy the model. Well, every time you have a team A does X, they throw it over the fence. And team Y does some, team B does Y, like you have a problem. Because of course, it never works the first time. And so you throw over the fence, they figure out, OK, it's too slow, won't fit, doesn't use the right operator, the tool crashes, whatever the problem is.
发生的情况是,有些人训练模型,然后把它扔过围墙,再由其他人试图使用模型。但每次团队A完成任务X后,他们就会把模型扔给其他团队去处理,比如团队Y做了一些事情,团队B又做了另一些事情,这就产生了问题。因为显然,第一次很难就能完成任务。所以,他们要经过反复尝试来解决问题,比如发现模型运行太慢、无法适应、没有使用正确的操作符或者工具崩溃等等问题。
Then they have to throw it back over the fence. And every time you throw a thing over a fence, it takes three weeks of project managers and meetings and things like this. And so what we've seen today is that getting models in production can take weeks or months. Like it's not atypical.
然后他们必须把它扔回篱笆那边。每次扔东西过篱笆,都需要三周的项目管理人员和会议等等。所以,今天我们看到的是把模型投入生产需要几周甚至几个月的时间。这并不少见。
I talked to lots of people and you talk about VP of software at some internet company trying to deploy a model. And they're like, why do I need a team of 45 people? It's so easy to try to model. Why can't I deploy it? And if you dig into this, every layer is problematic.
我和很多人交谈过,其中有人谈到某个互联网公司的软件副总裁正在尝试部署一个模型。他们说:“我为什么需要一个由45人组成的团队?部署这个模型非常简单,为什么我不能这样做?”如果你深入挖掘,你会发现每个层面都有问题。
So if you look at the language piece, I mean, this is tip of the iceberg. It's a very exciting tip of the iceberg for folks. But you've got Python on one side and C++ on the other side. Python doesn't really deploy. I mean, can theoretically, technically, in some cases. But often, a lot of production teams will want to get things out of Python because they get better performance and control and whatever else.
所以,如果你看看编程语言这个领域,我要说,这只是冰山一角。对于开发者们而言,这是一个非常令人兴奋的领域。但是,Python与C ++在另一个方面相对应。Python实际上并不易部署。理论上可能性很高,在一些情况下也是如此。但是,许多生产团队通常会想要从Python中获取更好的性能和控制,因此他们会进行转变。
So Mojo can help with that. If you look at serving. So you talk about gigantic models. Well, a gigantic model won't fit on one machine. And so now you have this model. It's written in Python. It has to be rewritten in C++. Now it also has to be carved up so that half of it runs on one machine, half of it runs on another machine. Or maybe it runs on 10 machines. Well, so now suddenly, the complexity is exploding.
Mojo可以帮助解决这个问题。如果您考虑提供服务,您会面对庞大的模型。很显然,这个庞大的模型无法被一个机器所包含。现在,您有一个用Python编写的模型需要被重写成C++。而且,这个模型也需要被切割成若干部分,一半运行在一台机器上,一半运行在另一台机器上。或者它需要运行在10台机器上。所以,现在突然间,复杂性变得更加复杂了。
And the reason for this is that if you look into TensorFlow PyTorch, these systems, they weren't really designed for this world. They were designed for back in the day when we were starting and doing things where it was a different, much simpler world. Like you want to run ResNet 50 or some ancient model architecture like this. It was a completely different world than. Rain on one GPU. Exactly. Doing Alex on one GPU. Yeah, AlexNet, right? It's in the major breakthrough. And the world has changed.
原因是,如果你研究TensorFlow和PyTorch这些框架,你会发现它们并不是为当前的环境而设计的。它们是在我们开始研究的时候设计出来的,那时候的世界非常简单。比如你想要运行ResNet 50或者其他古老的模型结构,那时候的世界完全不同。仅仅用一台GPU可以完成的任务,例如运行AlexNet,是当时的重大突破。但是现在世界变了。
And so now the challenge is that TensorFlow PyTorch these systems, they weren't actually designed for LLMs. That was not a thing. And so where TensorFlow actually has amazing power in terms of scale and deployment and things like that. And I think Google is maybe not unmatched, but they're incredible in terms of their capabilities and jigging at scale. Many researchers using PyTorch. And so PyTorch doesn't have those same capabilities.
现在的挑战是,TensorFlow和PyTorch这些系统实际上并不是为了LLMs而设计的。这不是一件事情。在规模和部署等方面,TensorFlow确实具有惊人的能力。我想谷歌在他们的能力和规模方面可能并不是无与伦比的,但他们非常出色。许多研究人员使用PyTorch。因此,PyTorch没有这些相同的能力。
And so a module can do so, it can help with that. Now, if you take a step back and say, what is Modular doing? So Modular has a bitter enemy that we're fighting against in the industry. And it's one of these things where everybody knows it, but nobody is usually willing to talk about it.
因此,一个模块可以帮助完成这个任务。现在,如果你后退一步,问问模块化到底是做什么的?那么,模块化面临着我们在业界与之搏斗的一个强大敌人。这是一个有关于大家都知道却不愿意谈论的事情。
The bitter enemy, the bitter thing that we have to destroy that we're all struggling with, and it's like fish can't see water, is complexity. Sure, yes, complexity. Right, that was very philosophical. As I said. And so if you look at it, yes, it is on the hardware side. Yes. All these accelerators, all these software stacks, the go with the accelerator, all these massive complexity over there.
我们所面临的苦恼、必须摧毁的苦涩敌人像鱼儿无法看到水一样,就是复杂性。复杂性确实是一个非常哲学的概念,我们都在为之挣扎。如果你看看它,是关于硬件的一面。所有这些加速器,所有这些与加速器配套的软件系统,以及所有这些巨大的复杂度都在那里。
You look at what's happening on the modeling side. Massive amount of complexity. Like things are changing all the time. People are inventing, turns out the research is not done. And so people want to be able to move fast. Transformers are amazing, but there's a ton of diversity even within Transformers. And what's the next transformer?
你看看建模方面正在发生的事情。非常复杂的情况。就像事情一直在变化。人们正在发明,结果发现研究还没有完成。因此人们想要能够快速移动。Transformers (注:一种语言模型)是惊人的,但即使在Transformers内部也有很多多样性。那么下一个变形金刚是什么呢?
And you look into serving. Also, huge amounts of complexity. It turns out that all the cloud providers have all their very weird, but very cool hardware for networking and all this kind of stuff. And it's all very complicated. People aren't using that.
而且你要考虑如何提供服务。同时,有大量的复杂性。事实证明,所有云服务提供商都拥有非常奇怪但非常酷的网络和其他硬件。这些都非常复杂,而人们并未充分利用它们。
You look at classical serving. There's this whole world of people who know how to write high-performance servers with zero copy networking and all this asynchronous I.O. and all these fancy things in the serving community, very little that has pervaded into the machine learning world. And why is that? Well, it's because, again, these systems have been built up over many years. They haven't been rethought.
你看看经典的服务。有一整个世界的人们知道如何编写高性能服务器,使用零拷贝网络和所有这些异步 I.O. 以及所有的高级技术在服务社区,但是很少有这些应用到机器学习领域。为什么呢?原因是这些系统已经建立了多年,没有被重新思考过。
There hasn't been a first principles approach to this. And so what Modular's doing is we're saying, OK, we've built many of these things. So I've worked on TensorFlow and TPUs and things like that. Other folks on our team have worked on PyTorch Core. We've worked on Onyx one time. We've worked on many of these other systems.
到目前为止,我们还没有一个基于第一原理的方法来解决这个问题。所以,Modular现在所做的就是说,我们已经建立了许多这样的东西。我曾经参与过TensorFlow和TPUs等工作,我们团队中的其他人也曾经参与过PyTorch Core的工作。我们曾经在Onyx上工作过一段时间,也曾经参与过许多其他系统的工作。
And so the Apple accelerators and all that kind of stuff, our team is quite amazing. And so one of the things that roughly everybody at Modular's grumpy about is that when you're working on one of these projects, you have a first-order goal. Get the hardware to work. Get the system to enable one more model. Get this product out the door. Enable the specific workload or solve this problem for this product team.
因此,苹果的加速器等等方面,我们的团队非常出色。在 Moduler,几乎每个人都会为一个问题感到烦恼:当你在处理这些项目时,你有一个一级目标,即让硬件工作起来、使系统能够支持更多的模型、让产品成为可交付的。为特定的工作负载提供支持、为某个产品团队解决特定问题。
And nobody's been given a chance to actually do that step back. And so we as an industry, we didn't take two steps forward. We took 18 steps forward in terms of all this really cool technology across compilers and systems and runtimes and heterogeneous computing, all this kind of stuff.
没有人有机会真正退一步。因此,我们作为一个行业,在编译器、系统、运行时和异构计算等所有这些非常酷的技术方面,不是向前迈了两步。我们迈出了18步。
And all this technology has been, I wouldn't say, beautifully designed, but it's been proven in different quadrants. You look at Google with TPUs, massive, huge exaflops of compute strapped together into machines that researchers are programming in Python in a notebook. That's huge. That's amazing. That's incredible. It's incredible. And so you look at the technology that goes into that. And the algorithms are actually quite general.
这些技术并不是被精美地设计出来的,但它们在不同领域已经证明其有效性。例如谷歌的TPU,这是一个将巨大的计算能力捆绑在一起的机器,研究者能够在笔记本上使用Python语言进行编程。这是一个巨大的进步,令人惊叹和令人难以置信。你可以看看这种技术所涉及的算法,它们实际上是相当普遍的。
And so lots of other hardware out there and lots of other teams out there don't have the sophistication or maybe the years working on it or the budget or whatever the Google does. And so they should be getting access to the same algorithms, but they just don't have that. And so what modular students were saying, cool. This is not researching more.
因此,还有很多其他硬件和团队并没有谷歌所拥有的复杂性,也许是年份,也许是预算或其他原因。因此,他们应该获得相同的算法,但他们没有这样的条件。这就是模块化学生们所说的,很棒的是,这不是更多的研究。
We've built auto tuning in many systems. We've built programming languages. And so have implemented C++, have implemented Swift, have implemented many of these things. And so it's hard, but it's not research. And you look at accelerators. Well, we know there's a bunch of different weird kind of accelerators, but they actually cluster together.
我们在许多系统中都已经建立了自动调谐功能。我们建立了编程语言,实现了C ++,实现了Swift,实现了许多其他的东西。所以这很困难,但不是一项研究。你看看加速器。好吧,我们知道有各种奇怪的加速器,但它们实际上是聚集在一起的。
And you look at GPUs. Well, there's a couple of major vendors of GPUs, and they maybe don't always get along, but their architectures are very similar. You look at CPUs. CPUs are still super important for the deployment side of things. And you see new architectures coming out from all the cloud providers and things like this. And they're all super important to the world.
看看GPU。GPU有几个主要的供应商,它们也许不总是友好相处,但它们的架构非常相似。看看CPU。CPU对于部署方面仍然非常重要。你会看到来自各个云服务提供商的新架构被推出。它们在整个世界中都非常重要。
But they don't have the 30 years of development that the entrenched people do. And so what modular can do is we're saying, OK, all this complexity. It's not bad complexity. It's actually innovation. And so it's innovation that's happening, and it's for good reasons. But I have sympathy for the poor software people. I mean, again, I'm a generally software person too. I love hardware.
但是,新人们没有站稳脚跟的人那样经过了30年的发展。因此,模块化的作用就在于,我们要说,这些复杂性并不是坏的复杂性,实际上是创新。这是正在发生的创新,是有好的原因的。但我同情那些可怜的软件人员。我是一个普通的软件人员,我也热爱硬件。
But software people want to build applications and products and solutions that scale over many years. They don't want to build a solution for one generation of hardware with one vendor's tools. Because of this, they need something that scales with them. They need something that works on cloud and mobile. Because their product managers said, hey, I want it to have lower latency, and it's better for personalization, or whatever they decide. Products evolve.
软件开发人员想要构建能够在多年间扩展的应用程序、产品和解决方案。他们不想使用某个供应商的工具为一代硬件构建解决方案。因此,他们需要具有可扩展性的东西。他们需要在云和移动设备上运行的东西。因为他们的产品经理说,嘿,我希望其延迟更低,对个性化效果更好,或者他们决定的其他一些事情。产品也在不断发展。
And so the challenge with the machine learning technology and the infrastructure that we have today in the industry is that it's all these point solutions. And because they're all these point solutions, it means that as your product evolves, you have to switch different technology stacks, or switch to a different vendor. And what that does is that slows down progress.
因此,我们在现今机器学习技术和基础设施领域所面临的挑战是,所有这些解决方案都是点对点的。由于这些解决方案都是点对点的,这意味着随着您的产品不断发展,您必须切换不同的技术堆栈或切换到不同的供应商。而这会减缓进展的速度。
So basically, a lot of the things we've developed in those little silos for machine learning tasks, you want to make that the first class citizen of a general purpose programming language that can then be compiled across all these kinds of hardware. Well, so it's not really about a programming language.
基本上,我们在机器学习任务的小笼子里开发的很多东西,你希望将它变成通用编程语言的一等公民,然后可以在所有这些种类的硬件上编译。那么,这实际上并不是关于编程语言的问题。
I mean, the programming language is a component of the mission. And the mission is, are not literal, but our joking mission is to save the world from terrible AI software. Excellent. I love this. So if you look at this mission, you need a syntax. So yeah, she needed a programming language. And we wouldn't have to build the programming language if one existed. So if Python was already good enough, then cool. We would just used it. We're not just doing very large scale expensive engineering projects for the sake of it.
我的意思是,编程语言是这项任务的一部分。而这项任务的意义不是文字本身,而是我们戏谑般的任务,即拯救世界免受可怕的AI软件的威胁。非常好。我喜欢这个。因此,如果你看看这项任务,你需要一个语法。因此,我们需要一种编程语言。如果已经有一种好的语言,比如Python,那么就用它就好了。我们不是毫无必要地进行大规模昂贵的工程项目。
It's to solve a problem. It's also about accelerators. It's also about exotic numerics and BFLT16 and matrix multiplications and convolutions, and this kind of stuff. Within the stack, there are things like kernel fusion. That's a esoteric, but really important thing that leads to much better performance and much more general research hackability together. And that's enabled by the ASICs. That's enabled by certain hardware. So it's like, where's the dance between several questions here?
这是解决问题的事情。它也涉及到加速器、异型数字、BFLT16、矩阵乘法和卷积等等。在这个堆栈中,有像内核融合这样的玄学但确实非常重要的事情,它可以带来更好的性能和更广泛的研究可塑性。这是由ASICs实现的,是由特定硬件实现的。因此,这就像在几个问题之间跳舞一样。
Like, how do you add a piece of hardware to the stack? If I have this genius invention of a specialized accelerator, how do I add that to the module framework? And also, how does modular as a standard start to define the kind of hardware that should be developed? Yeah.
就比如说,如果我想把硬件设备加入到整个设备框架中,该怎么做呢?如果我有一个独特的加速卡这种创新性的发明,又该如何把它添加到模块框架中呢?另外,模块化是作为一种标准如何开始定义应该开发的硬件类型的呢?嗯。
So let me take a step back and talk about status quo. And so if you go back to TensorFlow 1, PyTorch 1, this kind of time frame, and these have all evolved and gone way more complicated. So let's go back to the glorious simple days. These things basically were CPUs and CUDA. And so what you do is you say, go do a dense layer, and a dense layer has a matrix multiplication. And so when you say that, you say, go do this big operation or matrix multiplication. And if it's on a GPU, kick off a CUDA kernel. If it's on CPU, go do an Intel algorithm or something like that with the Intel MKO.
让我倒退一步,谈谈现状。如果回到TensorFlow 1和PyTorch 1这样的时间框架,它们都已经演变并变得更加复杂了。那么让我们回到光辉的简单时代。这些东西基本上都是CPU和CUDA。所以你说,做一个密集层,一个密集层有一个矩阵相乘。当你说出这句话时,你说,去执行这个大操作或者矩阵相乘。如果在GPU上,启动CUDA内核。如果在CPU上,使用Intel MKO之类的算法。
Now, that's really cool if you're either in video or Intel. But then more hardware comes in. And on one axis, you have more hardware coming in. On the other hand, you have an explosion of innovation in AI. And so what happened with both TensorFlow and PyTorch is that the explosion of innovation in AI has led to, it's not just about matrix multiplication and convolution. These things have now like 2,000 different operators. And on the other hand, you have, I don't know how many pieces of hardware there are.
现在,如果你是从事视频或英特尔领域,这是非常酷的。但是,会出现更多的硬件。一方面,硬件会越来越多。另一方面,AI领域的创新会爆炸式增长。因此,TensorFlow和PyTorch都经历了AI创新的爆炸发展,不仅仅局限于矩阵乘法和卷积了,现在有大约2000个不同的运算符。另一方面,硬件数量有多少,我不知道。
It's a lot. It's not even hundreds. It's probably thousands. And across all of Edge and across all the different things. That are used at scale. Yeah, exactly. I mean, AI is not just like everywhere. Yeah, it's not a handful of TPU alternatives. Correct. It's every phone, often with many different chips inside of it from different vendors. Like, AI is everywhere. It's a thing, right? Why are they all making their own chips? Like, why is everybody making their own thing? Well, so because.
这是一个相当大的数字,甚至不止数百,可能达到千位数,而且覆盖了所有的边缘计算,并涉及所有不同的使用规模。是的,确实如此。我是说,人工智能不仅仅是无处不在。是的,这不仅仅是一些TPU的替代品。每个手机都有,往往内部还有来自不同供应商的许多不同芯片。就像人工智能无处不在一样。这是一个事实,为什么他们都要造自己的芯片呢?为什么每个人都在造自己的东西呢?那么,因为……
Was that a good thing, for sure? So, Chris's philosophy on hardware. Right. So my philosophy is that there isn't one right solution. And so I think that, again, we're at the end of Moore's Law. Specialization happens. If you're building, if you're training GPT-5, you want some crazy supercomputer data center thingy. If you're making a smart camera that runs on batteries, you want something that looks very different. If you're building a phone, you want something that looks very different. If you have something like a laptop, you want something that looks maybe similar, but a different scale.
那是确实是个好事吗?那就是克里斯对硬件的哲学。没错。我的哲学是没有一种绝对正确的解决方案。因此,我认为我们又到了摩尔定律的终点。专业化越来越普遍。如果你在制作GPT-5,你需要一台疯狂的超级计算机数据中心设备。如果你在制作运行在电池上的智能摄像机,你需要的是一个外形截然不同的设备。如果你在制造手机,你需要的是一个外观截然不同的设备。如果你拥有像笔记本电脑这样的设备,你可能需要的是一个相似但不同规模的设备。
And so AI ends up touching all of our lives, robotics, right? And like lots of different things. And so as you look into this, these have different power envelopes. There's different trade-offs in terms of the algorithms. There's new innovations and sparsity and other data formats and things like that. And so hardware innovation, I think, is a really good thing. And what I'm interested in is unlocking that innovation. There's also like analog and quantum. And like, although the really weird stuff, right?
因此,人工智能最终影响到我们生活中的所有方面,包括机器人等许多不同事物。因此,当您深入研究时,这些东西具有不同的能量包络。这些算法存在不同的权衡取舍。还有新的创新和稀疏性等数据格式等等。因此,硬件创新是一件非常好的事情。我感兴趣的是解锁这种创新。另外,还有模拟和量子等技术,这些都是一些非常奇怪的东西。
And so if somebody can come up with a chip that uses analog computing and it's 100x more power efficient, think what that would mean in terms of the daily impact on the products we use. That'd be huge. Now, if you're building an analog computer, you may not be a compiler specialist. These are different skill sets, right?
如果有人能够开发出一种使用模拟计算技术的芯片,并且它比现有的芯片节省能源的比例高100倍,那么在我们日常使用的产品上它产生的影响会是非常大的。这将会是一件重大的事情。但是,如果你正在构建一个模拟计算机,你可能不是一个编译器专家。因为这需要不同的技能。
And so you can hire some compiler people if you're running a big company maybe. But it turns out these are really exotic new generation of compilers. Like this is a different thing, right? So if you take a step back out and come back to what is the status quo? The status quo is that if you're an Intelli or you're in video, you keep up with the industry and you chase. And OK, there's 1,900 now. There's 2,000 now. There's 2,100. And you have a huge team of people that are trying to keep up and tune and optimize.
因此,如果您经营着一家大公司,您可能会雇用一些编译器专业人员。但事实证明,这些人是真正奇特的新一代编译器专家。这是不同的东西,对吧?所以,如果你退一步出来,回到现状,现状是什么呢?现状是,如果你是Intelli或者视频产业,你要跟上产业发展并追赶它。现在,可能是1,900,现在又是2,000,再到2,100。您需要一个庞大的团队来努力跟进、调优和优化。
And even when one of the big guys comes out with a new generation of their chip, they have to go back and rewrite all these things, right? So really, it's only powered by having hundreds of people that are all frantically trying to keep up. And what that does is that keeps out the little guys. And sometimes they're not so little guys. The big guys that are also just not in those dominant positions.
即使是大公司发布了新一代的芯片,他们还必须回头改写所有这些东西,对吧?所以,真正推动技术进步的是有成百上千人在拼命跟进。这就排挤了小公司,有时候这些小公司不是那么小,而只是没有处于主导地位的大公司。
And so what has been happening, and so a lot of you talk about the rise of new exotic crazy accelerators, is people have been trying to turn this from a let's go right lots of special kernels problem into a compiler problem. And so we and I contributed to this as well. We as an industry went into it, like let's go make this compiler problem phase. Let's call it. Much of the industry is still in this phase, by the way. So I wouldn't say this phase is over.
所以最近正在发生的事情,很多人谈论的新型奇异加速器的兴起,人们正试图将这个问题从一个许多特殊内核的问题变成一个编译器的问题。我们也为此做出了贡献。我们作为一个行业参与其中,让我们把这个问题变成编译器问题。许多行业仍然处于这个阶段,所以我不会说这个阶段结束了。
And so the idea is to say, look, OK, what a compiler does is it provides a much more general, extensible, hackable interface for dealing with the general case. And so within machine learning algorithms, for example, people figured out that, hey, if I do a matrix multiplication, I do a value, the classic activation function. It is way faster to do one pass over the data and then do the value on the output where I'm writing out the data, because the value is just a maximum operation. Max is zero. And so it's an amazing optimization. Take Matmoor, Relyu squished together in one operation. Now we have Matmoor Relyu.
所以,这个想法是要说,编译器所做的是提供一个更通用、可扩展、可操作的接口来处理一般情况。因此,在机器学习算法中,人们发现,如果我做一个矩阵乘法,我做一个参数,传统的激励函数。在输出数据时进行一次数据遍历,然后在输出数据时进行参数操作,因为参数只是一个最大值操作,最大值是0。这是一个惊人的优化。将Matmoor、Relyu合并在一个操作中。现在我们有了Matmoor Relyu。
Well, wait a second. If I do that now, I just went from having two operators to three. But now I figure out, OK, well, there's a lot of activation functions. What about leaky value? What about like a million things that are out there, right? And so as I start fusing these in, now I get permutations of all these algorithms. And so what the compiler people said is they said, hey, cool. Well, I will go enumerate all the algorithms, and I will enumerate all the pairs, and I will actually generate a kernel for you.
等一下,如果我现在这样做,我的运算符就从两个变成了三个。但现在我意识到,有很多激活函数,比如渗透价值等等,还有许多其他的东西,对吧?所以当我开始把它们融合在一起时,现在我得到了所有这些算法的排列组合。所以编译器的人说,好的,我将罗列所有的算法和所有的配对,然后为你生成一个内核。
And I think that this has been very useful for the industry. This is one of the things that powers Google TPUs, PyTorch 2s, rolling out really cool compiler stuff with Triton, this other technology and things like this. And so the compiler people are coming into their fore and saying, awesome, this is a compiler problem. We'll compiler it. Here's the problem. Not everybody's a compiler person.
我认为这对于产业非常有益。这是Google TPUs、PyTorch 2s等技术的动力之一,也推出了非常酷的Trition编译器等技术。所以编译器的人们现在开始起到了非常大的作用,他们说:“很棒,这是一个编译问题,我们来解决它。问题在这里。”但并不是所有人都是编译器的专家。
I love compiler people. Trust me, right? But not everybody can or should be a compiler person. It turns out that they're people that know analog computers really well, or they know some GPU internal architecture thing really well, or they know some crazy sparse numeric, interesting algorithm that is the cusp of research, but they're not compiler people.
我爱编译器领域的人。相信我,是吗?但是,并不是每个人都可以或者应该成为编译器领域的人。事实证明,他们是一些对模拟计算机非常熟悉的人,或者他们非常了解某些GPU内部架构,或者他们了解一些疯狂的稀疏数值算法,这是研究的焦点,但他们不是编译器领域的人。
And so one of the challenges with this new wave of technology trying to turn everything into a compiler is, again, it is excluded a ton of people. And so you look at what does Mojo do? What does the modular stack do? It brings programmability back into this world. Like it enables, I wouldn't say normal people, but a different kind of delightful nerd that cares about numerics or cares about hardware, cares about things like this, to be able to express that in the stack and extend the stack without having to actually go hack the compiler itself.
因此,将一切都变成编译器的新技术浪潮所面临的挑战之一是,它会排除大量人群。那么Mojo是做什么的?Modular Stack又是做什么的?它们使可编程性重新回到了这个世界上。就像它可以让一个不算普通的人,但是一个对数字、硬件等关心的令人愉快的NERD,能够在叠加层中表达并扩展这些东西,而无需直接去更改编译器本身。
Should extend the stack on the algorithm side and then on the hardware side. Yeah, so again, go back to the simplest example of Int. And so both Swift and Mojo and other things like this did, as we said, OK, pull magic out of the compiler and put it in the standard library. And so what modular is doing with the engine that we're providing in this very deep technology stack, which goes into heterogeneous run times and whole bunch of really cool, really cool things. This whole stack allows that stack to be extended and hacked and changed by researchers and by hardware innovators and by people who know things that we don't know.
应该在算法和硬件两方面扩展堆栈。回到Int的最简单的例子。Swift、Mojo和其他类似的语言都将魔法从编译器中提取出来放在标准库中。Modular所做的是与我们提供的引擎一起,构建一个非常深入的技术堆栈,用于异构运行时和一系列非常酷的事情。整个堆栈允许研究人员、硬件创新者和那些我们不知道的人扩展、破解和更改它。
Because modular has some smart people, but we don't have all the smart people, it turns out. What are heterogeneous run times? Yeah, so what is heterogeneous? So heterogeneous just means many different kinds of things together. And so the simplest example you might come up with is a CPU and a GPU. And so it's a simple heterogeneous computer to say, I will run my data loading and pre-processing and other algorithms on the CPU.
因为模块化拥有一些聪明的人,但并不是所有聪明的人都参与其中。什么是异构运行时?是的,什么是异构?异构只是指许多不同种类的事物混合在一起。所以你可能会想到最简单的例子是CPU和GPU。因此,一个简单的异构计算机可以是这样说的:我将在CPU上运行我的数据加载、预处理和其他算法。
And then once I get it into the right shape, I shove it into the GPU. I do a lot of matrix multiplications and convolutions and things like this. And then I get it back out and I do some reductions and summaries and they shove it across the network to another machine. And so you've got now what are effectively two computers, a CPU and a GPU talking to each other, working together in a heterogeneous system. But that was 10 years ago.
然后一旦我把数据整理好,我会将它发送到 GPU 中。我会进行大量的矩阵乘法和卷积等运算。然后我将数据提取出来,进行一些汇总和简化操作,并将它传输到另一台计算机上。现在你拥有了两台计算机,一台是 CPU,一台是 GPU,它们在异构系统中互相通信,合作工作。但那是10年前的事情了。
You look at a modern cell phone. Modern cell phone, you've got CPUs. And they're not just CPUs, there's big dot little CPUs. And so there's multiple different kinds of CPUs that are working together. They're multi-core. You've got GPUs. You've got neural network accelerators. You've got dedicated hardware blocks for media, so for video decode and JPEG decode and things like this. And so you've got this massively complicated system.
你看看现代的手机。现代手机有中央处理器 (CPU),而且不止是一种,还有大、小两种 CPU,它们可以协同工作。你还有图形处理器 (GPU),神经网络加速器,以及专用硬件块用于媒体解码,例如视频解码和JPEG解码等等。所以你拥有了一个极其复杂的系统。
And this isn't just cell phones. Every laptop these days is doing the same thing. And all these blocks can run at the same time and need to be choreographed. And so again, one of the cool things about machine learning is it's moving things to data flow graphs and higher level of abstractions and tensors and these things that it doesn't specify. Here's how to do the algorithm. It gives the system a lot more flexibility in terms of how to translate or map or compile it onto the system that you have.
不仅是手机,现如今许多笔记本电脑也在做同样的事情。这些程序块需要同时运行且需要协调。机器学习的一个独特之处是,它将事情移动到数据流图和更高抽象层次的张量等等,在这些抽象层次上,它不指定算法应如何完成,在如何将算法映射、编译到你的系统上方面给了系统更多的灵活性。
And so what you need, the bottomest part of the layer there is a way for all these devices to talk to each other. And so this is one thing that I'm very passionate about. I mean, I'm a nerd. But all these machines and all these systems are effectively parallel computers running at the same time, sending messages to each other.
所以,你需要知道的是,在所有这些设备的最底层,它们之间有一种交流的方式。而这正是我非常热衷的一点。我是个书呆子,但所有的这些机器和系统实际上都是并行计算机,同时运行并互相发送信息。
And so they're all fully asynchronous. Well, this is actually a small version of the same problem you have in a data center. In a data center, you now have multiple different machines, sometimes very specialized, sometimes with GPUs or TPUs in one node and sometimes with disks in another node. And so you get a much larger scale heterogeneous computer. And so what ends up happening is you have this like multi-layer abstraction of hierarchical parallelism and hierarchical asynchronous communication.
因此,它们都是完全异步的。这实际上是数据中心中同样问题的一个小版本。在数据中心中,你现在有多个不同的机器,有时非常专业化,有时一个节点上有GPU或TPU,另一个节点上有磁盘。因此,你得到了一个规模更大、异构性更强的计算机。所以最终会出现一种多层抽象的分层并行和分层异步通信。
Making that, again, my enemy is complexity by getting that away from being different specialized systems at every different part of the stack and having more consistency and uniformity. I think we can help lift the world and make it much simpler and actually get used. But how do you leverage the strengths of the different specialized systems?
让我再说一遍,我的敌人是复杂性,因为在每个不同的堆栈部分使用不同的专门系统会让它变得更加复杂和不一致。我们可以通过更加一致和统一来帮助简化世界,使之更易于使用。但是,如何利用不同专门系统的优势呢?
So looking inside the smartphone, like there's a, what, I don't know, five, six computers essentially inside the smartphone. How do you, without trying to minimize the explicit, making it explicit, which computer is supposed to be for which operation? Yeah, so there's a pretty well-known algorithm and what you're doing is you're looking at two factors. You're looking at the factor of sending data from one thing to another.
在智能手机内部,有五到六个基本上的计算机,如何在不将其明确化的情况下,确定哪个计算机应该用于哪个操作?是的,有一个相当著名的算法。你要做的是看两个因素。你要考虑从一个地方发送数据到另一个地方的因素。
So it takes time to get it from that side of the chip to that side of the chip and things like this. And then you're looking at what is the time it takes to do an operation on a particular block? So take CPUs. CPUs are fully general. They can do anything, right? But then you have a neural net accelerator that's really good at matrix multiplications.
因此,从芯片的一侧到另一侧传输需要时间,等等这些事情。然后,你需要考虑在特定块上执行操作所需要的时间。例如,以CPU为例,CPU是完全通用的,可以执行任何操作,对吗?但是,你也可以使用神经网络加速器来进行矩阵乘法,这将变得更加高效。
Okay, and so you say, okay, well, if my workload is all matrix multiplications, I start up, I send the data over the neural net thing. It goes and does matrix multiplications. When it's done, it sends me back the result. All is good, right? And so the simplest thing is just saying, do matrix operations over there, right?
好的,你的意思是,如果我的工作负载全部是矩阵乘法,我开启它,将数据发送到神经网络设备中进行矩阵乘法运算。完成后,它将结果发送回给我。一切都很好,对吧?因此,最简单的方法就是说,在那里进行矩阵运算,对吧?
But then you realize you get a little bit more complicated because you can do matrix multiplications on a GPU. You can do it on a neural net accelerator. You can do it on a CPU. And they'll have different trade-offs and costs. And it's not just matrix multiplication. And so what you actually look at is you look at, I have generally a graph of compute. I wanna do a partitioning. I wanna look at the communication, the bisection bandwidth and overhead and the sending of all these different things and build a model for this and then decide, okay, it's an optimization problem. Where do I wanna place this compute?
然而,你会发现事情变得有点复杂,因为你可以在 GPU 上进行矩阵乘法,也可以在神经网络加速器上进行,还可以在 CPU 上进行,它们都有不同的权衡和成本。而且,这不仅仅是矩阵乘法。因此,实际上你要看的是,我通常有一个计算图,我要进行分区,要考虑通信、双向带宽和开销以及所有这些不同的元素的发送,并为此建立一个模型,然后决定,好的,这是一个优化问题,我要在哪里放置这些计算。
So it's the old school theoretical computer science problem of scheduling. And then how does presumably it's possible to somehow magically include autogen into this? Absolutely. So I mean, in my opinion, this is an opinion. This is not a way to agree with this, but in my opinion, the world benefits from simple and predictable systems at the bottom that you can control.
这是一个关于调度的老派理论计算机科学问题。然后,预计如何将自动生成魔法化地纳入其中呢?绝对可以。我的意见是这样的:在底层使用简单可预测的系统,可以让你掌控,这对世界有益。这是我的观点,不代表赞同。
But then once you have a predictable execution layer, you can build lots of different policies on top of it, and so one policy can be that the human programmer says, do that here, do that here, do that here, do that here and fully manually controls everything and the systems should just do it. Then you quickly get into the mode of like, I don't wanna have to tell it to do it. And so the next logical step that people typically take is they write some terrible heuristic. If it's a matrix multiplication, do it over there. Or if it's floating point, do it on the GPU, if it's an integer, do it on the CPU, like something like that.
一旦你建立了一个可预测的执行层,你可以在其上构建许多不同的策略,其中一个策略可能是人类程序员告诉系统在这里做这件事,在这里做那件事,完全手动地控制一切。然后你会很快进入这样的模式:我不想告诉它要做什么。所以人们通常要采取的下一个逻辑步骤就是编写一些可怕的启发式算法。比如如果是矩阵乘法,就在那里做;如果是浮点数,就在GPU上做;如果是整数,就在CPU上做。
And then you then get into this mode of like people care more and more and more, and you say, okay, well, let's actually make the heuristic better, let's get into auto-tank. Let's actually do a search of the space to decide, well, what is actually better? Well, then you get into this problem where you realize this is not a small space. This is a many dimensional, hyper dimensional space that you cannot exhaustively search. So, do you know of any algorithms that are good at searching very complicated spaces for?
接下来你会开始考虑,人们越来越在意,你会说,好吧,我们需要让算法更好,使用自动配备。我们需要搜索这个问题空间,决定哪个更好。但接着你会发现,这个空间非常庞大,而且是一个超大维空间,你无法穷尽其所有可能的情况。所以,请问你知道有哪些算法是擅长搜索非常复杂空间的吗?
Don't tell me you're gonna turn this into a machine learning problem. So then you turn into a machine learning problem, and then you have a space of generic algorithms and reinforcement learning and like all these cool things. Can you include that into the stack, into the module stack? Yeah, yeah. Where does it sit? Where does it live? Is it a separate thing or is it part of the compilation?
不要告诉我你将把这个问题转变成机器学习问题。那么,你就要将它变成机器学习问题,然后你会有一堆通用算法、强化学习和其他很酷的东西。你能把它们包含在堆栈中吗?是的,可以。它会被放在哪里?它属于一个单独的东西还是属于编译的一部分?
So you start from simple and predictable models, and so you can have full control, and you can have coarse grain knobs, like, no systems, so you don't have to do this. But if you really care about getting the best, you know, the last ounce out of a problem, then you can use additional tools.
所以你从简单和可预测的模型开始,这样你就可以拥有完全的控制,你可以有粗大的旋钮,比如没有系统,这样你就不必这么做。但是,如果你真的关心如何从问题中获得最好的效果,你可以使用额外的工具。
And there the cool thing is you don't wanna do this every time you run a model, you wanna figure out the right answer and then cache it. And once you do that, you can get, you can say, okay, cool, I can get up and running very quickly, I can get good execution out of my system, I can decide if something's important, I can go through a bunch of machines at it, and do a big expensive search over the space using whatever technique I feel like, it's pretty up to the problem. And then when I get the right answer, cool, I can just start using it. And so you can get out of this trade-off between, okay, am I gonna spend forever doing a thing, or do I get up and running quickly, and as a quality result, like these are actually not in contention with each other if the system's designed to scale.
这里的酷炫之处在于,你不需要每次运行模型时都这样做,而是需要找到正确的答案并进行缓存。一旦你这样做了,你就可以快速上手并获得良好的执行效果,你可以确定某些东西是否重要,可以在一堆机器上搜索空间,使用任何你想用的技术进行大规模的昂贵搜索,这完全取决于问题的本质。当你得到正确的答案时,你就可以开始使用它。因此,如果系统被设计为可扩展的话,你就可以摆脱这种平衡,即要么花费很长时间做一件事情,要么快速启动并获得高质量的结果,这两者实际上并不相互矛盾。
You started in a little bit of a whirlwind overview of how you get the 35,000X speed up, or more, over Python. Jeremy Howard did a really great presentation about sort of the basic, like, we'll get the code, here's how you get the speed up. Like you said, that's something we could, probably developers can do for their own code to see how you can get these gigantic speed ups.
你开始简要概述了如何让Python获得35,000倍以上的加速。Jeremy Howard做了一个非常棒的演示,介绍了基本的方法,如何获取代码和如何获得加速。正如你所说,这是开发人员可以为自己的代码做的事情,以了解如何获得巨大的加速。
But can you maybe speak to the machine learning task in general, how do you make some of this code fast and specific, like, what would you say is the main bottleneck for machine learning tasks? So are we talking about Metmull matrix multiplication, how do you make that fast?
能否总体谈谈机器学习任务,您是如何让其中某些代码快速而具体化的?主要瓶颈在哪里?例如Metmull矩阵乘法,如何使其更快?
So, I mean, if you just look at the Python problem, right, you can say, how do I make Python faster? There've been a lot of people that have been working on the, okay, how do I make Python two X-Faster, 10 X-Faster, or something like that, right? And there've been a ton of projects in that van, right?
所以,我的意思是,如果你只看Python的问题,你可以说,如何让Python运行更快?已经有很多人在研究,怎样让Python速度快两倍、十倍的项目,也有很多类似的项目在进行。
Mojo started from the, what can the hardware do? Like, what is the limit of physics? Yeah. What is the speed of light? What is it? Like how fast can the San go? And then how do I express that? Yeah. Right. And so it wasn't anchored relatively on make Python a little bit faster, it's saying, cool, I know what the hardware can do, let's unlock that, right?
Mojo从硬件的角度出发,探究硬件能做什么?就像,物理的极限是什么?对,光速是多少?它是什么?就像,三次元能跑多快?那该怎样表达呢?是的。所以,它不是相对于让Python稍微快一些,而是说,“酷啊,我知道硬件能做什么了,让我们解锁它的潜力,对吧?”。
Now, when you, right now, it just say how gutsy that is to be in the meeting, and as opposed to trying to see how do we get the improvement, it's like, what can the physics do? I mean, maybe I'm a special kind of nerd, but you look at that, what is the limit of physics? How fast can these things go, right?
现在,当你听到这句话时,它只是在说在会议中表现得多么勇敢,而不是试图看看我们如何改进,就像:物理学能做什么吗?我的意思是,也许我是一种特殊的书呆子,但你看看,物理学的极限是什么?这些东西能有多快?
When you start looking at that, typically it ends up being a memory problem, right? And so today, particularly with these specialized accelerators, the problem is that you can do a lot of math within them, but you get bottleneck sending data back and forth to memory, whether it be local memory or distant memory or disk or whatever it is. And that bottleneck, particularly as the training sizes get large, as you start doing tons of inferences all over the place, that becomes a huge bottleneck for people, right?
当你开始看那个问题的时候,通常会发现它是一个内存问题,对吧?因此,现在特别是有了这些专业加速器,问题在于你可以在它们内部进行大量的计算,但是你将数据发送到本地内存、远程内存或磁盘等地方,来回传输的瓶颈问题便会出现。特别是在训练数据集变大之后,或者在各个地方进行大量的推理任务时,这个瓶颈问题对人们来说是一个巨大的难题。
So again, what happened is we went through a phase of many years where people took the special case and hand tuned it and tweaked it and checked it out, and they knew exactly how the hardware worked, and they knew the model, and they made it fast. Didn't generalize. And so you can make, you know, ResNet 50 or some, or AlexNet or something, inception V1, like you can do that, right, because the models are small, they fit in your head, right? But as the models get bigger, more complicated, as the machines get more complicated, it stops working, right?
所以,我们经历了许多年的阶段,人们处理了特殊情况,手工调整了它,并对其进行了检查,他们完全了解硬件的工作原理,他们知道模型,并使其变快。没有概括性。因此,你可以制作ResNet 50或AlexNet或某些inception V1之类的模型,因为模型很小,它们适合在脑海中,对吧?但是,随着模型变得越来越大、越来越复杂,机器变得越来越复杂,这种方法就行不通了,对吧?
And so this is where things like kernel fusion come in. So what is kernel fusion? This is this idea of saying, let's avoid going to memory, and let's do that by building a new hybrid kernel, a numerical algorithm, that actually keeps things in the accelerator instead of having to write it all the way out to memory.
因此,这就是像内核融合这样的东西的作用。那么什么是内核融合?这个想法是说,让我们避免到内存中读取数据,通过构建一个新的混合内核,一个数值算法,实际上将要处理的数据保留在加速器中,而不是必须全部写入到内存中。
What's happened with these accelerators now is you get multiple levels of memory, like in a GPU, for example, you'll have global memory and local memory and all these things. If you zoom way into how hardware works, the register file is actually a memory. So the registers are like an L0 cache.
现在加速器发生了什么事情,你会得到多个层次的存储器,例如在一个GPU中,你会有全局存储器和本地存储器等等。如果你深入研究硬件是如何工作的,寄存器文件实际上是一种存储器。因此,寄存器就像是L0缓存。
And so a lot of taking advantage of the hardware ends up being fully utilizing the full power in all of its capability. And this has a number of problems, right? One of which is, again, the complexity of disaster, right? There's too much hardware.
因此,大量利用硬件的工作最终会完全利用其全部能力。这带来了许多问题,对吧?其中一个问题是灾难的复杂性,对吧?硬件太过复杂。
Even if you just say, let's look at the chips from one line of vendor, like Apple or Intel or whatever it is, each version of the chip comes out with new features. And they change things so that it takes more time or less time to do different things. And you can't rewrite all the software whenever new chip comes out, right?
即使你只看一家供应商(比如苹果或英特尔),每一款芯片的版本都会推出新功能。他们会更改一些东西,以使完成不同任务需要的时间更长或更短。但当新款芯片推出时,你不可能重新编写所有的软件,对吧?
And so this is where you need a much more scalable approach. And this is what Mojo and what the modular stack provides is it provides this infrastructure and the system for factoring all this complexity and then allowing people to express algorithms.
因此,您需要一种更具可扩展性的方法。 Mojo和模块化堆栈提供的就是这种基础设施和系统,用于分解所有这些复杂性,并允许人们表达算法。
You talk about auto tuning, for example, express algorithms in a more portable way so that when a new chip comes out, you don't have to rewrite it all. So to me, I kind of joke, what is a compiler?
你谈到了自动调整,比如以更可移植的方式表达算法,这样当一款新芯片出现时,你不必重写所有东西。所以对我来说,我有点开玩笑,编译器是什么?
简单来说,编译器就是将人类能够理解的编程语言代码翻译成计算机能够理解的指令代码的程序。它可以将人类编程语言的高级语法和语义转换成机器说明。因此,编译器是一种能够实现代码自动转换的工具,便于程序员开发和调试应用程序。它可以简化编程流程,提高程序的执行速度和效率,同时避免了需要重写代码来适应新的硬件环境的麻烦。
Well, there's many ways to explain that. You convert thing A into thing B, and you convert source code to machine code. You can talk about many things the compilers do. But to me, it's about a bag of tricks. It's about a system and a framework that you can hang complexity.
嗯,解释这个问题有很多方法。你将物品A转化为物品B,将源代码转化为机器代码。你可以谈论编译器所做的各种事情。但对我来说,这是一种诀窍的技巧。这是一个系统和一个框架,你可以用它来化繁为简。
It's a system that can then generalize and it can work on problems that are bigger than fit in one human's head, right? And so what that means, what a good stack and what the modular stack provides is the ability to walk up to it with a new problem and it'll generally work quite well.
这是一个系统,它可以推广并解决超出一个人头脑所能承受的大型问题,没错吗?因此,一个好的堆栈和模块化堆栈可以提供的能力是,当你遇到一个新问题时,将其提交给系统,它通常能够解决问题并表现得良好。
And that's something with a lot of machine learning infrastructure and tools and technologies don't have. Typical state of the art today is you walk up, particularly if you're deploying, if you walk up with a new model, you try to push it through the converter and the converter crashes. That's crazy.
这是一些现有的机器学习基础设施、工具和技术没有的东西。现在通常的情况是,如果你要部署一个新模型,你会尝试将其通过转换器推送,但转换器会崩溃。这太疯狂了。
The state of ML tooling today is not anything that a C programmer would ever accept, right? And it's always been this kind of flaky set of tooling that's never been integrated well and it's never worked together because it's not designed together.
目前的机器学习工具状态并不是C程序员能接受的,对吧?一直以来,机器学习工具都是一套不可靠的工具集,从未很好地整合,也从未共同运作,因为它们并不是设计好一起使用的。
It's built by different teams, it's built by different hardware vendors, it's built by different systems, it's built by different internet companies, they're trying to solve their problems, right? And so that means that we get this fragmented, terrible mess of complexity.
这个系统由不同的团队建造,不同的硬件供应商,不同的系统和不同的互联网公司参与,他们试图解决各自的问题,对吧?这意味着我们面对着一个碎片化、复杂而可怕的混乱系统。
So, I mean, the specifics of, and Jeremy showed this, there's the vectorized function which I guess is built in to the, into Mojo. Vectorized as he showed is built into the library. It's in the library. Vectorized parallelize, which vectorized is more low level, parallelize is higher level. There's the tiling thing, which is how he demonstrated the auto tune, I think.
所以,我的意思是,具体来说,Jeremy展示了向量化函数,我猜这是内置于Mojo中的。正如他所示,向量化已经被嵌入到库中。向量化并行化,向量化是更低级的,而并行化是更高级的。还有平铺的东西,这就是他演示自动调谐的方式,我想。
So think about this in like levels, hierarchical levels of abstraction, right? And so, if you zoom all the way into a compute problem, you have one floating point number, right? And so then you say, okay, I wanna be, I can do things one at a time in an interpreter. It's pretty slow, right? So I can get to doing one at a time in a compiler, I can see.
所以,我们可以基于抽象层次来考虑这个问题。假设我们将视角放到计算问题上来看,我们只有一个浮点数。如果我们想用解释器一个接着一个地进行计算,那么速度会很慢。因此,我们可以转而使用编译器,这样就能一次计算多个浮点数。
Then I can get to doing four or eight or 16 at a time with vectors, that's called vectorization. Then you can say, hey, I have a whole bunch of different, you know, what a multi-core computer is, is this basically a bunch of computers, right?
接着我可以进行矢量化操作,一次处理四个、八个甚至十六个向量,这就是所谓的向量化。到那时,你就可以说,“嗨,我有很多不同的,你知道多核电脑是什么吗?基本上是一堆电脑吧?”
So they're all independent computers that can talk to each other and they share memory. And so now what parallelize does is it says, okay, run multiple instances on different computers. And now they can all work together on a problem, right?
它们是互相独立的计算机,它们可以互相交流并共享内存。现在并行化的作用是,它告诉计算机在不同的机器上运行多个实例。这样它们就可以一起解决一个问题,对吧?
And so what you're doing is you're saying, keep going out to the next level out. And as you do that, how do I take advantage of this? So tiling is a memory optimization, right? It says, okay, let's make sure that we're keeping the data close to the compute part of the problem. Instead of sending it all back and forth through memory every time I load a block. And the size of the block size is, oh, that's how you get to the auto tune to make sure it's optimized.
所以你正在做的事情是,你在说,继续往外推进到下一个层面。当你这样做时,我如何利用这一点呢?因此,平铺是一种内存优化。它说,好的,让我们确保我们将数据保持靠近问题的计算部分,而不是每次加载块时来回通过内存发送它。块大小的大小就是你如何自动调节以确保它被优化的方式。
Yeah, well, so all of these, the details matter so much to get good performance. This is another funny thing about machine learning and high performance computing that is very different than C compilers we all grew up with, where, you know, if you get a new version of GCC or a new version of Clang or something like that, you know, maybe something will go 1% faster, right?
是啊,获取良好性能需要注意到这些细节,这是机器学习和高性能计算与我们成长中接触的C编译器非常不同的有趣之处。因为你知道,如果你获得GCC或Clang的新版本之类的东西,或许会比之前快1%吧。
And so compilers are really, really, really hard to get half a percent out of your C code, something like that. But when you're talking about an accelerator or an AI application or you're talking about these kinds of algorithms, and these are things people used to write in Fortran, for example, right? If you get it wrong, it's not 5% or 1%. It could be 2x or 10x, right?
因此,编译器真的非常难以从您的C代码中获得0.5%的效果,差不多就是这样。但是,当您谈论加速器或AI应用程序或者您谈论这些类型的算法,并且这些是人们过去用Fortran编写的,如果您弄错了,它可能不是5%或1%。它可能是2倍或10倍,对吧?
If you think about it, you really want to make use of the full memory you have, the cache, for example. But if you use too much space, it doesn't fit in the cache, now you're going to be thrashing all the way back out to main memory. And these can be 2x, 10x, major performance differences. And so this is where getting these magic numbers and these things right is really actually quite important.
如果你想一想,你确实想要充分利用你拥有的整个内存,例如缓存。但如果你使用太多空间,它就无法适应缓存,现在你将会在回到主内存的路上进行频繁的交换内存操作。这可能会导致2倍、10倍或是很大的性能差异。因此,在正确设置这些神奇数字和参数时,真的非常重要。
So you mentioned that Mojo is a superset of Python. Can you run Python code as if it's Mojo code? Yes. Yes. And so, and this has two sides of it. So Mojo's not done yet, so I'll give you disclaimer. Mojo's not done yet. But already we see people that take small pieces of Python code, move it over, they don't change it, and you can get 12x speedups.
你提到Mojo是Python的超集。那么你能像运行Mojo代码一样运行Python代码吗?是的。是的。这有两个方面。Mojo还没有完成,所以我要声明一下。但是我们已经看到有人将Python代码的一小部分移植过来,不更改它,就能获得12倍的速度提升。
Like somebody was just tweeting about that yesterday, which is pretty cool, right? And again, interpreters, compilers, right? And so without changing any code, without, also this is not with, this is not jit compiling or anything fancy, this is just basic stuff, move it straight over.
就好像昨天有人在推特上谈论这个一样,这很酷,对吧?再次提到,解释器、编译器,对吧?因此,不需要改变任何代码,也不需要jit编译或其他花哨的东西,这只是基本的操作,直接移过来就可以了。
Now Mojo will continue to grow out, and as it grows out, it will have more and more and more features. And our North Star is to be a full superset of Python, and so you can bring over, basically, arbitrary Python code and have it just work. And it may not always be 12x faster, but it should be at least as fast and way faster in many cases, this is the goal, right?
现在Mojo将继续成长,并随着它的成长,它将拥有越来越多的特性。我们的北极星是成为Python的完整超集,因此您可以将基本上任意的Python代码带过来,它就能正常工作。这可能不总是比原来快12倍,但至少应该和原来一样快,并在许多情况下更快,这就是目标,对吧?
Now it'll take time to do that. And Python is a complicated language. There's not just the obvious things, but there's also non-obvious things that are complicated. Like we have to be able to talk to CPython packages, to talk to the C API, and there's a bunch of pieces. So you have to, just to make explicit, the obvious may not be so obvious, and so you think about it, so to run Python code, that means you have to run all the Python packages and libraries. Yeah, yeah.
So that means what? What's the relationship between Mojo and CPython, the interpreter that presumably would be tasked with getting those packages to work? Yep, so in the fullness of time, Mojo will solve for all the problems and you'll be able to move Python packages over and run them in Mojo. Without the CPython. Without CPython, someday.
Yeah, right, it's not today, but someday. And that'll be a beautiful day because then you'll get a whole bunch of advantages and you'll get massive speedups and things like this.
现在要做这件事需要时间,而Python是一门复杂的语言。不仅有显而易见的事情,还有一些不显眼但也很复杂的东西。就像我们必须能够与CPython包进行通信,与C API进行通信,有很多复杂的组件。因此,需要明确的是,显而易见的可能并不是那么显而易见,因此需要考虑,那么为了运行Python代码,就必须运行所有Python包和库。对的,对的。
那么这意味着什么?Mojo与CPython之间的关系是什么,CPython的解释器将负责运行这些包?是的,最终Mojo将解决所有问题,您将能够将Python包移植到Mojo上并运行它们。没有CPython。没有CPython,有一天。对的,不是今天,但有一天。那将是美好的一天,因为那时您将获得许多优势,并获得巨大的加速等等。
But you can do that one at a time, right? You can move packages one at a time. Exactly, but we're not willing to wait for that. Python is too important. The ecosystem is too broad. We want to both be able to build Mojo out. We also want to do it the right way without intense time pressure. We're obviously moving fast.
但是你可以一个一个地做,对吗?你可以一次只移动一个包裹。确实如此,但我们不想等待。Python太重要了。生态系统太广泛了。我们想要能够构建Mojo。我们还想以正确的方式进行,而不是在强度时间压力下进行。我们显然在加快速度。
But, and so what we do is we say, okay, well, let's make it so you can import an arbitrary existing package. Arbitrary, including you write your own on your local disk, or whatever. It's not like an arbitrary package. And import that using CPython. Because CPython already runs all the packages, right?
所以我们所做的就是让你可以导入任意现有的包。任意包括你在本地磁盘上编写自己的包。这不是一个任意的包。然后使用CPython导入它。因为CPython已经可以运行所有包了,对吧?
And so what we do is we built an integration layer where we can actually use CPython. Again, I'm practical. And to actually just load and use all the existing packages as they are, the downside of that is you don't get the benefits of Mojo for those packages, right? And so the run as fast as they do in the traditional CPython way. But what that does is that gives you an incremental migration path.
因此,我们建立了一个集成层,可以实际使用CPython。我很实用主义。可以直接加载并使用所有现有包,缺点是对于这些包,您无法获得Mojo的好处。但是,这样可以提供增量迁移路径,以与传统的CPython方式一样快速运行。
And so if you say, hey, cool, well, here's a, you know, the Python ecosystem is vast. I want all of it to just work. But there are certain things that are really important. And so if I'm doing weather forecasting or something, well, I want to be able to load all the data, I want to be able to work with it. And then I have my own crazy algorithm inside of it.
如果你说:“嘿,很酷啊,Python生态系统非常广泛,我希望所有东西都可以正常工作。”但是有某些东西非常重要,例如如果我正在做天气预报,我想要能够加载所有数据,并且能够处理它们。然后我就会在其中使用我的疯狂算法。
Well, normally I'd write that in C++. If I can write in Mojo and have one system that scales, well, that's way easier to work with. Is it hard to do that, to have that layer that's running CPython? Because is there some communication back and forth? Yes, it's complicated. I mean, this is what we do. So I mean, we make it look easy, but it is complicated. But what we do is we use the CPython existing interpreter.
通常我会用C++来写这个。但如果我可以用Mojo来写并且使用一个可扩展的系统,那么这样会更容易处理。但是,在这个运行CPython的层面上进行通信是否很困难呢?是的,这很复杂。这就是我们的工作。我们让它看起来很容易,但实际上很复杂。我们使用已有的CPython解释器。
So it's running its own byte codes, and that's how it provides full compatibility. And then it gives us CPython objects. And we use those objects as is. And so that way we're fully compatible with all the CPython objects and all the, you know, it's not just the Python part, it's also the C packages. The C libraries underneath them, because they're often hybrid. And so we can fully run and fully compatible with all that. And the way we do that is that we have to play by their rules. And so we keep objects in that representation when they're coming from that world.
它正在运行自己的字节码,这就是它提供完全兼容性的方式。然后它给我们提供CPython对象。我们直接使用这些对象。因此,我们完全与所有CPython对象兼容,不仅涉及Python部分,还涉及C软件包,因为它们通常是混合的。这样,我们可以完全运行并与所有这些版本兼容。我们这样做的方式是,我们必须按照它们的规则操作。因此,当它们来自该世界时,我们保留对象的表示形式。
What's the representation that's being used in memory? We'd have to know a lot about how the CPython interpreter works. It has, for example, reference counting, but also different rules on how to pass pointers around and things like this super low level fiddly. And it's not like Python, it's like how the interpreter works. And so that gets all exposed out. And then you have to define wrappers around the low level C code.
这里所使用的内存表示是什么?我们需要了解CPython解释器的工作方式。例如,它具有引用计数功能,但也有不同的规则来传递指针等低级问题。它不是Python本身,而是解释器的工作方式。因此,这些问题都会暴露出来。然后,您需要在低级别的C代码周围定义包装器。
And so what this means is you have to know not only C, which is a different rule from Python, obviously, not only Python, but the wrappers. But the interpreter and the wrappers and the implementation details and the conventions. And it's just this really complicated mess. And when you do that, now suddenly you have a debugger that debugs Python. They can't step into C code.
这意味着你不仅要了解C语言,也要了解Python语言的不同规则,还要了解哪些是包装器(wrapper),解释器和实现细节以及约定惯例。这是一个非常复杂的问题。当你掌握这些知识后,就可以使用一个Python调试器来调试Python程序,但不能调试C语言代码。
So you have this two world problem. And so by pulling this all into Mojo, what you get is you get one world. You get the ability to say, cool, I have untyped, very dynamic, beautiful, simple code. OK, I care about performance for whatever reason. There's lots of reasons you might care. And so then you add types. You can parallelize things. You can vectorize things. You can use these techniques, which are general techniques to solve a problem. And then you can do that by staying in the system.
所以你有一个两个世界的问题。将这一切汇聚在Mojo中,你就得到了一个世界。你可以说,“太酷了,我有非类型、非常动态、优美、简单的代码。但是,如果因为某种原因你关心性能,你可以添加类型。你可以并行处理事物,也可以矢量化事物。你可以利用这些通用技术来解决问题。在这个系统中,你能够做到这些。
And if you have that one Python package that's really important to you, you can move it to Mojo. You get massive performance benefits on that. And other advantages, if you like exact types, it's nice if they're enforced. Some people like that rather than being hints. So there's other advantages too. And then you can do that incrementally as you go.
如果你有一个非常重要的Python包,可以把它移动到Mojo中,你可以获得巨大的性能优势。如果你喜欢精确类型,那么强制执行它们是很好的。有些人喜欢这样而不是提示。所以这是其他的优势。然后你可以逐步地做到这一点。
One different perspective on this will be why Mojo, instead of making CPython faster, redesigning CPython. Yeah. Well, I mean, you could argue Mojo is redesigning CPython. But why not make CPython faster and better and other things like that? There's lots of people working on that. So actually, there's a team at Microsoft that is really improving. I think CPython 3.11 came out in October or something like that. It was 15% faster, 20% faster across the board, which is pretty huge given how mature Python is and things like this. And so that's awesome. I love it. Doesn't run on GPU. It doesn't do AI stuff. It doesn't do vectors. It doesn't do things.
对于这件事情,另一个不同的观点是为什么Mojo不是让CPython更快,并对CPython进行重新设计。是啊,我是说,你可以争论Mojo正在重新设计CPython。但为什么不让CPython更快、更好,还有其他一些这样的事情呢?有很多人在这方面工作。所以实际上,微软团队正在真正改进(CPython)。我认为CPython 3.11于10月份发布了,速度提高了15%或者20%左右,这对于已经非常成熟的Python来说非常巨大。这太棒了,我喜欢它。但它不能在GPU上运行,不能完成AI等向量方面的工作。
20% is good. 35,000 times is better. So I'm a huge fan of that work, by the way. And it composes well with what we're doing. And so it's not like we're fighting or anything like that. It's actually just goodness for the world. But it's just a different path. And again, we're not working forwards from making Python a little bit better. We're working backwards from what is the limit of physics.
20%不错,35000次更好。顺便说一下,我非常喜欢那项工作。它与我们正在做的事情很好地组合在一起。并且这并不意味着我们在争吵或者什么的。实际上,这是对世界的好处。只是走了不同的道路。而我们不是在努力使Python变得更好,而是从物理的极限开始倒退。
What's the process of porting Python code to Mojo? Is there what's involved in that in the process? Is there tooling for that? Not yet. So we're missing some basic features right now. And so we're continuing to drop out new features like on a weekly basis. But at the fullness of time, give us a year and a half, maybe two years. Is it an automatable process? When we're ready, it'll be very automatable. Yes. Is it automatable? Is it possible to automate in the general case, the Python to Mojo conversion?
把 Python 代码转移到 Mojo 的过程是什么?这个过程中有什么需要注意的问题吗?现在还没有相应的工具,所以我们缺失了一些基本功能。因此,我们在每周推出新的功能。但到了一年半或两年之后,我们将开发出一个非常自动化的转换工具。是否可以自动化完成 Python 到 Mojo 的转换?一旦我们准备就绪,这个过程将是非常可自动化的。
Yeah. Well, if you're saying it's possible. Well, so and this is why, among other reasons, why we use tabs. Yes. Right. So first of all, by being a superset, it's like C versus C++. Can you move C code to C++? Yes. Yeah. And you move C code to C++, and then you can adopt classes. You can adopt templates. You can adopt other references or whatever C++ features you want after you move C code to C++. Like you can't use templates in C.
是的。嗯,如果您说这是可能的。所以,这也是为什么我们使用选项卡的原因之一。对的。首先,通过成为超集,就像C与C++一样。您可以将C代码移动到C++吗?可以的。您可以将C代码移动到C++,然后可以采用类、模板、其他引用或任何您想要的C++功能。就像您不能在C中使用模板一样。
And so if you leave it to C, fine, you can't use the cool features. But it still works. And C and C++ go to work together. And so that's the analogy. Now, here, there's not a Python is bad and a Mojo is good. Mojo just gives you superpowers. And so if you want to stay with Python, that's cool. But the tooling should be actually very beautiful and simple because we're doing the hard work of defining a superset.
如果你使用C,虽然你无法使用很酷的特性,但它仍然有效。而且,C和C++可以一起协作。这就是类比。在这里,没有 Python 不好,Mojo 很好的说法。Mojo只是给你超能力。如果你想继续使用Python,那也很好。但工具应该非常简单易用,因为我们正在努力定义一个超集。
Right. So there's several things to say there. But also the conversion tooling should probably give you hints as to how you can improve the code. And then exactly once you're in the new world, then you can build all kinds of cool tools to say, hey, should you adopt this feature? And we haven't built those tools yet, but I fully expect those tools will exist. And then you can modernize your code or however you want to look at it.
嗯,有几件事情要说。此外,转换工具应该可以给你一些提示,告诉你如何改进代码。一旦你进入了新世界,就可以建造各种很酷的工具来决定是否应该采用某个功能。虽然我们还没有建造这些工具,但我希望这些工具将会存在。然后,你可以现代化你的代码,或者你可以以其他方式看待它。
So one of the things that I think is really interesting about Mojo is that there have been a lot of projects to improve Python over the years. Everything from getting Python run on the Java virtual machine, PyPy, which is a JIT compiler, there's tons of these projects out there that have been working on improving Python in various ways. They fall into one or two camps. So PyPy is a great example of a camp that is trying to be compatible with Python. Even there, not really. It doesn't work with all the C packages and stuff like that. But they're trying to be compatible with Python.
Mojo很有趣的一点是,多年来已经有许多项目致力于改进Python。从让Python在Java虚拟机上运行到PyPy,这是一个JIT编译器,有很多这样的项目一直致力于以各种方式改进Python。它们分为一两个阵营。PyPy是一个试图与Python兼容的阵营的很好的例子,尽管它并不真正符合所有的C软件包等。但他们试图兼容Python。
There's also another category of these things where they're saying, well, Python is too complicated. And I'm going to cheat on the edges. And integers in Python can be an arbitrary size integer. If you care about it fitting in a going fast in a register in a computer, that's really annoying. And so you can choose to pass on that. You can say, well, people don't really use big integers that often, therefore, I'm going to just not do it. And it will be fine. Not a Python superset. Or you can do the hard thing and say, OK, this is Python. You can't be a superset of Python without being a superset of Python. And that's a really hard technical problem. But it's, in my opinion, worth it.
还有一类人认为Python太复杂了。他们在边界上作弊,而Python中的整数可以是任意大小的整数。如果你关心它能否适应计算机寄存器并快速运行,那么这真的很烦人。因此,你可以选择放弃这一点。你可以说,人们并不经常使用大整数,因此我不会去做。这样也没问题,但这不是Python的超集。或者你可以做一个艰难的决定,就是说,好的,这是Python。如果你不是Python的超集,那么你不能成为Python。这是一个非常难的技术问题,但在我看来,它是值得的。
And it's worth it because it's not about anyone packages about this ecosystem. It's about what Python means for the world. And it also means we don't want to repeat the Python 2 to Python 3 transition. We want people to be able to adopt this stuff quickly. And so by doing that work, we can help lift people. Yeah, the challenge. It's really interesting technical philosophical challenge of really making a language a superset of another language. Let's break my brain a little bit. Well, it paints you in the corners.
这很值得,因为它不仅仅是关于任何一个Python包的生态系统,而是关于Python对于世界的意义。同时,也意味着我们不想重复Python 2到Python 3的过渡时期。我们希望人们能够快速采用这些东西。因此,通过做这些工作,我们可以帮助人们提升。是的,这是一个非常有趣的技术哲学挑战,让一个语言成为另一个语言的超集。这会让我有些困惑,但也会使我有所收获。当然,这同时也会限制你。
So again, I'm very happy with Python. All joking aside, I think that the annotation thing is not the actual important part of the problem. But the fact that Python has amazing dynamic metaprogramming features, and they translate to beautiful static metaprogramming features, I think is profound. I think that's huge. And so Python, I've talked with Guido about this. It was not designed to do what we're doing. And that was not the reason they built it this way. But because they really cared and they were very thoughtful about how they designed the language, it scales very elegantly in the space.
再次说一遍,我对Python非常满意。开玩笑的话不说,我认为注解并不是实际问题的重点。但Python具有惊人的动态元编程功能,并将其转化为美丽的静态元编程功能,这是非常深刻的。我认为这是个重要的亮点。我和Guido谈过这个问题。Python并没有被设计用来做我们正在做的事情。这也不是他们构建它的原因。但因为他们真的很关心,而且非常慎重地设计了这门语言,它在这个领域中的扩展非常优雅。
But if you look at other languages, for example, C and C++, if you're building a superset, you get stuck with the design decisions of the subset. And so C++ is way more complicated because of C and legacy than it would have been if they would have theoretically designed a from scratch thing. And there's lots of people right now that are trying to make C++ better. In recent text, C++ is going to be great. We'll just change all this syntax. But if you do that, now suddenly you have zero packages. You don't have compatibility.
但是如果您看看其他编程语言,例如C和C++,如果您正在构建一个超集,那么您将被初始子集的设计决策所限制。因此,C++比从头开始设计要复杂得多,因为受到了C和传统的影响。现在有很多人正在努力使C++更好。最近的文章中,C++将会很好。我们只需改变这些语法。但是如果您这样做,现在突然就没有任何软件包,您将失去兼容性。
So what are the, if you could just linger on that, what are the biggest challenges of keeping that superset status? What are the things you're struggling with? Is it all boiled down to having a big integer? No, I mean, it's a, what are the other things like?
所以,如果你能稍微停留一下,最大的挑战是什么,保持这种超集状态的最大挑战是什么?你正为什么而努力?难道只是因为有一个大整数吗?不,我的意思是,还有其他什么事情吗?
Usually it's the, it's the long tail weird things. So let me give you a war story. So war story in the space is, you go away back in time, project I worked on is called Clang. Clang, what it is, is a C++ parser, right? And when I started working on Clang, I spent like 2006 or something, it was like, in 2006, when I first started working on it, right? That's funny how time flies. Yeah, yeah.
通常情况下,那些“长尾奇怪的东西”会是问题。让我来给你讲个故事吧。在开发领域,故事是这样的:你回到过去,我所参与的项目名为Clang,它是一个C++ 解析器。当我开始在Clang上工作时,是在2006年左右。时间真是飞快呀。
The, I started that project and I'm like, okay, well, I want to build a C parser, C++ parser for LVM. It's going to be the, GCC is yucky. You know, this is me in earlier times. It's yucky, it's unprincipled. It has all these weird features, like all these bugs, like, it's yucky.
我在以前刚开始那个项目的时候,想能不能为LLVM构建一个C解析器和C++解析器。现在我认为GCC很差,它不规范,有很多奇怪的特性和bug,实在是很令人不满。
So I'm going to build a standard compliant C and C++ parser. It's going to be beautiful. It'll be amazing. Well engineered, all the cool things that engineer wants to do. And so I started implementing and building it out, building it out and building it out. And then I got to include standard.io.h. And all of the headers in the world use all the GCC stuff. Okay.
所以我打算建造一个符合标准的 C 和 C++ 解析器。它将会很美丽,令人惊叹。这将是一个良好的工程,所有工程师想要做的酷炫的事情都会有。于是我开始实现和构建它,逐步扩展它。然后我遇到了 include standard.io.h 的情况。而世界上所有的头文件都使用了所有的 GCC 都有的东西。好吧。
And so again, come back away from theory back to reality. Right? I had, I was at a fork on the road. I could have built an amazingly beautiful academic thing that nobody would ever use. Or I could say, well, it's yucky in various ways. All these design mistakes, accents of history, the legacy at that point, GCC was like over 20 years old. Which, by the way, yeah. Now LVM's over 20 years old. Right? Yeah. Time catches up to you, right? And so you say, okay, well, what is easier?
因此,回到现实,远离理论。对吗?我曾经在十字路口上。我可以建造一个惊人美丽的学术项目,却没有任何人会使用它。或者我可以说,这种方法在各种方面都不好。所有这些设计错误、历史的强调、此时的遗产,GCC已经超过20年了。顺便说一下,现在LVM已经超过20年了。对吗?是的。时间会赶上你的,对吗?所以你说,好的,什么更容易呢?
Right? I mean, as an engineer, it's actually much easier for me to go implement long tail compatibility weird features, even if they're distasteful, and just do the hard work and like figure it out, reverse engineer, understand what it is, write a bunch of test cases, like try to understand behavior. It's way easier to do all that work as an engineer than it is to go talk to all C programmers and get, argue with them and try to get them to rewrite their code. Yeah.
你说得对。我是说,作为一名工程师,对于我来说,实现长尾兼容性奇怪功能实际上更容易,即使它们不受欢迎,只需要努力工作,像反向工程一样去理解它是什么,编写一堆测试案例,了解行为。对于一个工程师来说,做所有这些工作比去与所有C程序员交谈并争论试图让他们重新编写他们的代码更容易。是的。
Because that breaks a lot more things. Yeah. And you have realities like nobody actually understands how the code works, because it was written by the person who quit 10 years ago. Right?
这是因为这样会破坏更多的东西。对啊。而且你面临这样的事实,就是没有人真正理解代码的工作原理,因为它是10年前离开的那个人编写的。对吧?
意思是说,改变某些事情可能会导致更多问题的出现,而且可能会有些代码已经很久没人维护了,导致人们难以了解代码的运作原理。
And so this software is kind of frustrating that way, but that's how the world works. Yeah, unfortunately, it can never be this perfect, beautiful thing. Well, there are occasions in which you get to build, like you invent a new data structure or something like that. Or there's this beautiful algorithm that just makes you super happy. I love that moment.
所以,这个软件就是那样令人沮丧,但这就是世界的运作方式。遗憾的是,它永远不可能是完美的、美丽的事物。但是,有时候你能够创造出像发明新的数据结构之类的东西。或者是有一个美丽的算法让你超级开心。我喜欢那种时刻。
But when you're working with people, and you're working with code and dusty deck code bases and things like this, right? It's not about what's theoretically beautiful. It's about what's practical. It's real, what people actually use.
但是当你与人们一起工作,与代码和灰尘的代码基础一起工作时,是关于什么是理论上美丽的问题吗?它是关于实践性的。实际上,它是人们实际使用的东西。
And I don't meet a lot of people that say, I want to rewrite all my code, just for the sake of it. By the way, there could be interesting possibilities and we'll probably talk about it where AI can help rewrite some code that might be farther out feature, but it's a really interesting one, how that could create more, be a tool in the battle against this monster of complexity that you mentioned.
我很少遇到有人说:“我要重写我所有的代码,只是为了这样做。” 顺便说一句,人工智能可能有一些有趣的可能性,我们可能会谈论到它可以帮助重写一些可能更远的功能代码,但这真的是一个有趣的问题,它如何创造更多,成为对抗你提到的这个复杂性怪兽的工具。
Yeah. You mentioned Guido, the benevolent dictator for life of Python. What does he think about Mojo? Are you talking to him much about it?
对,你提到了Python的善良独裁者Guido,他对Mojo有什么看法吗?你和他多接触讨论了吗?
意思是,“Guido”是Python编程语言的创造者和主要维护者,“Mojo”可能是指某个Python程序或框架,询问对方是否和Guido进行了交流和讨论。
I have talked with him about it. He found it very interesting. We actually talked with Guido before it launched, and so he was aware of it before it went public. I have a ton of respect for Guido for a bunch of different reasons.
我跟他谈过这件事情,他觉得非常有趣。实际上,在它正式推出之前,我们就跟Guido谈过这个话题,因此他在公开之前就已经知道了这件事情。出于多种不同的原因,我对Guido深表尊重。
You talk about Walrus operator and Guido is pretty amazing in terms of steering such a huge and diverse community and driving it forward. And I think Python is what it is thanks to him. And so to me, it was really important starting to work on Mojo to get his feedback and get his input and get his eyes on this.
你谈到海象运算符,Guido在指导这样一个庞大而多样化的社群方面非常出色,并推动了它的发展。我认为Python之所以变得如此出色,都要归功于他。因此,对于我来说,开始致力于开发Mojo,获取他的反馈和意见以及让他关注它对我来说非常重要。
Now, a lot of what Guido was and is, I think, and third about is, have we not fragmented the community? Yeah. We don't want to Python too to Python three thing. Like, that was really painful for everybody involved. And so we spent quite a bit of time talking about that and some of the tricks I learned from Swift, for example.
现在,我认为Guido所说的很多,也是第三个问题,我们是否分裂了社区?是的,我们不想让Python成为Python三。那样对所有参与者来说都很痛苦。因此,我们花了相当多的时间讨论这个问题,也从Swift学到了一些诀窍。
So in the migration from Swift, we managed to like not just convert objective C into a slightly prettier objective C, which we did. We then converted not entirely, but almost an entire community to a completely different language. Right. And so there's a bunch of tricks that you learn along the way that are directly relevant to what we do. And so this is where, for example, the leverage C Python will bring up the new thing. Like that approach is, I think, proven and then comes from experience.
在从Swift迁移的过程中,我们不仅把OC转化成了更美观的OC,而且转化了几乎整个社区到使用一种完全不同的语言。沿途学到了许多技巧,这些技巧对我们的工作直接有影响。例如,利用C Python可以带来新的好处,这种方法已经经过实践证明。
And so Guido was very interested in like, okay, cool. Like, I think that Python is really his legacy. It's his baby. And I have tons of respect for that. Incidentally, I see Mojo as a member of the Python family who are not trying to take Python away from Guido and from the Python community. And so to me, it's really important that we're a good member of that community.
所以,Guido 对 Python 非常感兴趣,认为 Python 是他的遗产,就像他的孩子一样。我非常尊敬他。顺便说一下,我把 Mojo 视为 Python 家族的一员,我们没有试图从 Guido 和 Python 社区中夺走 Python。因此,对我来说,成为这个社区的好成员非常重要。
And so, yeah, I think that, again, you would have to ask Guido this, but I think that he's very interested in this notion of like, cool, Python gets beaten up for being slow. Maybe there's a path out of that. Right. And that, you know, if the future is Python, right? I mean, look at the far outside case on this, right? And I'm not saying this is Guido's perspective, but there's this path of saying like, okay, well, suddenly Python can suddenly go all the places it's never been able to go before. Right. And that means that Python can go even further and can have even more impact on the world.
所以,我认为,你应该问问Guido,但我觉得他对这种想法很感兴趣,就像Python因为速度慢而备受批评一样。也许有一条出路。如果未来是Python的话,右侧看到的远处情况就是,Python可以去它以前从未去过的地方。这意味着Python可以进一步发展,对世界产生更大的影响。
So in some sense, Mojo could be seen as Python 4.0. I would not say that. I think that would drive a lot of people really crazy. Because of the PTSD of the 3.0, 2.0. I'm willing to annoy people about Emacs, which is VIM, or the TAF versus spaces. That's that one. I don't know. That might be a little bit far even for me. Like my skin may not be that thick.
在某种程度上,Mojo可以视为Python 4.0。但是我不会这样说,因为这会让许多人疯狂。因为 3.0、2.0 的 PTSD。我愿意在Emacs和VIM之间,或Tab和空格之间与人争论,但这可能对我来说有点过分了。我的脾气可能不够好。
But the point is the step to being a superset and allowing all of these capabilities, I think, is the evolution of a language. It feels like an evolution of a language. So he's interested by the ideas that you're playing with, but also concerned about the fragmentation. So what are the ideas you've learned? What are you thinking about? How do we avoid fragmenting the community? Where the Pythonistas in the.
但是重点是要成为超集并允许所有这些能力,我觉得这是一种语言的进化。它感觉像是一种语言的进化。因此,他对你正在探讨的想法感兴趣,但也关注着分裂问题。那么,你学到了哪些想法呢?你在思考什么?我们如何避免分裂社区?在 Pythonista 中怎么样呢?
I don't know what to call the Mojo people. Magicians? The magicians? I like it. Can coexist happily and share code. And basically just have these big code bases that are using CPython and more and more moving towards Mojo.
我不知道该怎么称呼Mojo的人们,是魔术师吗?魔术师吗?我很喜欢这个称呼。我们可以和平共处,并共享代码。基本上,我们可以有这些使用CPython并且越来越多向Mojo转移的大型代码库。
Well, so again, these are lessons I learned from Swift. And here we face very similar problems, right? In Swift you have Objective-C, super dynamic. They're very different syntax, right? But you're talking to people who have large scale code bases. I mean, Apple's got the biggest, largest scale code base of Objective-C code, right? And so none of the companies, none of the iOS developers, none of the other developers want to rewrite everything all at once and see you want to be able to adopt things piece at a time.
所以,这些都是我从Swift中学到的教训。我们面对的问题非常相似,对吧?在Swift中,你拥有超级动态的Objective-C,两者有非常不同的语法,不是吗?但是你正在与那些拥有大规模代码库的人们交流。我是说,苹果拥有Objective-C代码库中的最大规模。因此,没有任何公司、iOS开发者或其他开发者希望一次性重写所有内容,因此您希望能够逐步采用一些新技术。
And so a thing that I found that worked very well in the Swift community was saying, okay, cool. And this was when Swift was very young. And she said, okay, you have a million line of code, Objective-C app. Don't rewrite it all. But when you implement a new feature, go implement that new class using Swift. Right. And so now this turns out is a very wonderful thing for an app developer. But it's a huge challenge for this compiler team and the systems people that are implementing this, right? And this comes back to what is this trade-off between doing the hard thing that enable scale versus doing the theoretically pure and ideal thing, right?
在Swift社区中,我发现一件很有效的事情是说“好的”。当时Swift还非常年轻。有人说,当你有一个有100万行代码的Objective-C应用时,不要全部重写。但是在实现新功能时,可以使用Swift来实现新的类。这对于应用程序开发者来说非常好。但对于编译器团队和实现人员来说,这是一个很大的挑战。这涉及到在实现规模化问题和理论的纯净和理想化之间如何权衡。
And so Swift had adopted and built a lot of different machinery to deeply integrate with the Objective-C runtime. And we're doing the same thing with Python, right? Now, what happened in the case of Swift is that Swift has language got more and more and more mature over time, right? And incidentally, Mojo is a much simpler language than Swift in many ways. And so I think that Mojo will develop way faster than Swift for a variety of reasons. But as the language gets more mature and parallel with that, you have new people starting new projects.
因此,Swift已经采用并构建了许多不同的机制与Objective-C运行时进行深度整合。我们正在用Python做同样的事情,对吧?现在,在Swift的情况下发生的是,Swift语言随着时间的推移变得越来越成熟,不断地发展。另外,Mojo在许多方面都比Swift更简单。因此,我认为Mojo将出于各种原因而快速发展。但随着语言的日趋成熟,也有新的人开始新的项目。
Right. And so when the language is mature and somebody starting a new project, that's when they say, OK, cool. I'm not dealing with a million lines of code. I'll just start and use the new thing for my whole stack. Now the problem is, again, you come back to work communities and work people that work together, you build new subsystem or a new feature or a new thing in Swift or you build a new thing in Mojo, then you want to be end up being used on the other side, right? And so then you need to work on integration back the other way. And so it's not just Mojo talking to Python. It's also Python talking to Mojo, right?
当编程语言成熟时,一个新项目刚开始时,人们会说:“好的,太棒了。我不再处理几百万行的代码了。我将使用全新的技术来完成整个项目。”现在的问题是,在工作社区和合作人员一起建立新的子系统、新的功能或新的东西时,你希望这些新东西也能被另一端使用,对吧?因此,你需要在另一方面工作,以实现集成。因此,这不仅仅是Mojo与Python之间的交流,还包括Python与Mojo之间的交流。
And so what I would love to see, and I don't want to see this next month, right? But what I want to see over the course of time is I would love to see people that are building these packages, like NumPy or TensorFlow, or these packages that are half Python, half C++. And if you say, OK, cool, I want to get out of this Python C++ world into a unified world and so I can move to Mojo. But I can't give up all my Python clients. Because these libraries get used by everybody, and they're not all going to switch all at once, and maybe never, right? Well, so the way we should do that is we should bend Python interfaces to the Mojo types. And that's what we did in Swift and worked great.
我想看到的是,在一段时间内,那些构建NumPy或TensorFlow等半Python半C++软件包的人能够使用Mojo作为一个整合平台。因为这些软件包被广泛使用,不是所有人都能立刻或永远都转移到Mojo上。因此,我们应该将Python接口弯曲使其适用于Mojo类型,这也是Swift所做的,而且效果很好。不过,我并不想在下个月就看到这种变化。
I mean, it was a huge implementation challenge for the compiler people, right? But there's only a dozen of those compiler people, and there are millions of users. And so it's a very expensive, capital intensive, like, skill set intensive problem. But once you solve that problem, it really helps adoption. It really helps the community progressively adopt technologies. And so I think that this approach will work quite well with the Python and the Mojo world. So for a package, port it to Mojo, and then create a Python interface.
我是说,对于编译器人员来说,这是一个巨大的实现挑战吧?但是只有大约十几个编译器人员,而有数百万用户。因此,这是一个非常昂贵的,资本密集型的,技能密集型的问题。但是,一旦你解决了这个问题,它真的有助于采用。它确实有助于社区逐步采用技术。所以我认为这种方法对于Python和Mojo世界将非常有效。因此,可以将一个包移植到Mojo,然后创建一个Python接口。
Yep. So how do you just linger on these packages, NumPy, PyTorch and TensorFlow? Yeah. How do they play nicely together? So is Mojo supposed to be, let's talk about the machine learning ones. Is Mojo kind of vision to replace PyTorch and TensorFlow to incorporate it? What's the relationship in this?
是的。那么你如何留在这些包(NumPy,PyTorch和TensorFlow)上呢?它们如何相互协作?让我们谈谈机器学习方面的Mojo。Mojo是打算取代PyTorch和TensorFlow并将其整合起来的吗?它们之间的关系是什么?
All right. So take a step back. So I wear many hats. So you're angling in on the Mojo side. Mojo's a programming language. And so it can help solve the C, C++, Python view that's happening. The FireMojo, God, man, I'm sorry. We should be talking to the modular. Yes. Yes.
好的,那么稍微退后一步。我有很多身份。你现在在关注 Mojo 方面。Mojo 是一种编程语言。可以帮助解决 C、C++、Python 中出现的视觉问题。FireMojo,伙计,我很抱歉。我们应该与模块化交谈。是的,没错。
OK. So the FireMojo is amazing. I love it. It's a big deal. The other side of this is the FireMojo is in service of solving some big AI problems. And so the big AI problems are, again, this fragmentation, this hardware nightmare, this explosion of new potential, but it's not getting felt by the industry.
FireMojo很棒,我喜欢它。这是一件大事。另一方面,FireMojo是为了解决一些重大的人工智能问题而存在的。这些重大的AI问题是,再次强调这种分裂、硬件噩梦、新潜力的爆炸,但这并没有被业界所感受到。
And so when you look at how does the modular engine help TensorFlow and PyTorch, it's not replacing them. In fact, when I talk to people, again, they don't like to rewrite all their code. You have people that are using a bunch of PyTorch, a bunch of TensorFlow. They have models that they've been building over the course many years.
因此,当你看到模块化引擎如何帮助TensorFlow和PyTorch时,它并不是替代它们。事实上,当我和人们交谈时,他们并不喜欢重写他们的所有代码。你会发现有很多人在使用PyTorch和TensorFlow。他们已经建立了许多模型,这些模型已经使用了多年。
And when I talk to them, there's a few exceptions, but generally, they don't want to rewrite all their code. And so what we're doing is we're saying, OK, well, you don't have to rewrite all your code. What happens is the modular engine goes in there and goes underneath TensorFlow and PyTorch. It's fully compatible.
在我与他们交谈时,有一些例外,但通常他们不想重写所有的代码。所以我们所做的就是说:“好吧,你不需要重写所有的代码。”模块化引擎进入其中并在 TensorFlow 和 PyTorch 下运行。它完全兼容。
And it just provides better performance, better predictability, better tooling. It's a better experience that helps lift TensorFlow and PyTorch and make them even better. I love Python. I love TensorFlow. I love PyTorch. This is about making the world better because we need AI to go further.
它提供更好的性能,更好的可预测性,更好的工具支持。它提供了更好的体验,帮助提升TensorFlow和PyTorch,让它们变得更好。我喜欢Python,我喜欢TensorFlow,我喜欢PyTorch。这是为了让世界变得更好,因为我们需要AI走得更远。
But if I have a process that trains a model and have a process that performs inference on that model and have the model itself, what should I do with that in the long arc of history? In terms of if I use PyTorch to train it, should I rewrite stuff in Mojo? Would that, if I care about performance?
但是,如果我有一个训练模型的过程,有一个执行模型推断的过程,以及模型本身,那么在历史的长弧线中,我该怎么做呢?如果我使用PyTorch进行训练,那么我应该在Mojo中重写代码吗?如果我关心性能,那么会有什么影响呢?
Oh, so again, it depends. So if you care about performance, then writing in Mojo is going to be way better than writing in Python. But if you look at LLM companies, for example, if you look at OpenAI, rumored, and you look at many of the other folks that are working on many of these LLMs and other innovative machine learning models, on the one hand, they're innovating in the data collection and the model billions of parameters and the model architecture and the RLHF and all the cool things that people are talking about.
哦,所以又要看情况而定。如果你关心性能,使用Mojo编写比Python要好得多。但是,如果你看看一些LLM公司,例如OpenAI,谣言和其他很多从事LLM和其他创新机器学习模型的人,他们一方面在数据收集、数十亿参数的模型和模型架构、RLHF等方面进行创新,人们对这些酷炫的东西充满了讨论。
But on the other hand, they're saying a lot of time, writing CUDA girls. Right? And so you say, wait a second, how much faster could all this progress go if they were not having to hand write all these CUDA girls? Right. And so there are a few technologies that are out there. And people have been working on this problem for a while. And they're trying to solve subsets of the problem, again, kind of fragmenting the space.
但另一方面,他们说很多时间都花在手写CUDA代码上,是吧?那你会想,如果他们不必手写所有这些CUDA代码,这些进展会快多少呢?因此,有一些技术已经存在,人们一直在尝试解决这个问题。他们试图解决问题的子集,将空间划分为碎片。
And so what Mojo provides for these kinds of companies is the ability to say, cool, I can have a unifying theory. And again, the better together the unifying theory, the two world problem or the three world problem or the end world problem, this is the thing that is slowing people down. And so as we help solve this problem, I think it'll be very helpful for making this whole cycle go faster.
Mojo为这些类型的公司提供的是能够说:“酷,我有一个统一的理论。”而且,统一的理论越好,可以解决两个世界、三个世界或终极问题,这是拖慢人们步伐的东西。因此,随着我们帮助解决这个问题,我认为它会对加快整个循环的进展非常有帮助。
So obviously, we've talked about the transition from Objective-C to Swift, if design this programming language. And you've also talked quite a bit about the use of Swift for machine learning context. Why have you decided to move away from maybe an intense focus on Swift for the machine learning context versus sort of designing a new programming language that happens to be a super-super-super-super.
显然,我们谈论了从Objective-C到Swift的转变,如果设计这种编程语言。你也谈了很多关于使用Swift进行机器学习的内容。为什么你决定了不再像之前那么强调Swift在机器学习领域的应用,而是设计一种全新的编程语言,它可以是非常,非常,非常,非常优秀的?
解释:这段话主要是在询问为什么放弃对Swift在机器学习领域的强调,并设计一种全新的更加优秀的编程语言。
As an irrational set of life choices I make. Did you go to the desert? I did you meditate on it? OK. All right. No, it was bold and needed. And I think, I mean, it's just bold. And sometimes to take those leaps is a difficult leap to take.
作为一系列非理性的生活选择,我所做的。你去沙漠了吗?我去了,你在那里冥想了吗?好的,没问题。不,那是大胆而必要的。而且我认为,也就是大胆。有时候,要迈出这种飞跃是很困难的。
Yeah, well, so OK. I mean, I think there's a couple of different things. So actually, I left Apple back in 2017, like January 2017. So it's been a number of years that I left Apple. And the reason I left Apple was to do AI. So and again, I won't come out on Apple and AI. But at the time, I want to get into and understand and understand the technology, understand the applications, the workloads.
好的,所以嗯,我想有几个不同的事情。实际上,我在2017年1月离开了苹果公司,已经有几年了。我离开苹果的原因是为了做人工智能。再次强调,我不会谈论苹果和人工智能。但当时,我想进入并了解技术、应用和工作负载。
And so I'm going to go dive deep into applied and AI, and then the technology underneath it. I found myself at Google. And that was like when TPUs were waking up. Exactly. And so I found myself at Google. And Jeff Dean, who's a rock star, as you know, and 2017, TensorFlow is really taking off and doing incredible things. And I was attracted to Google to help them with the TPUs. And TPUs are an innovative hardware accelerator platform, have now, I think, proven massive scale and done incredible things.
因此,我打算深入研究应用和人工智能,以及它们背后的技术。我在谷歌工作,而那时正是TPUs醒来的时候。正是如此,我发现自己在谷歌工作。而且,正如你所知,杰夫·迪安是一位摇滚明星,在2017年,TensorFlow真正开始起飞并做出了惊人的成就。我被谷歌吸引,要帮助他们处理TPUs。TPUs是一种创新的硬件加速平台,现在已经证明了它的巨大规模和惊人的成就。
And so one of the things that this led into is a bunch of different projects, which I'll skip over, one of which was this Swift for TensorFlow project. And so that project was a research project. And so the idea of that is say, OK, well, let's look at innovative new programming models, where we can get a fast programming language, we can get automatic differentiation into the language. Let's push the boundaries of these things in a research setting.
因此,这导致了许多不同的项目,其中一个是Swift for TensorFlow项目。这个项目是一个研究项目。这个项目的想法是要寻找创新的新编程模型,用于实现快速编程语言,并将自动微分集成到语言中。在研究环境中推动这些事物的边界。
Now, that project, I think, lasted two, three years. There's some really cool outcomes of that. So one of the things that's really interesting is I published a talk at an LLVM conference in 2018. Again, this seems like so long ago about graph program abstraction, which is basically the thing that's in PyTorch 2. And so PyTorch 2, with all this dynamo reel, things all about this graph program abstraction thing from Python bytecodes. And so a lot of the research that was done ended up pursuing and going out through the industry and influencing things. And I think it's super exciting and awesome to see that.
我认为那个项目持续了两三年,其中有一些非常酷的成果。其中一个非常有趣的事情是我在2018年的一个LLVM会议上发布了一篇演讲,关于图形程序抽象的内容,这基本上就是PyTorch 2中的东西。因此,PyTorch 2在所有这些轮子和事情中都涉及到了从Python字节码中提取出的图形程序抽象。因此,许多所做的研究最终得到了追求并在产业中发挥了影响。我认为这非常令人兴奋和令人惊叹。
But the Swift for TensorFlow project itself did not work out super well. And so there's a couple of different problems with that, one of which is that you may have noticed Swift is not Python. There's a few people that write Python code. Yes. And so it turns out that all of ML is pretty happy with Python. It's actually a problem that other programming languages have as well that they're not Python. We'll probably maybe briefly talk about Julia, who's a very interesting, beautiful programming language.
但是Swift for TensorFlow项目本身并不是非常成功。其中有几个不同的问题,其中一个问题是你可能已经注意到,Swift不是Python。有一些人写Python代码。是的。事实证明,所有的ML都非常喜欢Python。其他编程语言也有这个问题,它们不是Python。我们可能会简要谈论一下Julia,这是一种非常有趣、美丽的编程语言。
But it's not Python. Exactly. Well, and so if you're saying I'm going to solve a machine learning problem where all the programmers are Python programmers, and you say the first thing you have to do is switch to different language, well, your new thing may be good or bad or whatever. But if it's a new thing, the adoption barrier is massive. It's still possible. Still possible? Yeah, absolutely. The world changes and evolves. And there's definitely room for new and good ideas. But it just makes it so much harder.
但这不是Python。没错。如果你说我要解决一个机器学习问题,而所有的程序员都是Python程序员,那么你说你首先要做的事情是转换到不同的语言,那么你的新东西可能是好的或坏的,甚至可能什么都没有。但如果这是一个新的东西,采用的障碍是巨大的。这仍然是可能的。仍然可能?是的,绝对的。世界在变化和发展。而且肯定有新的和好的想法的空间。但这只会使事情更加困难。
And so lesson learned, Swift is not Python. And people are not always in search of learning a new thing for the sake of learning a new thing. And if you want to be compatible with all the world's code, turns out, meet the world where it is. Second thing is that a lesson learned is that Swift has a very fast and efficient language, kind of like Mojo, but a different take on it still. Really worked well with eager mode.
于是,我们得到了这个教训:Swift不是Python。人们并不总是为了学习而学习。如果你想与全世界的代码兼容,那么就要放眼于全球市场。第二个教训是,Swift是一种非常快速和高效的语言,类似于Mojo,但它仍然有其独特之处。在"eager mode"下表现得非常出色。
And so eager mode is something that PyTorch does, and it proved out really well. And it enables really expressive and dynamic and easy to debug programming. TensorFlow at the time was not set up for that, let's say. That was not the timing is also important in this world. Yeah, yeah, TensorFlow is a good thing. And it has many, many strengths. But you could say, Swift for TensorFlow is a good idea, except for the Swift and except for the TensorFlow part.
因此,热切模式是PyTorch使用的一种方法,它被证明效果非常好。它使编程变得非常灵活,易于表达和调试。而在当时,TensorFlow并没有设置好这方面的东西。在这个世界中,时间也很重要。当然,TensorFlow是一个很好的东西,有很多优点。但你可以说,Swift for TensorFlow是一个好主意,除了Swift和TensorFlow这两个部分。
So it's because it's not Python and TensorFlow because it's not easy. It wasn't set up for eager mode at the time. Yeah. That was 1.0. Exactly. And so one of the things about that is in the context of it being a research project, I'm very happy with the fact that we built a lot of really cool technology. We learned a lot of things. I think the ideas went on to have influenced some other systems like PyTorch. A few people use that right here. And so I think that's super cool. And for me personally, I learned so much from it. And I think a lot of the engineers that worked on it also learned a tremendous amount. And so I think that that's just really exciting to see.
这是因为它不是Python和TensorFlow,因此它并不容易。当时它没有设置为急切模式。是的,那时是1.0版。确切地说,这个项目作为一个研究项目,其中一个好处是我们建立了很多非常酷的技术,我们了解了很多东西。我认为这些想法对其他系统,如PyTorch产生了影响,这里有些人正在使用。所以我认为这非常酷。对于个人而言,我从中学到了很多东西。我认为,许多工程师也从中学到了很多东西。所以我认为这非常激动人心。
And I'm sorry that the project didn't work out. I wish it did, of course. But it's a research project. And so you're there to learn from it.
我很抱歉这个项目没有成功。当然,我希望它能成功。但这是一个研究项目,你们参与其中是为了从中学习。
Well, it's interesting to think about the evolution of programming as we come up with these whole new set of algorithms in machine learning in artificial intelligence. And what's going to win out? Because it could be a new programming language. Yeah. It could be, I mean, I just mentioned Julia. I think there's a lot of ideas behind Julia that Mojo shares. What are your thoughts about Julia in general?
“思考编程的演化”是一件有趣的事情,因为我们正在为机器学习和人工智能这些全新算法集合开发新的编程语言。但是哪一种会最终成为赢家呢?也许会是一种全新的编程语言。我曾经提到过 Julia,它的很多理念与 Mojo 有相似之处。你对 Julia 有什么想法吗?
So I will have to say that when we launched Mojo, one of the biggest things I didn't predict was the response from the Julia community. And so I was not. I mean, OK, let me take a step back. I've known the Julia folks for a really long time. They were an adopter of LLVM a long time ago. They've been pushing state of the art in a bunch of different ways. Julia's a really cool system.
因此我必须说,当我们推出Mojo时,我没有预料到最大的事情之一就是Julia社区的反应。所以我有些惊讶。我的意思是,好的,让我退一步。我认识Julia的人很久了。他们很早就采用了LLVM,一直在不同的方面推动最先进的技术。Julia是一个非常酷的系统。
I had always thought of Julia as being mostly a scientific computing-focused environment. And I thought that was its focus. I neglected to understand that one of their missions is to help make Python work end to end. And so I think that was my error for not understanding that. And so I could have been maybe more sensitive to that.
我一直认为Julia主要是一个以科学计算为重点的环境。我以为这是它的重点。我忽略了它的一个使命是帮助Python从头到尾地工作。我认为这是我的错误,因为我没有理解到这一点。所以我可能应该更敏感些。
But there's major differences between what Mojo's doing and what Julia's doing. So as you say, Julia is not Python. And so one of the things that a lot of the Julia people came out and said is like, OK, well, if we put a ton of more energy and 10 more money or engineering or whatever in Julia, maybe that would be better than starting Mojo.
Mojo和Julia之间存在着重要的区别。正如你所说,Julia不是Python。因此,许多Julia的支持者表示,如果我们在Julia上投入大量的精力、更多的金钱或工程资源,可能会比开始开发Mojo更好。
I mean, maybe that's true, but it still wouldn't make Julia into Python. So if you work backwards from the goal of, let's build something for Python programmers without requiring them to relearn syntax, then Julia just isn't there. I mean, that's a different thing. And so if you anchor on, I love Julia and I want Julia to go further, then you can look at it from a different lens. But the lens we were coming at was, hey, everybody is using Python. And syntax isn't broken. Let's take what's great about Python and make it even better. And so it was just a different starting point.
我的意思是,也许那是真的,但这仍然不会把Julia变成Python。因此,如果你从“让Python程序员构建东西而不需要重新学习语法”的目标开始倒推,那么Julia就不符合要求了。我的意思是,这是另一码事。因此,如果你想要支持Julia,让它走得更远,那么你可以从不同的角度来看待它。但我们的出发点是,每个人都在使用Python,而语法没有问题。我们要利用Python的优点,把它变得更好。这只是一个不同的起点。
So I think Julia is a great language. The community is a lovely community. They're doing really cool stuff, but it's just a different, slightly different angle.
我认为Julia是一门很棒的语言。这个社区非常友善,他们正在做非常酷的事情,但只是在一个稍微不同的角度。
But it does seem that Python is quite sticky. Is there some philosophical, almost that you could say about why Python, by many measures, seems to be the most popular programming language in the world?
Python似乎相当有吸引力。关于为什么Python众多方面都被认为是全球最流行的编程语言,是否有一些哲学的理由可以解释呢?
Well, I can tell you things. I love about it. Maybe that's one way to answer the question, right? So huge package ecosystem. Super lightweight and easy to integrate. It has very low startup time.
嗯,我可以告诉你我喜欢它的一些东西。也许这是回答这个问题的一种方式,对吧?它拥有一个巨大的软件包生态系统,非常轻量级且易于集成。它启动时间很短。
So what startup time do you mean? Like learning curve or what?
你说的启动时间是指什么?是指学习曲线还是其他的呢?
Yeah, so if you look at certain other languages, you say go. And it just takes a Java, for example. It takes a long time to just compile all the things. And then the VM starts up and the garbage pleasure kicks in, and then it revs its engines. And then it can cloud through a lot of internet stuff or whatever, right? Python is scripting. It just goes, right? Python has very low compile time. Like, so you're not sitting there waiting. Python integrates into notebooks in a very elegant way that makes exploration super interactive, and it's awesome.
如果你看其他某些语言,比如Java,编译所有的东西就需要很长时间。然后虚拟机启动时,垃圾回收机制开始运行,然后才能启动引擎,最终可以进行云端操作或其他操作。Python是一种脚本语言,直接启动。Python的编译时间非常短,所以你不必等待。Python可以非常优雅地集成到笔记本中,让探索变得超级即时交互和很棒。
Python is also almost the glue of computing, because it has such a simple object representation. A lot of things plug into it. That dynamic metaprogramming thing we were talking about also enables really expressive and beautiful APIs.
Python几乎是计算机领域的粘合剂,因为它拥有如此简单的对象表示法,很多事情都可以与它兼容。我们谈到的动态元编程使得API变得非常具有表现力和优美。
So there's lots of reasons that you can look at technical things that Python has done and say, OK, wow, this is actually a pretty amazing thing. And any one of those you can neglect, people will all just talk about indentation and ignore the fundamental things.
有很多理由可以从技术角度来看Python所做的事情,并说,哇,这实际上是一件非常了不起的事情。如果你忽略了其中任何一种,人们都只会谈论缩进并忽略基本的事情。
But then you also look at the community side, right? So Python owns machine learning. Machine learning is pretty big. Yeah, and it's growing. And it's growing in importance, right? And there's a reputation and prestige to machine learning to where, like, if you're a new programmer, you're thinking about which programming language do I use? Well, I probably should care about machine learning, therefore, let me try Python and build and build and build.
然后你也要考虑到社群的方面,对吧?Python主要用于机器学习,机器学习相当重要,而且越来越受到重视。机器学习已经具备声望和声誉,如果你是一名新的程序员,会思考应该使用哪种编程语言。那么,我可能应该关心机器学习,所以让我尝试使用Python,一步步建立。
And you even go back before that. Like, my kids learn Python. Right? Not because I'm telling them to learn Python, but because they were telling us you or what? Oh, no, right. Well, they also learn Scratch, right? And things like this, too. But it's because Python is taught everywhere, right? Because it's easy to learn, right? And because it's pervasive, right?
你甚至可以回溯到更早之前。比如说,我的孩子们学习Python。不是因为我要求他们学Python,而是因为他们告诉我们,Python到底是什么。哦,不对,他们还学习了Scratch之类的东西。但这是因为Python无处不在,它易于学习,广泛使用。
And they're not going to do that. My day, we learn Java and C++. Yeah. Well, uphill both directions. But yes, I guess Python is the main language of teaching software engineering at schools now.
它们不会这样做。在我那个时代,我们学习Java和C++。是的,当时不论是Java还是C++都很困难。但现在,在学校教授软件工程时,Python是主要的编程语言。
Yeah. Well, if you look at this, there's these growth cycles, right? If you look at what causes things to become popular, and then gain in popularity, there's reinforcing feedback loops and things like this. And I think Python is done, again, the whole community is a really good job of building those growth loops and help propel the ecosystem.
嗯,你看这个,有这些生长周期,对吧?如果你看看是什么导致事物变得流行,然后增长流行度,这就涉及到了强化反馈循环等等。我认为Python社区已经很出色地建立了这些生长循环,并帮助推动了整个生态系统的发展。
And I think that, again, you look at what you can get done with just a few lines of code. It's amazing. So this kind of self-building loop is interesting to understand, because when you look at Mojo, what it stands for, some of the features, it seems sort of clear that this is a good direction for programming languages to evolve in the machine learning community. But it's still not obvious that it will, because of this, whatever the engine of popularity, of virality, is there something you could speak to? Like, how do you get people to switch?
我认为,通过几行代码就能完成的功能是非常惊人的。因此,理解这种自建循环是很有趣的。当你看看Mojo,它代表了什么,一些特性,似乎很清楚这是编程语言在机器学习领域发展的一个好方向。但它仍然不是很明显,因为无论是受欢迎程度还是扩散引擎,都存在着一些问题。你能谈谈怎样让人们改变观念吗?
Yeah, well, I mean, I think that the viral growth loop is to switch people to Unicode. Yes. I think that Unicode file extensions are what I'm betting on. I think that's going to be the thing. Yeah. Tell the kids that you could use the fire emoji. And they'd be like, what?
嗯,我的意思是,我认为病毒式的增长循环是让人们转换到Unicode。是的,我认为Unicode文件扩展名是我押注的选择。我认为那将是重要的事情。是的。告诉孩子们你可以使用火焰表情符号。然后他们会说什么?
Exactly. Well, in all seriousness, I think there's really, I'll give you two opposite answers. One is, I hope if it's useful, if it solves problems, and if people care about those problems being solved, they'll adopt the tech. That's kind of the simple answer. And when you're looking to get tech adopted, the question is, is it solving an important problem people need solved? And is the adoption cost low enough that they're willing to make the switch and cut over and do the pain up front so that they can actually do it?
确实。认真地说,我认为有两种截然不同的答案。一种是,我希望如果它有用、能解决问题,而且人们关心这些问题得到解决,他们会采用这项技术。那是一个比较简单的答案。当你想让技术被采用时,问题是,它是否解决了人们需要解决的重要问题?采用成本是否足够低,以至于他们愿意切换并付出一些前期的痛苦,以便他们能够真正做到这一点呢?
And so hopefully, Mojo will be that for a bunch of people. And people building these hybrid packages are suffering. It's really painful. And so I think that we have a good shot of helping people. But the other side is, it's OK if people don't use Mojo. Because it's not my job to say, everybody should do this. I'm not saying Python is bad. I hope Python, CPython, all these implementations.
希望 Mojo能够成为许多人的选择。现在,制作这些混合软件包的人们正在遭受痛苦。我相信我们有机会帮助他们。但是另一方面,如果有人不使用Mojo也没关系。因为我不是要求每个人都必须使用它。我并不是在说Python不好。我希望Python和CPython等所有实现都能够发挥它们的作用。
Because Python ecosystem is not just CPython. It's also a bunch of different implementations with different trade-offs. And this ecosystem is really powerful and exciting as are other programming languages. It's not like TypeScript or something is going to go away. And so there's not a winner-take-all thing. And so I hope that Mojo is exciting and useful to people. But if it's not, that's also fine.
Python 生态系统不仅仅是 CPython,它也包括许多不同的实现,具有不同的权衡取舍。这个生态系统非常强大和令人兴奋,就像其他编程语言一样。这并不意味着 TypeScript 或其他语言将离开,所以并不存在一个赢者通吃的情况。因此,我希望 Mojo 对人们来说是令人兴奋和有用的,但如果不是,那也没关系。
But I also wonder what the use case for why you should try Mojo would be. So practically speaking, it seems like there's entertainment. There's the dopamine hit of saying, holy shit, this is 10 times faster. This little piece of code is sometimes faster in Mojo. Out of the box before you get to $35,000. Exactly. I mean, just even that, I mean, that's the dopamine hit that every programmer sort of dreams of. The optimization. It's also the drug they can pull you in and have you waste way too much of your life optimizing and overoptimizing, right?
不过我还在想,你为什么要尝试Mojo的用例是什么。从实际角度来看,似乎只是为了娱乐。有时Mojo会让你感到神奇,代码运行速度会比其他东西快十倍。正常情况下,尚未达到35,000美元的水平,这个小模块的速度就已经很快了。这也是每个程序员梦想中的多巴胺弹出效应,即优化。这也是一种毒品,可能会让你迷失在优化和过度优化中,白白浪费你的生命。
But so what do you see that would be like, this is very hard to predict, of course. But if you look 10 years from now, Mojo is super successful, what do you think would be the thing where people try it and then use it regularly and it kind of grows and grows and grows.
但是你觉得会有什么因素是很难预测的,当然了。但是如果你往后看10年,Mojo非常成功,你认为会是什么样的事情,人们尝试它并且经常使用它,然后它会不断地发展壮大。
Well, so you talk about dopamine hit. And so again, humans are not one thing. And some people love rewriting their code and learning new things and throwing themselves in the deep end and trying out new things. In my experience, most people don't. Like they're too busy. They have other things going on. By number, most people don't like this. I want to rewrite all my code. But even those people, the too busy people, the people that don't actually care about the language that just care about getting stuff done, those people do like learning new things, right?
好的,所以你谈论了多巴胺刺激。人并不是一个统一体,有些人喜欢重新编写代码,学习新的知识,将自己投入到深水区并尝试新的事物。在我的经验中,大多数人并不喜欢这样做。他们太忙了,有其他的事情要处理。从数量上看,大多数人并不喜欢这样做。我想重写我的所有代码。但是即使是那些太忙了的人,那些实际上并不关心语言,只关心完成任务的人,他们也喜欢学习新事物,对吧?
Yeah. And so you talk about the dopamine rush of 10x faster. Wow, that's cool. I want to do that again. Well, it's also like, here's the thing I've heard about in a different domain. And I don't have to rewrite all my code. I can learn a new trick, right? Well, that's called growth. And so one thing that I think is cool about Mojo, and again, those will take a little bit of time. For example, the blog posts and the books and all that kind of stuff to develop and the languages get further along.
是的。所以你谈论了10倍更快的多巴胺反应。哇,太酷了。我想再试一次。那么,这也就像我在另一个领域听到的东西一样。我不必重写所有的代码,而是可以学习一项新技能,对吧?那就是成长。因此,我认为Mojo很酷,虽然这需要一些时间,比如博客文章和书籍等需要开发,语言要进一步发展。
But what we're doing, you talk about types. Like you can say, look, you can start with the world you already know, and you can progressively learn new things and adopt them where it makes sense. And if you never do that, that's cool. You're not a bad person. If you get really excited about it, want to go all the way in the deep end, and rewrite everything and really, whatever, that's cool, right?
但是我们正在做的事情,你谈论的是类型。就像你可以说,看,你可以从你已经知道的世界开始,然后逐渐学习新的东西,并在有意义的地方采用它们。如果你从未这样做过,那很好,你不是一个坏人。如果你对此感到兴奋,想要尝试所有的深层次内容,重新编写所有内容,那也很好,对吧?
But I think the middle path is actually the more likely one where you come out with a new idea and you discover, wow, that makes my code way simpler, way more beautiful, way faster, way whatever. And I think that's what people like.
但我认为中间的道路实际上更可能,你会发现一个新的想法,让你的代码变得更简单、更漂亮、更快等等。我想人们就是喜欢这种感觉。
Now, if you fast forward and you said like 10 years out, right, I can give you a very different answer on that, which is, I mean, if you go back and look at what computers looked like 20 years ago, every 18 months they got faster for free, right? 2x faster every 18 months, it was like clockwork. It was free, right?
现在,如果你快进10年,我可以给你一个完全不同的答案,因为如果你回头看看20年前的计算机,每18个月它们就会免费地变得更快,对吧?每18个月加速2倍,就像钟表一样稳定。这是免费的,对吧?
You go back 10 years ago, and we entered in this world where suddenly we had multi-core CPUs, and we had GPUs. And if you squint and turn your head, what a GPU is, it's just a many-core, very simple CPU thing, kind of, right? And so, and 10 years ago, it was CPUs and GPUs and graphics.
你回到10年前,我们进入了一个新的世界,在这个世界里,我们突然拥有了多核心CPU和GPU。如果你眯起眼睛、扭头看一看,你会发现GPU就是一个拥有很多核心、非常简单的CPU。就像那样。而且,在10年前,主要的技术是CPU、GPU和图形处理。
Today, we have CPUs, GPUs, graphics, and AI, because it's so important, because the compute is so demanding, because of the smart cameras and the watches and all the different places the AI needs to work in our lives, it's caused this explosion of hardware.
现在我们拥有了CPU、GPU、图形和人工智能,这是非常重要的,因为计算需求非常高,智能摄像头、手表和所有在我们生活中需要人工智能工作的不同场所,导致了硬件的爆炸。
And so, part of my thesis, part of my belief of where computing goes, if you look out 10 years from now, is it's not going to get simpler. Physics isn't going back to where we came from. It's only going to get weirder from here on out, right? And so, to me, the exciting part about what we're building is it's about building that universal platform, which the world can continue to get weird, because again, I don't think it's avoidable, it's physics, but we can help lift people's scale, do things with it, and they don't have to rewrite their code every time a new device comes out.
因此,在我的论文中,我相信计算的发展方向,如果你往后10年看,它不会变得更加简单。物理学不会回到原点。从现在开始它只会变得更加诡异,对吧?因此,对我来说,我们正在建立的令人激动的部分,是建立这个通用平台,让世界能够继续变得更加诡异,因为再一次地,我认为它是不可避免的,这是物理学,但我们可以帮助人们提升他们的规模,使用它来完成各种事情,而且他们不必每次推出一个新设备就重写他们的代码。
And I think that's pretty cool. And so, if Mojo can help with that problem, then I think that it will be hopefully quite interesting, quite useful to a wide range of people, because there's so much potential, and maybe analog computers will become a thing, or something, right? And we need to be able to get into a mode where we can move this programming model forward, but do so in a way where we're lifting people and growing them instead of forcing them to rewrite all their code and exploding them.
我认为这非常酷。如果Mojo能够解决这个问题,那么我认为它将对许多人来说非常有趣、有用,因为它有很大的潜力,也许模拟计算机将成为一种事物。我们需要能够进入一种模式,推动这种编程模型向前发展,但是要以一种让人们成长的方式来做,而不是强迫他们重写所有代码并让他们崩溃。
Do you think there'll be a few major libraries that go Mojo first? Well, so I mean, the modular engine is all Mojo. So I can't come back to it. We're not building Mojo because it's fun. We're building Mojo because we had to dissolve these accelerators. That's the origin story. But I mean, ones that are currently in Python.
你认为会有几个主要的图书馆采用 Mojo 的方式吗?所以,我的意思是,这个模块化引擎完全是 Mojo。所以我不能回来用它了。我们不是因为好玩而建立 Mojo。我们建立 Mojo 是因为必须解散这些加速器。这就是源头故事。但是我是指当前使用 Python 的图书馆。
Yeah, so I think that a number of these projects will. And so one of the things, again, this is just my best guess. Each of the package maintainers also has, I'm sure, plenty of other things going on. People really don't like rewriting code just for sake of rewriting code. But sometimes people are excited about adopting a new idea.
嗯, 我觉得这些项目中的许多都会。并且其中一件事,再次强调一下,这只是我的最佳猜测。每个软件包维护者肯定也会有很多其他事情要做。人们真的不喜欢仅仅为了重新编写代码而重新编写代码。但是有时人们会对采用新想法感到兴奋。
It turns out that while rewriting code is generally not people's first thing, it turns out that redesigning something while you rewrite it and using a rewrite as an excuse to redesign can lead to the 2.0 of your thing that's way better than the 1.0. And so I have no idea. I can't predict that.
事实证明,尽管重写代码通常不是人们的首选做法,但事实证明,在重写过程中重新设计某些东西,并利用重写作为借口进行重新设计,可以得到比1.0更好的2.0版本。所以我不知道。我无法预测这个结果。
But there's a lot of these places where, again, if you have a package that is half C and half Python, just solve the pain, make it easier to move things faster, make it easier to debug and evolve your tech, adopting Mojo makes sense to start with. And then it gives you this opportunity to rethink these things.
然而,有很多这样的地方,如果你的程序包含了一半的C和一半的Python,就需要解决这些痛点,让东西更快地移动,更容易调试和发展你的技术。采用Mojo是明智的开始,它给你重新思考这些问题的机会。
So the two big gains are that there's a performance gain. And then there's the portability to all kinds of different devices. And there's safety. So you talk about real types. I mean, not saying this is for everybody, but that's actually a pretty big thing. Yeah, types are.
因此,这两个重要的收益是性能提升和可移植性到各种不同的设备上。还有安全性。所以你要谈论真实类型。我的意思不是这适用于每个人,但这其实是非常重要的。是的,类型非常重要。
And so there's a bunch of different aspects of what value Mojo provides. And so I mean, it's funny for me. Like I've been working on these kinds of technologies and tools for too many years now. But you look at Swift. And we talked about Swift for TensorFlow, but Swift as a programming language.
Mojo提供了许多不同的方面价值。对我来说很有趣,因为我已经从事这种技术和工具的工作太多年了。但是你看看Swift。我们谈论过Swift for TensorFlow,但Swift作为一种编程语言。
Swift's now 13 years old from when I started it. Yeah. Because I started in 2010, if I remember. And so that project, and I was involved with it for 12 years or something, right? That project has gone through its own really interesting story arc, right? And it's a mature, successful, used by millions of people's system, right? Certainly not dead yet, right?
Swift现在已经13岁了,从我开始使用它算起。如果我没记错的话,我是在2010年开始使用的。这个项目我参与了大约12年左右,经历了自己很有意思的发展史。现在它已经成熟、成功了,并被数百万人使用,绝对没有死亡的迹象。
But also, going through that story arc, I learned a tremendous amount about building languages, about building compilers, about working with the community, and things like this. And so that experience, like I'm helping channel and bring directly in a Mojo. And other systems, same thing. Like apparently, building and iterating and evolving things.
通过这个故事,我学到了大量有关语言构建、编译器构建、与社群合作等方面的知识。因此,这种经历让我能够在 Mojo 和其他系统中提供帮助和引导。同样的,建立、迭代和发展事物也是如此。
And so you look at this LVM thing I worked on 20 years ago. You look at MLIR, right? And so a lot of the lessons learned in LVM got fed into MLIR. And I think that MLIR is a way better system than LLVM was. And Swift is a really good system. And it's amazing. But I hope that Mojo will take the next step forward in terms of design.
因此,你看看我20年前开发的这个LVM的东西。你看看MLIR,对吧?因此,LVM中学到的很多教训都被提供给了MLIR。我认为,MLIR比LLVM系统好得多。Swift是一个非常好的系统,令人惊叹。但我希望Mojo在设计方面能够迈出下一步。
In terms of running Mojo, people can play with it. What's Mojo playground? Yeah. And from the interface perspective and from the hardware perspective, what's this incredible thing running on? Yeah, so right now, so here we are two weeks after launch. We decided that, OK, we have this incredible set of technology that we think might be good, but we have not given it to lots of people yet.
就运行 Mojo 来说,人们可以使用它进行游戏。Mojo 平台是什么?是的。从界面和硬件的角度来看,这个惊人的东西是在运行什么?是的,目前我们距离推出已经过去了两周。我们决定,我们有这一套我们认为可能很好的惊人技术,但我们还没有将它提供给很多人使用。
And so we're very conservative and said, let's put it in a workbook so that if it crashes, we can do something about it. We can monitor and track that. And so again, things are still super early, but we're having one person a minute sign up with over 70,000 people. Two weeks in is kind of crazy. So you can sign up to a bunch of playground and you can use it in the cloud in your browser. And so what that's running on the book. Yeah, what that's running on is that's running on cloud VMs.
所以我们非常保守,决定将其放在工作簿中,这样如果它崩溃了,我们就可以采取措施,监控和追踪它。尽管还处于早期阶段,但已经有每分钟一人注册的速度,已经超过了7万人。两周内达到这种程度有点疯狂。所以你可以注册一堆游乐场,然后在浏览器中使用它们。它正在云 VM 上运行。
And so you share a machine with a bunch of other people. But it turns out there's a bunch of them now because there's a lot of people. And so what you're doing is you're getting free compute and you're getting a play with this thing. And a limited controlled way so that we can make sure that it doesn't totally crash and be embarrassing.
因此,你与一群其他人共享一台计算机。但是,现在有很多人了,所以有很多人使用它。因此,你可以获得免费的计算资源,并可以使用这个东西进行一些探索。但是在受到有限的控制的情况下,以确保它不会完全崩溃并且丢人现眼。
So now a lot of the feedback we've got and there's people want to download it locally, so we're working on that right now. So that's the goal to be able to download locally into it. Yeah, that's what everybody expects. And so we're working on that right now. So we just want to make sure that we do it right.
现在我们收到了很多反馈意见,有人希望能够在本地下载,因此我们正在积极研发这个功能。这也是我们的目标。大家都希望实现这个功能,我们正在努力确保做得尽善尽美。
And I think this is one of the lessons I learned from Swift, also, by the way. Is it when we launched Swift? Gosh, it feels like forever ago. It was 2014. And it was super exciting. I and we, the team, had worked on Swift for a number of years in secrecy. And four years into this development, roughly, of work on this thing, at that point, about 250 people at Apple knew about it. So it was secret.
我认为这也是我从Swift中学到的一课,顺带一提,当我们发布Swift的时候吗?天哪,感觉好像是很久以前的事了,是在2014年。这非常令人兴奋。我和我们的团队,在保密的情况下,已经用几年的时间一直在开发Swift。而在大约四年的开发过程中,大约有250名苹果公司的人员知道了这件事情。所以这是个秘密。
Apple's good at secrecy, and it was a secret project. And so we launched this at WWC, a bunch of Hoopla and excitement, and said, developers, you're going to be able to develop and submit apps the app store in three months.
苹果在保密方面做得很好,这是一个机密项目。因此,在WWC上,我们发起了这个项目,引起了一连串的喧闹和兴奋,并告诉开发人员,他们将能够在三个月内在应用商店开发并提交应用程序。
Several interesting things happened, right? So first of all, we learned that, hey, it had a lot of bugs. It was not actually production quality. And it was extremely stressful in terms of trying to get it working for a bunch of people. And so what happened was we went from zero to, I don't know how many developers Apple had at the time, a lot of developers overnight. And they ran into a lot of bugs, and it was really embarrassing. And it was very stressful for everybody involved. It was also very exciting, because everybody was excited about that.
有好几件有趣的事情发生了,对吧?首先,我们发现它有很多漏洞,实际上它并不是在生产环境下使用的。同时,在让许多人能够使用该产品的过程中,它也带来了极大的压力。结果我们原本的人手不够,一夜之间就需要招募很多苹果的工程师来解决问题。然而他们遇到了很多问题,这让每个人都感到尴尬和焦虑。当然,这也是非常令人兴奋的,因为每个人都为这个项目充满期待。
The other thing I learned is that when that happened, roughly every software engineer who did not know about the project at Apple, their head exploded when it was launched, because they didn't know it was coming. And so they're like, wait, what is this? I signed up to work for Apple because I love Objective-C. Why is there a new thing, right? And so now what that meant, practically, is that the push from launch to, first of all, the fall, but then to 2.0 and 3.0 and all the way forward, was super painful for the engineering team and myself.
我学到的另一件事是,在这种情况下,几乎每个不知道苹果项目的软件工程师在发布的时候都会感到头脑爆炸,因为他们不知道这是预定的。所以他们会问,这是什么?我加入苹果是因为我热爱 Objective-C。为什么会有一个新的东西?因此,这意味着实际上的推动从发布到首先是秋天,然后是2.0和3.0以及所有进一步的事情对于工程团队和我自己来说都非常痛苦。
It was very stressful. The developer community was very grumpy about it, because they're like, OK, well, wait a second. You're changing and breaking my code. And we have to fix the bugs. And it was just a lot of tension and friction on all sides. There's a lot of technical debt in the compiler, because we have to run really fast. And you have to go implement the thing and unblock the use case and do the thing. And you know it's not right, but you never have time to go back and do it, right?
这很令人感到压力。开发者社区对此非常不满,因为他们认为,“等一下,你正在改变和破坏我的代码。我们必须修复这些错误。”这引发了各方的紧张和冲突。编译器中存在很多技术债务,因为我们必须快速运行。你必须实现这个目标、解决使用案例并完成任务。你知道这不对,但你永远没有时间回过头来做它,对吧?
And I'm very proud of the Swift team, because they've come. I mean, we, but they came so far and made so much progress over this time since launch. It's pretty incredible. And Swift is a very, very good thing. But I just don't want to do that again, right? And so I'll iterate more through the development process. And so what we're doing is we're not launching it when it's hopefully zero.9 with no testers.
我非常为Swift团队感到自豪,因为他们已经走了很长的路程,在上市以来取得了许多进步。这是相当不可思议的。Swift是一件非常好的事情。但是我不想再重复同样的事情了,对吧?所以我会在开发过程中进行更多的迭代。我们现在正在做的是,当它达到了0.9版本时,我们不会立即推出,而会先走一遍测试流程。
We're launching it and saying it's zero.1. And so we're saying expectations of saying, OK, well, don't use this for production. If you're interested in what we're doing, we'll do it in an open way. And we can do it together. But don't use it in production yet. Like, we'll get there. But let's do it the right way. And I'm also saying we're not in a race. The thing that I want to do is build the world's best thing. Yeah. Right? Because if you do it right and it lifts the industry, it doesn't matter if it takes an extra two months. Yeah. Like two months is worth waiting.
我们发布了它并称它为0.1版本。我们希望大家明白,这不是生产环境可用的版本。如果您对我们正在做的事情感兴趣,我们将以公开透明的方式进行。我们可以一起来完成它。但是现在不要在生产环境中使用它。让我们以正确的方式完成它,我们并不急于赶进度。我想做的是建造世界上最好的东西。因为如果做得正确,它能推动整个行业的发展,多等两个月也是值得的。
And so doing it right and not being overwhelmed with technical debt and things like this is like, again, war wounds, lessons learned, whatever you want to say, I think is absolutely the right thing to do. Even though right now, people are very frustrated that you can't download it or it doesn't have feature X or something like this.
因此,正确地执行任务并不陷入技术债务和其他类似问题,就像是一种战争创伤、经验教训,不管你想怎么形容,我认为这是完全正确的做法。尽管现在人们非常沮丧,无法下载某些应用或者某些功能还未完善。
What have you learned in a little bit of time since it's been released into the wild that people have been complaining about, feature X or Y or Z? What have they been complaining about? What have they been excited about? Like, almost like detailed things versus a big vision. I think everyone would be very excited about the big vision.
自从它被发布到世界上以来,你在短时间内学到了什么?人们对功能X、Y或Z有什么抱怨?他们对什么感到兴奋?比如,详细的事情与一个大的愿景相比。我认为每个人都会对这个大愿景感到非常兴奋。
Yeah. Yeah. Well, so I mean, I've been very pleased. In fact, I mean, we've been massively overwhelmed with response, which is a good problem to have. It's kind of like a success disaster, in a sense, right? And so I mean, if you go back in time, when we started Modular, which is just not yet a year and a half ago, so it's still a pretty new company, new team, small but very good team of people.
是的,是的。嗯,我是说,我一直很高兴。实际上,我们一直受到了巨大的反响,这是一个好问题。在某种程度上,这有点像一场成功的灾难,对吧?如果你回顾一下我们开始模块化的时候,那时候距今还不到一年半,所以我们是一个相当新的公司,新的团队,由一组小而非常优秀的人组成的团队。
Like, we started with extreme conviction that there is a set of problems that we need to solve. And if we solve it, then people will be interested in what we're doing, right? But again, you're building in basically secret, right? You're trying to figure it out. The creation's a messy process. You're having to go through a different path and understand what you want to do and how to explain it. Often when you're doing disruptive and new kinds of things, just knowing how to explain it is super difficult, right?
我们一开始就有极强的信念,认为有一系列的问题需要我们去解决。如果我们能够解决这些问题,那么人们肯定会对我们所做的事情感兴趣,对吗?但是,你实际上是在秘密的状态下进行构建,对吧?你正在努力找出正确的方向,理解自己想要做什么以及如何解释它。创造是一个混乱的过程,你必须走不同的道路,并且要理解你想要做什么以及如何解释它,而当你在做一些颠覆性和新型的事情时,仅仅知道如何解释它已经非常困难了,对吧?
And so when we launched, we hoped people would be excited. But I'm an optimist, but I'm also like, don't want to get ahead of myself. And so when people found out about Modular, I think their heads exploded a little bit, right? And here's, I think, a pretty credible team that has built some languages and some tools before. And so they have some lessons learned and are tackling some of the deep problems in the Python ecosystem and giving it the love and attention that it should be getting. And I think people got very excited about that.
当我们推出时,我们希望人们能够感到兴奋。但我是一个乐观主义者,同时也不想过度自信。当人们听到Modular的时候,我想他们的脑袋有点爆炸,对吧?我们有一个非常可靠的团队,他们之前已经构建了一些编程语言和工具。所以他们有很多经验教训,正在解决一些Python生态系统的深层问题,并给予它应有的关注和爱。我认为人们对此非常兴奋。
And so if you look at that, I mean, I think people are excited about ownership and taking a step beyond rust, right? And there's people that are very excited about that. There's people that are excited about, you know, just like I made Game of Life go 400 times faster, right? And things like that. And that's really cool. There are people that are really excited about the, okay, I really hate writing stuff in C++. Save me. Like systems in your, they're like stepping up like. Yeah. So that's me, by the way. Also, I really want to stop writing C++.
因此,如果你看到了这个,我觉得人们对拥有权和超越生锈的步骤感到兴奋,对此有些人非常兴奋。有些人对像我让“生命游戏”快了400倍这样的事情非常兴奋,这真的很酷。还有一些人对于“我真的讨厌用C++编写代码,救救我吧”的系统非常兴奋,像他们在向上迈步一样。是的,那个人就是我。此外,我真的想停止写C++。
But the, I get third person excitement. When people tweet, yeah, I made this code, Game of Life or whatever, it's faster. And you're like, yeah. Yeah, and also like, well, I would also say that, let me cast blame out to people who deserve it. Sure. These terrible people who convinced me to do some of this. Yes. Jeremy Howard. Yes. That guy.
但是我会感到第三人称的兴奋。当人们发推文时,“是啊,我写了这个代码,Game of Life 或者其他什么,速度更快了”,你会觉得,“是啊,是啊”。还有,我也要说,让我指责一些应受指责的人。没错,就是那些让我做这些事的可怕的人。对,Jeremy Howard,就是那个家伙。
Well, he's been pushing for this kind of thing. He's been pushing for more. He's been pushing for more. He's been pushing for more. Yeah, he's wanted this for a long, long time. He's wanted this for years. And so far, he's like one of the most legit people in the machine learning community. He's a grassroots. He really teaches, he's an incredible educator, he's an incredible teacher, but also legit in terms of a machine learning engineer himself. Yeah, so.
他一直在推动这种事情。他一直在寻求更多。他一直在寻求更多。他一直在寻求更多。是的,他想要这个已经很久了。他已经渴望这个多年了。迄今为止,他是机器学习社区中最可靠的人之一。他来自基层,非常善于教学,是一位了不起的教育家,同时在机器学习工程师领域也很有名气。这就是他的情况。
He's been running the fast.ai and looking, I think, for exactly what you've done. Exactly. And so, I mean, the first time, so I met Jeremy pretty early on. But the first time I sat up and I'm like, this guy is ridiculous is when I was at Google and we were bringing up TPUs and we had a whole team of people and there was this competition called Don Bench of who can train ImageNet fastest. Yes. And Jeremy and one of his researchers crushed Google by not through sheer force of the amazing amount of compute and the number of TPUs and stuff like that, that he just decided that progressive imagery sizing was the right way to train the model. And if you're an epoch faster and make the whole thing go room, right? Yep. And I'm like, this guy is incredible.
他一直在参加fast.ai的运营,我想他正在寻找类似于你所做的工作。就是完全相同。那么,我的意思是,第一次遇到Jeremy还很早时。但是,当我在Google工作时,我们正在启动TPUs,有一整个团队人员,我们参加一个比赛,叫做“Don Bench”,比赛的目标是谁能最快地训练ImageNet。没错。Jeremy和他的研究团队通过渐进式图像大小决定了正确的模型训练方式,将其优于Google。如果你能再快一个周期,就可以整个训练速度提升。我认为这个人不可思议。
So you can say, anyways, come back to, you know, where's Mojo coming from? Chris finally listened to Jeremy. It's all his fault.
所以,你可以说,不管怎么样,回到你知道的地方,摩乔是从哪里来的?克里斯最终听了杰里米的话。这都是他的错。
Well, there's a kind of very refreshing, pragmatic view that he has about machine learning that, I don't know if it's like this mix of desire for efficiency but ultimately grounded in desire to make machine learning more accessible to a lot of people. I don't know what that is. I guess that's coupled with efficiency and performance. But it's not just obsessed about performance.
他对机器学习持有一种非常清新、务实的观点,似乎是出于追求效率,但最终目的是让更多人能够更容易地接触机器学习。我不知道这种观点是什么,可能与效率和性能结合在一起。但他并不仅仅关注性能。
So a lot of AI and AI research ends up being that it has to go fast enough to get scale. So a lot of people don't actually care about performance, particularly on the research side, until it allows them to have a bigger data set. And so suddenly now you care about distributed compute and all these exotic HPCs. Like you don't actually want to know about that. You just want to be able to do more experiments faster and do so with bigger data sets, right?
因此,很多人工智能和人工智能研究最终都要求其足够快速,以达到规模。这就是为什么很多人实际上并不关心性能,尤其是在研究方面,直到它允许他们拥有更大的数据集。因此,突然间你会关心分布式计算和所有这些异乎寻常的高性能计算机。就像你实际上并不想了解那些东西。你只想能够更快地进行更多的实验,并使用更大的数据集进行实验,是吗?
And so Jeremy has been really pushing the limits. And one of the things I'll say about Jeremy, and there's many things I could say about Jeremy because I'm a fanboy of his, but he fits in his head. And Jeremy actually takes the time, where many people don't, to really dive deep into why is the beta parameter of the atom optimizer equal to this, right? And he'll go survey and understand what are all the activation functions in the trade-offs and why is it that everybody that does this model pick that thing. So the why, not just trying different values, like really what is going on here. Right. And so as a consequence of that, like he's always, again, he makes time, but he spends time to understand things at a depth that a lot of people don't. And as you say, he then brings it and teaches people. And his mission is to help lift, you know, his website says, making AI uncool again. Like, it's about, like, forget about the hype. It's actually practical and useful. Let's teach people how to do this, right?
因此,杰里米一直在不断挑战极限。关于杰里米,我可以说很多好话,因为我是他的粉丝之一,但他非常有头脑。许多人不愿深入探究,而杰里米会花时间深入研究为什么原子优化器的 beta 参数等于这个值,他会了解到所有的激活函数和权衡,以及为什么所有使用这个模型的人都选择这个选项。他关注的是为什么,而不仅是尝试不同的值。所以他总是花时间去深入了解的事情,这是很多人没有做到的。正如你所说的,他把这些知识教给其他人。他的使命是帮助提升人们的能力,使人们更好地掌握人工智能技术,他的网站上写着“让人工智能重新变得实在和有用”。让我们教人们如何真正运用人工智能技术,不要只关注炒作。
Now the problem Jeremy struggled with is that he's pushing the envelope, right? Research isn't about doing the thing that is staying on the happy path or the well-paved road, right? And so a lot of the systems today have been these really fragile, fragmented things or special case in this happy path. And if you fall off the happy path, you get eaten by an alligator.
现在,杰里米所面临的问题是他正在挑战极限,对吧?研究并不是做那些一帆风顺或是铺好的道路,对吧?许多现今系统都是极其脆弱、零散的,或者是针对一帆风顺的特例而设计的。如果你偏离了这个“一帆风顺”的轨迹,你就会被鳄鱼吃掉。
So what about, so Python has this giant ecosystem of packages. And it's a package repository. Do you have ideas of how to do that well for Mojo? Yeah. How to do a repository of packages.
那么,Python有一个庞大的包生态系统和包库。你有没有想过如何为Mojo打造一个良好的包库呢?是的,如何打造一个包库。
Well, so that's another really interesting problem that I knew about, but I didn't understand how big of a problem it was. Python packaging, a lot of people have very big pain points and a lot of scars with Python packaging. Oh, you mean, so there's several things to say. Building and distributing and managing dependencies and versioning and all this stuff. So from the perspective of if you want to create your own package. Yes. And then, or you want to build on top of a bunch of other people's packages and then they get updated and it's like this.
嗯,这是另一个我知道但不知道问题有多严重的非常有趣的问题。Python包装是许多人在Python包装方面遇到的很大的难点和许多创伤。哦,你的意思是,有几件事要说。构建、分发、管理依赖项和版本控制等等。所以从想要创建自己的包的角度来看。是的。然后,或者你想在其他人的包上构建,然后它们被更新了,就像这样。
Now, I'm not an expert in this, so I don't know the answer. I think this is one of the reasons why it's great that we work as a team and there's other really good and smart people involved. But one of the things I've heard from smart people who've done a lot of this is that the packaging becomes a huge disaster when you get the Python and C together. And so if you have this problem where you have code split between Python and C, now not only do you have to package the C code, you have to build the C code. C doesn't have a package manager. C doesn't have a dependency versioning management system. And so I'm not experiencing the state of the art and all the different Python package managers, but I'm understanding that's a massive part of the problem. And I think Mojo solves that part of the problem directly heads on.
现在,我不是这方面的专家,所以我不知道答案。我认为这是我们作为一个团队并有其他非常聪明和优秀的人参与的原因之一。但是,我从一些聪明的人那里听说,当你将Python和C混合在一起时,打包会变成一个巨大的灾难。因此,如果你有这样的问题,即代码在Python和C之间拆分,现在你不仅需要打包C代码,而且需要构建C代码。C没有包管理器。C没有依赖版本管理系统。因此,尽管我不了解不同Python包管理器的最新技术,但我明白这是问题的一个巨大部分。而我认为Mojo直接解决了这个问题部分地。
Now, one of the things I think we'll do with the community, and this isn't, again, we're not solving all the world's problems at once. We have to be kind of focused. Start with is that I think that we will have an opportunity to reevaluate packaging. And so I think that we can come back and say, OK, well, given the new tools and technologies and the cool things we have that we've built up because we have not just syntax, we have an entirely new compiler stack that works in a new way. Maybe there's other innovations we can bring together, and maybe we can help solve that problem.
现在,我认为我们与社区可以做的一件事情是重新评估包装。当然,我们不是一步到位,能够解决所有世界的问题。我们需要着重关注。我们可以利用新工具和技术以及我们已经建立起来的酷东西,不仅仅是语法,我们还有一整套新的编译器堆栈,可以用新的方式工作。也许还有其他的创新可以结合在一起,也许我们可以帮助解决这个问题。
So it almost attains you to that question from the user perspective of packages. It was always surprising to me that it was not easier to sort of explore and find packages with pip install. And it just feels it's an incredible ecosystem. It's just interesting that it wasn't made, still, I think, not made easier to discover packages to do like search and discovery as YouTube calls it.
从用户的角度看,几乎可以回答关于包的问题。对我来说,令人惊讶的是,使用pip安装软件包时没有更容易地进行探索和查找。这个生态系统非常棒,但很奇怪的是,发现新软件包还不够容易。就像YouTube所说的搜索和发现一样。
Well, I mean, it's kind of funny because this is one of the challenges of these intentionally decentralized communities. And so I don't know what the right answer is for Python. I mean, there are many people that were, I don't even know the right answer for Mojo. So there are many people that would have much more informed opinions than I do. But it's interesting if you look at this, open source communities.
嗯,我的意思是,这是有趣的,因为这是有意分散的社区之一的挑战。所以我不知道 Python 的正确答案是什么。我甚至不知道 Mojo 的正确答案是什么。所以有很多人比我更有见解。但是,如果你看一下这些开源社区,这是很有趣的。
There's Git. Git is a fully decentralized and they could do it any way they want. But then there's GitHub. And GitHub centralized commercial in that case, thing really helped pull together and helped solve some of the discovery problems and helped build a more consistent community. And so maybe there's opportunities for something like a GitHub for me. Although even GitHub might be wrong on this, but the search and discovery for GitHub is not that great. Like I still use Google search.
有 Git,它是一个完全去中心化的版本控制系统,用户可以按照自己的方式进行操作。但是也有 GitHub,它是一个商业化的中心化平台,它有助于集中力量解决部分发现问题,同时也有助于建立一个更加一致的社区。因此,可能存在着类似于 GitHub 的机会。尽管 GitHub 也许也有其缺点,例如它的搜索和发现功能并不十分出色,我仍需要经常使用 Google 搜索来完成相关的任务。
Yeah, well, I mean, maybe that's because GitHub doesn't want to replace Google search. And I think there is room for specialized solutions to specific problems. I don't know. I don't know if the right answer for GitHub either. They can go figure that out. But the point is to have an interface that's usable that's accessible to people of all different skill levels.
嗯,我是说,也许 GitHub 不想取代谷歌搜索,而且我认为还有为特定问题提供专业解决方案的空间。我不知道,也不确定 GitHub 的正确答案是什么,他们可以自己去探索。但是,重点是要有一个可用的界面,适用于不同技能水平的人们。
Well, and again, what are the benefit of standards? Standards allow you to build these next level up ecosystem, next level up infrastructure, and next level up things. And so again, come back to hate complexity. C plus Python is complicated. It makes everything more difficult to deal with. It makes it difficult to port, move code around, work with. All these things get more complicated. And so I mean, I'm not an expert, but maybe Mojo can help a little bit by helping reduce the amount of C in this ecosystem and make it there for scale better. So you kind of packages that hybrid in nature would be a natural fit to move to Mojo. Which is a lot of them, by the way. A lot of them, especially, they're doing some interesting stuff, computation-wise.
那么,标准有哪些好处呢?标准能让你建造更高级的生态系统、基础设施和其他东西。所以再次回到讨厌复杂性这一点。C加上Python很复杂,使一切都更加难以处理,使代码移植、转移和操作变得更加复杂。这些事情变得更加复杂。所以我不是专家,但也许Mojo可以帮助一点,帮助减少这个生态系统中C的数量,使其更适合扩展。因此,混合型式的软件包将自然适合移动到Mojo上。而且这样的软件包很多,尤其是在计算方面会使用到很多有趣的特性。
Let me ask you about some features. Yeah. So we talked about, obviously, the indentation that is the type language or optionally typed. Is that the right way to say it? See, they're optionally or progressively. Or aggressively. I think so. So people have very strong pains on the right word to use. Yeah. I don't know. I look forward to your letters.
让我问一下你一些特性。是的。所以我们讨论了显然是缩进的那种类型语言或者可选择类型的语言。这样说对吗?看,它们可以是可选择的,渐进的或者强制的。我想是这样的。人们对用词非常有问题。是的。我不知道。期待收到你的来信。
So there's the var versus let. But let is for constants. Var is an optional. Yeah, var makes it mutable, so you can reassign. OK. Then there's a function overloading. Oh, OK. Yeah. I mean, there's a lot of source of happiness for me. But function overloading, that's, I guess, is that for performance? Or is that, why does Python not have function overloading?
那么有var和let。但let是常量用的。Var是可选的。是的,var让它是可变的,所以你可以重新分配。好的。然后还有函数重载。哦,好的。是啊,对我来说有很多快乐的源泉。但是函数重载,我猜这是为了性能?或者,为什么Python没有函数重载?
So I can speculate. So Python is a dynamic language. The way it works is that Python objective C are actually very similar worlds if you ignore syntax. And so objective C is straight line derived from small talk. A really venerable, interesting language that much of the world has forgotten about, but the people that remember it love it, generally.
所以我能推测一下。Python是一种动态语言。它的工作方式是,如果忽略语法,Python objective C实际上是非常相似的世界。因此,如果忽略语法,objective C是源自Smalltalk的一条直线。Smalltalk是一种非常古老、有趣的语言,世界上许多人已经忘记了它,但那些记得它的人通常都喜欢它。
And the way that small talk works is that every object has a dictionary in it. And the dictionary maps from the name of a function or the name of a value within an object to its implementation. And so the way you call a method in an objective C is you say, go look up the way I call foo. As I go look up foo, I get a pointer to the function back, and then I call it. That's why Python works.
小谈话的方式是通过每个对象中都有一个字典来实现的。该字典将函数名或对象内部值的名称映射到其实现方式。因此,在Objective C中调用方法的方式是:你要说,“去查找我如何调用foo”。当我查找foo时,我会得到一个指向函数的指针,然后我调用它。这就是为什么Python能够工作的原因。
And so now the problem with that is that the dictionary within a Python object, all the keys are strings, and it's a dictionary. So you can only have one entry per name. You think it's as simple as that? I think it's as simple as that.
现在的问题是,在Python对象中的字典里,所有的键都是字符串,而且它是一个字典。所以你只能用每个名称添加一个条目。你认为这很简单吗?我认为这很简单。意思是说,Python对象中的字典键只能是字符串,每个键只能对应一个值,这是一个简单的问题。
And so now, why do they never fix this? Like, why do they not change it to not be a dictionary? Why do they not change it? Like, I do other things. Well, you don't really have to in Python because it's dynamic. And so you can say, I get into the function.
那么现在,为什么他们从来不修复这个问题?例如,为什么不把它改成不是一个字典?为什么不改变它?在Python中,你不需要做其他事情,因为它是动态的。因此,你可以说,我进入了函数。
Now if I got past an integer, do some dynamic test for it. If it's a string, go do another thing. There's another additional challenge, which is even if you did support overloading, you're saying, okay, well, here's a version of a function for integers and a function for strings. Well, you'd have, even if you could put it in that dictionary, you'd have to have the caller do the dispatch. And so every time you call the function, you'd have to say, like, is an integer's string? And so you'd have to figure out where to do that test. And so in a dynamic language, overloading is something you don't have to have.
现在假设我传递了一个整数,那就对它进行一些动态测试。如果是一个字符串,那么就进行另一种处理。另外一个附加的挑战是,即使你支持重载,你正在说,好吧,这是一个整数的函数的一个版本和一个字符串的函数。即使你可以把它放在那个字典里,你也得让调用者进行分派。所以每次调用函数时,你都需要说,这是一个整数还是字符串?你得找出在哪里进行这个测试。所以在动态语言中,重载是不必要的。
So, but now you get into a type language. And in Python, if you subscript with an integer, then you get typically one element out of a collection. If you subscript with a range, you get a different thing out. And so often, in type languages, you'll want to be able to express the fact that, cool, I have different behavior depending on what I actually pass into this thing. If you can model that, it can make it safer and more predictable and faster and like all these things. It somehow feels safer, yes, but also feels empowering.
所以,但现在你进入了一种类型语言。在Python中,如果你使用整数下标,通常会从集合中获得一个元素。如果你使用范围下标,得到的就是另一种东西。在类型语言中,通常你想表达的是,无论我传入什么参数,都会有不同的行为。如果你能够建模这种情况,它会使程序更安全、更可预测、更快速等等。这种感觉似乎更安全,同时也更有权力感。
In terms of clarity, like you don't have to design whole different functions. Yeah, well, this is also one of the challenges with the existing Python typing systems, is that in practice, like you take subscript, like in practice, a lot of these functions, they don't have one signature. They actually have different behavior in different cases. And so this is why it's difficult to retrofit this into existing Python code and make it play well with typing. You can have to design for that.
就清晰度而言,你不必设计完全不同的功能。是的,这也是现有的Python类型系统面临的挑战之一。在实践中,比如说取下标,很多函数实际上并不只有一个签名,不同情况下的行为也不同。因此,将其应用于现有的Python代码并使其良好地与类型相容是困难的。你需要特别设计符合情况的函数签名。
OK, so there's an interesting distinction that people that program Python might be interested in is def versus FM. So it's two different ways to define a function. And FN is a stricter version of def. What's the coolness that comes from the strictness? So here you get into what is the trade off with a superset?
好的,所以有一个有趣的区别可能会引起编写Python程序的人的兴趣,那就是def和FM之间的区别。所以这是定义函数的两种不同方式。而FM是def的严格版本。这种严格性带来了什么酷炫的功能呢?因此,你需要考虑超集的权衡。
Yes. OK, so a superset, you really want to be compatible. Like if you're doing a superset, you've decided compatibility with existing code is the important thing. Even if some of the decisions they made were maybe not what you'd choose. Yeah, OK. So that means you put a lot of time in compatibility. And it means that you get locked into decisions in the past, even if they may not have been a good thing.
是的。好吧,如果要创建一个超集,你真的想要兼容性。如果你在创建超集时,已经决定兼容现有代码是重要的事情。即使它们所做的决定可能不是你选择的,你也要花费大量时间来保持兼容性。这意味着你会被困在以前所做的决定中,即使它们可能不是好事。
Now, systems programmers typically like to control things. And they want to make sure that not all cases, of course. And even systems programmers are not one thing. But often you want predictability. And so one of the things that Python has, for example, as you know, is that if you define a variable, you just say x equals 4. I have a variable named x. Now I say sum long method, sum long name equals 17.
现在,系统程序员通常喜欢掌控自己的事情,并且当然并不是所有情况。即使系统程序员也不是一样的。但通常来说,他们希望能够预测结果。例如,Python有一个特点就是,如果你定义了一个变量,只需要简单地说x等于4,那么我就会有一个名为x的变量。现在我说长名称方法,sum long name等于17。
Print out sum long name. Oops, but I type out it. Well, the compiler, the Python compiler, doesn't know in all cases what you're defining and what you're using. And did you type out the use of it or the definition? And so for people coming from type languages, again, I'm not saying they're right or wrong, but that drives them crazy because they want the compiler to tell them you type out the name of this thing. And so what FN does is it turns on, as you say, it's a strict mode.
打印出长名称的总和。哎呀,但我错误地输入了它。好吧,编译器,Python编译器,不在所有情况下都知道您正在定义什么以及您正在使用什么。您是输入其使用方式还是定义方式?对于来自类型语言的人来说,我不是说他们正确或错误,但这让他们感到疯狂,因为他们希望编译器告诉他们输入这个东西的名称。因此,FN的作用是开启严格模式。
And so it says, OK, well, you have to actually intentionally declare your variables before you use them. That gives you more predictability, more error checking and things like this. But you don't have to use it. And this is a way that Mojo is both compatible because Deafs work the same way that Deafs have always worked. But it provides a new alternative that gives you more control. And it allows certain kinds of people to have a different philosophy to be able to express that and get that.
因此,它说,嗯,你必须在使用变量之前有意识地声明它们。这可以增加预测性、错误检查等方面的可靠性。但你并不需要使用它。这是Mojo既兼容的原因,因为Deafs的工作方式与以往相同。但它提供了一种新的选择,使你拥有更多的控制。它还允许某些人有不同的哲学观点,以表达并实现其目标。
But usually, if you're writing Mojo code from scratch, you'll be using FN. It depends, again, it depends on your mentality. It's not that Deaf is Python and FN is Mojo. Mojo has both, and it loves both. It really depends on. It depends just straight. Yeah, exactly. Are you playing around and scripting something out? Is it a one-off-throw-away script? Cool. Python is great at that. I will still be using FN.
通常来说,如果你从头开始编写Mojo代码,你会使用FN。这要看你的心态,不是说Deaf就是Python,FN就是Mojo。Mojo既有Deaf,也有FN,它对这两种都很喜欢,但最终选择哪种取决于你的想法。如果你只是玩弄和编写一次性的脚本,Python非常适合,但我仍然会使用FN。
Yeah. Well, so I love strictness. OK, well, so control power. You also like suffering, right? Yes. Go hand in hand. How many pull-ups? I've lost count at this point. So, and that's cool. I love you for that. Yeah. And I love other people who like strict things, right? But I don't want to say that that's the right thing because Python's also very beautiful for hacking around and doing stuff and research and these other cases where you may not want that.
嗯,我喜欢严格要求。好的,我喜欢掌控力量。你也喜欢受苦,对吧?是的。这两者是相辅相成的。你做了多少个引体向上了?我已经数不清了。所以,这很酷。我喜欢你对这个的追求。同时,我也喜欢喜欢严格要求的其他人。但我不想说严格要求就是对的,因为Python在研究、探索等其他情况下也非常美丽,你可能不需要那么强制的要求。
You see, I just feel like maybe I'm wrong with that, but it feels like strictness leads to faster debugging. So in terms of going from even on a small project, from zero to completion, it just, I guess it depends how many bugs you generally, usually. Yeah. Well, so I mean, it's, again, lessons learned in looking at the ecosystem. It's really, I mean, I think it's, if you study some of these languages over time, like the Ruby community, for example.
你看,我只是觉得可能我理解有误,但是似乎严谨会导致更快的调试。所以就算是在小项目中从零到完成,这个取决于你通常会有多少漏洞。是的,那么,我想,这又是在看生态系统中学到的经验。比如说,如果你有时间研究一些语言,比如 Ruby 社区,就会发现……
Now, Ruby is a pretty well-developed, pretty established community, but along their path, they really invested in unit testing. So I think that the Ruby community is really pushed forward the state of the art of testing because they didn't have a type system that caught a lot of bugs in compile time. And so you can have the best of both worlds.
现在,Ruby是一个相当成熟、相当稳定的社区,但在他们的发展过程中,他们真正投入了单元测试。因此,我认为Ruby社区真正推动了测试技术的发展,因为它们没有一个在编译时能够捕捉大量错误的类型系统。因此,你可以拥有两全其美的效果。
You can have good testing and good types and things like this. But I thought that it was really interesting to see how certain challenges get solved. And in Python, for example, the interactive notebook kind of experiences and stuff like this are really amazing. And if you type out something, it doesn't matter. It just tells you it's fine. Right.
你可以做出很好的测试和好的类型,像这些东西。但我认为真正有趣的是看到如何解决某些挑战。例如,Python中的交互式笔记本体验等都非常棒。如果你打出了什么,它也没关系。它只会告诉你一切都好。
And so I think that the tradeouts are very different if you're building a large scale production system versus you're building and exploring in a notebook. And the speaking of control, the hilarious thing, if you look at code I read just for myself for fun, it's like littered with asserts everywhere.
因此,我认为如果你正在构建一个大规模的生产系统,那么和你在笔记本上构建和探索时的交易成本是非常不同的。而且说到控制,有趣的是,如果你看看我为了自己的乐趣而阅读的代码,那么它到处都是断言。
Yeah. It's a kind of, yeah, you'd like to ask. It's basically saying in an addictive toil way, this should be true now. Otherwise, everything stops. Well, and that is the sign. Again, I love you, man. That is a sign of somebody who likes control. Yeah.
是的。你可能想问这是什么意思。它基本上是以一种上瘾的方式说,这应该是现在的真相。否则,一切都会停下来。嗯,那就是标志。再说一次,我爱你老兄。这是控制欲强的人的标志。
And so, yes, I think that you'll like FN. I think it'll like Mojo. Therapy session, yes. I definitely will. Speaking of asserts, exceptions are called errors. Why is it called errors?
所以,我认为你会喜欢FN,我认为Mojo也会喜欢。治疗会话,也是的。关于assert,异常被称为错误,为什么称之为错误呢?
So I mean, we use the same, we're the same as Python. But we implemented a very different way. And so if you look at other languages, like we'll pick on C++, our favorite, right? C++ has a thing called zero cost exception handling.
我想说的是,我们和Python一样使用相同的方式,但我们实现了一种非常不同的方式。所以,如果您看其他语言,例如我们选择的C ++, C++有一种叫做零成本异常处理的东西。
OK. And this is, in my opinion, something to learn lessons from. It's a nice play with thing. And so zero cost exception handling, the way it works, is that it's called zero cost because if you don't throw an exception, there's supposed to be no overhead for the non-error code. And so it takes the air path out of the common path.
我的意见是,这是一件可以吸取教训的事情。它是一项不错的技巧。零成本异常处理的工作原理是,如果不抛出异常,则非错误代码不应有任何开销。因此,它在普通路径上消除了空气路径。
It does this by making throwing an error extremely expensive. And so if you actually throw an error with a C++ compiler using exceptions, let's go look up in tables on the side and do all this stuff. And so throwing an error could be like 10,000 times more expensive than returning from a function.
这是通过让抛出错误变得异常昂贵来实现的。如果你使用C++编译器抛出异常,就需要去查表并进行各种操作。因此,抛出错误的代价可能比从一个函数返回高出10,000倍。
Also, it's called zero cost exceptions, but it's not zero cost by any stretch of the imagination because it massively blows out your code, your binary. It also adds a whole bunch of different paths because of destructors and other things like that that exist in C++. And it reduces number of optimizations. It adds all these effects.
它被称为零成本异常,但实际上却不是零成本的。因为它会大大膨胀你的代码和二进制文件。此外,由于 C++ 中存在析构函数和其他类似的东西,它还会添加许多不同的路径。同时,它还会降低代码的优化次数,并引入许多其他的副作用。
And so this thing that was called zero cost exceptions really ain't.
所以,所谓零成本例外事实上并不是真的零成本。
意思是在某些情况下,零成本还是会存在许多额外的隐性费用。
OK. Now if you fast forward to newer languages, and this includes Swift and Rust and Go, and now Mojo. Well, in Python, it's a little bit different because it's interpreted.
现在如果你快进到较新的编程语言,包括Swift、Rust、Go以及Mojo。但在Python中,情况有些不同,因为它是解释执行的。
And so it's got a little bit of a different thing going on. But if you look at it, if you look at compiled languages, many newer languages say, OK, well, let's not do that zero cost exception handling thing.
因此,它有一些不同的东西。但是,如果您看一些编译语言,许多较新的语言说,“好吧,我们不要采用零成本异常处理的方法。”
Let's actually treat throwing an error the same as returning a variant, returning either the normal result or an error.
实际上,我们应该像返回变量一样对待抛出错误的操作,即返回正常的结果或者一个错误。
Now, programmers generally don't want to deal with all the typing machinery and pushing around a variant. And so you use all the syntax that Python gives us.
现在,程序员通常不想处理所有的键入机械装置和推动变量。因此,我们使用Python提供的所有语法。
For example, try and catch. Functions that raise and things like this, you can put a raises decorator on your function, stuff like this. And if you want to control that, and then the language can provide syntax for it.
例如,尝试和捕获。对于提出异常等功能,你可以在你的函数上添加一个raises装饰器,类似这样的东西。如果你想要控制,那么语言可以提供相应的语法。
But under the hood, the way the computer executes it, throwing errors, basically as fast as returning something. So it's exactly the same way it does. You look a pilot perspective.
但是在计算机执行它的过程中,它会抛出错误,基本上像返回某些东西一样快。因此,它的执行方式与计算机平常执行其他任务的方式完全一样。从飞行员视角来看。
And so this is actually, I mean, it's a fairly nerdy thing, right? Which is why I love it. But this has a huge impact on the way you design your APIs.
因此,实际上,这是一个相当书呆子的东西,对吧?这就是为什么我喜欢它。但是这对于你设计API的方式有很大的影响。
So in C++, huge communities turn off exceptions, because the cost is just so high. And so the zero cost cost is so high.
在C++中,由于异常的成本非常高,因此许多社区都关闭了异常。因此,所谓的零成本实际上是非常高的成本。
And so that means you can't actually use exceptions in many libraries. And even for the people that do use it, well, OK, how and when do you want to pay the cost? If I try to open a file, should I throw an error?
这意味着许多库实际上无法使用异常。即使对于那些使用它的人,好吧,你何时以及如何想要支付代价?如果我尝试打开文件,应该抛出一个错误吗?
Well, what if I'm probing around looking for something, and I'm looking up in many different paths? Well, if it's really slow to do that, maybe I'll add another function that doesn't throw an error, returns an error code instead.
那么,如果我在寻找某样东西时,需要在许多不同的路径上寻找,如果这很慢的话,也许我会添加另一个函数,而不是抛出错误,而是返回错误代码。意思是,在寻找过程中,如果速度很慢,可以添加一个用于返回错误代码的函数。
And I have two different versions the same thing. And so it causes you to fork your APIs.
我有关于同一件事情的两个不同版本。因此,这会造成你需要分叉你的API。
意思是,对于同一件事情,可能会有不同的观点或方法,导致需要创建不同的API版本。
And so one of the things I learned from Apple and NISL love is the art of API design is actually really profound. I think this is something that Python's also done a pretty good job at in terms of building out this large scale package ecosystem.
因此,从Apple和NISL的经验中,我了解到API设计的艺术实际上非常深奥。我认为Python做得也很不错,特别是在构建庞大的软件包生态系统方面。
It's about having standards and things like this. And so we wouldn't want to enter a mode where there's this theoretical feature that exists in language, but people don't use it in practice.
这是关于标准和类似的事情。因此,我们不希望进入一种模式,即语言中存在理论功能,但人们在实践中不使用它。
Now I'll also say one of the other really cool things about this implementation approach is that it can run on GPUs and it can run accelerators and things like this.
现在我也会说另一个有趣的实现方式,就是它可以在GPU和加速器等设备上运行。
And that standard zero cost exception thing would never work on an accelerator. And so this is also part of how Mojo can scale all the way down to little embedded systems and to run on GPUs and things like that.
那个标准的零成本异常机制在加速器上永远行不通。因此,这也是 Mojo能够扩展到小型嵌入式系统,并在GPU等设备上运行的方式之一。
解释:作者在这里指出,标准的零成本异常机制无法在加速器上实现。同时,Mojo通过一些技术手段可以在小型嵌入式系统和GPU等设备上实现扩展性。
Can you actually say about the maybe, is there some high level way to describe the challenge of exceptions and how they work in code during compilation?
你是否能够简单地描述一下异常在编译过程中的工作方式以及相关挑战,最好用高级方式表达?
So it's just this idea of percolating up a thing, an error. Yeah. Yeah, so the way to think about it is think about a function that doesn't return anything, just as a simple case, right?
因此,这只是一种错误想法渐渐浮现的概念。是的。 因此,可以简单地将它想象成一个不返回任何内容的函数。
And so you have function one calls function two calls function three calls function four along that call stack that are try blocks, right?
所以你有一个函数调用另一个函数,进而调用第三个函数,最后在那个调用栈上调用第四个函数,这些函数都包含try块,对吗?
And so if you have function one calls function two function two as a try block, and then within it it calls function three, right?
因此,如果函数一调用了函数二并将函数二放在try块内,然后在函数二内部调用了函数三,对吗?
Well, what happens if function three throws? Well, actually start simpler. What happens if it returns? Well, if it returns, it's supposed to go back out and continue executing and then fall off the bottom of the try block and keep going and it all is good.
好的,如果函数三抛出异常会发生什么呢?先从简单的情况开始说起。如果函数三返回了什么会发生呢?如果函数三返回了,它应该会继续执行并回到 try 块的底部并继续执行,这就是正常流程。
If the function throws, you're supposed to exit the current function and then get into the accept clause, right? And then do whatever codes there and then keep falling on and going on.
如果该函数抛出异常,你需要退出当前函数并进入accept子句,对那里的代码进行处理,然后继续执行下去。
And so the way that a compiler like Mojo works is that the call to that function, which happens in the accept block, calls a function.
因此,像 Mojo 这样的编译器的运行方式是,在 accept 块中调用该函数的调用会触发另一个函数。
And then instead of returning nothing, it actually returns a variant between nothing and an error. And so if you return normally, fall off the bottom or do return, you return nothing.
然后,它实际上返回了一个介于无和错误之间的变量,而不是返回无。因此,如果你正常返回、掉出底部或者使用return,你就返回了无。
And if you throw an error, you return the variant that is, I'm an error, right?
如果您发生错误,那么您将返回指示错误的变量,即“我是错误”,对吗?
So when you get to the call, you say, OK, cool. I called a function. Hey, I know locally I'm in a try block, right? And so I call the function and then I check to see what it returns.
当你到达调用时,你说,好的,很酷。我调用了一个函数。嘿,我知道我在本地的try块中,对吧?然后我调用函数并检查它返回什么。
Aha, if it's that error thing, jump to the accept block. And that's all done for you behind the scenes. Exactly.
啊哈,如果是那个错误的事情,跳到接受块。这在幕后已经为您完成了。完全正确。
And so the compiler does all this for you. And I mean, one of the things, if you dig into how this stuff works in Python, it gets a little bit more complicated because you have finally blocks, which now you need to go into, do some stuff.
因此,编译器会为您完成所有这些工作。我是说,如果您深入了解Python中的这些内容,其中一件事情变得有点更加复杂,因为您需要进入finally块,进行一些操作。
And then those can also throw and return. Wait, what? Nested. And like the stuff matters to compatibility. Like there's a, there's with clauses. And so with clauses are kind of like finally blocks with some special stuff going on.
然后,这些也可以抛出和返回。等等,什么?嵌套的。而且,像这些东西对兼容性非常重要。就像有一些嵌套块。因此,嵌套块有点像最终块,其中一些特殊的事情正在进行。
And so there's, nesting in general, nesting of anything, nesting of functions should be illegal. Well, it just feels like it adds a level of complexity. Lex, I'm merely an implementer. And so this is again one of the trade-offs you get when you decide to build a superset is you get to implement a full fidelity implementation of the thing that you decided is good.
因此,总的来说,任何嵌套都应该是不合法的,包括函数的嵌套。嵌套会增加代码的复杂度。我只是一个实现者,这是一个权衡之一,当你决定建造超集时,你可以实现一个完整的、忠实的实现你所认为好的东西。
And so yeah, I mean, we can complain about the reality of the world in Shaker Fist. It always feels like you shouldn't be allowed to do that, like to declare functions and set in functions inside functions. Oh, wait, wait, wait. What happened to Lex, the LISP guy? No, I understand that. But LISP is what I used to do in college. It's a knife grown up. We've all done things in college. I love LISP. I love LISP. OK, yeah, I was going to say you're afraid of me irritating the whole internet. I love LISP. It worked as a joke in my head.
所以,我们可以抱怨Shaker Fist中世界的实际情况。但总觉得你不应该这样做,比如在函数内声明函数和设置函数。等等,莱克斯(LISP专家)呢?不,我明白。但LISP是我在大学里做的事情。它是一把成年人的刀子。我们都在大学里做过那些事情。我喜欢LISP。我喜欢LISP。好的,是的,我本来想说你害怕我惹恼整个互联网。我喜欢LISP。这在我脑子里是个笑话。
And yeah, come on. So nested functions are joking inside actually really great. And for certain things, right? And so these are also called closures. Closures are pretty cool. And you can pass callbacks. There's a lot of good patterns. So speaking of which, I don't think you have nested functions implemented yet in Mojo. We don't have lambdas syntax, but we do have nested functions.
嗯,来吧。所以,内嵌函数其实非常棒。对于某些事情来说,它们也被称为闭包。闭包非常酷。你可以传递回调函数,有很多好的模式。说到这里,我想在Mojo中实现内嵌函数,我们没有lambda语法,但我们有内嵌函数。
So there's a few things on the roadmap that you have that would be cool to just fly through. Because it's interesting to see how many features there are in a language, small and big, they have to implement. So first of all, there's tuple support. And that has to do with some very specific aspect of it. Like the parentheses are not parentheses that. Yeah, this is just a totally syntactic thing. That's a syntactic thing. OK, there's, but it's cool still.
在路线图上有一些非常酷的东西需要实现,有些很小的功能,也有一些很大的。了解一个编程语言中到底有多少功能非常有趣。首先是元组支持。这涉及到一些非常具体的方面,例如括号不是那种括号。这只是一个完全的语法问题,但是仍然很酷。
So keyword arguments and functions. Yeah, so this is where in Python you can say, call a function x equals 4. Yeah. And x is the name of the argument. That's a nice sort of documenting, self-documenting feature. Yeah, I mean, again, this isn't rocket science to implement, it's just the logic. It's just on the list. The bigger features are things like traits.
关键字参数和函数。在Python中,你可以使用关键字参数调用函数,比如 x=4。这里的 x 是参数的名称,这是一种很好的自我记录的特性。实现这个特性并不难,只需要一些逻辑处理。对于更大的特性,比如 traits,需要更多的工作。
So traits are when you want to define abstract. So when you get into typed languages, you need the ability to write generics. And so you want to say, I want to write this function. And now I want to work on all things that are arithmetic-like. Well, what does arithmetic-like mean? Well, arithmetic-like is a categorization of a bunch of types. And so, again, you can define it in many different ways. And I'm not going to go into a ring theory or something. But you can say it's arithmetic-like if you can add subtractmals, why divide it, for example.
特征(Traits)是指要定义抽象的内容。当使用类型化语言时,您需要能够编写泛型来定义函数。然后,您可能想说:“我想编写此函数,并对所有类似算术的内容进行操作。”那么,“类似算术”的含义是什么?其实,这是一堆类型的分类。你可以用很多不同的方式来定义它。我不会深入讲解环论等内容,但是如果你可以做加减乘除等操作,那么你就是算术类似的。
And so what you're saying is you're saying there's a set of traits that apply to a broad variety of types. And so all these types of arithmetic-like, all these tensors and floating point integer. And there's this category of types. And then I can define on an orthogonal axis algorithms that then work against types that have those properties. And so this is a, again, it's a widely known thing. It's been implemented in Swift and Rust and many languages. So it's not Haskell, which is where everybody learns what is their tricks from. But we need to implement that and that'll enable a new level of expressivity.
因此,你的意思是说有一组特征适用于各种类型。包括算术、张量和浮点整数等所有这些类型。所以,有这一类别的类型。然后我可以在一个正交轴上定义算法,然后针对具有这些属性的类型进行工作。这是一个广为人知的事情。在Swift和Rust等许多语言中都已经实现了。它不是Haskell,这是大家都从中学习的技巧。但我们需要实施它,这将使表达能力达到新的水平。
So classes? Yeah, classes are a big deal. It's a big deal still to be implemented. Like you said, Lambda syntax. And there's detailed stuff like whole modular import, support for top level code, and file scope, and then global variables also. So being able to have variables outside of a top level.
那么,课程?是的,课程非常重要。它仍然有待实施。就像您说的那样,Lambda语法。还有一些详细信息,例如整个模块化导入、支持顶层代码和文件范围,然后还有全局变量。因此能够拥有在顶层之外的变量也非常重要。
And so this comes back to where Mojo came from and the fact that this is your point one. And so we're building, so modular is building an AI stack. An AI stack has a bunch of problems working with hardware and writing high performance kernels and doing this kernel fusion thing I was talking about and getting the most out of the hardware. And so we've really prioritized and built Mojo to solve modular problem. Now, our North Star is build out and support all the things. And so we're making an incredible progress. By the way, Mojo is only like seven months old.
这与Mojo的来源有关,事实上这也是您的第一个观点。我们正在构建一个模块化的AI堆栈。AI堆栈与硬件一起工作时会遇到很多问题,需要编写高性能的内核并进行内核融合,以充分利用硬件性能。所以我们将Mojo作为优先解决模块化问题的解决方案。现在,我们的目标是构建和支持所有的事情。我们取得了惊人的进展,顺便说一句,Mojo只有七个月的历史。
So that's another interesting thing. I mean, part of the reason I wanted to mention some of these things is there's a lot to do and it's pretty cool how you just kind of, sometimes you take for granted how much there is in a programming language, how many cool features you kind of rely on. And this is kind of a nice reminder. When you lay it as it's a duelist. Yeah, and so, I mean, but also you look into, it's amazing how much is also there.
这是另一个有趣的事情。 我意思是,我想提到这些事情的部分原因是有很多事情要做,很酷的是你有时候会想当然地认为编程语言中有很多很酷的功能,但这是一种很好的提醒。当你把它放在一起,它是一个决斗者。是啊,所以,我是说,你还要了解到,有多少酷东西在那里。
And you take it for granted that a value, if you define it, it will get destroyed automatically. Like that little feature itself is actually really complicated, given the way the ownership system has to work. And the way that works within Mojo is a huge step forward from what Rust and Swift have done.
你认为一旦你定义了一个值,它就会自动被摧毁,这似乎很理所当然。实际上,这个小功能非常复杂,因为要考虑所有权系统的工作方式。而在Mojo中,这种实现方式远比Rust和Swift更加先进。
But can you say that again, when a value, when you define it gets destroyed on the mat? Yeah, so like say you have a string, right? So you just find a string on the stack, okay? Or whatever that means, like in your local function, right? And so you say, like whether it be in a def, and so you just say x equals hello world, right? Well, if your string type requires you to allocate memory, then once destroyed, you have to deallocate it. So in Python and Mojo, you define that with a Dell method.
你能再说一遍吗?当你定义一个值并在栈上销毁时,你还能再说一遍吗?比如说你有一个字符串,在栈中找到它,例如在本地函数中。然后你会使用 def 定义一个变量,比如 x,设为“hello world”。如果该字符串类型需要分配内存,一旦销毁,你就需要进行释放。在 Python 和 Mojo 中,你可以使用 Dell 方法来释放。
Where does that get run? Well, it gets run sometime between the last use of the value and the end of the program. Like in this, you now get into garbage collection, you get into like all these long debated, you talk about religions and trade offs and things like this. This is a hugely hotly contested world.
这段代码在哪里运行?它会在值的最后一次使用和程序结束之间的某个时刻运行。比如现在你已经进入了垃圾回收、长期争论、宗教信仰和权衡取舍等领域。这是一个非常激烈争议的领域。
If you look at C++, the way this works is that, if you define a variable or a set of variables within a function, they get destroyed in a last in first out order. So it's like nesting, okay? This has a huge problem because if you define, you have a big scope and you define a whole bunch of values at the top, and then you use them, and then you do a whole bunch of code that doesn't use them, they don't get destroyed until the very end of that scope.
如果你看看 C++,它的工作方式是,如果你在一个函数中定义一个变量或一组变量,它们将按照后进先出的顺序被销毁。所以像嵌套一样,好吧?这会带来一个巨大的问题,因为如果你在一个大作用域中定义了大量的值,然后使用它们,接着做了很多不使用它们的代码,它们要一直等到这个作用域的最后才会被销毁。
Right? And so this also destroys tail calls. So good functional programming, right? This has a bunch of different impacts on, you know, you talk about reference counting optimizations and things like this, a bunch of very low level things. And so what Mojo does is it has a different approach on that from any language I'm familiar with where it destroys them as soon as possible. And by doing that, you get better memory use, you get better predictability, you get tail calls that work, you get a bunch of other things, you get better ownership tracking.
对吧?这也破坏了尾递归。因此,优秀的函数式编程,对吧?这对许多不同的方面产生了影响,比如引用计数优化等等,这些都是非常低级的事情。Mojo从我所熟悉的任何语言中采用了不同的方法,在尽可能短的时间内摧毁它们。通过这样做,您可以获得更好的内存使用、更好的可预测性、可正常工作的尾递归、一堆其他东西以及更好的所有权跟踪。
There's a bunch of these very simple things that are very fundamental, that are already built in there in Mojo today, that are the things that nobody talks about generally, but when they don't work right, you find out and you have to complain about. Is it trivial to know what's the soon as possible to delete a thing that's not gonna be used again? Yeah, well, I mean, it's generally trivial, it's after the last use of it. So if you just find X as a string, and then you have some use of X somewhere in your code.
现在Mojo中已经内置了许多非常基本、非常简单的功能,这些功能通常不被人们所谈及,但是当它们无法正常工作时,你就会知道并且会抱怨。这是否很容易知道何时删除不再使用的东西?是的,一般而言很容易,就是在最后一次使用它之后。因此,如果你仅仅是在代码中的某个地方找到了字符串X,并且之后还有一些对X的使用,那么你就可以将它删除了。
Within that scope, I mean, within the scope that's accessible. It's, yeah, exactly. So you can only use something within its scope. And so then it doesn't wait until the end of the scope to delete it, it destroys it after the last use. So there's kind of some very eager machine that's just sitting there and deleting. Yeah, and it's all in the compiler, so it's not at runtime, which is also cool.
在这个范围内,我的意思是,在可以访问的范围内。对,确实如此。因此,你只能在其范围内使用某个东西。然后它不会等待到范围结束才删除,它会在最后使用后立即销毁。所以就像有一个非常积极的机器一直在那里删除。是的,这都在编译器中完成,所以不是在运行时进行的,这也很酷。
And so, yeah, and so what, and this is actually non-trivial because you have control flow. Yeah, right, and so it gets complicated pretty quickly and so like getting this right was not, oh, so you have to insert delete like in a lot of places. Potentially, yeah, exactly. So the compiler has to reason about this and this is where, again, this experience building language is not getting this right. So again, you get another chance to do it and you get basic things like this, right? But it's extremely powerful when you do that.
所以,是的,这很重要,因为你需要控制流程。是的,对,所以很快就会变得复杂,所以像在很多地方插入删除这样做得正确并非易事。可能的,没错。因此编译器必须考虑这一点,这就是在构建语言的经验中需要注意的地方。因此,你有另一次机会来做到这一点,并获得像这样基本的东西,对吧?但是当你做到了这一点,它变得非常强大。
And so there's a bunch of things like that that kind of combine together. And this comes back to the, you get a chance to do it the right way, do it the right way and make sure that every brick you put down is really good so that when you put more bricks on top of it, they stack up to something that's beautiful.
所以有很多类似的事情相互结合在一起。这就回到了这样一个观点,当你有机会做得对的时候,一定要做得对,确保你铺下的每个砖头都非常好,这样在你把更多的砖头叠在上面时,它们就能组成一些美丽的东西。
Well, there's also like how many design discussions, do there have to be about particular details like implementation of particular small features? Because the features that seem small, I bet some of them might be like really, require really big design decisions. Yeah.
好的,还有关于一些特定细节的实现方案,必须进行多少设计讨论呢? 因为那些看似微小的功能中,有一些可能需要做出相当大的设计决策。是的。
Well, so, I mean, let me give you another example of this. Python has a feature called Async Await. So it's a new feature, I mean, in the long arc of Python history, it's a relatively new feature, right? That allows way more expressive asynchronous programming. Okay. Again, this is a, Python's a beautiful thing and they did things that are great for Mojo for completely different reasons.
嗯,那么,我是说,让我再给你举另一个例子。Python有一个叫做Async Await的功能。所以,这是一个新功能,在Python的漫长历史中,相对较新的功能,对于异步编程来说更表达性更强。好的。同样,Python是一个美妙的东西,他们做的事情出于完全不同的原因对Mojo来说都是有益的。
The reason that Async Await got added to Python, as far as I know, is because Python doesn't support threads. Okay. And so Python doesn't support threads, but you wanna work with networking and other things like that that can block. I mean, Python does support threads, it's just not, it's strength. And so, and so they added this feature called Async Await. It's also seen in other languages like Swift and JavaScript and many other places as well.
据我所知,Async Await被添加到Python的原因是因为Python不支持线程。好的。如果你想处理网络和其他可能会阻塞的事情, Python虽然支持线程,但它并不擅长处理这一部分。因此,他们添加了这个名为Async Await的功能。这个功能也出现在其他编程语言比如Swift,JavaScript和其他很多地方。
Async Await and Mojo is amazing. Cause we have a high performance, heterogeneous compute runtime underneath the covers that then allows non-blocking IO so you get full use of your accelerator. That's huge. It turns out, it's actually really an important part of fully utilizing the machine. You talk about design discussions, that took a lot of discussions, right? And it probably will require more iteration.
Async Await和Mojo非常惊人。因为在底层有高性能、异构计算运行时,从而允许非阻塞IO,使您能够充分利用加速器。这非常重要。事实证明,这实际上是充分利用计算机的重要部分。您谈论的设计讨论需要很多讨论,对吧?而且可能需要更多的迭代。
And so, my philosophy with Mojo is that, we have a small team of really good people that are pushing forward and they're very good at the extremely deep knowing how the compiler and runtime and like all the low level stuff works together. But they're not perfect. Same thing as the Swift team, right? And this is where one of the reasons we released Mojo much earlier is so we can get feedback. And we've already renamed a keyword due to community feedback.
所以,我对Mojo的哲学是,我们有一个小型的非常优秀的团队正在推进,他们非常擅长于深入了解编译器、运行时和所有低级别的东西如何协同工作。但他们并不完美。Swift团队也是一样的,对吧?这也是我们早期发布Mojo的原因之一,就是为了能够获得反馈。由于社区的反馈,我们已经改名了一个关键字。
And we use an ampersand and now it's named in out. We're not renaming existing Python keywords cause that breaks compatibility, right? We're renaming things we're adding and making sure that they are designed well. We get usage experience, we iterate and work with the community because again, if you scale something really fast in every right dollar code and they start using it in production, then it's impossible to change.
我们使用和号来命名新的变量。我们不会重新命名已有的Python关键词,因为这会破坏兼容性,对吗?我们只会为新添加的变量进行重命名,并确保它们的设计良好。我们会获取使用经验,与社区进行迭代和协作,因为如果您让每个人都在生产中使用,就很难进行更改。
And so you wanna learn from people, you wanna iterate and work on that early on. And this is where design discussions, it's actually quite important. Could you incorporate an emoji into the language, into the main language? Like a. Good. Do you have a favorite one? Well, I really like in terms of humor, like raffle, whatever, wrong in the floor laughing. So that could be like, what would that be the use case for that? Like an exception, throw an exception of some sort.
所以你想要向人们学习,你想要尽早迭代和改进。这就是设计讨论的重要性所在。你能将一个表情符号添加到主要语言中吗?比如好的表情符号。你有最喜欢的表情符号吗?我喜欢幽默的表情符号,比如“摇摆抽奖”,“在地板上笑死了”。那么这个会有什么用呢?它可以用于异常处理。
What's your totally final feature request? Or maybe a hard one, it has to be a hard one. People have told me that I'm insane. So this is, I'm liking this. I'm gonna use the viral nature of the internet to actually get this pass. I mean, it's funny you come back to the Flame Emoji file extension, right? The, you know, we have the option to use the Flame Emoji, which just even that concept cause, for example, the people at GitHub say, now I've seen everything. Yeah, there's something, it kinda, it's reinvigorating. It's like, oh, that's possible. That's really cool. For some reason that makes everything else really exciting. I think the world is ready for this stuff, right?
你有没有最终的功能需求?或者或许是一个困难的功能需求,必须是一个困难的功能需求。人们告诉我我疯了,所以我喜欢这个。我将利用互联网的病毒性质来实际实现这一点。我的意思是,你重新回到了 Flame Emoji 文件扩展名,对不对?你知道,我们有使用 Flame Emoji 的选项,仅仅那个概念就引起了一些人的注意,比如 GitHub 的工作人员说,我已经见过一切了。是的,有些东西,有点像重振旗鼓的感觉。这是令人兴奋的,因为这是可能的。我认为世界已经准备好接受这些功能了,对不对?
And so, you know, when we have a package manager, we'll clearly have to innovate by having the compiled package thing be the little box with the bow on it, right? I mean, it has to be done. It has to be done. Is there some stuff on the roadmap that you particularly stressed about or excited about that you're thinking about a lot? I mean, as of today's snapshot, which will be obsolete tomorrow, the lifetime stuff is really exciting.
所以,你知道,当我们有软件包管理器时,我们肯定要创新,让编译好的软件包看起来像一个小盒子上面绑着一个蝴蝶结,对吧? 我的意思是,这必须要做到。 有没有一些在路线图上的东西让你特别担心或激动,你经常想到它们? 我的意思是,截至今天的快照,明天就会过时, 生命周期的东西真的很令人兴奋。
And so, lifetimes give you safe references to memory without dangling pointers. And so this has been done in languages like Rust before, and so we have a new approach, which is really cool. I'm very excited about that. That'll be out to the community very soon. The traits feature is really a big deal. And so that's blocking a lot of API design. And so there's that. I think that's really exciting.
因此,生命周期可以为内存提供安全引用,避免出现悬垂指针。在像 Rust 这样的语言中已经实现过,现在我们有了一种非常酷的新方法。我对此非常兴奋,很快就会向社区发布。特质功能确实非常重要,已经妨碍了很多 API 的设计。那就是它。我觉得这非常令人兴奋。
A lot of it is these kind of table stakes features. One of the things that is, again, also lessons learned with Swift is that programmers in general like to add syntactic sugar. And so it's like, oh, well, this annoying thing, like in Python, you have to spell it on bar, on bar, add, why can't I just use plus? Deaf plus, come on. Why can't I just do that, right? And so trivial bit of syntactic sugar, it makes sense. It's beautiful. It's obvious.
大部分都是这些基础功能。其中一个也是Swift学习的教训之一,程序员通常喜欢添加语法糖。就像Python中那么烦人的事情,你必须拼写“on bar, on bar, add”,为什么我不能只用加号呢?盲加号,来吧。为什么我不能这样做呢?因此,微不足道的语法糖是有意义的。它很美妙,排版清晰,很明显易懂。
We're trying not to do that. And so for two different reasons, one of which is that, again, lesson learned with Swift, Swift has a lot of syntactic sugar, which may be a good thing, maybe not, I don't know. But because it's such an easy and addictive thing to do, sugar like make sure blood get crazy, right? Like the community will really dig into that and want to do a lot of that. And I think it's very distracting from building the core abstractions.
我们试图避免这样做。有两个不同的原因,其中一个是,我们从开发Swift的经验中学到了教训。Swift有很多语法糖,这可能是一件好事,也可能不是,我不知道。但是因为它是一件如此容易上瘾的事情,语法糖会让人兴奋不已,对吧?社区会非常热衷于使用这些语法糖。但我认为这会非常分散我们构建核心抽象的注意力。
The second is we want to be a good member of the Python community, right? And so we want to work with the broader Python community. And yeah, we're pushing forward a bunch of systems programming features and we need to build them out to understand them. But once we get a long ways forward, I want to make sure that we go back to the Python community and say, okay, let's do some design reviews. Let's actually talk about this stuff. Let's figure out how we want this stuff all to work together. And syntactic sugar just makes all that more complicated. So. And yeah, list comprehensions, like you have to be implemented. And my favorite, I mean, dictionaries. Yeah, there's some basic stuff. 0.1. 0.1. But nonetheless, it's actually so quite interesting and useful.
第二个是我们想成为Python社区的好成员,对吧?因此,我们想与更广泛的Python社区合作。是的,我们正在推进一堆系统编程功能,并需要将它们构建起来以了解它们。但是,一旦我们走得更远,我希望确保我们回到Python社区并说,好吧,让我们做一些设计审查。让我们真正讨论这些东西。让我们弄清楚我们希望这些东西如何一起工作。语法糖会让所有这些更加复杂。所以。是的,像列表推导式这样的东西必须被实现。而且我的最爱,词典。是的,还有一些基本的东西。0.1.0.1。但是它实际上仍然非常有趣和有用。
As you mentioned, modular is very new. Mojo is very new. It's a relatively small team. Yeah. This building up this gigantic stack. It's an incredible stack that's going to perhaps define the future of development of our AI overlords. We just hope it will be useful. As do all of us.
正如你所提到的,模块化是很新的,Mojo也是很新的,团队也相对较小。我们正在打造一个庞大的技术栈。这是一个不可思议的技术栈,有可能定义我们AI霸主未来的开发。我们只希望它会有用。所有人都希望如此。
So what have you learned from this process of building up a team? Maybe one question is, how do you hire? Yeah. Great programmers, great. People that operate in this compiler, hardware, machine learning, software, interface design space. Yeah. And maybe there are a little bit fluid in what they can do. So, okay. So language design too. So building a company is just as interesting and different ways as building a language. Like different skill sets, different things, but super interesting. And I've built a lot of teams a lot of different places. If you zoom in from the big problem into recruiting.
你从建立团队的这个过程中得到了什么经验教训?也许有一个问题是,你是如何招聘的?是的,优秀的程序员,优秀的在编译器、硬件、机器学习、软件、接口设计等领域工作的人。也许他们具有一定的流动性,能够做一些不同的事情。所以,好的语言设计也非常重要。打造一个公司和打造一种语言一样有趣,不同的技能、不同的事情,但都非常有趣。我在许多不同的地方建立了很多不同的团队。如果你从招聘的大问题细分来看,招聘就更有趣了。
Well, so here's our problem. Okay, I'll just, I'll be very straightforward about this. We started modular with a lot of conviction about we understand the problems. We understand the customer pain points. We need to work backwards from the suffering in the industry. And if we solve those problems, we think it'll be useful for people. But the problem is, is that the people we need to hire, as you say, are all these super specialized people that have jobs at big tech, big tech worlds, right? And, you know, we, I don't think we have product market fit in the way that a normal startup does, or we don't have product market fit challenges. Because right now, everybody's using AI and so many of them are suffering and they won't help. And so again, we started with strong conviction. Now, again, you have to hire and recruit the best and the best all have jobs. And so what we've done is we said, okay, well, let's build an amazing culture. Start with that.
嗯,我们的问题在这里。好的,我会非常直接地谈论这个问题。我们开始采用模块化的方式,有充分的信心认为我们理解了问题。我们理解了客户的痛点。我们需要从行业中的痛苦开始逆向思考。如果我们解决这些问题,我们认为它将对人们有用。但问题在于,正如你所说,我们需要聘请这些在大型技术领域拥有工作的超级专业的人。你知道的,我不认为我们像普通创业公司一样具备产品市场适配性,或者说我们没有产品市场适配性的挑战。因为现在每个人都使用AI,而且很多人都在受苦,他们不愿意帮助。所以我们再次充满信心地开始了。现在,你必须雇用和招聘最好的人,而最好的人都有工作。所以我们所做的是,让我们先建立一个令人惊叹的文化。从那里开始。
That's usually not something a company starts with. Usually you hire a bunch of people and then it, people start fighting and it turns into a gigantic mess. And then you try to figure out how to improve your culture later. My co-founder Tim in particular is super passionate about making sure that that's right. And we've spent a lot of time early on to make sure that we can scale. I can you comment, sorry, before we get to the second, what makes for a good culture?
通常公司不是从这样的事情开始的。通常你会雇一群人,然后人们开始争吵,事情变得一团糟。然后你试图在以后想办法改进你的企业文化。我的联合创始人Tim非常热衷于确保这一点。我们早期花了很多时间来确保我们可以扩展。在我们谈论第二个问题之前,你可以评论一下,什么样的文化是好的呢?
So, I mean, there's many different cultures and I have learned many things from many different people. I've worked with several very unique, almost famously unique cultures. And some of them I learned what to do and some of them I learned what not to do. And so, we want an inclusive culture. I believe in amazing people working together. And so, I've seen cultures where people, you have amazing people and they're fighting each other. I see amazing people and they're told what to do. Like, doubt, shout, line up and do what I say. It doesn't matter if it's the right thing, do it, right? And neither of these is the, and I've seen people that have no direction. They're just kind of floating in different places. And they want to be amazing. They just don't know how. And so, a lot of it starts with have a clear vision. And so, we have a clear vision of what we're doing.
我的意思是,有很多不同的文化,我从很多不同的人身上学到了很多东西。我曾经和一些非常独特的、几乎是出了名的文化打过交道。有些我学了该做什么,有些我学了不该做什么。我们希望有一种包容性的文化。我相信惊人的人们一起工作。所以,我见过一些文化,人们互相争斗,即使他们很厉害。我见过一些人被告诉该怎么做,就像怀疑,喊叫,排队,听我的,不管是不是正确的,都要做对吧?这两者都不是我想要的。我还见过一些人没有方向,就像漂浮在不同的地方。他们想成为了不起的人,只是不知道该怎么做。所以,很多事情都始于有一个清晰的愿景。我们有一个清晰的愿景,知道我们在做什么。
And so, I kind of grew up at Apple in my engineering life, right? And so, a lot of the Apple DNA rubbed off on me. My co-founder, Tim, also is like a strong product guy.
因此,在我的工程生涯中,我有很多成长是在苹果公司的过程中实现的。所以,很多苹果DNA都影响了我。我的合作伙伴蒂姆也是一个非常强的产品专家。
And so, what we learned is, you know, I was taught at Apple that you don't work from building cool technology. You don't work from like, come up with a cool product and think about the features you'll have in the big checkboxes and stuff like this. Because if you go talk to customers, they don't actually care about your product. They don't care about your technology. What they care about is their problems, right? And if your product can help solve their problems, well, hey, they might be interested in that, right?
因此,我们学到的是,我在苹果公司学习到的一点是,你不应该仅仅专注于开发酷炫的技术,或是设计出一个带有亮眼特性的产品等。因为如果你与顾客交流,你会发现他们并不关心你的产品和技术,他们关心的是他们自己的问题。如果你的产品能够解决他们的问题,那么他们才有可能对你的产品感兴趣。
And so, if you speak to them about their problems, if you understand and you have compassion, you understand what people are working with, then you can work backwards to building an amazing product. So, the vision starts by defining the problem. And then you can work backwards in solving technology.
因此,如果你和人们谈论他们的问题,如果你能够理解并感同身受,了解人们正在处理的问题,那么你就可以逆向思考,构建出一款惊人的产品。因此,愿景始于定义问题。然后,可以逆向解决技术问题。
And at Apple, like it's, I think, pretty famously said that, you know, for every, you know, there's 100 nos for every yes. I would refine that say that there's 100 not yets for every yes, but famously, if you go back to the iPhone, for example, right, iPhone one, I mean, many people laughed at it because it didn't have 3G, it didn't have copy and paste. Right. And then a year later, okay, finally it has 3G, but it still doesn't have copy and paste, it's a joke. Nobody will ever use this product, blah, blah, blah, blah, blah, blah, blah, right?
在苹果公司,就像众所周知的那样,对于每一个肯定的答复,可能会有一百个否定的答复。我认为这种说法需要更加精细的改进,应该是每个肯定答复的背后都有一百个尚未确定的答复。但是,如果我们回顾一下iPhone的历史,就可以看到,比如第一代iPhone,很多人嘲笑它,因为它没有3G也没有复制和粘贴功能。一年后,它终于添加了3G功能,但是依然没有复制和粘贴功能,这仍被人们视为一个笑话。许多人认为这是一个没前途的产品,等等。
Well, year three had copy and paste and people stopped talking about it, right? And so, and so being laser focused and having conviction and understanding what the core problems are and giving the team the space to be able to build the right tech is really important.
好吧,三年级时已经拥有了复制和粘贴功能,人们停止谈论了,对吧?因此,我们需要专注于目标,有信心并了解核心问题,给团队建设正确的技术的空间至关重要。
Also, I mean, you come back to recruiting, you pay well. So we have to pay industry leading salaries and have good benefits and things like this. That's a big piece.
此外,我的意思是,你重新回到招聘领域,你付出了很大的代价。因此,我们必须支付行业领先的工资,并提供良好的福利和其他类似的东西。这是非常重要的一部分。
We're a remote first company and so we have to, so remote first has a very strong set of pros and cons. On the one hand, you can hire people from wherever they are and you can attract amazing talent, even if they live in strange places or unusual places.
我们是一个远程优先的公司,因此我们必须考虑它的优缺点。一方面,你可以雇佣来自任何地方的人才,吸引非常出色的人才,即使他们住在奇怪或不寻常的地方。
On the other hand, you have time zones. On the other hand, you have like everybody on the internet will fight if they don't understand each other. And so we've had to learn how to like have a system where we actually fly people in and we get the whole company together periodically and then we get work groups together and we plan and execute together.
另一方面,你有时区。而另一方面,如果网络上的所有人彼此不理解,就会发生争执。因此,我们不得不学会如何建立一个系统,实际上把人们飞来聚集整个公司,然后我们组成工作小组,一起计划并执行。
And there's like an intimacy to the in-person brainstorming. I guess you lose, but maybe you don't, maybe if you get to know each other well and you trust each other, maybe you can do that.
当面头脑风暴有一种亲密感。虽然你可能会失去某些东西,但也许你不会失去,也许如果你彼此更加了解并且信任彼此,你仍然可以在面对面的情况下进行头脑风暴。
Well, so when the pandemic first hit, I mean, I'm curious about your experience too. The first thing I missed was having whiteboards. Yeah. Right? Those design discussions were like, I can high intensity, work through things, get things done, work through the problem of the day, understand where you're on, figure out and solve the problem and move forward. But we figured out ways to work around that now with all these screen sharing and other things like that that we do.
嗯,当疫情最初出现时,我很好奇你的经历。我最开始错过的是有白板的感觉。对,是吗?这些设计讨论非常高强度,可以快速处理问题,完成任务,解决当天的问题,查明自己的位置,然后解决问题并继续前进。但是现在我们通过屏幕共享和其他方式找到了解决办法。
The thing I missed now is sitting down at a lunch table with the team. Yeah. The spontaneous things like the coffee bar things and the bumping into each other and getting to know people outside of the transactional solve a problem over Zoom.
我现在最怀念的是与团队一起坐在午餐桌旁。是啊,像是咖啡吧这样自然而然的事情,和在Zoom上解决问题外的认识人们的机会。
And I think there's just a lot of stuff that I'm not an expert at this. I don't know who is, hopefully there's some people. But there's stuff that somehow is missing on Zoom. Even with the whiteboard, if you look at that, if you have a room with one person at the whiteboard and there's like three other people at a table, there's, first of all, there's a social aspect of that where you're just shooting the shit a little bit almost like. Yeah.
我觉得有很多东西我不是专家。我不知道谁是,但希望有一些人。然而,在Zoom上总有些东西是缺失的。即使有白板,如果你在一个房间里有一个人在白板前,还有三个人坐在桌子旁边,首先,有一个社交方面的问题,你几乎要闲聊一下。是啊。
And people are just kind of coming in and yeah. That, but also while, like it's a breakout discussion that happens for like seconds at a time, maybe an inside joke or it's like this interesting dynamic that happens that Zoom. And you're bonding. Yeah. You're bonding, but through that bonding, you get the excitement.
人们似乎只是进来了,然后,还有一种突然发生的讨论,持续时间可能只有几秒钟,可能是一个内部笑话或一种有趣的动态,在Zoom上体现。你在建立联系,但是通过这种联系,你也感受到了兴奋。
There's certain ideas that are like complete bullshit and you'll see that in the faces of others that you won't see necessarily on Zoom. And like something, it feels like that should be possible to do without being in person. Well, I mean, being in person is a very different thing.
有些想法完全是胡扯,你会从其他人的表情中感受到,这是在Zoom上不容易看到的。虽然有些东西感觉应该不需要亲自在一起才能完成,但是实际上,亲自在一起是一件非常不同的事情。
Yeah, I don't. It's worth it, but you can't always do it. And so again, we're still learning, we're also learning is like humanity with this new reality, right? But what we found is that getting people together, whether it be a team or the whole company or whatever, is it worth the expense because people work together and are happier after that.
是的,我不这么认为。这样做是值得的,但并不总是可行的。因此,我们仍在学习,我们也正在了解这种新现实。但我们发现,让人们聚在一起,无论是团队还是整个公司,都是值得的开销,因为人们在那之后会更加愉快,并能更好地协作。
Like there's a massive period of time where you go out and things start getting frayed, pull people together, and then you realize that we're all working together, we see things the same way, we work through the disagreement or the misunderstanding, we're talking across each other, and then you work much better together. And so things like that, I think are really quite important.
就好像有一个长时间的阶段,你出去做事情,事情开始变得混乱,然后你把人们聚集起来,然后你意识到我们都在一起工作,我们看问题的角度一样,我们处理分歧或误解,我们沟通交流,然后我们一起更加高效地工作。因此,这些事情非常重要。
What about people that are kind of specialized in very different aspects of the stack working together? What are some interesting challenges there? Yeah, well, so I mean, there's lots of interesting people as you can tell I'm hard to deal with too.
那么,对于在技术栈的不同方面具有专长的人一起工作,有什么有趣的挑战?是的,有很多有趣的人,就像你能看出来,我也很难处理。
But you're one of the most lovable people. Yeah, so there's different philosophies in building teams. For me, and so some people say, hire 10x programmers, and that's the only thing, whatever that means, right?
但实际上你是最受人喜爱的人之一。是的,建立团队有不同的哲学。对于我来说,有些人说要雇用10倍程序员,这是唯一的选择,但是这意味着什么呢?
What I believe in is building well-balanced teams, teams that have people that are different in them. Like if you have all generals and no troops, or all troops and no generals, or you have all people that think in one way and not the other way, what you get is you get a very biased, and skewed and weird situation where people end up being unhappy.
我相信的是建立平衡的团队,这些团队有不同的人。例如,如果你只有将军,没有士兵,或者只有士兵,没有将军,或者你有所有人都只想一种方式,而不是另一种方式,那么你会得到一个非常偏见的,扭曲的和奇怪的情况,人们最终会感到不满意。
And so what I like to do is I like to build teams that people where they're not all the same. You know, we do have teams that are focused on like runtime or compiler, GPU or whatever the specialty is, but people bring a different take and have a different perspective. And I look for people that compliment each other. And particularly if you look at leadership teams and things like this, you don't want everybody thinking the same way. You want people bringing different perspectives and experiences. And so I think that's really important.
我喜欢组建不同的团队。我们有一些团队专注于运行时、编译器、GPU及其他的特定领域,但我会寻找不同的观点和视角来打造一个具有互补能力的团队。特别是在领导层,我们不希望每个人都想同样的事情,而是希望他们拥有不同的观点和经验。因此,我认为这非常重要。
That's team, but what about building a company as ambitious as modular? So what are some interesting questions there? Oh, I mean, so many. Like, so one of the things I love about, okay, so modular is the first company I built from scratch.
那个团队很不错,但是要像Modular一样建立一家雄心勃勃的公司呢?那么有哪些有趣的问题呢?哦,我的意思是有很多。比如说,我喜欢Modular的其中一个原因是它是我从零开始建立的第一家公司。
One of the first things that was profound was I'm not cleaning up somebody else's mess, right? And so if you look at that. That's liberating, just something good. It's super liberating. And also many of the projects I've built in the past have not been core to the product of the company.
一开始给我的深刻感受之一是我不需要整理别人留下的乱七八糟的事情,这很解放人心,是一件好事。很多过去我构建的项目也没有与公司产品有关。
Swift is not Apple's product, right? MLAR is not Google's revenue machine or whatever, right? It's important, but it's like working on the accounting software for the retail giant or something, right? It's like enabling infrastructure and technology. And so at modular, the tech we're building is here to solve people's problems.
Swift不是苹果的产品,对吧?MLAR也不是谷歌的主要盈利工具或者其他什么,对吧?它们都是重要的,但就像为零售巨头开发会计软件一样,对吧?它们就是为基础设施和技术提供支持的。因此,在模块化公司,我们正在构建的技术是为了解决人们的问题。
Like it is directly the thing that we're giving to people. And so this is a really big difference. And what it means for me as a leader, but also for many of our engineers, is they're working on the thing that matters. And that's actually pretty, I mean, again, for a compiler people and things like that. That's usually not the case, right? And so that's also pretty exciting and quite nice.
就好像我们直接把这个东西交给了人们一样。因此,这是一个非常大的区别。对于我作为领导者,以及我们的许多工程师来说,这意味着他们正在从事重要的事情。对于编译器之类的人来说,这通常并不是这样的,这也非常令人兴奋和愉快。
But one of the ways that this manifests is it makes it easier to make decisions. And so one of the challenges I've had in other worlds is it's like, OK, well, community matters somehow for the goodness of the world, or open source matters theoretically, but I don't want to pay for a t-shirt, right? Or some swag. Like, well, t-shirts cost $10 each.
其中之一的表现方式是它使得决策更容易。因此在其他世界中,我所面临的挑战之一是,社区对于世界的良善有所影响,或开源在理论上很重要,但我不想为T恤或其他赠品付款,因为一件T恤的成本是10美元。
You can have 100 t-shirts for $1,000 to a megacorp, $1,000 is uncountably, can't count that low, right? But justifying it and getting a t-shirt, by the way, if you'd like a t-shirt, I would 100% like a t-shirt. Are you joking? You can have a fire emoji t-shirt. I will treasure this. I will pass it down to my grandchildren.
你可以向一家大公司出售价值1000美元的100件T恤衫,1000美元是无法计算的,太低了,对吗?但要证明它并得到一件T恤衫,顺便说一句,如果你想要一件T恤衫,我非常希望能得到一件。你在开玩笑吗?你可以得到一件火焰表情T恤衫。我会珍惜它,并将其传给我的孙子孙女。
And so it's very liberating to be able to decide, I think that Lex should have a t-shirt, right? And it becomes very simple, because I like Lex.
所以,有能力决定的自由非常令人欣慰,我认为Lex应该有一件T恤,对吧?这变得非常简单,因为我喜欢Lex。
This is awesome. So I have to ask you about one of the interesting developments with large language models is that they're able to generate code recently, really well. I guess to a degree that maybe I don't know if you understand, but I struggle to understand, because it forces me to ask questions about the nature of programming, of the nature of thought.
这太棒了。所以我必须问你一个有趣的发展之一,大型语言模型最近能够生成非常好的代码。我大概不知道你是否能理解它的程度,但我很难理解,因为它迫使我提出关于编程和思维本质的问题。
Because the language models are able to predict the kind of code I was about to write so well. That it makes me wonder how unique my brain is, and where the valuable ideas actually come from. How much to contribute in terms of ingenuity, innovation, to code, or design, and that kind of stuff. When you stand on the shoulders of giants, that you're really doing anything.
因为语言模型能够非常准确地预测我即将写的代码类型,这让我开始怀疑我的大脑有多独特,以及真正有价值的创意到底是来自哪里。我应该在代码、设计等方面贡献多少独创与创新的内容。当你站在巨人的肩膀上时,并没有真正做出什么贡献。
And what LLMs are helping you do is they help you stand on the shoulders of giants in your program. There's mistakes. They're interesting that you learn from. I just would love to get your opinion first high level of what you think about this impact of large language models when they do programs that they generate code.
LLM帮助你做的是让你站在你的课程中巨人的肩膀上。其中存在一些错误,但你可以从中获取有趣的经验教训。我想先听听你对于大型语言模型在生成代码时的影响的高层次看法。
Yeah, well, so I don't know where it all goes. I'm an optimist, and I'm a human optimist. I think that things I've seen are that a lot of the LLMs are really good at crushing leak code projects, and they can reverse the link list like crazy.
嗯,我不知道这一切的去向。我是个乐观主义者,也是个人类乐观主义者。我觉得我所见过的情况是,很多 LLMs 在压制漏洞代码方面非常擅长,而且他们能够疯狂地反转链接列表。
While it turns out, there's a lot of instances of that on the internet, and it's a pretty stock thing. And so if you want to see standard questions answered, LLMs can memorize all the answers, and that can be amazing. And also, they do generalize out from that, and so there's good work on that.
实际上,网络上有很多这样的例子,这是一个常见的事情。因此,如果你想看到标准问题的答案,法学硕士可以记住所有的答案,这非常棒。而且,他们也可以从中推广出更多的内容,这是非常好的工作。
But I think that, in my experience, building things. Building something like you talk about Mojo, or you talk about these things, or you talk about building an applied solution to a problem, it's also about working with people. It's about understanding the problem.
依据我的经验,我认为建造事物是重要的。无论你谈论的是Mojo,还是其他东西,或者你谈论构建解决问题的应用方案,都需要与人合作。它涉及理解问题。
What is the product that you want to build? What are the use case? What are the customers? Can't just co-survey all the customers, because they'll tell you that they want a faster horse. Maybe they need a car. And so a lot of it comes into, I don't feel like we have to compete with LLMs. I think they'll help automate a ton of the mechanical stuff out of the way.
你想要开发什么产品?它的使用情况是什么?它的顾客是谁?不要仅仅针对所有顾客开展调查,因为他们可能会告诉你他们想要一匹更快的马,但实际上他们需要的可能是一辆汽车。因此,我认为我们不需要与 LLM 竞争,我认为他们会帮助自动化很多强制性的操作。
And just like, I think we all try to scale through delegation and things like this. Delegating wrote things to an LLM, I think is extremely valuable, and approach that will help us all scale and be more productive. But I think it's a fascinating companion.
就好像我们都尝试通过委派和其他方式来扩展自己一样,委派书面工作给研究硕士硕士(LLM)是非常有价值的做法。这种方法有助于我们所有人扩大规模并提高生产效率,但我认为它还需要一个迷人的伴侣。
But I'd say I don't think that that means that we're going to be done with coding. But there's power in it as a companion. And from there, I would love to zoom in on Temojo a little bit. Do you think about that? Do you think about LLMs generating Mojo code, and helping sort of like, design new programming language?
我认为我们并不会因此结束编码,但作为伴侣,编码具有巨大的力量。接下来,我想详细谈谈Temojo。你认为这可行吗?你是否考虑过通过LLMs生成Mojo代码和帮助设计新的编程语言?
It almost seems like, man, it would be nice to sort of almost as a way to learn how I'm supposed to use this thing for them to be trained on some of the Mojo code. So I do lead an AI company. So maybe there will be a Mojo LLM at some point. But if your question is like, how do we make a language to be suitable for LLMs? I think the cool thing about LLMs is you don't have to.
这句话的意思是,好像我们可以考虑让一些人去接受Mojo代码的培训,这样他们就可以了解如何使用这种代码。我领导一家AI公司,所以也许在某个时候我们会开设一个Mojo LLM课程。但如果你的问题是如何让一种编程语言适合LLMs,那么我想LLMs有一个很酷的地方,即你不必让编程语言适合LLMs。
And so if you look at what is English, or any of these other terrible languages that we as humans deal with on a continuous basis, they're never designed for machines. And yet, they're the intermediate representation. They're the exchange format that we humans use to get stuff done. And so these programming languages, they're an intermediate representation between the human and the computer, or the human and the compiler, roughly.
如果你看看英语,或者我们人类经常处理的其他可怕的语言,你会发现它们从未为机器设计。然而,它们是中介表示法,是我们人类用来完成事情的交换格式。所以这些编程语言,它们是人类和计算机,或者人类和编译器之间的一种中介表示法。
And so I think the LLMs will have no problem learning whatever keyword we pick. Maybe the 5MoG is going to be a quick break. Maybe that's going to break it. It doesn't tokenize. No, the reverse of that will actually enable it. Because one of the issues I could see with being a superset of Python is there would be confusion by the gray area. So it would be mixing stuff.
所以我认为LLMs学习我们选择的任何关键字都不会有问题。也许这个5MoG将会是一个快速的休息。也许它会破坏它。它不会被分词。不,它的反向实际上将使其得以实现。因为我能想象到一个超越Python的问题之一就是会存在灰色地带的混淆。所以它会混合一些东西。
But I'm a human optimist, and also an LLM optimist. I think that will solve that problem. Yeah, but you look at that and you say, OK, well, reducing the rote thing. Turns out compilers are very particular. And they really want the indentation to be right. They really want the cool one to be there on your else or else it will complain.
我是一个积极向上的人,也是一个LLM优化者。我认为这将解决那个问题。但是,你看看它,说,好的,减少机械性的东西。事实证明编译器非常讲究。它们真的希望缩进正确。如果你的else语句中没写好的内容,它们会抱怨的。
I mean, compilers can do better at this, but LLMs can totally help solve that problem. And so I'm very happy about the new predictive coding and co-pilot type features and things like this, because I think it'll all just make us more productive. It's still messy and fuzzy and uncertain, unpredictable.
我的意思是,编译器可以在这方面做得更好,但是LLM可以完全帮助解决这个问题。因此,我非常高兴有了新的预测编码和共同编写等功能,因为我认为这将使我们更加高效。它仍然是混乱和模糊的,不确定和不可预测的。
But is there a future you see given how big of a leap GPT-4 was, where you start to see something like LLMs inside a compiler or no? I mean, you could do that. Yeah, absolutely. I think that would be interesting. Is that wise? Well, I mean, it would be very expensive.
但是,您能看到未来吗?考虑到GPT-4的巨大飞跃,您是否开始看到编译器内部出现类似LLMs的东西?我是说,您可以这样做。是的,绝对可以。我认为这将是有趣的。但是这么做明智吗?嗯,我是说,这将是非常昂贵的。
So compilers run fast and they're very efficient. And LLMs are currently very expensive. There's on-device LLMs, and there's other things going on. And so maybe there's an answer there. I think that one of the things that I haven't seen enough of is that so LLMs to me are amazing when you tap into the creative potential of the hallucinations.
编译器运行速度快且非常高效。而目前,LLM技术的成本非常昂贵。有些设备内置了LLM技术,也有其他相关技术的应用。或许在这方面有些答案。我认为,我们还没有充分发挥幻觉的创造潜力,这是我对LLM技术最为惊奇的地方。
And so if you're doing creative brainstorming or creative writing or things like that, the hallucinations work in your favor. If you're writing code that has to be correct because you're going to ship it in production, then maybe that's not actually a feature. And so I think that there has been research and there has been work on building algebraic reasoning systems and figuring out more things that feel like proofs.
因此,如果你正在进行创意头脑风暴或创意写作之类的事情,幻觉会对你有帮助。但是,如果你正在编写要在生产环境中发布的代码,因为它必须是正确的,则可能不是一个功能。因此,我认为已经进行了研究和工作,建立了代数推理系统,并找出更多像证明一样的东西。
And so I think that there could be interesting work in terms of building more reliable at scale systems. And that could be interesting. But if you chase that rabbit hole down, the question then becomes, how do you express your intent to the machine? And so maybe you want LLM to provide the spec, but you have a different kind of net that actually implements the code. Right, so it's a use of documentation and inspiration versus the actual implementation. Yeah, potentially. Since a successful modular will be the thing that runs, I say so jokingly are AI overlords. But AI systems that are used across, I know it's a cliché term, but in and out of things, so across. So I'll joke and say like, AGI should be written in Mojo.
因此,我认为建立更可靠的系统是有趣的工作,尤其是在大规模应用方面。但是,如果你深入探究,问题就变成了如何向机器表达你的意图。也许你希望 LLM 提供规范,但是你有不同类型的网络来实现代码。所以,这是一种使用文档和灵感而非实际实现的方式。成功的模块化将成为运行的事物,我开玩笑说它们是 AI 统治者。但是 AI 系统被广泛应用于内外,因此我会开玩笑地说 AGI 应该用 Mojo 来编写。
Yeah, AGI should be in Mojo. You're joking, but it's also possible that it's not a joke. That a lot of the ideas behind Mojo seems like the natural set of ideas that would enable at scale training and inference of AI systems.
是的,AGI应该在Mojo里面。你在开玩笑,但这也有可能不是开玩笑。很多Mojo背后的想法似乎是自然的一套思路,可以实现AI系统的大规模训练和推理。
So I just have to ask you about the big philosophical question about human civilization. So folks like LEA, Zari, and Kowskare really concerned about the threat of AI. Do you think about the good and the bad that can happen at scale deployment of AI systems?
我只是想问你一个有关人类文明的大哲学问题。像LEA、Zari和Kowskare这样的人非常担心人工智能的威胁。您是否考虑过在大规模部署人工智能系统时可能出现的好坏之处?
Well, so I've thought a lot about it, and there's a lot of different parts to this problem, everything from job displacement to Skynut, things like this. And so you can zoom into sub parts of this problem. I'm not super optimistic about AGI being solved next year. I don't think that's going to happen personally. So you have a kind of Zen-like calm about. Is there's a nervousness because the leap of GPT-4 seemed so big? Sure.
嗯,我思考了很多关于这个问题的不同部分,包括工作岗位的取代以及Skynut等等。你可以专注于这个问题的某些分支。在关于AGI在明年得以解决上,我并不是非常乐观。我个人认为这不太可能发生。所以你需要一种禅宗一般的平静。但是GPT-4的跨越似乎很大,因此会有些紧张。
It's huge. We're almost, there's some kind of transition here period. You're thinking, or is there a lot of this? So, I mean, there's a couple of things going on there. One is, I'm sure GPT-5 and 7 and 19 will be also huge leaps. They're also getting much more expensive to run. And so there may be a limiting function in terms of just expense on one hand and train. That could be a limiter that slows things down. But I think the bigger limiter outside of Skynut takes over, and I don't spend any time thinking about that because if Skynut takes over and kills us all, then I'll be dead. So I don't worry about that. So, you know, I mean, that's just, okay, if other things worry about, I'll just focus on, I'll focus and not worry about that one.
这很巨大。我们几乎到了某种过渡期。你可能在想,或者是否有很多这样的事情?所以,我想,有几件事情正在发生。首先,我相信GPT-5、GPT-7和GPT-19也会有巨大的飞跃。他们的运行成本也越来越高。因此,可能会有一种限制功能,即成本和培训方面的限制,这可能会放缓进展。但我认为,更大的限制因素在于Skynut的接管,我不会花时间去思考这个问题,因为如果Skynut接管并杀死我们所有人,那么我将死亡。所以我不用担心那个。你知道,如果有其他事情让我担心,我会专注于那些事情,而不去担心那个问题。
But I think the other thing I'd say is that AI moves quickly, but humans move slowly, and we adapt slowly. And so what I expect to happen is, just like any technology diffusion, like the promise and then the application takes time to roll out. And so I think that I'm not even too worried about autonomous cars defining away all the taxi drivers. Remember, autonomous was supposed to be solved by 2020? Yeah, boy, do I remember that. So, and so like I think that on the one hand, we can see amazing progress, but on the other hand, we can see that, you know, the reality is a little bit more complicated and it may take longer to roll out than you might expect.
但我认为另一件事情是,人工智能发展速度快,而人类适应速度慢。因此,我预期会发生的就像是任何技术普及一样,需要时间来实现承诺和应用。因此,我甚至不太担心自动驾驶汽车将淘汰所有的出租车司机。还记得自动驾驶汽车应该在2020年解决的吗?是的,我还记得。所以,一方面我们可以看到惊人的进展,但另一方面,我们也可以看到现实情况要更加复杂,可能需要比预期的时间更长才能实现。
Well, that's in the physical space. I do think in the digital space is the stuff that's built on top of LLMs that runs, you know, the millions of apps that could be built on top of them, and that could be run on millions of devices, millions of types of devices. Yeah. I just think that the rapid effect it has on human civilization could be truly transformative to it. Yeah, well, you don't even know. Well, so that one, and there I think it depends on are you an optimist or a pessimist or a masochist. Just to clarify, optimist about human civilization. Me too. And so I look at that as saying, OK, cool, well, a I do. Right? And so some people say, oh my god, it's going to destroy us all. How do we prevent that? I kind of look at it from a, is it going to unlock us all? Right? You talk about coding. It's going to make us all have to do all the repetitive stuff. Well, suddenly that's a very optimistic way to look at. And you look at what a lot of these technologies have done to improve our lives. And I want that to go faster.
在物理空间中是如此。我认为在数字空间中构建在LLM之上并运行着的数百万应用程序是建立在LLM之上的东西,可以在数百万设备上运行,数百万种设备。是的。我认为它对人类文明的迅速影响可能会真正改变它。是的,你都不知道。所以这个,我认为它取决于你是乐观主义者还是悲观主义者或者是自虐者。澄清一下,乐观看待人类文明。我也是。所以我认为,嗯,我确实这样认为。对于有些人来说,这可能会摧毁我们所有人。我们该如何防止这种情况?我从一个角度看这个问题,即它是否会使我们所有人都能够解锁?你谈论编码。它将使我们所有人都不得不做所有重复的工作。那么,突然间这是一种非常乐观的看法。你看看这些技术为改善我们的生活所做的许多事情。我希望它发展得更快。
What do you think the future of programming looks like in the next 10, 20, 30, 50 years? They're lams and with Mojo with Modular. Like your vision for devices, the hardware to the compilers to the different stacks of software. Yeah.
你认为未来的编程会在接下来的10、20、30、50年里会是怎样的?它们是可插拔的(modular)且充满魔力(Mojo)。就像你对设备的愿景一样,从硬件到编译器再到不同的软件堆栈。是的。
Well, so what I want, I mean, coming back to my arch nemesis, right? It's complexity, right? So again, me being the optimist, if we drive down complexity, we can make these tools, these technologies, these cool hardware widgets accessible to way more people. Right? And so what I'd love to see is more personalized experiences, more things, the research getting into production instead of being lost at NURIPS, right? And like these things that impact people's lives by entering products.
嗯,我想说的是,回到我的死敌,对吧?就是复杂性,对吧?所以我身为乐观主义者,如果我们降低了复杂性,我们就能让更多的人获得这些工具、这些技术、这些酷炫的硬件小玩意儿。对吧?所以我希望看到更多个性化的体验,更多研究进入产品而不是丢失在NURIPS中,对吧?像这些能够影响人们生活的事情进入产品中。
And so one of the things that I'm a little bit concerned about is right now, the big companies are investing huge amounts of money and are driving the top line of AI capability forward really quickly. But if it means that you have to have $100 million to train a model or more, $100 billion, right? Well, that's going to make it very concentrated with very few people in the world that can actually do this stuff. I'd much rather see lots of people across the industry be able to participate in uses.
所以我有点担心的是,现在大公司正在投入大量资金,使人工智能能力的顶层不断向前推进。但是,如果这意味着你必须要花费1亿美元或更多来训练模型,甚至1千亿美元,那么这将使得只有很少一部分人能够参与这方面的研究。我更希望看到行业中有更多的人参与使用人工智能技术。
And you look at this. I mean, a lot of great research has been done in the health world and looking at detecting pathologies and doing radiology with AI and doing all these things. Well, the problem today is that to deploy and build these systems, you have to be an expert in radiology and an expert in AI. And if we can break down the barriers so that more people can use AI techniques, and it's more like programming Python, which roughly everybody can do if they want to, right?
你看看这个。我的意思是,医疗领域已经做出了很多伟大的研究,关注于使用人工智能进行病理学检测和放射学,以及其他相关领域的研究。然而,今天的问题是,要部署和构建这些系统,你必须是放射学和人工智能的专家。如果我们能够打破这些障碍,让更多的人可以使用人工智能技术,就像编程 Python 一样容易,那么大众应该可以更好地应用这些技术,对医疗领域做出更有益的贡献。
Then I think that we'll get a lot more practical application of these techniques and a lot more nicheier, cool, but narrower demands. And I think that's going to be really cool. Do you think we'll have more or less programmers in the world than now? Well, so I think we'll have more programmers, but they may not consider themselves to be programmers. That'd be a different name for you. Right.
然后我认为,这些技术将得到更多实际应用,并且会有更多更加独特、酷但更窄化的需求。我认为这会非常酷。你认为世界上的程序员会增加还是减少?我认为我们会有更多的程序员,但他们可能不认为自己是程序员。那对你来说会有一个不同的名称。对的。
I mean, do you consider somebody that uses, you know, I think arguably the most popular programming language is Excel. Yeah. Right? Yep. And so do they consider themselves to be programmers? Maybe not. I mean, some of them make crazy macros and stuff like that. But what you mentioned, Steve Jobs, it's the bicycle for the mind that allows you to go faster. Right?
我是说,你认为使用最受欢迎的编程语言Excel的人算不算程序员呢?呃,是吧。这样的人会认为自己是程序员吗?也许不会。有些人会编写疯狂的宏和其他内容。但是,像你之前提到的Steve Jobs所说,它是“思维的自行车”,可以让你进一步提升速度。是吧?
And so I think that as we look forward, right, what is AI? I look at it as, hopefully, a new programming paradigm. It's like object-oriented programming. Right? If you want to write a cat detector, you don't use for loops. It turns out that's not the right tool for the job, right? And so right now, unfortunately, because, I mean, it's not unfortunate, but it's just kind of where things are. AI is this weird different thing that's not integrated into programming languages and normal tool chains and all the technology is really weird and doesn't work right. And you have to babysit it. And every time you switch hardware, it's different. It shouldn't be that way.
因此,我认为当我们展望未来时,什么是人工智能?我认为它是一种新的编程范式,就像面向对象编程一样。如果你想写一个猫探测器,你不能使用for循环。这种工具不适合这项工作。目前,不幸的是,因为AI不是集成在编程语言和正常工具链中的奇怪的不同事物,所有技术都很奇怪而且不能正常工作。你要一直看管它,而且每次更换硬件时,都有所不同。这种情况不应该存在。
When you change that, when you fix that, suddenly, again, the tools technologies can be way easier to use. You can start using them for many more things. So that's why I would be excited about it.
当你改变那个问题,修正那个错误时,工具和技术使用起来突然变得更加容易。你可以开始应用它们进行更多的事情。这就是为什么我会对此感到兴奋的原因。
What kind of advice could you give to somebody in high school right now, or maybe early college, who's curious about programming and feeling like the world is changing really quickly here? Yeah. What kind of stuff to learn? What kind of stuff to work on? Should they finish college? Should they go work at a company? Should they build a thing? What do you think?
你能给现在高中或者早期大学的对编程有兴趣,同时感受到这个世界正在快速变化的人们提供什么建议呢?他们应该学习什么?应该从事什么项目?他们是否应该完成大学学业?是否应该去公司工作?或者应该专注于自己的项目建设?你的想法是什么呢?
Well, so, I mean, one of the things I'd say is that you'll be most successful if you work on something you're excited by. And so don't get the book and read the book, cover to cover and study and memorize and recite and flash card and go build something. Like, go solve a problem. Go build the thing that you want to exist. Go build an app. Go build the train a model. Go build something and actually use it and set a goal for yourself.
嗯,我是说,如果你做一些自己感兴趣的事情,那么你会更成功。不要只是拿到书,从头到尾地阅读、学习、记忆、背诵、使用闪卡等等,而是去解决一个问题。去建立你想要的东西。去建立一个应用程序。去建立一个模型。去建立一个真正能够使用的东西,并为自己设定一个目标。
And if you do that, then there's a success. There's the adrenaline rush. There's the achievement. There's the unlock that I think is where, if you keep setting goals and you keep doing things, and building things, learning by building is really powerful. In terms of career advice, I mean, everybody's different. It's very hard to give generalize advice. I'll speak as a compiler nerd.
如果你做到了这一点,就会获得成功,感受到肾上腺素的冲击,获得成就感。我认为最重要的是你将打开成功之门,如果你不断为自己设定目标,并且继续做事情,建设自己的知识,通过建设来学习是非常有力的。关于职业建议,每个人都不同,很难给出泛化的建议。我自己来讲,我是一个编译器迷。
If everybody's going left, sometimes it's pretty cool to go right. And so just because everybody's doing a thing, it doesn't mean you have to do the same thing and follow the herd. In fact, I think that sometimes the most exciting paths for life lead to being curious about things that nobody else actually focuses on. And it turns out that understanding deeply parts of the problem that people want to take for granted makes you extremely valuable and specialized in ways that the herd is not.
如果大家都向左走,有时向右走会很酷。所以,仅仅因为大家都在做一件事,不意味着你必须同样这么做,跟随大众。实际上,我认为有时候,最令人兴奋的人生路径是对那些没有人真正关注的事物保持好奇心。而且结果证明,深入理解人们认为理所当然的问题的某些部分,会使你在特定方面变得非常有价值和专业,而这是大众所不具备的。
And so again, I mean, there's lots of rooms for specialization, lots of rooms for generalists. There's lots of room for different kinds of parts of the problem. But I think that it's just because everybody's doing one thing doesn't mean you should necessarily do it. And now the herd is using Python. So if you want to be a rebel, go check out Mojo and help Chris and the rest of the world fight the arch nemesis of complexity. Because simple is beautiful.
因此,我的意思是,有很多专业领域可以探索,也有很多通才领域可以钻研。问题的不同方面都有很大的发展空间。但我认为,仅仅因为大多数人都在做某一件事情并不意味着你一定要去做了。现在,群体中大多数人使用的是Python。因此,如果你想成为一个叛逆者,请去了解Mojo并帮助Chris和世界上其他人与复杂性的大敌作斗争。因为简单才是美。
There you go. Because you're an incredible person. You've been so kind to me ever since we met. You've been extremely supportive. I'm forever grateful for that. Thank you for being who you are, for being legit, for being kind, for fighting this really interesting problem of how to make AI accessible to a huge number of people, huge number of devices. Yeah.
这就对了。因为你是一个不可思议的人。自从我们相遇以来,你一直对我很好。你一直给我很大的支持,我永远感激不尽。感谢你成为你自己,成为一个真实的人,成为一个善良的人,为了解决使人工智能可访问于大量的人和设备这个有趣的问题而努力。是的。
Well, so Lex, you're a pretty special person too. And so I think that one of the funny things about you is that besides being curious and pretty damn smart, you're actually willing to push on things. And I think that you've got an agenda to make the world think, which I think is a pretty good agenda. It's a pretty good one.
好的,所以 Lex,你也是一个相当特别的人。我认为你除了好奇心和相当聪明之外,你还真的愿意努力去追求。我认为你有一个目的,那就是让世界思考,我认为这是一个非常好的目标。这是一个相当不错的目标。
Thank you so much for talking to you, Chris. Yeah. Thanks, Lex. Thanks for listening to this conversation with Chris Latner. To support this podcast, please check out our sponsors in the description.
非常感谢你和我交谈,Chris。是的,谢谢你,Lex。感谢您收听与Chris Latner的对话。为了支持这个播客,请查看说明中的赞助商。
And now let me leave you with some words from Isaac Asimov. I do not fear computers. I fear the lack of them. Thank you for listening and hope to see you next time.
现在,让我用Isaac Asimov的一些话留给你们。我不害怕电脑,我害怕没有电脑。感谢你们的聆听,希望下次再见。