Good afternoon. My name is David and I'll be your conference operator today. At this time, I'd like to welcome everyone to Nvidia's first quarter earnings call. Today's conference is being recorded. All lines have been placed on mute to prevent any background noise. After the speakers are marked, there'll be a question and answer session. If you'd like to ask a question during this time, simply press C star key, follow by the number one on your telephone keypad. If you'd like to withdraw your question, press star one once again.
Thank you, Samona, Jen Kowski. You may begin your conference.
谢谢,Samona,Jen Kowski。你们可以开始你们的会议了。
意思是:谢谢 Samona 和 Jen Kowski,可以开始他们的会议了。
Thank you. Good afternoon, everyone. And welcome to Nvidia's conference call for the first quarter of fiscal 2024. With me today's human video, our gentleman, President and Chief Executive Officer, and collect CREF, Executive Vice President and Chief Financial Officer.
I'd like to remind you that our call is being webcast live on Nvidia's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results to the second quarter of fiscal 2024. The content of today's call is in video's property. It can be reproduced or transprived without our prior written consent.
During this call, we may make four lives in statements, based on current expectations. These are subject to a number of significant risks and uncertainties, and our actual results may differ materially. For discussion of factors that could affect our future financial results in business, please refer to the disclosure in today's earnings release. Our most recent forms 10K and 10Q, and the report that we may file on form 8K with the Securities and Exchange Commission.
All our statements are made as of today, May 24, 2023, based on information currently available to us. Except it's required by law, we assume no obligation to update any such statements.
During this call, we will discuss non-guessful measure. You can find a reconciliation of these non-guessful measure to guessful measure in our CFO commentaries, which is posted on our website.
And with that, let me turn the call over to Collette.
同时,请允许我将电话转交给Collette。
Thanks, Simona.
谢谢,Simona。
Q1 revenue with 7.19 billion, up 19% sequentially, and down 13% year on year. Unsequential growth was driven by record data center revenue, with our gaming and professional visualization platforms emerging from channel inventory corrections.
Starting with data center, record revenue of 4.28 billion was up 18% sequentially and up 14% year on year. Unstrong growth were accelerated computing platforms worldwide. Creative AI is driving exponential growth in compute requirements and a fast transition to Nvidia accelerated computing, which is the most versatile, most energy efficient, and the lowest TCO approach to train and deploy AI. Generative AI draws significant upside in demand for our products, creating opportunities and broad-based global growth across our markets.
Let me give you some color across our three major customer categories, called service providers or CSPs, consumer internet companies, and enterprises. First, CSPs around the world are racing to deploy our flagship hopper and ampere architecture GPUs to meet the surge in interest from both enterprise and consumer AI applications for training and inference. Global CSPs announce the availability of H100 on their platforms, including private previews at Microsoft Azure, Google Cloud, and Oracle Cloud infrastructure, upcoming offerings at AWS and general availability at emerging GPU specialized cloud providers like Core Reef and Lambda.
In addition to enterprise AI adoption, these CSPs are serving strong demand for H100 from generative AI pioneers. Second, consumer internet companies are also at the forefront of adopting generative AI and deep learning-based recommendation systems, driving strong growth. For example, Meta has now deployed its H100 powered brand Teton AI supercomputer for its AI production and research teams.
Third, enterprise demand for AI and accelerated computing is strong. We are seeing momentum in verticals such as automotive, financial services, healthcare, and telecom where AI and accelerated computing are quickly becoming integral to customers, innovation, road maps, and competitive positioning. For example, Bloomberg announced it has a 50 billion parameter model, Bloomberg GPT, to help with financial, natural language processing tasks, such as sentiment analysis, named entity recognition, news classification, and question answering. Auto Insurance Company, CCC intelligence solutions, is using AI for estimating repairs. And AT&T is working with us on AI to improve fleet dispatches so their field technicians can better serve customers. Among other enterprise customers using NVIDIA AI are devoid for logistics and customer service and AMGIM for drug discovery and protein engineering.
This quarter, we started shipping DGX H100, our Hopper Generation AI system, which customers can deploy on-prem. And with the launch of DGX Cloud through our partnership with Microsoft Azure, Google Cloud and Oracle Cloud infrastructure, we deliver the promise of NVIDIA DGX to customers from the cloud. Whether the customers deploy DGX on-prem or via DGX Cloud, they get access to NVIDIA AI software, including NVIDIA base command and AI frameworks and pre-trained models.
We provide them with a blueprint for building and operating AI, spanning our expertise across systems, algorithms, data processing, and training methods. We also announced NVIDIA AI Foundations, which are model foundry services available on DGX Cloud that enable businesses to build, refine, and operate custom large language models and generative AI models trained with their own proprietary data, created for unique domain specific tasks.
我们向他们提供了一份建立和运营AI的蓝图,涵盖了我们在系统、算法、数据处理和训练方法方面的专业知识。此外,我们还宣布了NVIDIA AI Foundations,这是在DGX Cloud上提供的模型工厂服务,使企业能够构建、精制和操作具有自己专有数据的自定义大语言模型和生成式AI模型,以完成独特的特定领域任务。
They include NVIDIA Nemo for large language models, NVIDIA Picasso for images, NVIDIA and 3D, and NVIDIA BioNemo for life sciences. Each service has six elements, pre-trained models, frameworks for data processing and configuration, proprietary knowledge-based vector databases, systems for fine-tuning, aligning, and guard railing, optimized inference engines, and support from NVIDIA experts to help enterprises fine-tune models for their custom use cases.
Service Nemo, a leading enterprise services platform is an early adopter of DGX Cloud and Nemo. They are developing custom large language models trained on data specifically for the service-nile platform. Our collaboration will let service-nile create new enterprise-grade generative AI offerings for the thousands of enterprises worldwide running on the service-nile platform, including for IT departments, customer service teams, employees, and developers.
Service Nemo是一家领先的企业服务平台,已成为DGX Cloud和Nemo的早期采用者。他们正在开发针对service-nile平台的特定数据进行训练的自定义大语言模型。我们的合作将使service-nile创建新的面向企业级生成式AI服务,应用于全球数千家运行在service-nile平台上的企业,包括IT部门、客户服务团队、员工和开发人员。
NVIDIA AI is also driving a step-function increase in inference work modes, because of their size and complexity, these workflows require acceleration. The latest ML Perk Industry benchmark released in April showed NVIDIA's inference platforms deliver performance that is orders of magnitudes ahead of the industry, with unmatched versatility across diverse work modes.
To help customers deploy generative AI applications at scale, at GTC, we announced four major new inference platforms that leverage the NVIDIA AI software stack. These include L4, Tensor Core GPU for AI video, L40 for omniverse and graphics rendering, H100NBL for large language models, and the Grace Hopper Superchip for LLMs, and also recommendation systems and vector databases. Google Cloud is the first CSP to adopt our L4 inference platform with the launch of its G2 virtual machines for generative AI inference and other work modes, such as Google Cloud Data Proc, Google Alpha Fold, and Google Cloud Immersive Stream, which render 3D and AR experiences.
In addition, Google is integrating our Triton inference server with Google Kubernetes engine and its cloud-based, MIRTEX AI platform. In networking, we saw strong demand at both CSPs and enterprise customers for generative AI and accelerated computing, which require high performance networking, like in videos and melanox networking platforms, demand relating to general purpose CPU infrastructure remains soft.
As generative AI applications grow in size and complexity, high performance networks become essential for delivering accelerated computing at data center scale to meet the enormous demands of both training and inferencing. Our 400GIG Quantum 2 Infinitban platform is the gold standard for AI dedicated infrastructure with broad adoption across major cloud and consumer internet platforms, such as Microsoft Azure.
With the combination of in-network computing technology and the industry's only end-to-end data center scale, optimized software stack, customers routinely enjoy a 20% increase in throughput for their sizeable infrastructure and mess line. Our multi-tenant cloud transitioning to support generative AI, our high speed Ethernet platform with Bluefield 3, DPUs, and Spectrum 4 Ethernet switching offers a highest available Ethernet network performance.
Bluefield 3 is in production and has been adopted by multiple hyperscale and CSP customers, including Microsoft, Azure, Oracle Cloud, PowerWeaf, BIDU, and others. We look forward to sharing more about our 400GIG Spectrum 4 accelerated AI networking platform next week at the Computex conference in Taiwan.
Lastly, our Grace Data Center CPU is sampling with customers. At this week's International Supercomputing Conference in Germany, the University of Bristol announced a new supercomputer based on the NVIDIA Grace CPU Superchip, which is 6X more energy efficient than the previous supercomputer. This adds to the growing momentum for Grace with both CPU only and CPU GPU opportunities across AI and cloud and supercomputing applications.
Now, let's move to gaming. Gaming revenue of $2.24 billion was up 22% sequentially and down 38% year on year. Strong sequence of growth was driven by sales of the 40 series GeForce RTX GPUs for both notebooks and dust tops. Overall, and demand was solid and consistent with seasonality demonstrating resilience against a challenging consumer spending backdrop.
The GeForce RTX 40 series GPU laptops were off to a great start, featuring four NVIDIA inventions, RTX Path Tracing, DLSS3 AI rendering, Reflex Ultra-Low Latency rendering, and MAX Q, energy efficient technologies. They deliver tremendous gains in industrial design, performance, and battery life for gamers and creators. And like our desktop offerings, 40 series laptops support the NVIDIA Studio platform of software technologies, including acceleration for creative data science and AI workflows, and omniverse, giving content creators unmatched tools and capabilities.
In desktop, we ramped the RTX 4070, which joined the previously launched RTX 4090, 4080, and the 4070 TI GPUs. The RTX 4070 is nearly 3x faster than the RTX 2070, and offers our large install base expect tax in our upgrade. Last week, we launched the 60 family, RTX 4060 and 4060 TI, bringing our newest architecture to the world's core gamers, starting at just $299. These GPUs, for the first time, provide two extra performances of the latest gaming console at mainstream price points. The 4060 TI is available starting today, while the 4060 will be available in July.
我们在台式机中推出了RTX 4070,它加入了之前推出的RTX 4090、4080和4070 TI GPU。RTX 4070比RTX 2070快近3倍,并且为我们庞大的安装基础提供了预期的税务升级。上周,我们推出了RTX 4060和4060 TI的60系列,将我们最新的架构带给了全球核心玩家,起价只有299美元。这些GPU首次在主流价格点提供了两个最新游戏机的额外性能。4060 TI从今天起就可用,而4060将在7月推出。
Generative AI will be transformative to gaming and content creation from development to runtime. At the Microsoft Build Developer Conference earlier this week, we showcased how Windows PCs and work stations within video RTX GPUs will be AI powered at their core. NVIDIA and Microsoft have collaborated on end-to-end software engineering, spanning from the Windows operating system to the NVIDIA graphics drivers, and Nemo, LLF Framework, to help make Windows on NVIDIA RTX 10-SERCOR GPUs a supercharged platform for generative AI.
Last quarter, we announced a partnership with Microsoft to bring Xbox PC games to GeForce now. The first game from this partnership, Gears 5, is now available with more set to be released in the coming months. There are now over 1,600 games on GeForce now, the richest content available on any gaming service.
上个季度,我们宣布与微软合作,将 Xbox PC 游戏带到 GeForce now 平台。这个合作伙伴关系的第一个游戏《战争机器 5》现已上线,今后几个月还会有更多游戏推出。目前,GeForce now 平台上已有超过1,600款游戏,是所有游戏服务中最丰富的内容。
Moving to Pro Visualization. Revenue of $295 million was up 31% sequentially and down 53% year on year. Supprential growth was driven by stronger workstation demand across both mobile and desktop platform factors, with strength in key verticals such as public sector, healthcare, and automotive. We believe the channel inventory correction is behind us. The ramp of our ADA-lovelace GPU architecture in work stations kicks off a major product cycle. At GTC, we announced six new RTX GPUs for laptops and desktop workstations with further rollouts planned in the coming quarters.
Generative AI is a major new workload for Nvidia powered workstation. Our collaboration with Microsoft transforms Windows into the ideal platform for creators and designers, harnessing Generative AI to elevate their creativity and productivity. At GTC, we announced Nvidia Omniverse Cloud and Nvidia fully managed service running in Microsoft Azure that includes the full suite of Omniverse applications and Nvidia OVX infrastructure. Using this full stack cloud environment, customers can design, develop, deploy, and manage industrial metaverse applications. Nvidia Omniverse Cloud will be available starting in the second half of this year. Microsoft Nvidia will also connect Office 365 applications with Omniverse.
Omniverse Cloud is being used by companies to digitize their workflows from design and engineering to smart factories and 3D content generation from our team. The automotive industry has been a leading early adopter of Omniverse, including companies such as the MW Group, Yili Lotus, General Motors, and Jaguar Land Rover.
Moving to automotive. Omniverse was 296 million up 1% sequentially and up 114% from a year ago. Our strong year on your growth was driven by the ramp of the Nvidia Drive, Orrin, across a number of new energy vehicles. As we announced in March, our automotive design went pipeline over the next six years, now it's down at 14 billion, up 11 billion a year ago, giving us visibility into continued growth over the coming years.
Sequentially, growth moderated as some of these customers in China are adjusting their production schedules to reflect slower than expected demand growth. We expect this dynamic to linger for the rest of the calendar year.
During the quarter, we expanded our partnership with BID, the world's leading manufacturer of NIVs. Our new design win will extend BID's use of the Drive, Orrin, to its next generation, high volume, dynasty, and ocean series of vehicles set to start production in calendar 2024.
Moving to the rest of the P&L, Gaprose margins were 64.6%, non-Gaprose margins were 66.8%, Gaprose margins have now largely recovered to prior peak levels, as we have absorbed higher costs and offset them by innovating and delivering higher valued products as well as products incorporating more and more software.
Sequentially, Gap operating expenses were down 3%, and non-Gap operating expenses were down 1%. We have held up X at roughly the same level over the last past four quarters, while working through the inventory corrections in gaming and professional visualization.
We now expect to increase investments in the business while also delivering operating leverage. We returned 99 million to shareholders in the form of cash dividends. At the end of the Q1, we have approximately 7 billion remaining under our share report of visualization through December 2023.
Let me turn to the outlook for the second quarter of fiscal 2016. Total revenue is expected to be 11 billion, plus or minus 2%. We expect this sequential growth to largely be driven by data center, reflecting a steep increase in demand related to generative AI and large language models. This demand has extended our data center visibility out a few quarters, and we have procured substantially higher supply for the second half of the year.
Gap and non-Gap gross margins are expected to be 68.6% and 70% respectively, plus or minus 50 basis points. Gap and non-Gap operating expenses are expected to be approximately 2.71 billion and 1.9 billion respectively. Gap and non-Gap other income and expenses are expected to be an income of approximately 90 million, excluding gains and losses from non-affiliated investments. Gap and non-Gap tax rates are expected to be 14%, plus or minus 1%, excluding any discrete items.
Several expenditures are expected to be approximately 300 to 350 million. Further financial details are included in the CFO commentary and other information available on our IR website, including let me highlight some of the upcoming events.
Jensen will give the compute tax keynote address in person in Taipei this coming Monday, May 29 local time, which will be Sunday evening in the U.S. In addition, we will be attending the BFA Global Technology Conference in San Francisco on June 6th and Rosenblatt Virtual Technology Summit on the age of AI on June 7th and the new Street Future of Transportation Virtual Conference on June 12th.
Our earnings call to discuss the revolts of our second quarter fiscal 24 is scheduled for Wednesday, August 23rd. Well, as covers our opening remarks, we're now going to open the calls for questions.
Operator, would you please call for questions? Thank you.
操作员,请您发问。谢谢。
At this time, I'd like to remind everyone in order to ask a question, press star than the number one on your telephone keypad. We ask that you please limit yourself to one question.
We'll take our first question from Tishia Hari with Goldman Sachs. Your line's open.
我们首先接受Goldman Sachs的Tishia Hari提出的问题。你可以开始提问了。
Hi, good afternoon. Thank you so much for taking the question and congrats on the strong results and incredible outlook.
您好,下午好。非常感谢您回答我们的问题,恭喜您取得了强劲的结果和令人难以置信的前景。
Just one question on data center. Collette, you mentioned the vast majority of the sequential increase in revenue this quarter will come from data center. I was curious what the construct is there. If you can speak to what the key drivers are from April to July. Perhaps more importantly, we talked about visibility into the second half of the year. I'm guessing it's more of a supply problem at this point. What kind of sequential growth beyond the July quarter can your supply chain support at this point?
Okay, so a lot of different questions there. So let me see if I can start and I'm sure Jensen will have some following up comments. So when we talk about our sequential growth that we're expected between Q1 and Q2, our generative AI, large language models are driving this surge into math. And it's a broad based across both our consumer internet companies, our CSPs, our enterprises and our AI startups.
It is also interest in both of our architectures, both of our hopper latest architecture as well as our ampere architecture. This is not surprising as we generally often sell both of our architectures at the same time. This is also a key area where deep recommendators are driving growth and we also expect to see growth both in our computing as well as in our networking business.
So those are some of the key things that we have baked in when we think about the guidance that we've provided to Q2.
这些都是我们在提供第二季度指导时考虑的关键因素之一。
We also surfaced in our opening remarks that we are working on both supply today for this quarter, but we have also procured a substantial amount of supply for the second half.
我们在开场白中也提到,我们正在为本季度的供应工作,但我们也已经采购了大量的供应品,以备下半年使用。
We have some significant supply chain flow to serve our significant customer demand that we see.
我们发现有大量的供应链流动来满足我们所见到的大量客户需求。
And this is demand that we see across a wide range of different customers. They are building platforms for some of the largest enterprises, but also setting things up at the CSPs and the large consumer internet companies.
So we have visibility right now for our data center demand that has probably extended out a few quarters and this led us to working on quickly procuring that substantial supply for the second half.
I'm going to pause there and see if Jen's are going to add a little bit more.
我要暂停一下,看看Jen是否会增加一点内容。
I thought the little screen could live. Thank you.
我以为这个小屏幕可以复活。谢谢。
Next we'll go to CJ Muse with Evercore, ISI, your line's open.
接下来我们将与Evercore, ISI的CJ Muse联系,您可以发言了。
Yeah, good afternoon. Thank you for taking the question. I guess with data center, essentially double in quarter on quarter, two natural kind of questions that relate to one another come to mind.
Number one, where are we in terms of driving acceleration into servers to support AI?
我们在推动加速发展服务器以支持人工智能方面的进展如何?
And as part of that, as you deal with longer cycle times with TSMC and your other partners, how are you thinking about managing the commitments there with where you want to manage your lead times in the coming years to best match that supply and demand?
Yes, CJ. Thanks for the question. I'll still start backwards.
是的,CJ。谢谢你的问题。我会从后往前开始说。意思是要倒着说。
Remember, we were in full production of both Ampere and Hopper when the CHAT GPT moment came and it helped everybody crystallize how to transition from the technology of large language models to a product and service based on a chatbot.
The integration of guard rails and alignment systems with reinforcement learning and human feedback, knowledge vector databases for proprietary knowledge, connection to search, all of that came together in a really wonderful way.
And the reason why I call it the iPhone moment, all the technology came together and helped everybody realize what an amazing product it can be and what capabilities it can have.
我称之为“iPhone时刻”的原因是所有技术都结合在一起,帮助人们意识到它有多么惊人的产品和潜力。
And so we were already in full production.
因此,我们已经处于全面生产状态。
Envious supply chain flow and our supply chain is very significant, as you know.
我们的供应链流和你们的供应链流一样都非常重要,这是毋庸置疑的。
And we built supercomputers in volume and these are giant systems and we built them in volume.
我们大量建造了超级计算机,这些是巨型系统,我们大量建造了这些系统。
It includes, it includes of course the GPUs but on our GPUs the system boards have 35,000 really components and the networking and the fiber optics and the incredible transceivers and the next, the smart nicks, the switches, all of that has to come together in order for us to stand up a data center.
And so we were already in full production and when the moment came, we had to really significantly increase our procurement substantial for the second half as Collette said.
And sufficiently that almost the entire data centers major applications could be exonerated, you could reduce the amount of energy consumed and the amount of cost for our data centers substantially by an order of magnitude.
It takes the cost a lot of money to do it because you have to do all the software and everything and you have to build all the systems and so on and so forth.
做这件事情需要花费很多钱,因为你需要开发所有软件,构建所有系统等等。
But you know we've been at it for 15 years.
但是你知道我们已经坚持这个事情15年了。
意思是说,我们已经长时间地坚持做某事。
And what happened is when gender today I came along it triggered a killer app for this computing platform that's been in preparation for some time.
今天当性别意识到出现时,它激发了一款为该计算平台准备了一段时间的杀手应用程序。
And so now we see ourselves in two simultaneous transitions.
因此,我们现在正处于两个同时进行的转变中。
The world's $1 trillion data center is nearly populated entirely by CPUs today.
And I, you know $1 trillion, $250 billion a year it's growing of course but over the last four years you know call it a trillion dollar so the infrastructure installed.
And it's all completely based on CPUs and dumb nicks.
这完全是基于CPU和愚蠢的功能的。
It's basically unaccelerated.
基本上是不加速的。
In the future it's fairly clear now with this with gender to the eye becoming the primary workload of most of the world's data centers generating information.
未来,很明显,性别分类将成为世界各大数据中心主要工作之一,不断生成信息。
It is very clear now that and the fact that accelerated computing is so energy efficient that the budget of a data center will shift very dramatically towards accelerated computing.
现在非常清楚的是,加速计算非常节能,数据中心的预算将出现非常明显的向加速计算的转变。
And you're seeing that now.
你现在正在看到这一点。这句话的含义可能是在某个过程或现象中,你现在已经开始观察到某些变化或趋势。
We're going through that moment right now as we speak.
我们正在说话的时候经历那个时刻。
Well, while the world's data center capex budget is limited at the same time we're seeing incredible orders to retool the world's data centers.
So I think you're starting you're seeing the beginning of you know call it a 10 year transition to basically recycle or reclaim the world's data centers and build it out as accelerated computing. You'll have you'll have this pretty dramatic shift in the spend of a data center from traditional computing and to accelerate computing with smart nicks smart switches. You know of course GPUs and and the workload is going to be predominantly gender to the eye.
Okay, we'll move to our next question. Vivek, are you with B of A securities your lines open? Well, thanks for the question. So could I just wanted to clarify does visibility mean data center sales can continue to grow sequentially in Q3 and Q4 or do they sustain at Q2 levels? I just wanted to clarify that. And then, then my question is that you know given this very strong demand environment, what does it do to the competitive landscape? You know does it invite more competition in terms of custom asics? Or does it invite more competition in terms of other GPU solutions or other kinds of solutions? How do you see the competitive landscape change over the next two to three years?
好的,接下来我们进入下一个问题。Vivek,你是来自B of A证券的吗?你的线路已经开通了吗?谢谢你的问题。我想澄清一下,可见性是否意味着数据中心销售可以在Q3和Q4以连续增长的方式继续增长,还是会保持在Q2的水平?我只是想澄清一下。接着,我的问题是,考虑到这种非常强劲的需求环境,它对竞争格局会产生什么影响?它是否会在定制ASIC方面引入更多的竞争?还是它会在其他GPU解决方案或其他解决方案方面引入更多的竞争?在未来两到三年内,您如何看待竞争格局会发生什么变化?
Yeah, Vivek, thanks for the question. Let me see if I can add a little bit more color. We believe that the supply that we will have for the second half of the year is we should sustain a relatively larger than H1. So we are expecting not only the demand that we just saw in this last quarter, the demand that we have in Q2 for our forecast, but also planning on seeing something in the second half of the year. We just have to be careful here, but we're not here to guide the second half. But yes, we do plan essential increasingly in the second half compared to the first half.
But regarding competition, we have competition from every direction. Startups really, really well funded and innovative startups, countless of them all over the world. We have competitions from existing, existing semiconductor companies. We have competition from CSPs with internal projects and many of you know about most of these. So we're mindful of competition all the time and we get competition all the time. And various value proposition at the core is we are the lowest cost solution.
Where the lowest TCO solution and the reason for that is because it's already computing is two things that I talk about often, which is it's a full stack problem. It's a full stack challenge. You have to engineer all of the software and all the libraries and all the algorithms integrate them into and optimize the frameworks and optimize it for the architecture of not just one ship at the architecture of an entire data center, all the way into the frameworks, all the way into the models. And the amount of engineering and distributed computing, fundamental computer science work is really quite extraordinary. It is the hardest computing as we know.
And so number one, it's a full stack challenge and you have to optimize it across the whole thing and across just a mind-blowing number of stacks. We have 400 acceleration libraries. As you know, the amount of libraries and frameworks that we accelerate is pretty mind-blowing. And the second part is that generative AI is a large scale problem and it's a data center scale problem. It's another way of thinking that the computer is the data center or the data center is the computer. It's not the chip. It's the data center. And it's never happened like this before.
And in this particular environment, your networking operating system, your distributed computing engines, your understanding of the architecture of the networking gear, the switches, and the computing systems, the computing fabric, that entire system is your computer. And that's what you're trying to operate. And so in order to get the best performance, you have to understand full stack and understand data center scale. That's what accelerator computing is. The second thing is that utilization, which talks about the amount of the types of applications that you can accelerate and diversity of your architecture, keeps the utilization high.
If you can do one thing and do one thing only and incredibly fast, then your data center is largely underutilized and it's hard to scale that out. And if you do this universal GPU, in fact, that we accelerate some of these facts, makes our utilization incredibly high. And so number one is your pet and that software, that's software intensive problems, a data center architecture problem, the second is utilization, versatility problem. And the third is just data center expertise. We've built five data centers of our own and we've helped companies all up to world build data centers.
And we integrate our architecture into all the world's clouds. And the moment of delivery of the product to standing up and the deployment, the time to operations of a data center is measured. You know, if you're not good at it and you're not proficient at it, it could take months. Standing up a supercomputer, let's see, some of the largest supercomputers in the world were installed about a year and a half ago and now they're coming online. And so it's not, you know, it's unheard of to see a delivery to operations of about a year. Our delivery to operations measured in weeks. And that's, we've taken data centers and supercomputers and we've turned it into products and the expertise of the team in doing that is incredible.
And so our value proposition is in the final analysis, all of this technology translates in the infrastructure, the highest throughput and the lowest possible cost. And so I think our market is of course very, very competitive, very large, but the challenge is really, really great.
Next, we go to Aaron Eraker's with Wells Fargo. You're lying to open. Yeah, thank you for taking a question and congrats on the clutter.
接下来,我们要去见沃尔斯·法戈的亚伦·艾瑞克。你们准备好了。嗯,感谢您接受提问并祝贺您的成功。
As we kind of think about unpacking the various different growth drivers of the data center business going forward, I'm curious, Collette, just how we should think about the monetization effect of software, considering that the expansion of your cloud service agreements continues to grow. I'm curious of what, where do you think we're at in terms of that approach, in terms of the AI enterprise software suite and other, you know, drivers of software only revenue going forward?
Thanks for the question. Software is really important to our accelerated platforms. Not only do we have a substantial amount of software that we are including in our newest architecture and essentially all products that we have, we are now with many different models to help customers start their work in generative AI and accelerated computing. So anything that we have here from DGX Cloud on providing those services, helping them build models or as you've discussed the importance of Nvidia AI enterprise, essentially that operating system for AI. So all things should continue to grow as we go forward, both the architecture and the infrastructure as well as the both availability of the software and the digital monetization data as well, the turnover of the Genshin-Circumence.
Yeah, we can see in real time the growth of generative AI and CSPs. Both for training the models, refining the models as well as deploying the models. As Colette said earlier, inference is now a major driver of accelerated computing because generative AI is used so, so, so, capability in so many applications already.
There are two segments that requires a new stack of software and the two segments are enterprise and industrial. Enterprise requires a new stack of software because many enterprises need to have all the capabilities that we've talked about, whether it's large language models, the ability to adapt them for your proprietary use case and your proprietary data in alignment to your own principles and your own operating domains. You want to have the ability to be able to do that in a high performance computing sandbox and we call that DJX Cloud and create that model. Then you want to deploy your chatbot or your AI in any cloud because you have services and you have agreements with multiple cloud vendors and depending on the applications, you might deploy it on various clouds.
For the enterprise, we have Envidia AI Foundation for helping you create custom models and we have Envidia AI Enterprise. Envidia AI Enterprise is the only accelerated stack, GPO accelerated stack in the world that is enterprise safe and enterprise supported. You know, they're they're they're constant patching that you have to do. There are 4,000 different packages that build up Envidia AI Enterprise and represents the operating engine and and operating engine of the entire AI workflow. The only one of its kind from data ingestion, data processing, you know, obviously in words of training in an AI model, you have a lot of data, you have to process and package up and curate and in a line and there's just a whole bunch of stuff that you have to do to the data to prepare it for training.
对于企业而言,我们有Envidia AI Foundation来帮助您创建定制模型,同时我们还有Envidia AI Enterprise。Envidia AI Enterprise是全球唯一的、具备GPO加速的完整堆栈,同时也是企业可靠和有支持的。您知道,要不断进行补丁更新。Envidia AI Enterprise包含了4,000个不同的软件包,构建了整个AI工作流的操作引擎。它是从数据摄取、数据处理到训练AI模型的唯一一种。在训练AI模型时,您需要处理很多数据,将其打包、整理和排列,以进行训练前的准备。
That amount of data could consume some 40, 50, 60 percent of your computing time. So data processing, data processing is a very big deal. And the second aspect of it is training the model, refining the model and third is deploying model for inferencing. And then in the AI Enterprise supports and patches and security patches continuously all of those 4,000 packages of software and and for an enterprise that wants to deploy their engines, it's like they want to deploy red hat Linux.
This is, you know, incredibly complicated software in order to deploy that in every cloud and as well as on-prem it has to be secure, it hasn't be supported. And so in VDA AI Enterprise is the second part. The third is on-prem.
这是非常复杂的软件,为了在每个云和本地网络上部署,必须保证其安全性和可支持性。因此,在VDA AI Enterprise中有第二部分,第三部分是本地部署。
Just as people are starting to realize that you need to align an AI to ethics, the same for robotics, you need to align the AI for physics. And aligning an AI for ethics includes a technology called reinforcement learning human feedback. In the case of industrial applications and robotics, it's reinforcement learning omniverse feedback. And omniverse is a vital engine for software defined in robotic applications and industries.
And so omniverse also needs to be a cloud service platform. And so our software stack, the three software stacks, AI foundation, AI Enterprise and omniverse runs in all of the worlds, all of the worlds clouds that we have partnerships, DJX, cloud partnerships with.
Azure, we have partnerships on both AI as well as omniverse with GCP and Oracle. We have great partnerships in DJX Cloud for AI and AI Enterprise is integrated into all three of them. So I think the in order for us to extend the reach of AI beyond the cloud and into the worlds enterprise and into the worlds industries, you need two new software stacks in order to make that happen.
在 Azure 中,我们与 GCP 和 Oracle 有关于 AI 和 omniverse 的合作伙伴关系。我们与 DJX Cloud 有着良好的 AI 合作伙伴关系,并且 AI Enterprise 已经集成到它们的所有三个平台中。因此,为了将 AI 的覆盖范围扩展到企业和行业外的世界,我们需要两个新的软件堆栈来实现。
And by putting it in the cloud, integrated into the worlds CSP clouds, it's a great way for us to partner with the sales and the marketing team and leadership team of all the cloud vendors.
So the question first is Jensen on the Infiniband versus Ethernet argument. Can you sort of speak to that debate and maybe how you see it playing out? I know you need the low latency of Infiniband for AI, but can you sort of talk about the attach rate of your Infiniband solutions to what you're shipping on the core compute side? And maybe whether that's similarly crowding out Ethernet like you are on the compute side.
And then the clarification collette is that there wasn't a share buyback despite you still having about $7 billion on the share repo authorization. Was that just timing?
Thanks. How about you go first? Let me take this question. That is correct. We have a seven billion available in recurrent authorization for repurchasing. We did not repurchase anything in this last quarter, but we do repurchase upper to stick with and we'll consider that as we go forward as well.
Thank you. Infiniband and Ethernet are target different applications in a data center. They both have their place. Infiniband had a record quarter. We're going to have a giant record year. And Infiniband has a really amazing quantum Infiniband. It has an exceptional roadmap. It's going to be really incredible.
The two networks are very different. Infiniband is designed for an AI factory, if you will. If that data center is running a few applications for a few people for a specific use case and it's doing it continuously, and that infrastructure costs you pick a number $500 million. The difference between Infiniband and Ethernet could be 15-20% in overall throughput.
And if you spent $500 million dollars in an infrastructure and the difference is 10-20% and $100 million, Infiniband is basically free. That's the reason why people use it. Infiniband is effectively free. The difference in a data center throughput is too great to ignore. And you're using it for that one application.
And so, however, if your data center is a cloud data center and it's multi-tenant, it's a bunch of little jobs. A bunch of little jobs and it's shipped by millions of people. And Ethernet's really the right answer. There's a new segment in the middle where the cloud is becoming a generative AI cloud. It's not an AI factory, per se. But it's still a multi-tenant cloud.
But it wants to run generative AI workloads. This new segment is a wonderful opportunity. And at Computex, I refer to it at the last GTC. At Computex, we're going to announce a major product line for this segment, which is for Ethernet, focused, generative AI application type of clouds.
But Infiniband is doing fantastically. And we're doing record numbers quarter on quarter a year and a year. Next, we'll go to Stacey Radgund with Bernstein Research Shareline-Dovon.
但因芬巴因很出色。我们连续几个季度创下了记录数字,今年和去年都是如此。接下来,我们将转向Bernstein Research Shareline-Dovon的Stacey Radgund。
Hi guys. Thanks for taking my question. I had a question on inference versus training for generative AI. So you're talking about inferences being a very large opportunity.
I guess two sub parts of that. Is that because inference basically scales with the usage versus training is more of a one in done? And can you give us some sort of, even if it's just qualitatively, if you think inference is bigger than training or vice versa, if it's bigger, how much bigger is it? Because the opportunity is at 5x, as it 10x. Is anything going to give us on those two workloads within generative AI, the helpful?
Yeah. I'll work backwards. You're never done with training. You're always, every time you deploy, you're collecting new data, when you collect new data, you train with the new data. And so you're never done training. You're never done producing and processing a vector database that augments the large language model. You're never done with vectorizing all of the collected, unstructured data that you have.
And so whether you're building a recommender system, a large language model of vector database, these are probably the three major applications of the three core engines of you well. Of the future of computing. It's all a bunch of other stuff, but obviously these are very three very important ones. They're always, always running. You're going to see that more and more companies realize they have a factory for intelligence and intelligence factory.
And in that particular case, it's largely dedicated to training and processing data and vectorizing data and learning representation of the data so on and so forth. The inference part of it are APIs that are either open APIs that can be connected to all kinds of applications. APIs are integrated into workflows, but APIs of all kinds are behind us of APIs in a company.
Some of them they built themselves, some of them part that many of them could come from companies like ServiceNow and Adobe that we partner with in AI foundations. And they'll create a whole bunch of generative APIs that companies can then connect into their workflows or use as an application. And of course there'll be a whole bunch of internet service companies.
So I think you're seeing for the very first time simultaneously a very significant growth in the segment of AI factories as well as a market that a segment that really didn't exist before, but now it's growing exponentially practically by the week for AI inference with APIs.
The simple way to think about it in the end is that the world has a trillion dollars of data center installed and it used to be 100% CPUs in the future. We know we've heard it in enough places and I think this year's ISC keynote was actually about the end of Moore's Law. We've seen it in a lot of places now that you can't reasonably scale out data centers with general purpose computing and that accelerated computing is the path forward.
最简单的思考方式是,现在全球拥有一万亿美元的数据中心,过去百分之百依靠 CPU 进行运行,但未来将不再如此。我们已经在多个场合听到过它,我认为今年的 ISC 主题演讲实际上就是关于摩尔定律的终结。现在我们已经看到,在大规模数据中心扩容上,通用计算已经无法满足需求,加速计算是未来的方向。
And now it's got a killer out. It's gone generative AI. So the easiest way to think about that is your trillion dollar infrastructure. Every quarter's capital catbex budget would lean very heavily into generative AI into accelerated computing infrastructure everywhere from the number of GPUs that would be used in the catbex budget to accelerate switches and accelerated networking chips that connect them all.
The easiest way to think about that is over the next four or five or ten years. Most of that trillion dollars and then compensating just for all the growth in data center still will be largely generative AI. The easiest way to think about that is training as well as inference.
Great thank you. I wanted to follow up on that. In terms of the focus on inference it's pretty clear that this is a really big opportunity around large language models. But the cloud customers are also talking about trying to reduce cost per query by very significant amounts. You can talk about the ramifications for you guys.
That's a great question. Whether you start by building a large language model and you use that large language model very large version and you could distill them into medium, small and tiny size. The tiny size ones you could put in your phone and your PC and so on and so forth. They all have good, you know, they all have, it seems surprising but they all can do the same thing.
But obviously the zero shot or the generalizability of the large language model, the biggest one is much more versatile and it can do a lot more amazing things. And the large one would teach the smaller ones how to be good AI.
And so you use the large one to generate prompts to align the smaller ones and so on and so forth. And so you start by building very large ones and then you also have to train a whole bunch of smaller ones. Now that's exactly the reason why we have so many different sizes of our inference.
You saw that I announced L4, L40, H100 NBL, which also has H100 and then we have H100 multi-node with NV link and so there's, there's a, you could, you could have model sizes of any kind that you like.
The other thing that's important is these are models but they're connected ultimately to applications and the applications could have image in, video out, video in, text out, you know, image in, proteins out, text in, 3D out, video in in the future, 3D graphics out. You know, so the input in the output requires a lot of pre-imposed processing.
The pre-imposed processing can be ignored and this is one of the things that most of, most of the specialized chip arguments fall apart and it's because the link, the model itself is only, you know, call it 25% of the data of the overall processing of inference. The rest of it is, is about pre-processing, post-processing, security, you know, decoding, all kinds of things like that. And so, so I think the multi-modality aspect of inference, the multi-diversity of inference that it's going to be done in the cloud on prem. It's going to be done in multi-cloud, that's the reason why we have an AI enterprise and all the clouds. It's going to be done on prem, that's the reason why we, you know, we have great partnership with Dell, which is the NALST the other day called Project Helix. That's going to be integrated into third party services, that's the reason why we have a great partnership with ServiceNow and Adobe because they're going to be creating a whole bunch of generative AI capabilities.
And so there's all the diversity and the reach of generative AI is so, so broad. You need to have some very fundamental capabilities like what I just described in order to really address the whole space of it.
Next we'll go to Harlan, sir, with JP Morgan, your lines open. Hi, good afternoon and congratulations on the strong result of the execution. I really appreciate more of the focus or some of the focus today on your networking products. I mean, it's really an integral part, it took sort of maximized the full performance of your compute platforms. I think that data center networking business is driving about a billion dollars of revenues per quarter plus or minus. You know, that's two and a half X growth from three years ago, right, when you guys acquired Mel and also very strong growth.
But given the very high attach of your Infinity Band, you can add solutions to your accelerated compute platforms is the networking run rate stepping up in line with your compute shipments. And then what is the team doing to further unlock more networking bandwidth going forward? Just to keep pace with the significant increase in compute complexity, data sets, requirements for lower latency, better traffic predictability and so on.
Yeah, Harlan, I really appreciate that. Nearly everybody who thinks about AI, they think about that chip, that accelerator chip. And in fact, misses the whole point nearly completely. And I've mentioned before that that accelerator computing is about the stack, about the software. And networking, remember, we announced a very early on this networking stack called DOKA. And we have the acceleration library called MagnumIO. These two pieces of software are some of the crown jewels of our company. Nobody ever talks about it because it's hard to understand. But it makes it possible for us to connect tens of thousands of GPUs. How do you connect tens of thousands of GPUs if the operating system of the data center, which is the infrastructure, it is not insanely great.
And so that's the reason why we're so obsessed about networking in the company. And one of the great things that we have, we have, you know, Melanox, as you know, quite well, was the world's highest performance and the unambiguous leader in high performance networking. That's the reason why our two companies are together. You also see that our network expands starting from NVLink, which is a computing fabric with really super low latency. And it communicates using memory references, not network packets. And then we take NVLink. We connect it inside multiple GPUs and I've described going beyond the GPU and I'll talk a lot more about that at Computex in a few days. And then that gets connected to Infiniband, which includes the NIC, the SmartNIC, Bluefield 3 that we're in full production with and the switches, all of the fiber optics that are optimized end to end, these things are running at incredible line rates.
And then beyond that, if you want to connect the Smart AI factory, the SmartFact, this AI factory into your computing fabric, we have a brand new type of ethernet that we'll be announcing at Computex. So this whole area of the computing fabric, extending, connecting all of these GPUs and computing units together all the way through the networking, through the switches, the software stack is insanely complicated. And so we're delighted you understand it. But we don't break it out particularly because we think of the whole thing as a computing platform as it should be. We sell it to all of the world's data centers as components so that they can integrate it into whatever style or architecture that they would like and we can still run our software stack.
接着,如果你想将智能 AI 工厂 SmartFact 连接到你的计算网络中,我们将在 Computex 上推出一种全新类型的以太网。计算网络这一整个领域,通过扩展、连接所有的 GPU 和计算单元,一直延伸到网络、交换机以及软件堆栈,过于复杂。我们很高兴你能够理解这一点。但是我们没有特别地把它分离出来,因为我们认为整个计算平台应该被视为一个整体。作为组件,我们将它销售给全球数据中心,这样他们就可以将它集成到任何风格或架构中,而我们仍然可以运行我们的软件堆栈。
That's the reason why we break it up. It's way more complicated the way that we do it. But it makes it possible for Nvidia's computing architecture to be integrated into anybody's data center in the world from cloud of all different kinds to on-prem of all different kinds all the way out to the edge to 5G. And so this way of doing it is really complicated but it gives us incredible reach.
And our last question will come from Matt Ramsey with TD Cowan. Your line's over. Thank you very much. Congratulations, Jensen, and to the whole team. One of the things I wanted to dig into a little bit is the DGX cloud offering. You guys have been working on this for some time behind the scenes where you sell in the hardware to your hyperscale partners and then lease it back for your own business. And the rest of us kind of found out about it publicly a few months ago. And as we look forward over the next number of quarters as Collette discussed to high visibility and the data center of business, if you could talk a little bit about the mix you're seeing of hyperscale customers buying for their own first party internal workloads versus their own sort of third party their own customers versus what of that big upside in data center going forward is systems that you're selling in with potential to support your DGX cloud offerings and what you've learned since you've launched it about the potential of that business.
Thanks. Yeah, thanks, Matt. It's without being too specific about numbers but the ideal scenario, the ideal makes is something like 10% Nvidia DGX cloud and 90% the CSP's clouds. And the reason and our DGX cloud is the Nvidia stack, it's the pure Nvidia stack. It is architected the way we like and it achieves the best possible performance. It gives us the ability to partner very deeply with the CSPs to create the highest performing infrastructure.
Number one, number two, it allows us to partner with the CSPs to create markets. Like for example, we're partnering with Azure to bring omniverse cloud to the world's industries. And the world's never had a system like that. The computing stack with all the generate today, I stopped and all the 3D stuff and the physics stuff, incredibly large database and really high speed networks and low latency networks. That kind of a virtual industrial virtual world has never existed before. And so we partnered with Microsoft to create omniverse cloud inside Azure cloud. And so it allows us number two to create new applications together and develop new markets together. And we go to market as one team and we benefit by getting customers on our computing platform and they benefit by having us in their cloud number one, but number two, the amount of data and services, security services and all of the amazing things that Azure and GCP and OCI have, they can instantly have access to that through omniverse cloud. And so it's a huge win win and for the customers, the way that Nvidia's cloud works for these early applications, they could do it anywhere.
So one one standard stack runs in all the clouds. And if they would like to take their software and run it on the on the CSPs cloud themselves and manage it themselves, we're delighted by that because Nvidia enterprise and Nvidia AI foundations and long term, this is going to take a little longer, but Nvidia omniverse will run in the CSPs clouds. Okay, so our goal really is to drive architecture to partner deeply in creating new markets and the new applications that we're doing and provide our customers with the flexibilities to run Nvidia everywhere, including on-prem. And so those were the primary reasons for it and it's worked out incredibly. Our partnership with the three CSPs and we currently have DJX cloud in. And there are Salesforce and marketing teams, there are leadership teams, it's really quite spectacular, it works great.
Thank you, I'll now turn it back over to Jensen Wong for closing remarks. The computer industry is going through two simultaneous transitions, accelerated computing and generative AI. CPU scaling has slowed, yet computing demand is strong. And now with generative AI, supercharged. Data computing, a full stack and data center scale approach that Nvidia pioneered is the best path forward.
谢谢,现在我将把话题交还给 Jensen Wong 进行结束语。计算机行业正在经历两个同时进行的转变,即加速计算和生成式人工智能。CPU 的缩放速度减慢了,但计算需求强烈。现在,随着生成式人工智能的出现,这种需求更强劲了。据中的数据计算,一种全栈和数据中心规模的方法,是 Nvidia 开创的最好的前进道路。
There's a trillion dollars installed in the global data center infrastructure based on the general purpose computing method of the last era. Companies are now racing to deploy accelerated computing for the generative AI era. Over the next decade, most of the world's data centers will be accelerated. We are significantly increasing our supply to meet their surging demand.
Large language models can learn information encoded in many forms. Guided by large language models, generative AI models can generate amazing content. And with models to fine tune, guard rail, align to guiding principles and ground to ground the facts, generative AI is emerging from labs and is on its way to industrial applications.
As we scale with cloud and internet service providers, we are also building platforms for the world's largest enterprises. Whether within one of our CSP partners or on-prem with Dell Helix, whether on a leading enterprise platform like ServiceNow and Adobe or Bespoke with Nvidia AI foundations, we can help enterprises leverage their domain expertise and data to harness generative AI securely and safely.
We are ramping a wave of products in the coming quarters, including H100, our grace and grace hopper superchips and our Bluefield 3 and Spectrum 4 networking platform. They are all in production. They will help deliver data center scale computing that is also energy efficient and sustainable computing.