I think you can't ignore the fact that the sums of money going into this industry are truly a gargantuan. Circularity of these deals is interesting. Things can flip quite quickly. One gigawatt of a data center for AI based, it costs $50 billion in Catholics. On an annual running basis, it costs between like another eight to nine, maybe even $11 billion to run. Companies are trying to do deals with anybody who has any capacity. In the short term, what many GPU data centers are getting powered on is just gas turbines. While that we've come for the point where we just want like an AI that works on our computer, but like to get that, you need to have so many more powerful systems collaborate with you.
I was the first time you had a system that could show its reasoning. Since then to now, the progress is pretty astounding. Hi, I'm Matt Tert from FirstMark. Welcome to the Mad Podcast. Today, I'm excited to welcome back Nathan Benneish, founder of Astrid Capital, to discuss the 2025 edition of his State of AI report, a must-read on whether field really is. We cover a lot, including why power is a new bottleneck, reasoning and channel action robotics, and the business reality, revenue margins, and what it means for builders and investors. Please enjoy this great conversation with Nathan. Nathan, great to have you back. Thanks for having me.
The State of AI 2025 is out, and as always, it's essential reading for anyone who's serious about understanding AI. This year, it's a 312 slice of goodness, a bit of a big ear in AI. Every year, I try to cut it down a little bit, but this year, it just felt like we were sharing it with various sub-communities of the AI, of the AI community, and each time we did that, the robotics folks would be like, hey, it's a little bit light on robotics. Can you add some more? And then we send it to the bio folks and we're like, why don't you say this paper or that paper and hence the inflation? Amazing.
All right, so we're certainly not going to cover everything in this conversation, obviously. As always, the report is available in its entirety for free at stateof.ai. So we're going to riff on some of the most important topics and ideas in the report, but obviously people can go and check out the report directly for more. All right, so starting from the top in the world of research, you mentioned that 2025 was your reasoning got real. So how far have we come in the last 12 months? That's pretty far. About 12 months ago, so we had, I think the very early inklings of it with O1 preview, potentially around like this time last year. And that was the first time you had a system that could kind of show its reasoning, show its stepwise process to get a more complicated answer.
And this has generally been the dream in AI for a long time. And since then, to now, I'd say the progress is pretty astounding. One of the areas that the progress has kind of unveiled itself is in mathematics and other verifiable domains where you can like explicitly say, yes, the system works or doesn't work. And you know, we saw gold medals on the International Math Olympiad by a couple of labs, including OpenAI and DeepMind. That area probably with, if you asked experts again, how long have it taken? We've probably been a decade. Then in areas a bit closer to my hardened biology and science, we've seen reasoning models kind of be used as an AI co-scientist.
So just as a human would be reading lots of papers, planning experiments, writing the experiments, and then doing data analysis. And then reformulating their hypothesis as a result. This example of models doing that in lieu of human, which is exciting because there's way too many papers to read. AI people kind of complained that it's like 50,000 papers a year and say in biology and chemistry and physics is probably an order of magnitude more than that. And so DeepMind has shown that you can integrate this kind of reasoning model to sort of decipher new targets for disease, new mechanisms that were actually also proven in a wet lab scenario post facto. We've gone from systems that were kind of dumps, stochastic parasts, and now they can solve pretty meaningful challenges that I'd say, even a smart human couldn't.
And still in research, you talk a little bit in the report or a lot in the report about robotics and this evolution towards a system of action or channel action, going from channel thought to channel action. What's happening there? Yeah. I mean, it just is probably two years ago, robotics was kind of a dead end. Opening I had disbandled. It's robot team. It was famous for solving the Rubik's cube using locomotion with the hand. And so now robotics is probably going through a Cambrian explosion. There's so much excitement.
And just as how language models informed biology, now language models are also informing robotics. So what you're referring to here is a sort of reasoning process for robots where a system is no longer just perceiving the environment and deciding what to act and so acting. But it's we've separated those steps. So now you have a reasoning model that looks at a task and tries to plan steps that a robot would need to do to execute that task and then passes that plan over to an actuator which goes and actually implements the plan. And that's what's called a chain of action. And here the Allen Institute was one of the first to really push this and various swiftly thereafter. Gemini also followed and we have some companies including CERIAC that are implying this into the real world. So it does genuinely work. It's not just like a research thing.
So we think the big moment for robotics is upon us because we all collectively have been talking about this for a very long time.
我们认为机器人技术的重要时刻已经到来,因为我们大家一直在讨论这个话题已经很久了。
Yeah. Yeah. Well, I'd say it really is upon us in the industrial sector in logistics and warehousing kind of more constrained environments with very repetitive tasks. There is the sort of more holy grail of this kind of embodied human-like form factor and putting a model on that might even be the same model that's been used in warehousing. A lot of money is going into that but my personal bet is I think it's going to be the humanoid space is going to look much more like self-driving where we have some very good isolated demos but the long tail will kill you.
We're pulling on to literally. Yeah. And so we're going to go through many false starts. I think this is just a start. Okay. Great. So a big year in robotics and reasoning for people listening to this. If you're interested in deep dives into reasoning and RL in the evolution of AI systems, we've done a bunch of great episodes recently with Sholto from Anthropic, Jerry from OpenAI and then Julian from Anthropic, if you curious to learn more.
Let's move on to the business of AI. You mentioned in the report that the business of AI finally caught up with the hype. What caught your attention in terms of facts that's in the last 12 months?
Yeah, a couple of them. Again, where we came from one or two years ago was just tons of money going into this segment, building models, a lot of usage but not clear where the revenue would come from. I think it was maybe OpenAI was making $50 million or something two years ago. That was very unclear how they would ever hit billions of revenue. Nowadays, I think if you sum sort of the top 20 or so major AI companies from the labs to the most popular vertical applications across them, they're making tens of billions of dollars of revenue.
You can look at the smaller scale companies which are growing from zero to 20 million or 20 million plus. As a group, they generally grow about 60 percent faster on a quarterly basis than non-AI companies. We've all seen the famous charts about ARR or non-ARR. It's unclear. But very steep curves for various coding companies. Perhaps most interestingly across a segment of 43,000 or so US customers, we work with Ramp to show the retention of subscriptions on AI products across this customer set has really improved marketly since 2022.
Run 2022 is around the 50 percent after 12 months. And now in 25, it's hitting around 80 percent. And the second stat in that analysis that was interesting was the total spend on AI products per customer kind of went up from $35,000 or so, maybe two years ago. Now it's around half a million dollars and it's predicted to hit a million dollars next year.
Yeah, exactly. And there might be some sampling bias slightly to what kind of companies use Ramp in the first place. Yeah, slightly more modern and tech forward companies. But a leading indicator I think of where things could go. And then you had your own survey right of 1200 AI practitioners.
Yeah, that was, I was surprised. Obviously, biases more towards pretty well educated US European professionals. A lot of people in there have at least undergrad masters degrees, maybe even more. But it's like 95 percent of people use AI in their personal life and in their professional life. About 76 percent of people pay out of their own pocket for it. It's like 10 percent of people pay more than 200 bucks a month for it.
And then looking at the organizations that they work at, it's like 70 percent of those organizations are spending a ton more or more than they did in the past on AI. The reasons that they gave for what, why they might not be spending more or what problems they have. It's like all the classic like new technology stuff. Like it's a bit hard to configure. It's, you know, I haven't really figured out the ROI yet because I need to do more customization. There's like some data privacy issues that I have. And I think all these things are kind of solvable. Like it's not rocket science. I just all these things. Yeah, it feels like we very much live this year in the world of shadow AI in companies where a mature reconcile is imperfect, but to reconcile your two stats 44 percent of businesses use AI yet 95 percent of people individually use AI.
Yeah. So there's a bunch of people as you're alluding to that use AI at work with that being officially as ours. Yeah. And I think there's still a big like education gap. I mean, there was a study bandied around a couple of weeks ago where you know, it said 95 percent of businesses like get no value from AI. Very, very controversial.
Yeah. Yeah. But I think 95 percent is the number or like everything is 95 percent. But I think that there it turned out it was like not the models that were bad. It's like the implementations of them were not great. So I think there's just a big education gap for how how you should like, you know, update your view of your own day-to-day tasks and and apply what capabilities you know models have. And then and think about like, hey, should I be doing this task myself or can I farm it out to a model?
And and there's definitely a delta of companies that really get this done well and others that are like basically clueless. What do you make of the margin debate as an investor and institute analyst? Maybe recap what that debate is and then what do you think about it? At a high level, basically, that the margin problem is for many, many customers of large model companies, their margins are basically dictated by how much the model vendor charges them for.
Now, here there's some issues because right now model vendors are charging the same amount per token. So if you're a hedge fund analyst and I'm a student, you know, your use case is clearly more financially valuable than mine, but we pay the same amount for the token assuming we use the same model. There are some use cases that are more reasoning heavy towards what we discussed before and they consume a ton of tokens. And the pricing that a customer pays for that product might not be fit for the amount of work the AI system is doing.
And so there are cases where these kind of vertical products are making gross margins of like 30% and sometimes they get worse with scale because you do have some edge users that like really pump the system and you can't like price discriminate or they have a manage to. And then you have some segment of model users that don't that have a both a paid plan and a free plan and it's not clear whether they include the costs of running the free plan in their gross margin.
So sort of just look at their paid customers. There's some creative accounting standards going on there. And then you have the model vendors themselves and what is their margin. And I think what's interesting in the last year is you've seen CEOs of these model companies say, hey, if we if we basically look at serving financial analysis terms like a layer cake of like what revenue is generated by each vintage of model over time, it looks like prior models are profitable.
So the amount of money we've spent to build them is less than the amount of money that we've generated with them over time, assuming a certain margin of inference cost. So really these these labs are like not not profitable because fastening more resources going into developing next generation systems than the prior ones. But as you and I both know there are companies here that are making very very good margins on serving their AI systems like 70, 80 sometimes 90% depending on the modality. And so like with everything the average number sucks.
But like when you look at the best companies, it's really good. And just to drive it home, the companies using those models we're talking about the you know in part the all the what used to be known as thin wrapper. So the vendors that happen to be powered by those models. So the cursors, the wind serves, and all the whatever legal financial AI startups as examples. The other big debate in the business of AI of course is the bubble question.
What's your what's your take? Are we in an AI bubble? Are we not in an AI bubble? Yeah. I think like with most things in markets there are probably localized bubbles all over the place. And I think at a high level it's interesting in terms of vibes in who's calling bubbles and who's not like the finance crowd in New York is talking about bubbles a lot more than what we're talking about in San Francisco where their view is like this is the golden era of AI and a lot of things are working. We have so much more to to do you know compute buildouts are enabling us to experiment a lot faster. You know this huge flood of like talent that's built the consumer internet and cloud computing is moving into AI.
And with that is bringing a lot of optimization techniques and knowledge that AI researchers didn't have when they built the first generations of chat GPT etc. But I think you can't ignore the fact that the sums of money going into this industry are truly a gargantuan. You know like 500 billion to build Stargate and then you know a couple hundred billion here a couple hundred billion there like pretty soon it's real money. And then the and then like the circularity of these deals is like is interesting of course Nvidia is at the center of this and it has incentives to use its balance sheet to sort of spin the wheel faster.
And then press more concerningly you have this sort of offloading of debt from big companies for example meta that raises tens of billions of dollars to fuel state of center ambitions but that doesn't sit on meta balance sheet. Some of this is like catnip to financial engineers but yeah it rests on certain assumptions that everything is going to keep going up into the right and that rates don't materially change and just given how like I suppose precarious various aspects of the economy are and how like sensitive geopolitics are things can flip like quite quickly.
然后,更令人担忧的是,大公司(例如 Meta)的债务转移。它们筹集数百亿美元来支持其核心目标,但这些债务并未计入 Meta 的资产负债表。这种操作对金融工程师来说犹如猫薄荷,但它的前提假设是经济会持续增长,并且利率不会发生重大变化。然而,考虑到经济的各个方面有些不稳定,以及地缘政治的敏感性,情况可能会很快发生变化。
But I think that's like the majorest the risk I'm less worried about is the stuff doesn't work because I think it does work. So there's some more question of timing to play back that the supply phase of the market is met by any quality strong or really stronger a demand side. Yeah there's that and then just the just the nuances of like the terms on the debt and what trigger events are where the later rates get reprised and then you know investors behave very differently once rates change and and flows of money can be quite like violent.
It's interesting what you're saying about the you know the dichotomy between Wall Street and the West Coast also because when you think about it that's actually not that many fuel play AI companies in public markets right a lot of reaction is happening in private markets so effectively if you're a Wall Street slash hedge fund investor you invest in Nvidia you invest in max seven the spring much at right pound here see three AI maybe by soft bank first position open AI yeah pretty much like it is in direct or you invest in power and energy like related players call weave I guess but it's it's very very small so it feels like that's that tension as well.
Yeah yeah but I think it's also the crowd that you hang out with yeah I mean and I think you live in a house in San Francisco is correct to other or three other AI genius correct correct or do you just consume the outputs of that of those kinds of conversations on Twitter and then try to like yes piece together your own world view.
是的,是的,不过我觉得这也和你交往的圈子有关。是吧?我理解你是住在旧金山的一所房子里,和其他两三位 AI 天才一起,对吧?还是说你只是通过 Twitter 上这种类型的对话来获取信息,然后尝试拼凑出你自己的世界观?
And I think the other part of this is like I don't think some of those individuals were really shilling that much anymore I think they do genuinely believe what they say and they are at the core face of the advances of these technologies and so if you know they've been saying for the last 50 times like hey this stuff is working there's lots of implementations we can improve or like things we can tweak or new experiments that'll yield better capabilities and that has happened at some point you got to be like maybe there right another aspect of this.
is fascinating to me is the again like the the the sort of dichotomy between some of the I call them the old guard and the newer younger kind of folks so you know from Rich Sutton to Yann LeCon to you know obviously Jeffrey Hinton a lot of those guys who are absolutely the Godfathers of the space and build this entire thing and our our steel extremely active today on top of everything say that LLM so just not gonna get us there or that we should just do everything with RL and then you know meanwhile the the the younger guys and they tend to be at places like Anthropic and OpenAns or maybe they do have an agenda but they're all saying well we just crutch into the surface of what we can do with those modern systems.
yeah yeah I think do both but yeah I think for me it's it's mostly what are kinds of new problems that you can you can work on and solve with this technology and I think it's becoming more popular to believe like the overhang of like problems we can solve in enterprise for consumers and science with the tools we have today is huge and so even if a lot of this compute build out doesn't go towards like dreaming up the next transformer architecture but goes into improving the Uniconomics of serving AI systems for everybody and makes it easier so you don't have to be like some prom master to elicit a behavior you want for your task I think that's not good.
all right let's switch to the physical reality that this whole stacks it's on so infrastructure data centers energy you you you mention in the deck that's power has become the new bottleneck what is your sense of the state of play in the energy procurement game the the biggest step for me is one gigawatt of a data center for AI basically cost 50 billion dollars in cathex and on an annual running basis it costs between like another eight to nine eight to nine to maybe even eleven billion dollars to run and so when you have just you know casually a 10 gigawatt data center uh a sec a lot of money.
and um and so one of the problems is like where does this energy come from uh you know traditionally it would be from i don't know coal or natural gas um potentially solar or ideally at some point the future nuclear and what we're seeing is it right now companies are trying to do deals with anybody who has any capacity so we can't to call some deals with future nuclear you know reactor companies then that would take maybe a decade or two decades to deliver um you know famously yeah that's uh google uh inking a ppa deal yeah we cfs to buy two hundred mega watts of electricity from a planned fusion plant so the the plant does not exist yet it does not exist yet.
and then last year we documented the sort of restarting of uh three three male island um yes the nuclear facility which was controversial in the past um and um and then uh in the short term what many uh GPU uh data centers are getting powered on is is just uh gas turbines and because these can get set up a lot faster but that has other issues like they're super loud um and there's demand outside of the US for these things and so now basically US tech companies are paying more to repatriate like uh some of the supply that should have been shipped abroad.
the other issues that the grid and like to what degree the grid can even tolerate data centers getting plugged into it now obviously like these turbines are off grid so it has some advantages but uh in china for example we do some analysis between like the uh you're between uh the US and china with regards to energy and china has a lot more like slack in its system to plug in um for any unpredictable uh demands in energy uh the uk famously it cannot really tolerate more data centers on its grid wrapping all this together is driving um some of the like offshoring of data centers towards uh energy rich countries whether that's the ua e uh or even uh Norway and uh and then with that comes a lot of like geo geopolitics of uh are these nations or friend or or potentially not and how do you ensure uh access to this regardless of your administration chains or other things.
so it yeah it's it's wild that we've come to the point where you know we just want like an ai that works on our computer but like to get that you need to have so many more powerful systems uh collaborate yeah and i was just looking for this slide as you spoke uh especially for united states versus china we're talking about the dramatic difference where. uh the capacity added in 2024 for the u.s uh if i read this correctly it was 48.6 geowatts or as china was 429 uh geowatts the other thing that's interesting is um the at least the states in the u.s uh or i she also internationally that um that are good for hosting data centers because there's energy typically are extremely dry and uh and we also chronicle the water usage that's needed for cooling uh of these data centers and so if your state super dry where do you get the water from uh is that actually gonna detract away from human populations that need the water then you have this whole like recycling of water which could potentially like you'll just like bad quality water getting circulated into the water system so the sustainability aspect to all of this seems uh extraordinarily important
yeah yeah yeah uh under discuss at least that's my my perspective is is that is that correct to do people actually care and do something about the sustainability aspect of this what a year or two ago big companies did make commitments to be green as of you know 2030 and then as soon as they started inking deals with uh you know nuclear companies and uh and and various energy providers for data centers all those commitments basically got like washed away um so it seems like maybe they care but the corporate priorities of making AI work of way outweighed the environmental constraints that's what's happened but I think again going back to like the politics side of things i don't think everybody's very happy about this particularly there's this like growth of nimbusum this like uh not in my backyard uh and uh and and i do think that uh people generally don't want to have a data center in their backyard uh and i think that's going to drive some of the political agendas like going forward whether it's in the u.s or or other countries so yes people do care about environmentalism companies that sort of washed out away but it's gonna i think it's going to come back
if we talk about infrastructure uh obviously we have to talk about Nvidia feels like it's been another extraordinary yeah last 12 months for uh Nvidia yep do uh see Nvidia continue to break away as like the undisputed number one in the market or do you think that sooner or later we're going to end up with a multi-silicon kind of world i think it's going to be 95 five that's a 95 percent is um farato you visited yeah exactly yeah um yeah so you know for context when we did the executive somebody last year we put Nvidia you know hit one trillion for the first time and now we have to change up to four trillion we look at uh all the open source AI research papers every year it's about 49,000 or so and then uh programmatically determine which chipsets are used in those papers so we know like hey an AI researcher is doing a study on uh i don't know some new model and in their in their experimental setup they say you know we train the model for x number of GPU hours on uh whatever chip and
if you do that analysis you basically find that 90 percent of uh all papers make use of a Nvidia chip out of that same analysis we did find that AMD is sort of popping up a very little bit um apple silicon is as well i think it's just because the computer that macbook is getting so good that people are doing local training uh an experiment on their computer but a broadcom is experiencing a a resurrection of some sort as well yeah yeah exactly yeah it has uh i think it's maybe a decade ago they bought a company that now is kind of the internal team doing this custom asix for uh google's tpu and uh you know more recently they announced a deal with open AI also to do uh a custom chip
in a high level it's interesting with the riser broadcom is basically GPUs have been that the the dominant chipset for a long time as the uh kind of nature of the neural network or other kind of AI system that you're running on the hardware was still changing very rapidly but as soon as you get to a point where there's some convergence on an architecture that's looks like it's stable and is revenue generating and developers are coming to uh sort of work on it and confirm that it is like the thing then you can flip towards doing a custom chip that's built to extract the most value out of that architecture and so the rise of broadcom basically tells you like there's strong forces that are saying like the transformer is the thing
but at the end of the day like we also look at how would your dollar be best use as an investor if you wanted to bet on your companies and uh and in the graph in the in the report we look at sort of six of the major contenders uh to Nvidia and basically said you know if you bought Nvidia stock on the day of the announcement of all the like private uh rounds in these companies what would the value of your stock be it Nvidia versus these companies and uh if our car correctly it's basically 12x and Nvidia versus 2x and um in these competitors and the trend was roughly the same last year uh so I think I think it's a little bit of a different difficult beast to bet against.
Yes I was uh I was looking for that slide as you were uh speaking it's uh for anybody that looks at the report that slide 166 that says um what would have happened if investors had just bought the equivalent amount of Nvidia stock at that day's price the 7.5 billion would be worth 85 billion in Nvidia stock today 12x uh versus 14 billion 2x for its contenders and the contenders being Grox Rubra, Symbanova, Celestial, Graphcore and in China, Kambracon has uh experience you know a big run this is you know a private company that then would public on on Chinese stock exchange to build custom asix for for AI and and that was driven mostly by the geopolitical sort of zigzagging on policy with regards to exporting custom um Nvidia chips to China the H20 which at some point was deemed to be okay by the government and then deemed to be not okay uh but then okay if uh 15 to 20 percent of the revenue was passed back to the US government and then uh and then um someone in the someone high up in the US administration said you know our girls basically to ship the like crappy stuff to China and at that point the Chinese said like no thank you and yes effectively said no one can buy Nvidia chips and then Kambracon stock ribs and this is why way as well right that's the emergence of uh yeah separate Chinese uh full stack.
Yeah from the the models which we'll probably talk about that at that at some point in this conversation of open source but very much at the cheap layer of the so that's what you mentioned and then Huawei whatever the model is becoming the sort of default chip for the Chinese stock yeah yeah yeah and there's some interplay between the government trying to get deep seek and other labs to throw in their models on uh on Chinese chips and there's been rumors that this is why a lot of the new generations of Chinese models have slowed down um particularly deep seek like the people are waiting for for the next next R1 so I R2 and uh and allegedly it's because it's just hard to run it on Huawei.
Two double click on on something that you mentioned a few minutes ago um talk about sovereign AI uh and uh what you've seen people do it seems to have been a big theme of the year you mentioned open AI in uh no way India and UAE what's happening in that world yeah that part of the world yeah yeah so the idea with sovereign AI is that nation states want to be uh at like able to control basically their fate with regards to AI so that's uh running models as training models having chips uh and um this is basically because you know nation states want to have control over their energy could control over their currency control over the infrastructure and AI is deemed to be kind of equivalent to those categories.
And so ever since the white house announcement of 500 billion in January uh various nation states have followed suit saying you know we have our own initiative and it's the tune of billions of dollars uh etc around the world and Nvidia has even started marketing this is like uh like a new kind of product line basically for its business that currently generates i think around 20 billion dollars worth um so it's it's real money um and so they're forming partnerships with various nation states uh to provide data centers there that are run locally um and uh and in theory that should give like countries comfort that uh their access to AI can't be turned off that's the idea.
I personally think it's a bit more of uh of an alignment between uh political agendas where particularly in the US it's really about reindustrialization like on-turing of key industries and building you know manufacturing and things like that which is i think one of the reasons why these AI dinner centers are getting rebranded as AI factories and so that's the political part um and that's getting aligned with um uh just the need of countries to get access to this technology.
So i think it's more marketing than it is like a real policy because in the day if you buy your stock from uh from from the US and you're not in all out of the US at some point then we'll just switch it off um and so part of this is like sovereignty washing i think and it also like oversimplifies the very interconnected nature and ecosystem aspect of of AI we're not just about the chip it's about uh the developer ecosystem um how you actually run it where your train data comes from um and uh and all the like infrastructure like data tools and and whatnot that that's it around this although uh that's where open source plays an important role right?
If you get your AI from open AI and indeed uh you are a usli but you no longer are usli for whatever reason there's a risk that you'd be turned off but if you have uh sovereign data center and we is a bunch of like chips running and then you run open source on top of it like presumably you are safe which is then uh interesting because whereas the most popular open source coming from now yes China China yeah although interestingly uh I think. since you uh published the report there's been the announcement of a very large investment in reflection AI which is uh in New York and San Francisco based uh company that just raised two billion yep uh to build uh the US equivalent of the Chinese models in the inner world of alama and meta um that sort of uh got in a different direction yeah yeah I think this is fascinating um because part of the AI action plan uh that was published by the US government a couple of months ago now um you know articulated the need for having this American AI stack so they're moving away from like diffusion controls and more towards just buy our stuff.
如果你从 OpenAI 获取你的人工智能,并且你是一个usli,但无论什么原因你不再是usli,那么有被关掉的风险。但如果你有一个独立的数据中心,里面运行着一堆芯片,然后你在上面运行开源软件,那么你可能就安全了。这很有趣,因为目前最流行的开源软件来自哪里呢?是的,中国。虽然有趣的是,我认为自从你发布报告后,有一项非常大的投资宣布了,即在纽约和旧金山的一家叫 Reflection AI 的公司,他们刚刚筹集了20亿美元,旨在构建美国版的中国模型,即 Alama 和 Meta 的内部世界。这走向了不同的方向。我觉得这很吸引人,因为几个月前美国政府发布的人工智能行动计划的一部分,阐明了需要建立一个美国人工智能体系,因此他们正在从扩散控制转向鼓励购买我们的产品。
And then one of the other aspects of that action plan was around open source and like and and sort of leading in that direction and of course as you said like meta step back and into the fold came quen um I think 50% of all model derivatives um being downloaded from hugging face are coin based now um hundreds of millions of downloads partially because they're they come in very accessible shapes and flavors so as a result of that we sort of predicted in the report that uh uh that a major you know AI lab would lean back into open source to win um basically brownie points with the government and then the next day this financing happened oh really great great uh great timing yeah and I think you you send the report as well that your sense was that opening I was sort of forced for like a better term uh into releasing an open source model to be on the on the right side of history.
Yeah I think that's one of them and then the second one probably dovetails with their announcement with AMD and I say that because uh you know quite recently semi-analysis uh kind of published this benchmarking dataset with a run models on various clouds to sort of benchmark them um actually gpt oss like looks pretty good on AMD um and so one could imagine that um like there were some uh optimizations and actually were optimizations to to gpt oss where it runs nicely on AMD it has support from their framework from day one the uh the parameterization of the model is uh to the point where you can run it on a single AMD chip and then some other nuances to their attention mechanisms that they customize to make it work really good on AMD um and to the point around like the circular economy stuff that we discussed a little while ago um there's uh like another financial sweetener in the deal where uh open AI has warrants in AMD if the stock price hits 600.
And so you can see how there's a lot of like incentives this game of both like aligning with us government uh helping developers which is a good thing but also like helping one of your vendors uh improve which frankly it does need help and it and it should improve but also getting some financial sweetener as a result of that which could help you kind of make the flywheel spin faster and uh since we're talking about uh circularity uh talk about concentration as well so maybe as an echo to the conversation about the bubble uh a few minutes ago it does feel like this uh AI economy has a lot of uh depending on how you look at it from funky just carry things yeah yeah well a lot of Nvidia's revenue comes from uh the major uh you know hyperscalers or or um or neoclouds so you know what it's like meta like xai uh google amazon um then core weave and then a lot of coreweas revenue also comes from Microsoft on the way back.
I think it's just this challenge with with AI progress that we've uh you know very meaningfully shipped from shifted from I think the gpt3 era to now of basically scale like rate limits your progress and uh it's no longer like a couple of people in a dorm room that can really build something uh transformational if they want to advance like AI capabilities it's really um it's really big boil end now um and so with that comes just different dynamics like you have to be good at capital raising you have to align yourself with uh with nation states you have to align yourself with wall street uh these are all I think contributing to the big vibe shifts that you've seen in in the culture of AI labs.
What what do you mean by that what well you know for example there there were some labs like anthropic that were built you know to really push the safety agenda because you know if we didn't do that the rational uh went that you know we could lead to the extermination of humanity right um and I think quite recently like Daria was interviewed by mark banyoff uh just this past week and asked about like some of these data center buildouts. and uh and you know he said something along the lines of yeah there's a lot of money going into this a lot of cost but in the day the only thing that matters is revenue like i don't think you would have said that you know on the founding day of anthropic and you know it's it's just a reality that that the tablesticks in this game have changed and with that you know entrepreneurs have to update their priors and um and you know change our strategy a little bit and so we document some of this and like sort of the blooper section of the report
which is uh which is just like how how much of sort of pendulum swinging we've we've noticed in um in corporate priorities at AI labs as a result of the extreme financialization of the sector are you uh encouraged or discourage by some of the stuff that's happening at the app layer in in in particular uh you know whether that's AI slop or uh yeah focus on on revenue and um uh you know versus the idea like did you do you think that's uh inevitable but good or what do you make of it i think we're just that such an early era to like see how you can maximally extract value and create interesting experiences for people with this AI technology that you know we have to try a lot of different things um at the end of the day you know if you're a lab that expands tens of billions of dollars on on R&D you do have to have a way to generate money to to fund that um i think that's just reality and i think like the the the slop thing i mean if it's bad people won't look at it
and uh if it if people look at it and they enjoy it then you know good good for them like yeah i don't necessarily have like a huge problem with that um as long as uh as long as like where I'm expanding my time uh I find is uh is useful and so that's why I'm spending a lot of my time on like enterprise software automation uh biology like doing your discoveries and drug discovery like defense technology and autonomy uh robotics I think these are all like very important macro drivers of of the economy um as we move into an era where like intelligence is uh you know increasingly cheap and accessible uh there's just so many different like instantiations of products that we need to build that are really meaningful and you know if a bi product is that is you have a social media app with like AI videos like let's find two you know we all have to like unwind right
you mentioned safety a minute ago i'd love to uh riff on that theme a little bit uh ip rights safety regulatory uh a little bit like the sustainability uh thing that we were discussing earlier it sort of feels like that that whole world as um sort of slow down in terms of like progress maybe starting with regulatory do you think that regulatory is anywhere and you're catching up or providing an adequate response to what's going on yeah i'd say like a big one a d on that one and include uh trump administration on wound a lot of the Biden era policies uh whether that was on uh diffusion you know trying to push a lot of stay level legislation against AI the uh over in europe like the you AI act has had um delays and implementations only three member states that have actually implemented it and now we're finally seeing how even its authors are saying uh maybe we went too far
particularly as we look at progress uh the speed to progress in the us in china compared to europe um you know fantasy this bill in california um you know rate limiting AI progress was really watered down into what eventually became SB 53 um there were you know many many proposed bills and then go over a thousand 10% of them actually made their way into laws um so it's still kind of patchworky but like at a meta level looks like we traded regulation for it's going faster it's perhaps like best encompassed by uh by the shift between the ai safety summit and the uk which was a bletchly which basically pledged like a whole network of uh ai safety institutes and conferences that would happen over the the coming years
um to then the subsequent event in paris which was called the ai action summit completely different than ai safety summit and jd vans saying something on the lines of basically like ai progress is not going to happen if we keep hang ringing over ai safety and the us basically didn't show up to a few of the subsequent conferences and we have this in the like safety r i p section of like very few people seem to care about it anymore and to the to the vibe shift like even the more uh doomerist parts of the ecosystem have uh kind of quieted down right it feels like the debate has gone from kill all all of us to more like well is the lm's lm's where rl the the the better way to get to a g i kind of yeah yeah yeah the the naysayers have like shifted their their kind of like approach yeah yeah and i think it's become yeah less about this existential crisis and more about which capabilities look concerning in models and you know it's been some kind of interesting uh data points of economical in the report.
这段文字可以翻译和表达如下:
然后到了巴黎的后续会议,被称为"AI行动峰会",这与"AI安全峰会"完全不同。JD Vance 表示,如果我们继续对 AI 安全问题过度担忧,AI 的进步就不会发生。而美国基本上没有出席之后的一些会议。我们将此归入 "安全已死" 的部分,因为似乎很少有人再关心这些问题了。随着氛围的变化,即使是生态系统中比较悲观的部分也安静了下来。讨论似乎已经从"AI会毁灭我们所有人"转变为"我们是否应该用RL(强化学习)来改进现有的语言模型(LM)以实现通用AI"的探讨,那些反对者似乎也改变了他们的立场。现在这已经不再是一个关于存在性危机的问题,而更多是关于模型中哪些能力看起来令人担忧。这个报告中也有一些经济方面的有趣数据点。
like for example models can increasingly know that they're in a simulation or know that they're in a valuation and then change their behaviors or result of that there's examples of uh models trying to like x-full trade their own weights there is uh another uh uh piece of work that we show which is around the cybersecurity capabilities of models which is basically measuring how long does a human take to solve various categories of cyber tasks and then putting models at the against the same tasks and saying you know how long would it take for them to solve it at a 50% pass rate and there looks like again the capabilities on cyber tasks of models are doubling every six months and then so this is cast against the fact that independent safety organizations those maybe like five or six these are usually nonprofits uh that are still nonprofit uh or private companies they spend on average $134 million a year in total so about across all of them yeah across all of them exactly and that's cast against roughly like 92 billion uh across all AI work for the major labs so basically like the same amount of money uh that a big lab that's spending one day is spent in an entire year across these safety orgs 130 million aka seed around yes in uh you know weak old AI started correct correct.
what about uh data rights there was another part of that just general kind of like policy universe that uh was very uh sensitive and controversial there's been some some evolution yeah yeah major changes i think it looks a little bit like the uh the sort of on demand commerce war of you know the uber style of do something that's a bit like dodgy for a long time get to scale and then get once or at scale you're kind of too big to to kill and so it's a similarly an AI like a lot of companies took slight dodgy practices uh to acquire training data and then got to scale and they were subject to many lawsuits in the last year or two uh particularly in in the media sector whether that's um you know music or video uh and and books and then there's the biggest uh settlement that happened in the last few months within throttic that agreed to pay out one and a half billion uh and this is a subtle out of court so it can't be used as precedents but but generally shows the the rough price uh taglets of feel that that's associated with uh with human works uh in the context of AI training.
and then separately there's been you know dozens if not a hundred organizations that have uh agreed content licensing deals with various model companies as i think the the power shift has has really happened one but by billion still being a drop in the bucket for a company like a in throttic interestingly does that create a moat over time meaning that you have to be large enough to be able to afford that kind of money uh that you're going to pay to data rights if you want to do pre-training and does it make it harder to start a company that needs to do pre-training from scratch uh in one sense yes in another sense uh if you can exploit the knowledge of these frontier models particularly from open source and then generate synthetic data could be a way to get to a cable models faster and also i think i mean you'll have many guests that go deep on this but um but even the the nature pre-training and what information is included in the corpus and at what point it's kind of like data mixtures as people call it has been evolving over time.
so i think we're just getting smarter about how to do pre-training rather than shoving everything we have into a bucket and like seeing what happens and so as a result of that you might not necessarily have to spend the exact same amount of money to get a capable system and you know some of this kind of came out from the deep seek paper you mentioned cyber let's riff on this a little bit obviously i create new attack vectors yeah what should people know i mean as of a couple years ago people were obsessed with deepfakes or like these videos of people saying things so they didn't actually say and they were still kind of grainy and not awesome uh clearly those deepfakes getting a lot better um although quite positively looks like we're actually quite good at detecting them and and realizing oh that's like yeah um but there's more advanced approaches now where you know models can be capable of coercion particularly for some individuals for a sensitive to this kind of uh risk there's been examples of for example in north korean state actors trying to infiltrate other states using AI systems uh you could potentially even package a language model in malware and then have it installed in a computer and then it kind of wakes up and because it's not dumb it's a language model it can do things on computers and that's kind of scary.
the rise of mcp i think this model context protocol which is kind of like a usb stick for for all sorts of of of data connectors is cool because now models can be smart they can integrate all your stuff across your digital life but do you necessarily trust the creator of that mcp server like whereas that data getting sent there's tens of thousands of these things now and cyber security risks that uh that result um because of this and also some changes towards APIs of you know uh of model APIs that sort of trade off whether the user or the model vendor manages state and depending on that that's another like risk that you have to think about and so I think at a high level like there are lots of security issues that are that are coming onto the fore here but it's it's sort of still unclear whether there's a good business to be built in cyber um for AI because it's still so early like we haven't um necessarily like felt the pain of all these things yeah reputationally and financially and a bit like insurance until you have actually felt the pain you know you sort of like prefer to divert your money towards just like improving and making more money than protecting your downside yeah interesting and to send other area where the uh incumbents are not asleep at the wheel yeah and all the big labs yeah exactly.
so if you're really good at at security do you want to it's a bit like AI safety if you're really good at these things do you want to be in the belly of the beast and be able to like see how the sausage is made and like influence it um because of the proximity uh or do you want to be on the other side like receiving the artifacts and maybe add best doing collaborations with labs on pre-launch safety testing like uh they do in the UK with AZ and in the US uh or at worst just like literally trying to sell a cyber security sass to people who are consuming these models so I can understand like why that uh imbalance occurs and to your point about it being hard to sell before the pain is uh felt uh feels like this whole generation of young startups that are going to be acquired pretty quickly by the hollow auto networks and checkpoints of the of the world yes before they got a chance to get to scale I mean you know something that feels like it probably turned out to be great for the for the founders uh but in terms of building large self-tending yeah sustainable companies not not so much agents.
如果你真的擅长安全领域,你是否希望“深入虎穴”,就像处理 AI 安全问题一样,如果你非常擅长这些事情,你是否希望通过亲近这些行业来影响它们的发展呢?或者你更愿意站在另一边,接收成果,并可能在发布前进行安全测试时与实验室合作,就像英国的 AZ 和美国的公司所做的那样。或者最糟情况下,只是试图向使用这些模型的人销售网络安全软件服务(SaaS)。我可以理解为什么会有这样的不平衡,就像你提到的,这些在问题真正显现之前往往难以销售,这让我觉得这代年轻创业公司可能会很快被一些大公司收购,比如世界上的 Palo Alto Networks 和 Checkpoint 公司,而没机会发展壮大。这对创始人来说可能是很好的事情,但从建立大型可持续发展的公司来看就不太理想了。
it cannot be a 2025 conversation on AI without talking about agents uh what is your sense of the reality in the state of play there's some vertical products that are really good clearly search um is actually pretty good you know replacing consulting replacing market research or augmenting all these uh these areas that were previously you know very heavy human uh you know knowledge working tasks is getting extremely good I think uh coding agents clearly are getting really good there's other metrics around like how long they can work autonomously I think with uh the new hiku release it's 30 hours or something and it can make a pretty decent version of slack yes although the controversial number but uh yes up to 30 lab testing okay okay uh exactly what what is what does even hours mean in an agent of like a computer running it yes yes is that equivalent yeah yes and then and then some the scientific reasoning we talked about uh is agent-based I think that's also quite neat.
I think the biggest problem just become like it's kind of compounding error of you know an agent is like 95% like good and then 95% times 95% times 95% etc etc sort of decays the quality over a longer period time and then there's some contention now. about like do you build these harnesses I like nerd speak for sticky tape um between uh between like models to like make it work in enterprise or do you just wait until the next model generation hopefully becomes better out of the box I think a ton of excitement and at some point basically just as this top software became SaaS at some point SaaS will just become an agent because it's no longer really like a human that's that's actually doing everything in the software product but uh a software that's running the software product itself yeah which I think is cool implications for like uh search and and and product discovery and and this whole like uh ecosystem of like online content.
like is it humans that are reading it anymore or is the agents that are chewing it and then storing it to their human workloads for that like the whole evolution away from you know we go on a website to buy a product versus you know enter and gin slash search and gin that's largely open AI and it also has to buy natively. I'm not so enthusiastic about oh we're gonna have agents a little book flights for us in travel I feel like that's just like a niche problem yeah that sort of like the sad canonical use case in San Francisco but uh but I think this the what's what's telling so far is that traffic that's generated through conversations and AI search onto a commerce platform converts how to higher level than direct traffic.
so the intent is really high because there's already been like background research that's been undertaken in chat I think that's really powerful and you can ignore and then the the next question on that is uh okay so what content is the model actually consuming to serve recommendations or information to its user you know people say our Google search is dead and it's like probably completely wrong because chat UPT references Google a ton as it shifted off of Bing um and so maybe it's not like the fun page of Google that's being consumed by a human but by a sort of agent that represents the the user.
意思是因為已經進行了一些背景研究,所以用戶的意圖很明確,這一點非常強大。而接下來的問題是,這個模型實際上在使用什麼內容來為用戶提供推薦或信息。有些人說 Google 搜索已經過時了,但這可能完全不對,因為 ChatGPT 經常參考 Google 而不是 Bing。可能並非由人類直接消耗 Google 的首頁內容,而是由某種代表用戶的代理程序來使用。
if you're a company that that has a new product and you want it to be recommended then there is like this flywheel that uh you should probably get on as soon as possible because the more you make your content and your website and your product accessible to agents that can try it uh you know even like a demo environment for an agent to go test your new SaaS product the more it will be able to learn about your product and provide recommendations to relevant uh prompts from human users and then if you kind of go the next step which is uh all this like reinforcement learning and environments and preference learning and things like that then that flywheel like accelerates even faster so I feel like it's kind of an inevitable it does kind of open up this agent experience rather than just pure user experience sort of craft within within software companies that is yet another like piece of alpha that uh once you jump on sooner rather than later.
what does that all leave you as a as a VC we have been talking about the set of AI report uh which is uh your annual labor of love and content and you know which uh I think everybody in the industry very much appreciates because there's so much going on so tying everything together in one document is uh incredibly helpful but you first and foremost of VC you work in air street t-shirts yes as people can see they're watching the video uh but otherwise trust me if you're listening to this on Spotify it was a very very nice logo uh kind of a retro little bit yeah it's uh yeah it's inspired from like all US Air Force yeah very very nice yeah so what are you um excited about so you mentioned like a bunch of like deep tech robotics is that what what you invest in uh what where do you think value can be built for founders and and the vases who love them uh going for it yeah yeah the meta thing I care about is uh is how do you build and make use of AI to create like new kinds of product experiences new kinds of companies and for me that's like best expressed by companies that are AI first so that's like both in terms of the product that they build if you rip out the AI the thing doesn't work but also like how they approach their like company philosophy it's times people they hire where they allocate resources and then I've generally just tried to follow areas of industry that are increasingly ripe for getting value out of AI so traditionally that would be you know lots of data for a task that they care about not enough people to do that task but with as a clear ROI if that task gets automated or increasingly automated.
and so that led me you know 10 years ago or so first to like FinTech style investments and then after that you know biology really came online into this new wave of tech bio so I made some investments there like balance discovery that we sold to recursion and also we sold to xantia uh and then more recently profluent which is uh kind of leading the charge for these language models in protein design developing the first like uh crisper genome editor that an AI is created then like another segment uh that really came online in the us was in defense uh and more recently in europe after the munic security conference in february kind of unwound a lot of uh assurances that european states had for us security guarantees and that like led to a big influx of holy s**t we need to defend ourselves because no one's coming to save us uh and so I have some investments there like daily and alliance industries in the UK increase and then in robotics as we discuss a team in uh stood-girl called seriact which is developing kind of this general purpose uh AI models for uh robotic manipulation and increasingly going to other form factors.
and then I've been obsessed with voice I think we talked actually about voice the last time I was here and I'm still just like amazed at how the magic demo I think you're saying like you want to yeah gonna impress you you're smart but none non AI peeled executive friend you show them voice yeah yeah exactly so I've definitely used our company uh 11 labs to like create uh audio of me speaking korean like i've abt tested this that apparently sounds pretty good but I had this like you know a new your company called delfo which is building tools for clinical trials like starting with actually just calling back patients who want to be part of your trial and need to be consented um and these are conversations in lots of different languages a lot of like and kind of esoteric medical terminology you know patients forget what drugs they were on so they have to call you back and this is like super laborious human work that agents like 11 labs and others in audio like solve really well so I'm excited to see where this goes at the limit and then perhaps like the more sciencey stuff like these generative world models I think are pretty amazing um whether it's google's you know geney or vio or audacity system or uh you sort of like imagining this world and then you can take actions in it and the actions are physically plausible because the system was trained with video plus actions uh and then maybe taking that even into scientific discovery um um for just trying to explore like the frontier and being a bit smarter with uh with what experiments we run um because now foundation models are not done okay fantastic all right so to uh close the conversation um of course we have to go into your predictions so uh each time uh you do the state of a i report you boldly come up with a prediction for the next 12 months so without going uh into old ten and people can check them out musklians slide 304 pick like uh you know maybe three that uh your passion is about yeah well I think um um one is just how politically charged a lot of the kind of AI compute data center buildout actually becomes because of energy because of water because of money because of geopolitics and I think that that's becoming too large of.
an issue for voters to ignore um and so we predict that this kind of nimbism not not in your backyard will kind of take precedence in uh in major political campaigns in 2026 I mean the the other one that I think is uh is interesting is like a fully end-to-end uh designed or developed uh scientific discovery I would honestly predict Nobel prize but the the 12 month window is a little bit too short I think that the alpha fold Nobel prize is probably the fastest in history uh Nobel prize won by an AI yeah versus uh the the recent Nobel prizes were for like AI researchers using AI to uh come up with better yeah yeah we's we's breakthroughs but that was a human power by AI here what you're talking about is an AI actually winning yeah yeah uh last year I mean we predicted maybe like a step towards this which was uh a fully AI written research paper would be accepted at a major conference or workshop and that actually happened with this uh paper AI scientist uh v2 I think so I think we're we're getting there because this is what the nerds are really wanting to work on like uh as a matter point you know I think there's all these like software industries were uh you know analysts think like oh my god it's gonna be dead because of AI
but I think part of the reality is um what's not gonna be dead is the problems that like these AI people don't want to work on because it's so boring to build that software that's such a fantastic humorous thing yeah work day safe um he was funny like actually that's he over because uh I think he said recently uh in response to his opening eye or in throbic or etc etc like a threat to your business and he just replied they're all my customers yeah all right that's two uh pick another one I mean uh it's kind of cheating but the open source one I think happened you know whether this particular company is uh is a leading lab or not um is beside the point that uh basically like aligning yourself with political agendas is the way to go and I think you can maybe take this even further and say like similar to how uh Nvidia has been monetizing sovereign AI away for uh nation states to kind of guarantee access to AI services is for them as nations to invest in one of these labs uh it's obviously still a risk that due to export controls you ask can just like tell open AI switch it off but I think it's interesting that for example the Albanian government invested in thinking machines
I see the CEO comes from there and so we I wrap this kind of prediction or this this topic in a prediction that's uh you know some countries will basically abandon their uh their efforts to achieve AI sovereignty and declare AI neutrality yeah it's a bit similar to like the um the defense posture where some nation states are just too small or don't have enough people don't have the money etc uh with the capability to develop weapon systems to defend themselves and so they have uh strategic security guarantee that they get from a larger neighboring nation I think those seem that inconceivable to me that universe countries would say I can't build this stuff I need to have a formal alliance with another country that is sovereign
well Nathan it's been wonderful thank you so much the state of AI 2025 again is available at state of uh dot AI it's remarkably comprehensive and detailed yet approachable so thank you for doing this thank you for coming on today sharing predictions hopefully I get to embarrass you at least a little bit for the next one have you great when some of those predictions uh turn out to not have a handout but this was wonderful thank you very much thanks for running it back appreciate it hi it's Matt Turk again thanks for listening to this episode of the mad podcast if you enjoyed it would be very grateful if you would consider subscribing if you haven't already or leaving a positive review or comment on whichever platform you're watching this or listening to this episode from
好的,Nathan,非常感谢,这是一次精彩的体验。2025人工智能现状的报告已经可以在 state of AI 网站上查看,内容详尽且易于理解,再次感谢你为此付出的努力。感谢你今天来到节目中分享预测,希望下次能稍微“挖苦”一下你,当一些预测没有实现时。不过这次真的很棒,非常感谢!
大家好,我是马特·特克,感谢收听这一期的 Mad 播客。如果你喜欢这期节目,希望你能考虑订阅(如果还没有订阅的话),或者在你收听或观看这个节目的平台上留下好评或评论,我们会非常感激。
this really helps us build a podcast and get great guests thanks and see you at the next episode