Why Hardware-Software Co-Design Is AI's Real 100x: Dylan Patel of SemiAnalysis
发布时间 来源
Episode 设置
摘要
Dylan Patel, founder of SemiAnalysis, argues the biggest gains in AI don't come from faster chips, they come from software-hardware co-design. Optimizing the model, the kernels, and the silicon together turns a 2x here and a 2x there into 100x. He explains why DeepSeek's experts were shaped for Nvidia's Hopper (and why TPUs struggle to run it), why OpenAI's sparser models and Anthropic's denser ones pull them toward different hardware, and why the so-called CUDA moat was never really about CUDA. Dylan breaks down InferenceX, his living benchmark that runs the latest models on over $50M of donated hardware daily, tracking a roughly 60x annual drop in cost per unit of quality. He makes the case that inference will be a bigger market than oil, that the compute crunch persists because models expand the value of useful work faster than compute grows, and why Jensen Huang is bankrolling neoclouds to engineer a multipolar world.
Hosted by Shaun Maguire and Sonya Huang, Sequoia Capital
00:00 Introduction
01:58 Motel Kid Origins
03:11 Xbox Repair Spark
04:23 Internet Forums to Semis
06:42 From Quant to Founder
09:16 Homeless Research Roadtrip
14:04 InferenceX and Benchmarking
34:35 Sparse vs Dense Models
35:08 Interconnect Shapes Architecture
35:48 CUDA Moat Is Shifting
36:46 Ecosystems and Co-Design
38:46 Cerebras Speed and Limits
42:07 ROI Debates and Hot Takes
44:20 Ten Year Tech Bets
50:48 Compute Crunch and NeoClouds
GPT-4正在为你翻译摘要中......
