Beyond Bigger Models: Recursion As The Next Scaling Law In AI

发布时间    来源
Episode 设置




摘要

A 7-million parameter model outperforming models a thousand times its size on tasks like ARC Prize. That's what recursive reasoning unlocks. In this episode of Decoded, YC's Ankit Gupta and Francois Chaubard break down two recent papers on recursive AI models, HRMs and TRMs, that are achieving state-of-the-art results with a fraction of the parameters of today's largest models. They explain why standard LLMs hit a fundamental ceiling on certain reasoning tasks, how recursion at inference time gives small models the compute depth to break through it, and what happens when you combine these ideas with the power of large-scale foundation models. Apply to Y Combinator: https://www.ycombinator.com/apply Work at a startup: https://www.ycombinator.com/jobs 00:00 - Intro 00:35 - Model Foundations 01:15 - RNN Limits and LLM Contrast 02:36 - Reasoning Limits and Sorting Analogy 04:22 - HRM Paper Introduction 05:25 - HRM Architecture and Intuition 07:36 - HRM Results and Outer Loop 09:46 - TRM Paper Overview 11:20 - TRM Training and Fixed Point 13:30 - Detailed HRM Summary 20:46 - Comparing HRM and TRM 34:45 - Future Outlook and Outro

GPT-4正在为你翻译摘要中......

中英文字稿