The following is a conversation with Terrance Tao, widely considered to be one of the greatest mathematicians in history, often referred to as the Mozart of Math. He won the Fields Medal and the Breakthrough Prize in Mathematics, and has contributed groundbreaking work to a truly astonishing range of fields in mathematics and physics. This was a huge honor for me, for many reasons, including the humility and kindness that Terry showed to me throughout all our interactions. It means the world.
This is Alex Friedman podcast to support it. Please check out our sponsors in the description or at lexfriedman.com, search sponsors, and now dear friends, here's Terrance Tao. What was the first really difficult research level math problem that you encountered? One that gave you powers maybe. Well, I mean, in your undergraduate education, you learn about the really hard and possible problems like the Riemann hypothesis, the Trinparms conjecture. You can make problems arbitrarily difficult. That's not really a problem. In fact, there's even problems that we know to be unsolvable.
What's really interesting are the problems just to the boundary between what we can do better easily and what are hopeless. But what are problems where like existing techniques can do like 90% of the job and then you just need that remaining 10%. I think as a PhD student, the KKF problem certainly caught my eye and it just got solved actually. It's a problem I've worked on a lot in my early research. Historically, it came from a little puzzle by the Japanese mathematician Sujika Kea, like 1918 or so.
So the puzzle is that you have a needle on the plane. I think like driving other icon on a road. You want it to execute a U-turn. You want to turn the needle around. But you want to do it as little space as possible. You want to use this little area in order to turn it around. But the needle is infinitely maneuverable. You can imagine just spinning it around. It's as a unit needle. You can spin it around its center. That gives you a disk of area I think PIVO 4.
Or you can do a 3.U-turn, which is why we teach people in the driving schools to do. And that actually takes area PIVO 8. So it's a little bit more efficient than a rotation. And so for a while people thought that was the most efficient way to turn things around. But as a COVID showed that in fact you could actually turn the needle around using as little areas you wanted. So 0.001, there was some really fancy multi back and forth U-turn thing that you could do.
That you could turn the needle around. And so doing it would pass through every intermediate direction. Is this in the traditional plane? This is an international plane. So we understand everything in two dimensions. So the next question is what happens in three dimensions. So suppose like the Hubble Space Telescope is tube in space. And you want to observe every single star in the universe. So you want to rotate the telescope to every single direction.
And he's unrealistic part. Suppose that space is at a premium, which totally is not. You want to occupy as little volume as possible in order to rotate your needle around in order to see every single star in the sky. How small a volume do you need to do that? And so you can modify the physical bagage's construction. And so if your telescope has zero thickness, then you can use as little volume as you need. That's a simple modification of the two-dimensional construction.
But the question is that if your telescope is not zero thickness but just very, very thin, some thickness delta, what is the minimum volume needed to be able to see every single direction as a function of delta. So as delta gets smaller as you need to get thinner, the volume should go down. But how fast does it go down? And the conjecture was that it goes down very, very slowly, like log with it going, roughly speaking. And that was proved after a lot of work.
So this seems like a puzzle-wise interesting. So it turns out to be surprisingly connected to a lot of problems in posh differential equations, in number theory, in geometry, commentarics. For example, in wave propagation, you splash some water around, you create water waves, and they travel in various directions. But waves exhibit both particle and wave type behavior. So you can have what's got a wave packet, which is like a very localized wave that is low-class in space and moving a certain direction in time.
And so if you plot it into space and time, it occupies a region, which looks like a tube. And so what can happen is that you can have a wave, which initially is very dispersed, but it all focuses at a single point later in time. Like you can imagine dropping a pebble into a pond and the ripple spread out. But then if you time reverse that scenario and the equations are way more than a time reversible, you can imagine ripples that are converging to a single point, and then a big splash occurs, maybe even a singularity. And so it's possible to do that.
And geometric was going on is that there's always light rays. So like if this wave represents light, for example, you can imagine this wave as a superposition of photons, all traveling at the speed of light. They all travel on these light rays, and they're all focusing at this one point. So you can have a dispersed wave focus into a very concentrated wave at one point in space and time, but then it's deflacuses again, and it separates.
But potentially if the conjunctia had a negative solution, so what I mean is that there's a very efficient way to pack tubes pointing to different directions into a very, very narrow region of very narrow volume. Then you would also be able to create waves that start out, there'll be some arrangement of waves that start out very, very dispersed, but they would concentrate not just at a single point, but there'll be a large, there'll be a lot of concentrations in space and time.
And you could create what's called a blow-up where these waves, they have to do become so great that the laws of physics that they're governed by are no longer wave equations, but something more complicated and nonlinear. And so in mathematical physics, we care a lot about whether certain equations and wave equations are stable or not, whether they can create these singularities. There's a famous, I saw a problem called the Navier-Stokes regularity problem. So the Navier-Stokes equations, equations that govern a fluid flow or incompressible fluid is like water.
The question asks, if you start with a smooth velocity fluid of water, can it ever concentrate so much that the velocity can be infinite at some point? That's got a singularity. We don't see that in real life, if you splash around water and the bathtub we want to explode on you, or have water leaving at a speed of light, but potentially it is possible. And in recent years, the consensus has drifted towards the belief that in fact for certain very special initial configurations of say water, that singularities can form.
But people have not yet been able to actually establish this. The Clay Foundation has these seven millennium prize problems, has a million dollar prize for solving one of these problems. So this is one of them. Of these seven, only one of them has been solved at the point where you can check your experiment. So the Kekena conjecture is not directly related to the Navier-Stokes problem, but understanding it would help us understand some aspects of things like wave concentration, which would indirectly probably help us understand the Navier-Stokes problem better.
Can you speak to the Navier-Stokes? So the existence of smoothness, like you said, millennial prize problem. You made a lot of progress on this one. In 2016, you published a paper finite time blow-up for an average three-dimensional Navier-Stokes equation. So we're trying to figure out if this thing usually doesn't blow up. But can we say for sure it never blows up? Right. That is literally the moving dollar question.
This is what distinguishes mathematicians from pretty much everybody else. If something holds an act 9.99% of the time, that's good enough for most things. But mathematicians, on a few people who really care about whether, like, 100% really 100% of all situations are covered by, most of the time, water does not blow up, but could you design a very special initial state that does this? And maybe we should say that this is a set of equations that govern in the field of fluid dynamics.
Yeah, trying to understand how fluid behaves and it's actually trying to be really complicated. You know, fluid is extremely complicated thing to try and amount. Yeah, so it has practical importance. So this clay prize problem concerns what's called the incompressible Navier-Stokes, which governs things like water. There's something called the compressible Navier-Stokes, which governs things like air.
And that's particularly important for weather prediction. Weather prediction, it does a lot of computational fluid dynamics. A lot of it is actually just trying to solve the Navier-Stokes equations as best they can. Also gathering a lot of data to let they can get, they can initialize the equation. There's a lot of moving parts. So it's very important from practically.
Why is it difficult to prove general things about the set of equations like in that blowing up? Short answer is Maxwell's demon. So Maxwell's demon is a concept in thermodynamics. Like if you have a box of two gases in an oxygen and nitrogen, and maybe you start with all the oxygen on one side and nitrogen the other side, but there's no barrier between them. Then they will mix. And they should stay mixed. There's no reason why they should unmix.
But in principle, because of all the collisions between them, there could be some sort of weird conspiracy. Like maybe there's a microscopic demon, called Maxwell's demon, that will every time an oxygen and nitrogen atom collide, they will bounce off in such a way that the oxygen drifts on to one side and the nitrogen goes to the other.
And you could have an extremely improbable configuration emerge, which we never see. And we're statistically extremely unlikely. But mathematically, it's possible that this can happen. And we can't wall it out. And this is a situation that shows up a lot in mathematics. A basic example is the digits of pi. 3, 4, 1, 4, 1, 5, and so forth.
The digits look like they have no pattern. And we believe they have no pattern. On the long term, we should see as many ones and 2s and 3s as 4s and 5s and 6s. There should be no preference in the digits of pi to favor, let's say, 7 over 8. But maybe there's some demon in the digits of pi that every time you can beat more more digits, it's a biases one digit to another.
And this is a conspiracy that should not happen. There's no reason it should happen. But there's no way to prove it with our current technology. Okay, so getting back to Navier Stokes, a fluid has a certain amount of energy. And because if fluid is in motion, the energy gets transported around. And what is also viscous?
So if the energy is spread out over many different locations, the natural viscosity of fluid will just damp out the energy and it will go to zero. And this is what happens when we actually experiment with water. I get it, you splash around, there's some turbulence and waves and so forth. But eventually it settles down and the lower the amplitude, the smaller velocity the more calm it gets.
But potentially there is some sort of demon that keeps pushing the energy of the fluid into a smaller and smaller scale. And we move faster and faster and faster speeds the effect of viscosity is relatively less. And so it could happen that it creates some sort of a sort of similar blob scenario where the energy of fluid starts off at some large scale and then it all sort of transfers energy into a smaller region of the fluid, which then at a much faster rate moves into an even smaller region and so forth.
And each time it does this, it takes maybe half as long as the previous one. And then you could actually converge to all the energy concentrating in one point in a finite amount of time. And that's an hour's go to finite hand blow up. So in practice this doesn't happen. So water is what's called turbulent. So it is true that if you have a big eddy of water, it will tend to break up into smaller eddies.
But it won't transfer all this energy from one big eddy into one smaller eddy, it will transfer into maybe three or four. And then those ones split up into maybe three or four small eddies of their own. And so the energy dispersed to the point where the viscosity can then keep a thing under control. But if it can somehow concentrate all the energy, keep it all together and do it fast enough that the viscous effects don't have enough time to come everything down, then this will all kind of go.
So there were papers who had claimed that, oh, you just need to take into account conservation of energy and just carefully use the viscosity and you can keep everything under control for not just an aviastokes, but for many, many types of equations like this. And so in the past there have been many attempts to try to obtain what's called global regularity for naviastokes, which is the opposite of finite hand blow up that have lost you say smooth.
And it all failed. There was always some sin error, some subtle mistake, and it couldn't be salvaged. So what I was interested in doing was trying to explain why we were not able to disprove finite hand blow up. I couldn't do it for the actual equations of fluids, which were too complicated. But if I could average the equations of motion of naviastokes, basically if I could turn off certain types of ways in which water interacts, and only keep the ones that I want.
So in particular, if there's a fluid and it could transfer this energy from a large 80 into this small 80 or this other small 80, I would turn off the energy channel that would transfer energy to this one and direct it only into this smaller 80. While still preserving the law of conservation energy. So you try and make a blow up. Yeah. Yeah. So I basically engineer a blow up by changing the laws of physics, which is one thing that mathematicians are allowed to do. We can change the equation.
How does that help you get closer to the proof of something? Right. So it provides what's called an obstruction in mathematics. So what I did was that basically if I turned off the certain parts of the equation, which usually when you turn off certain interactions make it less nonlinear, it makes it more regular and less likely to blow up. But I found that by turning off a very well-designed set of interactions, I could force all the energy to blow up in finite time.
So what that means is that if you wanted to prove global regularity for Navier-Stokes for the actual equation, you must use some feature of the true equation which my artificial equation does not satisfy. So it rules out certain approaches. So the thing about math is it's not just about finding a technique that is going to work in applying it, but you need to not take the techniques that don't work.
And for the problems that are really hard, often there are dozens of ways that you might think might apply to solve the problem. But it's only after a lot of experience that you realize there's no way that these methods are going to work. So having these counterexamples for nearby problems kind of rules out. It saves you a lot of time because you're not wasting energy on things that you now know cannot possibly ever work.
How deeply connected is it to that specific problem of fluid dynamics or is it some more general intuition you build up about mathematics? Right. Yeah. So the key phenomenon that my technique exploits is what's called supercriticality. So in positive and differential equations, often these equations are like a tug of war between different forces. So in Navier-Stokes, there's the dissipation force coming from viscosity and it's very honest to it.
It's linear. It calms things down. If viscosity was all there was, then nothing bad would ever happen. But there's also transport that energy from one location of space can get transported because of fluid in motion to other locations. And that's a nonlinear effect and that causes all the problems. So there are these two competing terms in the Navier-Stokes equation, the dissipation term and the transport term.
If the dissipation term dominates, if it's large, then basically you get regularity. And if the transport term dominates, then we don't know what's going on. It's a very nonlinear situation. It's unpredictable. It's turbulent. So sometimes these forces are in balance at small scales, but not in balance at large scales or vice versa. So Navier-Stokes is also supercritical. So at smaller and smaller scales, the transport terms are much stronger than the viscosity terms.
So the viscosity terms are things that calm things down. And so this is why the problem is hard. In two dimensions, the Soviet methodical and the addition of the sky, she in the 60s shows that in two dimensions, there was no blow up. And in two dimensions, the Navier-Stokes equations is what's called critical, the effect of transport and the effect of viscosity, apart the same strength, even at very, very small scales.
And we have a lot of technology to handle critical and also subcritical equations and proof of regularity. But for supercritical equations, it was not clear what was going on. And I did a lot of work and then there's been a lot of follow-up showing that for many other types of supercritical equations, you can create all kinds of blow-up examples. Once the nonlinear effects dominate the linear effects at small scales, you can have all kinds of bad things happen.
So this is sort of one of the main insights of this line of work is that supercriticality versus criticality and subcriticality. This makes a big difference. I mean, that's a key qualitative feature that distinguishes some equations for being nice and predictable and like planetary motion. I mean, there are certain equations that you can predict for millions of years or thousands at least.
Not really a problem, but there's a reason why we can't predict the weather past two weeks into the future because there's a supercritical equation. Lots of really strange things are going on at very fine scales. So whenever there is some huge source of nonlinearity that can create a huge problem for predicting what's going to happen.
Yeah. And if nonlinearity is somehow more and more featured and interesting at small scales, I mean, there are many equations that are nonlinear, but in many equations, you can approximate things by the bulk. So for example, planetary motion, if... you want to understand the orbit of the moon or Mars or something, you don't really need the microstructure of like the seismology of the moon or exactly how the mass is distributed. You can also approximate these patterns by point masses and just the aggregate behavior is important. But if you want to model a fluid like the weather, you can't just say in Los Angeles, the temperature is this, the wind speed is this for supercritical equations, the fine-scaled information is really important.
If we can just linger on the Navier-Stokes equations a little bit. So you've suggested maybe you can describe it that one of the ways to solve it or to negatively resolve it would be to sort of to construct a liquid, a kind of liquid computer. Right. And then show that the halting problem from competition theory has consequences for fluid dynamics. So show it in that way. Can you describe this?
Yeah. So this came out of all this work of constructing this average equation that blew up. So as part of how I had to do this, so this is the naive way to do it. You just keep pushing every time you get an energy at one scale, you push it immediately to the next scale as fast as possible. This is the naive way to force blow up. In terms of in 5 and high dimensions this works. But in three dimensions, there was this funny phenomenon that I discovered that if you keep, if you change loads of physics, you just always keep trying to push the energy into small, small scales.
What happens is that the energy starts getting spread out into many scales at once. So you have energy at one scale, you're pushing it into the next scale, and then as soon as it enters that scale, you also push it to the next scale, but there's still some energy left over from the previous scale. You're trying to do everything at once. And this spreads out the energy too much. And then it turns out that it makes it vulnerable for viscosity to come in and actually just damp out everything. So it turns out this directive, which doesn't actually work.
It was a separate paper by some of the authors that I actually showed this in three dimensions. So what I needed was to program at the lay, so kind of like air locks. So I needed an equation which would start over a fluid doing something at one scale. It would push this energy into the next scale, but it would stay there until all the energy from the larger scale got transferred. And only after you pushed all the energy in, then you sort of open the next gate, and then you you push that in as well.
So by doing that, the energy enters forward, scaled by scale, in such a way that it's always localized at one scale at a time. And then it can resist the effect of viscosity because it's not dispersed. So in order to make that happen, I had to construct a rather complicated non-linearity. And it was basically like, it was constructing like an electronic circuit. So I actually thank my wife for this because she was trained as an electro engineer.
And you know, she talked about, you know, she had to design circuits and so forth. And you know, if you want a circuit that does a certain thing, like maybe have a light, that flashes on and then turns off and then on and off, you can build it from more primitive components, you know, capacitors and resistors and so forth. And you have to build a diagram. You can sort of follow up your eyeballs and say, oh yeah, the current will build up here and it will stop and then it will do that.
So I knew how to build the analog of basic electronic components. You're like resistors and capacitors and so forth. And I would stack them together in such a way that I would create something that would open one gate and then there would be a clock, and then once the clock hits us in first, I would have closed it. It kind of a root goberg type machine, but described mathematically. And this ended up working.
So what I realized is that if you could pull the same thing off for the actual equations, so if the equations of water support a computation, so like if you can imagine kind of a steam punk, but it's really water punk type of thing where, you know, so modern computers are electronic, you know, they're powered by electrons passing through very tiny wires and interacting with other electrons and so forth. But instead of electrons, you can imagine these pulses of water moving in a certain velocity, and maybe it's there are two different configurations corresponding to a bit being up or down.
Probably if you had two of these moving bodies of water collide, they would come out with some new configuration, which would be something like an AND gate or OR gate. You know, the output would depend on a very particular way on the inputs. And like you could chain these together and maybe create a two-in-machine and then you have computers, which I made completely out of water. And if you have computers, then maybe you could do robotics, so hydraulics and so forth. And so you could create some machine, which is basically a fluid analog, what's called a VONOMIN machine.
So a VONOMIN proposed, if you want to colonize Mars, the sheer cost of transporting people on machines to Mars is just ridiculous. But if you could transport one machine to Mars, and this machine had the ability to mine the planet, create some more materials, to smell them, and build more copies of the same machine, then you could colonize the whole planet over time. So if you could build a fluid machine, which is a fluid robot, and what it would do, its purpose in life, it's programmed so that it would create a smaller version of itself, in some sort of cold state, it wouldn't start just yet.
Once it's ready, the big robot configured water would transport all the energy into the smaller configuration and then power down. And then I'd clean myself up. And then what's left is this newest set, which would then turn on, and do the same thing, but smaller and faster. And then the equation has a certain skating symmetry. Once you do that, it can just keep iterating. So this, in principle, would create a blur for the actual Navier Stokes. And this is what I managed to accomplish for this average Navier Stokes. So it provided this sort of roadmap to solve the problem.
Now this is a pipe dream, because there are so many things that are missing for this to actually be a reality. So I can't create these basic logic gates. I don't have these special conductors of water. I mean, there's candidates that include vortex rings that might possibly work, but also, you know, analog computing is really nasty, like a bit of digital computing. I mean, because there's always errors, you have to do a lot of error correction along the way. I don't know how to completely power down the big machine, so it doesn't interfere with the writing of the smaller machine.
But everything in principle can happen. Like it doesn't contradict any of the laws of physics. So it's sort of evidence that this thing is possible. There are other groups who are now pursuing ways to make Navier Stokes blow up, which are nowhere near as ridiculously complicated as this. They actually are pursuing much closer to the direct self-similar model, which can, it doesn't quite work as it is, but there could be some simpler scheme than what I just described to make this work. There is a real leap of genius here to go from Navier Stokes to this toying machine.
So it goes from what the self-similar blob scenario that you're trying to get the smaller, smaller blob to now having a liquid toying machine to get smaller, smaller, smaller, and somehow seeing how that could be used to say something about a blow up. I mean, that's a big leap. So this precedent. I mean, so the thing about mathematics is that it's really good at spotting connections between what you think of what you might think of as completely different problems. But if the mathematical form is the same, you can draw a connection.
So there's a lot of previously or what is cellular automata. The most famous of which is Conway's Game of Life. This is infinite to speak grid, and any given time the grid is occupied by a cell or it's empty. And there's a very simple rule that tells you how these cells evolve. So sometimes cells live and some of the day they die. And when I was a student, it was a very popular screen saver to actually just have these animations going on. And they look very chaotic. In fact, they look a little bit like terribly on a float.
Sometimes. But at some point, people discovered more and more interesting structures within this Game of Life. So for example, they discovered a single glider. So a glider is a very tiny configuration of four or five cells, which evolves and it just moves at a certain direction. And that's like this vortex rings. So this is an analogy. The Game of Life is kind of a discrete equation. And the fluid Navi-Sok is a continuous equation. But mathematically, they have some similar features.
And so all the time, people discovered more and more interesting things that you could build within Game of Life. The Game of Life is a very simple system. It only has three or four rules to do it. But you can design all kinds of interesting configurations inside it. There's some called a glider gun that does nothing of spit out gliders one at a time. And then after a lot of effort, people managed to create and gates and all gates for gliders. Like this is massive ridiculous structure, which if you have a stream of gliders coming in here and a stream of gliders coming in here, then you may produce a stream of gliders coming out. If maybe if both of the streams have gliders, then there we are and output stream. But if only one of them does, then nothing comes out.
So they could build something like that. And once you could build and these basic gates, then just from software engineering, you can build almost anything. You can build a touring machine. I mean, it's a kind of enormous steam pump type things. They look ridiculous. But then people also generate self-replicating objects in the Game of Life, a massive machine, a phenomenal machine, which over a huge period of time and they're always glider guns inside doing these very steam pump calculations. They would create another version of itself which could replicate. That's so incredible. A lot of this was like community crowdswashed by like amateur mathematicians actually. So I knew about that work. And so that is part of what inspired me to propose the same thing whenever you're stoked.
Now, if you're just a much, as I said, analog is much worse in digital. It's going to be, you can't just directly take the constructions in the Game of Life and pump them in. But again, it shows as possible. There's a kind of emergence that happens with these cellular automata local rules, maybe similar to fluids. I don't know. But local rules operating at scale can create these incredibly complex dynamic structures. Do you think any of that is amenable to mathematical analysis? Do we have the tools to say something profound about that? The thing is, you can get this emerging very complicated structures, but only with very carefully prepared initial conditions.
So these glider guns and gates and sort of machines, if you just plant randomly, some cells, and that you not see any of these. And that's the analogous situation of Navier Stokes again, that with typical initial conditions, you will not have any of this weird computation going on. But basically through engineering, especially designing things in a very special way, you can pick clever constructions. And one of the best possible to prove the sort of the negative of like, basically prove that only through engineering can you ever create something interesting. This is a recurring challenge in mathematics that I call the dichotomy between structure and randomness. That most objects that you can generate in mathematics are random.
They look like random. The digits are pie. Well, we believe there's a good example. But there's a very small number of things that have patterns. But now, you can prove something as a pattern by just constructing something like, if something has a simple pattern and you have a proof that it does some of the repeated itself every so often, you can do that. And you can prove that, for example, you can prove that most sequences of digits have no pattern. So like, if you just pick digits randomly, there's some called low large numbers that tells you you're going to get as many ones as two's in the long run. But we have a lot fewer tools to if I give you a specific pattern like the digits of pi, how can I show that this doesn't have some weird pattern to it?
Some other work that I have spent a lot of time on is to prove or construct your theorems or inverse theorems that give tests for when something is very structured. So some functions are what's going to add to it. If I give you a function, I'm asking that natural numbers, the natural numbers. So maybe two maps to four, three maps to six and so forth. Some functions, what's going to additive? Which means that if you add two inputs together, the output gets added as well. For example, a multiply by constant. If you multiply a number by 10, if you multiply a plus b by 10, that's the same as multiplying a by 10 and b by 10 and adding them together.
So some functions additive. Some functions are kind of additive, but not completely additive. So for example, if I take a number n, I multiply by the square root of two and I take the integer part of that. So 10 by square root of two is like 14 points, something. So 10, I'm up to 14, 20, I'm up to 28. So in that case, add additive to these two then. So 10 plus 10 is 20 and 14 plus 20 is 28. But because of this rounding, sometimes there's round of errors and sometimes when you add a plus b, this function doesn't quite give you the sum of the two individual outputs, but the sum plus minus one. So it's almost additive, but not quite additive.
有些函数是加法性质的。有些函数有点像加法性质,但并不完全是加法性质的。举个例子,如果我取一个数 n,将其乘以二的平方根,然后取这个结果的整数部分。例如,当 n 是 10 时,乘以平方根后的结果是 14 点多,取整数部分就是 14;当 n 是 20 时,结果是 28。所以在这种情况下,10 和 10 相加是 20,而 14 和 20 相加是 28。不过,由于取整的原因,有时会出现小的误差。当你计算 a 加 b 时,这个函数的结果可能不是两个独立计算结果的和,而是和加上或减去 1。因此,这个函数是几乎具有加法性质,但不完全是。
So there's a lot of useful results in mathematics and I've worked a lot on 12 things like this, to the effect that if a function has to exhibit some structure like this, then it's basically, there's a reason for why it's true and the reason is because there's some other nearby function, which is actually completely structured, which is explaining this sort of partial pattern that you have. And so if you have these little inverse theorems, it creates this sub-decordomy that either the objects that you study are either have no structure at all or they are somehow related to something that is structured. And in either way, in either case, you can make progress.
A good example of this is that this is old theorem in mathematics called semi-radity theorem proven in the 1970s. It concerns trying to find a certain type of pattern in a set of numbers, the patterns of arithmetic progression, things like 35 and 7 or 10, 15 and 20. And some really, some really proved that any set of numbers that are sufficiently big, also called positive density, has arithmetic precautions in it of any length you wish. So for example, the odd numbers have a set of density one-half and they contain arithmetic precautions of any length. So in that case, it's obvious because the odd numbers are really with these structures.
I can just take 11, 13, 15, 17, I can easily find arithmetic precautions in that set. But there are many of them also applies to random sets. If I take this set of odd numbers and I flip a coin for each number and I only keep the numbers for which I got a heads. So I just flip coins, I just randomly take out half the numbers, I keep one half. So that's the set that has no patterns at all. But just from random fluctuations, you will still get a lot of ethnic progressions in that set. Can you prove that there's arithmetic progressions of arbitrary length within a random...
Yes, I mean, one of the infinite monkey theorem. Usually, mathematicians give boring names to theorists, but occasionally they give colorful names. Yes. The popular version of the infinite monkey theorem is that if you have an infinite number of monkeys in a room with each of a typewriter, they type out text randomly. Almost surely one of them is going to generate the entire school of hamlet or any other finite string of text. It will just take some time, quite a lot of time actually. But if you have an infinite number, then it happens.
So basically, the theorem says that if you take an infinite string of digits or whatever, eventually any finite pattern you wish or you merge, it may take a long time, but it will eventually happen. In particular, ethnic progressions of any length, what eventually happen, a hypodinear, an extremely long random sequence for this to happen? I suppose that's intuitive. It's just infinity. Yeah, infinity absorbs a lot of sense.
Yeah. How are we humans supposed to deal with infinity? Well, you can think of infinity as an abstraction of a finite number of which you do not have a bound for. That, you know, I mean, so nothing in real life is truly infinite. But you know, you can you know, you can ask these old questions like, what about how much money is I wanted? You know, what if I could go as fast as I wanted? And a way in which mathematicians formalize that is, mathematics has found a formalism to idealize instead of something being extremely large, extremely small to actually be exactly infinite or zero.
And often the mathematics becomes a lot cleaner. When you do that, I mean, in physics, we joke about assuming spherical cows, you know, like rule of problems, I've got all kinds of rule of effects, but you can idealize sense of things to infinity, sense of something to zero. And the mathematics becomes a lot simpler to work with it. I wonder how often using infinity forces us to deviate from the physics of reality.
Yeah, so there's a lot of pitfalls. So, you know, we spend a lot of time, you know, undergraduate math classes, teaching analysis, and analysis is often about how to take limits. And whether you know, so for example, A plus B is always B plus A. So when you have a finite number of terms, you add them, you can swap them and there's no problem. But when you have an infinite number of terms, they sort of show games you can play where you can have a series which converges to one value, but you rearrange it and suddenly converges to another value. And so you can make mistakes. You have to know what you're doing when you allow infinity. You have to introduce these epsilons and deltas and there's a certain type of wave of reasoning that helps you avoid mistakes.
In more recent years, people have started taking results that are true in infinite limits and was sort of finalizing them. So you know, that's something true eventually, but you don't know when, now give me a rate. Okay, so such a thing. If I don't have an infinite number of monkeys, but a large finite number of monkeys, how long do I have to wait for him to come out? And that's a more quantitative question. And this is something that you can attack by purely finite methods and you can use your finite intuition. And in this case, it turns out to be exponential in the length of the text that you're trying to generate. And so this is why you never see the monkeys create hamlet. You can maybe see them create a fuller of wood, but nothing big.
And so I personally find once you find it high, say infinite statement, it's just a much more intuitive and it's no longer so weird. So even if you're working with infinity, it's good to find it out so that you can have some intuition. Yeah, the downside is that the finite type proves that just much, much messier. And so the infinite ones are found first, usually like decades earlier, and then later on people find it high. So since we mentioned a lot of math and a lot of physics, what is the difference between mathematics and physics as disciplines, as ways of understanding of seeing the world? Maybe you can throw an engineering in there. You mentioned your wife is an engineer, give a new perspective on circuits.
So this is a different way of looking at the world, given that you've done mathematical physics. So you've worn all the hats. Right. So I think science in general is interaction between three things. There's the real world. There's what we observe over the real world observations. And then our mental models as to how we think the world works. So we can't directly access reality. Okay. All we have are the observations which are incomplete and they have errors. And there are many, many cases where we would want to know because of what is the weather like tomorrow. We don't have the observation and we'd like to predict. And then we have these simplified models sometimes making unrealistic assumptions. It's spherical cow type things.
Those are the mathematical models. Mathematics is concerned with the models. Science collects the observations and it proposes the models that might explain these observations. What mathematics does it, we stay within the model and ask what are the consequences of that model? What observations, what predictions would the model make of the future observations? Or past observations to fit observed data. So there's definitely a symbiosis. I guess mathematics is unusual among other disciplines is that we start from hypotheses like the axioms of a model and ask what conclusions come up from that model.
In almost any other discipline, you start with conclusions. I want to do this. I want to board a bridge. I want to make money. I want to do this. Then you find the path to get there. There's a lot less sort of speculation about it. Suppose I did this. What would happen? Planning and modeling. Specular fiction maybe is one other place. But that's what I did. Actually, most of the things we do in life is conclusions driven, including physics and science. They want to know where is this asteroid going to go. What is the weather going to be tomorrow?
But physics also has this other direction of going from the axioms. What do you think there is this tension in physics between theory and experiment? What do you think is the more powerful way of discovering truly novel ideas about reality? Well, you need both. Top down on bottom up. It's a really an interaction with you in all these things. Over time, the observations and the theory and the modeling should both get closer to reality. Initially, this is the case. They're always far apart to begin with. But you need one to figure out where to push the other. If your model is predicting anomalies that are not picked up by experiment, that tells the experimenters where to look to find more data to refine the models.
It goes back and forth. Within mathematics itself, there's also a theory and experimental component. It's just that until very recently, theory has dominated almost completely 99% of mathematics. It's theoretical mathematics. There's a very tiny amount of experimental mathematics. I mean, people do do it. If they want to study prime numbers or whatever, they can generate large data sets. Once we had a computer, we'd be able to do it a little bit.
Although, even before, like Gals, for example, he discovered the most basic theory in number theory to call the prime number theorem, which predicts how many primes that have to a million, have to a trillion. It's not an obvious question. Basically, what he did was he computed most of these, by himself, but also hired human computers, people whose professional job it was to do arithmetic. To compute the first 100,000 primes or something and made tables and made a prediction.
That was an early example of experimental mathematics. Very recently, it was not... Theoretical mathematics was just much more successful. Doing complicated mathematical computations was just not feasible until very recently. And even nowadays, even though we have powerful computers, only some mathematical things can be explored numerically. There's something called the combinatorial explosion. If you want to study, for example, Zermadee's theory, you want to study all possible subsets of numbers 1 to 1000.
There's only 1000 numbers. How bad could it be? It turns out the number of different subsets of 1 to 1000 is 2 to the power 1000, which is way bigger than any computer can currently continue. Any computer ever will have a computer in your brain. There are certain math problems that very quickly become just intractable to attack by direct brute force computation. Chess is another famous example. The number of chess positions we can't get a computer to fully explore.
But now we have AI. We have tools to explore this space, not with 100% guarantees of success, but with experiment. So we can empirically solve chess now. Very, very good AI is that they don't explore every single position in the game tree, but they have found some very good approximation. And people are using actually these chess engines to do experimental chess. They're revisiting all chess theories about, oh, this type of opening, this is a good type of movement.
This is not. And they can use these chess engines to actually refine in some case overturn conventional wisdom about chess. And I do hope that mathematics will have a larger experimental building in the future perhaps powered by AI. Well, of course, talk about that. In the case of chess, and there's a similar thing in mathematics, I don't believe it's providing a kind of formal explanation of the different positions. But it's just saying which position is better, not that you can intuit as a human being.
And then from that, we humans can construct a theory of the matter. You've mentioned the Plato's cave algorithm. So it gives people to know it's where people are observing shadows of reality, not reality itself. And they believe what they're observing to be reality. Is that in some sense what mathematicians and maybe all humans are doing is looking at shadows of reality? Is it possible for us to truly access reality?
Well, there are these three ontological things. There's actual reality, there's observations and models. And technically, they are distinct. And I think they will always be distinct. But they can get closer over time. And the process of getting closer often means that you have to discard your initial intuitions. Astronomy provides great examples. An initial model of the world is flat because it looks flat.
And it's big. And the rest of the universe, the sky is not like the sun, for example, looks really tiny. And so you start off with a model which is really far from reality. But it fits kind of the observations that you have. So things look good. But over time, as you make more and more observations, bring it closer to reality, the model gets dragged along with it. And so over time, we had to realize that the Earth was round, that it spins.
It goes around the solar system. So it goes on the galaxy and so on and so forth. And the guys part of the universe, humans are expanding. Expansions are self-expanding, accelerating. And in fact, very recently, in this year or so, even the evolution of the universe still is this evidence that is non-constant. And the explanation behind why that is, it's catching up. It's catching up.
I mean, it's still the dark matter, dark energy. We have a model that sort of explains that fits the data really well. It just has a few parameters that you have to specify. But so people say all that spud factors, with enough spud factors, you can explain anything. But the mathematical point of the model is that you want to have fewer parameters in your model than data points in your observational set. So if you have a model with 10 parameters that explains 10-up to observations, that is completely useless model. And so it's got all the fitted. But if you have a model with two parameters and it explains a trillion observations, which is basically, so the dark matter model, I think it has like 14 parameters, and it explains petabytes of data that the astronomer's have. You can think of all the theory.
One way to think about physical mathematical theory is a compression of the universe and a data compression. So you have these petabytes of observations. You'd like to compress it to a model, which you can describe in five pages and specify a certain number of parameters and it can fit to reasonable accuracy, almost all of the observations. The more compression that you make, the better your theory. In fact, one of the great surprises of our universe and of everything in it is that it's compressible at all. It's the unreasonable effect and it's the mathematics. I'm not a quote like that. The most incompressible thing about the universe is that it is comprehensible. And not just comprehensible. You can do an equation like E equals empty squared. There is actually some mathematical possible explanation for that. This is phenomenal in mathematical universality.
Many complex systems at the macro scale are coming out of lots of tiny new interactions at the macro scale. Normally because of the common form of explosion, you would think that the macro scale equations must be infinitely exponentially more complicated than the macro scale ones. They are, if you want to solve them completely exactly. If you want to model all the atoms in a box of air, I have like algebra's numbers, you're monkeys. There's a huge number of particles. If you actually have to track each one, it will be ridiculous. But certain laws emerge at the macro scale that almost don't depend on what's going on at the macro scale or only depend on a very small number of parameters.
So if you want to model a gas of, you know, quintillion particles in a box, you just need to know temperature and pressure and volume in a few parameters, like 506. And it models almost everything you need to know about these 10th or 23 or whatever particles. So we don't understand universality anywhere new as we would like mathematically. But there are much simpler toy models where we do have a good understanding of why universality occurs. The most basic one is the central element theorem that explains why the bell curve shows up everywhere in nature. So many things are distributed by what's got a Gaussian distribution, a famous bell curve. There's now even a meme with this curve. And even the meme applies broadly. The universality to the meme.
Yes, you can go matter if you like. But there are many, many processes. For example, you can take lots of independent random variables and average them together in various ways. You can take a simple average or more complicated average. And we can prove in various cases that these bell curves, these Gaussian's emerge. And it is a satisfying explanation. Sometimes they don't. So if you have many different inputs and they will correlate it in some systemic way, then you can get something very far from a bell curve show up. And this is also important to know when the situation is really fails. So universality is not a 100% reliable thing to rely on. That global financial crisis was a famous example of this. People thought that mortgage defaults had this sort of Gaussian type behavior.
That if you ask if a population of 100,000 Americans with mortgages, ask what proportion of the mortgage is. If everything was decarolated, it would be an ass bell curve. And you can manage risk of options and derivatives and so forth. And it is a very beautiful theory. But if there are systemic shocks in the economy that can push everybody default at the same time, that's very non-gasting behavior. And this wasn't fully accounted for in 2008. Now I think there's some more awareness this is systemic risk is a key up a much bigger issue. And just because the model is pretty and nice, it may not match reality. So the mathematics of working out what models do is really important.
But also the size of validating when the models fit reality and when they don't. I mean, that you need both. And but mathematics can help because it can, for example, the central limit thing was it tells you that if you have certain axioms like non-correlation, that if all the inputs were not correlated to each other, then you have these Gaussian behaviors that things are fine. It tells you where to look for weaknesses in the model. So if you have a method, an understanding of central limit to and someone proposes to use these Gaussian couperlors or whatever to model, deport risk, if you're mathematically trained, you would say, okay, but what are the systemic correlation between all your inputs? And so then you can ask the economists how much risk is that? And then you can you can you can look for that.
So there's always this synergy between science and mathematics. A little bit on the topic of universality. You're known and celebrated for working across an incredible breadth of mathematics. So I'm an instant of Hilbert a century ago. In fact, the great fields metal winning mathematician Tim Gowers has said that you are the closest thing we get to Hilbert. He's a colleague of yours. But anyway, so you are known for this ability to go both deep and broad in mathematics. So you're the perfect person to ask, do you think there are threads that connect all the disparate areas of mathematics?
Is there kind of deep underlying structure to all of mathematics? This certainly a lot of connecting threads and a lot of the progress of mathematics has can be represented by taking by stories of two fields of mathematics that were previously not connected and finding connections. An ancient example is geometry and number theory. So in the times of ancient Greeks, these were considered different subjects. I mean, mathematicians worked on both. You could work both on geometry most famously, but also on numbers. But they were not really considered related. I mean, a little bit like, you could say that this length was five times this length because you could take five copies of this length and so forth.
But it wasn't until Descartes, you've really realized that you could develop the geometry that you can parameterize the plane a geometric object by two real numbers at every point. And so geometric problems can be turned into problems about numbers. And today, this feels almost trivial. There's no content to list. Of course, the plane is XX and Y. Because that's what we teach and it's internalized. But it was an important development that these two fields were unified. And this process has just gone on throughout mathematics over and over again. Algebra and geometry were separated and now we have a suitable algebraic geometry that connects them and over and over again.
And that's certainly the type of mathematics that I enjoy the most. So I think there's sort of different styles to being a mathematician. I think hedgehogs and fox. Fox knows many things a little bit, but a hedgehog knows one thing very, very well. And in mathematics, there's definitely both hedgehogs and foxes. And then there's people who are kind of who can play both roles. And I think I do a collaboration between mathematicians involves very, you need some diversity. Fox working with many hedgehogs all vice versa. So yeah, but I identify mostly as a fox certainly. I like arbitrage somehow, like learning how one field works, learning the tricks of that wheel and then going to another field, which people don't think it is related, but I can adapt the tricks.
So see the connections between the fields. So there are other mathematicians who are far deeper than I am. They're really hedgehogs. They know everything about one field and they're much faster and more effective in that field. But I can I can give them these extra tools. I mean, you said that you can be both a hedgehog and the fox depending on the context and depending on the collaboration. So what can you if it's at all possible speak to the difference between those two ways of thinking about a problem, say you're encountering a new problem, you know, searching for the connections versus like very singular focus.
I'm much more comfortable with the fox paradigm. Yeah, so yeah, I like looking for analogies, narratives. I spend a lot of time. If it is a result, I see it in one field and I like the result. It's a cool result, but I don't like the proof. Like it uses types of mathematics that I'm not super familiar with. I often try to re-prove it myself using the tools that I favor. Often my proof is worse, but by the exercise you're doing so, I can say, oh, now I can see what the other proof was trying to do. And from that, I can get some understanding of the tools that I have used in that field.
So it's very exploratory, very doing crazy things and crazy fields and like reinventing the wheel a lot. Whereas the hedgehog style is, I think much more scholarly, you know, you're very knowledge based. You stay up to speed on like all the developments in this field, you know, all the history. You have a very good understanding of exactly the strengths and weaknesses of each particular technique. I think you'd rely a lot more on calculation than sort of trying to find narratives. So yeah, I mean, I could do that too, but the other work extremely good at that.
Let's start back and maybe look at a bit of a romanticized version of mathematics. So I think you've said that early on in your life, math was more like a puzzle solving activity when you were young. When did you first encounter a problem or proof where you realize math can have a kind of elegance and beauty to it? That's a good question. When I came to graduate school in Princeton, so John Conway was there at the time. He passed away a few years ago.
But I remember one of the very first research talks I went to was a talk by Conway on what he called extreme proof. So Conway had just said this is an amazing way of thinking about all kinds of things in a way that you would normally think of. So he thought of proofs themselves as occupying some sort of space. So if you want to prove something, let's say that there's infinitely many primes. You're all different proofs. But you could rank them in different axes. Some proofs are elegant, some are long, some proofs are elementary and so forth. So this is cloud.
The space of all proofs itself has some sort of shape. So he was interested in extreme points of this shape. All these proofs, what is one of these shortest at the expense of everything else or the most elementary or whatever. And so he gave some examples of well-known theorems and then he would give what he thought was the extreme proof in these different aspects. I just found out really eye opening that it's not just getting a proof for a result, it was interesting. But once you have that proof trying to optimize it in various ways, that proofing itself had some craftsmanship to it.
It's something for my writing style that when you do your math assignments and undergraduate, your homework and so forth, you're encouraged to just write down any proof that works. As long as it gets a tick mark, you move on. But if you want your results to actually be influential and be read by people, it can't just be correct. It should also be a pleasure to read, motivated, be adaptable to generalize other things. It's the same in many other disciplines, like coding. There's a lot of analogies between math and coding.
I like analogies if you haven't noticed. But you can code something spaghetti code that works for a certain task and it's quick and dirty and it works. But there's lots of good principles for writing code well so that Alipuk can use it, board upon it, and so on. It has fewer bugs and whatever. There's some of the things with mathematics. First of all, there's so many beautiful things there. And Kamu is one of the great minds in mathematics ever and computer science. Just even considering the space of proofs.
And saying, okay, what does the space look like? And what are the extremes? Like you mentioned coding is an analogies interesting because there's also this activity called the code golf. Oh yeah, yeah, which I also find beautiful and fun where people use different programming languages to try to write the shortest possible program that accomplishes a particular task. Then I believe there's even competitions on this.
Yeah, yeah. And it's also a nice way to stress tests, not just the sort of the programs or in this case the proofs but also the different languages. Maybe that's a different notation or whatever to use to tell a comprehensive different task. Yeah, you learn a lot. I mean, it may seem like a frivolous exercise, but it can generate all these insights which if you didn't have this artificial objective to pursue, you might not see.
What to use the most beautiful or elegant equation in mathematics? I mean, one of the things that people often look to in beauty is this simplicity. So if you look at E equals obviously squared. So when a few concepts come together, that's why the oil or identity is often considered the most beautiful equation in mathematics. Do you find beauty in that one and the oil identity?
Yeah, well, as I said, I mean, what I find most appealing is connections between different things that you like. So if you eat the pie I equals minus one. So yeah, people are always, this is all the fundamental constants. Okay, that's cute. But to me, so the exponential function was interesting. To measure exponential growth. So the compound interest or decay or anything which is continuously growing continuously decreasing growth and decay or dilation or contraction is modeled by the exponential function. Whereas pie comes around from circles. and rotation. If you want to rotate a needle, for example, 100 degrees, you need to rotate by pie radians. And I, complex numbers, represents this hoping should be an imaginary axis of a negative rotation. So a change in direction.
So the exponential function represents growth and decay in the direction where you really are. When you stick an eye in the exponential, instead of motion in the same direction as your composition, it's a motion as a right angle to your composition. So rotation. And then so if the pie I equals minus one tells you that if you rotate for a time pi, you end up at the other direction. So it unifies geometry through dilation and exponential growth dynamics through this act of classification, rotation by by by. So it connects together all these two mathematics, yeah, that's a question of complex and complex and complex numbers. They will consider almost their own next door neighbors in mathematics because of this identity.
Do you think the thing you mentioned is cute? The collision of notations from these disparate fields is just a frivolous side effect or do you think there is legitimate value in one notation? Although our old friends come together at night. Well, it's confirmation that you have the right concepts. So when you first study anything, you have to measure things and give them names. And initially, sometimes because your model is again too far off from reality, you give the wrong things the best names and you only find out later what's really important. Physicists can do this sometimes. I mean, but it turns out okay.
So actually, physics of so equals n times squared. Okay, one of the big things was the E. So when Aristotle first came up with his laws of motion and then then Galileo and Newton and so forth, they saw the things they could measure. They could measure mass and acceleration and force and so forth. Newtonian mechanics, for example, ethical is MA, it was the famous Newton's second law of motion. So those were the primary objects. So they gave them the central building in the theory. It was only later after people started analyzing these equations that they always seem to be these quantities that were conserved.
据说,物理学中的某些公式可以用 n 倍于平方的方式表示。一个重要的因素是能量 (E)。当亚里士多德首次提出他的运动定律,然后伽利略和牛顿等人接着研究时,他们观察到一些可被测量的事物,如质量、加速度和力等等。例如,牛顿力学中的 F=ma 就是著名的牛顿第二运动定律。因此,这些是研究的主要对象,成为理论的核心组成部分。后来,人们开始分析这些方程式时,发现总有一些守恒量出现。
So a particular momentum in energy. And it's not obvious that things happen energy. Like it's not something you can directly measure the same way you can measure mass and velocity so forth. But over time, people realized that this was actually a really fundamental concept. Hamilton, eventually in 19th century, reformulated Newton's laws of physics into what it's called Hamiltonian mechanics, where the energy, which is now called the Hamiltonian, was the dominant object. Once you know how to measure the Hamiltonian of any system, you can just completely detect the dynamics like what happens to it or to all the states like it.
It really was a central actor, which was not obvious initially. And this helped actually, this change of perspective really helped when quantum mechanics came along. Because the early physicists who studied quantum mechanics, they had a lot of trouble trying to adapt in Newtonian thinking because the other thing was particle and so forth to quantum mechanics. Because I think people was a way, but it just looked really, really weird. Like what is the quantum version of F equals M A? And it's really, really hard to give an answer to that.
But it turns out that the Hamiltonian, which was so secretly behind the scenes in classical mechanics also is the key object in quantum mechanics that there's also an object called Hamiltonian. It's a different type of object. It's what's called an operator rather than a function, but but again, once you specify it, you specify the entire dynamics. So the circle shown this equation that tells you exactly how quantum systems evolve once you have the Hamiltonian. So side by side, they look completely different objects, you know, like one in those particles, one of those waves and so forth.
But with this centrality, you could start actually transferring a lot of intuition and facts from classical mechanics to quantum mechanics. So for example, in classical mechanics, there's this single nervous theorem. Every time there's a symmetry in a physical system, there was a conservation law. So the laws of physics are translation invariant. Like if I move tens of the left, I experience the same laws of physics as I was here. And that corresponds to conservation momentum. If I turn around by some angle, again, I experience the same laws of physics, this corresponds to the conservation of angular momentum. If I wait for 10 minutes, I still have the same laws of physics. So there's time transition invariance, this corresponds to the law of conservation energy. So there's this fundamental connection between symmetry and conservation. And that's also true in quantum mechanics, even though the equations are completely different.
But because they're both coming from the Hamiltonian, Hamiltonian controls everything. Every time the Hamiltonian is a symmetry, the equations will have a conservation law. So it's, it's, it's, it's, it's, once you have the right language, it actually makes them a lot cleaner. One of the problems is why we can't unify quantum mechanics and general relativity yet. We haven't figured out what the fundamental objects are. Like, for example, we have to give up the notion of space and time being these almost cleaning type spaces. And it has to be, you know, and, you know, we kind of know that at very tiny scales, there's going to be quantum fluctuations, there's a space, space time foam. And trying to use Cartesian coordinates XYZ is going to be, it's a non-starter. But we don't know how to, what to replace it with. We don't actually have the mathematical concepts. The analog of Hamiltonian that sort of organized everything.
Does your gut say that there is a theory of everything? So this is even possible to unify, to find this language that unifies general relativity and quantum mechanics. I believe so. I mean, the history of physics has been out of unification, much like mathematics over the years. You know, electricity and magnetism was separate theories and then backs will unify them. You know, Newton unified the motions of heavens for the motions of objects on the earth and so forth. So it should happen. It's just that the, again, to go back to this model of the observations and theory. Part of our problem is that physics is a victim of its own success. That of two big theories of physics, general relativity and quantum mechanics are so good now.
So together, they cover 99.9% of sort of all the observations we can make. And you have to like either go to extremely insane particle celebrations or the early universe or things that are really hard to measure in order to get any deviation from either of these two theories to the point where you can figure out how to combine them together. But I have faith that we, you know, we've been doing this for centuries. We've made progress before. There's no reason why we should stop. Do you think you will be a mathematician that develops a theory of everything? What often happens is that when the physicists need some theory of mathematics, there's often some precursor that the mathematicians worked out earlier.
So when Einstein started realizing that space was curved, he went to some mathematician and asked, yeah, is there some theory of curved space that the mathematicians already came up with that could be useful? And he's like, yeah, there's a, I think, a, we, we might have came up with something. And so, yeah, we might have developed a remaining geometry, which is precisely, you know, a theory of spaces that are curved in various general ways, which turn out to be almost exactly what was needed. And by Einstein's theory, there's a conductive to witness unreasonable effectiveness on mathematics. I think the theories that work well, they explain the universe tend to also involve the same mathematical objects that work well to solve the faculty problems.
Ultimately, there's just both ways of organizing data in useful ways. It just feels like you might need to go some weird land that's very hard to turn to it. Like, you have like string theory. Yeah, that was that was a leading candidate for many decades. I think it's slowly pulling out of fashion because it's not matching experiment. So one of the big challenges, of course, like you said, is experiment is very tough. Yes, because of how effective both theories are. But the other is like, just, you know, you're talking about you're not just deviating from space time. You're going into like some crazy number of dimensions.
You're doing all kinds of weird stuff that to us, we've gone so far from this flat earth that we started. Yes, that like you mentioned. Yeah, yeah, yeah, yeah. We're just, it's very hard to use our limited, a descendants of cognition to intuit what that reality really is like. This is why analogies are so important. I mean, so yeah, the round earth is not intuitive because we're stuck on it. But, you know, but round objects in general, we have pretty good intuition. And we've introduced about light works and so forth. And it's actually a good exercise to work out how eclipses and phases of the sun and the moon and so forth. Can we really easily explain by round earth and round moon, you know, and models. And you can just take, you know, a basketball one, a golf ball, and a light source and actually do these things yourself.
So the intuition is there. But you have to transfer it. That is a big leap into lecture for us to go from flat to round earth. Because you know, our life is mostly lived in flat land. Yeah, to load that information. And we're all like to take it for granted. We take so many things for granted because science has established a lot of evidence for this kind of thing. But, you know, we're in a round rock. Yeah, like through space. Yeah, yeah. That's a big leap. And you have to take a chain of those leaps the more and more and more we progress. Right.
Yeah. So modern science is maybe again a victim with its own success is that in order to be more accurate, it has to move further and further away from your initial intuition. And so for someone who hasn't gone through the whole process of science education, it looks more more suspicious because of that. So, you know, we need more grounding. I mean, I think, I mean, you know, there are scientists who do excellent outreach. But there's this, there's lots of science things that you can do at home.
I have this lots of YouTube videos. I did a YouTube video recently of Grant Sanderson, we talked about earlier, that, you know, how the ancient Greeks were able to measure things like the distance of the moon, distance of the earth. And, you know, using techniques that you could also replicate yourself. It doesn't all have to be like fancy space telescopes and very intimidating mathematics. Yeah, that's, I highly recommend that. I believe you have a lecture and you also did an incredible video with Grant. It's a beautiful experience to try to put yourself in the mind of a person from that time. Shroud and in mystery.
You know, you're like on this planet, you don't know the shape of it, the size of it. You see some stars, you see some, you see some things and you try to like localize yourself in this world. Yeah, yeah. And try to make some kind of general statements about distance to places. Change of perspective is really important. Say, travel borders the mind. This is intellectual travel. You know, put yourself in the mind of the ancient Greeks or some other persons of other time period. Make hypotheses, spherical cows, whatever, you know, speculate.
And, you know, this is what mathematicians do and some other sort of artists do, actually. It's just incredible that given the extreme constraints, you could still say very powerful things. That's why it's inspiring. Looking back in history, how much can be figured out? We don't have much. I think you're how stuff works. If you propose axioms, then the mathematics that you follow, those axioms do it to their conclusions. And sometimes you can get quite a lot, quite a long way from, you know, initial hypotheses.
If you're staying in the land of the weird, you mentioned general relativity. You've contributed to the mathematical understanding of Einstein's field equations. Can you explain this work? And from a sort of mathematical standpoint, what aspects of general relativity are intriguing to you, challenging to you? I have worked on some equations. There's something called the wave maps equation, all of the sigma-field model, which is not quite the equation of space-time gravity itself, but of certain fields that might exist on top of space-time.
So, Einstein's equations of relativity just describe space and time itself. But then there's other fields that live on top of that. There's the electromagnetic field, there's things like Yang-Mills fields. And there's this whole hierarchy of different equations, of which Einstein's considered one of the most nonlinear and difficult. But relatively low on a hierarchy was this thing called the wave maps equation. So it's a wave, which at any given point is fixed to be like on a sphere.
So I can think of a bunch of arrows in space and time, and yeah, it's pointing in different directions. But they propagate like waves. If you wiggle an arrow, it will propagate and make all the arrows move kind of like a sheep's of wheat in a wheat field. And I was interested in the global or global out of problem again for this question. Like, is it possible for all the energy here? to collect at a point? So equation, I considered what's actually what's called a critical equation, where it's actually the behavior at all scales is roughly the same. And I was able barely to show that you couldn't actually force a scenario where all the energy concentrated at one point. But at the end you had to dismiss a little bit and moment it was a little bit, it would stay regular. Yeah, this was back in 2000. That was part of why I got into the nervous talks afterwards.
Actually, yeah, so I developed some techniques to solve that problem. So part of it was, this problem is really nonlinear because of the curvature of the sphere. There was a certain nonlinear effect, which was non-perturbative. It was when you sort of looked at it normally, it looked larger than the linear effects of the wave equation. And so it was hard to keep things under control, even when the energy was small. But I developed what's called a gauge transformation. So the equation is kind of like an evolution of hives of wheat and they're all bending back and forth. And so there's a lot of motion. But if you imagine like stabilizing the flow by attaching little cameras at different points in space, which I'm trying to move in a way that captures most of the motion. And under this sort of stabilized flow, the flow becomes a lot more linear.
I discovered a way to transform the equation to reduce the amount of nonlinear effects. And then I was able to solve the equation. I found this transformation while visiting my art in Australia. And I was trying to understand the dynamics of all these fields. And I couldn't do a pen and paper. And I had not decided computers to do any computer simulations. So I ended up closing my eyes, being on the floor. I just imagined myself to actually be the specter field and rolling around to try to see how to change coordinates in such a way that somehow things in order of actions would behave in a reasonable linear fashion. And my aunt walked in and was doing that. And she was asking, what am I doing doing this? It's complicated. Yeah.
And okay, fine. You're a young man. I don't ask questions. I have to ask about the, you know, how do you approach solving difficult problems? If it's possible to go inside your mind when you're thinking, are you visualizing in your mind the mathematical object symbols? Maybe what are you visualizing in your mind usually when you're thinking? A lot of pen and paper. One thing you pick up as a mathematician is sort of a collage cheating strategically. So the beauty of mathematics is that you get to change the world, change the problem, change the rules as you wish. You don't get to do this or any other field.
Like, you know, if you're an engineer and someone says, put a bridge over this river, you can say, I want to build this up here over here instead or I want to put out a paper instead of steel. But imagine you can do whatever you want. It's like trying to solve a computer game where you can get this unlimited cheat codes available. And so, you know, you can set this, so there's a dimension that's large. I've set it to one. I'd solve the one-dimensional problem first. Or there's a main term and an error term. I'm going to make a spherical curl assumption. I'll assume the error term is zero.
And so, the way you solve these problems is not in sort of this ironman mode where you make things maximally difficult. But actually, the way you should approach any reasonable math problem is that you, if there are 10 things that are making it like difficult, find a version of the problem that turns off and then the difficulty is only keeps one of them. And so that. And then that just, so you install nine sheets. Okay, so 10 sheets then the game is trivial. But you install nine sheets. You solve one problem that that teaches you how to do that to get difficulty. And then you turn that one off and you turn someone else something else on and then you saw that one.
And after you know how to solve the 10 problems, 10 difficulty separately, then you have to start merging them if you had a time. I, I was a kid. I watched a lot of these Hong Kong action movies. This is from a culture. And one thing is that every time it's the fight scene, you know, something like that, the hero gets swarmed by 100 bad guy goons or whatever. But it'll always be choreographed so that you'd always be only fighting one person at a time and then you would defeat that person and move on. And because of that, they could defeat all of them. But whereas if they had fought a bit more intelligently and just swarmed the guy once, it would make for much much worse. I'm cinema, but they would win.
Are you usually pen and paper? Are you working with computer and late tech? Mostly pen and paper actually. So in my office, I have forward giant blackboards. And sometimes I just have to write everything I know about the problem on the full blackboards and then sit on my couch and just sort of see the whole thing. Is it all symbols like notation or is there from drawings? Oh, there's a lot of drawing and a lot of bespoke doodles that only makes sense to me. And this is a bit of a blackboard you raise. It's a very organic thing. I'm beginning to use more more computers partly because AI makes it much easier to do simple coding things.
If I wanted to plot a function before which is moderately complicated as some iteration or something, I'd have to remember how to set up a Python program and how does a full loop work and debug it and it would take two hours and so forth. And now I can do it in 10, 15 minutes as much. I'm using more and more computers to do simple explorations. Let's talk about AI a little bit if we could. So maybe a good entry point is just talking about computer-assisted proofs in general. Can you describe the lean formal proof programming language in how it can help as a proof assistant and maybe how you started using it and how it has helped you?
So lean is a computer language much like sort of standard languages like Python and C and so forth. Except in most languages the focus is on using executable code. Lines of code do things. They flip bits or they make a real one move or they deliver you text on it or something. So lean is a language that can also do that. It can also be run as a standard traditional language but it can also produce certificates. So a software like Python might do a computation and give you the answer is seven. Okay, that it does the sum of three plus four is equal to seven but lean can produce not just the answer but a proof that how it got the answer of seven is three plus four and all the steps involved in.
So it creates these more complicated objects not just statements but statements were proofs attached to them and every line of code is just a way of piecing together previous statements to create new ones. So the idea is not new. These things are called proof assistants and so they provide languages for which you can create quite complicated, intricate mathematical proofs and they produce these certificates that give it 100% guarantee that your arguments are correct. If you trust the compiler of the lean but they made the compiler really small and you can have several different competitors available for the same level.
Can you give people some intuition about the difference between writing on pen and paper versus using lean programming language? How hard is it to formalize statement? So lean a lot of mathematicians will invoke in the design of lean. So it's designed so that individual lines of code resemble individual lines of mathematical argument. You might want to introduce a variable, you want to improve our contradiction. There are various standard things that you can do and it's written so ideally it should be like a one to one correspondence.
In fact it isn't because lean is like explaining a proof to an extremely pedantic colleague who will point out, did you really mean this? Like what happens if this is zero? How do you justify this? So lean has a lot of automation in it to try to be less annoying. So for example every mathematical object has to come with a type. If I talk about x, is x a rule number or a natural number or a function or something? If you write things informally it's something in terms of context. You say clearly x is equal to let x be the sum of y and z and y and z were already rule numbers. x should also be a rule number.
事实上,这并不是因为 Lean 就像是向一个非常挑剔的同事解释一个数学证明,他会指出:“你的意思真的是这样吗?”比如“如果这个数是零会怎么样?你怎么证明这一点?”因此,Lean 内置了很多自动化功能来减少这种麻烦。例如,每一个数学对象都必须有一个类型。当我提到 x 时,x 是实数、自然数、函数还是其他东西?如果你非正式地写出这些内容,通常可以通过上下文来理解。你可能会说:“显然,设 x 是 y 和 z 的和,而 y 和 z 已经是实数了,所以 x 应该也是实数。”
So lean can do a lot of that. But every so often it says wait a minute can you tell me more about what this object is? What type of object is you have to think more at a philosophical level, not just sort of computations you're doing but sort of what each object actually is in some sense. Is it using something like llm's to do the type inference or like you match with a real though? It's using much more traditional, good or fashion AI. You can represent all these things as trees and there's always algorithm to match one tree to another tree. So it's actually doable to figure out if something is a real number or a natural number. Every object sort of comes with a history of what it came from and you can kind of trace. Oh I see. Yeah. So it's designed for reliability.
So modern AI's are not used in this is a district technology. People are beginning to use AI's on top of lean. So when a mathematician tries to program a proof in lean, often there's a step. Okay. Now I want to use the fundamental thing with calculus to do the next step. So the lean developers have built this massive project called methylib collection of tens of thousands of useful facts about methodical objects. And somewhere in there is the fundamental calculus but you need to find it. So a lot of the bottleneck now is actually lemma search. There's a tool that you know is in there somewhere and you need to find it.
And so there are various search engines specialized for methylib that you can do. But there's now these large language models that you can say, I need the fundamental thing with calculus at this point. And I said, okay, for example, when I code I have get up co-pilot installed as a plug-in to my IDE and it scans my text and it sees what I need. I'm not even typing. Now I need to use the fundamental thing with calculus. Okay. And then it might suggest, okay, try this and like maybe 25% of the time it works exactly and then another 10-15% of the time it doesn't quite work but it's close enough that I can say, oh, I've just changed it here and here. It will work. And then like half the time it gives me complete rubbish.
So but people are beginning to use AI's a little bit on top. Most of them are level of basically fancy autocomplete that you can type half of one line of a proof and it will find you'll tell you. But a fancy, especially fancy with the sort of capital letter F is remove some of the friction. Mathematician might feel when they move from patterned paper to formalizing. Yes. Yeah. So right now I estimate that the effort, time and effort taken to formalize a proof is about 10 times the amount taken to write it out. So it's doable but you don't, it's annoying. But doesn't it like kill the whole vibe of being a mathematician?
Yeah. So I mean having a pedantic worker, right? Yeah. If that was the only aspect of it. Okay. But okay. There's something because it was actually more pleasant to do this formally. So there's a few of my formalized and there was a certain constant 12 that came out of it in the final statement. And so this 12 had be carried all through the proof. And like everything had to be checked then it goes. Although all these other numbers had be consistent with this final number 12. And then so we want a paper through the theorem with this number 12.
And then a few weeks later, so I said, oh, we can actually improve this 12 to an 11 by we working some of these steps. And when this happens with pen and paper, every time you change your parameter, you have to check line by line that every single line of your proof still works. And there can be subtle things that you didn't quite realize. Some problems on number 12 that you didn't even realize that you were taking advantage of. So a proof can break down at a subtle place. So we had formalized the proof with this constant 12.
And then when this new paper came out, we said, oh, okay, so that took like three weeks to formalize and like 20 people to formalize this original proof. I said, oh, but now let's update the total 11. And what you can do with lean, you just in your headline theorem, you change your 12 to 11, you run the compiler. And like of the thousands of lines that code you have, 90 percent of them still work. And there's a couple that are line and red. Now I can't just buy these steps, but it meets the isolates which steps you need to change.
But you can skip over everything which works just fine. And if you program things correctly with good programming practices, most of your lines will not be read. And there'll just be a few places where you, I mean, if you don't hard code your constants, but you sort of you use smart tactics and so forth, you can localize the things you need to change to a very small period of time. So it's like within a day or two, we had updated our proof. Because this is a very quick process, you make a change. There are 10 things now that don't work for each one. You make a change. And now there's five more things that don't work. But the process converges much more smoothly than with pen and paper.
So that's for writing. Are you able to read it? Like if somebody else has a proof, they're able to like, how, what's the versus paper and yeah, so the proof is a longer, but each individual piece is easier to read. So if you take a math paper and you jump to page 27 and you look at paragraph six and you have a line of text or math, I often can't read it immediately. Because it assumes various definitions, which I had to go back and maybe on 10 pages earlier, this was defined. And the proof is scattered all over the place. And you basically are forced to read fairly sequentially. It's not like say a novel where like, you know, in a theory, you could you open up a novel halfway through and start reading. There's a lot of context.
But when I prove in lean, if you put your cursor on a line code, every single object there, you can hover over it and it would say what it is, what it came from, what was it justified, you can trace things back, much easier than so flipping through a math paper. So one thing that lean really enables is actually collaborating on proofs at a really atomic scale that you really couldn't do in the past. Traditionally, you pen a paper when you want to collaborate with another mathematician, either you do it at a blackboard where you can really interact. But if you're doing it sort of by email or something, basically yeah, you have to segment it. I'm going to finish section three, you do section four, but you can't really sort of work on the same thing, collaborative at the same time. But with lean, you can be trying to formalize some portion of the proof and say, I got stuck at line 67 here, I need to prove this thing, but it doesn't quite work.
Here's the three lines of code I can trouble with. But because all the context is there, someone else can say, oh, okay, I recognize what you need to do, you need to apply this trick or this tool. And you can do extremely atomic level conversations. So because of lean, I can collaborate with dozens of people across the world, most of them I don't have never met in person. And I may not know actually even whether they're how reliable they are in their in the proofs they can make, but lean gives me a certificate of trust. So I can do trust this mathematics.
So there's so many interesting questions. There's one you're known for being a great collaborator. So what is the right way to approach solving a difficult problem in mathematics when you're collaborating? Are you doing a divide and conquer type of thing or are you brain, are you focused in a particular part and your brainstorming? There's always a brainstorming process first. Yeah, so math research projects sort of by their nature, when you start, you don't really know how to do the problem. It's not like an engineering project where some other theory has been established for decades and its implementation is the main difficulty. You have to figure out even what is the right path.
So this is what I said about cheating first, you know, it's like to go back to the bridge building analogy. So first assume you have an infinite budget and like unlimited amounts of workforce and so forth. Now can you build this bridge? Okay, now have an infinite budget, but only finite workforce. Now can you do that and so forth. So of course, no engineer can actually do this because they have fixed requirements. Yes, there's this sort of jam sessions at the beginning where you try all kinds of crazy things and you make all these assumptions that aren't realistic, but you plan to fix later. And you try to see if there's even some skeleton, I'm going to push them might work.
And then hopefully that breaks up the problem into smaller subproblems, which you don't know how to do, but then you focus on the sub ones and sometimes different collaborators are better at working on certain things. So one of my themes I'm known for is a thing with Ben Green, which is probably Green-Tao theorem. It's a statement that the primes contain arithmetic progressions of any length. So it was a modification of this theme was already. And the way we collaborated was that Ben had already proven a similar result for progressions of length three. He showed that sets like the primes contain loss and loss of progressions of length three, even an even subsets of the primes. certain subsets do. But his techniques only worked for them three progressions. They didn't work for longer progressions.
But I had these techniques coming from a gothic theory, which is something that I had been playing with and I knew better than I'd been at the time. And so if I could justify certain randomness properties of some set relating to primes, there's a certain technical condition, which if I could have it, if Ben could supply me this fact, I could conclude the theorem. But what I asked was a really difficult question in number theory, which he said, there's no way we can prove this. So he said, can you prove your part of the theorem using a weak hypothesis that I have a chance to prove it? And he proposed something which he could prove, but it was too weak for me. I can't use this.
So there's this conversation going back and forth. So the different cheats too. Yeah, I want to cheat more than he wants to cheat less. But eventually we found a property which A he could prove in B I could use. And then we could prove that to you. And so there's a there's a there all kinds of dynamics, you know, I mean, it's every collaboration has a has a has some story. It's no two of the same. And then on the flip side of that, like you mentioned, with lean programming, now that's almost like a different story because you can do you can create, I think you've mentioned a kind of a blueprint, right, for a problem.
And then you can really do a divide and conquer with lean, where you're working on separate parts. Right. And they're using the computer system proof checker essentially to make sure that everything is correct along the way. So it makes everything compatible and yeah, and trustable. Yeah. So currently, only a few mathematical projects can be cut up in this way. At the current state of the art, most of the lean activity is on formalizing proofs that have already been proven by humans. And math paper basically is a boop a blueprint in a sense. It is taking a difficult statement like big theorem and breaking it up into me are 100 little numbers.
But often not all written with enough detail that each one can be sort of directly formalized. A blueprint is like a really pedantically written version of a paper where every step is explained as much detail as as as possible. And to try to make each step kind of self-contained. And depending on only a very specific number of previous statements, I mean, proven so that each node of this blueprint graph that gets generated can be tackled independently of all the others. And you don't even need to know how the whole thing works. So it's like a modern supply chain. You know, like if you want to create an iPhone or some other complicated object, no one person can build a single object.
But you can have a specialist who just if they're given some widgets from a similar company, they can combine them together to form a slightly bigger widget. I think there's a really exciting possibility because you can have if you can find problems that could be broken down this way, then you can have thousands of contributors, right? Yes, yes, yes, distributed. So I told you before about the split between theoretical and experimental mathematics. And right now, most mathematics is theoretical and when you type it is experimental. I think the platform that lean and other software tools, so get hub and things like that, allow experimental mathematics to be to scale up to a much greater degree than we can do now.
So right now, if you want to do any mathematical exploration of some mathematical pattern or some of you need some code, do write out the pattern. And I mean, sometimes there are some computer algebra packages that help. But often it's just one mathematician coding lots and lots of Python or whatever. And because coding is such an error for an activity, it's not practical to allow other people to collaborate with you on writing modules for your code, because if one of the modules has a bug in it, the whole thing is unreliable. So you get these Spock spaghetti code written by non-operational programmers with mathematicians, and they're clunky and slow.
And so because of that, it's hard to really mass-produce experimental results. But I think with lean, I mean, I'm already starting some projects where we are not just experimenting with data, but experimenting with proofs. So I have this project called the Equational Theory's project. Basically, we generated about 22 million little problems at abstract algebra. We should back up and tell you what the project is. Okay, so abstract algebra studies operations like multiplication and addition and the abstract properties. Okay, so multiplication, for example, is commutative. X times Y is always Y times X is for numbers. And it's also associative. X times Y times Z is the same as X times Y times Z. So these operations are based on laws that don't obey others. For example, X times X is not always equal to X. So that laws are not always true.
So given any operation, it obeys some laws and not others. And so we generated about 4,000 of these possible laws of algebra that certain operations can satisfy. And our question is which laws imply which other ones. So for example, does commutativity imply associativity? And the answer is no, because it turns out you can describe an operation which obeys the commutative law, but it doesn't obey the associative law. So by producing an example, you can show that commutativity does not imply associativity. But some of the laws do imply other laws by substitution and so forth. And you can write down some algebraic proofs. So we look at all the pairs between these 4,000 laws and the sort of 22 million of these pairs. And for each pair, we ask, does this law imply this law? If so, give a proof. If not, give a count example.
So 22 million problems, each one of which you could give to an undergraduate algebra student. And they had a decent chance of solving the problem. Although there are a few, at least 22 million, there are like 100 or so that are really quite hard. But a lot are easy. And the project was just to work out to determine the entire graph, like which ones imply which other ones. That's an incredible project, by the way. Such a good idea, such a good test that the very thing we've been talking about at a scale that's remarkable.
Yeah, so it would not have been feasible. I mean, the state of the art in the literature was like 15 equations and sort of how they imply that sort of the limit of what a human repentant people can do. So you need to scale that up. So you need to crowdsource, but you also need to trust all the, no one person can check 22 million of these proofs. You need to be computerized. And so it only became possible with lean. We were hoping to use a lot of AI as well. So the party is almost complete. So all these 20 million, all but two have been settled.
Well, actually, and all those two, we have a pen and paper proof of the two. And we're formalizing. In fact, I was this morning, I was working on it, finishing it. So we're almost done on this. It's incredible. It's yeah, a fact is how many people were able to get 50, which in mathematics is considered a huge number. It's a huge number. Yeah, crazy.
Yeah, so we got about paper 50, all this and a big appendix of food contributor. What? Here's an interesting question. Not to maybe speak even more generally about it. When you have this pool of people, is there a way to organize the contributions by level of Vux Partisa, the people of the contributors? Okay. I'm asking you a lot of pot head questions here, but I'm imagining a bunch of humans and maybe in the future, some AI's.
Yeah. Can there be like an elo rating type of situation where like a gamification of this, the beauty of these lean projects is automatically you get all this data. Yeah. So like everything's we uploaded with this guitar and GitHub tracks who contributed what? So you could generate statistics from at any, at any later point in time, you could say, oh, this person contributed this made this many lines of code or whatever. I mean, these are very crude metrics. I would, I would definitely not want this to become like part of your tenure review or something.
好的。这种情况可以像 Elo 评分那样进行游戏化吗?精益项目的美妙之处在于,你可以自动获得所有这些数据。所有东西都可以通过吉他和 GitHub 上传,它会追踪每个人的贡献情况。这样的话,你就可以在任何时候生成统计数据,比如说,这个人贡献了多少代码行数等。不过,这些都是很粗略的指标,我绝对不希望这些成为你评估任期的一部分。
But I mean, I think already in enterprise computing, right? People do use some of these metrics as part of the assessment of performance of an employee. Again, this is the direction which is a bit scary for academics to go down. We don't like metrics so much. And yet academics use metrics. They just use old ones. Number of papers.
Yeah. It's true. It's true that I mean, it feels like this is a metric while flawed is going in the more in the right direction, right? Yeah. It's interesting. At least it's a very interesting metric. Yeah. I think it's interesting to study. I mean, I think you can do studies of whether these are better predictors. There's this problem called good hard slow. If a statistic is actually used to incentivize performance, it becomes gained. And then it is no longer a useful measure.
Oh, humans always. Yeah. Yeah. I mean, it's rational. So what we've done for this project is self-report. So there are actually standard categories from the sciences of what types of contributions people give. So there's this concept and validation and resources and coding and so forth. So we, we, there's a standard list of pro or so categories. And we just ask each contributor to this big matrix of all the, all the, all the, all the categories just to tick the boxes where they think they're contributed. And just give a rough idea, you know, like, oh, so you did some coding and, and, and you provided some compute, but you didn't do an A for pen and paper verification or whatever. And I think that that works out.
Traditionally, mathematicians just order alphabetically by surname. So we don't have this tradition as in their sciences of, you know, lead author and second author and so forth. Like, which we're proud of, you know, we make all the authors equal status, but it doesn't quite scale to this size. So a decade ago, I was involved in these things called polymath projects. It was the crowdsourcing mathematics, but without the lean component. So it was limited by, you needed a human moderator to actually check that all the contributions coming in were actually valid. And this was a huge bottom neck, actually. But still, we had projects that were, you know, 10 authors or so, but we had decided at the time not to try to decide who did what, but to have a single pseudonym.
So we created this fictional character called DHJ Polymath in the spirit of Bourbon by Keywarkies is the pseudonym for famous group of mathematicians in the 20th century. But, and so the paper was altered on the pseudonym. So none of us got the author credit. This actually turned out to be not so great for a couple of reasons. So one is that if you actually wanted to be considered for 10 year or whatever, you could not use this paper in your, as you submitted, as many publications because it didn't have the formal author credit. But the other thing that we've recognized a little much later is that when people referred to these projects, they naturally refer to the most famous person who was involved in the project. Oh, so this was Tim Gowas, this was Tim Gowas project. This was Tim Gowas project. And not mention the other 19 or whatever people that were involved.
So we're trying something different this time around where we have everyone's an author. But we will have an appendix with this matrix. And we'll see how that works. I mean, so both projects are incredible just the fact that you're involved in such huge collaborations. But I think I saw a talk from Kevin Buzzer about the lean programming language is a few years ago and you're saying that this might be the future of mathematics. And so it's also exciting that you're embracing one of the greatest mathematicians in the world embracing this, what seems like the paving of the future of mathematics.
So I have to ask you here about the integration of AI into this whole process. So DeepMai's Alpha Proof was trained using reinforcement learning on both failed and successful formal lean proofs of IMO problems. So this is sort of high level high school. Oh, very high level, yes. Very high level high school level mathematics problems. What do you think about the system? And maybe what is the gap between this system that is able to prove the high school level problems versus gradual level problems?
Yeah, the difficulty increases exponentially with the number of steps involved in the proof. It's a commentarial explosion. So I think of large language models is that they make mistakes. And so if your proof has got 20 steps and your art line board has a 10% failure rate at each step of going in the wrong direction. It's extremely unlikely to actually reach the end. Actually, just to take a small tangent here, how hard is the problem of mapping from natural language to the formal program?
Oh, yeah, it's extremely hard actually. Natural language, you know, it's very full-tolerant. Like you can make a few minor grammatical errors and speak in the second language, you can get some idea of what you're saying. But formal language, if you get one little thing wrong, I do that the whole thing is nonsense. Even formal to formal is very hard. There are different incompatible prefaces in languages. There's lean but also cock and Isabelle and so forth. And I keep even converting from a formal language to formal language. It's an unsolved miscalusable problem. That is fascinating.
Okay, so but once you have an informal language, they're using their RL-trained model. So something akin to Alpha 0 that they used to go to then try to come up with proofs. They also have a model, I believe, is a separate model for geometric problems. So what impresses you about the system and what do you think is the gap? We talked earlier about things that are amazing over time become kind of normalized. So now somehow of course geometry is still available from... Right, that's true, that's true. I mean it's still beautiful. Yeah, yeah, it's a great work. It shows what's possible. The approach doesn't scale currently. Three days of Google's service, server time to do one high school math formula. This is not a scalable plus spect, especially with the exponential increase in complexity increases.
好的,所以一旦你拥有了一种非正式语言,他们就使用他们的强化学习训练模型。有点类似于他们用于 Alpha 0 的模型,然后尝试提出证明。我认为,他们也有一个针对几何问题的独立模型。那么,这个系统让你印象深刻的是什么?你认为缺陷在哪里?我们之前谈到,有些令人惊叹的事物随着时间推移变得平常。那么现在,几何学当然仍然可用……对的,那是真的,那是真的。我是说它依然很美。是的,是的,这是一个很好的工作,展示了可能实现的东西。这个方法目前并不能很好地扩展。使用三天的 Google 服务器时间来解决一个高中数学公式,这种方法不具备可扩展性,特别是随着复杂性的指数级增长。
Which mentioned that they got a silver medal performance. The equivalent of... I mean, yeah, the equivalent of... Yeah, I mean, they... So first of all, they took way more time than was allotted and they had this assistance where the humans started helped by formalizing. But also they're giving us those formats for the solution, which I guess is formally verified. So I guess that's fair. There are efforts, there will be a proposal at some point to actually have an AI method of the year where at the same time as the human contestants get the actual little bit problems, yes, we'll also be given the same problems at the same time period. And the outputs will have to be created by the same judges. Which means that we'll have to be written in natural language rather than formal language. I hope that happens. I hope that this time won't happen.
I hope that's fine. It won't happen this IMO. The performance is not good enough in the time period. But there are smaller competitions. The competitions where the answer is a number rather than a long form proof. And that's... AI is actually a lot better at problems where there's a specific numerical answer. Because it's easy to reinforce learning on it. You've got the right answer, you've got the wrong answer. It's a very clear signal. But a long form proof either has to be formal and then the lean can give it thumbs up, thumbs down. Or it's informal. But then you need a human to create it. And if you try to do billions of reinforcement learning, you can't hire enough humans to create those.
It's very hard enough for the last time criminals to do reinforcement learning on just the regular text that people get. But now if you hire people not just give thumbs up thumbs down, but actually check the output mathematically. That's too expensive. So if we explore this possible future, what is the thing that humans do that's most special in mathematics? So that you could see AI not cracking for a while. So inventing new theories. So coming up with new conjectures versus proving the conjectures, building new abstractions, new representations, maybe an AI dinosaur with seeing new connections between disparate fields. That's a good question. I think the nature of what mathematicians do over time has changed a lot.
So a thousand years ago, mathematicians had to compute the date of Easter. And then we had complicated calculations. But it's all automated. It's been an automated centuries. We don't need that anymore. They used to navigate to do spherical navigation, spherical trigonometry to navigate how to get from the old board to the new. So I think it's very complicated calculations. Again, we'd been automated. Even a lot of undergraduate mathematics, even before AI, like both from alpha, for example, it's not a language model. But you can solve a lot of undergraduate level f-tasks. So on the computational side, verifying routine things, like having a problem and say, here's a problem in partial differential equations.
Could you solve it using any of the 20 standard techniques? And they have a yes, I've tried all 20 and hear that 100 different permutations and disease map results. And that type of thing, I think, it worked very well. Type of scaling to, once you solve one problem to make the AI attack 100 adjacent problems, the things that humans do still, where the AI really struggles right now, is knowing when it's made a wrong turn. And you can say, oh, I'm going to solve this problem. I'm going to split up this woman into these two cases. I'm going to try this technique. And sometimes, if you're lucky, it's a simple problem. It's the right technique and you solve the problem.
Sometimes it will have a problem. It would propose an approach which is just complete nonsense. But it looks like a proof. So this is one annoying thing about LLM generated mathematics. So we've had human general mathematics as a very low quality. Submissions, we don't have the formal training and so on. But if a human proof is bad, you can tell it's bad pretty quickly. It makes really basic mistakes. But the AI general proofs they can look superficially flawless. And it's partly because that's what the reinforcement learning has, like you train them to do, to make things to produce text that looks like what is correct. But for many applications, it's good enough. So it was often really subtle. And then when you spot them, they're really stupid. Like no human would have vacuumed that mistake.
Yeah, it's actually really frustrating in the programming context because I program a lot. And yeah, when a human makes low quality code, there's something called code smell. Right? You can tell. You can tell immediately. Like, yeah, there's signs. But with AI generated code, it's old of us. And then you're right. Eventually you find an obvious dumb thing that just looks like good code. Yeah. So it's very tricky to and frustrating for some reason to. Yeah. So yeah. So the sense of smell. This is one thing that humans have. And there's a metaphor called mathematical smell that is not clear how to get there. You have to do pretty well. Eventually, I mean, so the way Alpha Zero and software to make progress and go and chest and so all this is in some sense they have developed a sense of smell for go and chest positions.
That this position is good for white. That's good for black. They can't initiate why. But just having that sense of smell lets them strategize. So if AIs gained that ability to sort of a sense of viability of certain proof strategies, so you can say, I'm going to try to break up this problem into two small sub tasks and then you can say, oh, this looks good. The two tasks look like they're simpler tasks than your main task. And they still got a good chance of being true.
So this is good to try. Or you've made the problem worse because each of the two sub problems is actually harder than your original problem, which is actually what normally happens if you try a random thing to try. Normally, it's very easy to transform a problem into a even harder problem. Very rarely do you transform a simpler problem. So if they can pick up a sense of smell, then they could maybe start competing with human law mathematicians. So this is a hard question, but not competing, but collaborating. If, okay, hypothetical, if I gave you an oracle that was able to do some aspect of what you do and you could just collaborate with it.
Yeah, yeah. What would that oracle, what would you like that oracle to be able to do? Would you like it to maybe be a verifier, like check to the code smut, like your, yes, a professor, child, this is the correct, this is a good, this is a promising fruitful direction.
Yeah, yeah, or would you like it to generate possible proofs and then you see which one is the right one? Or would you like it to maybe generate different representation, totally different ways of seeing this problem? I think all of the above. A lot of it is, we don't know how to use these tools because it's a paradigm that it's not, yeah, we have not had in the past assistance that are competent enough to understand complex instructions that can work at massive scale, but also unreliable. It's an interesting, a bit unreliable in subtle ways, was we was sufficiently good output.
It's an interesting combination. You have like graduate students who work with who kind of like this, but not as scale. And we had previous software tools that can work at scale, but very narrow. So we have to figure out how to use, I mean, so Tim Kowd, you imagine, Yakis for saw, like in 2000, he was envisioning what mathematics would look like in, like, two and a half decades.
Yeah, he wrote in his article, like a hypothetical conversation between a mathematical assistant of the future and himself, you know, a pharmaceutical problem, and they would have a conversation that sometimes the human would propose an idea and the AI would evaluate it. Sometimes the AI would propose an idea, and sometimes the computational is required and the AI would just go and say, okay, I've checked the 100 cases needed here, or the first, you set the situation for all end up checking for end up to 100, and it looks good so far, or hang on, there's a problem that equals 46.
And so just a free form conversation where you don't know in advance where things are going to go, but just based on, I think, ideas that get proposed on both sides, calculations get proposed on both sides, I've had conversations with AI, where I say, let's, we're going to collaborate to solve this math problem. And it's a problem that I already know the solution to. So I try to prompt it. Okay, so here's the problem. I suggest using this tool, and it'll find this lovely argument using it from a different tool, which eventually goes into the weeds and say, no, no, no, if I using this, okay, and I start using this, and then it'll go back to the tool that I wanted to do before. And you have to keep railroading it onto the path you want, and I could eventually force it to give the proof I wanted. But it was like hurting cats, and the amount of personal effort I had to take to, not just to prompt it, but I also check it output, because a lot of what it looks like is going to work, I know there's a problem on 917, and basically arguing with it. It was more exhausting than doing it on the system.
So like it, but that's the currency to be hot. I wonder if there's a phase shift that happens, towards no longer feels like hurting cats, and maybe you'll surprise us how quickly that comes. I believe so. So in formalization, I mentioned before that it takes 10 times longer to formalize a proof that I had. With these modern AI tools, it's also just better tooling. The lean developers are doing a great job adding more and more features and making it user-friendly. It's going from 9 to 8 to 7, okay, no big deal. But one day you'll drop all the one, and that's the phase shift, because suddenly it makes sense when you write a paper to divide it in lean first, or through a conversation with AI, which is generating lean on the fire with you. It becomes natural for journals to accept, and maybe they'll offer expedite refereeing. If a paper has already been formalized in lean, they'll just have to referee to comment on the significance of the results and how it connects to the literature and not worry so much about the correctness, because that's been certified.
Papers are getting longer and longer in mathematics, and it's harder and harder to get good refereeing for the really long ones, unless they're really important. It is actually an issue, which in the formalization is coming in just the right time for this to be. And the easier and easier to guess, because of the tooling and all the other factors, then you're going to see much more like math, label, grow, potentially exponentially. It's a virtuous cycle, okay. I mean, one phase shift of this type that happened in the past was the adoption latex. So latex is a type of language that all mathematicians use now. So in the past, people used all kinds of word processes and typewriters and whatever. But at some point, latex became easier to use than or other competitors, and that people would switch within a few years, like it was just a dramatic spaceship.
It's a wild out there question. But what year, how far away are we from a AI system being a collaborator on a proof that wins the field's model, so that level? Okay. Well, it depends on the level of collaboration. Yeah. No, like it deserves to be to get the fields model. Like, so half an hour. Already, like, I can imagine if it was a metal-witting paper having some AI systems inviting it. Just, you know, like they all complete alone. It's a very, I use it like it speeds up my own writing. Like, you know, you can have a theorem, you have a proof of three cases, and I write down the proof of first case, and the autocomplete just suggests that now these are the proof of second case of good work. And like, it was exactly correct. That was great. Save me like five to minutes of typing.
这是一个很大胆的问题。不过,我们距离 AI 系统成为一种能够合作完成获得菲尔兹奖难度级别的证明的合作者,还有多少年?好的,这要看合作的具体程度。对,我的意思是,AI 应该能够为获得菲尔兹奖作出足够的贡献。就像半小时内之类的情况。我已经可以想象,有些获奖论文可能会有 AI 系统的参与。比如,它们能够独立完成整个过程。这种体验非常棒,我自己用的时候感觉它可以加快我的写作速度。比如,你可以有一个定理,需要证明三个情况,我写完第一个情况的证明后,自动补全就给我建议了第二种情况的证明,而且完全正确。这非常棒,帮我省下了五到十分钟的输入时间。
But in that case, the AI system doesn't get the fields model. No. I was talking 20 years, 50 years, 100 years. What do you think? Okay. So I gave a bit of an imprint of us about 20, 26, which is now next year. There will be mathematical operations with AI. So not fields model winning, but like actual research level, like published ideas that are in our generation by AI. Maybe not the ideas, but at least some of the computations, the verifications here. I mean, there's that already happened. Yeah. There are problems that were solved by a complicated process, conversing with AI to propose things, and the human goes and tries it, and the contract doesn't work. But the, my, it was a different idea. It's hard to disentangle exactly. There are certainly math results, which could only have been accomplished because there was a method of human authentication and an AI involved.
但在这种情况下,AI 系统并没有获得菲尔兹奖。不是的。我讨论的是 20 年、50 年、100 年以后的情况。你怎么看?好的。我在 2026 年,也就是明年,提到了一些我们的印迹。届时,AI 将参与数学操作。虽然不是获奖级别的工作,但会达到实际研究水平,比如通过 AI 产生的发表理念。或许不是新的概念,但至少会有些计算和验证是这样的。我的意思是,这种事情已经发生了。通过与 AI 的复杂互动,已经有一些问题得到了解决,人类会尝试这些建议,虽然有时候结果不如预期,但其中确实包含了不同的想法。这很难完全理清。毫无疑问,有些数学成果只能通过人类验证和 AI 的结合才得以实现。
But it's hard to sort of disentangle credit. I mean, these tools, they do not replicate all the skills needed to mathematics, but they can replicate sort of some non-trivial percentage of them, you know, 30, 40 percent. So they can fill in gaps. So coding is a good example. So I am, it's annoying for me to code and Python. I'm not a native, no professional programmer. But with AI, the fiction cost of doing it is much reduced. So it fills in that gap for me.
AI is getting quite good at literature review. I mean, there's still a problem with hallucinating, you know, references that don't exist. But this I think is a several problems. If you train in the right way and so forth, you can, and verify using the internet. You know, you should, in a few years, get the point where you have a lemma that you need and say, anyone proving the slumber before and they will do basically a fancy web search AI system.
So yeah, there are these six papers where something similar has happened. I mean, you can ask you right now and it will give you six papers of which maybe one is legitimate and relevant. One exists but is not relevant for hallucinating. It has a non-zero success rate right now, but there's so much garbage, so much the signal noise ratio is so poor that it's, it's most helpful when you already somewhat know the literature.
And you just need to be prompted to be reminded of a paper that was really subconsciously in your memory. Or it's just helping you discover new you were not even aware of, but is the correct citation. Yeah, that's, yeah, that it can sometimes do, but when it does, it's buried in a list of options to which the other are bad. Yeah, I mean, being able to automatically generate a related work section that is correct. Yeah. That's actually a beautiful thing that might be another phase shift because it assigns quite a correctly.
Yeah. It does, it breaks you out of the silos of, yeah, yeah, yeah, yeah, no, there's a big hump to overcome right now. I mean, it's like self-driving cars. The safety margin has to be really high to be feasible. So yeah, so there's a last mile problem with a lot of AI applications that they can do, they can do tools that work 20%, 80% of the time, but it's still not good enough, and in fact, even worse than good, some ways.
I mean, another way of asking the Fields-Model question is, what year do you think you'll wake up and be like real surprised? You read the headline, the news or something happened that AI did, like, you know, real breakthrough, something. It doesn't, you know, like, Fields-Model, even hypothesis, it could be like really just, this Alpha Zero moment would go that way. Right.
Yeah, this decade, I can see it like making a conjecture between two unrelated, two things that people thought was unrelated. Oh, interesting. Generating a conjecture, that's a beautiful conjecture. Yeah, and actually, it has a real time, so being correct and then meaningful. Because that's actually kind of doable, I suppose, but the word of the data is, yeah, yeah, no, that would be truly amazing.
It's kind of a model struggle a lot. I mean, so a version of this is, I mean, the physicists have a dream of getting the AI to discover new laws of physics. The dreams you just feed it all this data, okay? And it says, he was a new patent that we didn't see before. But it actually even struggle, the current state of the art, even struggles to discover all laws of physics from the data.
Or if it does, there's a big concern with contamination that it did only because, like, it's somewhere in a training data, it did some new, you know, boils, whatever, if you're trying to reconstruct. Part of it is that we don't have the right type of training data for this. Yeah, so for laws of physics, we don't have like a million different universes, we have a million different balls of nature.
And a lot of what we're missing in math is actually the negative space of, so we have published things of things that people have been able to prove and conjectures that end up being verified or we can't examples produced. But we don't have data on things that were proposed and they're kind of a good thing to try. But then people quickly realized that it was the wrong conjecture and then they disinnovated, but we should actually change our claim to modify it in this way to actually make it more plausible.
There's a trial and error process, which is a real integral part of human mathematical discovery, which we don't record because it's embarrassing. We make mistakes and we only like to publish our wins. And yeah, it has no access to data to train on. I sometimes joke that basically, you know, I just get AI has to go through a grad school and actually, you know, go to grad courses, do the assignments, go to office hours, make mistakes, get advice on how to correct the mistakes and learn from that.
Let me ask you, if I may, about Gregory Perlman. You mentioned that you try to be careful in your work and not let a problem completely consume you. Just you've really fallen love with the problem and it really cannot rest until you solve it. But you also hastened to add that sometimes this approach actually can be very successful. An example you gave is Gregory Perlman who proved the Poincaré Conjecture and did so by working alone for seven years with basically little contact with the outside world. Can you explain this one millennial prize problem that's been solved Poincaré Conjecture and maybe speak to the journey that Gregory Perlman's been on?
All right, so it's a question about curved spaces. That's a good example. I think it was a 2D surface. I'm just assuming round you could maybe be a torus with a hole in it or it can have many holes. And there are many different topologies, a priori that the surface could have. Even if you assume that it's bounded and smooth and so forth. We have figured out how to classify surfaces. As a first approximation, everything is determined. I don't know the genus. How many holes it has. So the sphere has a genus zero. Don't know it has a genus one and so forth.
And one way you can tell the surface is apart, probably the sphere has which is constantly connected. If you take any closed loop on the sphere like a big close to a rope, you can contract it to a point and while staying on the surface. And the sphere has this property. But a torus doesn't. If you're on the torus and you take a rope that goes around, say the outer diameter, there's no way it can't get through the hole. There's no way to contract at all point.
So it turns out that the sphere is the only surface with this property of contractability up to like continuous deformations of the sphere. So some things that are what are called topologically. So point where you ask the same question, higher dimensions. So it becomes hard to visualize because the surface you can think of as embedded in three dimensions. But as curved free space, we don't have good intuition of four years face to live. And then there are also three spaces that can't even fit into four dimensions. You need five or six or four or higher.
But anyway, mathematically you can still pause this question. That if you have a bounded three dimensional space now, which is also has this simply connected property that every loop can be contracted, can you turn it into a three dimensional version of the sphere. And so this is the point where you can actually weirdly in higher dimensions four and five, it was actually easier. So it was solved first in higher dimensions. There's somehow more room to do the deformation. It's easier to move things around to your sphere. But three was really hard.
So people tried many approaches. There's sort of commentary approaches where you chop up the surface into little triangles or tetrahedrons. You just try to argue based on how the faces interact each other. There were algebraic approaches. There's various algebraic objects like the fundamental group that you can attach to these homology and homology and all these very fancy tools. They also didn't quite work. But Richard Hamilton proposed a partial differential equations approach.
So you take you take so the problem is that you have this object which is sort of secret is a sphere. But it's given to you in a really weird way. So I think of a ball that's been kind of crumpled up and twisted and it's not obvious that it's the ball. But like if you have some sort of surface which is which is a deformed sphere, you could, for example, think of it as a surface of a balloon. You could try to inflate it. You pull it up. And naturally as you fill the air, the wrinkles were sort of smooth out and it will turn into a nice round sphere.
Unless of course it was a toy or something like that which is it would get stuck at some point. Like if you inflate a toy it would, there would be a point in the middle when the inner ring shrinks to zero you get a singularity and you can't pull up any further. You can't pull up any further. So if you created this flow which is called Richie flow, which is a way of taking an arbitrary surface or space and smoothing it out to make it rounder and rounder to make it look like a sphere.
And he wanted to show that either this process would give you a sphere or it would create a singularity. I can very much like how PDs either have global irregularity or finite and blow up. Basically it's almost exactly the same thing. It's all connected. And so and he showed that for two dimensions, two initial surfaces, if you start with some clear neck, no singularities ever formed. You never manage a trouble and you could flow and it will give you a sphere.
And so he got a new proof of the two dimensional results. But whether that's a beautiful explanation of a Richie flow and its application and its context, how difficult is the mathematics here for the 2D case? Yeah, these are quite sophisticated equations. On par with the Einstein equations. Okay. It's slightly simpler but yeah, but they will consider hard nonlinear equations to solve. And there's lots of special tricks in 2D that that helped. But in 3D, the problem was that this equation was super critical. It has the same problem as Navier's dogs. As you blow up, maybe the curvature could get constrained in finite small and small regions. And it looked more and more nonlinear and things just look worse and worse.
And we all kinds of singularities that showed up. Some singularities, there's these things called neck pinches where the surface creates a big hip like a barbell and it pinches at a point. Some singularities are simple enough that you can sort of see what you do next. You just make a snip and then you can turn one surface into two and you built them separately. But there was the prospect that from really nasty, like, notive singularities showed up that you couldn't see how to result in any way, that you couldn't do any surgery too. So you need to classify all the singularities. What are all the possible ways things can go wrong?
So what Perman did, first of all, he made the problem, he turned the problem super critical problem to a critical problem. I said before about how the invention of energy, the Hamiltonian, really clarified Newtonian mechanics. So he introduced something which is now called Perman's reduced volume and Perman's entropy. He introduced new quantities, kind of like energy, that looked the same at every single scale and turned the problem into a critical one where the nonlinearities actually suddenly looked a lot less scary than they did before. And then he had to solve, he still had to analyze the singularities of this critical problem.
And that itself was a problem similar to this weight-capsing at work on actually. So on the level of difficulty of that, so he managed to classify all the singularities of this problem and show how to apply surgery to each of these and through that was able to result the point in Jack here. So quite like a lot of really ambitious steps and like nothing that a large language model today, for example, could I mean, at best, I could imagine a model proposing this idea as one of hundreds of different things to try. But the other 99 would be complete dead ends, but you don't only find out after months of work. He must have had some sense that this was the right tractable suit because it takes years to get them pump A to B.
So you've done like you said, actually, you see even strictly mathematically, but more broadly in terms of the process, he done similarly difficult things. What can you infer from the process he was going through because he was doing it alone? What are some low points in a process like that? When you start to like, you've mentioned heart share like AI doesn't know when it's failing. What happens to you? You're sitting in your office when you realize the thing you did the last few days, maybe weeks is a failure. Well, for me, I switch to a different problem. So I'm a fox, I'm not a hedgehog, but you will generally, that is a break that you can take is to step away and look at it in a problem. You can modify the problem too. You can ask them to cheat.
If there's a specific thing that's blocking you, that just some bad case keeps showing up for which your tool doesn't work, you can just assume by fear, this bad case doesn't occur. You do some magical thinking, but strategically, okay, for the point to see if the rest of the argument goes through. If there's multiple problems with your approach, then maybe you just give up. If this is the only problem that we know, then everything else checks out, then it's still worth fighting. So yeah, you have to do some so-ford reconnaissance sometimes. That is sometimes productive. Just assume we'll figure it out. Sometimes actually it's even productive to make mistakes.
So one of the, I mean, there's a project which actually we want some prizes for. We worked on this PD problem, again, actually this blow-off regularity type problem. It was considered very hard. John McCain, another few of his methods, he worked on a special case of this, but he could not solve the general case. We worked on this problem for two months and we thought we solved it. We had this dispute argument that I think fit and we were excited. We were planning celebration, we will get together and have champagne or something.
We started riding it up and one of us, not me, I could, but another call of a, said, oh, in this lemma here, we have to estimate these 13 terms that show up in this expansion. It's made 12 of them, but in our notes, I can't find it for the estimation of 13th, can you? Someone supply that. I said sure, look at this and I we didn't cover that we completely omitted this term. This term, we worse than the other 12 terms put together. In fact, we could not estimate this term. We tried for a few more months and all different permutations and there was always this one thing that we could not control.
This was very frustrating, but because we had already invested months and months of effort and was already, we stuck at this. We tried increasingly desperate things and crazy things. After two years, we found that the picture is somewhat different, but quite a bit from our initial strategy, which didn't generate these permutations and actually solve the problem. We solved the problem after two years, but if we hadn't had that initial full-storm of nearly solving the problem, we would have given up by month two or something and worked on an easier problem.
If we had known it would take two years, not sure we would have started the project. Sometimes actually having the asylambus in Newark, they had an incorrect version of the measurement of the size of the earth. He thought he was going to find a new trade route in India. At least that was how he sold it in his perspective. It could be that he secretly knew. Just on the psychological element, do you have emotional or self-doubt, the just overwhelmed you, and things like that? This stuff feels like math is so engrossing that it can break you. When you invest so much yourself in the problem and then it turns out wrong, you could start to...
Similar way chess has broken some people. I think different mathematicians have different levels of emotional investment in what they do. I think for some people it's as a job. You have a problem and if it doesn't work out, you will be on the next one. The fact that you can always move on to another problem, it reduces the emotional connection. There are cases, so there are certain problems that are what about that go diseases, where just latch on to the one problem and they spend years and years thinking about nothing but that one problem. Maybe the career suffers and so forth.
I could get this big win this world. Once I finish this problem, I will make up for all the years of lost opportunity. Occasionally it works. I really don't recommend it for people who are far too right-forward to you. I've never been super invested in any one problem. One thing that helps is that we don't need to call our problems in advance. When we do grab proposals, we want to study this set of problems. But even then we don't promise, definitely by five years, I will supply a proof of all these things.
You promise to make some progress or discover some interesting phenomena and maybe you don't solve the problem, but you find some related problem that you can say something new about and that's a much more feasible task. But I'm sure for you there's problems like this. You have made so much progress towards the hardest problems in the history of mathematics. Is there a problem that just haunts you? It sits there in the dark corners. Twin prime conjecture, Raymond hypothesis, go luck conjecture. Twin prime, that's...
Again, the problem is that the Raymond hypothesis is so far out of reach. I think so. Yeah, there's no even viable strategy. Even if I activate all the cheats that I know of in this problem, there's still no way to get me to be. I think it needs a breakthrough in another area of mathematics to happen first. For someone who recognized that it would be a useful thing to transport into this problem. We should maybe step back for a little bit and just talk about prime numbers.
So they're often referred to as the atoms of mathematics. Can you just speak to the structure that these atoms... So the natural numbers have two basic operations, attention on addition and multiplication. If you want to generate the natural numbers, you can do one or two things. You can just start with one and add one to itself over and over again and that generates you the natural numbers. So additively, they're very easy to generate. One, two, three, five. Or you can take the prime number, if you want to generate multiplicatively, you can take all the prime numbers, two, three, five, seven and multiply them all together. Together they kiss you all either, the natural numbers, except maybe four, one. So there are these two separate ways of thinking about the natural numbers. I'm adding to point of view and a multiplicative point of view. Separately, they're not so bad. Any question about that natural numbers, only was addition, as well as easier to solve.
Any question that only was multiplication is a little bit easier to solve. But what has been frustrating is that you combine the two together. And suddenly, you get an extremely rich, I mean, we know that there were statements in numbers that are actually as undesirable. There are certain polynomials in some number variables. You know, it's the solution in the natural numbers and the asset depends on an undecidable statement, like whether the axioms of mathematics are consistent or not. But yeah, but even the simplest problems that combine something more multiplicative such as the primes with some additives such as chipping by two. Separately, we understand both from well, but if you ask when you shift the prime by two, do you, can you get up, how often can you get another prime? It's been amazingly hard to relate the two.
And we should say that the twin prime conjectures just that it posits that there are infinitely many pairs of prime numbers that differ by two. Now, the interesting thing is that you have been very successful at pushing forward the field in answering these complicated questions of this variety. Like you mentioned, the green tile theorem, it proves that prime numbers contain arithmetic progressions of any length. Which is mind-blowing, you could prove something like that. Right. Yeah, so what we've realized because of this type of research is that there's different patterns have different levels of indestructibility. So what makes the twin prime common hard? You could take all the primes in the world, three, five, seven, eleven, so forth. There are some twins in there, eleven and thirteen is a twin prime pair of twin primes so forth.
But you could easily, if you wanted to, redact the primes to get rid of these twins. The twins, they show up and they're infinitely many of them, but that you recently sparse. Initially, it's quite a few, but once you got the millions, trillions, they become rare and rare. And you could actually just, if someone was given access to the database of prime, you just edit it out a few primes here and there, they could make the trim package of your false by just removing like 0.0 or 1% of the primes or something, just well chosen to do this. And so you could present a censored database of the primes, which passes all of the statistical tests of the primes. You know, it obeys things like the problem of theorem and other things about the primes, but doesn't contain any trim primes anymore.
And this is a real obstacle to the trim prime conjecture. It means that any proof strategy to actually find trim primes in the actual primes must fail when applied to these slightly edited primes. And so it must be some very subtle, delicate feature of the primes that you can't just get from like, like I could get statistical analysis. Okay, so that's all. Yeah. On the other hand, I think progression has turned out to be much more robust. Like you can take the primes and you can eliminate 99% of the primes actually. And you can take any 90% of the one and it turns out and another thing we prove is that you still get asmic progressions. Asmic progressions are much, you know, they're like cockroaches of arbitrary length. Yes. Yes. That's crazy.
Yeah. So for people who don't know, I think progression is a sequence of numbers that differ by some fixed amount. Yeah. But it's again, like it's an infinite monkey type phenomenon. For any fixed length of your set, you don't get arbitrary, that's the progression. You only get quite short progressions. But you're saying twin primes not an infinite monkey of phenomena. I mean, it's a very subtle one. It's still an infinite monkey phenomenon. Yeah. If the primes were really genuinely random, if the primes were generated by monkeys, then yes, in fact, the infinite monkey theorem would all be you're saying that twin prime is it doesn't, you can't use the same tools. Like, it doesn't appear random almost.
Well, we don't know. Yeah. We believe the primes behave like a random set. So the reason why we care about the trim and how conjecture is a test case for whether we can genuinely completely say with 0% chance of error that the primes behave like a random set. Okay. Random random versions of the primes we know contain twins, at least with 100% probably, or probably 10 to 100% as you go out further further.
Yeah. So the primes we believe that the random, the reason why primes are indestructible is that regardless of whether it looks random or looks structured like periodic, in both cases, the arithmetic progressions appear, but for different reasons. And this is basically all the ways in which the thing, there are many proofs of these sort of arithmetic progression after the year and they're all proven by some sort of dichotomy, where your set is either structured or random and in both cases, you can say something and then you put the two together.
But in twin primes, if the primes are random, then you're happy, you win, if the primes are structured, they could be structured in a specific way that eliminates the twins. And we can't rule out that one conspiracy. And yet you're able to make a azanish term progress on the K2Po version. Right.
Yeah. So the one thing about conspiracies is that any one conspiracy theory is really hard to disprove. That, you know, if you believe the word is what by lizards is a here's some evidence that it's not what I mean, this is what that episode's trying to buy the lizards. Yeah. You may have encountered this kind of phenomenon.
Yeah. So like, I'm a pure like there's there's almost no way to it. Definitely will not I can explain the same as true in mathematics. A conspiracy is totally devoted to learning twin primes. You know, like you would have to also infiltrate other areas of mathematics. But like it could be made consistent at least as far as we know.
But there's a weird phenomenon that you can make one conspiracy, one conspiracy, rule out other conspiracies. So, you know, if the word is one by this is, it can't also be one by the least. Right. Right. So one unreasonable thing is it's hard to disprove. But more than one, there are there are tools.
So yeah. So for example, we know there's simply many primes that are no two, which are so there is a pair of the pride which differ by at most 246 actually is is the code. So there's a bound. Yes. Right. So like the twin primes, they're called cousin primes that differ by by four. This is called sexy primes that differ by six. What are sexy primes? Primes that differ by six. The name is much less. Of course, there's much less exciting than the name suggests.
So you can make a conspiracy rule out one of these. But like once you have like 50 of them, it turns out that you can't rule out all of them at once. It just requires too much energy somehow in this conspiracy space. How do you do the bound part? How do you how you develop a bound for the different teen primes?
Okay. So there's an infinite number of. So it's ultimately based on what's called the pigeonhole principle. So the pigeonhole principle is the statement that if you have a number of pigeons and they all have to go over to the pigeonholes and you have more pigeons than pigeonholes, then one of the pigeonholes has to have at least two pigeons. So that has to be two pigeons that are close together.
So for instance, if you have a hundred numbers and they all range from one to a thousand, two of them have to be at most 10 apart. Because you can divide up the numbers from one to a hundred into one hundred pigeonholes. Let's say they are 101 numbers. 101 numbers, then two of them have to be distance less than 10 apart because two of them have to belong to the same pigeonhole. So it's a basic basic feature of a basic principle in mathematics.
So it doesn't quite work with the primes directly because the primes get sparser and sparser as you go out. That a few and a few numbers are primes. But it turns out that there's a way to assign weights to numbers. So there are numbers that are almost primes, but they don't have no factors at all other than themselves in one. They have very few factors. And it turns out that we understand almost primes a lot better than those ant primes. So for example, it was known for a long time that they were trying to almost primes. This has been worked out. So almost primes are something we cannot understand. So you can actually restrict the attention to a suitable set of almost primes. And whereas the primes are very sparse overall relative to the almost primes, actually are much less sparse.
They make you can set up a set of almost primes where the primes of density likes a 1%. And that gives you a shot at proving by applying some sort of pigeonhole principle that there's pairs of primes that are just only a hundred and a hundred apart. But in order to prove the training ground conjecture, you need to get the density of primes. This is almost up to up to a 50%. Once you get up to 50%, you would get trim primes. But unfortunately, there are barriers. We know that no matter what kind of goods that are almost primes you pick, the density of primes can never get up off 50%.
It's called the parody barrier. And I would love to find, yes, one of my long-term dreams is to find a way to breach that barrier because it would open up not only to trim out conjecture, the go-back conjecture, and many other problems in number theory are commonly blocked because our current techniques would require going beyond this theoretical parody barrier. It's like pulling past the speed of light.
Yeah, so we should say a 20-prime conjecture. One of the biggest problems in the history of mathematics, go-back conjecture also. They feel like extra neighbors. Is there been days when you felt you saw the path? Oh, yeah. Yeah, sometimes you try something and it works super well. You again, again, the sense of mathematical smell we talked about earlier. You learn from experience when things are going too well because there are certain difficulties that you sort of have to encounter.
I think the way a colleague might put it is that if you are on the streets of New York and you put in a blindfold and you put in a car and after some hours, the blindfolds off and you're in Beijing. That was too easy somehow. There was no ocean being crossed. Even if you don't know exactly what was done, you're suspecting that something wasn't right. But is that still in the back of your head to do you return to the prime numbers every once in a while to see? Yeah, when I have nothing better to do, which is less than that time. It's busy with so many things that you say.
But yeah, when I have free time and I'm too frustrated to work on my sort of view, research projects, I also don't want to do my ministry of sub-order. I don't want to do some errands for my family. I can play with these things for fun and usually you get nowhere. Yeah, you have to just say, okay, fine. Once again, nothing happened. I will move on. Very occasionally, one of these problems I actually solved. Sometimes as you say, you think you solved it and then you're forward for maybe 15 minutes and then you think I should check this because this is too easy.
It could be true and usually is. What's your gut say about when these problems would be solved? When prime and go back to prime, I think we'll keep getting more partial results. It doesn't need at least one, this parity barrier is the biggest remaining obstacle. There are simpler versions of the conjecture where we are getting really close. So I think we will, in 10 years, we will have many more much closer results. We may not have the whole thing.
So, trend times is somewhere close. Reemnant hypothesis, I have no, I mean, it has happened by accident, I think. So the remnant hypothesis is kind of more general conjecture about the distribution of prime numbers. It's a sort of viewed model plicatively. For questions only involving multiplication, no addition. The primes really do behave well, I as randomly as you could hope. So there's a phenomenon in probably called square cancellation that if you want to pull, say, America upon some issue and you ask one or two voters and you may have sampled a bad sample and then you get a really imprecise measurement of full average.
But if you sample more and more people, the accuracy gets better and better. And it actually improves the square root of the number of people you sample. So if you sample a thousand people, you can get like a two, three percent margin of error. So in the same sense, if you measure the primes in a certain more plicative sense, there's a certain type of statistic you can measure. It's called the remunisator function and it fluctuates up and down. But in some sense, as you keep averaging more and more, if you sample more and more, the fluctuation should go down as if they were random. And there's a very precise way to quantify that. And the remunapalysis is a very elegant way that captures this. But as with many other ways in mathematics, we have very few tools to show that something really genuine behaves like Biddy random.
And this is not just a little bit random, but it's as asked in that behaves as random as a duchyly random set. This square root cancellation. And we know here because of things related to parity from actually most of us, usual techniques cannot hope to settle this question. The proof has to come on a left fuel. But what that is, no one has any serious proposal. And there's various ways to sort of, as I said, you can modify the primes a little bit and you can destroy the remunapalysis. So it has to be very delicate. You kind of find something that has huge margins of error. It has to be just barely work. And there's like all these pitfalls that you have like dodge very adeptly. The prime numbers are just fascinating.
Yeah. What do you most mysterious about the prime numbers? That's a good question. So like, conjectually, we have a good model of them. I mean, as I said, I mean, they have certain patterns like the primes are usually odd, for instance. But apart from these obvious patterns, they behave very randomly and just assuming that they behave. So there's something called the cream of random model of the primes. But that after a certain point, primes just behave like a random set. And there's various flight modifications as a model. But this has been a very good model. It matches the numerics. It tells us what to predict. Like, I can tell you of complete certainty, the dream of architecture is true.
The random model gives overwhelming odds that it's true. I just can't prove it. Most of our mathematics is optimized for solving things with patterns in them. And the prime numbers and type pattern as doom holds everything really. But we can't prove that. Yeah, I guess it's not mysterious that the prized wave event is kind of random because there's no reason for them to be to have any kind of secret pattern. But what is mysterious is what is the mechanism that really forces the randomness to happen? And this is just absolute. Another incredibly surprisingly difficult problem is the collots conjecture. Oh yes. Simple to state, beautiful to visualize, in the simplicity, and yet extremely difficult to solve.
And yet you have been able to make progress. Paul Radar said about the collots conjecture that mathematics may not be ready for such problems. Others have stated that it is an extraordinarily difficult problem completely out of reach. This is in 2010 out of reach of present-day mathematics and yet, you have made some progress. Why is it so difficult to make? Can you actually even explain what it is? Oh yeah. So it's a problem that you can explain. It helps with some visual aids. But yeah, so you take any natural number like 13 and you apply the following procedure to it. So if it's even you divide it by two. And if it's odd, you multiply it by three and add one.
So even numbers get smaller, odd numbers get bigger. So 13 would become 40. Because 13 times 3 is 39 and add one of your 40. So it's a simple process for odd numbers and even numbers that both are very easy operations. And then you put together, it's still reasonably simple. But then you ask what happens when you iterate it. You take the output that you just got and feed it back in. So 13 becomes 40. 40 is now even divided by two is 20. 20 is still even divided by 10 to 10. 5 and then 5 times C plus 11 16 and then 8 421. So and then for 1 equals 1421421 is cycles forever.
So this sequence I just described 13, 40, 20, 10s or both. These are also called hailstorm sequences because there's an over-stimplified model of hailstorm formation, which is not actually quite correct, but it's some I've taught to high school students as a first-box mission. Is that like a little nugget of ice gets gets a nice crystal forms in cloud and it goes up and down because of the wind and sometimes it's cold it gets a bit more mass and maybe it melts a little bit and this process is going up and down creates this sort of partially melted ice, which I've mentioned because it's health stone. And eventually it falls down to the earth.
So the conjecture is that no matter how high you start up like you take a number which is in the millions or billions, equal this process that goes up if you're hard and down if you even eventually goes down to to the earth all the time. No matter where you start was very simple algorithmally you end up at one and you might climb for a while. Right. Yeah so it's known. Yeah if you plot it these sequences they look like brownie and motion. They look like the stock market. They just go up and down in a seemingly random pattern.
And in fact usually that's what happens. If you plug in a random number you can actually prove that at least initially that it would look like random walk. And actually random walk with a downward drift. It's like if you're always gambling on a net at the casino with odds slightly weighted against you. So sometimes you win sometimes you lose but over in the long run you lose a bit more than you win and so normally your wallet will go to zero if you just keep playing over and over again. So statistically it makes sense.
Yes. So the result that I proved roughly speaking asserts that statistically like 19% of all inputs would drift down to maybe not all the way to one but to be much much smaller than what you started. So it's like if I told you that if you go to a casino most of the time you end up if you keep playing up long enough you end up with a smaller amount of any wallet when you start. That's kind of like the result that I proved.
So why is that result like can you continue down that thread to prove the full conjecture? Well the problem is that I used arguments from probability theory and there's always this exceptional event. So you know so in probability we have this low large numbers which tells you things like if you play a casino with a game at a casino with a losing expectation over time you are guaranteed or almost surely with probability as close to 100% as you wish your guaranteed to lose money.
But there's always this exceptional outlier like it is mathematically possible that even in the game is also not on your paper you could just keep winning slightly more often than you lose. Very much like how in Navier Stokes there could be you know most of the time your waves can disperse. There could be just one outlier choice of initial conditions that would lead you to and there could be one outlier choice of special number that you stick in that shoes of infinity were all other numbers crashed earth crash to one.
In fact there's some mathematicians who've Alex Contorovich for instance who've proposed that actually these collects iterations are like these similar automata. Actually if you look at what happened in binary they do if you look a little bit like these game of life type patency and in analogy to how the game of life can create these massive like software pliketing objects and so forth possibly you could create some sort of heavier than air flying machine a number which is actually encoding this machine which is just whose job it is is to encode is to create a version of a software which is larger.
事实上,有些数学家,比如 Alex Contorovich,提出了一个观点:这些科拉兹迭代就像某种类似的自动机。如果研究它们在二进制系统中的表现,会发现它们有点像“生命游戏”这种模式。就像“生命游戏”可以创造出复杂的软件对象一样,这个迭代过程可能也能创造出某些比空气重的飞行器。这个飞行器的本质是通过编码来创建一个更大的软件版本。
Heavier than air machine encoded in a number that flies forever. So Convert in fact work on this problem as well. So Convert so similar in fact that was more on insturations for the Navi Stokes project. That Convert studied generalizations of the collapse problem where instead of more than we're three and adding one or debiting by two you have more complicated branching with us but instead of having two cases maybe you have 17 cases and then you go up and down and he showed that once your iteration gets complicated enough you can actually encode two ing machines and you can actually make these problems understandable and do things like this.
In fact he met a programming language for these kind of fractional linear transformations. He got a fact track as a play on a full track and he showed that you can program it was two incomplete you could you could you could you could make a program that. If your number you inserted in was encoded as a prime it would sink to zero it would go down otherwise it would go up and things like that. So the general cluster problems is really as complicated as all the mathematics. Some of the mystery of the cellular terminal that we talked about having a mathematical framework to say anything about cellular terminal maybe the same kind of framework is required. Glocks and texture. Yeah if you want to do it not statistically but you really want 100% or all inputs to to to to for the earth.
Yeah so one might be feasible is yeah assisting 99% you know going to go to one but like everything yeah well that looks hard. What would you say is out of these within reach famous problems is the hardest problem we have today is the remember how about this is we want to it's up there. Pico's MP is a good one because like that's that's a meta problem like if you solve that in the in the positive sense that you can find a Pico's MP algorithm that potentially this solves a lot of other problems as well and we should mention some of the conjectures we've been talking about you know a lot of stuff is built on top of them now.
There's a lot of effects. Pico's MP has more ripple effects than basically any other right. If the readman hypothesis is disproven that would be a big mental shock to the number theorists but it would have follow on effects for cryptography because a lot of cryptography uses number theory uses number theory constructions evolving primes and so forth and it relies very much on the intuition that number theorists have built over many many years of what operations evolving primes behave randomly and what ones don't and in particular encryption methods are designed to turn text information on it into text which is indistinguishable from random noise.
So enhance we believe to be almost impossible to crack at least mathematically but if something has caught our beliefs as we went about this is wrong it means that there are actual patterns at the prime that we're not aware of and if there's one there's probably going more and suddenly a lot of our crypto systems are in doubt but then how do you then say stuff about the primes yeah that you're going towards the collection and conjecture again because if I do you do you want it to be random right yes yeah yeah.
So more broadly I'm just looking for more tools more ways to show that yeah that things are right no how do you prove a conspiracy doesn't happen is there any chance to you that p equals np is there some can you imagine a possibly universe it is possible i mean this is various scenarios i mean if there's one way it is technically possible but in fact it's never actually implementable the evidence is sort of slightly pushing in favor of no that we probably p is not equit and p i mean it seems like it's one of those cases seem similar to human hypothesis.
更广泛地说,我只是想寻找更多的方法和工具来证明一些事情确实是正确的。你怎么证明某个阴谋没有发生?你认为 P 等于 NP 有可能成立吗?你能想象一个这可能成立的宇宙吗?我的意思是有各种情景,如果有一种方法的话,从技术上来说是可能的,但实际上从未实现过。证据似乎有点倾向于否定,也就是说我们很可能 P 不等于 NP。这似乎像是一种与人类假设类似的情况。
I think the evidence is leading pretty heavily on the know certainly more on the know than on the yes the funny thing about p equals np is that we have also a lot more obstructions than we do for almost any other problem so while there's evidence we also have a lot of results ruling out many many types of approaches to the problem. This is the one thing that the computer science is actually very good at it's actually saying that certain approaches cannot no go theorems it could be understandable we don't we don't know.
There's a funny story I read that when you won the fields metal somebody from the internet wrote you and asked uh you know what are you going to do now they want this procedure award and then you just quickly very humbly said that you know this shiny metal is not going to solve any of the problems i'll currently working out some okay keep working on them it's just first of all it's funny to me that you would answer an email in that context and second of all it um it just shows your humility but anyway uh maybe you could speak to the fields metal but it's another way for me to ask about uh about Gregor at problem and what do you think about him famously declining the fields metal and the millennial prize which came with a one million dollar of prize money he stated that i'm not interested in money or fame the prize is completely irrelevant for me if the proof is correct then no other recognition is needed.
Yeah well he's he's somewhat of an outlier um even among mathematicians who tend to uh do have uh someone idealistic views um i've never met him i think i'd be interested in me to one day but i'd never have the chance i know people who met him he's always had strong views on certain things um you know i mean it's not like he was completely isolated from the math community i mean he would give talks and papers and so forth but somewhere he just decided not to engage with the rest like i mean he was dissolution of something um i don't know um and he decided to uh to piece out uh and you know collect mushrooms and same papers work or something and that's that's fine you know and you can you can do that um i mean that's another sort of flip side i mean we are not a lot of problems that we solve you know they some of them do have practical application that's that's great but uh like if you stop thinking about a problem you know so he's he hasn't published since in this field but that's fine there's many many other people you've done as well um.
Yeah so I guess one thing I didn't realize initially with the field's metal is that it sort of makes you part of the establishment um you know so you know most mathematicians you have this it's just career mathematicians you know you just focus on publishing your next paper maybe getting one just to promote a one one rank you know and and starting a few projects maybe having taking some students or something yeah but then suddenly people want your opinion on things and you have to think a little bit about uh you know things that you might just have foolishly say because you know no one's gonna listen to you uh this is so it's more important now is it constraining to you are you able to still have fun and be a rebel and try crazy stuff and plot uh play with ideas.
I have a lot less free time than I had previously um I mean mostly by choice I mean I I I I can always see I have the option to sort of decline so I declare a lot of things I could decline even more um well I could acquire a repetition things so unreliable that you would have even asked anymore uh this is this I love the different algorithms here this is great this is it's always an option um but you know um there are things that are like I mean so I mean I I don't spend as much time as I do as a postdoc you know just working the one part of the time or um falling around I still do that a little bit but yeah as you're advancing your career it's some of the more soft skills so math somehow front nodes all the technical skills to the early stages of your career so um yeah so it's uh as a postdoc is publisher parish you're you're you're incentivized to basically focus on on proving very technical themes to prove yourself um as well as prove the theorems um.
But then as as you get more senior you have to start you know mentoring and and and giving interviews and uh and trying to shape direction of field both research wise and and you know uh sometimes you have to uh uh you know to present with administrative things and it's kind of the the right social contract because you you need to to work in the trenches to see what can help mathematicians the other side of the establishment sort of the the really positive thing is that um you get to be a light that's an inspiration to a lot of young mathematicians and young people that are just interesting mathematics and it's like yeah yeah it's just how the human mind works this is where i would probably uh say that i like to field metal that it does inspire a lot of young people somehow i don't this is just how human brains work.
Yeah at the same time I also want to give sort of respect to somebody like Gore Promen who is critical of awards in his mind those are his principles and any human that's able for their principles to like do the thing their most humans would not be able to do it's beautiful to see some recognition is it's necessary and important uh but yeah it's it's also important to not let these things take over your life um and like only be concerned about getting the next big award or whatever um I mean yeah yeah yeah so again you see these people try to only solve like a really big math problems and not work on on on things that are less uh sexy you wish but but but actually uh it's still interesting and it's instructive as you say like the way the human mind works it's um we understand things better when they're attached to humans um and also uh if they're attached to a small number of humans like so I mean the the the the way our humans um minus is is why we can comprehend the relationship between the 10 or 20 people.
But once you get beyond like a hundred people like this is this is a there's a limit I figured this name for it um beyond which uh it just becomes the other um and uh so we have you have to simplify the poem master you know nine point nervous energy becomes the other um and uh and of these models are incorrect in this causes or kinds of problems. but um so yeah so to humanize a subject you know if you identify a small number of people i say you know these are representative people of the subject you i say role models for example um that has some role um but it can also be um uh yeah too much of it can be harmful because it's i'll be the first to say that my own career trough this not that of a typical mathematician um i the very accelerated education i skipped a lot of classes.
Um i think i was very fortunate mentoring opportunities um and i think i was at the right place at the right time just because someone doesn't have my um to lectury you know it doesn't mean that they can't be good mathematicians i mean they may be better just in a very different style uh and we need people with different style um and you know even and sometimes too much focus is given on the on the on the person who that's the last step to complete um project in mathematics or elsewhere that's that's really taken you know centuries or decades with lots and of putting a lot of previous work um but that's a story that's difficult to tell um if you're not expert because you know it's easy to just say one person did this one thing you know it makes more much simpler history.
I think on the whole it um is a hugely positive thing to to talk about Steve Jobs as a representative of apple when i personally know and of course i've already knows the incredible design the incredible engineering teams just the individual humans on those teams they're not a team they're individual humans on a team and there's a lot of brilliance there but it's just a nice shorthand like a very like pie yeah Steve Jobs yeah yeah pie as a starting point you see yeah as a first approximation that's how you read some biographies and then look into much deeper first approximation yeah that's right uh so you must you were a person to um angel wiles at that time oh yeah professor there.
It's a funny moment how history is just all interconnected and at that time he announced that he proved the from us last year what did you think maybe looking back now with more context about that moment in math history yeah so i was a graduate student at the time i mean i i vaguely remember you know there was press attention and uh um we all had the same um we heard pigeonholes in the same mail group you know so we all particularly on mail and like suddenly angel wiles is mailbox exploded to be all the flowing that's a good metric yeah um you know so yeah we all talked about it at t and so forth i mean we didn't understand most of us sort of understand the proof um we understand sort of high level details.
Um like there's an ongoing project to formalize it in lean like Kevin Bonson is actually yeah can we take that small tangent is it is it how difficult does that because as i understand the from us that the proof for from as last term has like super complicated objects yeah yeah really difficult to formalize now yeah i guess yeah you're right the objects that they use um you can define them uh so they've been defined in lean okay so so just defining what they are can be done uh that's really not trivial but as we've done there but there's a lot of really basic facts about um these objects that have taken decades to prove in that they're in all these different math papers and so a lot of lots of these have reformalize as well.
嗯,其实有一个正在进行的项目,目的是用 Lean 形式化这个内容,比如 Kevin Bonson 就参与其中。我们可以稍微谈谈这个吗?就是我想知道这有多难,因为据我所知,那些证明里包含了一些非常复杂的对象,是吧,是啊,现在确实很难形式化。我想是的,你说得对,他们使用的这些对象是可以在 Lean 里面定义出来的,所以定义它们是可以做到的,这并不简单,但我们已经完成这一步了。不过,这些对象有很多非常基础的事实,它们在很多不同的数学论文中经过了几十年的证明。所以,我们也需要重新形式化这些内容。
Um Kevin's uh Kevin Bonson's goal actually he has five year grad to formalize philosophy and his aim is that he doesn't think he could be able to get all the way down to the basic axioms but he wants to formalize it to the point where the only things that he needs to rely on as black boxes are things that were known by 1980 to um to number theories at the time um and then some other person or some other work we're happy done to to to get from there um so it's it's a different area mathematics than um the type of mathematics i'm used to um in analysis which is kind of my area um the objects we study are kind of much closer to the ground we study I study things like prime numbers and and and functions and and things that i wouldn't scope of a high school math education to at least define um yeah but then this is very advanced algebraic side of number theory where people have been building structures of more structures for quite a while and it's a very sturdy structure this it's it's been it's been very um at the base it releases extremely what about the text books and so forth but um it does get to the point where if you if you haven't taken these years of study and you want to ask about what what is going on at like level six of the office tower.
Kevin Bonson的目标是,用五年的时间将哲学形式化。他的意图是,即使无法彻底简化到最基本的公理,他也希望能够将其形式化到一个程度,即他所需依赖的“黑箱”知识仅限于1980年前数论中已知的内容以及其他某些人的研究成果。这与我熟悉的数学领域不同,我的研究方向是分析学。我们研究的对象更接近基础,比如质数、函数等,这是高中数学教育范围内至少能够定义的内容。然而,Kevin所研究的是非常高级的代数数论,那里的人们已经构建了非常坚固的结构,基础知识经过长时间的发展和巩固。如果没有经过多年的学习,很难理解像“办公塔楼第六层”这样的高级概念是什么。
I yeah spend quite a bit of time before they can even get to the point where you can see you see some of you recognize water inspires you about his journey that was similar as we talked about seven years most working in secret yeah uh yes that is a eromatic yeah so it kind of fits with sort of the the romantic image that people have of mathematicians to be extent they think of things that are at all as these kind of eccentric you know wizards or something um so that certainly kind of a uh accentuated that perspective you know I mean it's it is a great achievement a hairstyle of solving problems is so different from my own um but which is great I mean we we need people like that speaker like what uh in in terms of like the you like the collaborative I like moving on from a problem if it's giving too much difficulty um but you need the people who have the tenacity and the the fearlessness um you know I’ve collaborated with with people like that where well I want to give up uh because the first approach that we tried didn't work in the second one didn't approach they convinced and they have a third fourth and the fifth of what works um and I'd have to eat my work okay I didn't think this was going to work but yes you will write along and we should say for people don't know not only are you known for the brilliance of your work but the incredible productivity just the number of papers which are all of very high quality so there's something to be said about being able to jump from topic to topic.
Yeah it works for me yeah I mean I'm being also people who are very productive and they take focus very deeply on yeah I think I want us to find their own workflow um like one thing which is a shame in mathematics is that we have mathematics there's sort of a one-size-fits-all approach to teaching mathematics um and you know so we have a certain curriculum and so forth I mean you know maybe like if you do math competitions or something you get a slightly different experience but um I think many people um they don't find their their native math language uh until very late or usually too late so they they they stop doing mathematics and they have a bad experience with a teacher who's trying to teach the one way to do mathematics that they don't like it um
My theory is that um humans don't come evolution has not given us a math center or a brain directly we have a vision center and a language center and some other centers um which have evolution as home but we it doesn't we don't have in eight sense of mathematics um but our other centers are sophisticated enough that different people uh we we can repurpose other areas or a brain to do mathematics so some people have figured out how to use the visual. center to do mathematics and so they think very visually when they do mathematics some people have to repurpose their their language center and they think very symbolically um
You know um some people like if they are very competitive and they feel like gaming they there's a tap there's a part of the brain that's very good at at at uh at solving puzzles and games and and and that can be repurposed but like when I talked about the mathematicians you know they don't quite think they i can tell that they're using some of different styles of thinking then I i mean not not destroyed but they they may prefer visual like i i i don't actually prefer visual so much i need also visual aids myself um you know mathematics provides the common language so we can still talk to each other even if we are thinking in different ways but you can tell there's a different set of subsystems being used in the thinking process like they take it from past they're very quick at things that i struggle with and vice versa um and yet they still get to the same goal um that's beautiful and
Yeah but i mean the way we educate unless you have like a person like two to or something i mean education sort of just by nature of scale has to be mass produced you know you have to teach to 30 kids you know they have 30 different styles you can't you can't teach 30 different ways on that topic what advice would you give to students uh young students who are struggling with math and but are interested in it and would like to get better is there something in this yeah in this complicated educational context what would you
Yeah it's a tricky problem one nice thing is that there are now lots of sources for my faculty in Richmond outside the classroom um so in my day they're already there are math competitions um and you know they're also like popular math books in the library um yeah but now you have you know youtube uh there are forums just devoted to solving you know math puzzles and um and math shows up in another place you know like um for example there are hobbyists who play poker for fun and um they they you know they are for very specific reasons are interested in very specific probability questions um yes and and they they actually know there's a community of amateur probabilists in in poker um in chess and baseball i mean there's there's uh yeah um there's math all over the place um
And i'm hoping actually with there with these new sort of tools for lean and so forth that actually we can incorporate the broader public into math research projects um like this is almost is doesn't happen at all currently so in the sciences there's some scope for citizen science like astronomers uh they're amateurs who would discover comets and there's biologists there are people who could identify butterflies and so forth um and in math there are smaller activities where um amateur mathematicians can like discover new primes and so forth but but previously because we have to verify every single contribution um like most mathematical research projects it would not help to have input from the general public in fact it would it would just be be time consuming because just error checking and everything um
But you know one thing about these formalization projects is that they are bringing together more bringing in more people so i'm sure there are high school students who have already contributed to some of these formalizing projects who contributed to math with um you know you don't need to be a PhD holder to just work on one atomic thing there's something about the formalization here that also it's it's a very first step opens it up to the programming community yes the people who are already comfortable with program it seems like programming is somehow maybe just the feeling but it feels more accessible to folks than math math is seen as this like extreme especially modern mathematics seen as this extremely difficult to enter area and programming is not so that could be just an entry point you can actually code and it can get results you know you can put out the world pretty quickly.
Yeah, um, yeah like if uh if programming was taught as an almost entirely theoretical subject where you just talk with the computer science the theory of functions and and and and routines and so forth and and outside of some some very specialized homework assignments you don't like your program like on the weekend for fun yeah or yeah there would be as considered as hardest math.
Um yeah so as I said you know there are communities of non mathematicians where they're deploying math for some very specific purpose you know like like optimizing the poker game and and for them then math becomes fun for them.
What advice would you give in general to young people how to pick a career how to find themselves like yeah that's a tough tough tough question yeah so um there's a lot of certainty now in the world you know I mean I I there was this period after the war where uh at least in the west you know if you came from a good demographic you uh you know like you there was very stable path through it to a good creator you go to college you get an education you pick one profession and you stick to it it's becoming much more think of the past.
So I think you just have to be adaptable and flexible I think people will have to get skills that are transferable you know like learning one specific program in language or one specific subject mathematics or something it's that is still this not a super transferable skill but sort of knowing how to um the reason with abstract concepts or how to problem solve and things go wrong or so anyway these are things which I think we will still need even as our tools get better.
And you know you you'll be working with AI sports so forth but actually you're an interesting case study I mean you're like one of the great living mathematicians right and then you had a way of doing things and then all of a sudden you start learning I mean first of all you kept learning new fields but you've learned lean that's not that's a non trivial thing to learn like that's uh yeah that's a for a lot of people that's extremely uncomfortable leap to take right yeah so mathematicians um let's well I've always been interested in new ways to do mathematics.
I I feel like a lot of the ways we do things right now are inefficient um like I spend like many of my colleagues you spend a lot of time doing very routine computations or doing things that other mathematicians would instantly know how to do and we don't know how to do and why can't we search and get a quick response and so on so that's why I've always been interested in exploring new workflows about four or five years ago I was on a committee where we had to ask for ideas for interesting workshops to run at a math institute and at the time Peter Schultzer had just formalized one of his new theorems and there's some other developments in computer assisted proof that look quite interesting and I said oh we should we should uh we should have a workshop on this this is a good idea.
Um and then I was a bit too enthusiastic about this idea so I got violent told to actually write it um so I did with a bunch of other people Kevin Pozzett and Jordan Ellen Bergen and uh on a bunch of other people um and it was it wasn't a nice success we brought together a bunch of mathematicians and computer scientists and other people and and we got up spin on state VR um and it was really interesting um developments that but most mathematicians didn't know what was going on um that lost a nice proofs of concept you know it's just so hints of what was going to happen.
嗯,然后我对这个想法有点过于热情,所以有人强烈建议我实际去写下来。于是,我和其他一些人,包括Kevin Pozzett和Jordan Ellen Bergen一起动手了。我们汇集了一群数学家、计算机科学家和其他人,最终在VR状态方面取得了一些进展。这是一次非常有趣的发展,但大多数数学家并不清楚具体发生了什么。虽然我们获得了一些不错的概念验证,但这只是对未来可能发生事情的一些初步提示。
This was just before chat GVT but there was even then there was one talk about language models and the potential uh of those in the future so that got me excited about the subject so I started giving talks um about this is some of which more of us just started looking at um now that I mentioned the run as conference and then chat GVT came out and like suddenly AI was everywhere and so uh I got interviewed a lot um about about this topic um and in particular um the interaction between AI and formal proof systems.
And I said yeah they should be combined this this this this is um this perfect synergy to happen here and at some point I realized that I have to actually do not just talk the talk but walk the walk you know like you know I don't work in machine learning I don't work in through formalization and there's a limit to how much I can just rely on authority and saying you know I'm a I'm a warnaut mathematician just trust me you know when I say that this is going to change mathematics and I'm not doing it any and I don't do any of it myself.
So I thought like I had to actually uh uh justify it yeah but there's a lot of what I get into actually um I don't quite see an advice as how much time I'm gonna spend on it and it's only after I'm sort of waste deep in in in a project that I I realized by that point I'm committed well that's deeply admirable that you're willing to go into the fray be in some small way beginner right or have some of the sort of challenges that a beginner would right and yeah new concepts new ways of thinking also you know sucking at a thing that others.
I think I think in that talk you know you could be a feels no matter winning mathematician and undergrad knows something better yeah um I think mathematics inherently I mean mathematics is so huge these days that nobody knows all of modern mathematics and inevitably we make mistakes and um you know you can't cover up your mistakes with just sort of provide or and I mean because people will ask for your proofs and if you don't have the proofs you know what the proofs um I don't love math.
Yeah so it does keep us honest I mean not I mean you can still it's not a perfect uh panacea but I think uh we do have more of a culture of admitting error then because we're forced to all the time big ridiculous question I'm sorry for it once again who is the greatest mathematician of all time maybe one who's no longer with us who are the candidates the oiler gouse newton ramanogen helper.
So first of all I say I said much before like there's some time dependence but on the day yeah like if you if you if you park cumulatively over time for example you click like like like so like this is one of the good tenders um and then maybe some unnamed anonymous method just before that um you know whoever came up with the concept of obnumbus you know um do mathematicians today still feel the impact of Hilbert just oh yeah directly of everything that's happened in the 20th century.
Yeah Hilbert spaces we have lots of things that are named after him of course just the arrangement of mathematics and just the introduction of certain concepts I mean 23 problems have been extremely influential there's some strange power to the declaring which problems yeah hard to solve the statement of the open problems yeah I mean you know this is bystander effect everywhere like if no one says you should do x and I just sort of mills around. winning for someone else to to do something and like nothing gets done.
Um so and and like it's the one one thing that actually uh you have to teach undergraduates in mathematics is that you should always try something so um you see a lot of paralysis in an undergraduate trying a math problem if they recognize that there's a certain technique that can be applied they will try it but there are problems for which they see none of their standard techniques obviously applies and the common reaction is then just paralysis I don't know what to do.
Oh um I think there's a quote from the Simpson so I've tried nothing and I'm all that of ideas so you know like the next step then is to try anything like no matter how stupid um and in fact how almost the stupid are the better um which you know I want a technique which is almost guaranteed to fail but the way it fails is going to be instructive um like it fails because you do not at all taking to account this hypothesis oh this hypothesis must be useful that's a clue.
I think you also suggested somewhere this this fascinating approach which really stuck with me as they're using it it really works I think you said it's called structured procrastination no yes it's when you really don't want to do a thing the imagine a thing you don't want to do more yes yes it's worse than that and then in that way you procrastinate by not doing the thing that's worse yeah yeah it's a nice it's a nice hack it actually works yeah.
There's um I mean with anything like you know I mean like you um psychology is really important by you you you talk to athletes like marathon runners and so forth and yeah and they talk about what's the most important thing is that they're training veteran men or the diet and so on so much of it is like psychology um you know just tricking yourself to to think the the most feasible um so you you're motivated to do it.
Is there something our human mind will never be able to comprehend well I sort of I guess the method is I mean yeah okay it's a bit of a reduction I it's between a lot there must be some it's between large number that you got I'm just that I was the first thing I came to mind so that but even broadly is there are we limb is there something about our mind that's we're going to be limited even with the help of mathematics well okay.
I mean it's like how much augmentation are you willing like for example if I didn't even have pen and paper um like if I had no technology whatsoever okay so I'm not allowed blackboard pen and paper right you're already much more limited than you would be incredibly limited even language the English language is a technology it's uh it's one that's been very internalized so you're right they're really the the formulation of the problems in correct because there really is no longer a just a solo human.
We're already augmented in extremely complicated intricate ways right yeah yeah we're already like a collective intelligence yes yes I guess so humanity plural has much more intelligence principally on his good days then then the individual humans put together it can all have less okay but um yeah so yeah math math math math the mathematical community plural is incredibly super intelligent uh entity um that uh no single human mathematician can can come close to to replicating.
You see it a little bit on these like question analysis sites um uh so this mathematical flow which is the mathematical version of stackable flow and like sometimes you get like this very quick responses to very difficult questions from the community um and it is it's a pleasure to watch actually oh well as an expert I'm a fan spectator of that uh of that site just seeing the brilliance of the different people they the deaf analogous than people have and though the willingness to engage in the in the rigor and the nuance of the particular questions pretty cool to watch.
It's fun it's almost like just fun to watch uh what gives you hope about this whole thing we have going on human civilization I think uh yeah the um the younger generation is always like really creative and enthusiastic and and inventive um i it's a pleasure working with with uh with uh with young students um you know the uh the progress of science tells us that the problems that used to be really difficult can become extremely you know can become like trivial to solve you know i mean i you know it was like navigation you know I just was knowing where you were on the planet was this horrendous problem people people died um you know i lost fortunes. because so they couldn't navigate you know and we have devices in our pockets that do it automatically for us like is a completely solved problem you know so things that i've seen unfeedable for us now could be maybe just with homeic exercises with you yeah one of the things i find really sad about the finiteness of life is that i won't get to see all the cool things we create as a civilization you know that because it in the next hundred years two hundred years just imagine showing showing up in two hundred years.
yeah well already plenty has happened you know like if you could go back in time and and talk to you if you know teenage self or something you know i mean yeah yeah i'm just the internet and and now AI i mean it's like i get the they've been into they've been getting to internalize and say yeah of course uh and i can understand a voice and and give reason why you know slightly you can correct answers to any question but you know this was mind blowing even two years ago and in the moment it's hilarious to watch on the internet and so on the the drama uh people take everything for granted very quickly and then they we humans seem to entertain ourselves with drama well out of anything that's created somebody needs to take one opinion another person needs to take an opposite opinion argue with each other about it but when you look at the arc of things i mean it's just even in progress of robotics yeah just to take a step back and be like wow this is beautiful the way humans are able to create this yeah when they infrastructure and the culture is is healthy yeah the community of humans can be so much more intelligent and mature and rational the individuals within it.
well one place i can always count on rationality is the comments section of your blog which i'm a fan of there's a lot of really smart people there and thank you of course for for putting those ideas out on the blog and it's i can't tell you how uh on or drem that you would spend your time with me today i was looking forward to this for a long time tear i'm a huge fan um you inspire me you inspire millions of people thank you so much for telling thank you it was a pleasure thanks for listening to this conversation with Terence Tao to support this podcast please check out our sponsors in the description or at lexfreedman.com slash sponsors and now let me leave you with some words from golele golele mathematics is a language with which god has written the universe thank you for listening and hope to see you next time.