Who you are defines how you build. Welcome YouTube and Stanford communities to the entrepreneurial thought leaders seminar brought to you by STVP the entrepreneurship center in the School of Engineering at Stanford and basis the Business Association of Stanford Entrepreneurial students today we are so honored to have Ilya Sudskiver here at ETL.
Ilya is the co-founder and chief scientist of open AI which aims to build artificial general intelligence for the benefit of all humanity. Elon Musk and others have cited that Ilya is the foundational mind behind the large language model generative pre-trained transformer three or GPT three and its public facing product chat GPT.
Few product releases have created as much excitement intrigue and fear as the release of chat GPT in November of 2022. Ilya was Ilya is another example of how the US and the world has been the beneficiary of amazing talent from Israel and Russia is Ilya was born in Russia and then when he was five he moved to Israel where he grew up and he spent the first half of undergrad even in Israel and then he transferred and went to the University of Toronto to complete his bachelor's degree in mathematics.
He went on to get a master's in PhD in computer science from the University of Toronto and then came over here to the farm and did a short stint with Andrew in before returning back to Toronto to work under his advisor Jeffrey hints research company DNN research.
Google then acquired DNN research shortly thereafter in 2013 and Ilya became a research scientist at it as part of Google brain and in 2015 he left Google to become a director of the then newly formed open AI. It's hard to overestimate the impact that chat GPT has had on the world since its release in November of last year and while it feels like chat GPT came out of nowhere to turn the world on its head the truth is there's a deep history of innovation that has led to that moment.
And as profound as chat GPT is Ilya is no stranger in offering in discontinuous leaps of innovation in AI Jeff Hinton has said that Ilya was the main impetus for Alex net which was the convolutional neural network in 2012 that is attributed to setting off the deep learning revolution that has led to the moment that we are now in.
And of course it was seven years since the founding of open AI that chat GPT was finally unleashed to the world Ilya was elected a fellow of the Royal Society in 2022 he's been named to the MIT Tech Review 35 under 35 list in 2015 he's received the University of Toronto's innovator of the year award in 2014 and the Google wrap fellowship from 2010 to 2012 so with that everybody please give a virtual warm round round applause and welcome for Ilya to the entrepreneurial thought leader seminar.
So Ilya imagine lots of applause and you're always invited back onto the farm physically whenever you are able so Ilya there's so much to discuss and I know we're going to have so little time when we have quite a broad range of fluency around the audience in terms of chat GPT and large language models I wanted to start off with just a quick question on the technology which is just the key technology underlying open AI and generative AI more broadly is large language models.
Can you describe the technology in simple terms and now that you're at the forefront of the tech can you share what is surprised you the most about what the tech can do that you didn't anticipate. Yeah.
I can explain what the technologies and white works. I think the explanation for white works is both simple and extremely beautiful and it works for the following reason. So you know how the human brain is our best example of intelligence in in the world and we know that the human brain is made out of a large number of neurons very very large number of. Neuroscientists have studied neurons for many decades to try to understand how they work precisely and while the operation of our biological neurons are still mysterious there's been a pretty bold conjecture made by the earliest deep learning researchers in the 40s. The idea that an artificial neuron the ones that we have in our artificial neural networks kind of sort of similar to biological neuron if you squint so that's there's an assumption there.
And we can just run this assumption now one of the nice things about these artificial neurons is that you can they are much simpler and you can study them mathematically. And a very important breakthrough that's taken that was done by the very very early deep learning pioneers before it was known as deep learning was the discovery of the back propagation algorithm which is a mathematical equation for how these artificial neural networks should learn.
It provides us with a way of taking a large computer and implementing this neural network in code and then there would be there is an equation that we can code up that tells us how this neural network should adapt its connections to learn from experience.
Now a lot of additional further progress had to do with understanding just how good and how capable this learning procedure is and what are the exact conditions under which this learning procedure works well. It's although this is although we do with computers it was a little bit of an experimental science little bit like biology we have something that is like like like a biological experiment a little bit.
And so then a lot of the progress with deep learning basically boils down to this. We can build these neural networks in our large computers and we can train them on some data we can train those large neural networks to do whatever it is that the data asks them to do.
Now the idea of a large language model is that if you have a very large neural network perhaps one that's now not that far from like these neural networks are pretty large and we train them on the task to guess the next word from a bunch of previous words in text. So this is the idea of a large language model you train a big neural network to guess the next word from the previous words in text and you want the neural network to guess the next word as accurately as possible.
Now the thing that happens here is we need to come back to our original assumption that maybe biological neurons aren't that different from artificial neurons. And so if you have a large neural network like this that guesses the next word really well maybe it will be not that different from what people do when they speak and that's what you get.
So now when you talk to a neural network like this it's because it has such a great such an excellent sense of what comes next what word comes next it can narrow down it can't see the future but it can narrow down the possibilities correctly from its understanding. Being able to guess what comes next very very accurately requires prediction which is the way you operation lies understanding what does it mean for a neural network to understand it's hard to come up with a clean answer but it is very easy to measure and optimize the network's prediction error of the next word.
So we say we want understanding but we can optimize prediction and that's what we do and that's how you get this current large language models these are neural networks which are large they are trained with the bad propagation algorithm which is very capable and if you allow yourself to imagine that an artificial neuron is not the different from a biological neural.
Then yeah like our brains are doing our capable of doing a pretty good job at guessing the next word if you pay if you pay very close attention so.
嗯,是的,我们的大脑能够在你非常注意的情况下,相当准确地猜测下一个单词。
So if I love that and I just want to make this more concrete so just to push that analogy further between the biological brain and these neural analog physical networks digital networks if the human if we consider you know before it was considered untenable for these machines to learn.
Now it's a given that they can learn or do this. Do predictive outcomes of what's going to come next if a human is at one x learning and you have the visibility into the most recent chat GBT models what would you put the most recent chat GBT model as a ratio of where the humans are at so humans are at one x what's chat GP yet.
You know it's a bit hard to make direct comparisons between our artificial neural networks and people because at present. People are able to learn more from a lot less data. This is why these neural networks like chat GPT are trained on so much data to compensate for their initial slow learning ability. You know as we train these neural networks and you make them better faster learning abilities start to emerge.
But overall it is the case that we are here quite different the way people learn is quite different from the way these neural networks start. Like one example might be you know these neural networks they are you know solidly good at math or programming.
But like the amount of math books they needed to get let's say good at something like calculus is very high or as a person would need a fairly you know two text books and maybe 200 exercises and you're pretty pretty much good to go.
So there is just to get an order of magnitude sense if you relax the data constraint so if you let the machine consume as much data as it needs.
因此,如果您放宽数据限制,让机器消耗其所需的数据量,那么只需要获得一个数量级的感觉即可。
Do you think it's operating at like one tenth of a human right now or. You know it's quite hard to answer that question still let me tell you why hesitate to like like a thing that any figure like this will be misleading and I want to explain why.
Like because right now any such neural network is obviously very super human when it comes to the breadth of its knowledge and to the very large number of skills that these neural networks have for example they're very good at poetry and they're very good they know like they can talk eloquently about any topic pretty much. And they can talk about historical events and lots of things like this on the other hand on the other hand people can go deep and they do go deep so you may have an expert like someone who understands something very deeply despite having read only a small amount of documents let's say on the topic.
So because of this difference I really hesitate to answer the question in terms of all it's like some number. Do you think there is a singularity point where the machines will surpass the humans in terms of the pace of learning and adaption. Yes. Yeah and when do you think that point will occur. I don't know when it will occur I think some additional advances will need to will will happen but you know I absolutely would not bet against this point occurring at some point can you give me a range is at some point next month is next year. You know I think it's like the then certainty on this thing is quite high because these advances I can imagine it can take in quite a while I can imagine you can take in this point in a long time I can also imagine it's taking. You know some number of years but it's just very it's very hard to give a calibrated answer
and I know there's lots of push forward so I'm going to ask one more question then move on to some of the other issues but I know I read that when you were a child you were disturbed by the notion of consciousness and I wasn't sure with that what that word meant disturbed but I'm curious. Do you view consciousness or sentience or self awareness as an extenduation of learning do you think that that is something that also is an inevitability that will happen or not.
Yeah I mean on the consciousness questions. Like yeah I was as a child I would like you know look into my hand and I would be like how can it be that this is my hand that I get to see like something of this nature I don't know how to explain it much better so that's been something I was curious about you know it's. It's tricky with consciousness because how do you define it it's something that eluded definition for a long time and how can you test it in a system maybe there is a system which acts perfectly right but perfectly the way you expect.
A conscious system would act yet maybe it won't be conscious for some reason I do think there is a very simple way to. There is there is an experiment which we could run on an AI system which we can't run on which we can't run just yet but maybe. In like the future point when the AI learns very very quickly from less from less data you could do the following experiment. Very carefully with very carefully curate the data such that we never ever mention anything about consciousness we would only say you know here is. Here is a ball and here is a castle and here is like a little toy like you would imagine imagine you'd have data of this sort and it would be very controlled maybe we'd have some number of years worth of this kind of training data maybe it would be maybe such an AI system would be interacting with a lot of different teachers learning from them but well very carefully you never ever mention consciousness you don't talk about. People don't talk about anything except for the most surfaced level notions of their experience and then at some point you sit down the say I and you say okay I want to tell you about consciousness it's the thing that's a little bit not well understood people disagree about it but that's how they describe it.
And imagine if the AI then goes and says oh my god I've been feeling the same thing but I know how to articulate it that would be okay that would be definitely something to think about it's like if the AI was just trained on very mundane data around objects and going from place to place or maybe you know. Something like this from a very narrow set of concepts we would never ever mention that and if it could somehow eloquent and correctly talk about it in a way that we would recognize that would be convincing. And do you think of it as a some as consciousness as something of degree or is it something more binary. I think it's something that's more a matter of degree I think that I think that like you know let's say if a person is very tired extremely tired and maybe drunk then perhaps that when someone is in that state and maybe their consciousness is already reduced to some degree.
想象一下,如果 AI 说出:哦,天哪,我也有同样的感觉,但我知道如何表达它,那就好了,这绝对是值得思考的。如果 AI 只是在非常平凡的物体数据或从一个地方到另一个地方的数据上进行训练,或者是从一个非常狭窄的概念集合中进行训练,我们永远不会提到这点,如果它能以我们能认可的方式进行优美、准确的谈论,那就是令人信服的事情。您认为意识是某种程度的东西,还是二进制的东西?我认为它是某种程度的东西,我认为如果一个人非常疲倦,极度疲劳,或者喝醉了,那么也许在这种情况下,他们的意识已经在某种程度上减少了。
I can imagine that animals have a more reduced form of consciousness if you imagine going from you know large primates maybe dogs cats and then eventually you get mice you might get an insect feels like I would say it's pretty continuous yeah.
Okay, I want to move on even like I would love to keep asking more questions along the lines of the technology but I want to move on to talking about the mission of open AI and how you perceive or any issues around ethics and your role as chief science officer how ethics informs if at all how you think about your role and so let me just lay a couple foundation points out and then have you speak.
As you know open a eyes mission is to ensure the art of that artificial general intelligence benefits all of humanity and it started off as a nonprofit and open source and it is now a for profit and closed source and with a close relationship with Microsoft and Elon Musk who I believe recruited you to originally join open AI and gave a hundred million dollars when it was a nonprofit has says that the original vision was to create a counterweight to Google and the corporate world and he didn't want to have a world in which AI was.
你知道,“打开 AI 的眼睛”(OpenAI)的任务是确保人工智能的艺术造福于全人类。它最初是一个非盈利性质的开放源码组织。现在已转化为盈利性质的封闭源码组织,并与 Microsoft 和 Elon Musk 有着密切的关系。我相信是 Elon Musk 招募了你最初加入 OpenAI,并在那时是一个非盈利组织时捐出了一亿美元。他表示,最初的愿景是要创建一个平衡 Google 和企业世界的力量,而他不想看到一个 AI 支配着整个世界。
It's a world in which AI which is as it which he perceives and others can have an existential threat to humanity to be solely in the whole holds of of a corporate of a for profit and now open AI is neither open nor exclusively a nonprofit it's also a for profit with close ties to Microsoft and it looks like the world maybe headed towards a private do opily between Microsoft and Google. So I'm going to write on the calculus to shift from a for profit to a nonprofit and did you weigh in the ethics of that decision and do ethics play a role in how you conceive of your role as the chief science officer or do you view it more as something that somebody else should handle and you are mainly just tasked with pushing the technology forward.
So many parts let me think about the best way to approach it. So there are several parts there is the question around open source versus closed source. There is a question around nonprofit versus for profit and the connection with Microsoft and how to see that in the context of Elon Musk's recent comments.
So I'm going to ask you a question about how I see my role in this maybe I'll start with that because I think that's easier. So I feel the way I see my role I feel a lot I feel direct responsibility for what of the opening I does. So my role is primarily around advancing the science it is still the case I'm one of the founders of the company and ultimately I care a lot about open eyes overall impact.
所以我想问你一个关于我在这个角色中的看法的问题,也许我会从这个开始,因为我认为那更容易理解。所以我认为我的角色是非常关键的,我对公司的开拓有直接的责任。因此,我的主要职责是推动科学的进步。我仍然是公司的创始人之一,并且最终我非常关心 open eyes 的整体影响。
I want to go so be this context I want to go and talk about the open source versus close source and the nonprofit versus for profit. And I want to start the open source versus close source because I think that you know the challenge with AI is that AI is so all in composing and composing. And it comes with many different challenges comes it's many many different dangers which come into conflict with each other.
And I think the open source versus close source is a great example of that. Why is it desirable what are some reasons for which it is desirable to open source AI.
The answer there would be to prevent concentration of power in the hands of those who are building the AI. So if you are in a world where let's say there is only a small number of companies you might have control this very powerful technology. You might say this is an undesirable world and that AI should be open and that anyone could use the AI. This is the argument for open source.
But this argument you know of course you know to state the obvious there are near term commercial incentives against open source. But there is another longer term argument against open source in as well which is if we believe if one believes that eventually AI is going to be unbelievably powerful.
If we get to a point where your AI is so powerful where you can just tell it hey can you autonomously create a biological research lab. Autonomously do all the paperwork, rent the space, hire the technicians, aggregate experiments do all this autonomously. Like that starts to get incredibly that starts to get like mind-bendingly powerful should this be open source also.
So my position on the open source question is that I think that I think that the reason maybe a level of capability you can think about these neural networks in terms of capability. How capable they are, how smart they are, how much can they do.
When the capability is on the lower end I think open source is a great thing. But at some point and you know there can be debate about where the point is. But I would say that at some point the capability will become so vast that it will be obviously irresponsible to open source months.
And was that the driver behind closed sourcing it or was it driven by a devil's compact or business necessity to get cash in from Microsoft or others to support the viability of the business was the decision making to close it down actually driven by that line of reasoning or was it driven by more.
So the way I've articulated you know my view is that the current level of capability is still not that high where it will be the safety consideration it will drive the close source in the model this kind of this kind of research. So in other words a claim that it goes in phases right now it is indeed the competitive phase but I claim that as the capabilities of these models keep increasing the will come a day where it will be the safety consideration that will be the obvious and immediate driver to not open source these models.
So this is the open source versus closed source but your question had an up but your question had another part which is nonprofit versus for profit. And we can talk about that also.
You know indeed it would be preferable in a certain meaningful sense if open AI could just be a for a nonprofit from now until the mission of open AI is complete. However one of the things that's worth pointing out is the very significant cost of these data centers. I'm sure you're reading about various AI startups and the amount of money they are raising the great majority of which goes to the cloud providers.
Why is that? Well the reason so much money is needed is because this is the nature of these large neural networks they need the compute and of story. You can see something like this. It's all you can see a divide that's now happening between academia and the AI companies.
So for a long time for many decades cutting edge research in AI took place in academic departments in universities. That kept being the case up until the mid 2010s. But at some point when the complexity and the cost of these projects started to get very large it no longer remained possible for universities to be competitive. And now universities need a university research in AI needs to find some other way in which to contribute. Those ways exist they're just different from the way they're used to and different from the way the company is contributing right now.
Now with this context you're saying okay the thing about nonprofit and nonprofit is that people who give money to a nonprofit never get to see any of it back. It is a real donation and believe it or not it is quite a bit harder to convince people to give money to an nonprofit. And so we think what's the solution there or what is a good course of action.
So we came up with an idea that to my knowledge is unique in all corporate structures in the world. The open AI corporate structure is absolutely unique. Open AI is not a for profit company. It is a kept profit company and I'd like to explain what that means. What that means is that equity in open AI can be better seen as a bond rather than equity in our company.
我们想出了一个想法,据我所知,在全球的公司结构中都是独一无二的。开放式AI企业结构是绝对独特的。 Open AI并不是营利性公司,而是保持利润的公司,我想解释一下这是什么意思。这意味着在Open AI的股权可以被视为债券,而不是我们公司的股权。
The main feature of a bond is that once it's paid out it's called so in other words open AI has a finite obligation to its investors as opposed to an infinite obligation to that normal companies have. And does that include the founders to the founders of equity in open AI.
So some Altman does not have equity but the other founders do. And is it kept or is it unlimited is kept and how does that cap is that kept up. Because the founders I presume didn't buy in unless it's capped at the nominal share value.
I'm not sure I understand the question precisely but what I can say like what what what what what I can answer the part which I do understand which is like. There is certainly like it is there are it is a different it is different from normal startup equity but there are some similarities as well where the earlier you joined the company the higher the cap is because then. The larger cap is needed to attract the initial investors as the company continues to succeed the cap decreases and why is that important it's important because it means that the company.
Once once once all the obligation to investors and employees are paid out open AI becomes an on profit again and you can say this is totally crazy what are you talking about like it's not going to change anything but it's worth considering what the expect like it's worth looking at what we think AI will be.
I mean we can look at what AI is today and I think it is not at all inconceivable for open AI to achieve it's. To pay out its obligation to the investors and employees becoming on profit at around the time when perhaps the computer will become so capable where the economic destruction will be very big where this transition will be very beneficial.
So this is the answer on the cap profit versus nonprofit. There was a last party of question I know I'm speaking for a while but the question had many parts the last party of question is the Microsoft relationship and.
Here the thing that's very fortunate is that Microsoft is there thinking about these questions the right way they understand the potential and the gravity of a GI and so for example on the on all the investor documents that any investor in open AI sign and by the way Microsoft is an investor into open AI which is a very different relationship from the Google deep mind. Any anyone who signed any document any investment document there is a purple rectangle at the top of the investment document which says that the fiduciary duty of open AI is to the open AI mission which means that you run the risk of potentially losing all your money if the mission comes in conflict. This is something that all the investors have signed.
And let me just make this clear for everybody because Google Google acquired deep mind so deep mind was just an asset inside of Google but beholden to Google you're making the distinction that with open AI Microsoft is an investor and so beholden to this fiduciary duty for the mission of open AI which is held by the nonprofit which is a is is a. And a GP or an LP in the in the for profit. Okay understood something like this you know I am you know there are people. I can't tell you the precise details. But so but this is the general picture.
So I have claimed though now especially Steve was the co founder of Apple and Elon Musk famously signed this very public petition saying that the point of no return is already passed or approaching it where it's going to be impossible to rain in AI and it's and it's it's repercussions if we don't halt it now and they've called for halting AI. So this on you are a world citizen earlier you were born in Russia you were raised in Israel your Canadian and I'm and and it's open a eyes response to that public petition was I know Sam basically said that you know this wasn't the right way to go about doing that. Also in parallel Sam is on a world tour with many countries that also can be antagonistic towards the west are there any citizen obligations ethical obligations that you think also overweight your your technological obligations when it comes to spreading the technology around the world right now through open AI do you think that should be beholden to a regulation or some oversight. Let me think. I'm trying to give you the so you can respond however you want to on that I know we're going to come out of the off of time so I just want to give you the mic and just share everything that's on my mind you can decide how you want to handle it.
Thank you I mean you know it is true that AI is going to become truly extremely powerful and truly extremely transformative and I do think that we will want to move to a world with sensible government regulations and there you know there are several dimensions to it. We want to be in a world where there are clear rules about for example training more powerful neural networks. We want there to be some kind of careful evaluation careful prediction of these of what we expect these neural networks of what they can do today and of what we expect them to be able to do let's say in a year from now or by the time they finish training.
I think all these things will be very necessary in order to like rationally rationally I wouldn't use the word slow down the progress I will use the term you want to make it so that the progress is sensible so that you did step we've done the homework and indeed we can make a credible story that OK. The neural network the system that we've trained it has we are doing this and here all the steps and it's been verified certified I think that is the world that we are headed to which I think is correct. And as for the citizen obligation I feel like I mean I'll answer it like this like I think I think like there are there are two answers to it so obviously you know I leave a leave in the United States and really like it here and I want and I want this place to flourish as much as possible care about that.
I think that of course there will be lots of but the world is much more than just the US and I think that these are the kind of questions which are feel a little bit. Let's say outside of my expertise how these between country relationships work out but I'm sure that will be lots of discussions there as well. Yeah, can I turn a little bit towards strategy I'm curious for you guys internally what metrics do you track as your north star what are the most sacred KPIs that you use to measure open a eyes success right now. The most sacred KPIs you know I think this is also the kind of question where maybe different people give you different answers but I would say I would say that there are if I were to really narrow it down I would say that there are. There is a couple of really important KPIs of really important dimensions of progress one is undeniably the technical progress are we doing good research.
Do we understand our systems better I'll be able to train them better can be control them better I am I is our is our is our research plan being executed well is our safety plan being executed well how happy are we with it.
I would say this would be my description of the primary KPI which is do a good job of the technology then there is of course stuff around the product but. Which I think is cool but I would say that it is really the core technology which is the heart of opening the technology it's development. And on and it's control it's steering.
And do you view right now chat GBT is a destination. Do you view open AI in the future being a destination that people go to like Google or will it be powering other applications and be the back end or be be you know used as part of the back end infrastructure. Is it a destination or is it going to be more behind the scenes in in five to 10 years. Yeah well I mean the exchange so fast I I cannot make any claims about. Five to 10 years in terms of the correct shape of the product I imagine a little bit of both perhaps but this kind of question. I mean I think it remains to be seen I think there are I think this stuff is still so new.
Okay I'm going to ask one more question then I jump to the student questions if you were student at Stanford today interested in AI if you were. You know somebody who wants to be Ilya what would you focus your time. And another second question on this if you're also interested in entrepreneurship. Where would what would you what advice would you give for a Stanford undergrad engineer that's interested in AI and entrepreneurship.
So I think on the first one. It's always hard to give generic advice like this. Yeah. But I can still provide some generic advice nonetheless. And I think it's something like it is generally a good idea to lean into once unique predispositions. You know every you know what if you think if you look if you think about the set of let's say inclinations or skills or talents that the person might have. The combination is pretty rare so leaning into that is a very good idea no matter which direction we choose to go look to go in. And then on the AI research.
Like I would say. I would say that there. You know I could say something but even but there especially you want to lean into your own ideas and really ask yourself what can you is there something that's totally obvious to you. That makes you go why is everyone else not getting it. If you feel like this that's a good sign it means that you might be able to get you you want to lean into that and explore it and see if your instinct is drawn out. It may not be true but you know my my advisor Jeff Hinton says this thing which I really like he says you should trust your intuition because if your intuition is good you go really far and if it's not good then it's nothing you can do. And as far as entrepreneurship is concerned.
Like this is a place where the unique perspective is even more valuable or maybe it's because it's maybe maybe I'll explain why I think it's more valuable than in research. Well in research it's very valuable too but in entrepreneurship like you need to like almost pull from your unique life experience where you say okay I see this thing I see this technology I see something like take a very very broad view and see if you can hone in on something and that actually could just go for it. So that would that would be the conclusion of my generic advice.
Okay which is great that's also great I'm going to move on to the student question so one of the most uploaded question is how do you see the field of deep learning evolving in the next five to 10 years. Yeah let's see you know I expect deep learning to continue to make progress. I I expect that you know there was a period of time where a lot of progress came from scaling. And you see we saw that most in the most pronounced way in going from GPT one to GPT three.
But things will change a little bit the reason the reason the reason that progress in scaling was so rapid is because people had all these data centers which they weren't using for a single training run. So by simply reallocating existing resources you could make a lot of progress and it doesn't take that long necessarily to reallocate existing resources you just need to you know someone just needs to decide to do so. It is different now because the training runs are very big and the scaling is not going to be progressing as fast as it used to be because building data centers takes time.
But at the same time I expect deep learning to continue to make progress in art from other places. The deep learning stack is quite deep and I expect that there will be improvements in many layers of the stack. And together they will still lead to progress being very robust. And so if I had to guess I imagine that there would be maybe I'm certain we will discover new properties which are currently unknown of deep learning and those properties will be utilized. And I fully expect that the systems of 5 to 10 years from now will be much much better than the ones they are right now.
But exactly how it's going to look like I think I think it's a bit harder to answer it's a bit like. It's because the improvements that there will be maybe a small number of big improvements and also a large number of small improvements all integrated into a large complex engineering artifact.
And can I ask you, your co founder Sam Altman has said that we've reached the limits of what we can achieve by scaling to larger language models is do agree. And if so, you know what then what is the next innovation frontier that you're focusing on if that's the case.
Yeah, so I think maybe I don't know exactly what he said, but maybe he meant something like the age of easy scaling has ended or something like this. Of course, of course, the large neural nets would be better, but it's do be a lot of effort and cost to do them. But I think there will be lots of different frontiers and actually into the question of how can one contribute in deep learning identifying such a frontier, perhaps one that's been missed by others is very fruitful.
And is it can I go even just deeper on that because I think there is this debate about vertical focus versus general general's training, you know, is it better you do you think that there's better performance that can be achieved in particular domain such as law or medicine by training with special data sets or is it likely that general's training with all available data will be more beneficial.
So, like at some point we should absolutely expect specialist training to make huge impact, but the reason we do the general is training is just so that we can even reach the point where just so that we can reach the point where the neural net will can we understand the questions that we are asking. And only when it has a very robust understanding only then we can go into specialty training really benefit from it. So, yeah, I mean, I think all these I think these are all fruitful directions.
But you don't think when do you think we'll be at that point when specialist training is the thing to focus on. I mean, you know, like if you look at people who do open source work, people who work is open open source models, they do a fair bit of this kind of specialist training because they have a fairly underpowered model and they try to get any ounce of performance they can out of it. So, I would say that this is an example. I'd say that this is an example of it happening it's already happening to some degree it's not a binary it's you might want to think of it as if I like a continue spectrum.
But do you think that the competitive do you think that the winning advantage is going to be having these proprietary data sets or is it going to be having a much higher performance large language model when it comes to these applications of AI into verticals. So, I think it's maybe productive to think about about an AI like this as a combination of multiple factors where each factor makes a contribution. And is it better to have a special data which helps you make your AI better in a particular set of tasks of course is better to have a more capable base model of course from the perspective of the task. So, maybe this is the answer it's not an either or.
I'm going to move down the other questions. There's a question on what was the cost of training and developing GP T3 slash four. Yeah, so you know for obvious reasons I can comment on that. But there I think there is a you know I think even from our research community there's a strong desire to be able to get access to to different aspects of open a eyes technology.
And are there any plans for releasing it to researchers or to other startups to encourage more competition and innovation some of the requests that I've heard are unfettered interactions without safeguards to understand the model's performance model specifications, including details on how it was trained and access to the model itself, I either trained parameters.
Do you want to comment on any of that. I mean, I think I could relate it to our earlier question about home universe is closed. I think that there are some intermediate approaches which can be very fruitful. For example, model access and various combinations of that can be very, very productive because these manual networks already have such a large and complicated surface area of behavior. And. And studying that alone can be extremely interesting like we have an academic access program we provide various forms of access to the models and in fact plenty of academic research labs do study them in this way. So I think this kind of approach. Is viable and is something that we could that we are doing.
And we're coming up on time I want to end with just one final question, which is can you just share any unintuitive but compelling use cases for how you love to use chat GBT that others may not know about. So I mean, I wouldn't say that it's unknown but I really enjoy its poem writing ability. It can write poems, it can wrap, it can be it can be it can be pretty amusing.
And do you guys use it is it an integrated part of the of teamwork at open I assume it is, but I'm curious do you have any insights on how it changes dynamics with teams when you have AI deeply integrated into you know human team and how they're working and any insights into what we may not know but that will come. I would say today the best way to describe the impact is that everyone is a little bit more productive. People are a little bit more on top of things I wouldn't say that right now there is a dramatic impact on dynamics which I can say oh yeah the dynamics have shifted in this pronounced way. I think the worst of a depersonalizes conversations because it's the AI bot or maybe it may but maybe we're not at that point yet where it's. That I definitely I don't think that's the case and I predict that you'll not be the case but we'll see.
Well thank you Ilya for a fascinating discussion time is always too short you're always invited back to the farm we'd love to have you either virtually or in person so thank you thank you thank you to our audience thank you for tuning in for this session of the entrepreneurial thought leader set series next week we're going to be joined by the executive chairman and co founder of octa.
Frederick carist and you can find that event another future events in this ETL series on our Stanford E corner YouTube channel and you'll find even more of the videos podcasts and articles about entrepreneurship and innovation at Stanford E corner that's corner dot Stanford dot E. D. U. and as always thank you for tuning in to ETL.