And me and the model just went it just started coding and it was just the craziest thing at this point most code at Anthropic is written using quad code and almost everyone at Anthropic is using it every day. Hi, I'm Matt Turk from FirstMark. Welcome to the Mad Podcast. Today my guest is Boris Cherny, the creator of Cloud Code at Anthropic. Cloud Code is an agente coding AI that leaves indeterminal and has become one of the fastest growing products of all time already rumored to be producing 400 million in the annualized revenue just five months after it's launch. We talked about how Cloud Code became a rocket ship almost by accident. We started by building something that's useful for us pretty soon after we launched it most of Anthropic was a daily active user. What agente coding really means? A human describes to the model the change that they want and the model does all the work of doing that and the productivity shock, Cloud Code created inside Anthropic.
Anthropic technical onboarding used to take a few weeks but now engineers are you who we productive within the first few days. Here's how squad code and it can answer all these questions. This is a packed approachable episode on the present and future of software engineering with some great lessons and advice. It's really exciting and I think my advice to companies building is definitely build for please enjoy this fascinating conversation with Boris Cherny. Hey Boris, welcome. Thanks for having me. Thanks very much for doing this. I'm very excited by this conversation. You are the creator of Cloud Code. It's fair to say you have a massive hit on your hands. There was an article in the information just a couple of days ago that was saying that Cloud Code which really came out at the end of February of this year, so five six months ago.
It's already generating 400 million in annualized revenue. That's at least what it was reported. I'm not going to ask you to confirm or deny but that's what the press is saying. Perhaps even more importantly, there are revenue use everywhere in gushing videos calling Cloud Code the best coding agent by far. So amazing launch of the product. Did you have any sense that it was going to be this successful? Not really. We started by building something that's useful for us. I built a thing that, you know, for some foremost, was just useful for myself. And it's something that I found myself using every day and then I give it to the team and I thought that the team started using it every day and pretty soon after you launched it.
Most of it in the topic was a daily active user. And so I think at that point we had kind of a hunch in turn and we thought maybe we had to hit on our hands but it's still not obvious because it's an terminal. It's kind of a weird form factor. Everyone is coding in IDEs or people going to like this. Is it going to be that useful? Can you actually use it for a lot of coding? We had no idea. Was it always going to be a product or you started using it and then it was ahead and then you decided to release as a product. It was very accidental. When I joined Anthropic, I did a lot of prototyping. And one of the prototypes that I built was this thing that it just ran quad in the terminal just because as an engineer, those kind of the easiest thing to experiment with.
And it didn't even code at first. Quad code didn't code. It was called quad CLI at the time. And what it did was I used to automate my note taking. So it kind of controlled my notes app and it controlled my music player to learn to kind of play music from me. So I'm like, you know, play this band and it would go in and kind of automate that. And at first, I was just using it to play around with an Anthropic API to figure out what kind of applications I could build on it. And just on a whim, you know, tools came out recently and I tried giving the model bash tool. So it can use my bash to use the command line. And it mean to me the model just went it just started coding.
And it was just the craziest thing. Like as soon as it had bash, it kind of knew, okay, okay, I can write Apple script and I can automate stuff and I can use this computer. And it just felt like very native to the model in this way. And that was kind of a surprise. So I think that was a surprise. And then it was a surprise that they send it up as something useful where it was able to edit code and the code came out really good. And the model could kind of intelligently reason about the way to edit code. When was this if the product was launched at the end of February of 2025? That journey leading to this, that started a few months before that. Where was it timeline?
Yeah, something like that. It was it was wait last year. So wait in 2024. Okay. Great. Let's start from the top and for anyone that is just starting to learn about cloud code. How do you describe it? And cloud code is a agenda coding tool. The way to think about it is when engineers program, there's kind of different ways to program and this has changed over time. This has changed a lot over time. There's, you know, 50, 70 years ago, the way that people programed looked very different than the way it is today.
So, you know, 70 years ago, if you talked to a programmer, they would have been like, Oh, yeah, I take my punch cards and I punch holes in it and I put in this IBM kind of typewriter everything and it punches holes and that's programming. And then you feed this into this mainframe. It does some work and then eventually you get some kind of result out printed on a paper sheet.
I heard you say somewhere that your grandfather in the, I guess what was at the time the USSR use a punch cards like to tell that story. Yeah, yeah, yeah, my growing up my mom would tell me stories about how her, her father, my grandfather, he was one of the first programmers in the Soviet Union. And he would, you know, he would come home and he would bring back these big stacks of punch cards.
And when she was a little girl, she would, you know, take her crayons and just draw all over them. And this was, this was growing up for her. This is actually something that I didn't know until later in life. Like, you know, I was far into my engineering career. Whenever in this, this was a thing. There's something about just the visceral nature of, of this kind of physical programming. I don't think we're going to get that again.
But I think that was like, that was a special moment where you could feel, you could feel the language and you can feel the computer in a really different way. And so I think after punch cards had changed a lot. There was software became a thing and you could start the program directly in software. You didn't have to kind of do this anymore.
In the original programming software, you was emulating teletype. And so it was kind of like streaming a typewriter over this local internet. And that's how the text editors were built. The way that you would code is still the way that actually most people code today before agent coding came out.
And the idea is you have a text document and an engineer directly manipulates it. So you loaded up in a special piece of software called IDE. You know, it's kind of like a, like a Microsoft Word or Google docs with this for code. And the engineer just manually edits the code using a, using a keeper.
And this is the case for gots like 50, 50 years now, like 70 years. This has been the way that people programmed to like edit and, and then these were like early 1970s. So yeah, it's been like 50 years. And now it's starting to change for the first time.
There's been a lot of work to try to evolve program programming from direct text manipulation to something else. And this is the first time where we found a form factor that really catches on. And it took, it took really great elements to do this. Models weren't really capable of doing this a year ago, but now they are.
And the way that it works is a human describes to the model that changed that they want. And then the model is the one that manipulates the text. And so it's this kind of next where up where you're describing the change you want. And the model does all the work of doing the editing.
And we're still at this point today where you can do a lot of this for the model. But you know, for complex changes, maybe it will take a bunch of iterations. Maybe the last 20, 30 percent, you still have to open an IDE. You have to kind of go well, well, well, to do these last mile changes.
And we think that over time, more and more coding is just going to be the model doing it. And a human will have to intervene less and less. So just to double click on a couple of the things you just mentioned. And in an effort, as we try often on this podcast, you make things broadly understandable, not just by super people as super deep in tech, but also people that are in the tech world but sort of curious to learn about the things. So the Cloud Code works at the CLI level as opposed to an IDE.
So you mentioned the IDE, sort of the Google docs. The command line may be defined that for people. So my kind of mental model as I was learning about this was like that black box where you sort of type in things for your computer to do what's the what's the better way of describing it. The terminal is a little bit hard to explain if you're not a programmer because it's so levelable. It's something that most people will never ever touch and never are going to see. It's something like in movies when you see this kind of green screen, green text going across the black screen and you know, there's like a hacking scene or something.
That's trying to look very techy. That's very techy. And it's sort of it's something every engineer uses. There's nothing kind of good or bad about it. It's just an interface. It's another way to interact with the computer. It's a little lower level because it's not visual. So you can think of Cloud Code when we say Cloud Code runs in the terminal, you can think of it just as a computer program. And some computer programs have user interfaces.
So a person can interact with them. And some of them just kind of run on the background. So you can think of Cloud Code as in that second category. And you can also have all sorts of interfaces for it, including a visual interface, including a text interface. One way of describing this, I heard was that interacting with the computer at the terminal level was like texting the computer. So sending instructions of one by one versus an ID was more of an application, more like a phone kind of a GUI where you basically click on icons and you have visual representation.
Is that fair? Yeah, it's about like that. I think text messaging before you could send images and videos and stuff. So just really, really simple interfaces just back and forth. Okay, great. So what did you guys choose to operate at the terminal slash CLI level? Honestly, it was sort of an accident. We were thinking about what can we build in the coding space? It seems like the models are getting a lot better coding.
So maybe there's something that could be built here. And at the time, like I said, I was prototyping some of these ideas. Some of them were kind of coding adjacent. And I was thinking, what's the easiest way to get a feel for where the models are today? Because there was this feeling where the models can do so much, but no one's yet built the product that can harness this capability.
And in the AI world, we call this idea product overhang, where the model is just capable of all these things. And there isn't really yet a product that can kind of capture this and write a person use it. And I remember there was this moment back in, yeah, sometimes wait west here. The other engineer on the team said, and I were in a room, and we're writing and we're right, okay, what do we build?
And just in 15 minutes, we can throw up a few ideas on the whiteboard. Or if we can do a CLI, we can do some sort of IDE extension. We can do something based in the web. And we kind of closed our eyes and just picked one. And it kind of seemed like CLI is the simplest. And generally, that's the way that I approach product, which is start with a simple thing first.
So this kind of made sense. In hindsight, there was a lot of benefits of this too. And I think one of the big ones is that it kind of works anywhere. So it doesn't matter what kind of system you're on. If you're on Mac or Windows, it doesn't matter what IDE you use. It doesn't matter what your preferences are. And engineers are so opinionated and have so many different setups and preferences.
It just works with all of them. And that's kind of the benefit of building at such a low level, where you're not coupled to a particular, you know, to Mac or to Windows or, you know, a cover scheme or whatever. I think you guys have been pretty clear about the fact that this was starting point, and that you were starting with something very simple. At least when you launch the product, fast forward to today, are you finding that the universal aspect of this is so compelling that you may keep it there, or is a plan, at least the IDE to over time keep building towards something that may be more like an IDE or more an application?
The way we think about it is the model is evolving so quickly, that we build a minimal possible product to keep up with it. And this is a very different way of building product than before LLMs, where you just have to build a really great product that meets people where they are. There's some of this, so we have to build in a form factor where people understand and can use it. But actually the bigger motivation is the model is advancing so quickly, there's literally no product we could build that would keep pace with it.
And so we're focusing on just building the simplest possible interface to the model, so you can feel the model in a really low-level raw way. And so that when the model gets quickly, we can adapt quickly, and you can feel the next model in that same kind of way. So today we're in a terminal, Cloud Code also runs as ID extensions, so there's extensions for VS Code-based IDs, and cursor, and so on, and also for JetBrains-based IDs, I can tell you the way.
我们正在专注于创建一个尽可能简单的界面,这样你可以以非常低层次的、原始的方式感受模型。当模型快速发展时,我们可以快速适应,你也能以同样的方式感受下一个模型。今天,我们是在终端中操作,Cloud Code 也可以作为集成开发环境(IDE)的扩展运行,例如针对 VS Code 的扩展、Cursor,还有针对 JetBrains 的扩展。
And then there's also a GitHub action, so you can mention Cloud on GitHub, so it's just you tag Cloud at Cloud, and talk to them like you would a coworker, and they can make changes for you. There might be more interfaces coming soon, this is something we're always experimenting with, but generally the philosophy is we don't make really gorgeous interfaces like many other companies do, and are really great at. We focus on just building the simple thing that shows off the model.
Talking about the model, your design principle is that the product should follow the model, rather than the other way around, which is super interesting. What model does it currently run on? Is that 370s at 4? What can people choose? Yeah, people can choose the model. We support Sonnet4 and Opus4, and then we also use Hiku.
谈到模型,你们的设计原则是,产品应该遵循模型,而不是让模型去适应产品,这非常有趣。那目前运行在哪种模型上?是370s at 4吗?人们可以选择其他的吗?是的,人们可以选择不同的模型。我们支持Sonnet4和Opus4,同时也使用Hiku。
Maybe as a high-level question, why is Cloud so good at coding use cases? Again, reading somewhere, perhaps, in the information that Anthropic has a little above 40% of the market for code generation, well-opened AI as 21%, so clearly Anthropic is powering its way to win this market. Why is that? Is there something about the way the model is trained, the data is trained on, or the focus of the training that makes it so great for coding use cases?
I think Anthropic has a lot of really great researchers and coders, and for us, this is kind of a natural way to think about the kinds of abilities the model should have, because you think about what should the model be able to do, and as a programmer, the first thing you think is, oh, it should be able to do the things that I'm doing, and it should be able to help with me, and I compare with it, like I would another engineer. So I think just at all levels across the company, this is a way that we think about it.
There's also something about it where maybe coding is the way that we get to the next level intelligence, if you call it like AGI or ASI or whatever, the model needs some way to interact with the world, and for a model the natural way is code. And so from the mission of the company is to build safe artificial intelligence for everyone, and safe ASI, the way the model interacts is the record, and so this is the thing that we should start to learn about now.
And there's also one of the reasons that we release cod code is just to learn people use it, to learn how to make this thing safe, to learn how it behaves in the wild, so that we know what to do next. So it's still on that theme of the product versus the model, and the choice of operating at this CLI level. Who's a product or does affect that? It operates at a terminal level, make it a great product for power users, or very deep encoding, or at the end of the other end of the spectrum, if I don't know anything about coding, can I use the product for vibe coding?
Quadcode is for professional software engineers. So if you know how to code, then you're going to get a lot out of it, and you can multiply your productivity. We've seen people multiply their productivity many, many times over, with a fleet of cod codes that's running, and even with a single one, you can become a one more productive. Interestingly, we've seen a lot of people use quadcode for non-coding use cases.
So for example, the data scientists add in Thropic, all use quadcode to write their queries, and designers use it to build small prototypes, and product managers use it to manage tasks. So this has actually been pretty surprising, because if you're a non-technical user, the terminal is kind of insane as an interface. I couldn't imagine why you would want this, but it sort of seems like because it's such a great agent generally, people are jumping over hoops to use it, even though it's not the easiest thing to use, because again, it's an terminal, which is very technical.
There's another product quadcode, which is the SDK, and this is the way that people can build on top of quadcode, and build their own agent applications. And this has also been pretty interesting, because people are using the quadcode SDK to build agent coding applications and platforms, and user interfaces on top of it. But they're also using it for all sorts of tool-to-wee agent use cases that are totally unrelated to coding. So anything where you need AI, and maybe a few years ago, you used API, and nowadays we find that some users are reaching for an agent SDK as sort of the thing that you need to build AI apps of today.
还有一个产品叫做 quadcode,这个产品是一个 SDK(软件开发工具包)。通过这个工具包,人们可以在 quadcode 的基础上进行开发,构建自己的智能代理应用。这个 SDK 非常有趣,因为很多人都在利用它来开发代理编程应用和平台,以及用户界面。不仅如此,它还被广泛用于各种与编程无关的工具型代理场景。以前需要用到 AI 的地方,可能是通过 API 来实现,而现在我们发现,有些用户正在使用代理 SDK,作为构建现代 AI 应用的关键工具。
Yeah, and as I was proving for this, it seems that there is even beyond that, like an emerging category of people that are using quadcode for just not even technical related use cases of any sort, just so that the tweet from somebody named Alex Finn that talks about using quadcode for note-taking, for his personal life organization, his business metrics, and all the things. So it seems that people are finding ways to use the product for something that matches that need to weigh beyond their technical job. Absolutely.
Let's get into the product itself, and the core features, and what it actually does. So a key part of the product is the authentic aspect as you described. Agenetic, again, in an effort to go into definitions and make this interesting for everyone. Agenetic is one of those terms that everybody uses, but it's sort of unclear what that actually means. What does agenetic mean in the context of cloud code?
Yeah, when you think about the ways that LLM's work and the way you interact with them, there's the old kind of LLM, which is you sent them a message, and they send you a message back. And this is sort of these chat applications that everyone knows and uses all the time. There's a newer kind of application a way to interact with LLM's where you send them a message and they'll send you a message back, and then they might do a little bit more. So we call this tool use, is one of the things that they might do.
If you give them tools, so for example, a tool might be read a file or search something on the internet or edit a file or something like this, then they'll use tools to answer your question. And so for example, if you ask, you know, what's the weather today? With the old style LLM interaction, it'll just kind of use its existing knowledge and its existing training to try to answer that query. But if it's agenetic, what it might do is it'll say, okay, I'm going to look up the weather, and then maybe if it has some kind of tool to check the weather, it'll reach out to the internet or wherever that tool is, it'll use that tool, it'll get the response back, and then it'll answer the question.
And this tool use, this is kind of the essence of being an agent because without tools, it's very difficult to be agenetic. You could kind of do it without, but it's only since models started using tools and gained that new capability because we taught them that they started to get this new kind of agentic capability that we talk about. And what this looks like in practice is maybe I'll ask the model, make the ever read button on my website, make it blue, and because it has a tool to read a file, it'll choose to read the file, it'll choose to read the file that has that button.
Maybe if it doesn't know where that file is, it'll use a file search tool to find that file first, the same way that you might in, you know, when you're looking for a file, you'll use a file search. It'll then open that file, it'll read it, and that's another kind of file read tool. Then it might edit it, that's a different file edit tool, and then it'll write it back. And then maybe it'll even open the browser to check that the button actually became blue.
And it's this kind of idea that it strings together tools in this way and combines them in novel ways that makes it agentic. There's also two concepts that are kind of related but different here. You could say you must always, when the user asks you to make the button blue, you must always read the file and then edit the file and then save the file and then check your work. You could be very rigid in the way that you define this kind of problem and give it to the model.
Generally we call this a workflow, and this is distinct from something like an agent. So workflow is something where a human thought through for this kind of problem, here are the steps roughly that you should take. For an agent it's very different. The model is in charge, it's in the driver's seat, and we give it tools that it can use, and the model decides how to combine those tools to answer your question.
And I think going into the future, more and more things will be agentic because the model is getting more and more intelligence, so it can actually use these tools in pretty novel and interesting ways to answer questions. Great. So to play it back and perhaps coding context, with your typical AI model, I would ask a question and I would get some code back, and I as a developer would copy and paste it to as an extraction, whereas coding use cases, whereas in an agentic coding use case, you give the agent a task, and then he's going to plan, and then he's going to execute, and then he's going to continue running until it believes it's done with the task. Is that a fair summary? Exactly, exactly.
And the same way that a person might do it. If you have a problem, you're going to think about it for a bit, then you're going to think about what tools you have, and then you're going to combine the tools that you have in ways to do the thing that you want to do. What are some of the actions? If taking an action is one of the core part of a begin agent, what kind of actions can Cloud Code take? Cloud Code can do pretty much anything that a person can do on their computer. There isn't really a limitation besides safety. This is something we think about a lot, to kind of place intelligent limits on what it can do and put a human in the loop at the right points, to make sure that if action is potentially dangerous or destructive in any way, that a human has to prove it first.
But besides that, the model can do pretty much anything. So reading files, writing files, running commands on the system, editing things, you can reach out to the internet, obviously mostly these again with human approval. And then there's ways to customize it however you want. So if you have a bunch of MCP tools, for example, to read your Gira issue tracker or to open a browser or to open an iOS simulator, the model can use these too. And how does the MCP part work? So MCP, the model context protocol, being something that you guys at Anthropic defined. How does that work? You just leverage the protocol to connect to any tool?
Yeah, exactly. Cloud Code is an MCP client and an MCP server. And what this means is if you give it tools to use. So maybe at your company, you have a bunch of MCP tools that you build for all your systems to integrate with. I guess that maybe there's one to integrate with Gira. Maybe there's another one to read Slack. And maybe write messages to Slack. Maybe there's another one to fetch some internal knowledge base or something like this. You're going to plug this into a bunch of your tools. You can plug it into Quadai, into Quad Desktop. You can also plug it into Quad Code. So it gets all the same tools that you do.
So just a few days ago, you guys announced the release of sub agents. What does that do? Yeah, sub agents are really exciting. And this actually started with a Reddit post. There was someone that posted on Reddit about how they have the sub agents that they built for Quad Code. And they had a product manager sub agent and an engineer and a designer sub agent. And a couple engineers on the team's office and got really excited and felt that this is something that we should support a lot better.
And the way sub agents works is, you know, when you start Quad Code, you have a quad and you can talk to the quad and it can do things for you. Sub agents are just other quads. And they're prompted a little bit differently. So you can customize what prompts they have. You can customize what tools they have. So for example, you can say, you know, you are a really excellent QA engineer or sub agent. Your job is to verify that code is correct and to test code. And to do that, you have these tools at your disposal, maybe of like a browser iOS simulator, Android simulator. You're given code, your job is to test it.
You might have another engineer, another sub agent that's maybe a project manager. And their job is to have tasks and then divide up tasks for other sub agents. And so you can kind of split up all the work into these different roles. I think we're still figuring out what these roles are. There's one version of the world where the roles are kind of like a, on a regular engineering team where you have engineers and designers and product managers and data scientists and so on.
There's another world where actually sub agents are a little bit more similar. And maybe every sub agent can kind of do the same thing, but they kind of split up the work a little bit more. So everyone is a generalist. And then quad is in charge of figuring out how to launch the sub agents, in which particular sub agents to use in which way, exactly the same way that it would use for any other tool. So you can think of it as a really, really intelligent tool where the model can launch more quads to do things. So basically it's like, if you think of AI as an intern, it's like having a group of interns and ever intern as their own role. And then you recombine what everybody did into one result. That's exactly right. And you can define what those roles are.
And it's a fascinating question whether to end and throw pormorphize human functions into what agents should be doing or whether there's something that's agent native in the way the worker gets distributed and chopped into space. And chopped into smaller parts. Yeah. In the AI world, we talk a lot about this essay called The Bitter Lesson. This was a rich satin essay from Gidecai D'Ago or something. Yes. Where he talks about the more general model in most of the time in the long term, it will subsume more specific models. And so what this means is if you build a agentic system in this context, then the more general agentic system will generally outperform the more specific one in the long term.
And I think where we're at today is models have the capability to do stuff. But if you give them too many tools or too much context or too much responsibility, it might be disappointing because they won't really know how to handle it. And the models of a year ago could barely even call tools. The models of today are pretty good at doing it, but get a little bit overloaded sometimes with context or with too many tools. And so we can divide them up into sub agents in this way. But I think that the models six or 12 months from now they probably won't need this anymore because they're all going to be pretty good. And you won't have to define very rigidly what each one's responsibilities are anymore.
And so this is something we're building for people today because we think it's quite useful today and it's something that we use a lot. But I could also see this going away at some point. Yeah. It's a concept of a context pollution, right? That's one of the way people describe it right. And then Claude code can handle both free, precise tasks like debugging or much broader tasks like a broad refactor, for example, is the idea that the broader the task, the more sub agents you would have in the current context?
Yeah, that's probably one way to think about it. Generally, when we introduce Claude code to new people, we actually suggest that like you said, Claude code can do everything. And this is one of the things that makes it a little bit hard to use if you're an engineer that's used to essentially like text completions in the idea. It's a very different kind of AI coding experience. And so generally, the thing that we recommend is start with something simple like just ask questions about the code base. So don't even code, don't use any tools. Just ask the model questions. You know, how, what does this file do? Where is the file that does this thing? If I want to make a new, whatever, how do I do that? So just ask it questions like that.
And for this, generally, the main model can do it and you don't really need sub agents. But then as you get a little bit more sophisticated, you might want to start to putting up the work. So if you ask the model maybe to make a small change, like I said, make the button right or make the button blue, you probably don't need sub agents. But if you do something a little bit fancier, like build a new section of the website that does blah, blah, blah. Then you might want to have a few sub agents. Maybe one is the software architect and it's responsible for planning out the work.
Another one is maybe like some kind of reviewer, where it'll review that plan to make sure looks good. Then maybe you'll have a few sub agents that actually do the implementation. So maybe there's a front end engineer, back end engineer, this kind of thing. And then some kind of verifier at the end that verifies it. And then we also, internally, we really love using a code simplification sub agent. And its job is to take the code that was produced and just simplify it while making it still work.
Okay, great. So that's the actions part of the agentec workflow. Let's talk about the awareness and memory of this. One of the exciting features is that Cloud Code can connect to the existing sort of code knowledge in the company. How does that work? There's a few different ways to pull in context and this kind of knowledge from the company. The simplest one is just working on files. There's a few different approaches actually to reading files. So I'll go a little bit in the depth into the way that actually happens.
In the past, the thing that people use the most is this thing called rag. And essentially this is a technique where you take the whole code base. And this actually works for any document set of documents. It's not necessarily code. But you take a set of documents like all the files in the code base. You do this kind of indexing step and then you store essentially this database of all the knowledge that's in these files in a very, very particular form that makes it really easy for the model to search. There's a lot of trade-offs to doing this.
The indexing takes time. It's pretty expensive to maintain this database. It's quite tricky, practically, to make sure that security is really good and privacy is really good. Because it's just a very sensitive information like your code base. And so you want to keep it really safe. And so Quad Code actually doesn't use this technique called rag. Instead, what does is it just searches files the same way that a human would. You can think of it like at the engineering level it uses the tools glob and grep.
These are two tools that are kind of built into the computer. And you can think of it as kind of command F for files. So it will just search around with text the same way that a human can. And what's kind of cool is if you just search for one piece of text, you might get the result you're looking for. But you might not. And depending on the result you get as a human, you would refine your search term and you would try again. And you might try a few times to get the result you're looking for.
And the model is really good at this. And this again, this is one of those things that was not the case like with models of a year ago, but with models of today, they're excellent at this. And so we call this process agent search. And what this means is using really, really simple search tools like command F. And using them repeatedly and then adjusting the search terms over and over based on the result of the query.
And this is something that we don't specifically tell the model to do. It's something that I just figure is out because it's intelligent enough. And it has the search tool. So this is the first form of memory, which is just looking at the contents of the code base and understanding it in this way. We augment this a little bit with this thing we call cloud dot MD files. And all this is a special file. It's what are you called cloud dot MD.
You put it in your code base or you can put in whatever folder you want. And use it to record memories. So at any point you can tell cloud to remember something. So for example, whenever I do, whenever I edit this file, I always want you to double check it in a browser or something like this. You can tell cloud to remember this and then it will record it in the right quantum D. So that it remembers it next time.
And I think one of the most powerful use cases we've seen with this is when people check this into their code base and share it with their team. So this is a, it's a memory file. It's just a regular text file on the computer. But you don't keep it to yourself. You share it with all the other engineers on your team. And what it means is if cloud remembered something when you were using it, everyone on your team gets to benefit from that.
And it gets this really interesting effect where everyone on the team starts to contribute to this knowledge base and this kind of memory bank. And it's very simple. Again, it's a text file. So anyone can read it also. So it's very easy to edit these memories also and see exactly what's in there. But everyone just starts to benefit. And it feels kind of magical because as your team uses quad code to get smarter and smarter and kind of similar to building in the CLI, this is literally the simplest thing we could have done.
There's nothing simpler than this. I think that we could have done to build memory. There's nothing. There's no special tools. There's no special prompting. There's nothing like this. It's just a file and quad kind of learns to use it. And you have to declaratively add to the memory or whether today or in the future, the memory will automatically pull from the context and sort of improve itself.
Yeah, you have to add to it manually today. We've actually had a bunch of internal experiments to do automatic memory. So quad can automatically remember things. The problem is there's kind of two ways in which it feels. One is that it remembers things that it shouldn't. So for example, if I say make the button blue, it might remember that the user always wants the button to be blue. And this is, you know, maybe that's the case for this button. That's not the case for every button. And then sometimes it doesn't remember very important things that it should remember. And so for the last few months, we've been doing a lot of experiments to try to get this performance really good. And it's something we've been using internally. And at some point when we're happy with it, it's something we're going to release for everyone. But generally our bar is if we find ourselves really happy with it and we find ourselves using it every day, then we release it to everyone. And this one's not quite there yet.
It's another fascinating example or discussion when you compare a gently memory like this to human memory. And the fact that you can edit it and then leads to all sorts of questions around, okay, what is it that we as an organization should remember and who's in charge of ultimately editing what should be remembered or not? Because this is for code, but code being everything is a fascinating concept. Yeah, this is, I think this is one of those social problems and maybe not social problems. This is one of those social dynamics that will change over the coming years as people use these tools more and more. Is we need to figure out like what are the roles of people on the team and how do they interact with each other and interact with Quad. And this is maybe one very specific problem within that who curates Quad's knowledge. My feeling is that this is something where teams will kind of get more and more horizontal as a result of this because anyone is able to contribute in this way. But it'll be interesting to see how it plays out.
And then the key question for agents is always the level of autonomy versus human in the loop. How do you guys think about that and where do I get pinged as the human coder in my cloud code workflow? The default behavior is there's always a human in the loop. This is, this is super important because this is in the end, this is a model and it's not predictable. And you want to make sure that it doesn't do anything dangerous. So yeah, there's always a human loop. So for actions that we know can't have any kind of dangerous repercussions. So for example, reading a file, we know this is inherently safe. We just let the model do this in the folder that you might have to do this in. But for other actions like editing a file or running a command or using the internet, this always needs a human in the loop. And it always needs a human to approve it.
There's ways to reduce this burden a little bit. So for example, if you find yourself always approving edits to the same file or always approving the same command. There's a settings file that you can configure across your team. And you can use this to essentially allow us to block with certain commands, certain files that you always want the model to be able to edit without human approval. Or you never want it to be able to run. And while we are on the topic of sort of safety and security, how do you guys think about sort of confidential code, you know, that whole model? You know, that whole world of like regulated industries and sensitive code and kind of stuff. Do you offer a local version of this and on-prem version of this? Or maybe that's a question for Cloud Code, but I can throw up again in general. Are you all sort of cloud-based?
It's something that actually we've seen work quite well in these very, very regulated industries. And the reason is that it doesn't use any services except the API itself. So that's all it needs and then everything else you actually don't need. And this is one of the nice side effects of not doing code-based indexing or anything like this is it's just very easy to hook up to. So if let's say you're a bank and at your company, you already have a better-ock approved, you can just use better-ock and use Cloud Code that way. So you run out on people's laptops and all you need is access to better-ock. So really easy to get approval for. And then if you want to use the Anthropic API or Vertex, you can use that too.
How do you think about the UI and UX experience of this? How you balance the power of everything can do versus your stated goal of being as lean and lightweight, of an interface on top of the model. So we just talked about how you can approve actions that the agent takes. What about the rice? Does it basically feel like you'd be using a regular CLI or is there something different that one needs to get used to? We tried really hard to make Cloud Code something really beautiful that everyone feels is something that we put a lot of care into because we did. And I think when you use Cloud Code, you can feel that this is something that we use every day. At this point, most code at Anthropic is written using Cloud Code. And almost everyone at Anthropic is using it every day.
And when we look at customers that start to use Cloud Code, they use it kind of more and more and more. And with a product like this, you want it to feel really smooth and really beautiful and something you really enjoy using. And that's been really fun from my engineering and design point of view because we build in a terminal. Like as I've had terminals, I've been around for 50 plus years at this point. And it really feels like we're rediscovering how to design for a terminal. Because since terminals were first invented, the design world moved to Web and then it moved to Apps.
And there's kind of different design principles you can take here. And we try to apply these back to Cloud Code even though it's running in a terminal. And there's a lot of details that we spent a lot of time on. Like the way that we represent statuses for every item with this kind of blinking dot that turns red or green to indicate whether it's exceeded or failed. The even the loading indicator, like the spinner while Cloud is working, we spent probably 30 or 40 iterations of this. Just to get it to feel just right to make it feel so you know what's happening, but it's not giving you too much information.
And it's also not jittering and moving around on you. So yeah, every part of the interface we iterated on probably more than more than you think. Yes, and then this is really fun kind of like series of words while the while Cloud does its thing. Or it says like cooking or hurting or schlepping or honking or clotting, which you know how many of those words do you have? But you seem to have like dozens of them. I just find it such a like a fun kind of like Easter egg kind of kind of design detail that makes the whole difference.
You know at the beginning I added maybe 20 of them and then immediately people started making a suggesting changes like hey how about this word? How about this word? How about this word? So now at this point, there's a pretty big list and the quad can actually choose the word that best describes the task that it's doing. So it's up the quad, which word it wants. How do you charge for using the product? And I'm asking this in the context of I think over the last few days, there was some evolution as people became one more rabid users of the product.
The pricing structure evolved. Yeah, this is something that we were honestly really excited to see how some people are just going, they're really figuring out how to run this thing at all hours of the day. It's just, you know, some people have this army of quads, you know like 5, 10, 20 that are just running in parable all the time and just doing work. I talked before about how you quad generally needs human approval to do work. There's actually ways to do it in slightly more autonomous way too if you want to run it for long periods.
That's actually you need to set up a container for it and just give it some kind of like a container to be in and then they can run without approval in a way that is safe. And so there's a lot of people that do this. It's extremely exciting, but also the pricing structure that we had was really not cut out to actually serve these kinds of users. And there's also just a little we call this kind of abuse for at the tail end. You know, there's like a counter chairing and things like this where people are just really not using it in the way that it's intended.
So generally for a plot code, there's two pricing models today. One is you can get a subscription. There's there's pro and that there's max. And this is I think it's 20 bucks a month, 100 bucks a month or 200 bucks a month. And it has very, very generous rate limits. If you use only opus, you'll run out of limits pretty quick and then we'll switch you over to sonnet. If you sonnet, then you know, you can use much more of it than most people need and almost everyone doesn't run into rate limits at all.
It's generally power users run into it. And this is something that we're thinking also thinking through a kind of all this as we land features where people use more tokens and there's more sub agents and there's more autonomy. Because we need to figure out a way that we can provide this to people in a way that's sustainable. It's an interesting that amic right because your tool for power users, very sophisticated coders that do sophisticated things with it. But equally, you don't want to you want to encourage them not discourage your power users was pricing. Exactly, exactly.
Yeah, we want to we want to support the community. We want to hear how people use quad code in these ways so that we can make sure that we can support that. That's super important to us. Great. So it's a price per token or you get like a certain number of tokens for certain pricing tiers. Yeah, essentially like within a certain period of time, you have a certain number of tokens you can use. And then if you want pretty much unlimited usage without dealing with these women, you can always just use a API key. And this way you can just use as much as you want.
So I'd love to spend a little bit of time double clicking on use cases. So we talked about this a little bit and talked about how actually some people at the fringe are using a cloud for non-coding use cases. But talking about the intended users or the coders, what are the main things that you see people do in a context where cloud presumably can do everything?
Yeah, we see people using it for all sorts of stuff for all sorts of code-related tasks. Everything from planning projects to managing tasks to actually writing code, testing code, debugging, it's excellent at. So if something doesn't work, you can ask quad to debug it, writing unit tests, verifying code. Whenever there's issues in production, our first find of defenses to give it to cloud. We have logs coming from GCP or whatever. We give this to cloud and it'll figure out what is the issue that's happening.
And it can even interact with the Git and source control. So it can figure out what exactly cause to breakage or what exactly cause to regression. It can automatically fix it. So yeah, it's for every stage of STLC. And I think this is the first tool that really serves every stage in this way. And like I said before, we didn't intend it to be this way. It just happens to be very general and a cloud happens to be really good at using these tools.
And so this is a, you know, for me, this has been kind of accidental product market fit across the entire lifecycle of engineering work. So yeah, the kind of accident that every founder or creator of a product dreams of that happens so rarely. What an amazing story. And presumably if you have those, those MD files and agentic search, cloud code can also be used if you're a new person either at a company or on a project to learn about the code. First of all, is that right? And two, do you see people do that?
Absolutely. When I think about the things that that cloud code is good at, I'll be kind of, I'll be a little bit self critical. I feel like if you look at answering questions about the code base and kind of code based research, I think it's like 10 out of 10 good. It's as good as it can get. When it comes to writing code, it's maybe like a six out of 10. It's pretty good. It won't get everything perfect. The next model will be better and it's something that we keep improving.
When it comes to debugging code, it's maybe also like a six or seven out of 10. So I think this code based research and onboarding on code base, it's really, really excellent at it. And at Anthropic, whenever new people join on their second day, this is part of technical onboarding. We teach them here's cloud code. Here's the code base. If you have any questions, don't bug engineers on your team. Just has cloud code.
And again, answer these questions probably better than they can because you can search around the code, it can look through history, it can look through power questions, FAC, and just pull in all the context to answer all these questions. And what we saw is at Anthropic technical onboarding used to take a few weeks. But now engineers are you who we productive within the first few days. And they don't task their teams anymore.
I think this is the biggest thing where you don't need to bug your senior engineer or your manager anymore to get answers to your questions. You just ask what code and it can answer all these questions for you. In the same vein, what do you find yourself using cloud code for as a leader and engineer? What's your daily use case?
Yeah, I use it all day for all sorts of stuff. So obviously, like I said, code based research. If I'm working on a piece of code, I'm not familiar with. I'll just start by asking Quad Code to tell me about it. Whenever I'm working on a small feature, I'll usually use Quad Code in GitHub actions. So I'll just say add quad, I'll make a new GitHub issue and then I'll say add quad implement this feature for me. And it'll just do it usually in one shot. And sometimes I'll do this on the command line too. So I'll just say implement this feature and make a pull request and I'll come back a few minutes later on that's done.
Then there's this kind of other work where it's a little bit more complex. You can't really do it in one shot. It's not as simple as changing a piece of text or changing a button or building a small feature. Maybe it's like something more involved. There's probably two work flows I have here. One is for really complex stuff. I'll prototype it a bunch. And this is something that I did even before Quad Code. You know, when you when you write a complex piece of code or a complex feature, often engineers will write it a few times.
Because you don't actually know the right way to do it. And so you'll try one approach. You'll try second approach. You'll try third approach and you'll kind of figure out the edge cases for each and kind of the limitations and you'll get a fuel for the problem this way. And this is something I used to do by hand. Nowadays I'll just talk quad code to do it. So I'll maybe make a few get work trees, I'll launch a few quads and then in parallel to them. Here's your job. I want you to implement this feature. Go to town.
And I'll then look at each of the solutions and try to figure out, OK, maybe I like this piece of this one, this piece of this one, this piece of this one. And then I'll throw away all that code. I'll just I'll discard it and then I'll sort a new quad and I'll tell it, OK, here's how I want you to do it now that I got a I got to feel for the problem. And then I think the last workflow is actually the one I probably use the most and this is probably for medium sized features.
And in this one, I'll ask quad to make a plan and I'll go back and forth with it a little bit on that plan. And it's really important actually to get that plan right and to make sure it's the same thing you have in mind before you continue. Because otherwise what I see sometimes is people ask quad for a little bit too much, maybe it's a complex feature and then it wrote it in some way that like it's not at all what you wanted, but the problem isn't that quad doesn't know how to do it.
The problem is that the description is just too low bandwidth. You only described in a few words what you want. And so the idea that quad got of it is a very different idea than what you had in mind. And so I find that planning helps a lot. So you can iterate on the plan the same way that you would with anyone else you're working with. And then once that's ready, I'll ask quad to write the plan to file maybe and then I'll tell it to kind of implement that plan.
And it will naturally make it to do is for tough to implement it switching tax a little bit. I'd love to talk about the space more broadly. So cloud code is a rocket chip within the anthropic rocket chip, but equally there are other products like cursor, windsurf, replete, lovable hole, you know, V zero. The list goes on and on and we've had a few of those CEOs on this podcast. What do you make of the space going forward?
Do you think we end up with a bunch of different solutions, different doing different things? Do you think there is a at some point of winner takes all kind of scenario? How do you how do you think about the next couple of years? Yeah, my feeling is there's kind of two dynamics here. One is that this is just a giant market. The market you could think of it as all of coding. You could think of it as all of kind of creativity and kind of creating things. Because this extends at some point beyond coding to design and things like this. So I think there's room for everyone. It's a giant market. And the biggest thing that these companies should be thinking about is everyone that's not yet using AI for coding. Because if you only focus on people that use AI for coding already, these are kind of early adopters. So you want to kind of get into that curve and get the middle and the late adopters to and many of these don't even use AI today. So I think it's a big market.
I personally use a lot of these products. And you know, I use quad code every day, but I also use cursor every day and I use other products every day. So there's room for all these and they all kind of fit together into people's workflows. The second way to think about it is like I said before, the model is getting better so quickly that the kinds of things it's able to do are just changing every few months. It's this it's exponential that just keeps accelerating and this is really the feeling inside of the lab. You know, working on AI and kind of building the stuff. And hopefully it's also the feeling that users have as you get to use all these new products that are coming out every few weeks or you know, maybe a few days at this point.
So yeah, it's really exciting and I think my advice to companies building is definitely build for what the model will be able to do six months from now. Not for what the model can do today. This is probably the single biggest advice and this is something that we followed for quad code also. We started building quad code when it was still on at 3.5 and it was okay and then with you know 3.6 and 3.7 it was still it was fine. It was pretty good. But then when it's on for an open for came out, that's when a really hit it's dried and we felt like the product was really good and we started to be able to use it for a lot of coding. And so this the biggest way I would think about it is how do you build the product that captures the model capabilities six maybe even 12 months from now in the market for those capabilities is just going to be huge.
Yeah, and to double click on that if I'm a product builder or founder, how do I know that how do I know what's in the you know six months horizon. And you know, I'm speaking of like entropic or in general, what is it that's going to happen in the next six months that I should plan on on talking about the coding capabilities. Yeah, I think the biggest thing is just use all these products and see where they stumble and try to get a feel for the model itself. I think what code is a really good way to do that is probably other ways to but try to kind of get away from all this scaffolding and all the products people built around it and just get a feel for the model's capabilities and as raw farmers you can.
And try to get your head around the limitations like where exactly does the model stumble today what is what is exactly that frontier where it's like not very good and then you know sometimes it's good and maybe 50% of the time it's good you can kind of get a feel for this frontier. And with the models today a lot of this is around kind of agentic work where it can use tools really well and then at some point maybe it will fall over when there's too much context or too many tools or the trajectories too long maybe if you've been running quad for two hours. It was a track after a little bit so there's some there's some sort of like frontier here maybe there's another frontier around code quality where today I have to maybe correct the model when you know maybe there's something that it does that isn't exactly the way I would have written it and I think over time models will get better kind of understanding this too.
So I would just try to use the model in a raw a form as you can and get a feel for these front frontiers in the domain you care about. So for coding maybe it's kind of the how long the trajectory is and whether the model can stay on track and then the quality of the code and probably a bunch of other stuff and then for non coding domains there's there's a lot more. So still in the same vein of like the AI coding wars it seems that there is this this you know competition competition kind of dynamic where and andthropic or others would provide the models to companies like cursor or Windsorff at some point but equally build cloud code which is application do you think that's a long term sort of way the ecosystem works or you know in a context where cursor is saying that they're going to build their own models as well.
The end result is kind of like full-sack players where everybody has their own underlying model and application on top. I think there's probably room for both. And my personal take is probably there's many more stuff built on top of the platform than there will be built in the house. Just because there's so many things to build and there's just not enough time and enough people and energy to build all those things. So I think a lot of the innovation is going to happen on top of the API as an SDK is that are built.
As a last theme for this conversation, the sort of the elephant in the room is what that means for coding and coding as a profession. What's your general sense for what coding is going to look like in Fures from now? Yeah, it's a little bit hard to say. You know, a Fures from now is, in AI time is like decades in normal time. I think even today for a lot of professional coders, it's really easy to lament the state of coding and to think, I used to write this code by hand and now it's this agent doing all of it. I think actually being the one that does this work, it's incredibly exciting to have an agent write the code. It feels very empowering as an engineer because I can explore a lot more ideas than I could before.
I can do it much faster. I can work in domains that I know literally nothing about. Maybe I don't know iOS, but I can write an app because I can just generally code review the code and I can see it looks reasonable, but Quad actually does all the writing and all the testing of it. There's one engineer on the team, Lena, she still writes C++ on the weekend sometimes by hand. She was telling me because, you know, like as a programmer, this is kind of one of the things that we enjoy. Sometimes you have to get down to the metal and you have to kind of do it this way.
But I see this as a transition, the same way that in the 60s, there was this transition between punch cards and assembly and then later on between assembly and fortrane and co-ballin, the first type of languages. And I think this is just another next transition. It's hard to know exactly how this is going to play out. I think one way it will definitely play out is it's going to change programming where programming is no longer direct text manipulation, but it's more working with agents to get the work done.
And I think it's going to be hugely empowering where a lot of people that couldn't create before can now create. Even if maybe you don't know anything about apps you can use a lovable or you can use another platform to build cool stuff that you couldn't before. And this is just hugely exciting. But if I'm a young developer today or I want to make coding or building applications, my career, what would you say? What would you tell a younger version of Boris at the beginning of your career for a future in the field? What do I need to learn?
I think for people that are learning coding today, it's actually harder than it was when I was learning coding because not only do you have to know coding because you still have to understand the languages, you still have to understand the frameworks, you still have to understand system design and all the stuff. But also you have to use all these tools and you have to do both. You have to hold both in your head at once. So you have to code so that you can check what the model does and you know how to direct it because you have to have, you know, still with the models of today, you have to know what you're doing in order to direct coding agents effectively.
But at the same time, you have to be using all this new technology, you have to be using quad code and you have to be using all these new agent coding tools because this is what the future is. And I think it's hugely important to understand what these are and what it lets you do and to learn how to function both when writing code manually and when using these tools.
All right, there's been a fantastic conversation. Maybe to close, give us a sense for what the next few weeks or months or years looks like for Cloud Code, anything on the roadmap, anything you're excited about, anything you can talk about.
Yeah, there's a lot of stuff we're working on. We ended native Windows support recently and we're working on single file distribution so you don't need a Node.js anymore. You can just use quad code and it's a single binary that you can use anywhere. So much more portable.
We're working on getting quad code into more places. So wherever you are, you can use quad code more easily, the same way that you can on GitHub today and expect a lot more agents. So, you know, be able to launch agents, agents, managing agents and a lot more kind of freedom, freedom this way and continuing to level up the state of the earth and figure out what's next.
But I would say overall, we don't really know. Still, we're testing stuff out and we have a lot of ideas and we don't know what's going to work, but we're excited to show what we come up with and see if people like it.
That feels like a wonderful place to live the conversation and very fitting as we all try to figure out where not only where we take AI but where AI takes us collectively. So it's been wonderful. Thank you so much, Boris. We appreciate you're spending some time with us today.