Life Is But A Stream

Ep 13 - The Future of Coding: How Cursor and WarpStream Power AI Productivity

Episode Summary

Cursor is redefining what coding looks like when AI and humans work side by side. In this episode, Alex Haugland explains how WarpStream and data streaming keep Cursor fast, private, and scalable while powering better AI productivity.

Episode Notes

Software development is changing fast. With Cursor, Anysphere is building an AI-forward IDE that fuses human creativity with machine intelligence. At the heart of this transformation is data streaming—making it possible to train models responsibly, deliver lightning-fast Tab completions, and scale telemetry without breaking engineering velocity.

In this episode, engineer Alex Haugland shares how WarpStream gives Cursor sovereignty over user data, how telemetry and accounting pipelines strengthen product decisions, and why “coding is really just a bug” in how we interact with computers. 

You’ll learn:

For senior leaders evaluating AI and streaming strategies, Cursor offers a blueprint for balancing innovation, cost, and trust.

About the Guest:
Alex Haugland is an engineer at Anysphere working on Cursor. He is a massive scale distributed systems performance engineer enthusiastic about understanding complex technical systems, organizational systems, and mentoring junior engineers.

Guest Highlight: 
“Our business is built on top of data, and a lot of it is data that people have been really generous to allow us to use… WarpStream lets us do that in a way where the data sits entirely inside stuff that we control and have access to. Being able to say, ‘All this stuff is locked down on S3; we have control of it,’ is really, really important to us.”

Episode Timestamps: 
*(00:45) -    Anysphere, Cursor, and Data Streaming Strategy
*(04:05) - Data Streaming Goodness
*(21:30) -  Beyond the Stream
*(32:00) - Quick Bytes
*(34:45) - Joseph’s Top 3 Takeaways

Dive Deeper into Data Streaming:

Links & Resources:

Our Sponsor:  
Your data shouldn’t be a problem to manage. It should be your superpower. The Confluent data streaming platform transforms organizations with trustworthy, real-time data that seamlessly spans your entire environment and powers innovation across every use case. Create smarter, deploy faster, and maximize efficiency with a true data streaming platform from the pioneers in data streaming. Learn more at confluent.io.

Episode Transcription

0:00:00.2 Alex Haugland: That's just like data go in, data go out. It's exactly kind of the abstraction that you want out of Kafka. Like you pay somebody a small amount of money, you get to keep your data. You don't have to think about it.

0:00:10.6 Joseph Morais: That's Alex Haugland, an engineer at Anysphere. But you may know them better as the developers of Cursor, the AI powered IDE. As always, we're diving deep into data streaming from WarpStream by Confluent, helping Anysphere keep data private to delivering ultra fast and ever improving AI powered user experiences, all while scaling through the success of their product. I'm Joseph Morais, technical champion at Confluent and your host. This is Life Is But A Stream. Let's get started. 

[music]

0:00:44.6 Joseph Morais: Well, thanks for being here, Alex. Let's jump right into it. Tell us about Cursor.

0:00:49.3 Alex Haugland: So Cursor is an AI forward IDE, or put differently, it's our attempt to figure out what software development should look like for the next 10 years.

0:00:57.4 Joseph Morais: Nice.

0:00:57.9 Alex Haugland: We're trying to explore this combination of what people can do and what machines can do and how the two of them together can be stronger than just one on their own.

0:01:07.0 Joseph Morais: I like that because I have had a couple of guests that are really focused on AI and one person contextualized it. They're like, we have an AI workforce and a human workforce. And I was like, wow, that really is the future. And Cursor is like a way to bridge that. How do I empower my humans with AI goodness? And Cursor is a great way to do that. So my understanding though, Alex, is that the company's name is actually Anysphere. Is that right?

0:01:32.9 Alex Haugland: That is correct, yes.

0:01:34.1 Joseph Morais: Okay. So we have this interesting relationship where everyone thinks that we're Confluence, which is weird because that's not even the name of the company, it's Atlassian, and that's just one of their products. But we're Confluent and that's our company name.

0:01:45.8 Alex Haugland: Wait, you work for Atlassian?

0:01:47.8 Joseph Morais: No, I don't.

0:01:48.8 Alex Haugland: You just said he worked for Atlassian.

0:01:50.5 Joseph Morais: But that's interesting. So even though your company name is Anysphere, everyone knows you for Cursor, so everyone probably just calls you guys Cursor.

0:01:58.3 Alex Haugland: Yeah. It's like some news outlets will refer to us as Anysphere, but I think that's just because somebody there is a stickler, like we don't actually care.

0:02:06.8 Joseph Morais: Got it. Well, I guess that's the success. That's the result of being so successful. You build a great product, everyone wants to call you buy it. So, I think that's a good problem in your case. So tell me, who are Cursor or Anysphere's customers?

0:02:20.0 Alex Haugland: So it's a surprising mix actually. Like, it's a really large cohort of what we call pro users, which are individuals who've chosen to purchase a Cursor subscription and enterprises. So like large enterprises who have some contract with us through sales and all that. There's more enterprise than I would have expected for a company of this size.

0:02:38.1 Joseph Morais: Interesting.

0:02:39.3 Alex Haugland: But also there's a lot of individuals out there. It's just like it's more than I would have expected on both sides, which doesn't really make any sense, but I don't know how else to explain it.

0:02:47.2 Joseph Morais: Well, I think it's exciting, right? It's exciting tech and it's exciting for people that have done it for a while, like your Enterprise customers, and also people that are quote-unquote "vibe coding." Right. Like, they now see artificial intelligence as removing barriers to entry to what I think is going to be essential for so many people moving forward. Just knowing how to turn your desires into code so that you can get computers doing your job or doing something for you is a really powerful paradigm.

0:03:14.1 Alex Haugland: I think one way that we like to look at it is that programming is kind of a bug. The fact that it's so hard to tell the computer what you want it to do is not a feature. It's not good that, I don't know, if I want to sort all my emails that I got today by the last letter of the thing. I got a script and do a bunch of figure out APIs and authentication. None of that is good. None of that is what people wanted when they set out to make computers. And that's just how it's been this whole time. And a world where you can just say, computer, make a script, and it just does that, is like the dream we've always had for computing. If you go back to the original Star Trek, it's like you just talk to the computer. You're not sitting there typing Python into the Enterprise.

0:03:54.9 Joseph Morais: That's exactly where I was going with it. I'm a huge tracker and they did everything through natural language. They just told the computer what to do and it did it. And that is what we should be striving for. 

0:04:05.1 Alex Haugland: Yeah. 

0:04:06.0 Joseph Morais: That's interesting. I didn't think Star Trek would come up on this. That's great. 

[music]

0:04:19.0 Joseph Morais: So we set the stage. So let's dive deeper into the heart of your data streaming journey. So Alex, this is a very interesting episode for me. So, you guys are customers of WarpStream. 

0:04:27.8 Alex Haugland: Yes. 

0:04:30.1 Joseph Morais: And this is the first time having a WarpStream customer on the show. So first for the audience, WarpStream is this incredible technology. It's built on top of object storage. So think something like S3 on Amazon, on AWS. And they built these incredible agents that basically talk Kafka protocol and then write the object storage. And they were a competitor to ours until we acquired them and they filled this amazing gap that we had. So we have software that's self managed called Confluent Platform. You can run that in the cloud or in your data center or the edge, wherever you want. We have a fully managed product called Confluent Cloud. And then we were missing something that would fit BYOC or bring your own cloud. And that was what WarpStream was for us and what it is proved to be for our customers, like Cursor is it's this amazing kind of lower cost BYOC alternative where you trade latency for cost, but then you could do things at massive scale. So, I think I categorized that right, Alex? Right?

0:05:31.6 Alex Haugland: I think there's a couple other things that I think we think are important about WarpStream.

0:05:35.3 Joseph Morais: Cool, I want to hear from you.

0:05:37.2 Alex Haugland: Yeah. So one of them, like you mentioned latency, the latency on WarpStream is actually pretty good. I think a thing that Richie, one of the founders of WarpStream, explained a while ago is that most people don't really need 100 millisecond, 10 millisecond Kafka latency. It's just not important. At least for us, we could use 10, 20, 50-second Kafka latency. We would never care. We're mostly using it for slower stuff. And if you're willing to accept just a little bit of latency, it makes so many things so much easier. You no longer have to worry about provisioning compute and storage together. You don't have to worry about scaling your stuff. You just get this magical, like, WarpStream put stuff in S3, and I have some number of WarpStream nodes, and I just don't ever think about it. In exchange, latency is 500 milliseconds. Oh, my God. That doesn't matter. But the thing that we actually, I think, value a lot more about WarpStream isn't so much the way that it scales or any of the performance stuff or even the cost particularly. It's the fact that the way that it handles the data means that at the end of the day, it's in our cloud. So all of the data is stored in S3 in a place that we control, and our business is built on top of data. And a lot of it is data that people have been really generous and allowed us to use. If you're a pro-subscription person and you sign up for Cursor, one of the things you'll be presented with immediately is, do you want us to be able to train on your data? So when you press tab in Cursor, we see if you accept the suggestion or if we don't as the most basic example.

0:06:58.3 Alex Haugland: And people allow us to do that. And that is something that we do not take lightly. You don't have to do that. We are not paying you lots of money to do it, but we hope that people think that if they let us have our data, then we will train on that data and we like a better product. And that's the promise that we try to fulfill is like, if you let us look at your data, we will train tab on it. We like an even better tab model that you guys can use. But we think that that also bestows upon us a really serious responsibility to not leak the data, not do anything weird with the data, not let anybody else see the data, not let us see the data where possible. We want to honor that trust that people are putting in us and really only use it to further the thing that they've given it for, which is building the best model we can serve to our users. And so WarpStream lets us do that in a way where the data sits entirely inside stuff that we control and have access to. And so we don't have to worry about like, oh man, if our data provider gets hacked, what are we going to do? All of people's stuff is out there. It's a huge, huge problem. It's a betrayal of our users' trust. It's a serious thing. 

0:07:53.9 Joseph Morais: Absolutely.

0:07:54.6 Alex Haugland: And being able to just say all this stuff is locked down in S3, we have control to it. There's just nothing going on there is really, really, really important to us.

0:08:02.4 Joseph Morais: So that's interesting. There's a couple of things I want to unpack there. So first, I think it's very important that if we are going to use our customer's data, that we're upfront about it. And you guys are extremely transparent about it. It's like one of the first things people set is like, do you want to train? Do you allow us to train on your data or not? And then once you're transparent about it and you get that buy-in, the next step is what you just said is like, you have to take that data seriously. They opted in. They're helping you make your product better. We have to make sure that we can secure that. So having that sovereignty over your data is another great differentiator of WarpStream. And I guess that really begs the question, what specific use case are you utilizing data streaming through WarpStream for, as it pertains to Cursor?

0:08:45.1 Alex Haugland: So we've got a number of things. Like I alluded to earlier, one of the primary ones is for stuff like tab, where it's like, hey, people press tab a lot, as you might be able to imagine. And when they have privacy mode turned off, we will collect the aggregate data there and try to say, okay, are we serving good suggestions to people? Is the latest iteration of the tab model good? Or is it really hurting our metrics and is it bad? And then, okay, how do we figure out why? We don't want to be sitting here serving people a really bad tab model and then sending an email survey being like, hey, do you like the new tab model? That's ridiculous, right? We want to be able to figure out as quickly as possible if we've made a mistake so we can get you a better model as quickly as possible.

0:09:21.8 Joseph Morais: Of course.

0:09:22.4 Alex Haugland: And so WarpStream is core to that story of pulling in the analytics of how people use tab, seeing what it does, and then feeding into our ML pipelines to make sure that we were able to build the best tab model that we possibly can. And the other broad categories are more general telemetry. When you're using the agent and you have privacy mode disabled and you have a bad request, like, why? Like, how do we store enough data to be able to look at that and go, okay, the user didn't have a good experience here. What's going on? And that's not a particularly latency-sensitive thing, but it's just important that you capture it. And then the third broad category is for accounting purposes. If you send us a request, we want to be able to charge you for the request. It's a fairly important thing for the business. But it's not usually super, super, super latency-sensitive. It's not usually mission-critical that if you say, hey, you can have 500 requests, that you can only have exactly 500 requests or if you can spend $10, you can only spend exactly $10. You want to be pretty tight on it, but the important thing is that you don't let people go wildly over. It's not so much that, oh, we want to crash because we're not 100% certain if you're at $9 or $9.99.  Okay, you want to kind of err on the side... 

0:10:22.7 Joseph Morais: You've got to have some tolerance. Right?

0:10:24.4 Alex Haugland: Yeah, you want to have a little bit of tolerance. And so we find that that lines up pretty well with the guarantees that we get out of WarpStream, where 99.99% of the time, it's super fast and it's dead on. And if it's a little bit delayed, it's fine. We'll give the user a credit or whatever.

0:10:34.7 Joseph Morais: Now, the latencies for telemetry and for accounting, I assume, are these things happening asynchronously outside of the experience? Because I do imagine any added latency to something like predicting a tab, that would be bad for the user experience. So is that hidden in the things that you're doing...

0:10:51.8 Alex Haugland: Yeah, yeah, yeah. All that stuff is completely off the critical path. We're not waiting to show you tab suggestions entangled with... 

[overlapping conversation]

0:10:58.0 Joseph Morais: Right. You're not waiting for that to come back from WarpStream to deliver that?

0:11:00.4 Alex Haugland: Yeah. 

0:11:01.8 Joseph Morais: Right, okay. 

0:11:02.4 Alex Haugland: It's like totally parallel, separate pipeline.

0:11:03.7 Joseph Morais: Cool. Okay. And that makes a lot of sense. So now you can give that razor-sharp, ultra-fast user experience, but still tolerate this slightly higher latency in the backend to serve that telemetry and accounting use case.

0:11:18.2 Alex Haugland: Yeah. Because we want it to be correct within order seconds, and that's fine.

0:11:22.4 Joseph Morais: Now, tell me about what Cursor's objectives are, right, or Anysphere's objectives. What were the objectives building the initial set of Cursor functionalities, and what are the challenges and objectives today?

0:11:38.0 Alex Haugland: So I think if you want to go to the super high-level goals, it's what I said upfront about we want to build this fusion of computer and person, like, trying to be the best engineer that you possibly can. If you want to be a little bit megalomaniacal about it, we want to give humans better things to do. We don't want to spend our time typing Python. We don't want to spend our time debugging bash scripts. We want to work on more interesting problems. We want to do more interesting things. And to be able to do that with computers requires leveling up the abstraction with which we interface with them. And that's the high-level mission thing. But I think there's a bit more in the tactical sense where it's we're trying not to be... Like, we're not trying to do this in a super philosophically like it must be done this way sort of sense. We're trying to say like what is the best we can do with the technology available to us today? So given that the way that the models work, given the way that people interact with them, what is the best way we can put this together to make it useful to people? Let's not try to build a product that requires a model that doesn't exist. Let's not try to put a model in a place where it doesn't belong. Let's try to take everything that we have to us today and try to build the best thing we can, like, right now today with those things knowing that like in six months the best thing might be different. We're just trying to say, what is the best thing we can do right now today for everything?

0:12:49.5 Joseph Morais: Yeah. So I've always lived by the idea of done is better than perfect. I think perfect is hard and it has diminishing returns, right? I think it's important that you're taking that view because like you said, we're in this unprecedented time of software...

0:13:03.7 Alex Haugland: I've been saying that for a while now.

0:13:05.5 Joseph Morais: Yeah. But you know what? And it's true, but it's really next level now. I mean, Cursor is an example of it. The fact is you can take someone with almost no experience and they can... And build something. It may not be great, but they could do it. We've never really had that. We've had that for websites and like Wix.com and Squarespace. The idea of people writing HTML right now is unheard of, but that was a reality, right? And I think there is this reality and it's probably not that far away where people don't type into their IDE. It's all human language. It's all natural language. And I think that makes the possibilities insane because now anyone could be a developer and any idea could actually catch fire. It's really insane. 

0:13:46.9 Alex Haugland: The way that I kind of like to think about it is that there's a lot of problems that people spend their time on in software that are not actually hard problems. There's nothing philosophically difficult about them. 

0:13:56.7 Joseph Morais: Yes. 

0:13:58.4 Alex Haugland: The thing I said about like I want to sort my emails differently, like it's not a difficult thing to do. There's not really like an inherent complexity to like I want to sort my email titles. But the thing wasn't made to do that. And so making it do that is actually hard. And I think there's a ton of tasks throughout our lives where tools like Cursor are able to make those things so much easier. Like you can build little utilities. You can build little one-off scripts where it's like I wouldn't normally bother with this. It's going to take me like a couple hours. But now it's just like Cursor make me a script to like tidy up my to-do list. And you can just do that and you don't need to be an expert. And it doesn't matter if it's like 100% bulletproof mission critical. It's a silly little Python script. The barrier to entry on that is so low now. And I think that's wonderful.

0:14:32.5 Joseph Morais: And you know, we actually have a quote that's from one of the co-founders of WarpStream, Ryan, about the usage of Cursor at WarpStream at Confluent. And it's kind of this amazing kind of like self-fulfilling Ouroboros, right, where their engineers are improving WarpStream, which you guys are using, which feeds the telemetry and counting, which feeds the betterment of Cursor, which then makes their job easier. Kind of impressive, right? If you think about that.

0:15:03.3 Alex Haugland: Yes. I think about that a lot because like you got to remember, we're using Cursor to build Cursor.

0:15:08.3 Joseph Morais: Yeah. I didn't even think about that. Yes. And you guys are doing that. That is so weird. So what was the idea there before Cursor was Cursor?

0:15:17.5 Alex Haugland: It's a fork of Visual Studio Code, right?

0:15:19.5 Joseph Morais: Oh, you know what? I guess I never realized that that was built on top of that. So that makes sense. Okay. Can you take me through how you approach scaling telemetry and accounting to increase the load? I know that you guys have grown. And for those non-privacy users or those privacy users, I know that they're hitting those systems differently. But what did scaling look like for that?

0:15:42.5 Alex Haugland: So in the case of like the WarpStream side of things, right, you just don't think about it. It scales up and you don't think about it. It's like, I think we've spent zero hours thinking about scaling WarpStream in the last few months. It's just like not... It's a solved problem. The servers in front of WarpStream that ingest the data and then enqueue it, they just hang out. We get more of them. It's not a big problem. It's not interesting. 

0:16:00.3 Joseph Morais: Okay. 

0:16:01.3 Alex Haugland: I think the part that's actually been more of a bit of a struggle is what happens like after the data leaves WarpStream and we run it to a database. And then it's like, okay, but like do you want to write this to a different database? Do you want to like reshuffle it in some way? We've been using some managed non-confluent services to do some of those things. And they've been giving us some teething problems. But those are probably due for a rewrite in the next couple weeks.

0:16:22.3 Joseph Morais: Now, with these systems, I know you guys started with WarpStream, was there like a tipping point to implement data streaming for telemetry and accounting, or was that like a no-brainer? We know we're going to use data stream and we just got to figure out what flavor of it.

0:16:35.6 Alex Haugland: I think it was just kind of like a how else would you do it sort of thing. Obviously, you're not just going to take the data and send it to your server and your server's gonna write it down on disk. That's an insane thing to do in the cloud era. And you don't really want to write it straight to a database necessarily, because like, at least a lot of the managed databases and cloud databases that we've used, like they've got some occasional sadnesses, especially if you're doing a whole bunch of unbatched random writes. Like, you don't want to just like have your database going like, here's a bajillion writes all the time. They're not batched, they're not sorted, they're not grouped. It's just like, here's a giant cacophony of garbage, basically. And so at minimum, having something like WarpStream or Kafka to like take all of the telemetry you're getting and do a little reshuffling, batching, sharding is like a really nice thing to have. And then you get all the availability benefits of like having it buffer the data, keep it in object storage, like, oh, if your database is down, it's fine. WarpStream's got it. Like, as long as we can get it to WarpStream, we got it covered.

0:17:22.3 Joseph Morais: Right. So it was clearly a real time use case. It needed that decoupling. WarpStream just made sense. Now you called out taking the data to other systems, right? Whether that's a database. I'm curious, are there any other ISVs like Confluent that you're using or maybe, CSP native services that you have integrated with WarpStream? I know they have their connector package, it's Bento. And I'm just curious if you're using any of those pieces of the technology.

0:17:50.7 Alex Haugland: We use Bento fairly heavily to pull data out of WarpStream and write it into like a variety of managed databases and self-hosted databases.

0:17:57.8 Joseph Morais: Cool. So like an example, I don't know... I think you guys, are you guys on AWS? I don't know if you'd say.

0:18:03.3 Alex Haugland: We have a fairly large AWS presence.

0:18:05.0 Joseph Morais: Okay. Gotcha. So something like Aurora or an RDS, you're connecting those services together. 

0:18:10.5 Alex Haugland: Yeah, yeah. 

0:18:10.9 Joseph Morais: That's great.

0:18:11.7 Alex Haugland: But depending on the use case, right? Like if it's like counting stuff, maybe like a smaller database. If it's like analytics, maybe a bigger database. If it's stuff for like ML models, maybe we're just writing it straight to S3. It depends.

0:18:21.0 Joseph Morais: Cool. And I think that's one of the wonderful flexibilities, whether it's Kafka Connect and our Confluent connectors or Bento and all those connectors. Having that flexibility to say, I can direct these streams to these downstream sources. I didn't mean to be so redundant, but I can send this data here and I can send this data here or I can send this data to five places. The flexibility is really impressive.

0:18:44.6 Alex Haugland: And if you mess it up, you rewind and go back, right?

0:18:47.1 Joseph Morais: Yes. And then of course I can replay. So you have all that flexibility. Or like you said, you can suddenly, I have a new data system. It's brand new. I'll just start the connector from the beginning of the topic. And suddenly all that data that has been there for months is now in this downstream system.

0:19:01.2 Alex Haugland: And since it's in S3, it's very cheap. You don't actually have to pay that much to keep it there.

0:19:05.1 Joseph Morais: Exactly. Right. So these are all things that are music, I think, to many people's ears when they're thinking about the explosion of data. 

0:19:11.0 Alex Haugland: Yeah. 

0:19:13.6 Joseph Morais: So let me summarize this. So you had to address processing high volumes of telemetry and accounting data. And the only way to do that, or the best way to do that was data streaming, because it was inherently event driven. You needed to have this decoupled substrate. You landed on WarpStream because your founders are fans of the tech. That's awesome. And also because it had that BYOC flavor, which allowed you to protect the data for your customers that opt in. To get into business terms, is there a specific KPI that's tied to telemetry and accounting? I'm curious, is there any specific metrics that you're looking for? Throughput, latency that is key to these particular two systems?

0:19:55.1 Alex Haugland: I think the biggest one is like, if you're looking from an observability standpoint, we keep an eye on the latency and how delayed the topic is. But we don't really necessarily get upset until it's in the hours, right? 

0:20:06.9 Joseph Morais: Right.

0:20:08.3 Alex Haugland: For a lot of these topics, it's not critical that it show up for quite a while. If it's the training data for a model, like for the new tab model, for example, it's not that big of a deal if it's delayed by six hours. Like, you know? We should fix it. It probably means something is broken. But it's not like, oh my God, the data is a bit delayed. We're screwed. Usually it's like our consumer is broken in some way. And so we got to go fix it. That's fine. But at a more high-level perspective, it's not so much the data streaming as what it enables. Is like, is our model good? Is the new tab model good? That's more of the KPI. Do people like our product? Do we have accurate billing? That stuff is a lot more important than any particular KPI.

0:20:44.0 Joseph Morais: Yeah. I think that's ultimately... I think that's a very pragmatic way of building something. It's like things could be... I mean, obviously you should fix it. You don't want to [0:20:54.6] ____ stop working.  But if your KPI is ultimately experienced, you can build the technology around the experience and ensure that even if there are rough edges, that they're not affecting the experience. I think you guys have figured that out. So next we're gonna dive into how your partnership with Confluent solves some of your data challenges. But first, a quick word from our sponsor. 

0:21:17.3 Speaker 3: Your data shouldn't be a problem to manage. It should be your superpower. The Confluent data streaming platform transforms organizations with trustworthy real-time data that seamlessly spans your entire environment and powers innovation across every use case. Create smarter, deploy faster and maximize efficiency with a true data streaming platform, from the pioneers in data streaming. 

0:21:44.4 Joseph Morais: Now we'll go beyond the stream on why confluent was the right fit for cursor in any sphere. So you guys knew that data streaming was the answer, right? There's a reason why Kafka is de facto standard and was able to address that telemetry and accounting use case. But there's also a lot to wrangle with Kafka. Now, I know that the founders were fans of WarpStream, but was there any consideration of using like open source options versus implementing WarpStream?

0:22:12.6 Alex Haugland: So I think that the fan thing is how they heard about it. Right? It's not, like... They were not like crazily widely advertised at the time. 

0:22:22.3 Joseph Morais: Okay, yeah. 

[overlapping conversation]

0:22:22.9 Alex Haugland: But I think like it's really just the characteristics of the system that were really appealing. Like if you look at WarpStream, you get to own your data, it sits in your S3 buckets. But you don't have to worry about managing a control plane, which is like by far the biggest pain in the ass of managing a system like that.

0:22:34.5 Joseph Morais: That's true.

0:22:35.0 Alex Haugland: And you don't have to worry about scaling or storage or compute because it's just like you get some amount of WarpStream nodes, you don't have to worry about routing between them, you just kind of halve them. You have some amount of storage which is an S3 which you never need to worry about scaling because it's S3. Who cares?

0:22:48.0 Joseph Morais: That's true.

0:22:48.5 Alex Haugland: You don't have to worry about the relationship between the two of them. That's just like data go in, data go out. It's exactly the abstraction that you want out of Kafka. You pay somebody a small amount of money, you get to keep your data, you don't have to think about it.

0:23:00.2 Joseph Morais: I can't think of a better advertisement than data goes in and data goes out and like you don't really think about it. So again, my background is in operations. I ran open source Kafka clusters. I hated it. And it was like closer to 10 years ago when you know, even open source wasn't quite as robust as it is now now. And like you said, it's a distributed system. You have to worry about everything. You have to worry about connectivity between AZs. You have to worry about disks running out. You have to worry about CPUs burning their cores out and...

0:23:32.7 Alex Haugland: Getting exactly the right mix of CPU versus disk for any given workload... I want to think about that 

0:23:37.5 Joseph Morais: And memory and all that. And to have something that just works and have it backed by something that is so robust, like S3, that has what? Like it's like 10 nines worth of durability. Just incredible.

0:23:49.1 Alex Haugland: Also, by the way, it's cheaper. That was also nice.

0:23:53.0 Joseph Morais: It just works and we could do it for less money. Just incredible. So I'm curious, with the partnership and the fact that you guys don't have to worry about your data streams, it just works. How are your teams interacting with WarpStream for telemetry and accounting? What does a typical day in somebody that works on that code look like?

0:24:11.4 Alex Haugland: People don't usually work on it. Like, it just kind of works.

0:24:15.1 Joseph Morais: They're really just ignorant to that substrate.

0:24:17.4 Alex Haugland: Yeah. I'm trying not to be hyperbolic, but people do not, I think, interact with the WarpStream related stuff on a regular basis. Sometimes people are like, hey, I need to add a new field to this object and it gets piped through the telemetry. But like that's adding a field to a protobuf somewhere. And then like maybe adding a thing to Bento, depending on what you're doing. And that's it. We don't spend a lot of time thinking about it.

0:24:37.9 Joseph Morais: Yeah. Honestly, again, that's one of those things from like an operation standpoint would probably be like the biggest compliment I could get. So in operations, I've done various things. I was a DevOps engineer, I was a network administrator. And having someone not worry about something that you run is really like a feather in your calf. Because what's the opposite of that? Well, we worry about our data stream platform all the time. It breaks constantly. We hate it. And I have nightmares of pager duty because one of our Kafka notes failed. And the fact that you guys have had that, and you're having it at your level of success with the amount that you guys are scaling, these are all really good things.

0:25:16.8 Alex Haugland: We occasionally get pages that have the word WarpStream in them, but like I said, almost always because the consumer is broken. And that's not actually WarpStream's fault in any way.

0:25:27.3 Joseph Morais: Yeah. I was going to say WarpStream can't fix that, unfortunately. So with a project this ambitious, trying to fuse humans and AI into these superpowered software engineers. There's gotta be more than tech to consider. How do you approach getting businesses on board? So I'm curious, like you know, outside of engineering, there's a business side to Anysphere and Cursor and was there any friction in making sure that data streaming was put to the front for these particular use cases?

0:26:01.4 Alex Haugland: So we've got like actually a fairly healthy split between what we call like our GTM team and our engineering team. It's about half and half because people recognize quite early that, hey, we have this tremendous amount of pro users that really like our product that just kind of come to us and that's a really fortunate position to be in. But there's also all these companies out there and nobody's going to go get a thousand person contract through PayPal. They don't want to just type their email in and do this thing. They want to talk to a guy, they want to have a contract, they want to talk to a lawyer. There's all this stuff you have to do to actually sell stuff to businesses and we've got a lot of people working on that. And there's also an analytics side to that too where if a business is purchasing your product, they want to know what are people using it for? Do they like it? Is it working? And so we also have user facing analytics which I don't think I mentioned up front, are also powered by this same telemetry stream. So if you send it to the dashboard on an enterprise account, it's like, hey, here's how much tab people are using, here's how much agent people are using, here's how many lines of code people are accepting. And a lot of that work has been driven through our enterprise engineering/GTM teams building on top of these same tools.

0:27:04.6 Joseph Morais: See, I didn't think about that. So you have customers at the enterprise level validating, hey, look at how much code's coming out. They may even have metrics that they knew, you know, lines of committed code per engineer on a given day prior to using Cursor, and then...

[overlapping conversation]

0:27:20.2 Alex Haugland: I hope that they're not like incentivizing people to commit as much code as possible. But like... 

0:27:23.9 Joseph Morais: I realize that's not a great metric, but it's something right? And if you...

0:27:27.7 Alex Haugland: I'm sure someone is. I'm sure someone is. 

0:27:28.1 Joseph Morais: I know like people like, oh, I committed 10,000 lines of code yesterday. But it's like a docs page, like, okay, [laughter] but not impressive. But if you had some metric, and maybe there are better ones, you can... It sounds like using Cursor's UI, you have these user facing metrics that could say, hey, we think Cursor is doing better. Like we can justify our cost and our investment in Cursor because we have these metrics. And I didn't realize that was being powered by data streaming. So I imagine on the business side that makes people very happy. 

0:27:54.8 Alex Haugland: Yeah. 

0:27:57.1 Joseph Morais: Now I'm curious, were there any unexpected final impacts? Because like you said, telemetry and accounting came after the MVP of Cursor. So once it was implemented, obviously you mentioned it helps train the model, better tab suggestion, things like that. But was there anything that you learned from the telemetry and accounting data that you weren't expecting?

0:28:20.4 Alex Haugland: I think there's also just little product things where it's like, oh, I didn't know people use this that often. I didn't know people never use this button or whatever. Where it's like, if you've ever worked in an organization that transitioned from being Vibes-based product development to analytics based product development, like, we don't want to live wholly in either world. We don't want to just say, oh, this makes the percent go up by 2%, we're canceling the feature. We want to mix the two. But...

0:28:45.0 Joseph Morais: Yeah. You want informed Vibes. 

0:28:46.6 Alex Haugland: We want informed Vibes. We don't want to be sitting here going, this is the most important feature ever. And go like, actually this feature is used by 1% of users. Maybe it's really important to those 1%, but you want that vibe check of your analytics where it's like, okay, how many of our users are on Linux, how many of our users are on Windows, how many of our users use SSH, how many users use this extension, how many people use Rust, all that sort of stuff, where it's like you really need to know vaguely where your user base is at in order to make the right decisions for them. And so all of the analytics stuff powers those things.

0:29:14.2 Joseph Morais: And I like it. It's very impressive, honestly, the whole story. Implementing accounting and metrics and telemetry on top of an already running system, doing that cleanly, doing it without affecting the user experience, and then having that ultimately build a better product, it's just all in all a really exciting story. Now, could you for the audience, share some tips, tricks or gotchas for anyone that's trying to tackle data streaming. Is there anything, other than using WarpStream, just working, are there any rough edges that you could call out that you think would be helpful for the folks out there?

0:29:49.2 Alex Haugland: I think my first answer, which is maybe not the one you want to hear, is like consider how much data you actually have. Because I think there's a lot of cases where people want to jump straight into like oh, I've got big data, like I need to go buy like a Clickhouse instance, and I need to set up like 100 Kafka queues, and like I need to set up DBT. And it's like, how many users do you have? Oh, we have a hundred users and it's like... 

0:30:07.4 Joseph Morais: Overkill. 

0:30:09.1 Alex Haugland: You can just put all in a SQLite database and you will have a fantastic time. I think it's really important to pick the right scale of tool for the problem. And using a tool like WarpStream for a really, really, really small use case, is just more annoying than it's worth. But at the same time, if you've got a lot of data, you've got a lot of use case, pick an appropriate tool. So I think that's the biggest gotcha I see people fall into is just completely mismatching the scope of the problem either picking something really stupid. I'm just going to put all my stuff in a SQLite database when they have 100 million users or the opposite. Right?

0:30:37.5 Joseph Morais: Yeah. Honestly, that's really good. And again, like, it's not something, I think people should right size their data, right? And that's why we hear, whether it's WarpStream or Confluent, we have things that autoscale. Because we realize, sometimes you make a decision and you want to have a system that it can grow, but you don't want to over provision or kill yourself in over complexity when... Like you said, you have an MVP, you have like 50 people working at your company, you don't need all that.

0:31:02.9 Alex Haugland: And WarpStream, I think is nice because as long as you have at least a little bit of usage, you're not paying a ton. If you can justify having like one WarpStream node, then you're fine. As long as you meet that threshold, then like you're not wasting a bunch of money on stuff you don't need.

0:31:14.9 Joseph Morais: Yeah. And then you scale it when you need to. 

0:31:16.6 Alex Haugland: Sure. 

0:31:20.0 Joseph Morais: So I'm curious kind of to... We still have one more segment, but to close out our conversation about the partnership and about data streaming, what is your vision or the company's vision for data streaming in the future?

0:31:30.3 Alex Haugland: So I think that it's going to continue to be the lifeblood of the company. We want to be able to take the stuff that our product is doing and make sure that it is as good as possible. That's the whole thing about like AI and machine learning. It's like you try a thing, you see if it works out. You try a thing, you see if it works out. And we need to continue to build upon that in our product and see what people are doing with it, see what they like, see what they don't like, see what works, see what doesn't work and use that to build the best product that we possibly can.

0:31:56.6 Joseph Morais: I love it. And I love that WarpStream was there for you guys to help you transition from purely vibes to analytic-informed vibes. That's gonna be one of my favorite takeaways from this episode, no doubt about it. So before we let you go, we're gonna do a lightning round. Byte-sized questions, byte-sized answers. That is B-Y-T-E. They're like hot takes, but schema-backed and serialized. That's right. We're using schema registry here, Alex. 

0:32:29.9 Alex Haugland: Okay. 

0:32:30.9 Joseph Morais: What's something you hate about IT?

0:32:32.8 Alex Haugland: Glorifying complexity.

0:32:34.6 Joseph Morais: Glorify complexity? Yeah, too much data, glorify complexity. You have a theme there, I like that. What is your hot take on the future of artificial intelligence?

0:32:43.2 Alex Haugland: It is going to be less impressive than people think it will, but more impressive than people think it will.

0:32:46.9 Joseph Morais: What's a non-tech activity that's impacted how you think about data? 

0:32:51.7 Alex Haugland: Gardening with my wife.

0:32:52.9 Joseph Morais: I love that, how? 

0:32:55.1 Alex Haugland: Because the decisions that you make today have an impact years from now, and you don't really get to skip ahead and see where the consequences are.

0:33:01.5 Joseph Morais: That's really good. My proudest moment, it's not gardening, but when we bought a house, I planted 25 emerald green arborvitaes, and someone said, hey, normally you're gonna lose 10% of them, so if you lose two or three, not a big deal. They all lived, and not only did they live, one got hit by a car. One got run over by a car, and it's still alive. I just had to stand it back up, it's amazing. Where are you getting outside inspiration, whether it's from a book or thought leaders, or a thought leader, as it pertains to your day-to-day job at Anysphere? 

0:33:29.4 Alex Haugland: On a day-to-day basis, I'm not sure. I think on a more long-term basis, my answer might be the aviation industry.

0:33:36.1 Joseph Morais: Okay, cool.

0:33:37.0 Alex Haugland: Because that's one of the places where people have been able to take things that are just unfathomably complicated and make them boring and routine.

0:33:43.2 Joseph Morais: Yeah. I guess, and that's true. When you think of success rates in airlines and how statistically safe they are and how complex they are and the fact that they have to deal with multiple types of weather and takeoffs and liftoffs, and literally tens of thousands of flights across a lifetime, that's a really good place for inspiration.

0:34:02.7 Alex Haugland: You take this machine that's amongst the most complicated mankind has ever built, and people are like, oh, it's so boring, I have to go on a flight for six hours. And it's like, how did we bridge that gap?

0:34:10.8 Joseph Morais: Oh, how do we make the ultra-complex mundane? I love that. That's really good. And I think you guys are trying to do that with coding, so there you go. We've gone full circle. And with that said, Alex, any final thoughts or anything to plug?

0:34:21.5 Alex Haugland: I think I gotta plug Cursor, right? Like, you... 

[overlapping conversation] 

0:34:22.9 Alex Haugland: It's a great app. 

0:34:24.5 Joseph Morais: You have to plug Cursor. You have to. They'll fire you if you don't plug Cursor. 

0:34:28.3 Alex Haugland: And I think I also have to plug WarpStream, right? WarpStream is great. You should use WarpStream. 

0:34:31.8 Joseph Morais: Please do. I didn't tell him to say that, but I do appreciate it.

0:34:34.7 Alex Haugland: Nah, I think maybe Richie told me, though. You'll never know.

0:34:37.8 Joseph Morais: Richie's like DMing you on Slack. Honestly, Alex, one of my favorite conversations that I've had here, so thank you so much for joining me today. For the audience, stick around after this, because I'm giving you my top three takeaways in two minutes. So let's talk about the takeaways from this episode. Well, the first one is, and when I asked Alex about WarpStream and the success that they've had at Anysphere building out Cursor with it, he said data goes in and data goes out. Their engineers do not interact with it. Well, that to me as an operations person is the best compliment you can get when you're building any type of distributed complex software. The fact that the folks at Anysphere don't have to worry about this, that they know that their data stream is basically bulletproof. They don't have to worry about the scaling, the networking, the various AZs. They can just focus on what they're doing, getting that telemetry and that accounting data to give them that informed vibes to build a better product. Just incredible, right? I haven't had a chance to talk to anyone else, that is a WarpStream customer, so I'm so happy to hear that Anysphere is having that level of success. 

0:35:46.5 Joseph Morais: And the fact that they're using Cursor to build Cursor, that's my next takeaway. That's just wild, right? The idea that you build something like an AI-powered IDE and now you can use it to improve itself. It's just... If you don't think that the future is here, I don't know what other point I can make to you. I guess it's like before there were cars and someone eventually built trucks, and then you would use trucks to move around other truck parts, which you couldn't have done before that. But making Cursor to make Cursor better is just an amazing thing. And that leads me to the most interesting takeaway. And that Alex said that coding is really just a bug. And that blew my mind, because it's true. We ultimately just want computers to do what we want it to do. And they all start as just ideas that could be written in natural language. I want an application that does this, and I want it to input this, and I want it to output that.

0:36:44.8 Joseph Morais: And the fact is that we just have to code in Python or Java because we don't have a better way to interact with our computers is really just a bug. And with that future, with things like Cursor, giving that infusion of AI power to the creativity of humanity, just an incredible thing. And I'm going to hold on to that one for a while. That's it for this episode of Life Is But A Stream. As always, we're brought to you by Confluent. The Confluent Data Streaming Platform is the data advantage every organization needs to innovate today and win tomorrow. Your unified platform to stream, connect, process, and govern your data starts at confluent.io. If you'd like to connect, find me on LinkedIn, tell a friend or coworker about us, and subscribe to the show so you never miss an episode. We'll see you next time.