Michael Truell on Cursor and the World After Code

Guest: Michael Truell — Co-founder and CEO of Anysphere (Cursor). Studied CS and math at MIT; did AI research at MIT and Google.
Host: Lenny Rachitsky
Source: Lenny’s Podcast.

Overview

Michael Truell articulates the most specific vision in the wiki for what programming looks like after formal languages — a world where engineers become logic designers specifying intent in pseudocode-like representations rather than writing imperative code. He explains Cursor’s surprising decision to develop custom models, the product philosophy of keeping humans in the driver’s seat, and his framework for competing in a market with a very high ceiling.

Key ideas

World after code. The future of programming is not chatbot-style typing into a Slackbot, nor unchanged text editing in TypeScript and Go. It is a higher-level representation of software logic — closer to pseudocode or English — that humans can edit directly, with AI handling the translation to executable code underneath.
Engineer = logic designer. As the translation layer from intent to code gets automated, software engineers will increasingly resemble logic designers: specifying the what and how things should work, not the imperative how to execute it.
Every magic moment involves a custom model. Cursor unexpectedly became a model development company. Custom models power autocomplete (fast/cheap diff prediction), codebase retrieval (mini-Google search for relevant code), and sketch-to-diff (large model sketches changes; small model fills in the full diff).
Chop things up. The most successful Cursor users break tasks into small increments — specify a little, review, specify a little more — rather than writing a giant specification and waiting for the AI to do everything.
High-ceiling market dynamics. The AI coding market resembles search in 1999: possible to be leapfrogged, but big R&D investments keep yielding value; moat is being the best product, not lock-in.

The world after code

“More and more, being an engineer will start to feel like being a logic designer, and really, it will be about specifying your intent for how exactly you want everything to work.”

Two wrong predictions for the future of programming:

Nothing changes — engineers keep writing TypeScript and Go in text editors. Wrong because models are going to get much, much better.
Chatbot Slackbot — you type requests to an AI engineering department and it builds things for you. Wrong because this lacks precision: humans need to gesture at changes in a form factor more precise than a text box removed from the code.

Michael’s prediction: software logic is represented in something terse and human-readable — an evolution from code toward pseudocode — that humans can directly edit and point at. The formal programming language layer gives way to a higher-level specification. This process is gradual and goes through existing professional engineers, not around them.

Related to Vibe Coding but more specific: vibe coding describes the current state (generating code without understanding details). The world after code describes the end state (a representation where understanding details becomes unnecessary because the representation is designed to be human-comprehensible at a high level).

Taste as the post-code skill

“Taste will be increasingly more valuable.”

As carefulness (getting the implementation details right) gets automated, taste becomes the irreplaceable human contribution. Taste in this context is not aesthetic taste (visual design, animations) but having the right idea for what should be built: how you want the logic to work, what the interaction should be, what the software should do.

Right now, the only good representation of software logic is code. Figma and notes can gesture at it but cannot fully specify it. The world after code creates a representation that makes taste actionable by non-coders: if you have the idea, you can now fully specify it.

Related: Product Taste; Karina Nguyen on Model Training and AI Product (form follows function); Guillermo Rauch on v0 and the Future of Building (exposure hours to develop taste).

Cursor’s model stack

Counter-intuitive discovery: “Every magic moment in Cursor involves a custom model in some way.” Cursor unexpectedly became a model development company. The custom models operate at three layers:

Autocomplete layer — fully custom.

Task: predict the next set of code diffs the user is about to make (across multiple files).
Why custom: requires 300ms response time; runs on every keystroke; the task (diff prediction) is unlike generic next-token completion.
Why code is special: sometimes the next 10–30 minutes of work is entirely predictable from looking over the developer’s shoulder. Unlike prose writing, code has high local predictability.

Retrieval layer — custom models augmenting foundation models.

Task: search the codebase to find the relevant parts to feed to a large model (Sonnet, GPT, Gemini).
Described as “a mini Google search specifically built for finding the relevant parts of the code base.”
Sits on the input side of the large model.

Sketch-to-diff layer — custom models augmenting foundation models.

Task: large model produces a high-level sketch of changes; small fast custom models fill in the full diffs.
Sits on the output side of the large model.
Enables quality (large model for reasoning) and speed (small model for execution) simultaneously.

This extends Model Maximalism: Cursor’s approach is not model maximalism (trust the foundation model to get better) but model pragmatism — use the best available model for each sub-task, including training custom models where foundation models are too slow, too expensive, or misaligned with the task. This is the ensemble architecture Kevin Weil on OpenAI and the Future of AI Products describes from the OpenAI side.

Chop things up

The most actionable advice for Cursor users. Two patterns exist:

Pattern	Description	When it works
Waterfall AI	Write giant spec → AI works → review entire output	Rarely — hard to specify everything correctly upfront; failures accumulate
Incremental	Specify a little → AI produces a little → review → repeat	Most successful users today

The incremental pattern matches how successful engineers also use autocomplete (next-edit prediction) — the human stays in the driver’s seat and reviews frequently. Michael sees this as the right model until AI reliability on large compositional tasks improves significantly.

Competitive dynamics: high-ceiling markets

Michael’s framework for thinking about competitive moats when the ceiling is very high:

“I truly just think that the ceiling is so high that no matter what entrenchment you build, you can be leapfrogged.”

Historical analogies: search at end of 1990s (fragmented, high-ceiling, distribution helps because data feedback improves the product); personal computer market in 1970s–1990s.

In a high-ceiling market:

Lock-in moats are fragile — users can try alternatives and switch if they’re better.
Consumer-like product moat — be the best thing, consistently.
R&D scale is defensible — if you can keep investing in product quality and it keeps yielding improvements, your lead compounds.

Microsoft/Copilot counter-example: markets unfriendly to incumbents when the ROI between products is large and users can freely compare. Structural challenges (original team dispersed) plus incumbency disadvantage = opportunity for challenger.

Origin story

Cursor was started in late 2021/early 2022, motivated by two signals:

GitHub Copilot beta — the first AI product that was genuinely useful, not just a demo.
OpenAI scaling papers — showed AI would get better just by scaling compute and data, without requiring new ideas.

Initially built for mechanical engineering (looking for an uncompetitive niche) for four months. Pivoted to coding after realising: (a) not excited about mechanical engineering, (b) the coding space lacked sufficient ambition about where things were heading, (c) there was room for a leapfrog.

First version hand-rolled (not VS Code-based), built in ~5 weeks of intensive dogfooding. Switched to VS Code base after initial user feedback. Growth was consistent exponential from launch — felt slow at first, then obvious in retrospect.

Hiring: two-day work test

Cursor uses a two-day onsite work test as the core of their interview process. Candidate is given a canned project in the codebase, works largely independently with some collaboration, and the team gets two days of seeing real work output. Benefits:

Shows real work product, not performance under interview pressure.
Extended time investment signals mutual seriousness.
Lets candidates meet the team and decide they want to join — important in early days before the product was well-known.

This has scaled to 60+ people. Combined with: rejecting credential-centric filtering (great hires looked different from “straight out of central casting”), recruiting world-class people over years (not hiring fast), and screening for intellectual curiosity, intellectual honesty, and level-headedness.