Karina Nguyen on Model Training and AI Product

Karina Nguyen on Model Training and AI Product

transcript model-training synthetic-data evals ai-product post-training lenny-podcast

Karina Nguyen on Model Training and AI Product

Guest: Karina Nguyen — AI researcher; former Anthropic (post-training and evals for Claude 3; 100K context; file uploads); OpenAI (Canvas, Tasks; Frontier Product Research team).
Host: Lenny Rachitsky
Source: Lenny’s Podcast.


Overview

Karina Nguyen has worked at the bleeding edge of both Anthropic and OpenAI, moving from front-end engineering into AI research after realising Claude would surpass her coding skills. The conversation covers model training as art, synthetic data for rapid product iteration, how Canvas and Tasks were built, the no-data-wall argument, skills for the future, Anthropic vs OpenAI culture, and the trajectory from personal computer to personal model.


Key ideas

  1. Model training is more art than science. Data quality decisions require extensive debugging of model behaviour — similar to debugging software, but the failure modes are unexpected self-knowledge contradictions.
  2. Synthetic data eliminates the data wall in post-training. Pre-training may face diminishing returns; post-training via RL on infinite tasks does not. Any task becomes training data.
  3. Product features are built via synthetic behavioural training. Canvas was built by identifying three core behaviours (trigger, edit, comment) and teaching them synthetically using o1 to generate training examples.
  4. Form follows function in AI. Product form factors (file upload, calendar notification) make model capabilities accessible to users who would not interact with raw context windows or RL-trained reasoning.
  5. From personal computer to personal model. The trajectory: asynchronous agents that learn user preferences over time will become a personalised AI that predicts and completes actions.

Model training: art not science

“Model training is more an art than a science.”

Karina’s example: during Claude 3 training, the team taught the model self-knowledge (“you don’t have a physical body”) alongside functional call data (“this is how you set an alarm”). The model became confused — it could not decide whether it was capable of physical-world actions. This produced over-refusal (“Sorry, I cannot help you”) on legitimate tasks.

The insight: training data creates implicit knowledge graphs. Incompatible datasets produce surprising failure modes. Debugging requires reading model responses with the same attention a software engineer brings to debugging code.


The data wall does not exist (in post-training)

The “data wall” framing applies to pre-training on internet data. But post-training operates differently:

  • Pre-training: compress the internet; model learns to predict the next token; learns about the world.
  • Post-training (o1 paradigm): RL on infinite tasks — search, scheduling, writing, coding, tool use. Each task is trainable; each new capability is a new data source.

The bottleneck shifts to evaluations. Benchmarks like GPQA (PhD-level Q&A, Google-proof) are approaching saturation — models are hitting or exceeding PhD-level performance. The new challenge: frontier evals that capture what models genuinely cannot yet do.


Synthetic data for product behaviour training

Canvas was trained on three core behaviours, each taught synthetically:

BehaviourWhat it doesTraining approach
TriggerDecide when to open Canvas vs. answer inlineLabel which prompts should trigger (long essay, code) vs. not trigger (factual question)
EditUpdate specific document sections on requestTeach: find the section, edit it in-place; rewrite vs. targeted edit decision
CommentMake targeted inline comments on a documentUse o1 to generate document → inject “critique this” prompt → teach model to place comments on specific spans

The key advantage of synthetic data: scalable, cheap, and generalises across diverse coverage once core behaviours are defined. After beta launch, user feedback shifts the training distribution — the model learns from real usage patterns.

Evals for product features: two types.

  • Deterministic evals — binary pass/fail. If user says “7:00 PM”, model should output “7:00 PM”. No subjectivity.
  • Human win-rate evals — human raters compare completions; which model produces higher quality edits or comments? Continuous win rates against previous model versions.

Spreadsheet method: PMs maintain tabs with (current behaviour, ideal behaviour, why). This spreadsheet can itself be fed to o1, which can generate training data from the ground truth labels.


Form follows function

“Form follows function. File uploads is a form factor that can enable people to just literally upload anything — books, reports, financial documents — and ask any task.”

100K context windows were a raw model capability. The question Karina asked: what form factor makes this accessible? File uploads was the answer — familiar UI pattern, immediately usable by enterprise customers (financial analysts, researchers). The capability was always there; the form factor unlocked it.

Similarly, Tasks is not a groundbreaking capability — but wrapping scheduling/reminder behavior in a familiar notification form factor makes it accessible to users who would not build their own agentic pipeline.

The lesson: product value often comes from form factor translation, not from the model capability itself.


Skills for the future

Karina’s view: soft skills become more durable than hard technical skills.

What AI will struggle with for longer:

  • Aesthetic judgment (what is good visual design?).
  • True creative writing (ChatGPT’s writing remains bottlenecked by creative reasoning).
  • Management and research prioritisation — allocating constrained compute to the highest-conviction research paths requires human judgment at scale.
  • Emotional intelligence and people skills — building trust, deriving intent, collaboration.

What AI is already very good at:

  • Connecting dots across disparate data sources.
  • Strategy and planning given sufficient context (aggregate user feedback + metrics → action plan).
  • Most hard technical skills (coding, front-end, data analysis).

Karina’s own career signal: she switched from front-end engineering to AI research when she saw Claude getting good at coding. She wanted to stay ahead of what models could automate.


Anthropic vs. OpenAI culture

Karina is one of very few people to have worked at both labs at a senior research level.

DimensionAnthropicOpenAI
Scale at time~70 people when joined; startup feelMuch larger; scale brings creative freedom
Product approachHardcore prioritisation; craft; meticulous model characterMore bottoms-up; more risk-taking; more product launches
Model training cultureDeep care about model personality, character, ethical behaviourMore experimental; larger research freedom
DifferencesFocused; one model at a time; consideredDistributed; ideas bubble up; more surface area

Karina’s conclusion: “It’s more similar than different.” Both labs care about model quality; both do extensive evals; both build culture around the output of the model. The framing of “enemies” in AI is wrong — it is one community doing the same thing.

Claude’s personality (thoughtful, librarian-like, considered) reflects the careful curation of datasets and character data at Anthropic. The output model is a reflection of the people who made it.


The personal model trajectory

“We went from personal computers to personal models.”

The evolution:

  1. Synchronous chatbot — ask question, get answer.
  2. Collaborative agent (Canvas) — model and user co-author documents; model has agency to edit, comment, restructure.
  3. Asynchronous agent (Tasks, Operator) — model completes tasks in the background; fires on schedule; builds trust through consistent performance.
  4. Personal model — agent that has learned your preferences, workflows, and patterns well enough to predict your next action; a digital collaborator that knows you.

The trust-building problem: agentic systems that go off for 10 minutes and return with a wrong answer create worse UX than not having the agent at all. Teaching the model to derive intent correctly — asking follow-ups only when needed — is an active research problem.


Early Anthropic prototypes

Karina’s memories from early days:

  • Claude in Slack — one of the first tool-using products; Claude summarised Slack threads, generated weekly org-wide summaries on Fridays, provided a social element for learning prompting.
  • Shared workspace prototype — Claude and user with a shared document they could co-iterate on (predated Canvas by ~2 years).
  • 100K context launch — the moment when the product form factor (file uploads) made model capability tangible to users. The first real “wow” signal from enterprise customers.

See also