Speaker

Andrej Karpathy

Andrej Karpathy

AI researcher, engineer, and educator. Known for making deep technical concepts in machine learning accessible to general audiences.


Background

  • Co-founder of OpenAI (2015).
  • Director of AI at Tesla; led the Autopilot computer vision programme.
  • PhD from Stanford (2015) under Fei-Fei Li; dissertation on convolutional neural networks for visual recognition.
  • Created the widely followed neural networks: zero to hero YouTube series.
  • Coined the term vibe coding (early 2025).

Talks in this wiki

TitleDateTopic
Intro to Large Language ModelsNov 2023Busy person’s one-hour intro to LLMs; LLM OS framing
Deep Dive into LLMs like ChatGPTFeb 2025Full pipeline — pretraining through security
How I Use LLMsFeb 2025Practical ecosystem tour; tools, workflows, habits
From Vibe Coding to Agentic Engineering2025Software 3.0; vibe coding vs agentic engineering

Recurring themes across talks

LLMs as compression. Karpathy consistently frames pretraining as lossy compression of the internet. The parameters store statistical patterns, not facts; recall is vague and probabilistic.

Jagged intelligence. A recurring preoccupation: LLMs are brilliant in most domains, with arbitrary, unexplained gaps — the reversal curse, the strawberry R-count, the car-wash question. The gaps reflect both structural limits (verifiability) and what labs chose to invest in.

Verifiability as the axis. From the RL discussions in “Deep Dive” to the founder advice in “Vibe Coding”: progress clusters where verification is possible. This explains why maths and code improved fastest, and it points practitioners toward tractable opportunities.

Tool use as amplification. In every practical talk, the model alone is weaker than the model plus tools. Web search, Python interpreter, code execution: these convert LLMs from recall engines into problem-solving systems.

The LLM as a new computing substrate. From “LLM OS” (2023) to “Software 3.0” and the “neural computer” (2025): Karpathy frames LLMs not as a tool within existing computing but as the foundation of a new computing paradigm.


Key concepts originating from or emphasised by Karpathy