Tool Use
Tool use is the capability of a large language model to invoke external systems — web search, code interpreters, file readers, image generators — by emitting special tokens that the surrounding application intercepts and acts upon.
The mechanism
The model is trained to emit special token sequences when it wants to use a tool. For example:
<SEARCH_START>when is White Lotus Season 3 Episode 2<SEARCH_END>
The inference application detects this sequence, pauses token generation, executes the search, and injects the results into the Context Window. Generation resumes with the retrieved information directly available.
The same pattern applies to:
- Python interpreter — the model writes code; the application executes it and returns the output.
- Calculator — arithmetic delegated to a reliable tool.
- Image generation — the model writes a description; a separate image model generates the image.
- File access / RAG — relevant passages retrieved from a document store and inserted into context.
Why tool use matters
Without tools, the model is a closed system: it can only draw on what it learnt during training, which is frozen at a training cutoff and subject to Hallucination. With tools:
- Time-sensitivity is resolved: web search retrieves current information.
- Arithmetic and counting are reliable: the Python interpreter computes exactly.
- Character-level tasks (spelling, string formatting) are reliable: code operates on actual characters, not tokens.
- Long documents are accessible: file uploads put content directly in context.
Karpathy’s framing: just as humans don’t do complex arithmetic in their heads — they reach for a calculator — well-designed LLM systems reach for tools rather than reasoning beyond their reliable capability.
Practical guidance
| Task type | Recommended tool |
|---|---|
| Recent events / current data | Web search |
| Arithmetic, statistics, plotting | Python interpreter |
| Reading papers or books | File upload |
| Counting, character operations | Code |
| Long-horizon research | Deep research mode |
Agentic tool use
In agentic settings, the model issues multiple tool calls in sequence, building up a plan from the results. See Agentic Engineering and Vibe Coding for how this extends to full software development workflows.