How to Use LLMs Effectively
This is a narrative theme synthesising practical guidance for working with large language models — drawn primarily from How I Use LLMs and Deep Dive into LLMs like ChatGPT by Andrej Karpathy.
The central principle
Know what you’re talking to: a compressed, probabilistic snapshot of the internet, with a learned assistant persona, frozen at a training cutoff, running in a stateless session. Every practical habit flows from this model.
Habits by category
Model selection
- Know which model you are using. Free tiers use smaller, weaker models. The difference in quality is material for professional use.
- Match the model to the task. Try the fast, non-thinking model first. Escalate to a reasoning model (o3, Claude extended thinking, DeepSeek-R1) for hard maths, code, or multi-step reasoning. Non-thinking models are faster and usually sufficient for factual and creative tasks.
- Use multiple models (the LLM Council). For important decisions — technical or personal — query multiple frontier models and compare. Disagreements are informative.
Context hygiene
- Start a new chat when switching topics. Irrelevant tokens in the context window can degrade output and slow the model. Keep working memory clean.
- Paste information in; don’t rely on recall. Parametric memory is vague and prone to hallucination. If you need accurate information, put it in the context window via file upload or copy-paste.
Tool use
- Use web search for time-sensitive queries. Anything that could be answered by a Google search and skimming the top links: use the search tool.
- Use a Python interpreter for arithmetic and analysis. Don’t trust the model’s mental arithmetic. A one-liner in Python is more reliable.
- Use file upload for reading documents. Upload PDFs, papers, or web pages. The model reads alongside you as a knowledgeable collaborator.
- Use deep research for multi-source research tasks. Expect a 10-minute run, treat the output as a first draft, and follow citations.
Prompting
- Few-shot beats zero-shot. Don’t just describe the task — give examples of input and output. Always.
- Let the model think. Don’t force an answer into the first token. Chain-of-thought prompting — “think step by step” — works because it distributes reasoning across tokens.
- For recurring tasks, save few-shot prompts as Custom GPTs, Claude Projects, or equivalent presets.
Verification
- Read generated code. The model is a capable but absent-minded analyst. It will silently substitute placeholder values, make arithmetic errors in narration, or miss a context-dependent constraint.
- Follow citations from research outputs. Deep Research reports can hallucinate or misattribute. Verify what matters.
- Ask the model to transcribe before trusting. When the model processes an image or document, ask it to transcribe what it extracted so you can verify accuracy.
Interface
- Speak rather than type. Faster, and transcription (via Super Whisper or native mobile input) is now reliable for most purposes.
- Use Custom Instructions for persistent preferences — tone, format, domain context — so you don’t re-explain them each session.
The verification discipline
Across all these habits, a consistent discipline: always verify what matters. The model is a powerful accelerator, not a source of truth. It will produce confidently wrong answers. The cost of that confidence is borne by the user, not the model.