Theme

How to Use LLMs Effectively

theme practical workflow multi-source

How to Use LLMs Effectively

This is a narrative theme synthesising practical guidance for working with large language models — drawn primarily from How I Use LLMs and Deep Dive into LLMs like ChatGPT by Andrej Karpathy.


The central principle

Know what you’re talking to: a compressed, probabilistic snapshot of the internet, with a learned assistant persona, frozen at a training cutoff, running in a stateless session. Every practical habit flows from this model.


Habits by category

Model selection

  • Know which model you are using. Free tiers use smaller, weaker models. The difference in quality is material for professional use.
  • Match the model to the task. Try the fast, non-thinking model first. Escalate to a reasoning model (o3, Claude extended thinking, DeepSeek-R1) for hard maths, code, or multi-step reasoning. Non-thinking models are faster and usually sufficient for factual and creative tasks.
  • Use multiple models (the LLM Council). For important decisions — technical or personal — query multiple frontier models and compare. Disagreements are informative.

Context hygiene

  • Start a new chat when switching topics. Irrelevant tokens in the context window can degrade output and slow the model. Keep working memory clean.
  • Paste information in; don’t rely on recall. Parametric memory is vague and prone to hallucination. If you need accurate information, put it in the context window via file upload or copy-paste.

Tool use

  • Use web search for time-sensitive queries. Anything that could be answered by a Google search and skimming the top links: use the search tool.
  • Use a Python interpreter for arithmetic and analysis. Don’t trust the model’s mental arithmetic. A one-liner in Python is more reliable.
  • Use file upload for reading documents. Upload PDFs, papers, or web pages. The model reads alongside you as a knowledgeable collaborator.
  • Use deep research for multi-source research tasks. Expect a 10-minute run, treat the output as a first draft, and follow citations.

Prompting

  • Few-shot beats zero-shot. Don’t just describe the task — give examples of input and output. Always.
  • Let the model think. Don’t force an answer into the first token. Chain-of-thought prompting — “think step by step” — works because it distributes reasoning across tokens.
  • For recurring tasks, save few-shot prompts as Custom GPTs, Claude Projects, or equivalent presets.

Verification

  • Read generated code. The model is a capable but absent-minded analyst. It will silently substitute placeholder values, make arithmetic errors in narration, or miss a context-dependent constraint.
  • Follow citations from research outputs. Deep Research reports can hallucinate or misattribute. Verify what matters.
  • Ask the model to transcribe before trusting. When the model processes an image or document, ask it to transcribe what it extracted so you can verify accuracy.

Interface

  • Speak rather than type. Faster, and transcription (via Super Whisper or native mobile input) is now reliable for most purposes.
  • Use Custom Instructions for persistent preferences — tone, format, domain context — so you don’t re-explain them each session.

The verification discipline

Across all these habits, a consistent discipline: always verify what matters. The model is a powerful accelerator, not a source of truth. It will produce confidently wrong answers. The cost of that confidence is borne by the user, not the model.


See also