Module 3 · Generative AI & LLMs — Foundations
Capabilities & Hard Limits: Hallucination, Context Windows & Knowledge Cutoffs
65 min
Learning objectives
- Explain why hallucination is a structural property of LLMs, not a rare bug
- Describe what a context window is and the practical consequences of its limit
- Explain knowledge cutoffs and how retrieval or tools extend a model's reach
- Apply practical habits to use LLMs responsibly given these limits
Genuine capabilities
LLMs are remarkably good at a real set of tasks: drafting and rewriting text, summarizing long documents, translating, extracting structured information from messy text, explaining concepts, brainstorming, and generating or debugging code. These strengths are real and valuable. The skill of a practitioner is pairing these strengths with an honest grasp of the limits below.
LLMs excel where fluent language transformation is the job and a human can verify the result. They are riskiest where unverifiable facts are taken at face value.
Hallucination: confident and wrong
Because an LLM generates what is statistically plausible rather than what is verified, it will sometimes produce fluent, confident output that is simply false — a fake citation, an invented statistic, a non-existent function name. This is called hallucination. It is not a malfunction you can fully patch away; it is a direct consequence of next-token prediction with no built-in fact-checker.
Hallucination — Fluent, confident model output that is factually wrong or fabricated, arising because the model predicts plausible text rather than verified truth.
Watch out
Hallucinations are most dangerous precisely because they sound authoritative. Never treat an LLM's factual claims, citations, names, numbers, or quotes as reliable without independent verification — especially in legal, medical, financial, or safety-critical contexts.
Example — The fabricated citation
Ask an LLM for sources on a niche topic and it may return real-looking references — plausible authors, a believable journal, a tidy year — that do not exist. The format is learned from millions of real citations; the specific facts are invented to fit. This has led to real-world sanctions for professionals who filed AI-generated fake case law.
The context window: a limited working memory
A model can only consider so much text at once — the context window, measured in tokens, covering the prompt and the generated response together. Modern windows are large (often hundreds of thousands of tokens), but they are finite. Exceed the window and the earliest content falls out of view: the model effectively 'forgets' the start of a very long conversation or document.
Context window — The maximum number of tokens a model can attend to at once, including both the input prompt and the output it generates.
Analogy
The context window is like a desk, not a filing cabinet. Only what fits on the desk is in view; push new papers on and old ones slide off the edge. The model has no memory of anything that has dropped off — unless you put it back on the desk.
Knowledge cutoff: frozen at training time
An LLM's built-in knowledge stops at its training cutoff — the date after which it saw no new data. Ask about events past that date and it will not know, and may confidently guess. To work with current or private information, the model must be given that text directly: pasted into the prompt, supplied via retrieval (RAG), or fetched through a connected tool such as web search.
Knowledge cutoff — The point in time after which a model has no training data; it has no inherent knowledge of later events unless that information is provided to it.
| Limit | Root cause | Practical mitigation |
|---|---|---|
| Hallucination | Predicts plausible text, no fact-checker | Verify claims; cite sources you control; ground with retrieval |
| Context window | Finite tokens in view at once | Summarize/chunk long inputs; re-supply key facts |
| Knowledge cutoff | No training data past a date | Provide current info via prompt, RAG, or tools |
- Treat factual output as a draft to verify, not an answer to trust.
- Give the model the source material instead of relying on its memory.
- For long tasks, summarize earlier context so key facts stay in the window.
- Match the stakes to the safeguards: higher risk demands tighter human review.
Knowledge check
Quick practice — not part of your exam score.
Why is hallucination considered a structural property of LLMs rather than a simple bug?
A user pastes a 500-page document that exceeds the model's context window. What is the most likely consequence?
To get reliable answers about an event that happened after a model's knowledge cutoff, the best approach is to:
Sign in to track your progress and mark lessons complete.
Sign in