What You Already Know
You've used ChatGPT, Claude, or Copilot. You type a message, and text streams back — word by word, as if the model is thinking out loud. That streaming response is a Large Language Model at work.
An LLM takes text in and produces text out. It doesn't browse the internet in real time. It doesn't have a database it queries. It generates each word based on patterns it absorbed during training — patterns extracted from a massive slice of the internet.
That's the surface. Underneath, every behavior you've noticed — the confident wrong answers, the uncanny understanding of your question, the word-by-word streaming — traces back to a specific set of design decisions made over the last 25 years.
This module walks that path. By the end, you'll understand why LLMs work the way they do — not as magic, but as engineering.