The Wall Between You and the Model: Tokens, Encoders, and Embeddings
GPT-4 can't count letters. Here's why — and how embeddings give token IDs the meaning the tokenizer strips away.
Johannes Hayer
Building ai-in-a-shell
Related articles
April 21, 2026
Why Your LLM Can't Think Ahead: The Token-by-Token Reality of Text Generation
Most developers treat LLMs as black boxes that write answers. The reality is stranger and more mechanical — and understanding it changes how you build.
June 14, 2026
The thing your AI agent is missing isn't a better model
I kept upgrading the model when my agent failed. Turns out I was solving the wrong problem. Here's what a harness is and why architecture beats model quality.
April 24, 2026
The Complete Journey of a Prompt: How LLMs Actually Process Your Input End-to-End
Most explanations cover one piece at a time. Here's the full data flow — from your prompt to the next generated token — traced through every component in order.
Learn it properly
Practice the AI Native Engineer Roadmap
Turn the article into concept cards, Socratic questions, and an AI tutor session that checks whether the model actually holds in your head.