Learn/The Transformer & LLMs

Track 1 · Foundations

The Transformer & LLMs

The mental model behind modern language models: how text becomes tokens, how attention moves information between them, how a transformer block is built, and why generation is both powerful and fallible.

6 lessons Beginner to intermediate After AI/ML Foundations

Tokenization: how text becomes tokens

Why an LLM does not read words directly, how subword tokens work, and why token boundaries affect cost, speed, and behavior.

Attention: how tokens look at each other

Queries, keys, values, and attention weights without drowning in matrix notation. The core trick that made transformers work.

Transformer architecture: the LLM block

The repeating block inside an LLM: embeddings, positional information, attention, MLPs, residual paths, and layer norm.

How LLMs generate text

Next-token prediction, logits, probabilities, sampling, temperature, and why generation is a loop instead of a one-shot answer.

Context windows: prompt, memory, and limits

What actually fits in context, why the model does not remember outside it, and how long context changes cost and latency.

Why LLMs hallucinate

Why fluent text is not the same as truth, where hallucinations come from, and what engineering patterns reduce them.