- AI, But Simple
- Posts
- The Math-Based Guide to RAG, MCP, and Agents
The Math-Based Guide to RAG, MCP, and Agents
AI, But Simple Issue #108

The Math-Based Guide to RAG, MCP, and Agents
AI, But Simple Issue #108
After pre-training, a large language model is considered frozen, with its knowledge sealed inside its weights.
At this point, the transformer base itself is only able to take inputs and sample output tokens, meaning it can’t search the web, execute multi-step goals by itself, and access a specific company's documents or data.
These are notable limitations if you want a model to do real work.
Suppose an employee asks an LLM: "Is this customer eligible for a refund?" A model can't answer this on its own.
It doesn’t have the company refund policy,
It can't look up the order or interact with the company systems,
And it can't reliably carry out a multi-step workflow from start to finish.

We present 3 techniques that remove a constraint from foundational LLMs:
Retrieval-Augmented Generation (RAG) gives it knowledge it doesn't have to access information in an efficient way.
Model Context Protocol (MCP) gives it tools it can't reach, to act on systems.
Agents give it a loop to pursue a goal across many steps.
They build on each other and combine into a system that can actually tackle real, complex problems.
What You’ll Learn
Why a trained LLM is fundamentally limited
RAG: retrieving knowledge by meaning and the math of similarity search
MCP: the protocol that standardizes how a model uses tools
Agents: wrapping the model in a goal-seeking loop
When to use each technique
How the three compose into a single system
What’s Helpful to Know
Embedding
A vector that represents a piece of text's meaning.
Context window
The fixed budget of tokens a model can see at once.
Tool or function call
A model outputting a structured request (a name plus arguments) that an external system runs.
JSON-RPC
A standard format for sending a function call and obtaining a result.
Policy
A policy is a function that maps states to actions (a distribution over actions).

A field of machine learning that focuses on training models to act in environments by maximizing a reward signal.