Last verified: April 2026

Tool-Using AI Agents: Function Calling, MCP, and Retrieval (2026)

Without tool use, an AI agent is a chat session with extra steps. Tool use is what lets an agent take actions in the world. Three mechanisms dominate production deployments: function calling, the Model Context Protocol, and retrieval-augmented generation. This page explains each at reference level.

All three are about the same thing under the surface: giving the language model a way to read or write something outside its own context window. The differences are in standardisation, dynamism, and what the model has to know in advance about the tools it has available.

Section 1

Function calling

Introduced across the major LLM providers in mid-2023. The first standard mechanism for giving a model access to external functions.

Function calling is the model emitting a structured request to invoke an external function instead of generating freeform text. The application code receives the structured call, executes the function, and feeds the result back to the model. The model then produces either another function call or a final answer.

The loop has three steps from the model's perspective: read tool schemas in the prompt; emit a tool call as part of the response; receive the tool result and continue. Tool schemas are JSON Schema fragments describing the function name, arguments, and return type. The runtime that hosts the agent (the application code, not the model) is responsible for actually executing the function and returning a result.

Function calling was introduced by OpenAI in mid-2023, followed by Anthropic and Google later that year. By 2024 it was the default pattern for any agent that needed to take actions outside the model.

Schema, illustrative

{
  "name": "search_knowledge_base",
  "description": "Search the company KB for an article",
  "parameters": {
    "type": "object",
    "properties": {
      "query": { "type": "string" },
      "max_results": { "type": "integer", "default": 5 }
    },
    "required": ["query"]
  }
}

Section 2

Model Context Protocol (MCP)

Open standard introduced by Anthropic in late 2024 for connecting LLMs to external tools and data sources.

The Model Context Protocol is an open standard that specifies how an LLM application talks to a tool server. The headline feature is dynamic discovery: an agent can connect to an MCP server at runtime and find out what tools and resources are available. The agent does not have to know in advance which tools it will need.

MCP introduces a separation of concerns that raw function calling lacks. The LLM application is responsible for the agent loop and the model. The MCP server is responsible for the tool implementations and their schemas. A team can ship an MCP server for a system, and any MCP-aware agent platform can use it without bespoke integration work.

The other capability MCP standardises is the "resource" primitive: data the agent can read but not call (a document, a file tree, a database row). Distinct from a tool, which is a function the agent can invoke. The protocol covers prompts, tools, resources, and sampling. Specification at modelcontextprotocol.io.

MCP is the answer to the question raw function calling did not solve: how do you avoid bespoke per-vendor integration work for every tool an agent might want? An agent platform that supports MCP can talk to any MCP server without a custom adapter. The standardisation matters for organisations running more than one or two agent platforms.

Section 3

Retrieval as a tool

Retrieval-augmented generation (RAG) is a special case of tool use where the tool is "search the knowledge base". The agent emits a query, a retriever returns matching documents, and the model generates an answer grounded in the retrieved content. Most enterprise agent deployments include retrieval as their first and most-used tool.

The reason retrieval is called out separately is volume. In any given production agent, the most-called tool is the retrieval tool. A customer support agent retrieves before nearly every reply. A research-assistant agent retrieves before nearly every reasoning step. Optimising the retrieval layer (chunking, embedding model, re-ranking, hybrid keyword-plus-semantic search) often improves the agent more than tuning the model itself.

Section 4

Agent-to-agent (A2A) protocols

The emerging counterpart to MCP for agent-to-agent communication. Where MCP standardises how an agent talks to a tool server, A2A standardises how one agent talks to another. As of early 2026, the field is unsettled: Anthropic, OpenAI, Google, and several startups have proposed competing specifications. The full multi-agent treatment is on multi-agent systems.

Tool-use failure modes

Five failure modes are responsible for the majority of production tool-use incidents. The full evaluation framework lives on how to evaluate an AI agent.

Hallucinated tool callsThe model invents a tool that does not exist or invents arguments to a real tool. The runtime usually catches this, but a poorly written runtime may pass it through.
Incorrect argument formattingThe model emits arguments in the wrong shape: a string where an integer was expected, a malformed date, a missing required field. JSON Schema validation at the runtime layer catches most of this.
Ignored tool errorsThe tool returned an error; the agent treated it as success. Common when the tool result is not in a format the model recognises as an error.
Infinite tool-use loopsThe agent calls the same tool with similar arguments indefinitely. Always set a hard iteration cap and a per-tool budget.
Tool selection errorsMultiple tools could plausibly handle a request; the agent picks the wrong one. Mitigation is sharper tool descriptions and, where possible, narrower tool catalogues.
Argument injectionUntrusted input flows through the model and into a tool argument, allowing the user to influence what the tool does. Treat all model output as untrusted before passing to side-effectful tools.

How AI agents work→Multi-agent systems→How to evaluate→Glossary→