How an AI Agent Books a Complicated Flight
A walkthrough of one agent run on a multi-constraint travel goal. Eight tool calls, four phases of the agent loop, one error to recover from. The point is to show what an agent actually does between the user's request and the result.
Most explainers stop at the diagram. This page does not. The user goal is below the demo. Hit auto-play and watch the agent run, or step through manually if you want to read each tool call in full. Reasoning is on the left in monospace. Tool input and tool output sit on the right. The brick-clay panel is the agent's decision after each observation.
I need to fly from London to Tokyo for a 4-day conference, prefer overnight, need vegan meal, layover under 4 hours, budget GBP 900.
User goal: London to Tokyo, 4-day conference, prefer overnight, vegan meal, layover under 4h, budget GBP 900. I need to break this into search constraints first, then search, then validate against the soft constraints (meal, layover) before booking.
14 candidates within constraints. Cheapest is GBP 681. Now I need to narrow by overnight outbound and check vegan meal availability per carrier.
flights.search({
from: "LHR",
to: "NRT",
depart_after: "2026-05-12T18:00",
depart_before: "2026-05-13T06:00",
return_after: "2026-05-16T08:00",
return_before: "2026-05-17T12:00",
max_layover_hours: 4,
cabin: "economy",
currency: "GBP"
}){
"results": 14,
"cheapest": 681,
"median": 814,
"trip_ids": ["JL44-JL45", "BA005-BA006", "NH212-NH211", ...]
}The agent capabilities on display
Eight tool calls, four phases, one error recovery, three side effects. Each of the items on the right is a capability that distinguishes an agent from a chatbot or a workflow.
- Multi-tool useTool-using agents →
Five different tools across the eight steps: flights.search, flights.filter, airline_meals.check, flights.hold, flights.quote, flights.book, calendar.add_event, email.send. The agent decides which tool to call based on what it needs next.
- Constraint satisfactionHow agents work →
Hard constraints (LHR -> NRT, vegan meal, layover under 4h, budget GBP 900) versus soft preferences (overnight). The agent treated the soft constraint as a tiebreaker once the hard set narrowed the candidates.
- Error recoveryFailure modes & recovery →
Step 4 returned HOLD_UNAVAILABLE. A scripted bot would have stopped. The agent reasoned about the error, dropped the hold step, and re-quoted the trip before booking instead.
- ObservationSense-think-act-observe →
After every tool call, the agent reads the response and updates its plan. Step 5 caught a fare increase from GBP 712 to GBP 728 between search and quote.
- Decision authorityAgent vs chatbot →
The agent decided to proceed at the higher fare because GBP 728 was still well within the user-stated budget of GBP 900. A workflow with a hard rule would have aborted; an agent reasoned about the threshold.
- Multi-step side effectsExamples →
Booking the flight is not the goal. The goal is the trip. The agent chained the booking with calendar updates in the right timezones and an email confirmation in a single sequence.
- Honest summaryEvaluation →
The terminal step explicitly named the GBP 16 overshoot. The user retains the ability to override on the next run because they know what was decided and why.
This is a scripted illustration
The tool calls and responses on this page are illustrative. They show the shape of a real agent run, not a recording of one. Production agents typically run 5 to 15 iterations on tasks of this complexity, sometimes more when an error path opens up. The reasoning text is a reasonable approximation of what a 2026 frontier model emits; the exact wording varies by model and system prompt.
For a citation-shaped reference on what is constant across runs, see how AI agents work. For the architectural breakdown of the loop the demo above traces, see the expanded diagram on that page.