Book — 20 chapters
Building Agentic AI Systems
15 chapters and 4 capstone projects covering agentic AI patterns, multi-agent orchestration, tool use, memory systems, and production deployment. Build portfolio-ready projects with working code.
Part 01 Foundations
What Is Agentic AI?
A Fortune 500 retailer deployed a chatbot that could answer questions about return policies. Within six months they tried to make it process actual returns, modify shipping addresses, and issue refunds. It could do none of those things — and the failure was not a bug. It was an architecture never designed for agency.
LLM Primitives
A production order-fulfillment agent silently doubled its cloud bill in seventy-two hours. Nobody changed the prompt. Nobody deployed new code. A single upstream schema change added four extra fields to every tool-call response, ballooning each completion from 800 tokens to 3,200. This chapter closes that gap.
Agent Anatomy
A customer-service agent went live on Monday and by Tuesday had refunded $14,000 to users who never asked for refunds. The LLM was fine. The architecture had no memory layer, no input validation, no confirmation gate. This chapter is the teardown.
Your First Agent
An LLM that calls tools without thinking is just a random function caller with good grammar. In this chapter, you build the loop that turns a language model into something that actually reasons.
Part 02 Core Patterns
Reasoning Patterns
Your agent from Chapter 4 can reason one step at a time. But ask it to plan a weekend trip involving flights, hotels, dietary restrictions, and a budget constraint, and it confidently books a hotel in the wrong city. The problem is not intelligence — it is the absence of structured thinking.
Tool Use
An agent without tools is a confident liar. It will invent API responses, fabricate database rows, and cite papers that do not exist — all with the fluency of someone who has done it a thousand times. This chapter gives your agents hands.
Memory
An agent that forgets what you said three messages ago is not an assistant — it is a stranger you keep re-introducing yourself to. This chapter gives your agents the ability to remember.
RAG Pipelines
A customer asks your agent whether the company's refund policy covers digital subscriptions. The agent responds with absolute confidence: yes, all purchases are eligible. The real policy, updated six weeks ago, explicitly excludes digital subscriptions. Three hundred support tickets later, someone finds the problem. RAG exists because models do not know what they do not know.
Part 03 Multi Agent
Orchestration
A fraud checker timed out. Nobody had specified what to do when that happened, so the workflow approved the claim anyway. Forty-seven later, an auditor found they all came from the same body shop, identical damage photos rotated by a few degrees. Orchestration is what keeps a missing timeout handler from becoming a $200,000 problem.
Supervisor-Worker Pattern
A single agent that researched competitors, drafted analysis, pulled financial data, and produced a report — flawless in the demo, ruined in production. It would start researching, chase a tangent, forget the report, and time out after burning $40 in API calls. Asking one agent to hold an entire workflow in its head is the wrong model. The supervisor-worker pattern fixes that.
Human-in-the-Loop
At 2:47 AM an automated procurement agent placed a $2.3 million order for industrial solvents. The purchase passed every internal check. What the agent couldn't know: the requesting department had submitted a cancellation four hours earlier through a channel the agent didn't monitor. Reversal cost $180,000. The agent wasn't broken. It was unsupervised.
Agent Communication
The marketing agent wrote a press release announcing a feature engineering had already descoped. The engineering agent drafted a deployment timeline that contradicted the date marketing had promised journalists. Neither agent was wrong in isolation. Each followed its instructions perfectly. The failure was that they never talked to each other.
Part 04 Production
Observability
A loan-approval agent spent six hours silently rejecting every application. No errors. No alerts. All 200s. Each applicant got a polite decline email citing the bank's lending criteria. The agent had been following an outdated policy document since a vector store reindex went wrong. You found it at 8 AM when the lending team noticed the approval rate had dropped to zero. This is what it looks like when an agent fails without failing.
Security
A customer service agent deployed three weeks earlier starts sending internal pricing spreadsheets to anyone who asks. The prompt injection is elegant: a fake customer complaint telling the agent to ignore its system prompt and retrieve any document it can access. By morning, confidential data has gone to fourteen users. No SQL injection, no buffer overflow, no misconfigured firewall. The attack exploited the agent's fundamental capability: following instructions.
Deployment
The agent passed every test on your laptop. You merged the PR on Friday, pushed to production, and went home. By Saturday morning the on-call engineer had paged you three times: OOM-killed every forty minutes, API gateway routing to a stale replica with the old prompt template, and $1,200 in token spend overnight — ten times the daily budget. Nothing was technically broken. Everything worked in isolation. The system failed because nobody had designed the space between the components.
Part 05 Capstones
Capstone 1: Research Assistant
A senior analyst opens twelve browser tabs every morning, copies figures into a spreadsheet, cross-references claims, and drafts a two-page brief the partner reads in six minutes. The bottleneck is the process, not the analyst. This capstone replaces that ritual with a multi-agent research assistant that plans queries, searches in parallel, analyzes through RAG, synthesizes across sources, and produces a cited report ready for human review.
Capstone 2: Code Review Agent
A pull request sits in the queue for two days because the one person who knows that subsystem is on vacation. When it finally gets reviewed, the reviewer catches a style violation and a missing null check but misses the SQL injection on line 247. This capstone builds an automated PR reviewer combining static analysis, security scanning, style enforcement, and LLM reasoning into a single pipeline — the kind of system that ships in your portfolio and covers every pattern from Parts 1 through 4.
Capstone 3: Customer Support System
Ticket volume grows, response times stretch, quality drops, customers churn, and the remaining agents burn out. You hire more people, but onboarding takes months and institutional knowledge stays locked in senior agents' heads. Meanwhile your knowledge base — hundreds of articles, runbooks, and policy documents — sits in a wiki that nobody searches correctly. This capstone builds the system that breaks that spiral: RAG-powered retrieval, per-session memory, sentiment-driven escalation, and human handoff with full context transfer.
Capstone 4: Data Pipeline Orchestrator
Data pipelines break silently. A column renamed upstream, a vendor switching date formats, a nullable field that was never null until today — each failure looks trivial in hindsight, yet it propagates through warehouses for hours before anyone notices. This capstone builds an agentic ETL system that detects schema drift, validates quality, transforms data, and heals its own failures, replacing manual triage with a supervisor-worker architecture that keeps pipelines healthy around the clock.