Quick Reference 04

AI Agents

Quick reference for agent architecture, tool definitions, memory types, and orchestration patterns.

7 min readAI ArchitectureQuick ReferenceDownload PDF

Agent Anatomy: Observe-Think-Act Loop

Every AI agent, regardless of framework, follows this core loop. Understanding it is essential before you add complexity with multi-agent patterns or custom orchestration.

                    ┌─────────────┐
                    │   OBSERVE   │
                    │ Read input, │
                    │ tool output,│
                    │ environment │
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │    THINK    │
                    │ Reason about│
                    │ next action │
                    │ (LLM call)  │
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │     ACT     │
                    │ Call tool,  │
                    │ respond, or │
                    │ delegate    │
                    └──────┬──────┘
                           │
                    loops until done

Core Agent Components

An agent is more than an LLM with tools -- it needs memory, planning, and reflection to handle real-world tasks reliably. Missing any of these components leads to agents that work in demos but fail in production.

ComponentPurposeImplementation
System promptIdentity, rules, capabilitiesStatic text + dynamic context
Tool definitionsAvailable actionsFunction schemas (JSON)
MemoryConversation contextBuffer, summary, or vector store
PlanningTask decompositionCoT, ReAct, or explicit planner
ExecutionTool calling + result processingFunction dispatch + error handling
ReflectionSelf-check, retry logicOutput validation, critic LLM

Tool Definition Patterns

Tool definitions are the contract between your agent and the outside world. Vague descriptions and sloppy schemas are the #1 cause of agents calling the wrong tool or passing bad arguments.

OpenAI-Compatible Tool Schema

{
  "type": "function",
  "function": {
    "name": "search_database",
    "description": "Search the product database by query. Returns top 5 results.",
    "parameters": {
      "type": "object",
      "properties": {
        "query": {
          "type": "string",
          "description": "Natural language search query"
        },
        "category": {
          "type": "string",
          "enum": ["electronics", "clothing", "books"],
          "description": "Optional category filter"
        },
        "max_results": {
          "type": "integer",
          "default": 5,
          "description": "Maximum number of results to return"
        }
      },
      "required": ["query"]
    }
  }
}

Tool Design Best Practices

PrincipleGoodBad
Specific namesearch_orders_by_emailsearch
Clear description"Finds orders by customer email. Returns order ID, date, total.""Search stuff"
Typed parameters{"type": "integer", "minimum": 1}{"type": "string"} for a number
Required vs optionalMark only truly required as requiredEverything required
Return formatDocumented, consistent structureUnpredictable output
Error surfaceReturn error in result, don't throwSilent failure

Memory Types

Without memory, every agent turn starts from scratch. The right memory architecture determines whether your agent can handle a 5-message chat or a 500-step workflow spanning multiple sessions.

Memory TypeHow It WorksCapacityUse Case
Buffer (sliding window)Keep last N messagesLow-mediumShort conversations
Token bufferKeep last N tokens of historyMediumToken-budget aware
SummaryLLM summarizes older messagesHighLong conversations
Vector/semanticEmbed messages, retrieve relevantVery highKnowledge-heavy agents
EpisodicStore full episodes, retrieve by similarityVery highLearning from past tasks
EntityExtract and track entity statesMediumCustomer service, CRM
Structured (KG)Knowledge graph of factsHighComplex domain reasoning

Memory Selection Guide

How long are typical conversations?
  < 10 turns    -> Buffer memory (simple, cheap)
  10-50 turns   -> Summary memory (compress old context)
  50+ turns     -> Vector memory (retrieve relevant only)

Does the agent need to learn across sessions?
  YES -> Episodic + Vector memory
  NO  -> Buffer or Summary is fine

Does the agent track many entities?
  YES -> Entity memory + structured storage
  NO  -> Standard memory is fine

Agent Frameworks Comparison

Choosing a framework is a build-vs-buy decision that affects your iteration speed and lock-in. Pick based on your language, complexity needs, and whether you need multi-agent support.

FrameworkLanguageKey FeatureBest For
LangGraphPython/JSGraph-based workflowsComplex stateful agents
CrewAIPythonRole-based multi-agentTeam simulations
AutoGenPythonConversational agentsResearch, debate patterns
Semantic KernelC#/PythonEnterprise integration.NET ecosystems
HaystackPythonPipeline-basedRAG-heavy agents
Agents SDK (OpenAI)PythonHandoffs, guardrailsOpenAI-centric apps
Claude Agent SDKPythonMCP tools, model agnosticAnthropic-centric apps
MastraTypeScriptWorkflows, evalsTS/JS applications

ReAct Pattern

ReAct (Reason + Act) is the most widely used agent pattern because it forces the model to explain its reasoning before taking action. This makes agent behavior interpretable and debuggable.

Thought: I need to find the user's order status. I'll search by their email.
Action: search_orders(email="user@example.com")
Observation: Found order #1234, status: shipped, tracking: XYZ789
Thought: I have the info. I'll respond with the order status and tracking number.
Answer: Your order #1234 has been shipped. Tracking number: XYZ789.

ReAct Implementation

def react_loop(query, tools, max_steps=10):
    messages = [
        {"role": "system", "content": REACT_SYSTEM_PROMPT},
        {"role": "user", "content": query}
    ]

    for step in range(max_steps):
        response = llm.chat(messages, tools=tools)

        if response.tool_calls:
            for call in response.tool_calls:
                result = execute_tool(call.name, call.args)
                messages.append({"role": "tool", "content": result,
                                 "tool_call_id": call.id})
        else:
            return response.content  # Final answer

    return "Max steps reached without resolution."

Orchestration Patterns

How you wire agents together determines your system's capability ceiling and failure modes. Start with the simplest pattern that works and only add complexity when you have evidence a single agent cannot handle the task.

PatternDescriptionUse CaseComplexity
Single agentOne LLM + toolsSimple tasksLow
Sequential pipelineAgent A output feeds Agent BMulti-step processingMedium
Parallel fan-outSame input to N agentsMultiple perspectivesMedium
RouterClassifier routes to specialistDomain-specific handlingMedium
Supervisor-workerSupervisor delegates, reviewsComplex task decompositionHigh
HierarchicalMulti-level supervisorsEnterprise workflowsHigh
Debate/consensusAgents argue, reach agreementHigh-stakes decisionsHigh

Supervisor-Worker Pattern

# Supervisor decides which worker to call
supervisor_prompt = """
You are a supervisor managing these workers:
- researcher: Finds information from documents
- calculator: Performs mathematical computations
- writer: Drafts text content

Given the user request, decide which worker(s) to call and in what order.
Respond with a plan as JSON: {"steps": [{"worker": "...", "task": "..."}]}
"""

# Workers are specialized agents with focused tool sets
workers = {
    "researcher": Agent(tools=[search, retrieve]),
    "calculator": Agent(tools=[calculate, chart]),
    "writer": Agent(tools=[draft, edit]),
}

Planning Strategies

Planning determines whether your agent tackles a complex task methodically or stumbles through it. The right strategy balances structure against adaptability -- too rigid and the agent can't recover from surprises, too loose and it loses track of its goal.

StrategyHow It WorksProsCons
No explicit planLLM decides step by stepSimple, flexibleMay lose track
Upfront planGenerate full plan first, then executeOrganizedInflexible to new info
Adaptive planPlan, execute, re-plan after each stepFlexible, informedHigher LLM cost
Plan-and-solveDecompose into sub-tasks, solve eachGood for complex tasksOverhead

Error Handling

Agents fail in ways that traditional software does not -- they hallucinate tool names, pass invalid arguments, and get stuck in infinite loops. Robust error handling is what separates a demo agent from a production one.

Error TypeDetectionRecovery
Tool not foundInvalid tool name in responseRe-prompt with available tools
Tool execution failureException from toolReturn error to LLM, let it retry
Infinite loopStep counter exceeds maxForce response or escalate
Hallucinated tool callTool name not in schemaFilter, re-prompt
Wrong argumentsSchema validation failureReturn validation error to LLM
Context overflowToken count exceededSummarize history, trim old messages

Agent Evaluation

Agent evaluation goes beyond LLM output quality -- you also need to measure tool accuracy, step efficiency, and cost. Without these metrics, you are flying blind on whether your agent is actually improving.

MetricWhat It MeasuresHow to Measure
Task completionDid the agent finish the task?Binary success/failure
Tool accuracyCorrect tool called with correct args?Compare to gold standard
Step efficiencyNumber of steps to completeCount tool calls
CostTotal tokens consumedSum input + output tokens
LatencyTime to complete taskWall clock time
SafetyNo harmful actions takenRed-team testing
User satisfactionDid the user get what they needed?Thumbs up/down, CSAT

Common Pitfalls

Most agent failures come from architectural over-engineering or missing safety boundaries, not from the LLM itself. Check this list before adding another agent to your system.

PitfallProblemFix
Too many toolsLLM confused, wrong tool selectionLimit to 10-15 tools, use routing
Vague tool descriptionsWrong tool callsWrite precise descriptions with examples
No max iteration limitInfinite loops, cost explosionSet hard limit (5-20 steps)
Full history in contextToken overflow, high costUse summary or vector memory
No tool result validationGarbage in, garbage outValidate tool outputs before passing to LLM
Single monolithic agentPoor at specialized tasksSplit into specialist agents
No human escalation pathAgent stuck on hard casesAdd "escalate_to_human" tool
Ignoring tool errorsAgent continues with bad dataSurface errors clearly to the LLM