Book — 20 chapters
Knowledge Graphs for Enterprise AI
A practitioner's guide from SQL to knowledge graphs. 15 chapters and 4 capstone projects bridging relational database experience with graph-powered AI systems — GraphRAG, graph-aware agents, knowledge graph construction, and production deployment.
Part 01 Why Graphs Matter Now
The JOIN Wall
Your compliance query spans 12 tables and takes 4 hours. This chapter explains why deep JOINs hit a performance wall, what graph databases do differently with index-free adjacency, and how to decide whether your workload belongs in a relational database or a graph.
How Graphs Actually Work
A ground-up explanation of graph databases for people who already understand tables, rows, columns, and JOINs. Covers the property graph model, maps every concept to its relational equivalent, and compares Neo4j, Amazon Neptune, and FalkorDB.
The AI Connection
Why vector search alone cannot answer relationship questions, how knowledge graphs give AI structured memory, and the three patterns for combining graphs with LLMs: graph-enhanced RAG, graph-aware agents, and graph-based memory.
Part 02 Graph Thinking
Translating Your Data Model
A pattern-by-pattern guide to converting ERD structures into graph models. Covers five common relational patterns — self-referencing tables, many-to-many junctions, polymorphic associations, temporal data, and hierarchical categories — with SQL schemas, graph models, and clear guidance on when the graph version is better.
Cypher for SQL People
A side-by-side translation of 20 common SQL query patterns into Cypher. Covers SELECT, JOIN, WHERE, GROUP BY, subqueries, INSERT, UPDATE, DELETE, and five queries that are painful in SQL but elegant in Cypher. Includes a complete cheatsheet table.
When to Use Graphs (and When Not To)
A decision framework for evaluating whether a graph database belongs in your architecture. Covers six diagnostic questions, five use cases where graphs shine, five where relational is better, hybrid architecture patterns, and a cost comparison of Neo4j Aura, Amazon Neptune, and self-hosted options.
Part 03 Building Knowledge Graphs
Knowledge Graphs from Documents
Your documents have the answers. The problem is they are prose, not data. This chapter walks the full pipeline from PDF ingestion to a queryable graph: chunking, LLM extraction with Pydantic schemas, entity resolution across four strategies, and production scaling with checkpointing and parallel processing.
Ontology Design Without a PhD
The consultants quoted $500K for an ontology. You can build one in a week. This chapter covers the five-by-five starter method, four-iteration design process, domain patterns for financial services, healthcare, manufacturing, and IT, LLM-assisted ontology suggestion, schema validation, and three copy-paste templates ready to load.
Data Quality for Knowledge Graphs
You have 50,000 nodes. Are they right? This chapter covers the six quality failure categories, automated checks for orphans and duplicates and type violations, semantic consistency rules, document coverage metrics, human review sampling strategies, drift monitoring, and a complete validation pipeline you can run after every ingestion batch.
Part 04 Graph-Powered AI
GraphRAG — Beyond Vector Search
Your RAG chatbot nails lookup questions. Ask it who approved the vendor whose part failed the safety test and you get silence. This chapter explains why vector-only retrieval breaks on relationship questions, and builds a complete GraphRAG pipeline with entity detection, graph traversal, hybrid context assembly, and question routing — including a 10-question comparison of vector-only vs GraphRAG accuracy.
Graph-Aware Agents
A support agent answering "which customers are affected by this recall?" needs a graph traversal, not a vector search. This chapter builds the four essential graph tools (Cypher query, path finder, impact analysis, subgraph summarizer), wires them into a LangGraph agent with schema context and few-shot Cypher examples, and implements three production use cases with read-only guardrails, query timeouts, and audit logging.
Multi-Hop Reasoning
Who approved the vendor that supplies the component that failed the safety test? That is 4 hops. This chapter builds the full multi-hop reasoning pipeline: question decomposition, chain-of-traversal with intermediate results, single-query vs step-by-step tradeoffs, a complete reasoning agent with traceable output, and depth limit guidance by use case.
Part 05 Production
Migration Strategy
Nobody is asking you to rip out Oracle. This chapter covers the sidecar pattern for adding graphs incrementally, a scoring framework for picking which data migrates first, dual-write vs CDC vs batch tradeoffs, a complete Debezium-to-Neo4j CDC pipeline, a four-phase migration timeline with exit criteria, team structure, and a feature-flag rollback strategy that does not require a deployment.
Testing Graph Systems
Your QA team has never tested a graph database. This chapter builds the full test pyramid: unit tests for Cypher queries using testcontainers, integration tests for extraction pipelines, query regression with golden files, data quality automation, performance benchmarks, and a CI/CD workflow that runs all five layers without database mocks.
Monitoring and Operations
It is in production. Three things will wake you up at 3am if you do not monitor them: query latency spikes, graph size growth, and extraction pipeline failures. This chapter builds the full monitoring stack: key metrics dashboards, alert thresholds with severity levels, backup and disaster recovery, read replica routing, connection pool configuration, cost management, and an 8-issue runbook for your on-call team.
Part 06 Capstones
Capstone 1: Compliance Knowledge Graph
Five hundred regulatory documents, no way to query them. This capstone builds the full pipeline: PDF ingestion, LLM-based entity and relationship extraction, Neo4j graph construction, and a GraphRAG layer that answers compliance questions in seconds with source citations.
Capstone 2: Fraud Investigation Agent
An analyst says "follow the money" and waits 3 hours. This capstone builds a graph-powered agent that does it in seconds: transaction graph modeling, five Cypher-based fraud pattern detectors (structuring, rapid movement, circular flows, fan-out, shared entity networks), and a tool-using agent that traces funds and writes the investigation report.
Capstone 3: IT Dependency Mapper
Nobody can answer "what breaks if the payment gateway goes down" in under an hour. This capstone builds a system that answers it in seconds: CMDB ingestion, a dependency graph with tier-aware schema, blast radius Cypher queries, and an impact analysis agent that names the affected apps, the critical paths, and the teams to notify.
Capstone 4: Customer 360 with GraphRAG
The support agent has three tabs open: CRM, ticketing, orders. This capstone collapses all three into a single traversable graph, resolves customer identities across systems, and wraps it with a GraphRAG agent that answers "Tell me everything about this customer" with a complete, contextualized journey in one call.