Chapter 04 of 9

Grounded Delivery: Anti-Patterns

The five delivery anti-patterns that cause AI systems to fail in production — and the key questions to ask at every phase gate before it is too late.

3 min read

Overview

The five phases of Grounded Delivery only function if practitioners are honest at each gate. Anti-patterns are the ways that honesty breaks down — through skipped phases, misread metrics, and sunk-cost reasoning that turns a gate review into a formality. Recognizing these patterns before they take hold is the difference between a portfolio that learns and one that accumulates expensive failures.

Key Questions

Before committing to a phase gate, practitioners should be able to answer these questions:

  • What are the success criteria, expressed in probabilistic terms with specific thresholds?
  • What is the evaluation dataset, and does it represent real production behavior including edge cases?
  • Where is the deterministic/non-deterministic boundary in the architecture, and is it explicit?
  • What are the fallback paths for AI component failure or quality degradation?
  • What legacy integration constraints affect the delivery timeline and technical approach?
  • What are the kill criteria, and who has authority to make the kill decision?
  • How will quality be monitored in production, and by whom, on what cadence?

Anti-Patterns: Grounded Delivery

Frame Skipped. The team jumps directly to building because the problem seems clear and the technology seems obvious. The Frame phase is skipped to "save time." The result is a system built against undefined success criteria, evaluated against undefined quality thresholds, deployed with no monitoring design, and eventually abandoned because nobody can agree on whether it is working. Frame is not overhead. It is the foundation without which everything else is guesswork.

Velocity as Progress. The team tracks sprint velocity and burn-down as evidence of progress. Features ship, story points accumulate, and the demo looks polished — but the evaluation dataset shows quality distributions that do not meet the success criteria defined in Frame. Sprint completion is not value delivery. In AI systems, activity and progress are easily decoupled.

Test Generation Illusion. AI-assisted test generation produces thousands of test cases quickly. The team reports high test coverage and the tests pass — but the tests were generated by the same AI system being tested, which means they validate what the AI thinks is correct rather than what is actually correct. Test coverage generated by the system under test is not validation. Evaluation datasets must be built from ground-truth human judgment, not AI-generated assertions.

Sunk Cost at Gates. Explore or Harden produce evidence that the value hypothesis is not achievable, or that quality thresholds cannot be met. The team pivots the framing of the results to justify continuing — the demo is polished, the leadership has seen it, the team has worked hard. The gate becomes a formality. Sunk cost reasoning at gates is the most expensive anti-pattern in AI delivery. Gates exist to make the kill decision before more is wasted, not after.

Ship and Forget. The system is deployed, the team moves on, and nobody monitors quality in production. Six months later, users notice degraded output quality. The evaluation dataset has not been updated. Drift has not been detected. The system is producing subtly wrong outputs that nobody formally noticed. AI systems are not static software. Shipping without operating is not shipping — it is abandonment on a delayed schedule.