BlogWikiAbout

Kyle Pericak

"It works in my environment"

Bot-Wiki/Research/Design Docs Optimized for AI Coding Agents/Gemini Deep Research: Design Docs for AI Coding Agents

Gemini Deep Research: Design Docs for AI Coding Agents

Last verified: 2026-03-16

The Architecture of Intent: Optimizing Technical Design Documentation for Autonomous AI Engineering Agents

The software engineering landscape is currently undergoing a fundamental transition from human-centric development to a paradigm of agent-augmented synthesis. In this new era, the technical design document (TDD) has evolved from a static artifact intended for human alignment into a dynamic, machine-executable interface. As autonomous coding agents—such as Claude Code, GitHub Copilot Workspace, and Cursor—assume the role of the primary implementer, the requirements for documentation have shifted from ambiguity-tolerant narratives to high-precision logical constraints. This report investigates the current state of technical design documentation, exploring the templates, handoff mechanisms, and structural optimizations required to transform a design document into an effective plan for an AI agent.

Comparative Taxonomies of Design Documentation

The structural evolution of design documents across leading technology firms reveals a convergence toward "Spec-Driven Development" (SDD). In this framework, the specification is not merely a precursor to code but the primary source of truth that determines what is built.1 This shift is necessitated by the observation that while AI agents are remarkably capable of code generation, they frequently suffer from a "specification problem" rather than a "capability problem".2 When 90% of Fortune 100 companies adopt agentic coding and AI writes approximately half of the average developer's code, the precision of the input document becomes the primary determinant of system integrity.2

Industry Standards in Template Architecture

Leading technology organizations utilize distinct templates that reflect their unique operational philosophies. While Google prioritizes clarity and trade-off analysis, Uber emphasizes design system synchronization, and Stripe focuses on developer experience and API patterns.

Company Core Document Type Primary Structural Focus Key Sections
Google Design Doc / Mini-Doc Trade-offs and Cross-cutting concerns.3 Context, Scope, Goals/Non-goals, Design, Alternatives, Cross-cutting concerns.3
Uber RFC / Component Spec Token-driven accuracy and multi-stack parity.5 Abstract, Motivation, Approaches, Proposal, Accessibility, API Props.5
Stripe RFC / RFP / Pattern Guide Ecosystem alignment and security.7 Core requirements, API performance, Security/Compliance, Onboarding patterns.7
GitLab Blueprint / Handbook-First Asynchronous alignment and SSOT.9 Mission, Communication, Departmental Guides, Architecture Design Workflow.9
Shopify Program Playbook / UCP Checklist Merchant-centric UX and Definition of Done.10 Problem Statement, Objectives, Guiding Principles, Risks, Path to Done.11

Google's approach is characterized by its relative informality combined with a strict emphasis on "Alternatives Considered".3 This section is critical for AI agents as it provides a record of rejected paths, preventing the agent from re-proposing solutions that have already been vetted and dismissed by human architects. Uber, by contrast, has moved toward highly automated specifications. Their design systems team uses AI agents and Figma Console Model Context Protocol (MCP) to generate component specs directly from design tokens, ensuring that the documentation remains a living reflection of the source material.5

Minimum Viable Design vs. Comprehensive RFC

The distinction between a Minimum Viable Design (MVD) and a comprehensive Request for Comments (RFC) is defined by the scope of the proposed change and the degree of architectural risk. An MVD, or "mini design doc," is typically 1-3 pages and is used for incremental improvements or sub-tasks in an agile project.3 It maintains the same structure as a full design doc—context, design, and trade-offs—but remains terse and focused on a limited problem set. Comprehensive RFCs are required for "substantial" changes that impact the entire engineering organization, such as replacing widely used libraries or introducing new code conventions.12 In an all-remote environment like GitLab, these RFCs serve as the "blueprints" for major features, ensuring that all stakeholders can provide asynchronous feedback.9 For AI agents, the MVD provides a localized plan for a single task, while the RFC establishes the "Constitution" or governing principles that the agent must reference to maintain long-term system consistency.13

The Logic of the PRD-to-Design Handoff

The transition from a Product Requirements Document (PRD) to a technical design document is the most volatile stage of the development lifecycle. This handoff requires a filtration process that carries forward functional intent while discarding marketing narrative and business justifications that do not impact technical logic.

Information Filtration and Persistence

Teams must translate product requirements into technical architecture decisions by mapping user needs to system components. The filtration of information during this handoff is illustrated in the table below.

Requirement Type PRD Origin Design Doc Carry-Forward Rationale for Handoff
Functional User Stories (As a...) Acceptance Criteria (GIVEN-WHEN-THEN).15 Transforms vague intent into testable logic for agents.
Non-Functional "Fast response" SLA/SLO (e.g., $< 200ms$ latency).16 Provides measurable constraints for optimization.
Constraints "Secure data" RBAC, Encryption standards, SOC2 compliance.7 Establishes "Never" rules for agent behavior.
Context Market Analysis Problem Statement & Motivation.18 Helps agent prioritize trade-offs during implementation.
Out of Scope "Not for V1" Explicit "Out of Scope" list.18 Prevents scope drift and gold-plating by agents.

A critical tool in this translation is the Easy Approach to Requirements Syntax (EARS). EARS forces requirements into a structured notation—such as WHEN [event], THE SYSTEM SHALL [behavior]—which acts as a programming interface for the AI agent.19 By formalizing "vibe-based" requirements into EARS notation, engineers ensure that the agent builds exactly what is needed without making undocumented assumptions.13

AI-Native Handoff Workflows

Modern workflows have formalized this handoff into gated, sequential phases. Tools like GitHub Spec Kit, AWS Kiro, and Copilot Workspace have moved away from "vibe coding" toward a process where specs are living, executable artifacts.13

  • GitHub Spec Kit: This toolkit uses the specify CLI to bootstrap projects for Spec-Driven Development (SDD).13 The workflow follows a sequence: Constitution → Specify → Plan → Tasks.14 The Constitution establishes non-negotiable principles (e.g., "no default exports," "use pnpm") that guide all subsequent agent actions.13
  • AWS Kiro: Kiro optimizes for "state management of the development process".22 It generates three linked Markdown files—requirements.md, design.md, and tasks.md—where changes in requirements automatically propagate to the design and task list via a "Refine" and "Update tasks" flow.22 This structural linking ensures that the agent's work remains synchronized with the product intent.
  • Copilot Workspace: This environment leverages issue-driven autonomous agents. It treats the GitHub issue as the entry point, which the agent then decomposes into a technical plan before implementing changes across the codebase.22

Design Documents as Machine-Readable Instruction Sets

For a design document to serve as an effective input for a coding agent, it must be optimized for machine parsing rather than just human reading. This requires a shift toward structured data, explicit boundaries, and executable commands.

Memory Files and Context Anchoring

AI agents like Claude Code and Cursor utilize markdown "memory files" to maintain project context across sessions. Without these files, the agent is effectively stateless, frequently repeating mistakes such as using the wrong package manager or violating naming conventions.21

File Name Purpose Content Type
CLAUDE.md Core project context for Claude Code.21 Build commands, test setup, architectural patterns, and "Never" rules.21
AGENTS.md Persona-based instructions.23 Focused behaviors for specialized agents (e.g., @security-agent).23
.cursorrules Global rules for Cursor.21 File naming conventions, directory structures, and preferred library versions.21
constitution.md Governing principles.13 Immutable quality, testing, and UX standards.14
SPEC.md Persistent feature reference.23 High-level spec expanded into a detailed plan for a specific project.23

According to Addy Osmani, an effective spec for AI agents must cover six core areas: specific executable commands, testing frameworks and coverage expectations, project structure (explicitly defining where application code vs. unit tests live), code style (using real snippets rather than just descriptions), Git workflow requirements, and clear boundaries of what the agent is never allowed to touch, such as secrets or production configurations.23

Machine-Readability and Architectural Logic

To make architectural decisions machine-readable, design docs should utilize text-based diagramming formats like Mermaid or PlantUML. These formats allow agents to parse the relationship between components (e.g., Frontend -> API -> Service -> Database) without needing to interpret complex binary image files.18 Furthermore, the document should distinguish between high-level architectural decisions and low-level implementation code. Some practitioners argue that TDDs should explicitly avoid file-level change lists and implementation code to ensure the document remains resilient to framework or tooling changes.18 However, file-level change lists can improve implementation quality if the agent is operating with "Low Confidence" ($<66\%$). In such cases, the agent should dedicate the first phase to research and knowledge-building before proposing a step-by-step implementation plan.19 For high-confidence tasks ($>85\%$), the agent can proceed directly to a full, automated implementation based on a granular tasks.md file.19

Trade-off Documentation and Decision Rationale

One of the primary failure modes for AI agents is the "re-discovery" of eliminated approaches. If a design document does not explicitly capture rejected alternatives, an agent—trained on vast datasets of common but potentially inappropriate solutions—may re-propose a sub-optimal approach.

Structuring Negative Design

Effective design docs at Google and Stripe use an "Alternatives Considered" section to prevent reviewers (and agents) from re-litigating old decisions.3 For an agent to understand these trade-offs, they must be structured as specific "Decision Records" that link a choice to its justification and constraints.

  1. The Choice: The specific technology or pattern selected.
  2. The Context: The facts and goals that influenced the decision.
  3. The Trade-off: What was gained and what was sacrificed (e.g., "gained speed at the cost of eventual consistency").
  4. The Rejection: A specific list of alternatives and the reasons they were discarded (e.g., "rejected approach X due to $O(n^2)$ complexity").3

By documenting these rationale, engineers create a "Compressed Decision Record" that acts as a guardrail.19 This ensures that when an agent suggests a change, it does so within the context of the established architectural boundaries, preventing the common frustration of "AI solutions that are almost right, but not quite".2

Architecture Decision Records (ADR) as Permanent Context

While an RFC is a proposal for change, an Architecture Decision Record (ADR) is the final documentation of what was decided and why.25 ADRs are lightweight, text-based documents stored in the repository that fill the gap between high-level architecture docs and low-level designs.26 AI agents are now capable of automatically generating these ADRs by scanning a codebase for architectural shifts, ensuring that the "why" behind a change is captured while the context is still fresh.27 An ADR template for agents typically includes:

  • Status: Draft, Accepted, Superseded, or Rejected.26
  • Kernel of Truth: A short, one-sentence summary of the decision.26
  • Consequences: The documented impact of the decision on future maintenance.27
  • Guardrails: Explicit instructions for the agent (e.g., "References MUST exist—check each link").26

Implementation Phasing and Task Decomposition

The most critical factor in successful agentic implementation is the granularity and sequencing of task decomposition. McKinsey's research identifies that AI's time savings are most significant in documentation and code writing, but these gains are only realized when work is broken into manageable, bounded components.29

McKinsey's Two-Layer Architecture and Strategic Planning

McKinsey's concept of a "two-layer architecture" provides a framework for hierarchical task management.31 In control theory—from which the term is borrowed—this involves a long-horizon strategic layer (the "Speed Planner") that generates reference trajectories and a reactive lower layer (the "Tracking Controller") that executes immediate actions.32 In software engineering, this maps to a dual-agent workflow:

  1. Lead Agent (Strategic): Decomposes a project into a "phased implementation plan".22 It focuses on the critical path and long-term objectives.18
  2. Worker Agents (Tactical): Execute individual tasks in parallel within independent contexts, significantly reducing completion time through parallelization.22

Research suggests that structured task decomposition creates "navigable pathways" for newcomers and agents alike, leading to a 24% performance improvement in problem-solving value.29

Task Sequencing in Spec Kit and Kiro

AI-native tools handle task generation and sequencing by ensuring that implementations proceed from "dependencies upward".19

  • GitHub Spec Kit: Uses the /tasks command to generate a specific list of implementation steps derived directly from the plan.md.14 It utilizes checklists within the markdown files to track user clarifications and research tasks, acting as a "definition of done" for each step.15
  • AWS Kiro: Explicitly traces each task in tasks.md back to a specific requirement in requirements.md.22 It features a "Grind Mode" where a long-running agent autonomously iterates through tasks, checking its own work against "correctness properties" and compile-time diagnostics until the CI pipeline passes.33

This granularity ensures that tasks are "atomic units of work" that an agent can independently execute and verify.33 For larger projects, the "sweet spot" for a design doc is around 10-20 pages, though "mini design docs" of 1-3 pages are highly effective for sub-tasks.3

Design Doc Failure Modes in the AI Era

The introduction of AI agents has shifted the nature of documentation failure. While traditional issues like "bikeshedding" and "over-specification" persist, new failure modes related to "agent drift" and "hallucinated constraints" have emerged.

Common Pathologies and AI Impact

Failure Mode Human Consequence AI Agent Consequence
Over-specification Burying reviewers in minutiae.34 Preventing the agent from exploring the best implementation approaches.2
Under-specification Confusion and misalignment. Agent "vibe coding" based on training data rather than project needs.13
Stale Documentation Confusion and tech debt. Agent following outdated decisions, resulting in broken implementations or security risks.5
Bikeshedding Wasted time in meetings. Agent getting stuck in "loops of death" or redundant refactoring cycles.24
Ignored Constraints Human error. Agent ignoring "Never" rules if they are not explicitly placed in the context window.23

The "stale doc" problem is particularly acute for agents. If a CLAUDE.md mentions a Node.js requirement of $>=18.0.0$ while the package.json has been updated to $22.0.0$, the agent will likely prioritize the explicit markdown instruction, leading to an environment mismatch.35 To mitigate this, teams must adopt a "handbook-first" approach where documentation is treated as a single source of truth (SSOT) and updated with every pull request.9

Mitigating Failure via Automation

Automation can solve the eternal problem of outdated documentation. Uber's use of agents to generate specs directly from Figma tokens eliminates transcription errors and ensures that accessibility requirements (WCAG compliance) are baked into the documentation from day one.5 Furthermore, tools like the context-evaluator can be used to audit memory files, highlighting outdated technology references and suggesting structured improvements.35

The Convergence of Specification and Implementation

The ultimate goal of optimizing technical design documents for AI agents is to close the "reality gap" between concept and production.20 As the "lingua franca" of development moves to a higher level of abstraction, the role of the engineer shifts from writing code to evolving specifications.2

Agentic Commerce and Specialized Documentation

Stripe's "Agentic Commerce Suite" highlights the next frontier of technical documentation: preparing for non-human traffic.36 Documentation must now specify how to optimize product information for agents via llms.txt files and how to handle "agentic bursts" with edge computing logic and rate limiting.36 In this context, the TDD is not just a plan for a coding agent but a blueprint for how a system interacts with an entire ecosystem of autonomous agents.

Conclusion: The Executable Blueprint

Technical design documentation in 2026 has become the "neurological connection" that directs and controls operations in a tech-enabled product.31 To effectively leverage AI coding agents, design documents must be structured as "executable blueprints" that prioritize:

  1. Strict Formalism: Using EARS notation for requirements and Mermaid for architecture.
  2. Contextual Anchoring: Utilizing memory files (CLAUDE.md, constitution.md) to establish immutable project boundaries.
  3. Hierarchical Decomposition: Adopting a two-layer architecture that separates strategic planning from tactical task execution.
  4. Living Synchronization: Automating the generation of ADRs and specs to ensure documentation never drifts from reality.

While evidence is currently thin on the long-term maintenance costs of "agent-generated specs," the performance gains in the initial build phase are undeniable.5 The challenge for future architects will be to govern these specs at scale, converting vague rules into actionable, versioned standards that both humans and silicon minds can execute with high confidence.

Works cited

  1. Spec-driven development with AI: Get started with a new open source toolkit - The GitHub Blog, accessed March 16, 2026, https://github.blog/ai-and-ml/generative-ai/spec-driven-development-with-ai-get-started-with-a-new-open-source-toolkit/
  2. How to write PRDs for AI Coding Agents, accessed March 16, 2026, https://medium.com/@haberlah/how-to-write-prds-for-ai-coding-agents-d60d72efb797
  3. Design Docs at Google - Industrial Empathy, accessed March 16, 2026, https://www.industrialempathy.com/posts/design-docs-at-google/
  4. Design Docs for software architecture review - Brian Sigafoos, accessed March 16, 2026, https://briansigafoos.com/design-docs/
  5. How Uber Built an Agentic System to Automate Design Specs in Minutes, accessed March 16, 2026, https://www.uber.com/blog/automate-design-specs/
  6. h3/dev-docs/RFCs/rfc-template.md at master · uber/h3 - GitHub, accessed March 16, 2026, https://github.com/uber/h3/blob/master/dev-docs/RFCs/rfc-template.md
  7. Master RFP Template for Billing Vendors | Stripe, accessed March 16, 2026, https://stripe.com/guides/master-rfp-template-for-billing-vendors
  8. Design patterns for Stripe Apps, accessed March 16, 2026, https://docs.stripe.com/stripe-apps/patterns
  9. Developer Experience Design Documents | The GitLab Handbook, accessed March 16, 2026, https://handbook.gitlab.com/handbook/engineering/infrastructure-platforms/developer-experience/design-documents/
  10. Tips for Creating Effective Technical Documentation for Your Shopify App, accessed March 16, 2026, https://www.shopify.com/partners/blog/technical-documentation
  11. A Guide to Running an Engineering Program - Shopify, accessed March 16, 2026, https://shopify.engineering/running-engineering-program-guide
  12. How to use RFCs to take decisions at scale | by Marco Ziccardi | Doctolib - Medium, accessed March 16, 2026, https://medium.com/doctolib/how-to-use-rfcs-to-take-decisions-at-scale-a4f97c96b6c
  13. Diving Into Spec-Driven Development With GitHub Spec Kit - Microsoft for Developers, accessed March 16, 2026, https://developer.microsoft.com/blog/spec-driven-development-spec-kit
  14. github/spec-kit: Toolkit to help you get started with Spec ... - GitHub, accessed March 16, 2026, https://github.com/github/spec-kit
  15. Understanding Spec-Driven-Development: Kiro, spec-kit, and Tessl, accessed March 16, 2026, https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html
  16. writing-rfc-resources.md - gists · GitHub, accessed March 16, 2026, https://gist.github.com/henkmeulekamp/117f2ec9c2767500cc9a4460615a998e
  17. Risk Register + Engineering ControlsRisk Register | Elogic Commerce, accessed March 16, 2026, https://elogic.co/risk-register/
  18. technical-design-doc-creator | Skill... · LobeHub, accessed March 16, 2026, https://lobehub.com/skills/plocemourasouza-osforge-technical-design-doc-creator
  19. spec-driven-workflow-v1.instructions.md - awesome-copilot - GitHub, accessed March 16, 2026, https://github.com/github/awesome-copilot/blob/main/instructions/spec-driven-workflow-v1.instructions.md
  20. Kiro: Bridging the Gap Between AI Prototyping and Production-Ready Code, accessed March 16, 2026, https://tgaleev.com/kiro-bridging-the-gap-between-ai-prototyping-and-production-ready-code
  21. The Complete Guide to AI Agent Memory Files (CLAUDE.md, AGENTS.md, and Beyond), accessed March 16, 2026, https://medium.com/data-science-collective/the-complete-guide-to-ai-agent-memory-files-claude-md-agents-md-and-beyond-49ea0df5c5a9
  22. Why Kiro Looks Unassuming: Organizing Design Philosophy Differences in the Age of Claude Code and Cursor - DEV Community, accessed March 16, 2026, https://dev.to/aws-builders/why-kiro-looks-unassuming-organizing-design-philosophy-differences-in-the-age-of-claude-code-and-1dp9
  23. How to write a good spec for AI agents - AddyOsmani.com, accessed March 16, 2026, https://addyosmani.com/blog/good-spec/
  24. Technical writing with AI agents: Devin, Cursor, and Claude Code in January 2026 - Fern, accessed March 16, 2026, https://buildwithfern.com/post/technical-writing-ai-agents-devin-cursor-claude-code
  25. How to Make Architecture Decisions: RFCs, ADRs, and Getting Everyone Aligned, accessed March 16, 2026, https://lukasniessen.medium.com/how-to-make-architecture-decisions-rfcs-adrs-and-getting-everyone-aligned-ab82e5384d2f
  26. Accelerating Architectural Decision Records (ADRs) with Generative AI | Equal Experts, accessed March 16, 2026, https://www.equalexperts.com/blog/our-thinking/accelerating-architectural-decision-records-adrs-with-generative-ai/
  27. AI generated Architecture Decision Records (ADR) - Dennis Adolfi, accessed March 16, 2026, https://adolfi.dev/blog/ai-generated-adr/
  28. Building an Architecture Decision Record Writer Agent | by Piethein Strengholt | Medium, accessed March 16, 2026, https://piethein.medium.com/building-an-architecture-decision-record-writer-agent-a74f8f739271
  29. Decomposing Complexity: An LLM-Based Approach to Supporting Software Engineering Tasks, accessed March 16, 2026, http://ra.adm.cs.cmu.edu/anon/2025/CMU-CS-25-132.pdf
  30. Redefining Software Engineering | MI - 超智諮詢, accessed March 16, 2026, https://www.meta-intelligence.tech/en/insight-ai-dev-paradigm
  31. Tackling IT complexity in product design - McKinsey, accessed March 16, 2026, https://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%20Digital/Our%20Insights/Tackling%20IT%20complexity%20in%20product%20design/Tackling%20IT%20complexity%20in%20product%20design.pdf
  32. Energy-efficient Motion Planning and Control for Connected, Automated Vehicles By Eunhyek Joa - eScholarship, accessed March 16, 2026, https://escholarship.org/content/qt7j29h06c/qt7j29h06c_noSplash_55d5d6fd88eea210db028be6381f7f5f.pdf
  33. On Kiro and the AI-Driven Development Lifecycle | by Dirk Michel ..., accessed March 16, 2026, https://medium.com/@micheldirk/on-kiro-and-the-ai-driven-development-lifecycle-3459c2c19751
  34. Design Docs, accessed March 16, 2026, https://www.designdocs.dev/
  35. Writing AI coding agent context files is easy. Keeping them accurate isn't. - Packmind, accessed March 16, 2026, https://packmind.com/evaluate-context-ai-coding-agent/
  36. How to prepare for agentic commerce: A technical field guide, accessed March 16, 2026, https://stripe.com/guides/how-to-prepare-for-agentic-commerce-technical-field-guide
Related:wiki/research/design-docs-for-agentswiki/research/ai-augmented-prds/geminiwiki/design-docs
Blog code last updated on 2026-03-17: 7ae9b0dec32c090b187b7ffa81b1559756cdad46