Problem: The Missing Layer
When working with AI coding agents, each tool solves one problem:
| Document | What It Does | What It Doesn’t Do |
|---|---|---|
| README.md | Introduces the project to users | Doesn’t describe technical structure |
| CLAUDE.md / AGENTS.md | Instructs the agent (rules, commands, boundaries) | Doesn’t describe project structure |
| ADRs | Records decisions and rationale | Doesn’t describe current state |
| Code | The truth itself | Scattered across hundreds of files, expensive to scan |
The agent is forced to re-scan the project from scratch every session. Glob, grep, file reads. The context window fills up with tool output. Shallow scanning leads to incorrect inferences: confusing similarly named items across different modules, treating unused legacy code as active, guessing infrastructure details.
I experienced this problem in my own projects. In a 992-file monorepo, it took the agent ~20 minutes to investigate the Inngest proxy worker setup: SSH attempts, API endpoint guesses, port scans. The answer was actually simple: container name, ports, tunnel routes, env vars. But this information didn’t exist in a structured form anywhere. This is partly my fault; something I hadn’t thought of, or had thought of but overlooked, I’ll own that. However, when you consider this situation across all services, configuration decisions, and so on in the project, remembering every piece or making a parallel update for each topic during the workflow is not very realistic. Accordingly, I needed to find a way, as non-static as possible, to address this situation that wastes time, disrupts the decision flow, and/or bases decisions on incorrect assumptions.
architecture.md fills this gap: a living document that describes the project’s actual structure, in a single file, in a structured format.
Research Foundation
Before formalizing Living Architecture as a template, I conducted extensive research on documentation for AI agents. Parallel investigations with GPT-4o, Kimi K2.5, and Gemini Deep Research, academic papers, and industry practices.
Codified Context Approach
A three-layer system tested on a 108,000-line C# project1 was one of the key findings I came across:
- Hot Memory: Conventions, retrieval hooks, orchestration protocols. In practice, the CLAUDE.md or AGENTS.md file
- Domain Expert Agents: Project-specific specialist agents. Each knows the rules and patterns of a particular domain
- Cold Memory: On-demand specification documents. The agent pulls only what’s relevant
Critical finding: documentation is infrastructure. A structure that requires continuous maintenance, like code, and is essential for agents to produce correct output.
AGENTS.md Ecosystem and Six Essential Areas
AGENTS.md, standardized by the Agentic AI Foundation, uses different files for different AI tools (CLAUDE.md, .cursorrules, SPEC.md) but serves the same purpose: giving the agent structured context.
Six essential areas derived from analysis of over 2,500 successful agent config files2:
- Commands: Exact executable strings (install, build, test, lint)
- Testing: Frameworks, coverage expectations, mocking patterns
- Project Structure: High-level directory tree map
- Code Style: “Show, Don’t Tell” code snippets
- Git Workflow: Branch naming, commit format, PR checklist
- Boundaries: Do/Don’t lists, operational limits
These six areas are the scope of CLAUDE.md or AGENTS.md. However, the project’s architectural structure, inter-module relationships, data flow, infrastructure topology, and tech debt don’t fit into these areas. That’s the reason architecture.md exists.
Hybrid Approach Consensus
All research sources converge on the same point:
| Layer | Content | Method |
|---|---|---|
| Auto-generate | Schema, types, dependency graph, API signatures | Update via Repomix/CI hook when code changes |
| Human-write | Design decisions, constraints, trade-offs | ADR format, “Kernel of Truth” workflow |
| Staleness detect | Document freshness tracking | Git hook or CI check |
“What” is auto-generated, “why” is human-written. Together they give the agent the full picture.
C4 Model and Mermaid.js
The C4 model (Context, Containers, Components, Code) offers an ideal hierarchy for AI agents. When implemented as text-based diagrams with Mermaid.js: version-controllable, token-efficient, readable and editable by agents.
Progressive Disclosure
Even a 200K token context window degrades in performance when overloaded with information3. The solution: revealing information only when needed, rather than loading everything upfront.
| Technique | Mechanism | Benefit |
|---|---|---|
| Jump-to pointers | Reference files by path, not content | Token savings |
| Executable search | rg/grep commands in AGENTS.md | Just-in-time discovery |
| Nested overrides | Per-module local AGENTS.md | Only relevant rules loaded |
These research findings directly shaped the design decisions of the Living Architecture template.
Living Architecture: Design Decisions
When converting research findings into a template, I made several key decisions.
Single File, Structured Sections
I chose to consolidate architectural information in a single architecture.md file rather than distributing it across multiple files. The reason: the agent can load a single file as a reference at session start. Navigating between multiple files creates additional cost in the context window. Thanks to heading hierarchy, the agent can jump to the section it needs.
Per-Section Depth
Instead of assigning a single depth level to the entire project, I adopted a per-section depth approach. In a 50-file project, L1 may be sufficient for the Stack section, but if there are 15 database tables, the Data Model section should be raised to L2.
Optional Modules
Not every project has a monorepo structure, background jobs, or i18n support. I defined 11 optional modules with “trigger conditions”: if pnpm-workspace.yaml exists, include the Monorepo Structure module; otherwise, delete it. This prevents the template from bloating.
Derived from Real Projects
Rather than designing the template abstractly and then applying it, I documented three real production projects (50, 180, 1200+ files) and extracted patterns. Every section’s rationale came from a concrete need.
Template Structure
10 Core Sections
Every project includes these 10 sections, regardless of size:
| # | Section | What It Describes |
|---|---|---|
| 1 | Stack & Dependencies | Packages, versions, which layer uses what |
| 2 | Module Map | Directory structure, file responsibilities |
| 3 | Data Flow | How data moves through the system |
| 4 | Route / API Structure | Endpoints, pages, middleware |
| 5 | Data Model | Tables, columns, relationships |
| 6 | Configuration & Environment | Env vars, secrets, deploy config |
| 7 | Security | Validation rules, auth flow, headers |
| 8 | Constraints & Trade-offs | Why the architecture is the way it is |
| 9 | Known Tech Debt | What needs fixing, prioritized |
| 10 | Code Hotspots | Most-changed, highest-risk files |
11 Optional Modules
Only modules that apply to your project are included; the rest are deleted:
| Module | When to Include |
|---|---|
| Monorepo Structure | 2+ packages/workspaces |
| Background Jobs | Inngest, BullMQ, cron, queues |
| i18n | Multi-language content or UI |
| Membership / Access Control | Tier or role-based access |
| Webhook Processing | External service webhooks |
| Caching Strategy | KV, Redis, CDN, Cache API |
| Search | Pagefind, Algolia, Meilisearch |
| Content Collections | CMS, MDX, Astro collections |
| Design System | Defined visual language, tokens |
| Infrastructure Topology | Multi-service deployments |
| Dependency Graph | Complex internal dependencies |
3 Depth Levels
Each section is written at L1, L2, or L3 depth. Different sections in the same project can have different depths.
| Level | When | Example |
|---|---|---|
| L1 | < 30 source files | Flat tables, one-line descriptions, single diagrams |
| L2 | 30-200 files | Layer grouping, per-flow diagrams, column details |
| L3 | > 200 files | File-level detail, edge cases, metrics, query patterns |
Per-section overrides raise the default, never lower it:
| Condition | Section | Minimum |
|---|---|---|
| DB tables > 10 | Data Model | L2 |
| DB tables > 30 | Data Model | L3 |
| API endpoints > 20 | Route / API Structure | L2 |
| Env vars > 15 | Configuration | L2 |
| Background jobs > 5 | Background Jobs | L2 |
| Components > 30 | Module Map | L2 |
Applied Examples
I illustrated the template with three fictional but realistic projects of different sizes:
L1: Small Static Site (Bella’s Kitchen)
A corporate website built with Astro. ~50 source files, hosted on Cloudflare Pages. architecture.md ~180 lines.
Stack section as a flat table:
| Package | Version | Purpose |
| ------------------- | -------- | ------------------------- |
| astro | ^5.16.11 | Framework (static output) |
| @astrojs/cloudflare | ^12.6.12 | Cloudflare Pages adapter |
| tailwindcss | ^4.1.18 | CSS framework |
Module Map as a single-level directory tree. Data Flow as a single ASCII diagram. That’s sufficient.
L2: Medium Membership (LearnHub)
A 3-tier course platform with Next.js + Stripe + Neon. ~180 files. architecture.md ~450 lines.
Stack section with layer grouping (Runtime, Infrastructure, Dev Dependencies). Data Flow with separate diagrams for auth, payment, and content access. Data Model with column details and foreign keys.
L3: Large Monorepo (Bazaar)
pnpm monorepo, NestJS + Kafka + BullMQ + AWS. ~1200 files. architecture.md ~500 lines (condensed).
Monorepo Structure, Background Jobs, Infrastructure Topology optional modules active. File-level responsibilities, edge cases, metrics.
Keeping It Current
architecture.md is only valuable if it stays up to date. Three strategies:
1. Update on Change
When you add a new dependency, route, or table, update the relevant section. Takes 2 minutes. The simplest and most effective method.
2. PR Check (GitHub Action)
A GitHub Action that maps changed files to architecture.md sections via glob patterns. Posts an informational comment on PRs:
## architecture.md Staleness Check
The following sections may be outdated based on changed files:
- **Stack & Dependencies** - package.json modified
- **Route / API Structure** - 2 files added in src/pages/
Run `/architecture --section "Stack & Dependencies"` to update.
Not a blocker, advisory. The developer decides. Can be addressed in parallel when creating an ADR.
3. Daily Review Integration
Tools like daily-code-review can use architecture.md as review context. Review findings feed back into Tech Debt and Hotspots sections. A feedback loop.
Its Place in the Context Engineering Ecosystem
Living Architecture is more than a standalone tool; it’s a cornerstone of a broader ecosystem. In the four-layer structure I described in the context engineering ecosystem post, architecture.md forms half of the first layer:
Layer 1: Static references → CLAUDE.md + architecture.md
Layer 2: JIT search → mcp-code-search + dnomia-knowledge
Layer 3: Decision governance → /court + ADRs
Layer 4: Learning loop → forge retro
CLAUDE.md tells the agent “how to work”. architecture.md tells it “what you’re working with”. Together they ensure the agent starts with the right context from the first second.
Related Tools
| Project | Relationship |
|---|---|
| decision-gate | /court decisions are recorded as ADRs, feeding architecture.md’s Constraints section |
| daily-code-review | Uses architecture.md as review context, findings flow back to Tech Debt |
| forge | Retro step updates architecture.md |
| mcp-code-search | Complementary with jump-to pointers in architecture.md |
Related Posts
- AI-Powered Codebase Audit: How architecture.md and domain dictionary are used in 6-track audit process
- Context Engineering Ecosystem: Full description of the four-layer ecosystem
Application in My Own Projects
Before formalizing the template, I was already writing architecture.md for my own projects. I have a ~3,600-line architecture.md for my main 992-file project. Before this file, finding answers to infrastructure questions was a trial-and-error process for the agent. After the file, the agent reads the relevant section directly.
For the question “Why did we switch to Neon?”, it used to require scanning 155 ADRs. Now architecture.md’s Architectural Decisions section has a summary with an ADR-127 reference. The agent starts from the summary, opens the ADR file if needed.
The template is the generalized version of these experiences. I removed details specific to my projects and left a structure applicable to projects of different sizes and complexity levels.
Getting Started
For a depth-level overview, examples, and the FAQ, see the project page at lab.ceaksan.com/architecture. To pull the template directly:
curl -sL https://raw.githubusercontent.com/ceaksan/living-architecture/main/TEMPLATE.md -o architecture.md
- Use DEPTH_GUIDE.md to determine each section’s depth
- Delete optional modules that don’t apply to your project
- Fill in each section. Your AI tool can help: it can scan the codebase to produce a module map, generate a stack table from package.json
- Reference architecture.md in your AI tool config
The project is published as open source on GitHub4.
Footnotes
- “Codified Context: Infrastructure for AI Agents in a Complex Codebase” (arxiv.org/html/2602.20478v1). A three-layer system tested on a 108,000-line C# project. ↩
- AGENTS.md Standard (aihero.dev). Content areas standardized by the Agentic AI Foundation, based on analysis of 2,500+ agent config files. ↩
- “Lost in the Middle: How Language Models Use Long Contexts” (Liu et al., 2023). A study documenting how model performance degrades as context window occupancy increases. ↩
- Living Architecture GitHub repository ↩
- 01 architecture.md describes your project's actual structure: module map, data flow, constraints, tech debt. That's not the job of CLAUDE.md or AGENTS.md.
- 02 Depth level is determined per section based on project size. L1 for a 50-file site, L3 for a 1200-file monorepo.
- 03 11 optional modules (monorepo, background jobs, i18n, membership, webhook, cache, search, content collections, design system, infra topology, dependency graph) are included only when relevant.
- 04 The template was derived from real production projects. Applied and validated across three different sizes (50, 180, 1200 files).
+ What is Living Architecture?
A project-agnostic architecture.md template for AI coding agents. It includes 10 core sections (stack, module map, data flow, API structure, data model, configuration, security, constraints, tech debt, code hotspots) and 11 optional modules. Written at L1, L2, or L3 depth depending on project size.
+ Why do I need architecture.md when I already have CLAUDE.md or AGENTS.md?
CLAUDE.md instructs the agent: 'don't use barrel imports', 'run tests like this'. architecture.md describes the project's reality: which module is where, how data flows, which decisions were made and why. One says 'how to work', the other says 'what you're working with'. They serve different purposes and can't replace each other.
+ How do I determine depth levels?
Defaults are based on source file count: under 30 is L1, 30-200 is L2, over 200 is L3. Per-section overrides exist: more than 10 database tables means Data Model is at least L2, more than 20 API endpoints means Route structure is at least L2. Overrides only raise the level, never lower it.
+ How do I keep architecture.md up to date?
Three strategies: (1) On change, update the relevant section when you add a new dependency, route, or table (takes 2 minutes). (2) PR check with a GitHub Action that maps changed files to architecture.md sections (advisory, not a blocker). (3) Daily review integration using tools like daily-code-review that use architecture.md as review context, creating a feedback loop.
+ How do I get started with the template?
Download TEMPLATE.md to your project via curl, use DEPTH_GUIDE.md to determine each section's depth, fill it in. Reference architecture.md in your AI tool config (CLAUDE.md, .cursorrules). Delete optional modules that don't apply to your project.