The Spec Growth Engine: Spec-Anchored, Code-Coupled, Drift-Enforced Architecture for AI-Assisted Software Development
Pith reviewed 2026-06-26 03:37 UTC · model grok-4.3
The pith
A spec graph with contract-design separation, ownership-path context scoping, hardest-first vertical slices, and a blocking drift gate prevents context explosion and silent spec-code drift for AI coding agents.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Spec Growth Engine is a lightweight framework that addresses context explosion and silent spec-code drift through a machine-readable spec graph whose nodes carry explicit contract/design separation, a Spine context assembler that scopes agent context to an ownership path, a vertical-slice growth protocol that enforces hardest-first ordering, and a drift gate that makes spec-code divergence a blocking merge condition; the design synthesises established principles such as Parnas information hiding, C4, ADRs, Walking Skeleton, Reflexion Models, and Fitness Functions into a lean, code-coupled, machine-enforced whole without the overhead of heavyweight frameworks.
What carries the argument
The spec graph whose nodes separate contracts from designs, together with the Spine context assembler, the vertical-slice growth protocol, and the drift gate that blocks merges on divergence.
If this is right
- AI agents can work across growing repositories without output quality dropping from full-context overload.
- Specifications remain visibly coupled to code because divergence is treated as a blocking condition at merge time.
- Development follows a hardest-first vertical-slice order that keeps structural decisions visible early.
- The same machinery can be applied without adopting heavy process frameworks such as RUP or MDA.
- Context supplied to agents is restricted to an ownership path, limiting the information each agent must process.
Where Pith is reading between the lines
- The drift-gate idea could be ported to conventional non-AI workflows to catch divergence earlier in review cycles.
- The vertical-slice ordering rule might improve project predictability even when no AI agent is present.
- Automated extraction of the initial spec graph from an existing codebase would be a natural next implementation step.
- Regulated domains that require traceable alignment between requirements and code could adopt the drift gate as an audit point.
Load-bearing premise
Combining the spec graph, Spine assembler, vertical-slice protocol, and drift gate will remove context explosion and silent drift in real projects without creating new failure modes or unacceptable overhead.
What would settle it
Run the framework on a multi-module application with an AI agent, then check whether context-window usage stays low as the repository grows and whether any spec-code divergence reaches a merge without being rejected by the drift gate.
Figures
read the original abstract
AI coding agents dramatically accelerate implementation speed but introduce two structural failure modes that existing spec-driven approaches do not fully solve: (1) context explosion -- the agent must reason over an entire repository at once, degrading output quality as the context window fills; and (2) silent spec-code drift -- code evolves, the specification does not, and the divergence becomes invisible until it is costly to repair. We present the Spec Growth Engine, a lightweight framework that addresses both failure modes through a machine-readable spec graph whose nodes carry explicit contract/design separation, a Spine context assembler that scopes agent context to an ownership path, a vertical-slice growth protocol that enforces hardest-first ordering, and a drift gate that makes spec-code divergence a blocking merge condition. The design synthesises well-established software engineering principles (Parnas information hiding, C4, ADRs, Walking Skeleton, Reflexion Models, Fitness Functions) into a lean, code-coupled, machine-enforced whole -- without the overhead of heavy-weight frameworks such as RUP or MDA.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the Spec Growth Engine framework—consisting of a machine-readable spec graph with explicit contract/design separation, a Spine context assembler that scopes agent context to an ownership path, a vertical-slice growth protocol enforcing hardest-first ordering, and a drift gate that blocks merges on spec-code divergence—addresses context explosion and silent spec-code drift in AI-assisted development. It does so by synthesizing established principles (Parnas information hiding, C4, ADRs, Walking Skeleton, Reflexion Models, Fitness Functions) into a lightweight, code-coupled, machine-enforced system without the overhead of heavy frameworks like RUP or MDA.
Significance. If the proposed synthesis of these components can be shown to deliver the claimed scoping and enforcement effects, the work could provide a practical, enforceable architecture for AI-assisted software development that maintains spec-code alignment and manages context without introducing unacceptable overhead or new failure modes.
major comments (1)
- [Abstract] Abstract (second paragraph): the central claim that the spec graph, Spine assembler, vertical-slice protocol, and drift gate 'address both failure modes' rests entirely on an untested synthesis assumption; the manuscript supplies no reasoning, interaction analysis, worked example, or validation demonstrating why this specific combination produces the claimed reductions in context explosion and silent drift without new overhead or failure modes.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and for highlighting the need for stronger justification of the framework's claims. We address the major comment point by point below.
read point-by-point responses
-
Referee: [Abstract] Abstract (second paragraph): the central claim that the spec graph, Spine assembler, vertical-slice protocol, and drift gate 'address both failure modes' rests entirely on an untested synthesis assumption; the manuscript supplies no reasoning, interaction analysis, worked example, or validation demonstrating why this specific combination produces the claimed reductions in context explosion and silent drift without new overhead or failure modes.
Authors: We agree that the current manuscript, being a conceptual design paper, does not include empirical validation or a detailed worked example demonstrating the interactions. The claims are grounded in the synthesis of established principles (Parnas information hiding for scoping, C4 and ADRs for structure, Walking Skeleton and vertical slices for growth, Reflexion Models and Fitness Functions for drift enforcement). However, we acknowledge the absence of explicit reasoning on their combined effects. In the revised manuscript, we will add a new section providing a step-by-step worked example of applying the framework to a sample project, including analysis of how the components interact to mitigate context explosion (via Spine scoping) and silent drift (via drift gate), and discuss why this synthesis does not introduce unacceptable overhead based on the lightweight nature of the components. This will strengthen the justification for the central claim. revision: yes
Circularity Check
No circularity; purely conceptual synthesis of external principles
full rationale
The manuscript proposes a framework by combining named external principles (Parnas information hiding, C4, ADRs, Walking Skeleton, Reflexion Models, Fitness Functions) into a spec graph, Spine assembler, vertical-slice protocol, and drift gate. No equations, fitted parameters, derivations, or predictions exist. No self-citations appear as load-bearing justification for the central claim; the synthesis is presented as an untested design assumption rather than a result forced by prior author work or definitional loops. The paper is self-contained as a proposal and does not reduce any claimed outcome to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption AI coding agents suffer from context explosion and silent spec-code drift as primary structural failure modes that existing spec-driven approaches do not fully solve.
invented entities (4)
-
Spec graph with explicit contract/design separation
no independent evidence
-
Spine context assembler
no independent evidence
-
Vertical-slice growth protocol
no independent evidence
-
Drift gate
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Aws kiro: Spec-driven ai ide, 2025.https://kiro.dev
Amazon Web Services. Aws kiro: Spec-driven ai ide, 2025.https://kiro.dev
2025
-
[2]
Exploring gen ai: The tools of spec-driven development
Birgitta Böckeler. Exploring gen ai: The tools of spec-driven development. martinfowler.com, 2025.https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html
2025
-
[3]
Barry W. Boehm. A spiral model of software development and enhancement.ACM SIGSOFT Software Engineering Notes, 11(4):14–24, 1986. doi: 10.1145/12944.12948. 13
-
[4]
2: Visualise, Document and Explore Your Software Architecture
Simon Brown.Software Architecture for Developers, Vol. 2: Visualise, Document and Explore Your Software Architecture. Leanpub, 2018.https://c4model.com
2018
-
[5]
Context rot: How increasing input tokens impacts llm performance, 2025
Chroma Research. Context rot: How increasing input tokens impacts llm performance, 2025. https://www.trychroma.com/research/context-rot
2025
-
[6]
Crystal clear: A human-powered methodology for small teams, 2001
Alistair Cockburn. Crystal clear: A human-powered methodology for small teams, 2001. Walking Skeleton pattern
2001
-
[7]
Robert G. Cooper. Stage-gate systems: A new tool for managing new products.Business Horizons, 33(3):44–54, 1990. doi: 10.1016/0007-6813(90)90040-I
-
[8]
Addison- Wesley, 2003
Eric Evans.Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison- Wesley, 2003. ISBN 978-0321125217
2003
-
[9]
O’Reilly Media, 2022
Neal Ford, Rebecca Parsons, Patrick Kua, and Pramod Sadalage.Building Evolutionary Architectures, 2nd Edition. O’Reilly Media, 2022. ISBN 978-1492097532
2022
-
[10]
IT Revolution Press, 2018
Nicole Forsgren, Jez Humble, and Gene Kim.Accelerate: The Science of Lean Software and DevOps. IT Revolution Press, 2018. ISBN 978-1942788331
2018
-
[11]
Addison-Wesley, 1999
Martin Fowler.Refactoring: Improving the Design of Existing Code. Addison-Wesley, 1999. ISBN 978-0201485677
1999
-
[12]
Addison- Wesley, 2009
Steve Freeman and Nat Pryce.Growing Object-Oriented Software, Guided by Tests. Addison- Wesley, 2009. ISBN 978-0321503626
2009
-
[13]
Spec kit: Toolkit for spec-driven development, 2025.https://github.com/github/ spec-kit
GitHub. Spec kit: Toolkit for spec-driven development, 2025.https://github.com/github/ spec-kit
2025
-
[14]
Dumb Zone
Dexter Horthy. No vibes allowed: Engineering with coding agents. Talk, AI Engineer, 2025. Popularises the “Dumb Zone” heuristic for coding agents.https://www.youtube.com/watch? v=rmvDxxNubIg
2025
-
[15]
Addison-Wesley, 2010
Jez Humble and David Farley.Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Addison-Wesley, 2010. ISBN 978-0321601919
2010
-
[16]
Addison-Wesley, 1999
Andrew Hunt and David Thomas.The Pragmatic Programmer. Addison-Wesley, 1999. ISBN 978-0201616224
1999
-
[17]
Guide to the software engineering body of knowledge (swe- bok), version 4.0, 2024
IEEE Computer Society. Guide to the software engineering body of knowledge (swe- bok), version 4.0, 2024. https://www.computer.org/education/bodies-of-knowledge/ software-engineering
2024
-
[18]
O’Reilly Media, 2021
Vlad Khononov.Learning Domain-Driven Design. O’Reilly Media, 2021. ISBN 978-1098100131
2021
-
[19]
The 4+1 view model of architecture.IEEE Software, 12(6):42–50, 1995
Philippe Kruchten. The 4+1 view model of architecture.IEEE Software, 12(6):42–50, 1995. doi: 10.1109/52.469759
-
[20]
Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. Lost in the middle: How language models use long contexts.Transactions of the Association for Computational Linguistics, 12:157–173, 2024. doi: 10.1162/tacl_a_00638. 14
-
[21]
Martin.Agile Software Development: Principles, Patterns, and Practices
Robert C. Martin.Agile Software Development: Principles, Patterns, and Practices. Prentice Hall, 2002. ISBN 978-0135974445
2002
-
[22]
In: Proceedings of the 3rd ACM SIGSOFT Symposium on Foundations of Software Engineering, pp
Gail C. Murphy, David Notkin, and Kevin Sullivan. Software reflexion models: Bridging the gap between design and implementation. InProceedings of the 3rd ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 18–28, 1995. doi: 10.1145/222124.222136
-
[23]
Michael T. Nygard. Documenting architecture decisions, 2011.https://cognitect.com/blog/ 2011/11/15/documenting-architecture-decisions
2011
-
[24]
David L. Parnas. On the criteria to be used in decomposing systems into modules.Communi- cations of the ACM, 15(12):1053–1058, 1972. doi: 10.1145/361598.361623
-
[25]
Parnas et al
David L. Parnas et al. The modular structure of complex systems. Technical report, Naval Research Laboratory, 1979. A-7E project module guide
1979
-
[26]
Dewayne E. Perry and Alexander L. Wolf. Foundations for the study of software architecture. ACM SIGSOFT Software Engineering Notes, 17(4):40–52, 1992. doi: 10.1145/141874.141884
-
[27]
Stefano Rando et al. LongCodeBench: Evaluating coding LLMs at 1m context windows.arXiv preprint arXiv:2505.07897, 2025.https://arxiv.org/abs/2505.07897
arXiv 2025
-
[28]
The scrum guide, 2020.https://scrumguides.org
Ken Schwaber and Jeff Sutherland. The scrum guide, 2020.https://scrumguides.org
2020
-
[29]
Tessl framework, 2025
Tessl. Tessl framework, 2025. Private beta; see [2]. 15
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.