arxiv: 2604.13097 · v1 · submitted 2026-04-10 · 💻 cs.SE · cs.AI

Recognition: no theorem link

ECM Contracts: Contract-Aware, Versioned, and Governable Capability Interfaces for Embodied Agents

Xue Qin , Simin Luan , John See , Cong Yang , Zhijun Li

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:59 UTC · model grok-4.3

classification 💻 cs.SE cs.AI

keywords embodied agentscapability modulessoftware contractsmodule compositionversion compatibilityruntime governancerobotics interfacessoftware ecosystems

0 comments

The pith

Embodied agents can compose and upgrade capability modules safely by encoding six execution dimensions into contracts rather than relying on input-output types alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes ECM Contracts to turn modular embodied capabilities into a stable, governable ecosystem instead of ad hoc bundles. The contracts add six dimensions—functional signature, behavioral assumptions, resource requirements, permission boundaries, recovery semantics, and version compatibility—to standard interfaces. These details support static checks that catch type mismatches, dependency conflicts, policy violations, resource issues, and recovery problems before installation or composition. A release process with version classes, deprecation rules, and upgrade checks is also defined. Evaluation in a robotics setting indicates fewer unsafe combinations and better rollback readiness than schema-only baselines.

Core claim

ECM Contracts extend conventional software interfaces by encoding six dimensions essential for embodied execution—functional signature, behavioral assumptions, resource requirements, permission boundaries, recovery semantics, and version compatibility—into reusable capability modules. This model supplies a compatibility framework for static and pre-deployment checks during module installation, composition, and upgrade, plus a release discipline of version-aware compatibility classes, deprecation rules, migration constraints, and policy-sensitive checks. A prototype registry, resolver, and checker implemented for a robotics runtime shows contract-aware composition reduces unsafe or invalid模块s

What carries the argument

ECM Contracts, which encode six embodied execution dimensions into module interfaces to enable compatibility checks and versioned governance for composition and release.

If this is right

Contract-aware composition substantially reduces unsafe or invalid module combinations.
Contract-guided release checks improve upgrade safety and rollback readiness compared with schema-only baselines.
Static checks can catch type mismatches, dependency conflicts, policy violations, resource contention, and recovery incompatibilities before deployment.
Version-aware compatibility classes and deprecation rules allow controlled evolution of capabilities without breaking existing systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same contract structure could apply to other modular agent systems such as simulation environments or distributed sensor networks if the six dimensions generalize beyond the tested robotics runtime.
Automated tooling built around these contracts might one day suggest safe module upgrades by comparing contract fields without human review.
In production fleets, contract mismatches could become the primary signal for triggering governance actions rather than runtime failures.

Load-bearing premise

The six contract dimensions plus the prototype checks are enough to capture the main execution risks that arise when embodied modules run together.

What would settle it

Running the same composition and upgrade tasks on a wider set of embodied platforms or non-robotics agents and counting how often unsafe combinations still slip through would show whether the six dimensions are sufficient.

Figures

Figures reproduced from arXiv: 2604.13097 by Cong Yang, John See, Simin Luan, Xue Qin, Zhijun Li.

**Figure 2.** Figure 2: Ground truth oracle distribution by contract dimension. Only 5 of 42 incompatibilities are [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Experiment 1 summary: (a) chains accepted by each method, (b) runtime failures among ac [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

**Figure 4.** Figure 4: Ablation study: unsafe acceptances when each contract dimension is removed. Signature [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗

read the original abstract

Embodied agents increasingly rely on modular capabilities that can be installed, upgraded, composed, and governed at runtime. Prior work has introduced embodied capability modules (ECMs) as reusable units of embodied functionality, and recent research has explored their runtime governance and controlled evolution. However, a key systems question remains unresolved: how can ECMs be composed and released as a stable software ecosystem rather than as ad hoc skill bundles? We present ECM Contracts, a contract-based interface model for embodied capability modules. Unlike conventional software interfaces that specify only input and output types, ECM Contracts encode six dimensions essential for embodied execution: functional signature, behavioral assumptions, resource requirements, permission boundaries, recovery semantics, and version compatibility. Based on this model, we introduce a compatibility framework for ECM installation, composition, and upgrade, enabling static and pre-deployment checks for type mismatches, dependency conflicts, policy violations, resource contention, and recovery incompatibilities. We further propose a release discipline for embodied capabilities, including version-aware compatibility classes, deprecation rules, migration constraints, and policy-sensitive upgrade checks. We implement a prototype ECM registry, resolver, and contract checker, and evaluate the approach on modular embodied tasks in a robotics runtime setting. Results show that contract-aware composition substantially reduces unsafe or invalid module combinations, and that contract-guided release checks improve upgrade safety and rollback readiness compared with schema-only or ad hoc baselines. Our findings suggest that stable embodied software ecosystems require more than modular packaging: they require explicit contracts that connect capability composition, governance, and evolution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a six-dimension contract model for composing and upgrading embodied modules, but the evaluation is too light on details and coverage to confirm the safety gains.

read the letter

The punchline is that ECM Contracts gives a concrete six-part interface spec—functional signature, behavioral assumptions, resources, permissions, recovery, and version compatibility—plus a checker and release rules for modular embodied agents. That fills a real gap in turning ad-hoc skill bundles into something more governable at runtime. The prototype registry and resolver show they thought through installation, composition, and upgrade paths in a robotics setting, which is useful scaffolding even if the numbers are missing here. Prior ECM work gets extended without obvious circularity, and the version-aware deprecation rules look like a practical addition for long-lived systems. The model itself is new enough on the specifics to be worth noting. The evaluation claims fewer unsafe combinations and safer upgrades than schema-only baselines, but the abstract supplies no counts, task descriptions, or error breakdowns, so those results stay unverified. The stress-test point lands: nothing in the six dimensions directly addresses kinematic limits, timing under load, or sensor noise, so any reported reductions might be artifacts of the chosen tasks rather than general coverage. If the full paper has only the same high-level robotics runtime test, that section needs expansion before the safety argument holds. Readers working on robotics middleware or agent platforms would find the dimensions and framework worth skimming for ideas on contract design. The work is coherent on its own terms and engages the right literature, so it clears the bar for peer review. Send it to referees but ask for quantitative results, a clearer description of the test suite, and explicit discussion of what physical or stochastic issues the contracts do not catch.

Referee Report

2 major / 2 minor

Summary. The paper proposes ECM Contracts, a contract-based interface model for embodied capability modules (ECMs) that encodes six dimensions essential for embodied execution: functional signature, behavioral assumptions, resource requirements, permission boundaries, recovery semantics, and version compatibility. It introduces a compatibility framework for static and pre-deployment checks during ECM installation, composition, and upgrade, along with a release discipline including version-aware compatibility classes, deprecation rules, and policy-sensitive checks. A prototype registry, resolver, and contract checker is implemented and evaluated on modular embodied tasks in a robotics runtime setting, with claims that contract-aware composition substantially reduces unsafe or invalid module combinations and that contract-guided release checks improve upgrade safety and rollback readiness relative to schema-only or ad hoc baselines.

Significance. If the empirical claims hold under rigorous evaluation, the work could provide a practical foundation for building stable, governable ecosystems of reusable embodied capabilities, addressing open questions in runtime composition, controlled evolution, and safety for modular robotics software. The explicit linkage between contracts, governance, and versioned release discipline is a notable strength, as is the prototype implementation that operationalizes the model.

major comments (2)

[Evaluation] Evaluation section: the manuscript asserts that contract-aware composition substantially reduces unsafe or invalid module combinations and that contract-guided checks improve upgrade safety, but supplies no quantitative data, baselines, methods, statistical analysis, or error bars to support these claims. Without such evidence, it is impossible to determine the magnitude of improvement or whether the six dimensions are sufficient to capture embodied-specific failure modes.
[Contract Model] Contract Model and Compatibility Framework sections: the central safety-reduction claim rests on the assumption that the six dimensions plus the prototype checker are adequate to detect relevant embodied execution issues. No targeted coverage analysis or counterexample tests are provided for embodied-specific concerns such as kinematic constraints, real-time timing under physical load, or stochastic sensor effects; if these fall outside the dimensions, the reported reductions may be artifacts of the chosen task suite rather than a general property of the model.

minor comments (2)

[Abstract] Abstract: the phrase 'results show' is used without any indication of the number of tasks, modules tested, or comparison details, reducing clarity for readers.
[Compatibility Framework] Ensure that all six contract dimensions are explicitly cross-referenced with the compatibility checks and release rules in the framework description to improve traceability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important gaps in the empirical support and coverage analysis, which we will address through targeted revisions to strengthen the manuscript.

read point-by-point responses

Referee: Evaluation section: the manuscript asserts that contract-aware composition substantially reduces unsafe or invalid module combinations and that contract-guided checks improve upgrade safety, but supplies no quantitative data, baselines, methods, statistical analysis, or error bars to support these claims. Without such evidence, it is impossible to determine the magnitude of improvement or whether the six dimensions are sufficient to capture embodied-specific failure modes.

Authors: We agree that the evaluation section relies on illustrative examples from the task suite without providing quantitative metrics, explicit baselines, or statistical analysis. In the revised manuscript we will expand this section to report concrete measurements (e.g., counts of invalid combinations prevented under contract-aware versus schema-only and ad-hoc composition), describe the evaluation methodology in detail, and include statistical summaries or error bars where the data permit. These additions will allow readers to assess the magnitude of the reported improvements. revision: yes
Referee: Contract Model and Compatibility Framework sections: the central safety-reduction claim rests on the assumption that the six dimensions plus the prototype checker are adequate to detect relevant embodied execution issues. No targeted coverage analysis or counterexample tests are provided for embodied-specific concerns such as kinematic constraints, real-time timing under physical load, or stochastic sensor effects; if these fall outside the dimensions, the reported reductions may be artifacts of the chosen task suite rather than a general property of the model.

Authors: We acknowledge that the manuscript does not yet include an explicit coverage analysis or counterexample tests for embodied-specific issues such as kinematic constraints, timing under load, or sensor stochasticity. In the revision we will add a dedicated discussion of the six dimensions' coverage, identify which classes of embodied concerns are addressed by static contract checks and which are deferred to runtime mechanisms, and supply targeted counterexamples that illustrate both successful detection and cases outside the model's static scope. This will clarify the boundaries of the safety claims. revision: yes

Circularity Check

0 steps flagged

No circularity: ECM Contracts model and evaluation are self-contained

full rationale

The paper introduces ECM Contracts as an extension of prior ECM work by defining six explicit contract dimensions (functional signature, behavioral assumptions, resource requirements, permission boundaries, recovery semantics, version compatibility) and a compatibility/release framework. It describes a prototype implementation and reports empirical comparisons against schema-only and ad hoc baselines on robotics tasks. No equations, fitted parameters, predictions that reduce to inputs, self-citations that bear the central claims, or ansatzes imported from prior author work appear in the text. The derivation chain consists of definitional modeling followed by independent prototype evaluation, satisfying the requirement for self-contained content without reduction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the domain assumption that embodied agents require stable ecosystems beyond ad hoc modules, plus the introduction of new conceptual entities (the contracts and frameworks) whose value is asserted via an unevaluated prototype.

axioms (1)

domain assumption Embodied agents increasingly rely on modular capabilities that can be installed, upgraded, composed, and governed at runtime.
Opening premise of the abstract that frames the entire problem and solution.

invented entities (2)

ECM Contracts no independent evidence
purpose: Contract-based interface model encoding six dimensions for embodied execution.
Newly proposed construct to solve composition and release problems.
Compatibility framework no independent evidence
purpose: Enables static and pre-deployment checks for installation, composition, and upgrades.
Introduced as part of the contract model.

pith-pipeline@v0.9.0 · 5588 in / 1474 out tokens · 76023 ms · 2026-05-10T17:59:28.515967+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 4 canonical work pages · 4 internal anchors

[1]

AEROS: A Single-Agent Operating Architecture with Embodied Capability Modules

X. Qin, S. Luan, J. See, C. Yang, and Z. Li, “AEROS: Agent execution runtime operating system for embodied robots,” arXiv preprint arXiv:2604.07039, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

Learning Without Losing Identity: Capability Evolution for Embodied Agents

X. Qin, S. Luan, J. See, C. Yang, and Z. Li, “Learning without losing identity: Capability evolution for embodied agents,” arXiv preprint arXiv:2604.07799, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[3]

Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution

X. Qin, S. Luan, J. See, C. Yang, and Z. Li, “Harnessing embodied agents: Runtime governance for policy-constrained execution,” arXiv preprint arXiv:2604.07833, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[4]

Governed Capability Evolution: Lifecycle-Time Compatibility Checking and Rollback for AI-Component-Based Systems, with Embodied Agents as Case Study

X. Qin, S. Luan, J. See, C. Yang, and Z. Li, “Governed capability evolution for embodied agents,” arXiv preprint arXiv:2604.08059, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

Taming Dr. Frankenstein: Contract- based design for cyber-physical systems,

A. Sangiovanni-Vincentelli, W. Damm, and R. Passerone, “Taming Dr. Frankenstein: Contract- based design for cyber-physical systems,”Eur. J. Control, vol. 18, no. 3, pp. 217–238, 2012

2012
[6]

A contract-based methodol- ogy for aircraft electric power system design,

P. Nuzzo, H. Xu, N. Ozay, J. Finn, A. Sangiovanni-Vincentelli et al., “A contract-based methodol- ogy for aircraft electric power system design,”IEEE Access, vol. 2, pp. 1–25, 2014. 22

2014
[7]

Contracts for system de- sign,

A. Benveniste, B. Caillaud, D. Nickovic, R. Passerone, J.-B. Raclet, P. Reinkemeier, A. Sangiovanni-Vincentelli, W. Damm, T. Henzinger, and K. Larsen, “Contracts for system de- sign,”Foundations and Trends in Electronic Design Automation, vol. 12, no. 2–3, pp. 124–400, 2018

2018
[8]

ROS: An open-source Robot Operating System,

M. Quigley et al., “ROS: An open-source Robot Operating System,” inICRA Workshop on Open Source Software, 2009

2009
[9]

Robot Operating System 2: Design, architecture, and uses in the wild,

S. Macenski et al., “Robot Operating System 2: Design, architecture, and uses in the wild,”Sci. Robot., vol. 7, 2022

2022
[10]

Applying Design by Contract,

B. Meyer, “Applying Design by Contract,”Computer, vol. 25, no. 10, pp. 40–51, 1992

1992
[11]

RoboChart: Modelling and verification of the functional behaviour of robotic applications,

A. Miyazawa, P. Ribeiro, W. Li, A. Sherif, and A. Sherif, “RoboChart: Modelling and verification of the functional behaviour of robotic applications,”Softw. Syst. Model., vol. 18, pp. 3097–3149, 2019

2019
[12]

Synthesis for robots: Guarantees and feedback for robot behavior,

H. Kress-Gazit, M. Lahijanian, and V . Raman, “Synthesis for robots: Guarantees and feedback for robot behavior,”Annu. Rev. Control Robot. Auton. Syst., vol. 1, pp. 211–236, 2018

2018
[13]

Formal specification and verifi- cation of autonomous robotic systems: A survey,

M. Luckcuck, M. Farrell, L. Dennis, C. Dixon, and M. Fisher, “Formal specification and verifi- cation of autonomous robotic systems: A survey,”ACM Comput. Surv., vol. 52, no. 5, pp. 1–41, 2019

2019
[14]

The real-time motion control core of the Orocos project,

H. Bruyninckx, P. Soetens, and B. Koninckx, “The real-time motion control core of the Orocos project,” inProc. IEEE ICRA, 2003

2003
[15]

SkiROS—A skill-based robot control platform on top of ROS,

F. Rovida, M. Crosby, D. Holz et al., “SkiROS—A skill-based robot control platform on top of ROS,” inStudies in Computational Intelligence, vol. 707, Springer, 2017

2017
[16]

Robotics middleware: A comprehensive literature survey and attribute- based bibliography,

A. Elkady and T. Sobh, “Robotics middleware: A comprehensive literature survey and attribute- based bibliography,”J. Robotics, vol. 2012, pp. 1–15, 2012

2012
[17]

Semantic Versioning 2.0.0,

T. Preston-Werner, “Semantic Versioning 2.0.0,”https://semver.org, 2013

2013
[18]

Runtime verification and field-based testing for ROS-based robotic systems,

R. Caldas, J. Pinera Garcia, A. Schiopu, and P. Pelliccione, “Runtime verification and field-based testing for ROS-based robotic systems,”IEEE Trans. Softw. Eng., vol. 50, no. 2, pp. 336–360, 2024

2024
[19]

Dynamical movement primitives: Learning attractor models for motor behaviors,

A. Ijspeert, J. Nakanishi, H. Hoffmann, and P. Pastor, “Dynamical movement primitives: Learning attractor models for motor behaviors,”Neural Computation, vol. 25, no. 2, pp. 328–373, 2013

2013
[20]

Recent advances in robot learning from demonstration,

H. Ravichandar, A. Polydoros, S. Chernova, and A. Billard, “Recent advances in robot learning from demonstration,”Annu. Rev. Control Robot. Auton. Syst., vol. 3, pp. 297–330, 2020

2020
[21]

Colledanchise and P

M. Colledanchise and P. Ögren,Behavior Trees in Robotics and AI: An Introduction. CRC Press, 2018

2018
[22]

A formal framework for the specification and verification of robotic skills composition,

C. Pelletier, C. Lesire, and K. Godary-Dejean, “A formal framework for the specification and verification of robotic skills composition,”Robot. Auton. Syst., vol. 173, 104620, 2025

2025
[23]

SAFER-HRC: Safety analysis through formal verification in human-robot collaboration,

M. Askarpour, D. Mandrioli, M. Rossi, and F. Vicentelli, “SAFER-HRC: Safety analysis through formal verification in human-robot collaboration,” inProc. SAFECOMP, 2016

2016
[24]

Hierarchical task and motion planning in the now,

L. P. Kaelbling and T. Lozano-Pérez, “Hierarchical task and motion planning in the now,” inProc. IEEE ICRA, 2011. 23

2011
[25]

Behavior trees and state machines in robotics applications,

Y . Ghzouli, T. Berger, E. B. Johnsen, and A. Wasowski, “Behavior trees and state machines in robotics applications,”IEEE Trans. Softw. Eng., vol. 49, no. 9, pp. 4376–4402, 2023

2023
[26]

End-to-end training of deep visuomotor policies,

S. Levine et al., “End-to-end training of deep visuomotor policies,”J. Mach. Learn. Res., vol. 17, no. 39, pp. 1–40, 2016

2016
[27]

Learning dexterous in-hand manipulation,

OpenAI et al., “Learning dexterous in-hand manipulation,”Int. J. Robot. Res., vol. 39, no. 1, pp. 3– 20, 2020

2020
[28]

RAFCON: A graphical tool for engineering complex, robotic tasks,

S. Brunner, F. Steinmetz, R. Belder, and A. Dömel, “RAFCON: A graphical tool for engineering complex, robotic tasks,” inProc. IEEE/RSJ IROS, 2016, pp. 3283–3290

2016
[29]

CoSTAR: Instructing collaborative robots with behavior trees and vision,

S. Natarajan et al., “CoSTAR: Instructing collaborative robots with behavior trees and vision,” in Proc. IEEE ICRA, 2019

2019
[30]

Modular robot software framework for the intelligent and flexible composition of its skills,

L. Heuss, J. Blank, F. Dengler et al., “Modular robot software framework for the intelligent and flexible composition of its skills,” inProc. APMS, 2019

2019
[31]

Towards a framework for certification of reliable autonomous systems,

M. Fisher, V . Mascardi, K. Rozier, B. Schlingloff, M. Winikoff, and N. Yorke-Smith, “Towards a framework for certification of reliable autonomous systems,”Auton. Agents Multi-Agent Syst., vol. 35, no. 1, 2021

2021
[32]

Semantic versioning and impact of breaking changes in the Maven repository,

S. Raemaekers, A. van Deursen, and J. Visser, “Semantic versioning and impact of breaking changes in the Maven repository,”J. Syst. Softw., vol. 129, pp. 140–158, 2017

2017
[33]

Dependency versioning in the wild,

J. Dietrich, D. Pearce, J. Stringer, A. Tahir, and K. Blincoe, “Dependency versioning in the wild,” inProc. MSR, 2019, pp. 349–359. 24

2019