pith. sign in

arxiv: 2606.09751 · v2 · pith:AFYXOCBXnew · submitted 2026-06-08 · 💻 cs.AI · cs.CL· cs.HC

Collaborative Human-Agent Protocol (CHAP)

Pith reviewed 2026-06-27 16:15 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.HC
keywords Collaborative Human-Agent ProtocolCHAPhuman-agent collaborationaccountable AImulti-agent systemsevidence logsigned decisionsoverride events
0
0 comments X

The pith

CHAP turns human overrides and approvals of agent drafts into structured events with diffs, rationales, hashes, and signatures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Foundation models now handle multi-step operational work in teams that span humans and other agents, but the critical human edits and approvals still disappear into chats or tickets. CHAP defines a protocol whose minimal core records these moments as events in an append-only evidence log, making them portable, signed, and replayable. The design adds review, handoff, identity, and audit features only through composable profiles rather than a monolithic specification. If the protocol works as described, past decisions can be reconstructed and audited across time zones and trust boundaries without relying on application-specific code or tribal memory. The paper supplies the specification, reference implementation, and conformance tests to support adoption.

Core claim

CHAP standardizes the shared workspace for humans and agents through a Core of workspaces, participants, tasks, artefacts, and an append-only evidence log; overrides become events carrying a diff, rationale, and content hash while human approvals become non-repudiable signed decisions that can be replayed years later.

What carries the argument

The Core of workspaces, participants, tasks, artefacts, and an append-only evidence log, extended by composable profiles that add review, modes, routing, deliberation, handoff, identity, signatures, and transparency-backed audit.

If this is right

  • Overrides that previously vanished into chat threads become structured events carrying a diff, rationale, and content hash.
  • Shift handoffs become portable envelopes rather than pinned messages.
  • Human approval of an agent's draft becomes a non-repudiable signed decision that can be replayed years later.
  • The protocol complements MCP for tool access and A2A for agent interoperability by defining the human-agent workspace.
  • A conformance suite allows independent verification that implementations preserve the required event structure and signatures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Regulated domains could adopt the evidence log to produce auditable records of oversight without custom logging code.
  • The signed events could support automated compliance checks that verify human review occurred before certain outputs are released.
  • Patterns of human intervention extracted from the log might inform future agent training on when to request approval.
  • The protocol could be extended to other collaboration standards by mapping its event types to existing audit formats.

Load-bearing premise

A small core of workspaces, participants, tasks, artefacts, and an append-only evidence log plus composable profiles will capture the full range of accountable human-agent interactions across teams, time zones, and trust boundaries.

What would settle it

A multi-human, multi-agent collaboration recorded under CHAP in which the evidence log cannot reconstruct the sequence of edits, rationales, and approvals after the fact.

Figures

Figures reproduced from arXiv: 2606.09751 by Arsalan Shahid, Gordon Suttie, Philip Black.

Figure 1
Figure 1. Figure 1: Three waves in the evolution of agentic systems. Wave I centred on isolated conversa [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The agent protocol stack. MCP handles agent-to-tool and agent-to-resource access; [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: chap adoption model. Core is useful on its own. Profiles are layered when a workspace needs additional collaboration, identity, control, security, or audit capability. 3.2 Core primitives The Core model is built around six primary primitives [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Conceptual relations among chap primitives. A workspace has participants, active profiles, policy references, an operational mode, and a stream of tasks. Tasks produce artefacts. Accepted envelopes become evidence entries. The diagram is generated from the same Mermaid source as the reference implementation documentation. production deployment can run Coordinator replicas behind a load balancer, with a tra… view at source ↗
Figure 5
Figure 5. Figure 5: Simplified chap task lifecycle. Core transitions capture the basic movement of accountable work. Profiles add specialised transitions, such as review request, deliberation, and mode promotion, without changing the role of the evidence log. Every accepted transition becomes an evidence entry. 4.6 Artefacts and citations A task produces artefacts. An artefact can be a draft answer, decision, structured recor… view at source ↗
Figure 6
Figure 6. Figure 6: Evidence chain. Each accepted envelope can reference the previous entry’s hash. In [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Mode promotion ladder. An agent starts in [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Four deployment topologies. The Coordinator-mediated topology is the default. [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Weighted-vote deliberation. The Coordinator opens deliberation when the proposed [PITH_FULL_IMAGE:figures/full_fig_p035_9.png] view at source ↗
read the original abstract

Foundation models are moving from response generation into operational roles. They plan across steps, call tools, request human input, coordinate with other agents, and increasingly carry responsibility for work that affects customers, claims, code, contracts, and clinical decisions. Production deployments are no longer one human supervising one model. They are multi-human, multi-agent collaborations that cross teams, time zones, and trust boundaries. The technical surface for this collaboration remains weakly specified. When an agent drafts a response and a human edits it before it ships, the moment of human judgement is the most valuable signal in the system. In current practice it is recorded, if at all, in application code, chat threads, ticket comments, and tribal memory. Two protocol standards address adjacent concerns: MCP standardises agent access to tools and data, and A2A standardises agent-to-agent interoperability. Neither defines the shared workspace in which humans and agents perform accountable work together. This paper presents CHAP, the Collaborative Human-Agent Protocol. Under CHAP, the override that used to vanish into a chat thread becomes a structured event carrying a diff, a rationale, and a content hash. The handoff between shifts becomes a portable envelope rather than a pinned message. The human approval of an agent's draft becomes a non-repudiable signed decision that can be replayed years later. The protocol achieves this through a small Core (workspaces, participants, tasks, artefacts, and an append-only evidence log) together with composable profiles that add review, modes, routing, deliberation, handoff, identity, signatures, and transparency-backed audit as deployments require them. Specification, reference implementation, conformance suite, and worked examples are available at: https://github.com/BrightbeamAI/chap

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces the Collaborative Human-Agent Protocol (CHAP) to address the lack of standards for accountable multi-human, multi-agent collaborations in production AI systems. It defines a minimal core consisting of workspaces, participants, tasks, artefacts, and an append-only evidence log, together with composable profiles that can add review, modes, routing, deliberation, handoff, identity, signatures, and transparency-backed audit capabilities as needed. The protocol is positioned as turning informal overrides, handoffs, and approvals into structured, non-repudiable events carrying diffs, rationales, and content hashes that can be replayed.

Significance. If the protocol design holds and sees adoption, CHAP could provide a practical foundation for traceable human judgment in high-stakes AI deployments across domains such as code, contracts, and clinical decisions, complementing existing standards like MCP and A2A. The paper's explicit provision of a reference implementation, conformance suite, and worked examples on GitHub is a strength that supports reproducibility and further evaluation.

major comments (1)
  1. [Abstract] Abstract and overall manuscript: the central claims regarding non-repudiation, replayability of signed decisions, and sufficiency of the small core plus composable profiles rest on a high-level description only; no formal specification, message formats, data schemas, or worked protocol traces appear in the text itself (the manuscript instead refers readers to an external GitHub repository for these details). This absence is load-bearing for assessing whether the stated properties actually hold.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the need for greater self-containment in the manuscript. We address the concern point by point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract and overall manuscript: the central claims regarding non-repudiation, replayability of signed decisions, and sufficiency of the small core plus composable profiles rest on a high-level description only; no formal specification, message formats, data schemas, or worked protocol traces appear in the text itself (the manuscript instead refers readers to an external GitHub repository for these details). This absence is load-bearing for assessing whether the stated properties actually hold.

    Authors: We acknowledge that the current version of the manuscript presents CHAP at a conceptual level and directs readers to the GitHub repository for the complete formal specification, schemas, and traces. This design choice kept the paper concise and focused on motivation and architecture, consistent with many protocol introductions. However, the referee is correct that this makes direct evaluation of properties such as non-repudiation and replayability dependent on external material. In revision we will add a new section containing the core data schemas (JSON Schema excerpts for workspaces, tasks, artefacts, and the evidence log), representative message formats, and one end-to-end protocol trace that explicitly demonstrates signed events, content hashing, and replay. The full reference implementation and conformance suite will remain in the repository. We believe these additions will allow the claims to be assessed from the text while preserving the paper's scope. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper defines CHAP as a protocol via a small Core (workspaces, participants, tasks, artefacts, append-only evidence log) plus composable profiles; all described capabilities (structured override events with diff/rationale/hash, non-repudiable signed approvals) are direct definitional consequences of those entities rather than outputs of any derivation, equation, fitted parameter, or self-citation chain. No load-bearing step reduces to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The paper introduces a new protocol specification without empirical data, derivations or external benchmarks; it relies on the domain assumption that structured logging will improve accountability and on the invented entities of the CHAP core and profiles.

axioms (1)
  • domain assumption An append-only evidence log together with content hashes and signatures can produce non-repudiable records of human decisions.
    Invoked when the abstract states that human approval becomes a non-repudiable signed decision that can be replayed years later.
invented entities (2)
  • CHAP Core (workspaces, participants, tasks, artefacts, append-only evidence log) no independent evidence
    purpose: To provide the minimal shared structure for human-agent accountable work.
    Newly defined in the paper; no independent evidence supplied.
  • Composable profiles (review, modes, routing, deliberation, handoff, identity, signatures, transparency-backed audit) no independent evidence
    purpose: To allow deployments to add functionality only when required.
    Newly defined in the paper; no independent evidence supplied.

pith-pipeline@v0.9.1-grok · 5850 in / 1519 out tokens · 28630 ms · 2026-06-27T16:15:05.428224+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 11 canonical work pages

  1. [1]

    JSON-RPC Working Group.JSON-RPC 2.0 Specification. 2013. Available at: https: //www.jsonrpc.org/specification

  2. [2]

    Version 2025-11-25,

    Model Context Protocol Project.Model Context Protocol Specification. Version 2025-11-25,

  3. [3]

    Available at:https://modelcontextprotocol.io/specification/2025-11-25

  4. [4]

    Version 1.0.0,

    A2A Protocol Working Group.Agent2Agent (A2A) Protocol Specification. Version 1.0.0,

  5. [5]

    Available at:https://a2a-protocol.org/latest/specification/

  6. [6]

    Hardt, Ed.The OAuth 2.0 Authorization Framework

    D. Hardt, Ed.The OAuth 2.0 Authorization Framework. RFC 6749, IETF, October

  7. [7]

    Available at:https://datatracker.ietf.org/doc/html/ rfc6749

    doi:10.17487/RFC6749. Available at:https://datatracker.ietf.org/doc/html/ rfc6749

  8. [8]

    Sakimura, J

    N. Sakimura, J. Bradley, M. Jones, B. de Medeiros, and C. Mortimore.OpenID Connect Core 1.0 incorporating errata set 2. OpenID Foundation, December 2023. Available at: https://openid.net/specs/openid-connect-core-1_0.html

  9. [9]

    W3C Recommendation, 15 May 2025

    W3C Verifiable Credentials Working Group.Verifiable Credentials Data Model v2.0. W3C Recommendation, 15 May 2025. Available at:https://www.w3.org/TR/vc-data-model-2. 0/

  10. [10]

    Birkholz, A

    H. Birkholz, A. Delignat-Lavaud, C. Fournet, Y. Deshpande, and S. Lasker.An Archi- tecture for Trustworthy and Transparent Digital Supply Chains. Internet-Draft draft-ietf- scitt-architecture-22, IETF, Work in Progress, 10 October 2025. Available at:https: //datatracker.ietf.org/doc/draft-ietf-scitt-architecture/

  11. [11]

    Rundgren, B

    A. Rundgren, B. Jordan, and S. Erdtman.JSON Canonicalization Scheme (JCS). RFC 8785, IETF, June 2020. doi:10.17487/RFC8785. Available at:https://datatracker.ietf. org/doc/html/rfc8785

  12. [12]

    Bryan and M

    P. Bryan and M. Nottingham, Eds.JavaScript Object Notation (JSON) Patch. RFC 6902, IETF, April 2013. doi:10.17487/RFC6902. Available at:https://datatracker.ietf.org/ doc/html/rfc6902

  13. [13]

    Schaad.CBOR Object Signing and Encryption (COSE): Structures and Process

    J. Schaad.CBOR Object Signing and Encryption (COSE): Structures and Process. STD 96, RFC 9052, IETF, August 2022. doi:10.17487/RFC9052. Available at:https: //datatracker.ietf.org/doc/html/rfc9052

  14. [14]

    D. Fett, B. Campbell, J. Bradley, T. Lodderstedt, M. Jones, and D. Waite. OAuth 2.0 Demonstrating Proof of Possession (DPoP). RFC 9449, IETF, September

  15. [15]

    Available at:https://datatracker.ietf.org/doc/html/ rfc9449

    doi:10.17487/RFC9449. Available at:https://datatracker.ietf.org/doc/html/ rfc9449

  16. [16]

    Jones, J

    M. Jones, J. Bradley, and H. Tschofenig.Proof-of-Possession Key Semantics for JSON Web Tokens (JWTs). RFC 7800, IETF, April 2016. doi:10.17487/RFC7800. Available at: https://datatracker.ietf.org/doc/html/rfc7800

  17. [17]

    Available at:https://spiffe.io

    SPIFFE Project.SPIFFE: Secure Production Identity Framework for Everyone. Available at:https://spiffe.io

  18. [18]

    P. Hunt, K. Grizzle, M. Ansari, E. Wahlstroem, and C. Mortimore.System for Cross-domain Identity Management: Protocol. RFC 7644, IETF, September 2015. doi:10.17487/RFC7644. Available at:https://datatracker.ietf.org/doc/html/rfc7644. 51

  19. [19]

    2023.Artificial Intelligence Risk Management Framework (AI RMF 1.0)

    National Institute of Standards and Technology.Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST AI 100-1, January 2023. doi:10.6028/NIST.AI.100-1. Available at:https://www.nist.gov/itl/ai-risk-management-framework

  20. [20]

    Official Journal of the European Union, 2024

    European Parliament and Council of the European Union.Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence. Official Journal of the European Union, 2024. Available at:https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng

  21. [21]

    https: //doi.org/10.1145/3290605.3300233 .https://doi.org/10.1145/3290605.3300233

    S. Amershi, D. Weld, M. Vorvoreanu, A. Fourney, B. Nushi, P. Collisson, J. Suh, S. Iqbal, P. N. Bennett, K. Inkpen, J. Teevan, R. Kikin-Gil, and E. Horvitz.Guidelines for Human-AI Interaction. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19), Paper 3, pp. 1–13. Association for Computing Machinery, New York, NY, USA,...

  22. [22]

    Horvitz.Principles of Mixed-Initiative User Interfaces

    E. Horvitz.Principles of Mixed-Initiative User Interfaces. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’99), pp. 159–166. Association for Computing Machinery, New York, NY, USA, 1999. doi:10.1145/302979.303030

  23. [23]

    J. D. Lee and K. A. See.Trust in Automation: Designing for Appropriate Reliance.Human Factors, 46(1):50–80, 2004. doi:10.1518/hfes.46.1.50_30392

  24. [24]

    Open specification, reference implementation, and conformance suite

    Brightbeam AI.CHAP: Collaborative Human-Agent Protocol. Open specification, reference implementation, and conformance suite. Version 0.2, 2026. Available at:https://github. com/BrightbeamAI/chap. 52