The Khipu Problem: Institutional Legibility Under Distributed Cognition

Krti Tallam

arxiv: 2606.12414 · v1 · pith:NQOS6RBZnew · submitted 2026-05-06 · 💻 cs.CY

The Khipu Problem: Institutional Legibility Under Distributed Cognition

Krti Tallam This is my paper

Pith reviewed 2026-06-30 23:18 UTC · model grok-4.3

classification 💻 cs.CY

keywords khipu problemdistributed cognitionAI governanceinterpretive continuityinstitutional legibilitytrace retentioncognitive episodes

0 comments

The pith

Distributed AI creates records that later institutions cannot read even when the data survives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

AI governance has assumed bounded models or agents, but real systems now spread cognition across models, tools, humans, retrieval layers, and institutional roles. The paper names the khipu problem: the record persists while the reading practice required to treat those traces as one coherent cognitive episode disappears. This produces a governance failure distinct from ordinary missing data because institutions lose the capacity to classify, trust, audit, or constrain the system. The argument distinguishes missing evidence, ambiguous evidence, and structurally unreadable evidence, then concludes that governance must preserve interpretive continuity rather than trace retention alone. It proposes governance workspaces and receipt-bearing surfaces as the required interpretive infrastructure.

Core claim

The khipu problem for distributed AI is that the record can survive while the reading practice needed to interpret it as part of one coherent cognitive episode decays, creating a structural mismatch between what can be represented and what institutions must still decide under consequential conditions.

What carries the argument

The khipu problem: survival of logs, traces, model versions, and approval artifacts without the institutional capacity to read them as a single distributed cognitive episode.

If this is right

Institutions must treat structurally unreadable evidence as a distinct category alongside missing and ambiguous evidence.
Consequential outcomes are better understood as distributed cognitive episodes than as outputs of bounded models.
Governance workspaces together with receipt-bearing governance surfaces can preserve action identity, authority, boundary truth, evidential scope, and consequential outcomes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decay of interpretive capacity could affect accountability in any long-lived socio-technical system that distributes decision-making across changing components.
Current emphasis on data retention policies may need explicit requirements for maintaining the surrounding scaffolding that allows later readers to treat traces as one episode.

Load-bearing premise

The premise that the relevant object of governance is a distributed cognitive episode whose legibility depends on surrounding interpretive scaffolding rather than a bounded model or agent.

What would settle it

A documented case in which every model version, tool call, log, and approval artifact remains available yet no subsequent institution can reconstruct the authority, boundaries, or evidential basis of the outcome because the required interpretive practices have decayed.

Figures

Figures reproduced from arXiv: 2606.12414 by Krti Tallam.

**Figure 2.** Figure 2: Question-relative units of governance for distributed AI. Later review may need to traverse [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

read the original abstract

AI governance still tends to assume that the relevant object is a bounded model or a bounded agent. That assumption is getting weaker. Real systems increasingly distribute cognition across models, tools, humans, context stores, retrieval layers, runtime policies, authorization boundaries, and delegated institutional roles. In such systems, the central governance problem is no longer only what the system did, but whether later institutions can still read what the system was. This paper introduces the khipu problem for distributed AI: the record can survive while the reading practice needed to interpret it decays. Logs, traces, model versions, tool calls, outputs, and approval artifacts may remain available while the institutional capacity to read them as parts of one coherent cognitive episode disappears. We argue that this failure is better understood as loss of interpretive continuity than as ordinary lack of observability. The result is a distinct governance failure. Institutions must classify, trust, audit, and constrain systems whose relevant identity is distributed across components and whose legibility depends on surrounding interpretive scaffolding. The problem is not merely missing data. It is a structural mismatch between what can be represented and what must still be decided under consequential conditions. We therefore argue that governance for distributed AI requires preservation of interpretive continuity, not only trace retention. The paper distinguishes missing evidence, ambiguous evidence, and structurally unreadable evidence; argues that many consequential outcomes are better understood as distributed cognitive episodes than as bounded model outputs; and proposes governance workspaces together with receipt-bearing governance surfaces as interpretive infrastructure for preserving action identity, authority, boundary truth, evidential scope, and consequential outcomes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper names the khipu problem as loss of interpretive continuity in distributed AI records, a useful distinction but one that stays at the level of reframing without cases or mechanisms.

read the letter

The main takeaway is that this paper introduces the khipu problem to describe how distributed AI systems can leave behind logs and traces that later institutions cannot read as a single coherent episode. It separates this from ordinary missing data by pointing to missing, ambiguous, and structurally unreadable evidence, and it argues that governance needs to preserve interpretive continuity rather than just retain traces.

The distinction among those three evidence types is the clearest new element. It gives a vocabulary for talking about why audit and liability might break even when data is present. The paper also does a straightforward job of laying out why the bounded-agent assumption is weakening as systems spread across models, tools, retrieval, and human roles.

The argument is entirely conceptual. No concrete cases, no worked examples of how the distinctions would apply in practice, and no formal definitions or derivation steps appear. The central claims rest on the initial premise about distributed cognitive episodes without additional support. That keeps the piece short but also leaves the distinctions as definitional moves rather than tested ones.

This is the sort of position paper that could interest readers working on AI governance frameworks and institutional design. It is coherent internally and engages the literature on observability without obvious contradictions. A serious referee could push on whether the framing adds operational value or needs grounding in specific domains.

I would send it to peer review.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the 'khipu problem' for distributed AI governance: records, logs, and traces may persist while the institutional reading practices required to interpret them as coherent cognitive episodes decay. It reframes the core issue as loss of interpretive continuity rather than ordinary observability failures, distinguishes missing/ambiguous/structurally unreadable evidence, treats consequential outcomes as distributed cognitive episodes rather than bounded model outputs, and proposes governance workspaces plus receipt-bearing governance surfaces to preserve action identity, authority, boundary truth, evidential scope, and outcomes.

Significance. If the proposed distinctions and infrastructure prove operationalizable, the reframing could usefully shift AI governance discussions from trace retention toward institutional interpretive scaffolding. The paper receives credit for cleanly separating three evidence categories and for identifying a structural mismatch between representable data and decision requirements under consequential conditions.

major comments (2)

[Abstract] Abstract: the claim that loss of interpretive continuity constitutes a distinct governance failure (distinct from ordinary lack of observability) rests on the unelaborated premise that distributed systems possess a coherent 'cognitive episode' whose identity can be lost; no criteria for identifying episode boundaries or for measuring continuity decay are supplied, rendering the distinction definitional rather than diagnostic.
[Abstract] Abstract (opening paragraphs): the assertion that 'the relevant object of governance is no longer a bounded model or agent but a distributed cognitive episode' is load-bearing for all subsequent recommendations, yet the manuscript supplies neither a formal characterization of such episodes nor a concrete test (e.g., application to a retrieval-augmented multi-agent workflow) that would allow the claim to be evaluated or falsified.

minor comments (1)

[Abstract] The historical analogy motivating the term 'khipu problem' is invoked but not explained or referenced, leaving readers without the intended interpretive bridge.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which correctly identify places where the abstract would benefit from greater precision on the diagnostic criteria and testability of the proposed reframing. We respond to each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that loss of interpretive continuity constitutes a distinct governance failure (distinct from ordinary lack of observability) rests on the unelaborated premise that distributed systems possess a coherent 'cognitive episode' whose identity can be lost; no criteria for identifying episode boundaries or for measuring continuity decay are supplied, rendering the distinction definitional rather than diagnostic.

Authors: We agree that the abstract presents the distinction without explicit criteria for episode boundaries or continuity decay. The manuscript develops the distinction via the three evidence categories and the argument that consequential outcomes require interpretive scaffolding to maintain action identity, authority, and evidential scope. To strengthen the diagnostic character, we will revise the abstract and add a short subsection outlining preliminary criteria based on preservation of those properties, including indicators for when decay constitutes a governance failure. revision: yes
Referee: [Abstract] Abstract (opening paragraphs): the assertion that 'the relevant object of governance is no longer a bounded model or agent but a distributed cognitive episode' is load-bearing for all subsequent recommendations, yet the manuscript supplies neither a formal characterization of such episodes nor a concrete test (e.g., application to a retrieval-augmented multi-agent workflow) that would allow the claim to be evaluated or falsified.

Authors: The manuscript treats the shift to distributed cognitive episodes as a reframing that motivates the subsequent distinctions and infrastructure proposals rather than a fully axiomatized theory. We acknowledge that a formal characterization and a concrete test case would improve evaluability. In revision we will add a brief illustrative application to a retrieval-augmented multi-agent workflow, specifying how episode boundaries are identified through tool calls, authorization logs, and oversight points. revision: yes

Circularity Check

0 steps flagged

No significant circularity in conceptual reframing

full rationale

The paper is a position piece that introduces the khipu problem as a conceptual distinction between loss of interpretive continuity and ordinary lack of observability in distributed AI systems. It contains no equations, fitted parameters, predictions, or derivation chains. The central argument is presented as a direct interpretive shift from the premise of distributed cognitive episodes, without any self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations. The distinctions (missing/ambiguous/structurally unreadable evidence) and proposed governance workspaces are definitional proposals rather than results derived from prior inputs within the paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper introduces a new framing concept without numerical parameters or formal axioms, resting on the domain assumption that distributed cognition is now the dominant regime and that interpretive continuity constitutes a distinct failure type.

axioms (1)

domain assumption Real systems increasingly distribute cognition across models, tools, humans, context stores, retrieval layers, runtime policies, authorization boundaries, and delegated institutional roles.
Opening sentence of the abstract; treated as given rather than derived.

invented entities (1)

khipu problem no independent evidence
purpose: To name and organize the governance failure of lost interpretive continuity in distributed AI systems.
Newly coined term whose only support is the conceptual argument in the abstract; no independent empirical handle provided.

pith-pipeline@v0.9.1-grok · 5807 in / 1259 out tokens · 36284 ms · 2026-06-30T23:18:12.438284+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 8 internal anchors

[1]

doi: 10.1177/1461444816676645. Emily M. Bender and Batya Friedman. Data statements for natural language processing: Toward mitigating system bias and enabling better science. InTransactions of the Association for Computational Linguistics, volume 6, pages 587–604,

work page doi:10.1177/1461444816676645
[2]

Geoffrey C

doi: 10.1162/tacl_a_00041. Geoffrey C. Bowker and Susan Leigh Star.Sorting Things Out: Classification and Its Consequences. MIT Press, Cambridge, MA,

work page doi:10.1162/tacl_a_00041
[3]

Clark, D

doi: 10.1093/analys/58.1.7. Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning,

work page doi:10.1093/analys/58.1.7
[4]

Towards A Rigorous Science of Interpretable Machine Learning

URL https://arxiv.org/abs/1702.08608. Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé, and Kate Crawford. Datasheets for datasets,

work page internal anchor Pith review Pith/arXiv arXiv
[5]

URL https://doi.org/10.1145/353485.353487

doi: 10.1145/353485.353487. URL https://doi.org/10.1145/353485.353487. 14 Sirui Hong, Mingchen Zhuge, Jingwei Chen, Xiaoxin Zheng, Yuheng Cheng, Chaoyun Zhang, Zijie Wang, Shing-Chi Yau, Zhe Lin, Luping Zhou, Can Ran, Lifu Xiao, Chenhui Wu, and Jürgen Schmidhuber. MetaGPT: Meta programming for a multi-agent collaborative framework,

work page doi:10.1145/353485.353487
[6]

MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework

URL https://arxiv.org/abs/2308.00352. Edwin Hutchins.Cognition in the Wild. MIT Press, Cambridge, MA,

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Model Cards for Model Reporting

URL https://arxiv.org/abs/1810.03993. Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. Generative agents: Interactive simulacra of human behavior,

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Generative Agents: Interactive Simulacra of Human Behavior

URL https://arxiv.org/abs/2304.03442. Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing,

work page internal anchor Pith review Pith/arXiv arXiv
[9]

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom

URL https: //arxiv.org/abs/2001.00973. Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools,

work page arXiv 2001
[10]

URL https://arxiv.org/abs/2302.04761. James C. Scott.Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. Yale University Press, New Haven, CT,

work page internal anchor Pith review Pith/arXiv arXiv
[11]

, author Boyd, D

doi: 10.1145/3287560.3287598. URL https://doi.org/10.1145/3287560.3287598. Susan Leigh Star and James R. Griesemer. Institutional ecology, ‘translations’ and boundary objects: Amateurs and professionals in berkeley’s museum of vertebrate zoology, 1907–39.Social Studies of Science, 19(3):387–420,

work page doi:10.1145/3287560.3287598 1907
[12]

doi: 10.1177/030631289019003001. Lucy A. Suchman.Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press, Cambridge,

work page doi:10.1177/030631289019003001
[13]

Elham Tabassi

URL https://arxiv.org/abs/1901.10002. Elham Tabassi. Artificial intelligence risk management framework (AI RMF 1.0). Technical report, National Institute of Standards and Technology, January

work page arXiv 1901
[14]

Krti Tallam

URL https://doi.org/10.6028/NI ST.AI.100-1. Krti Tallam. Authorization propagation in multi-agent ai systems: Identity governance as infras- tructure, 2026a. Unpublished manuscript, arXiv-ready preprint. Krti Tallam. Execution envelopes: A shared admission contract for backend ai execution requests, 2026b. Unpublished manuscript, arXiv-ready preprint. 15 ...

work page doi:10.6028/ni
[15]

Voyager: An Open-Ended Embodied Agent with Large Language Models

URL https://arxiv. org/abs/2305.16291. Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. AutoGen: Enabling next-gen LLM applications via multi-agent conversation,

work page internal anchor Pith review Pith/arXiv arXiv
[16]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

URL https://arxiv.org/abs/2308.08155. Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models,

work page internal anchor Pith review Pith/arXiv arXiv
[17]

URL https://arxiv.org/abs/ 2210.03629. 16

work page internal anchor Pith review Pith/arXiv arXiv

[1] [1]

doi: 10.1177/1461444816676645. Emily M. Bender and Batya Friedman. Data statements for natural language processing: Toward mitigating system bias and enabling better science. InTransactions of the Association for Computational Linguistics, volume 6, pages 587–604,

work page doi:10.1177/1461444816676645

[2] [2]

Geoffrey C

doi: 10.1162/tacl_a_00041. Geoffrey C. Bowker and Susan Leigh Star.Sorting Things Out: Classification and Its Consequences. MIT Press, Cambridge, MA,

work page doi:10.1162/tacl_a_00041

[3] [3]

Clark, D

doi: 10.1093/analys/58.1.7. Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning,

work page doi:10.1093/analys/58.1.7

[4] [4]

Towards A Rigorous Science of Interpretable Machine Learning

URL https://arxiv.org/abs/1702.08608. Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé, and Kate Crawford. Datasheets for datasets,

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

URL https://doi.org/10.1145/353485.353487

doi: 10.1145/353485.353487. URL https://doi.org/10.1145/353485.353487. 14 Sirui Hong, Mingchen Zhuge, Jingwei Chen, Xiaoxin Zheng, Yuheng Cheng, Chaoyun Zhang, Zijie Wang, Shing-Chi Yau, Zhe Lin, Luping Zhou, Can Ran, Lifu Xiao, Chenhui Wu, and Jürgen Schmidhuber. MetaGPT: Meta programming for a multi-agent collaborative framework,

work page doi:10.1145/353485.353487

[6] [6]

MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework

URL https://arxiv.org/abs/2308.00352. Edwin Hutchins.Cognition in the Wild. MIT Press, Cambridge, MA,

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Model Cards for Model Reporting

URL https://arxiv.org/abs/1810.03993. Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. Generative agents: Interactive simulacra of human behavior,

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

Generative Agents: Interactive Simulacra of Human Behavior

URL https://arxiv.org/abs/2304.03442. Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing,

work page internal anchor Pith review Pith/arXiv arXiv

[9] [9]

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom

URL https: //arxiv.org/abs/2001.00973. Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools,

work page arXiv 2001

[10] [10]

URL https://arxiv.org/abs/2302.04761. James C. Scott.Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. Yale University Press, New Haven, CT,

work page internal anchor Pith review Pith/arXiv arXiv

[11] [11]

, author Boyd, D

doi: 10.1145/3287560.3287598. URL https://doi.org/10.1145/3287560.3287598. Susan Leigh Star and James R. Griesemer. Institutional ecology, ‘translations’ and boundary objects: Amateurs and professionals in berkeley’s museum of vertebrate zoology, 1907–39.Social Studies of Science, 19(3):387–420,

work page doi:10.1145/3287560.3287598 1907

[12] [12]

doi: 10.1177/030631289019003001. Lucy A. Suchman.Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press, Cambridge,

work page doi:10.1177/030631289019003001

[13] [13]

Elham Tabassi

URL https://arxiv.org/abs/1901.10002. Elham Tabassi. Artificial intelligence risk management framework (AI RMF 1.0). Technical report, National Institute of Standards and Technology, January

work page arXiv 1901

[14] [14]

Krti Tallam

URL https://doi.org/10.6028/NI ST.AI.100-1. Krti Tallam. Authorization propagation in multi-agent ai systems: Identity governance as infras- tructure, 2026a. Unpublished manuscript, arXiv-ready preprint. Krti Tallam. Execution envelopes: A shared admission contract for backend ai execution requests, 2026b. Unpublished manuscript, arXiv-ready preprint. 15 ...

work page doi:10.6028/ni

[15] [15]

Voyager: An Open-Ended Embodied Agent with Large Language Models

URL https://arxiv. org/abs/2305.16291. Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. AutoGen: Enabling next-gen LLM applications via multi-agent conversation,

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

URL https://arxiv.org/abs/2308.08155. Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models,

work page internal anchor Pith review Pith/arXiv arXiv

[17] [17]

URL https://arxiv.org/abs/ 2210.03629. 16

work page internal anchor Pith review Pith/arXiv arXiv