pith. machine review for the scientific record. sign in

arxiv: 2605.01710 · v1 · submitted 2026-05-03 · 💻 cs.AI · cs.CY

Recognition: unknown

Model Routing as a Trust Problem: Route Receipts for Adaptive AI Systems

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:44 UTC · model grok-4.3

classification 💻 cs.AI cs.CY
keywords AI routingroute receiptstransparencyadaptive AI systemsmodel cardstrustruntime documentationredaction
0
0 comments X

The pith

Adaptive AI systems should attach a route receipt to each response to document the runtime path taken without exposing proprietary logic.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that routing decisions in AI services affect cost, quality, and accountability yet remain invisible to users, eroding trust. It proposes that every response include a compact route receipt capturing enough material facts for users to reconstruct key decisions such as model version, tier, or safety handling. This receipt would function as a runtime counterpart to static model cards, which describe only the trained artifact. The author surveys current platforms and notes that fragments of routing information already exist but lack a standardized, per-answer portable format. If adopted, route receipts would let relying parties verify the conditions under which an answer was produced.

Core claim

The central claim is that model routing constitutes a trust problem best addressed by producing a route receipt for each request: a minimal, redacted record of the serving path that supplies enough facts for external reconstruction of routing choices while protecting internal proprietary details.

What carries the argument

The route receipt, a compact runtime record of the path that served a request, designed with a minimal schema and redaction rules to enable reconstruction without full disclosure.

If this is right

  • Route transparency becomes a required element of model documentation alongside existing model cards.
  • Users gain the ability to verify which version, tier, or fallback produced a given answer.
  • Platforms can share receipt fragments already generated internally in a unified, portable format.
  • Accountability improves because changes in cost or quality can be traced to specific routing steps.
  • Safety and compliance reviews can reference the exact runtime conditions of a response.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Standardized receipt formats could integrate with existing logging and audit systems to reduce duplication.
  • High-stakes domains such as medical or financial AI might adopt receipts first to meet regulatory expectations.
  • Over time, receipts could evolve to include optional fields for user-requested transparency levels.
  • The approach focuses on path documentation rather than model internals, complementing rather than replacing explainability techniques.

Load-bearing premise

A compact redacted receipt can be produced and shared at acceptable cost without either leaking proprietary routing logic or creating excessive overhead.

What would settle it

Demonstrating that any usable receipt either reveals enough routing details to compromise competitive advantage or adds latency and storage costs that production systems reject would falsify the proposal.

read the original abstract

AI products often route requests through version aliases, service tiers, tool choices, regional endpoints, fallback rules, or safety handling before responding. These routing steps are documented product surfaces in several widely used AI platforms and serving stacks. Routing helps AI services stay affordable, fast, and available at scale, and it shapes trust. Trust can break when routing changes the cost, quality, or accountability of a response without the user being able to tell what happened. "Which model answered?" is only part of the audit question. The runtime path matters. Adaptive AI systems should produce a runtime transparency artifact called the route receipt. A route receipt is a compact record of the route that served a request. It should capture enough material facts for people relying on the output to reconstruct important routing decisions without exposing proprietary internals or hidden reasoning. Route transparency should be part of model documentation. Model cards describe trained model artifacts, while route receipts describe the runtime conditions under which a particular answer was produced. The paper introduces the route-receipt concept, a minimal schema and redaction model, and a documentation-based survey of selected platforms showing that receipt fragments already exist without a portable per-answer record.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that routing decisions in adaptive AI systems (version aliases, tiers, tool choices, fallbacks, safety handling) shape trust and accountability, and proposes 'route receipts' as compact runtime transparency artifacts. These receipts should capture enough material facts via a minimal schema and redaction model to let downstream parties reconstruct key routing decisions without exposing proprietary internals. The work positions receipts as complementary to model cards, introduces the schema and redaction approach, and surveys documentation from selected platforms to show that receipt-like fragments already exist in practice.

Significance. If the redaction model can be validated to balance reconstructibility with proprietary protection and acceptable overhead, the proposal could help standardize runtime accountability for adaptive AI services, filling a gap between static model documentation and dynamic serving behavior. The conceptual framing is internally consistent, and the survey of existing platform fragments provides a practical foundation that strengthens the case for a portable standard.

major comments (2)
  1. [Schema and Redaction Model] The section defining the minimal schema and redaction model provides no worked example on a real router nor any argument (formal or informal) demonstrating that the redaction rules preserve sufficient information to reconstruct material routing facts (model version, tier, fallback path, safety handling) while provably avoiding leakage of proprietary decision logic. This assumption is load-bearing for the central claim that receipts can be both useful and safe.
  2. [Survey of Platforms] The documentation-based survey of platforms shows that routing fragments appear in existing systems but does not address or evaluate whether these can be unified into a single portable per-answer receipt format without unacceptable overhead or requiring disclosure of proprietary internals.
minor comments (1)
  1. The distinction between the proposed route receipt and existing platform-specific logs or metadata could be clarified with a small comparison table to improve readability.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive and detailed comments, which identify key areas where the conceptual proposal can be made more concrete. We address each major comment below and indicate the revisions we will incorporate.

read point-by-point responses
  1. Referee: [Schema and Redaction Model] The section defining the minimal schema and redaction model provides no worked example on a real router nor any argument (formal or informal) demonstrating that the redaction rules preserve sufficient information to reconstruct material routing facts (model version, tier, fallback path, safety handling) while provably avoiding leakage of proprietary decision logic. This assumption is load-bearing for the central claim that receipts can be both useful and safe.

    Authors: We agree that the manuscript would be strengthened by an explicit worked example and a clearer articulation of how the redaction model balances reconstructibility and protection. The schema is defined to record only observable material facts (model alias or version, tier, fallback indicator, safety handling flag) while the redaction rules exclude internal routing logic, decision trees, or proprietary heuristics. Although the paper relies on an informal design argument rather than a formal proof, we will add a worked example in the revision using a representative multi-tier router configuration. This example will show step-by-step how a receipt enables reconstruction of the key facts listed by the referee without exposing proprietary elements. We will also expand the surrounding text to make the informal preservation argument explicit. These changes directly address the load-bearing assumption. revision: yes

  2. Referee: [Survey of Platforms] The documentation-based survey of platforms shows that routing fragments appear in existing systems but does not address or evaluate whether these can be unified into a single portable per-answer receipt format without unacceptable overhead or requiring disclosure of proprietary internals.

    Authors: The survey is deliberately documentation-based to demonstrate that receipt-like fragments already appear in public platform documentation, thereby grounding the proposal in existing practice rather than pure invention. The unification into a portable per-answer format is the central proposal, with the redaction model intended to ensure no proprietary internals need be disclosed. The manuscript does not contain a quantitative overhead evaluation because it is a conceptual contribution focused on the schema and its rationale. In the revision we will add a short discussion of expected overhead, observing that the schema is intentionally minimal and emitted per request, which aligns with the low-cost logging already performed by serving systems. We maintain that the redaction approach precludes disclosure of proprietary logic by construction. revision: partial

standing simulated objections not resolved
  • A formal (as opposed to informal) proof that the redaction rules provably avoid leakage of proprietary decision logic would require information-theoretic or cryptographic analysis beyond the scope of this conceptual paper.

Circularity Check

0 steps flagged

Conceptual proposal with no derivations, predictions, or self-referential steps

full rationale

The paper is a definitional proposal introducing the route-receipt concept, a minimal schema, a redaction model, and an observational survey of existing platform fragments. No equations, fitted parameters, predictions, or derivation chains appear in the provided text. All load-bearing content consists of new definitions and documentation-based observations rather than reductions to prior results, self-citations, or inputs by construction. The work is therefore self-contained as a conceptual contribution with no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper rests on the domain assumption that hidden routing decisions materially affect user trust and on the invention of the route-receipt artifact itself.

axioms (1)
  • domain assumption Routing decisions can change cost, quality, or accountability without the user being able to tell what happened
    Stated directly in the abstract as the core trust problem.
invented entities (1)
  • route receipt no independent evidence
    purpose: Compact record of the route that served a request for transparency
    Newly introduced artifact with a proposed minimal schema and redaction model.

pith-pipeline@v0.9.0 · 5499 in / 1217 out tokens · 53963 ms · 2026-05-10T15:44:03.475367+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 4 canonical work pages · 2 internal anchors

  1. [1]

    Priority processing | OpenAI API

    OpenAI API Documentation. “Priority processing | OpenAI API” . Ac- cessed April 29, 2026

  2. [2]

    Service tiers

    Anthropic API Documentation. “Service tiers” . Accessed April 29, 2026

  3. [3]

    Service tiers for optimizing performance and cost

    A WS Documentation. “Service tiers for optimizing performance and cost” . Accessed April 29, 2026

  4. [4]

    Models | Gemini API

    Google AI for Developers. “Models | Gemini API” . Accessed April 29, 2026

  5. [5]

    Foundry Models lifecycle and support policy

    Microsoft Learn. “Foundry Models lifecycle and support policy” . Accessed April 29, 2026. 20

  6. [6]

    Understanding intelligent prompt routing in Ama- zon Bedrock

    A WS Documentation. “Understanding intelligent prompt routing in Ama- zon Bedrock” . Accessed April 29, 2026

  7. [7]

    Model router for Microsoft Foundry concepts

    Microsoft Learn. “Model router for Microsoft Foundry concepts” . Ac- cessed April 29, 2026

  8. [8]

    Web search | OpenAI API

    OpenAI API Documentation. “Web search | OpenAI API”. Accessed April 29, 2026

  9. [9]

    Deployments and endpoints | Generative AI on Vertex AI

    Google Cloud Vertex AI Documentation. “Deployments and endpoints | Generative AI on Vertex AI” . Accessed April 29, 2026

  10. [10]

    Provider Routing

    OpenRouter Documentation. “Provider Routing” . Accessed April 29, 2026

  11. [11]

    Zero Data Retention

    OpenRouter Documentation. “Zero Data Retention” . Accessed April 29, 2026

  12. [12]

    Model Fallbacks

    OpenRouter Documentation. “Model Fallbacks”. Accessed April 29, 2026

  13. [13]

    arXiv preprint arXiv:2207.10342 , year=

    David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-Dickstein, Kevin Murphy, and Charles Sutton. “Language Model Cascades” . arXiv:2207.10342, 2022

  14. [14]

    FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

    Lingjiao Chen, Matei Zaharia, and James Zou. “FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Perfor- mance”. arXiv:2305.05176, 2023

  15. [15]

    RouteLLM: Learning to Route LLMs with Preference Data

    Isaac Ong, Amjad Almahairi, Vincent Wu, Wei-Lin Chiang, Tianhao Wu, Joseph E. Gonzalez, M. Waleed Kadous, and Ion Stoica. “RouteLLM: Learning to Route LLMs with Preference Data” . arXiv:2406.18665, 2024

  16. [16]

    LLMRouterBench: A massive benchmark and unified framework for LLM routing.arXiv preprint arXiv:2601.07206, 2026

    Hao Li, Yiqun Zhang, Zhaoyan Guo, Chenxu Wang, Shengji Tang, Qiaosheng Zhang, Yang Chen, Biqing Qi, Peng Ye, Lei Bai, Zhen Wang, and Shuyue Hu. “LLMRouterBench: A Massive Benchmark and Unified Framework for LLM Routing” . arXiv:2601.07206, 2026

  17. [17]

    Reasoning models | OpenAI API

    OpenAI API Documentation. “Reasoning models | OpenAI API” . Ac- cessed April 29, 2026

  18. [18]

    GenerationConfig | Generative AI on Vertex AI

    Google Cloud Vertex AI Documentation. “GenerationConfig | Generative AI on Vertex AI” . Accessed April 29, 2026

  19. [19]

    Model Cards for Model Reporting

    Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. “Model Cards for Model Reporting” . Proceedings of the Conference on Fairness, Accountability, and Transparency (F AT*), 2019

  20. [20]

    Artificial Intelligence Risk Management Framework (AI RMF 1.0)

    National Institute of Standards and Technology. “Artificial Intelligence Risk Management Framework (AI RMF 1.0)” . NIST AI 100-1, 2023

  21. [21]

    OECD AI Principles

    OECD. “OECD AI Principles” . OECD Recommendation on Artificial Intelligence, 2019, updated 2024

  22. [22]

    Appropriate Reliance on AI Advice: Conceptualization and the Effect of Explanations

    Max Schemmer, Niklas Kühl, Carina Benz, Andrea Bartos, and Gerhard Satzger. “Appropriate Reliance on AI Advice: Conceptualization and the Effect of Explanations” . Proceedings of the 28th International Conference on Intelligent User Interfaces (IUI), 2023

  23. [23]

    PROV-DM: The PROV Data Model

    W3C. “PROV-DM: The PROV Data Model” . W3C Recommendation, 2013

  24. [24]

    PROV-O: The PROV Ontology

    W3C. “PROV-O: The PROV Ontology” . W3C Recommendation, 2013. 21

  25. [25]

    MLflow Tracking

    MLflow Documentation. “MLflow Tracking”. Accessed April 29, 2026

  26. [26]

    SLSA Provenance

    SLSA. “SLSA Provenance”. Version 1.2. Accessed April 29, 2026

  27. [27]

    Datasheets for Datasets

    Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wort- man Vaughan, Hanna Wallach, Hal Daume III, and Kate Crawford. “Datasheets for Datasets” . Communications of the ACM, 64(12), 2021

  28. [28]

    Data Statements for Natural Language Processing

    Emily M. Bender and Batya Friedman. “Data Statements for Natural Language Processing”. Transactions of the Association for Computational Linguistics, 6, 2018

  29. [29]

    FactSheets: Increasing trust in AI services through supplier’s declarations of conformity

    Matthew Arnold, Rachel K. E. Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilović, Ravi Nair, Karthikeyan Natesan Ramamurthy, Darrell Reimer, Alexandra Olteanu, David Piorkowski, Ja- son Tsay, and Kush R. Varshney. “FactSheets: Increasing trust in AI services through supplier’s declarations of conformity” . IBM Journal of Research a...

  30. [30]

    Hidden Technical Debt in Machine Learning Sys- tems

    D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. “Hidden Technical Debt in Machine Learning Sys- tems”. Advances in Neural Information Processing Systems 28 (NIPS), 2015

  31. [31]

    The ML Test Score: A Rubric for ML Production Readiness and Techni- cal Debt Reduction

    Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D. Sculley. “The ML Test Score: A Rubric for ML Production Readiness and Techni- cal Debt Reduction” . IEEE International Conference on Big Data, 2017

  32. [32]

    TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

    Denis Baylor, Eric Breck, Heng-Tze Cheng, Noah Fiedel, Chuan Yu Foo, Zakaria Haque, Salem Haykal, Mustafa Ispir, Vihan Jain, Levent Koc, Chiu Yuen Koo, Lukasz Lew, Clemens Mewald, Akshay Naresh Modi, Neoklis Polyzotis, Sukriti Ramesh, Sudip Roy, Steven Euijong Whang, Martin Wicke, Jarek Wilkiewicz, Xin Zhang, and Martin Zinke- vich. “TFX: A TensorFlow-Bas...

  33. [33]

    Data Validation for Machine Learning

    Eric Breck, Martin Zinkevich, Neoklis Polyzotis, Steven Whang, and Sudip Roy. “Data Validation for Machine Learning” . Proceedings of MLSys, 2019

  34. [34]

    Fiddler Observability

    Fiddler Documentation. “Fiddler Observability”

  35. [35]

    WhyLabs Observe

    WhyLabs Documentation. “WhyLabs Observe”

  36. [36]

    LLM Tracing and Observability with Arize Phoenix

    Arize. “LLM Tracing and Observability with Arize Phoenix”

  37. [37]

    OpenAI-Compatible Server

    vLLM Documentation. “OpenAI-Compatible Server” . Accessed April 29, 2026

  38. [38]

    Production Metrics

    SGLang Documentation. “Production Metrics” . Accessed April 29, 2026

  39. [39]

    SGLang: Efficient Execution of Structured Language Model Programs

    Zheng et al. “SGLang: Efficient Execution of Structured Language Model Programs”

  40. [40]

    Regulation (EU) 2024/1689

    European Union. “Regulation (EU) 2024/1689”

  41. [41]

    ISO/IEC 42001:2023

    ISO. “ISO/IEC 42001:2023”

  42. [42]

    Sycophancy in GPT-4o: what happened and what we’re doing about it

    OpenAI. “Sycophancy in GPT-4o: what happened and what we’re doing about it”

  43. [43]

    Data controls in the OpenAI platform

    OpenAI API Documentation. “Data controls in the OpenAI platform” . Accessed April 29, 2026. 22

  44. [44]

    Web search tool

    Anthropic API Documentation. “Web search tool” . Accessed April 29, 2026

  45. [45]

    GPT-5.3-Codex Model | OpenAI API

    OpenAI API Documentation. “GPT-5.3-Codex Model | OpenAI API” . Accessed April 29, 2026

  46. [46]

    Route Receipt Specification

    Route Receipt Specification. “Route Receipt Specification” . Maintained by Vincent Schmalbach. Accessed April 30, 2026

  47. [47]

    Semantic conventions for generative AI systems

    OpenTelemetry. “Semantic conventions for generative AI systems” . Ac- cessed May 1, 2026. Appendix A: Minimal route receipt JSON Schema This schema is intentionally small. It defines a portable receipt object that can be embedded in API responses, logs, audit exports, or benchmark records. Providers can add extension fields under provider_extensions, but ...