Recognition: unknown
Model Routing as a Trust Problem: Route Receipts for Adaptive AI Systems
Pith reviewed 2026-05-10 15:44 UTC · model grok-4.3
The pith
Adaptive AI systems should attach a route receipt to each response to document the runtime path taken without exposing proprietary logic.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that model routing constitutes a trust problem best addressed by producing a route receipt for each request: a minimal, redacted record of the serving path that supplies enough facts for external reconstruction of routing choices while protecting internal proprietary details.
What carries the argument
The route receipt, a compact runtime record of the path that served a request, designed with a minimal schema and redaction rules to enable reconstruction without full disclosure.
If this is right
- Route transparency becomes a required element of model documentation alongside existing model cards.
- Users gain the ability to verify which version, tier, or fallback produced a given answer.
- Platforms can share receipt fragments already generated internally in a unified, portable format.
- Accountability improves because changes in cost or quality can be traced to specific routing steps.
- Safety and compliance reviews can reference the exact runtime conditions of a response.
Where Pith is reading between the lines
- Standardized receipt formats could integrate with existing logging and audit systems to reduce duplication.
- High-stakes domains such as medical or financial AI might adopt receipts first to meet regulatory expectations.
- Over time, receipts could evolve to include optional fields for user-requested transparency levels.
- The approach focuses on path documentation rather than model internals, complementing rather than replacing explainability techniques.
Load-bearing premise
A compact redacted receipt can be produced and shared at acceptable cost without either leaking proprietary routing logic or creating excessive overhead.
What would settle it
Demonstrating that any usable receipt either reveals enough routing details to compromise competitive advantage or adds latency and storage costs that production systems reject would falsify the proposal.
read the original abstract
AI products often route requests through version aliases, service tiers, tool choices, regional endpoints, fallback rules, or safety handling before responding. These routing steps are documented product surfaces in several widely used AI platforms and serving stacks. Routing helps AI services stay affordable, fast, and available at scale, and it shapes trust. Trust can break when routing changes the cost, quality, or accountability of a response without the user being able to tell what happened. "Which model answered?" is only part of the audit question. The runtime path matters. Adaptive AI systems should produce a runtime transparency artifact called the route receipt. A route receipt is a compact record of the route that served a request. It should capture enough material facts for people relying on the output to reconstruct important routing decisions without exposing proprietary internals or hidden reasoning. Route transparency should be part of model documentation. Model cards describe trained model artifacts, while route receipts describe the runtime conditions under which a particular answer was produced. The paper introduces the route-receipt concept, a minimal schema and redaction model, and a documentation-based survey of selected platforms showing that receipt fragments already exist without a portable per-answer record.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that routing decisions in adaptive AI systems (version aliases, tiers, tool choices, fallbacks, safety handling) shape trust and accountability, and proposes 'route receipts' as compact runtime transparency artifacts. These receipts should capture enough material facts via a minimal schema and redaction model to let downstream parties reconstruct key routing decisions without exposing proprietary internals. The work positions receipts as complementary to model cards, introduces the schema and redaction approach, and surveys documentation from selected platforms to show that receipt-like fragments already exist in practice.
Significance. If the redaction model can be validated to balance reconstructibility with proprietary protection and acceptable overhead, the proposal could help standardize runtime accountability for adaptive AI services, filling a gap between static model documentation and dynamic serving behavior. The conceptual framing is internally consistent, and the survey of existing platform fragments provides a practical foundation that strengthens the case for a portable standard.
major comments (2)
- [Schema and Redaction Model] The section defining the minimal schema and redaction model provides no worked example on a real router nor any argument (formal or informal) demonstrating that the redaction rules preserve sufficient information to reconstruct material routing facts (model version, tier, fallback path, safety handling) while provably avoiding leakage of proprietary decision logic. This assumption is load-bearing for the central claim that receipts can be both useful and safe.
- [Survey of Platforms] The documentation-based survey of platforms shows that routing fragments appear in existing systems but does not address or evaluate whether these can be unified into a single portable per-answer receipt format without unacceptable overhead or requiring disclosure of proprietary internals.
minor comments (1)
- The distinction between the proposed route receipt and existing platform-specific logs or metadata could be clarified with a small comparison table to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which identify key areas where the conceptual proposal can be made more concrete. We address each major comment below and indicate the revisions we will incorporate.
read point-by-point responses
-
Referee: [Schema and Redaction Model] The section defining the minimal schema and redaction model provides no worked example on a real router nor any argument (formal or informal) demonstrating that the redaction rules preserve sufficient information to reconstruct material routing facts (model version, tier, fallback path, safety handling) while provably avoiding leakage of proprietary decision logic. This assumption is load-bearing for the central claim that receipts can be both useful and safe.
Authors: We agree that the manuscript would be strengthened by an explicit worked example and a clearer articulation of how the redaction model balances reconstructibility and protection. The schema is defined to record only observable material facts (model alias or version, tier, fallback indicator, safety handling flag) while the redaction rules exclude internal routing logic, decision trees, or proprietary heuristics. Although the paper relies on an informal design argument rather than a formal proof, we will add a worked example in the revision using a representative multi-tier router configuration. This example will show step-by-step how a receipt enables reconstruction of the key facts listed by the referee without exposing proprietary elements. We will also expand the surrounding text to make the informal preservation argument explicit. These changes directly address the load-bearing assumption. revision: yes
-
Referee: [Survey of Platforms] The documentation-based survey of platforms shows that routing fragments appear in existing systems but does not address or evaluate whether these can be unified into a single portable per-answer receipt format without unacceptable overhead or requiring disclosure of proprietary internals.
Authors: The survey is deliberately documentation-based to demonstrate that receipt-like fragments already appear in public platform documentation, thereby grounding the proposal in existing practice rather than pure invention. The unification into a portable per-answer format is the central proposal, with the redaction model intended to ensure no proprietary internals need be disclosed. The manuscript does not contain a quantitative overhead evaluation because it is a conceptual contribution focused on the schema and its rationale. In the revision we will add a short discussion of expected overhead, observing that the schema is intentionally minimal and emitted per request, which aligns with the low-cost logging already performed by serving systems. We maintain that the redaction approach precludes disclosure of proprietary logic by construction. revision: partial
- A formal (as opposed to informal) proof that the redaction rules provably avoid leakage of proprietary decision logic would require information-theoretic or cryptographic analysis beyond the scope of this conceptual paper.
Circularity Check
Conceptual proposal with no derivations, predictions, or self-referential steps
full rationale
The paper is a definitional proposal introducing the route-receipt concept, a minimal schema, a redaction model, and an observational survey of existing platform fragments. No equations, fitted parameters, predictions, or derivation chains appear in the provided text. All load-bearing content consists of new definitions and documentation-based observations rather than reductions to prior results, self-citations, or inputs by construction. The work is therefore self-contained as a conceptual contribution with no circularity.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Routing decisions can change cost, quality, or accountability without the user being able to tell what happened
invented entities (1)
-
route receipt
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Priority processing | OpenAI API
OpenAI API Documentation. “Priority processing | OpenAI API” . Ac- cessed April 29, 2026
2026
-
[2]
Service tiers
Anthropic API Documentation. “Service tiers” . Accessed April 29, 2026
2026
-
[3]
Service tiers for optimizing performance and cost
A WS Documentation. “Service tiers for optimizing performance and cost” . Accessed April 29, 2026
2026
-
[4]
Models | Gemini API
Google AI for Developers. “Models | Gemini API” . Accessed April 29, 2026
2026
-
[5]
Foundry Models lifecycle and support policy
Microsoft Learn. “Foundry Models lifecycle and support policy” . Accessed April 29, 2026. 20
2026
-
[6]
Understanding intelligent prompt routing in Ama- zon Bedrock
A WS Documentation. “Understanding intelligent prompt routing in Ama- zon Bedrock” . Accessed April 29, 2026
2026
-
[7]
Model router for Microsoft Foundry concepts
Microsoft Learn. “Model router for Microsoft Foundry concepts” . Ac- cessed April 29, 2026
2026
-
[8]
Web search | OpenAI API
OpenAI API Documentation. “Web search | OpenAI API”. Accessed April 29, 2026
2026
-
[9]
Deployments and endpoints | Generative AI on Vertex AI
Google Cloud Vertex AI Documentation. “Deployments and endpoints | Generative AI on Vertex AI” . Accessed April 29, 2026
2026
-
[10]
Provider Routing
OpenRouter Documentation. “Provider Routing” . Accessed April 29, 2026
2026
-
[11]
Zero Data Retention
OpenRouter Documentation. “Zero Data Retention” . Accessed April 29, 2026
2026
-
[12]
Model Fallbacks
OpenRouter Documentation. “Model Fallbacks”. Accessed April 29, 2026
2026
-
[13]
arXiv preprint arXiv:2207.10342 , year=
David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-Dickstein, Kevin Murphy, and Charles Sutton. “Language Model Cascades” . arXiv:2207.10342, 2022
-
[14]
FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance
Lingjiao Chen, Matei Zaharia, and James Zou. “FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Perfor- mance”. arXiv:2305.05176, 2023
work page internal anchor Pith review arXiv 2023
-
[15]
RouteLLM: Learning to Route LLMs with Preference Data
Isaac Ong, Amjad Almahairi, Vincent Wu, Wei-Lin Chiang, Tianhao Wu, Joseph E. Gonzalez, M. Waleed Kadous, and Ion Stoica. “RouteLLM: Learning to Route LLMs with Preference Data” . arXiv:2406.18665, 2024
work page internal anchor Pith review arXiv 2024
-
[16]
Hao Li, Yiqun Zhang, Zhaoyan Guo, Chenxu Wang, Shengji Tang, Qiaosheng Zhang, Yang Chen, Biqing Qi, Peng Ye, Lei Bai, Zhen Wang, and Shuyue Hu. “LLMRouterBench: A Massive Benchmark and Unified Framework for LLM Routing” . arXiv:2601.07206, 2026
-
[17]
Reasoning models | OpenAI API
OpenAI API Documentation. “Reasoning models | OpenAI API” . Ac- cessed April 29, 2026
2026
-
[18]
GenerationConfig | Generative AI on Vertex AI
Google Cloud Vertex AI Documentation. “GenerationConfig | Generative AI on Vertex AI” . Accessed April 29, 2026
2026
-
[19]
Model Cards for Model Reporting
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. “Model Cards for Model Reporting” . Proceedings of the Conference on Fairness, Accountability, and Transparency (F AT*), 2019
2019
-
[20]
Artificial Intelligence Risk Management Framework (AI RMF 1.0)
National Institute of Standards and Technology. “Artificial Intelligence Risk Management Framework (AI RMF 1.0)” . NIST AI 100-1, 2023
2023
-
[21]
OECD AI Principles
OECD. “OECD AI Principles” . OECD Recommendation on Artificial Intelligence, 2019, updated 2024
2019
-
[22]
Appropriate Reliance on AI Advice: Conceptualization and the Effect of Explanations
Max Schemmer, Niklas Kühl, Carina Benz, Andrea Bartos, and Gerhard Satzger. “Appropriate Reliance on AI Advice: Conceptualization and the Effect of Explanations” . Proceedings of the 28th International Conference on Intelligent User Interfaces (IUI), 2023
2023
-
[23]
PROV-DM: The PROV Data Model
W3C. “PROV-DM: The PROV Data Model” . W3C Recommendation, 2013
2013
-
[24]
PROV-O: The PROV Ontology
W3C. “PROV-O: The PROV Ontology” . W3C Recommendation, 2013. 21
2013
-
[25]
MLflow Tracking
MLflow Documentation. “MLflow Tracking”. Accessed April 29, 2026
2026
-
[26]
SLSA Provenance
SLSA. “SLSA Provenance”. Version 1.2. Accessed April 29, 2026
2026
-
[27]
Datasheets for Datasets
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wort- man Vaughan, Hanna Wallach, Hal Daume III, and Kate Crawford. “Datasheets for Datasets” . Communications of the ACM, 64(12), 2021
2021
-
[28]
Data Statements for Natural Language Processing
Emily M. Bender and Batya Friedman. “Data Statements for Natural Language Processing”. Transactions of the Association for Computational Linguistics, 6, 2018
2018
-
[29]
FactSheets: Increasing trust in AI services through supplier’s declarations of conformity
Matthew Arnold, Rachel K. E. Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilović, Ravi Nair, Karthikeyan Natesan Ramamurthy, Darrell Reimer, Alexandra Olteanu, David Piorkowski, Ja- son Tsay, and Kush R. Varshney. “FactSheets: Increasing trust in AI services through supplier’s declarations of conformity” . IBM Journal of Research a...
2019
-
[30]
Hidden Technical Debt in Machine Learning Sys- tems
D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. “Hidden Technical Debt in Machine Learning Sys- tems”. Advances in Neural Information Processing Systems 28 (NIPS), 2015
2015
-
[31]
The ML Test Score: A Rubric for ML Production Readiness and Techni- cal Debt Reduction
Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D. Sculley. “The ML Test Score: A Rubric for ML Production Readiness and Techni- cal Debt Reduction” . IEEE International Conference on Big Data, 2017
2017
-
[32]
TFX: A TensorFlow-Based Production-Scale Machine Learning Platform
Denis Baylor, Eric Breck, Heng-Tze Cheng, Noah Fiedel, Chuan Yu Foo, Zakaria Haque, Salem Haykal, Mustafa Ispir, Vihan Jain, Levent Koc, Chiu Yuen Koo, Lukasz Lew, Clemens Mewald, Akshay Naresh Modi, Neoklis Polyzotis, Sukriti Ramesh, Sudip Roy, Steven Euijong Whang, Martin Wicke, Jarek Wilkiewicz, Xin Zhang, and Martin Zinke- vich. “TFX: A TensorFlow-Bas...
2017
-
[33]
Data Validation for Machine Learning
Eric Breck, Martin Zinkevich, Neoklis Polyzotis, Steven Whang, and Sudip Roy. “Data Validation for Machine Learning” . Proceedings of MLSys, 2019
2019
-
[34]
Fiddler Observability
Fiddler Documentation. “Fiddler Observability”
-
[35]
WhyLabs Observe
WhyLabs Documentation. “WhyLabs Observe”
-
[36]
LLM Tracing and Observability with Arize Phoenix
Arize. “LLM Tracing and Observability with Arize Phoenix”
-
[37]
OpenAI-Compatible Server
vLLM Documentation. “OpenAI-Compatible Server” . Accessed April 29, 2026
2026
-
[38]
Production Metrics
SGLang Documentation. “Production Metrics” . Accessed April 29, 2026
2026
-
[39]
SGLang: Efficient Execution of Structured Language Model Programs
Zheng et al. “SGLang: Efficient Execution of Structured Language Model Programs”
-
[40]
Regulation (EU) 2024/1689
European Union. “Regulation (EU) 2024/1689”
2024
-
[41]
ISO/IEC 42001:2023
ISO. “ISO/IEC 42001:2023”
2023
-
[42]
Sycophancy in GPT-4o: what happened and what we’re doing about it
OpenAI. “Sycophancy in GPT-4o: what happened and what we’re doing about it”
-
[43]
Data controls in the OpenAI platform
OpenAI API Documentation. “Data controls in the OpenAI platform” . Accessed April 29, 2026. 22
2026
-
[44]
Web search tool
Anthropic API Documentation. “Web search tool” . Accessed April 29, 2026
2026
-
[45]
GPT-5.3-Codex Model | OpenAI API
OpenAI API Documentation. “GPT-5.3-Codex Model | OpenAI API” . Accessed April 29, 2026
2026
-
[46]
Route Receipt Specification
Route Receipt Specification. “Route Receipt Specification” . Maintained by Vincent Schmalbach. Accessed April 30, 2026
2026
-
[47]
Semantic conventions for generative AI systems
OpenTelemetry. “Semantic conventions for generative AI systems” . Ac- cessed May 1, 2026. Appendix A: Minimal route receipt JSON Schema This schema is intentionally small. It defines a portable receipt object that can be embedded in API responses, logs, audit exports, or benchmark records. Providers can add extension fields under provider_extensions, but ...
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.