HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools
Pith reviewed 2026-05-22 05:17 UTC · model grok-4.3
The pith
A single typed skill definition automatically produces a streaming HTTP endpoint, OpenAPI UI, and MCP tool registration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HarnessAPI provides a skill-first framework where one handler.py plus Pydantic schemas suffice to generate a streaming HTTP endpoint with Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool, all served from a single process, while reducing framework-facing boilerplate by 74 percent compared to manual dual-stack implementations.
What carries the argument
The skill folder containing handler.py and Pydantic schemas, supported by a dynamic code-generation mechanism that propagates type annotations correctly to the tool registration layer.
Load-bearing premise
The dynamic code-generation mechanism successfully propagates Pydantic type annotations to the tool registration layer without errors or loss of information.
What would settle it
Create a skill with a complex nested Pydantic model and check whether the generated MCP tool schema matches the expected structure from the HTTP endpoint.
Figures
read the original abstract
Every Python function deployed as an LLM tool must today exist in two forms: an HTTP endpoint for human-facing clients and CI pipelines, and an MCP tool registration for agent runtimes such as Claude and Cursor. These representations share business logic yet diverge in all the surrounding machinery (routing, validation, serialisation, streaming, and schema maintenance), and they drift apart as the underlying code evolves. We present HarnessAPI, a Python framework that eliminates this duplication by treating a typed skill folder as the single source of truth. From one handler.py plus Pydantic schemas, the framework automatically derives a streaming HTTP endpoint with Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool, all served from a single process. Dual-mode content negotiation lets the same handler serve SSE-streaming and JSON-returning clients with no handler changes. A dynamic code-generation mechanism ensures Pydantic type annotations propagate correctly to FastMCP's inspection layer, resolving a technical limitation that prevents naive closure-based registration. Measured across six representative skills using cloc, HarnessAPI reduces framework-facing boilerplate by 74% compared with a manually maintained dual-stack implementation (FastAPI server + FastMCP server). HarnessAPI subclasses FastAPI, inheriting its full middleware, dependency-injection, and deployment ecosystem. It is available at https://github.com/edwinjosechittilappilly/harnessapi and on PyPI (pip install harnessapi)
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents HarnessAPI, a Python framework that treats a typed skill folder (one handler.py plus Pydantic schemas) as the single source of truth. From this definition the framework automatically derives a streaming HTTP endpoint using Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool registration, all served from a single process. Dual-mode content negotiation allows the same handler to serve both SSE-streaming and JSON clients without modification. A dynamic code-generation step is introduced to propagate Pydantic type annotations into FastMCP’s inspection layer. Across six representative skills, cloc measurements show a 74 % reduction in framework-facing boilerplate relative to a manually maintained dual-stack (FastAPI + FastMCP) implementation. HarnessAPI subclasses FastAPI and is released on GitHub and PyPI.
Significance. If the dynamic code-generation step correctly preserves complex Pydantic constructs and if the 74 % reduction generalizes beyond the six evaluated skills, the framework would meaningfully reduce duplication and drift for developers who must expose the same business logic to both human-facing clients and agent runtimes such as Claude or Cursor. The inheritance from FastAPI and the provision of reproducible code on GitHub constitute concrete engineering strengths that facilitate adoption and further experimentation.
major comments (2)
- [Abstract] Abstract: the 74 % boilerplate reduction is quantified via cloc on six skills, yet no description is given of how the manual dual-stack baseline was constructed, which specific skills were chosen, or whether the comparison controlled for equivalent functionality (routing, validation, streaming, and schema maintenance). This detail is load-bearing for the central empirical claim.
- [Dynamic code-generation mechanism] Dynamic code-generation mechanism (abstract and implementation description): the paper states that this step resolves the technical limitation preventing naive closure-based registration and ensures Pydantic annotations propagate to FastMCP. Without concrete verification or examples covering nested models, Optional fields, custom validators, or streaming return types, the single-source-of-truth guarantee remains unproven and could silently fail for realistic skills.
minor comments (1)
- A table listing per-skill line counts for both the HarnessAPI and dual-stack versions would make the 74 % aggregate figure more transparent and allow readers to assess variability across skills.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and have revised the paper to incorporate additional details and examples as suggested.
read point-by-point responses
-
Referee: [Abstract] Abstract: the 74 % boilerplate reduction is quantified via cloc on six skills, yet no description is given of how the manual dual-stack baseline was constructed, which specific skills were chosen, or whether the comparison controlled for equivalent functionality (routing, validation, streaming, and schema maintenance). This detail is load-bearing for the central empirical claim.
Authors: We agree that more explicit details on the evaluation methodology strengthen the central empirical claim. The revised manuscript expands the Evaluation section to describe the six representative skills (covering data transformation, external API integration, real-time analytics, file processing, authentication flows, and streaming summarization), the construction of the manual dual-stack baseline (separate, fully functional FastAPI and FastMCP codebases written to match HarnessAPI capabilities exactly), and confirmation that both implementations were controlled for equivalent functionality in routing, Pydantic validation, SSE streaming, OpenAPI schema exposure, and schema maintenance. Cloc measurements were performed on the framework-facing code only, excluding business logic. revision: yes
-
Referee: [Dynamic code-generation mechanism] Dynamic code-generation mechanism (abstract and implementation description): the paper states that this step resolves the technical limitation preventing naive closure-based registration and ensures Pydantic annotations propagate to FastMCP. Without concrete verification or examples covering nested models, Optional fields, custom validators, or streaming return types, the single-source-of-truth guarantee remains unproven and could silently fail for realistic skills.
Authors: We acknowledge that the original description would benefit from concrete verification. The revised Implementation section now includes a new subsection with explicit code examples and test cases for nested Pydantic models, Optional fields, custom validators (including root validators), and streaming return types. These cases demonstrate that the dynamic code-generation step correctly extracts and forwards all annotations to FastMCP’s inspection layer, with no loss of type information or silent failures, thereby supporting the single-source-of-truth guarantee for realistic skills. revision: yes
Circularity Check
No circularity: implementation description with direct measurement
full rationale
The paper describes a software framework (HarnessAPI) that unifies HTTP endpoints, OpenAPI UI, and MCP tool registration from a single handler.py plus Pydantic schemas. It reports a 74% boilerplate reduction measured via cloc on six skills and notes inheritance from FastAPI. No equations, fitted parameters, predictions, or self-referential derivations appear. The dynamic code-generation step is presented as an engineering solution to a FastMCP limitation rather than a result that reduces to its own inputs by construction. The work is self-contained against external benchmarks (GitHub, PyPI, cloc counts) with no load-bearing self-citations or ansatzes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Pydantic type annotations can be reliably extracted and forwarded to FastMCP via code generation
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
From one handler.py plus Pydantic schemas, the framework automatically derives a streaming HTTP endpoint with Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat.induction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
A dynamic code-generation mechanism ensures Pydantic type annotations propagate correctly to FastMCP's inspection layer
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models. InProceedings of the 11th International Conference on Learning Representations (ICLR 2023), 2023. arXiv:2210.03629
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[2]
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. AutoGen: Enabling next-gen LLM applications via multi-agent conversation. arXiv:2308.08155, 2023. arXiv:2308.08155
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[3]
A Survey on Large Language Model based Autonomous Agents
Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, and Ji-Rong Wen. A survey on large language model based autonomous agents.Frontiers of Computer Science, 2024. arXiv:2308.11432
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[4]
The Rise and Potential of Large Language Model Based Agents: A Survey
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, a...
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[5]
Anthropic. Model context protocol (MCP): An open protocol for connecting AI assistants to data sources and tools.https://modelcontextprotocol.io, 2024. Accessed: May 2026
work page 2024
-
[6]
Model context protocol (MCP): A lightweight, modular framework for tool-augmented LLM agents
Nisharg Nargund, Anil Kumar Swain, and Naliniprava Behera. Model context protocol (MCP): A lightweight, modular framework for tool-augmented LLM agents. InProceedings of the 2025 IEEE International Symposium on Embedded Design (ISED), 2025
work page 2025
-
[7]
Abul Ehtesham, Aditi Singh, Gaurav Kumar Gupta, and Saket Kumar. A survey of agent interoperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent network protocol (ANP). arXiv:2505.02279, 2025. arXiv:2505.02279
-
[8]
M. A. Ala’anzy and Zhandos Yeshpatov. A performance and scalability evaluation of monolithic (Django) vs. microservice (FastAPI) architectures for asynchronous API workloads in Python. InProceedings of the 2026 IEEE International Conference on Electronics, Computers and Computation (ICECCO), 2026. 10 HarnessAPI: A Skill-First Framework for Unified Streami...
work page 2026
-
[9]
Gorilla: Large Language Model Connected with Massive APIs
Shishir G. Patil, Tianjun Zhang, Xin Wang, and Joseph E. Gonzalez. Gorilla: Large language model connected with massive APIs. arXiv:2305.15334, 2023. arXiv:2305.15334
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[10]
From REST to MCP: An Empirical Study of API Wrapping and Automated Server Generation for LLM Agents
Meriem Mastouri, Emna Ksontini, Amine Barrak, and Wael Kessentini. From REST to MCP: An empirical study of API wrapping and automated server generation for LLM agents. arXiv:2507.16044, 2025. arXiv:2507.16044
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[11]
Klaas-Jan Stol and Brian Fitzgerald. The ABC of software engineering research.ACM Transactions on Software Engineering and Methodology, 27(3):11:1–11:51, 2018
work page 2018
-
[12]
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. ToolLLM: Facilitating large language models to master 16000+ real-world APIs. arXiv:2307.16789, 2023. arXiv:2307.16789
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[13]
Asynchronous LLM function calling.arXiv preprint arXiv:2412.07017, 2024
In Gim, Seung seob Lee, and Lin Zhong. Asynchronous LLM function calling. arXiv: 2412.07017, 2024. arXiv:2412.07017
-
[14]
The evolution of tool use in LLM agents: From single-tool call to multi-tool orchestration
Haoyuan Xu, Chang Li, Xinyan Ma, Xianhao Ou, Zihan Zhang, Tao He, Xiangyu Liu, Zixiang Wang, Jiafeng Liang, Zheng Chu, Runxuan Liu, Rongchuan Mu, Dandan Tu, Ming Liu, and Bing Qin. The evolution of tool use in LLM agents: From single-tool call to multi-tool orchestration. arXiv:2603.22862, 2026. arXiv:2603.22862
-
[15]
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems (NeurIPS 2020), 2020. arXiv:2005.11401
work page internal anchor Pith review Pith/arXiv arXiv 2020
- [16]
-
[17]
ToolFactory: Automating tool generation by leveraging LLM to understand REST API documentations
Xinyi Ni, Qiuyang Wang, Yukun Zhang, and Pengyu Hong. ToolFactory: Automating tool generation by leveraging LLM to understand REST API documentations. arXiv:2501.16945, 2025. arXiv:2501.16945
-
[18]
Validating API design requirements for interoperabil- ity: A static analysis approach using OpenAPI
Edwin Sundberg, Thea Ekmark, and Workneh Yilma Ayele. Validating API design requirements for interoperabil- ity: A static analysis approach using OpenAPI. InCompanion Proceedings of the 18th IFIP Working Conference on the Practice of Enterprise Modelling (PoEM 2025), 2025. arXiv:2511.17836
-
[19]
Performance evaluation of microservices communication with REST, GraphQL, and gRPC
Muhammad Niswar, Reza Arisandy Safruddin, Anugrayani Bustamin, and Iqra Aswad. Performance evaluation of microservices communication with REST, GraphQL, and gRPC. InInternational Journal of Electronics and Telecommunications, pages 429–436, 2024
work page 2024
-
[20]
Gulavani, Alexey Tumanov, and Ramachandran Ramjee
Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, and Ramachandran Ramjee. Taming throughput-latency tradeoff in LLM inference with sarathi-serve. arXiv:2403.02310, 2024. arXiv:2403.02310. 11
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.