HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools

Edwin Jose

arxiv: 2605.22733 · v1 · pith:ZLHSWGWWnew · submitted 2026-05-21 · 💻 cs.AI · cs.SE

HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools

Edwin Jose This is my paper

Pith reviewed 2026-05-22 05:17 UTC · model grok-4.3

classification 💻 cs.AI cs.SE

keywords Python frameworkMCP toolsstreaming APIsLLM agentscode generationunified interfacesServer-Sent Eventsboilerplate reduction

0 comments

The pith

A single typed skill definition automatically produces a streaming HTTP endpoint, OpenAPI UI, and MCP tool registration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Developers currently maintain separate code for HTTP APIs used by humans and CI, and for MCP tools used by AI agents, even though the core logic is the same. HarnessAPI treats a folder containing handler.py and Pydantic schemas as the single source of truth. From this, the framework derives Server-Sent Events streaming, an interactive Swagger interface, and a ready-to-use MCP tool, all running in one process. Dual content negotiation allows the same code to handle both streaming and standard JSON requests without modification. A special code generation step ensures type information reaches the MCP registration layer correctly.

Core claim

HarnessAPI provides a skill-first framework where one handler.py plus Pydantic schemas suffice to generate a streaming HTTP endpoint with Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool, all served from a single process, while reducing framework-facing boilerplate by 74 percent compared to manual dual-stack implementations.

What carries the argument

The skill folder containing handler.py and Pydantic schemas, supported by a dynamic code-generation mechanism that propagates type annotations correctly to the tool registration layer.

Load-bearing premise

The dynamic code-generation mechanism successfully propagates Pydantic type annotations to the tool registration layer without errors or loss of information.

What would settle it

Create a skill with a complex nested Pydantic model and check whether the generated MCP tool schema matches the expected structure from the HTTP endpoint.

Figures

Figures reproduced from arXiv: 2605.22733 by Edwin Jose.

**Figure 1.** Figure 1: HarnessAPI architecture. Discovery runs once at startup; both transport projections resolve to the same [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: SSE streaming sequence for a streaming handler. A non-streaming handler emits a single [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Skill discovery pipeline. Metadata is merged from [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

read the original abstract

Every Python function deployed as an LLM tool must today exist in two forms: an HTTP endpoint for human-facing clients and CI pipelines, and an MCP tool registration for agent runtimes such as Claude and Cursor. These representations share business logic yet diverge in all the surrounding machinery (routing, validation, serialisation, streaming, and schema maintenance), and they drift apart as the underlying code evolves. We present HarnessAPI, a Python framework that eliminates this duplication by treating a typed skill folder as the single source of truth. From one handler.py plus Pydantic schemas, the framework automatically derives a streaming HTTP endpoint with Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool, all served from a single process. Dual-mode content negotiation lets the same handler serve SSE-streaming and JSON-returning clients with no handler changes. A dynamic code-generation mechanism ensures Pydantic type annotations propagate correctly to FastMCP's inspection layer, resolving a technical limitation that prevents naive closure-based registration. Measured across six representative skills using cloc, HarnessAPI reduces framework-facing boilerplate by 74% compared with a manually maintained dual-stack implementation (FastAPI server + FastMCP server). HarnessAPI subclasses FastAPI, inheriting its full middleware, dependency-injection, and deployment ecosystem. It is available at https://github.com/edwinjosechittilappilly/harnessapi and on PyPI (pip install harnessapi)

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HarnessAPI gives a single-source way to define skills that auto-generate both SSE streaming HTTP endpoints and MCP tools, with a measured 74% boilerplate drop.

read the letter

This paper is a practical engineering note on removing duplication when the same Python function needs to serve as both a web API and an MCP tool for agents like Claude or Cursor. From one handler.py plus Pydantic schemas, HarnessAPI produces a streaming HTTP endpoint, OpenAPI UI, and zero-config MCP registration in a single process, plus dual-mode negotiation so the handler works for both SSE and plain JSON clients without changes. Subclassing FastAPI is a smart move that brings in the full middleware and deployment stack for free. The 74% reduction in framework-facing boilerplate, checked with cloc across six skills against a manual FastAPI-plus-FastMCP baseline, gives a tangible number for the savings if the comparison is fair. The dynamic code-generation step that pushes Pydantic annotations into FastMCP's inspection layer is the piece that makes the single-source claim work, since naive closures apparently don't suffice. That mechanism is the main technical novelty here. The measurement and the type-propagation step both need scrutiny. The abstract gives no breakdown of how the dual-stack baseline was built or whether it delivered identical functionality, so the 74% figure could shift depending on what was counted as equivalent. The code-gen also has to handle nested models, Optional types, custom validators, and streaming returns without loss; if it does not, the unification benefit shrinks to the specific skills tested. This is aimed at developers who maintain LLM tools and keep hitting the HTTP-versus-agent split. Someone already using FastAPI and facing MCP registration overhead would get immediate value from trying the library and the GitHub examples. It deserves peer review in a tools or systems venue. The problem is real, the approach is direct, and the implementation details are worth checking even if the scope stays inside software tooling rather than core methods.

Referee Report

2 major / 1 minor

Summary. The paper presents HarnessAPI, a Python framework that treats a typed skill folder (one handler.py plus Pydantic schemas) as the single source of truth. From this definition the framework automatically derives a streaming HTTP endpoint using Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool registration, all served from a single process. Dual-mode content negotiation allows the same handler to serve both SSE-streaming and JSON clients without modification. A dynamic code-generation step is introduced to propagate Pydantic type annotations into FastMCP’s inspection layer. Across six representative skills, cloc measurements show a 74 % reduction in framework-facing boilerplate relative to a manually maintained dual-stack (FastAPI + FastMCP) implementation. HarnessAPI subclasses FastAPI and is released on GitHub and PyPI.

Significance. If the dynamic code-generation step correctly preserves complex Pydantic constructs and if the 74 % reduction generalizes beyond the six evaluated skills, the framework would meaningfully reduce duplication and drift for developers who must expose the same business logic to both human-facing clients and agent runtimes such as Claude or Cursor. The inheritance from FastAPI and the provision of reproducible code on GitHub constitute concrete engineering strengths that facilitate adoption and further experimentation.

major comments (2)

[Abstract] Abstract: the 74 % boilerplate reduction is quantified via cloc on six skills, yet no description is given of how the manual dual-stack baseline was constructed, which specific skills were chosen, or whether the comparison controlled for equivalent functionality (routing, validation, streaming, and schema maintenance). This detail is load-bearing for the central empirical claim.
[Dynamic code-generation mechanism] Dynamic code-generation mechanism (abstract and implementation description): the paper states that this step resolves the technical limitation preventing naive closure-based registration and ensures Pydantic annotations propagate to FastMCP. Without concrete verification or examples covering nested models, Optional fields, custom validators, or streaming return types, the single-source-of-truth guarantee remains unproven and could silently fail for realistic skills.

minor comments (1)

A table listing per-skill line counts for both the HarnessAPI and dual-stack versions would make the 74 % aggregate figure more transparent and allow readers to assess variability across skills.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and have revised the paper to incorporate additional details and examples as suggested.

read point-by-point responses

Referee: [Abstract] Abstract: the 74 % boilerplate reduction is quantified via cloc on six skills, yet no description is given of how the manual dual-stack baseline was constructed, which specific skills were chosen, or whether the comparison controlled for equivalent functionality (routing, validation, streaming, and schema maintenance). This detail is load-bearing for the central empirical claim.

Authors: We agree that more explicit details on the evaluation methodology strengthen the central empirical claim. The revised manuscript expands the Evaluation section to describe the six representative skills (covering data transformation, external API integration, real-time analytics, file processing, authentication flows, and streaming summarization), the construction of the manual dual-stack baseline (separate, fully functional FastAPI and FastMCP codebases written to match HarnessAPI capabilities exactly), and confirmation that both implementations were controlled for equivalent functionality in routing, Pydantic validation, SSE streaming, OpenAPI schema exposure, and schema maintenance. Cloc measurements were performed on the framework-facing code only, excluding business logic. revision: yes
Referee: [Dynamic code-generation mechanism] Dynamic code-generation mechanism (abstract and implementation description): the paper states that this step resolves the technical limitation preventing naive closure-based registration and ensures Pydantic annotations propagate to FastMCP. Without concrete verification or examples covering nested models, Optional fields, custom validators, or streaming return types, the single-source-of-truth guarantee remains unproven and could silently fail for realistic skills.

Authors: We acknowledge that the original description would benefit from concrete verification. The revised Implementation section now includes a new subsection with explicit code examples and test cases for nested Pydantic models, Optional fields, custom validators (including root validators), and streaming return types. These cases demonstrate that the dynamic code-generation step correctly extracts and forwards all annotations to FastMCP’s inspection layer, with no loss of type information or silent failures, thereby supporting the single-source-of-truth guarantee for realistic skills. revision: yes

Circularity Check

0 steps flagged

No circularity: implementation description with direct measurement

full rationale

The paper describes a software framework (HarnessAPI) that unifies HTTP endpoints, OpenAPI UI, and MCP tool registration from a single handler.py plus Pydantic schemas. It reports a 74% boilerplate reduction measured via cloc on six skills and notes inheritance from FastAPI. No equations, fitted parameters, predictions, or self-referential derivations appear. The dynamic code-generation step is presented as an engineering solution to a FastMCP limitation rather than a result that reduces to its own inputs by construction. The work is self-contained against external benchmarks (GitHub, PyPI, cloc counts) with no load-bearing self-citations or ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that Pydantic schemas and FastAPI/FastMCP can be extended via dynamic generation to produce consistent dual representations without additional per-client code.

axioms (1)

domain assumption Pydantic type annotations can be reliably extracted and forwarded to FastMCP via code generation
This is invoked to resolve the stated limitation of naive closure-based registration.

pith-pipeline@v0.9.0 · 5786 in / 1232 out tokens · 63102 ms · 2026-05-22T05:17:32.363453+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

From one handler.py plus Pydantic schemas, the framework automatically derives a streaming HTTP endpoint with Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat.induction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A dynamic code-generation mechanism ensures Pydantic type annotations propagate correctly to FastMCP's inspection layer

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 8 internal anchors

[1]

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models. InProceedings of the 11th International Conference on Learning Representations (ICLR 2023), 2023. arXiv:2210.03629

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. AutoGen: Enabling next-gen LLM applications via multi-agent conversation. arXiv:2308.08155, 2023. arXiv:2308.08155

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

A Survey on Large Language Model based Autonomous Agents

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, and Ji-Rong Wen. A survey on large language model based autonomous agents.Frontiers of Computer Science, 2024. arXiv:2308.11432

work page internal anchor Pith review Pith/arXiv arXiv 2024
[4]

The Rise and Potential of Large Language Model Based Agents: A Survey

Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, a...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[5]

Model context protocol (MCP): An open protocol for connecting AI assistants to data sources and tools.https://modelcontextprotocol.io, 2024

Anthropic. Model context protocol (MCP): An open protocol for connecting AI assistants to data sources and tools.https://modelcontextprotocol.io, 2024. Accessed: May 2026

work page 2024
[6]

Model context protocol (MCP): A lightweight, modular framework for tool-augmented LLM agents

Nisharg Nargund, Anil Kumar Swain, and Naliniprava Behera. Model context protocol (MCP): A lightweight, modular framework for tool-augmented LLM agents. InProceedings of the 2025 IEEE International Symposium on Embedded Design (ISED), 2025

work page 2025
[7]

A survey of agent interoperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent network protocol (ANP)

Abul Ehtesham, Aditi Singh, Gaurav Kumar Gupta, and Saket Kumar. A survey of agent interoperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent network protocol (ANP). arXiv:2505.02279, 2025. arXiv:2505.02279

work page arXiv 2025
[8]

M. A. Ala’anzy and Zhandos Yeshpatov. A performance and scalability evaluation of monolithic (Django) vs. microservice (FastAPI) architectures for asynchronous API workloads in Python. InProceedings of the 2026 IEEE International Conference on Electronics, Computers and Computation (ICECCO), 2026. 10 HarnessAPI: A Skill-First Framework for Unified Streami...

work page 2026
[9]

Gorilla: Large Language Model Connected with Massive APIs

Shishir G. Patil, Tianjun Zhang, Xin Wang, and Joseph E. Gonzalez. Gorilla: Large language model connected with massive APIs. arXiv:2305.15334, 2023. arXiv:2305.15334

work page internal anchor Pith review Pith/arXiv arXiv 2023
[10]

From REST to MCP: An Empirical Study of API Wrapping and Automated Server Generation for LLM Agents

Meriem Mastouri, Emna Ksontini, Amine Barrak, and Wael Kessentini. From REST to MCP: An empirical study of API wrapping and automated server generation for LLM agents. arXiv:2507.16044, 2025. arXiv:2507.16044

work page internal anchor Pith review Pith/arXiv arXiv 2025
[11]

The ABC of software engineering research.ACM Transactions on Software Engineering and Methodology, 27(3):11:1–11:51, 2018

Klaas-Jan Stol and Brian Fitzgerald. The ABC of software engineering research.ACM Transactions on Software Engineering and Methodology, 27(3):11:1–11:51, 2018

work page 2018
[12]

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. ToolLLM: Facilitating large language models to master 16000+ real-world APIs. arXiv:2307.16789, 2023. arXiv:2307.16789

work page internal anchor Pith review Pith/arXiv arXiv 2023
[13]

Asynchronous LLM function calling.arXiv preprint arXiv:2412.07017, 2024

In Gim, Seung seob Lee, and Lin Zhong. Asynchronous LLM function calling. arXiv: 2412.07017, 2024. arXiv:2412.07017

work page arXiv 2024
[14]

The evolution of tool use in LLM agents: From single-tool call to multi-tool orchestration

Haoyuan Xu, Chang Li, Xinyan Ma, Xianhao Ou, Zihan Zhang, Tao He, Xiangyu Liu, Zixiang Wang, Jiafeng Liang, Zheng Chu, Runxuan Liu, Rongchuan Mu, Dandan Tu, Ming Liu, and Bing Qin. The evolution of tool use in LLM agents: From single-tool call to multi-tool orchestration. arXiv:2603.22862, 2026. arXiv:2603.22862

work page arXiv 2026
[15]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems (NeurIPS 2020), 2020. arXiv:2005.11401

work page internal anchor Pith review Pith/arXiv arXiv 2020
[16]

Elias Lumer, Anmol Gulati, V . K. Subbiah, P. Basavaraju, and James A. Burke. ScaleMCP: Dynamic and auto-synchronizing model context protocol tools for LLM agents. arXiv:2505.06416, 2025. arXiv:2505.06416

work page arXiv 2025
[17]

ToolFactory: Automating tool generation by leveraging LLM to understand REST API documentations

Xinyi Ni, Qiuyang Wang, Yukun Zhang, and Pengyu Hong. ToolFactory: Automating tool generation by leveraging LLM to understand REST API documentations. arXiv:2501.16945, 2025. arXiv:2501.16945

work page arXiv 2025
[18]

Validating API design requirements for interoperabil- ity: A static analysis approach using OpenAPI

Edwin Sundberg, Thea Ekmark, and Workneh Yilma Ayele. Validating API design requirements for interoperabil- ity: A static analysis approach using OpenAPI. InCompanion Proceedings of the 18th IFIP Working Conference on the Practice of Enterprise Modelling (PoEM 2025), 2025. arXiv:2511.17836

work page arXiv 2025
[19]

Performance evaluation of microservices communication with REST, GraphQL, and gRPC

Muhammad Niswar, Reza Arisandy Safruddin, Anugrayani Bustamin, and Iqra Aswad. Performance evaluation of microservices communication with REST, GraphQL, and gRPC. InInternational Journal of Electronics and Telecommunications, pages 429–436, 2024

work page 2024
[20]

Gulavani, Alexey Tumanov, and Ramachandran Ramjee

Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, and Ramachandran Ramjee. Taming throughput-latency tradeoff in LLM inference with sarathi-serve. arXiv:2403.02310, 2024. arXiv:2403.02310. 11

work page arXiv 2024

[1] [1]

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models. InProceedings of the 11th International Conference on Learning Representations (ICLR 2023), 2023. arXiv:2210.03629

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. AutoGen: Enabling next-gen LLM applications via multi-agent conversation. arXiv:2308.08155, 2023. arXiv:2308.08155

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [3]

A Survey on Large Language Model based Autonomous Agents

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, and Ji-Rong Wen. A survey on large language model based autonomous agents.Frontiers of Computer Science, 2024. arXiv:2308.11432

work page internal anchor Pith review Pith/arXiv arXiv 2024

[4] [4]

The Rise and Potential of Large Language Model Based Agents: A Survey

Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, a...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[5] [5]

Model context protocol (MCP): An open protocol for connecting AI assistants to data sources and tools.https://modelcontextprotocol.io, 2024

Anthropic. Model context protocol (MCP): An open protocol for connecting AI assistants to data sources and tools.https://modelcontextprotocol.io, 2024. Accessed: May 2026

work page 2024

[6] [6]

Model context protocol (MCP): A lightweight, modular framework for tool-augmented LLM agents

Nisharg Nargund, Anil Kumar Swain, and Naliniprava Behera. Model context protocol (MCP): A lightweight, modular framework for tool-augmented LLM agents. InProceedings of the 2025 IEEE International Symposium on Embedded Design (ISED), 2025

work page 2025

[7] [7]

A survey of agent interoperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent network protocol (ANP)

Abul Ehtesham, Aditi Singh, Gaurav Kumar Gupta, and Saket Kumar. A survey of agent interoperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent network protocol (ANP). arXiv:2505.02279, 2025. arXiv:2505.02279

work page arXiv 2025

[8] [8]

M. A. Ala’anzy and Zhandos Yeshpatov. A performance and scalability evaluation of monolithic (Django) vs. microservice (FastAPI) architectures for asynchronous API workloads in Python. InProceedings of the 2026 IEEE International Conference on Electronics, Computers and Computation (ICECCO), 2026. 10 HarnessAPI: A Skill-First Framework for Unified Streami...

work page 2026

[9] [9]

Gorilla: Large Language Model Connected with Massive APIs

Shishir G. Patil, Tianjun Zhang, Xin Wang, and Joseph E. Gonzalez. Gorilla: Large language model connected with massive APIs. arXiv:2305.15334, 2023. arXiv:2305.15334

work page internal anchor Pith review Pith/arXiv arXiv 2023

[10] [10]

From REST to MCP: An Empirical Study of API Wrapping and Automated Server Generation for LLM Agents

Meriem Mastouri, Emna Ksontini, Amine Barrak, and Wael Kessentini. From REST to MCP: An empirical study of API wrapping and automated server generation for LLM agents. arXiv:2507.16044, 2025. arXiv:2507.16044

work page internal anchor Pith review Pith/arXiv arXiv 2025

[11] [11]

The ABC of software engineering research.ACM Transactions on Software Engineering and Methodology, 27(3):11:1–11:51, 2018

Klaas-Jan Stol and Brian Fitzgerald. The ABC of software engineering research.ACM Transactions on Software Engineering and Methodology, 27(3):11:1–11:51, 2018

work page 2018

[12] [12]

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. ToolLLM: Facilitating large language models to master 16000+ real-world APIs. arXiv:2307.16789, 2023. arXiv:2307.16789

work page internal anchor Pith review Pith/arXiv arXiv 2023

[13] [13]

Asynchronous LLM function calling.arXiv preprint arXiv:2412.07017, 2024

In Gim, Seung seob Lee, and Lin Zhong. Asynchronous LLM function calling. arXiv: 2412.07017, 2024. arXiv:2412.07017

work page arXiv 2024

[14] [14]

The evolution of tool use in LLM agents: From single-tool call to multi-tool orchestration

Haoyuan Xu, Chang Li, Xinyan Ma, Xianhao Ou, Zihan Zhang, Tao He, Xiangyu Liu, Zixiang Wang, Jiafeng Liang, Zheng Chu, Runxuan Liu, Rongchuan Mu, Dandan Tu, Ming Liu, and Bing Qin. The evolution of tool use in LLM agents: From single-tool call to multi-tool orchestration. arXiv:2603.22862, 2026. arXiv:2603.22862

work page arXiv 2026

[15] [15]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems (NeurIPS 2020), 2020. arXiv:2005.11401

work page internal anchor Pith review Pith/arXiv arXiv 2020

[16] [16]

Elias Lumer, Anmol Gulati, V . K. Subbiah, P. Basavaraju, and James A. Burke. ScaleMCP: Dynamic and auto-synchronizing model context protocol tools for LLM agents. arXiv:2505.06416, 2025. arXiv:2505.06416

work page arXiv 2025

[17] [17]

ToolFactory: Automating tool generation by leveraging LLM to understand REST API documentations

Xinyi Ni, Qiuyang Wang, Yukun Zhang, and Pengyu Hong. ToolFactory: Automating tool generation by leveraging LLM to understand REST API documentations. arXiv:2501.16945, 2025. arXiv:2501.16945

work page arXiv 2025

[18] [18]

Validating API design requirements for interoperabil- ity: A static analysis approach using OpenAPI

Edwin Sundberg, Thea Ekmark, and Workneh Yilma Ayele. Validating API design requirements for interoperabil- ity: A static analysis approach using OpenAPI. InCompanion Proceedings of the 18th IFIP Working Conference on the Practice of Enterprise Modelling (PoEM 2025), 2025. arXiv:2511.17836

work page arXiv 2025

[19] [19]

Performance evaluation of microservices communication with REST, GraphQL, and gRPC

Muhammad Niswar, Reza Arisandy Safruddin, Anugrayani Bustamin, and Iqra Aswad. Performance evaluation of microservices communication with REST, GraphQL, and gRPC. InInternational Journal of Electronics and Telecommunications, pages 429–436, 2024

work page 2024

[20] [20]

Gulavani, Alexey Tumanov, and Ramachandran Ramjee

Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, and Ramachandran Ramjee. Taming throughput-latency tradeoff in LLM inference with sarathi-serve. arXiv:2403.02310, 2024. arXiv:2403.02310. 11

work page arXiv 2024