Agentic Explainability at Scale: Between Corporate Fears and XAI Needs

Cecily Jones; Yomna Elsayed

arxiv: 2604.14984 · v1 · submitted 2026-04-16 · 💻 cs.HC · cs.AI

Agentic Explainability at Scale: Between Corporate Fears and XAI Needs

Yomna Elsayed , Cecily Jones This is my paper

Pith reviewed 2026-05-10 10:40 UTC · model grok-4.3

classification 💻 cs.HC cs.AI

keywords agentic AIexplainabilityAI governanceagent sprawlXAIAI agentscorporate AI adoptionAI cards

0 comments

The pith

Governance experts recommend design-time and runtime explainability techniques plus an Agentic AI Card prototype to address corporate fears of agent autonomy and sprawl.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates the specific concerns that AI governance professionals hold about scaling agentic AI in enterprise environments. These include risks from agent autonomy and the mismatch between rapid low-code adoption and lagging governance processes, which leads to agent sprawl. It presents explainability methods at both design and runtime stages that experts themselves proposed as ways to reduce those risks. The authors close with a preliminary Agentic AI Card prototype intended to give companies greater confidence when deploying agents at scale. A sympathetic reader would care because many organizations are already moving ahead with agentic systems while their oversight capabilities remain underdeveloped.

Core claim

The paper explores AI governance professionals' concerns in enterprise settings, while offering design-time and runtime explainability techniques as suggested by AI governance experts for addressing those fears. Finally, we provide a preliminary prototype of an Agentic AI Card that can help companies feel at ease deploying agents at scale.

What carries the argument

The Agentic AI Card prototype, which supplies insights into agent configurations, settings, and decision-making during agent-to-agent communication and orchestration.

If this is right

Companies gain a practical way to align governance processes with low-code agent adoption.
Observability into agent orchestration and communication becomes available beyond basic discovery tools.
Standardized documentation of agent behavior can reduce perceived risks of autonomy.
Deployment decisions can incorporate expert-identified explainability practices from the outset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The prototype could be tested for integration with existing shadow AI discovery tools to create end-to-end visibility.
Wider use of similar cards might accelerate the creation of industry standards for agent documentation.
The approach could extend to non-agent AI systems that also face scaling and governance gaps.
Adoption rates of the card in real enterprises would provide a direct test of whether fears actually decrease.

Load-bearing premise

The explainability techniques suggested by governance experts will effectively mitigate corporate fears around agentic autonomy and sprawl, and the preliminary Agentic AI Card prototype will meaningfully support safe scaling without further validation.

What would settle it

A controlled trial or survey in which companies given the Agentic AI Card and explainability techniques report no measurable drop in concerns about agent deployment compared with a control group would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.14984 by Cecily Jones, Yomna Elsayed.

**Figure 1.** Figure 1: Agent Card for Security Monitoring Agent listing participants’ explainability requirements for agentic trust [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

read the original abstract

As companies enter the race for agentic AI adoption, fears surface around agentic autonomy and its subsequent risks. These fears compound as companies scale their agentic AI adoption with low-code applications, without a comparable scaling in their governance processes and expertise resulting in a phenomenon known as "Agent Sprawl". While shadow AI tools can help with agentic discovery and identification, few observability tools offer insights into the agents' configuration and settings or the decision-making process during agent-to-agent communication and orchestration. This paper explores AI governance professionals' concerns in enterprise settings, while offering design-time and runtime explainability techniques as suggested by AI governance experts for addressing those fears. Finally, we provide a preliminary prototype of an Agentic AI Card that can help companies feel at ease deploying agents at scale.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper flags practical scaling issues with agentic AI but its solutions lack any supporting evidence or testing.

read the letter

The paper's main point is that as firms push agentic AI, they worry about autonomous agents and the mess of agent sprawl from scaling without matching governance. It offers design-time and runtime explainability ideas suggested by experts, plus a prototype Agentic AI Card to address that. What it does reasonably well is highlight the mismatch between rapid low-code adoption and slow governance growth. The discussion of observability gaps in agent-to-agent interactions is a concrete issue that many in the field will recognize. Tying XAI techniques to these corporate fears gives the work a targeted feel. On the downside, the work stays descriptive. There are no methods for how the expert suggestions were gathered, no prototype evaluation, and no data on whether any of this reduces actual fears or risks. The central promise that these steps will let companies deploy at scale comfortably lacks support. Readers in AI governance, enterprise risk management, or applied XAI would find this relevant as an early look at the problem space. It is not for those seeking validated methods or quantitative results. The paper deserves peer review. The concerns are timely, the framing is straightforward, and referees could push for the missing validation or clearer links to existing literature. It is not ready as is, but the direction is worth developing.

Referee Report

2 major / 3 minor

Summary. This paper examines concerns among AI governance professionals regarding agentic AI autonomy and 'agent sprawl' in enterprise settings, where low-code scaling outpaces governance. It presents design-time and runtime explainability techniques suggested by these experts to address the fears and describes a preliminary prototype of an Agentic AI Card intended to support safer deployment of agents at scale.

Significance. The work addresses a timely gap between rapid agentic AI adoption and corporate governance practices, drawing on expert input to propose XAI-informed solutions. If the suggested techniques and prototype were shown to reduce perceived risks, the paper could usefully inform HCI and enterprise AI ethics research. Its current value is primarily in surfacing real-world concerns rather than demonstrating effective mitigations.

major comments (2)

The central claim in the abstract that the Agentic AI Card prototype 'can help companies feel at ease deploying agents at scale' is unsupported. No evaluation, user study, deployment metrics, or comparative analysis is described to show that the prototype or the suggested explainability techniques actually mitigate fears of autonomy and agent sprawl versus existing practices.
The manuscript states that the techniques are 'as suggested by AI governance experts' but provides no details on how these suggestions were obtained (e.g., number of participants, elicitation method, or synthesis process). This absence weakens the grounding of the proposed solutions in the reported expert input.

minor comments (3)

The term 'Agent Sprawl' is introduced without a formal definition or clear differentiation from related concepts such as shadow AI or general AI sprawl.
Additional references to prior work on explainability for multi-agent systems and enterprise governance frameworks would help situate the contribution.
Concrete examples illustrating how the design-time versus runtime techniques operate within the Agentic AI Card prototype would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight opportunities to strengthen the transparency and precision of our manuscript. We address each major point below and will revise the paper to better reflect the preliminary and exploratory nature of the work.

read point-by-point responses

Referee: The central claim in the abstract that the Agentic AI Card prototype 'can help companies feel at ease deploying agents at scale' is unsupported. No evaluation, user study, deployment metrics, or comparative analysis is described to show that the prototype or the suggested explainability techniques actually mitigate fears of autonomy and agent sprawl versus existing practices.

Authors: We agree that the abstract phrasing overstates the current contribution. The prototype is preliminary and has not been evaluated for effectiveness in reducing perceived risks. In the revised version, we will change the abstract to state that the Agentic AI Card is 'a preliminary prototype intended to support safer deployment of agents at scale' and will add explicit language in the discussion section noting the lack of empirical validation, along with a clear statement of limitations and directions for future evaluation studies. revision: yes
Referee: The manuscript states that the techniques are 'as suggested by AI governance experts' but provides no details on how these suggestions were obtained (e.g., number of participants, elicitation method, or synthesis process). This absence weakens the grounding of the proposed solutions in the reported expert input.

Authors: We acknowledge that the current manuscript lacks sufficient methodological detail on the expert input. We will add a dedicated subsection describing the consultation process, including how experts were engaged, the elicitation approach used, and the method for synthesizing their suggestions into the design-time and runtime explainability techniques. revision: yes

Circularity Check

0 steps flagged

No significant circularity; paper is purely descriptive and proposal-based

full rationale

The manuscript contains no mathematical derivations, equations, fitted parameters, or predictive claims that reduce to inputs by construction. It surveys governance concerns, relays expert-suggested explainability techniques, and presents a preliminary prototype without asserting that any result follows tautologically from prior definitions or self-citations. All content is qualitative and forward-looking; the central statements are framed as explorations and suggestions rather than derivations. No load-bearing self-citation chains, uniqueness theorems, or ansatz smuggling appear. This is the expected outcome for an exploratory HCI paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The work rests on domain assumptions about enterprise AI scaling outpacing governance and the value of explainability for risk mitigation, with no formal parameters or invented entities beyond descriptive terms.

axioms (1)

domain assumption Enterprise adoption of agentic AI via low-code tools is occurring without comparable growth in governance processes and expertise.
Directly stated in the abstract as the source of Agent Sprawl and compounded fears.

invented entities (1)

Agent Sprawl no independent evidence
purpose: To describe the uncontrolled proliferation of agentic AI tools in enterprises.
Introduced in the abstract as a resulting phenomenon from mismatched scaling of adoption and governance.

pith-pipeline@v0.9.0 · 5428 in / 1296 out tokens · 43613 ms · 2026-05-10T10:40:51.442988+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

7 extracted references · 7 canonical work pages

[1]

https://ijaidsml.org/index.php/ijaidsml/article/view/305/

Journal of Artificial Intelligence, Data Science, and Machine Learning, 6(4), 29-40. https://ijaidsml.org/index.php/ijaidsml/article/view/305/

work page
[2]

Robinson

Yonadev Shavit, Sandhini Agarwal, Miles Brundage, Steven Adler, Cullen O’Keefe, Rosie Campbell, Teddy Lee, Pamela Mishkin, Tyna Eloundou, Alan Hickey, Katarina Slama, Lama Ahmad, Paul McMIllan, Alex Beutel, Alexandre Passos, and David G. Robinson. 2023. Practices for Governing Agentic AI Systems [White paper]. OpenAI. https://cdn.openai.com/papers/practic...

work page 2023
[3]

Alejandro Bellogín, Paolo Giudici, Stefan Larsson, Jun Pang, Gerhard Schimpf, Biswa Sengupta, and Gürkan Solmaz . 2025. Systemic Risks Associated with Agentic AI: A Policy Brief. ACM Europe Technology Policy Committee - Autonomous Systems Subcommittee. https://portal.research.lu.se/en/publications/systemic-risks-associated-with-Agentic-ai-a-policy-brief/

work page 2025
[4]

Sule Anjomshoae, Amro Najjar, Davide Calvaresi, and Kary Framling. 2019. Explainable Agents and Robots: Results from a Systematic Literature Review. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-158024

work page 2019
[5]

Watkins, Carina Manger, Gonzalo Ramos, Justin D

Upol Ehsan, Philipp Wintersberger, Elizabeth A. Watkins, Carina Manger, Gonzalo Ramos, Justin D. Weisz, Hal Daumé Iii, Andreas Riener, and Mark O Riedl. 2023. Human-Centered Explainable AI (HCXAI): Coming of Age. In Abstracts of the CHI ‘23: CHI Conference on Human Factors in Computing Systems. https://dl.acm.org/doi/10.1145/3544549.3573832

work page doi:10.1145/3544549.3573832 2023
[6]

Amin Hass, Sachini Rajapakse, Ng Wee Keong, Kasun De Zoysa, Aruna Withanage, and Nilaan Loganathan. 2025. Towards responsible and explainable AI agents with consensus-driven reasoning. arXiv preprint. https://doi.org/10.48550/arXiv.2512.21699 7. Infocomm Media Development Authority. (2026). Model AI Governance Framework for Agentic AI

work page doi:10.48550/arxiv.2512.21699 2025
[7]

Avi Rosenfeld. 2021. Better metrics for evaluating explainable artificial intelligence. Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021), 45-50. https://ifmas.csc.liv.ac.uk/Proceedings/aamas2021/pdfs/p45.pdf

work page 2021

[1] [1]

https://ijaidsml.org/index.php/ijaidsml/article/view/305/

Journal of Artificial Intelligence, Data Science, and Machine Learning, 6(4), 29-40. https://ijaidsml.org/index.php/ijaidsml/article/view/305/

work page

[2] [2]

Robinson

Yonadev Shavit, Sandhini Agarwal, Miles Brundage, Steven Adler, Cullen O’Keefe, Rosie Campbell, Teddy Lee, Pamela Mishkin, Tyna Eloundou, Alan Hickey, Katarina Slama, Lama Ahmad, Paul McMIllan, Alex Beutel, Alexandre Passos, and David G. Robinson. 2023. Practices for Governing Agentic AI Systems [White paper]. OpenAI. https://cdn.openai.com/papers/practic...

work page 2023

[3] [3]

Alejandro Bellogín, Paolo Giudici, Stefan Larsson, Jun Pang, Gerhard Schimpf, Biswa Sengupta, and Gürkan Solmaz . 2025. Systemic Risks Associated with Agentic AI: A Policy Brief. ACM Europe Technology Policy Committee - Autonomous Systems Subcommittee. https://portal.research.lu.se/en/publications/systemic-risks-associated-with-Agentic-ai-a-policy-brief/

work page 2025

[4] [4]

Sule Anjomshoae, Amro Najjar, Davide Calvaresi, and Kary Framling. 2019. Explainable Agents and Robots: Results from a Systematic Literature Review. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-158024

work page 2019

[5] [5]

Watkins, Carina Manger, Gonzalo Ramos, Justin D

Upol Ehsan, Philipp Wintersberger, Elizabeth A. Watkins, Carina Manger, Gonzalo Ramos, Justin D. Weisz, Hal Daumé Iii, Andreas Riener, and Mark O Riedl. 2023. Human-Centered Explainable AI (HCXAI): Coming of Age. In Abstracts of the CHI ‘23: CHI Conference on Human Factors in Computing Systems. https://dl.acm.org/doi/10.1145/3544549.3573832

work page doi:10.1145/3544549.3573832 2023

[6] [6]

Amin Hass, Sachini Rajapakse, Ng Wee Keong, Kasun De Zoysa, Aruna Withanage, and Nilaan Loganathan. 2025. Towards responsible and explainable AI agents with consensus-driven reasoning. arXiv preprint. https://doi.org/10.48550/arXiv.2512.21699 7. Infocomm Media Development Authority. (2026). Model AI Governance Framework for Agentic AI

work page doi:10.48550/arxiv.2512.21699 2025

[7] [7]

Avi Rosenfeld. 2021. Better metrics for evaluating explainable artificial intelligence. Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021), 45-50. https://ifmas.csc.liv.ac.uk/Proceedings/aamas2021/pdfs/p45.pdf

work page 2021