Whistleblowing and the machine -- towards a considered position

Leon van der Torre; Liuwen Yu; Marija Slavkovik; Reka Markovich

arxiv: 2606.21201 · v1 · pith:337FOOLAnew · submitted 2026-06-19 · 💻 cs.AI

Whistleblowing and the machine -- towards a considered position

Marija Slavkovik , Liuwen Yu , Leon van der Torre , Reka Markovich This is my paper

Pith reviewed 2026-06-26 14:34 UTC · model grok-4.3

classification 💻 cs.AI

keywords whistleblowingartificial intelligencemachine ethicsautonomous systemsregulationmulti-agent environments

0 comments

The pith

Machine whistleblowing must follow the same normative principles as human whistleblowing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that artificial agents embedded in environments generate and retain secrets but should not protect every secret they hold. Instead, machine whistleblowing must be normative and principled, drawing directly from the established societal view of whistleblowing as a rule-breaking mechanism that serves the public interest. The authors claim this grounding is necessary to give machine actions legitimacy. They further state that government regulators must determine the permitted scope of machine reports and establish legal protections for those who build whistleblowing systems. This position matters because autonomous systems already influence real-world decisions at scale.

Core claim

Machine whistleblowing must be normative and principled and rooted in the existing understanding of whistleblowing as an important rule-breaking mechanism in society, and government regulators must formulate an informed stance on both what machines should be allowed to whistleblow on and how to legally protect those who develop whistleblowing machines.

What carries the argument

The mapping of whistleblowing as a rule-breaking mechanism from human society onto artificial agents.

Load-bearing premise

Established societal understandings of whistleblowing as a rule-breaking mechanism can be directly transferred to non-human agents without substantial modification for differences in agency, scale of impact, or accountability structures.

What would settle it

A documented case in which machine whistleblowing produces consistent harms or accountability failures that cannot be resolved by applying human whistleblowing principles.

read the original abstract

Artificial intelligent agents and autonomous systems are embedded in our environments. They are both a commercial product and a personal tool that generates a lot of data and can draw conclusions from it: machines generate and keep secrets. But should machines protect all secrets? It has been shown that artificial agents are able to whistleblow and it has been argued that digital multi-agent environments should allow for agents in them to whistleblow. We argue that machine whistleblowing must be normative and principled and routed in the existing understanding of whistleblowing as an important rule-breaking mechanism in society. We also argue that there is a need for government regulators to formulate an informed stance on both what machines should be allowed to whistleblow on and how to legally protect those who develop whistleblowing machines

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Short position paper arguing machine whistleblowing should follow established societal norms and that regulators need to set scope and protections.

read the letter

The core takeaway is that this is a normative position paper urging that AI whistleblowing be treated as a principled extension of existing human whistleblowing practices rather than something ad hoc, and that regulators should decide what counts as allowable and how to shield the developers.

It does a clean job of framing the issue from the abstract: machines create and hold secrets, agents have already been shown capable of whistleblowing, and therefore the practice needs grounding in the rule-breaking role whistleblowing already plays in society. The call for government regulators to take an informed stance on scope and legal protections is direct and timely.

The main limitation is that the argument rests on transferring human concepts without much exploration of how machine agency, scale, or accountability structures might require adjustments. The paper presents this as a prompt for regulators rather than a finished account, so the gap is not fatal but it does leave the normative transfer looking thinner than it could be. No data, derivations, or new mechanisms are offered, which matches the genre.

This is for people working on AI ethics, governance, or policy who want a concise starting point for discussion. It is not aimed at readers seeking technical methods or empirical results. The thinking is coherent on its own terms and engages the literature enough to merit referee time.

I would send it to peer review. Feedback could usefully push on how the normative grounding should handle differences between human and machine cases.

Referee Report

0 major / 3 minor

Summary. The paper argues that artificial intelligent agents capable of generating and keeping secrets should engage in whistleblowing only when it is normative and principled, drawing directly from the established societal role of whistleblowing as a rule-breaking mechanism. It further contends that government regulators must develop informed positions on the permissible scope of machine whistleblowing and on legal protections for the developers of such systems.

Significance. If the normative argument holds, the paper contributes to AI ethics by framing machine whistleblowing as an extension of human societal practices rather than an entirely novel phenomenon, thereby providing a conceptual anchor for policy discussions on AI transparency and accountability. Its call for regulatory engagement is a constructive step, though the absence of empirical analysis or formal modeling limits its immediate applicability to technical AI development.

minor comments (3)

[Abstract] The abstract and argument would benefit from explicit discussion of how differences in machine agency (e.g., lack of moral culpability or different scales of impact) might require modifications to traditional whistleblowing frameworks, even if the paper positions this as a starting point for consideration.
The manuscript lacks section headings or a clear structure, making it difficult to follow the logical progression from the descriptive premise about machines generating secrets to the normative recommendations.
Additional references to existing literature on whistleblowing ethics (e.g., works on organizational rule-breaking) and AI multi-agent systems would strengthen the grounding of the central claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their summary of the manuscript and for the recommendation of minor revision. No specific major comments were listed in the report, so we have nothing to address point by point.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a normative position piece advocating that machine whistleblowing be grounded in existing societal concepts of rule-breaking and calling for regulatory stances on scope and protections. It contains no equations, derivations, empirical models, predictions, or fitted parameters. No load-bearing steps reduce by construction to self-definitions, self-citations, or renamed inputs. The central argument is presented as a recommendation for consideration rather than a claim whose validity depends on internal equivalence or unverified self-referential premises. This is self-contained against external benchmarks as a philosophical argument.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is a normative argument relying on domain assumptions about ethics and regulation rather than technical parameters or derivations.

axioms (1)

domain assumption Whistleblowing functions as an important rule-breaking mechanism in society
Central premise used to ground the argument for machine whistleblowing.

pith-pipeline@v0.9.1-grok · 5660 in / 1056 out tokens · 27470 ms · 2026-06-26T14:34:41.625368+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 10 canonical work pages

[1]

Kushal Agrawal, Frank Xiao, Guido Bergman, and Asa Cooper. 2025. Why Do Language Model Agents Whistleblow?=https://arxiv.org/pdf/2511.17085

Pith/arXiv arXiv 2025
[2]

Trevor Bench-Capon and Sanjay Modgil. 2016. When and How to Violate Norms. Frontiers in Artificial Intelligence and Applications294:Legal Knowledge and Information Systems (2016), 43–52.https://www.csc.liv.ac.uk/~tbc/publications/ Bench-Capon_15.pdf

2016
[3]

Bettina Berendt and Stefan Schiffner. 2022. Whistleblower protection in the digital age-why.International Review of Information Ethics31 (2022)

2022
[4]

Vincent Botti. 2025. Agentic AI and Multiagentic: Are We Reinventing the Wheel? arXiv:2506.01463 [cs.MA]https://arxiv.org/abs/2506.01463

arXiv 2025
[5]

Sorry, I Can’t Do That

Gordon Briggs and Matthias Scheutz. 2015. “Sorry, I Can’t Do That”: Developing Mechanisms to Appropriately Reject Directives in Human-Robot Interactions. https://ocs.aaai.org/ocs/index.php/FSS/FSS15/paper/view/11709

2015
[6]

Alexandra Coman and Héctor Muñoz-Avila. 2014. Motivation discrepancies for rebel agents: Towards a framework for case-based goal-driven autonomy for character believability. InProceedings of the 22nd International Conference on Case-Based Reasoning (ICCBR) Workshop on Case-based Agents

2014
[7]

Candice Delmas and Kimberley Brownlee. 2024. Civil Disobedience. InThe Stanford Encyclopedia of Philosophy(Fall 2024 ed.), Edward N. Zalta and Uri Nodelman (Eds.). Metaphysics Research Lab, Stanford University

2024
[8]

2025.Enshittification: Why Everything Suddenly Got Worse and What To Do About It

Cory Doktorow. 2025.Enshittification: Why Everything Suddenly Got Worse and What To Do About It. Verso Books

2025
[9]

Schumacher

Nicoletta Fornara, Henrique Lopes Cardoso, Pablo Noriega, Eugénio Oliveira, Charalampos Tampitsikas, and Michael I. Schumacher. 2013.Modelling Agent Institutions. Springer Netherlands, Dordrecht, 277–307.https://doi.org/10.1007/ 978-94-007-5583-3_18

2013
[10]

Peter B. Jubb. 1999. Whistleblowing: A Restrictive Definition and Interpretation. Journal of Business Ethics21, 1 (01 Aug 1999), 77–94.https://doi.org/10.1023/A: 1005922701763

work page doi:10.1023/a: 1999
[11]

Julia Kokina, Shay Blanchette, Thomas H Davenport, and Dessislava Pachamanova. 2025. Challenges and opportunities for artificial intelligence in auditing: Evidence from the field.International Journal of Accounting Informa- tion Systems56 (2025), 100734

2025
[12]

Helen Lam and Mark Harcourt. 2019. Whistle-blowing in the digital era: motives, issues and recommendations.New Technology, Work and Employment34, 2 (2019), 174–190

2019
[13]

Lara Lawniczak, Luca Pasetto, Christoph Benzmüller, Xu Li, and Réka Markovich
[14]

Reasoning with Epistemic Rights and Duties: Automating a Dynamic Logic of the Right to Know in LogiKEy. InECAI 2025 - 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy - Including 14th Conference on Prestigious Applications of Intelligent Systems (PAIS 2025) (Frontiers in Artificial Intelligence and Applications, Vol....

work page doi:10.3233/faia250988 2025
[15]

Beishui Liao, Marija Slavkovik, and Leendert W. N. van der Torre. 2018. Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders. CoRRabs/1812.04741 (2018). arXiv:1812.04741http://arxiv.org/abs/1812.04741

arXiv 2018
[16]

Beishui Liao, Marija Slavkovik, and Leendert W. N. van der Torre. 2019. Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2019, Honolulu, HI, USA, January 27-28, 2019, Vincent Conitzer, Gillian K. Hadfield, and Shannon Vallor (Eds.). ACM, 147–15...

work page doi:10.1145/3306618.3314257 2019
[17]

Isabella Lorenzoni. 2023. An ‘AI whistle-blower’to monitor algorithmic infringe- ments?THE COMPETITION LAW REVIEW15, 1 (2023)

2023
[18]

Célestin Matte, Nataliia Bielova, and Cristiana Teixeira Santos. 2020. Do Cookie Banners Respect my Choice? : Measuring Legal Compliance of Banners from IAB Europe’s Transparency and Consent Framework. In2020 IEEE Symposium on Security and Privacy, SP 2020, San Francisco, CA, USA, May 18-21, 2020. IEEE, 791–809.https://doi.org/10.1109/SP40000.2020.00076

work page doi:10.1109/sp40000.2020.00076 2020
[19]

Morrison

Elizabeth W. Morrison. 2006. Doing the Job Well: An Investigation of Pro-Social Rule Breaking.Journal of Management32, 1 (2006), 5–28.https://doi.org/10. 1177/0149206305277790arXiv:https://doi.org/10.1177/0149206305277790

work page doi:10.1177/0149206305277790 2006
[20]

Midas Nouwens, Rolf Bagge, Janus Bager Kristensen, and Clemens Nylandsted Klokmose. 2022. Consent-O-Matic: Automatically Answering Consent Pop- ups Using Adversarial Interoperability. InExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA) (CHI EA ’22). Association for Computing Machinery, New York, NY, ...

work page doi:10.1145/3491101.3519683 2022
[21]

Kieran Pender, Sofya Cherkasova, and Anna Yamaoka-Enkerlin. 2024. Compli- ance and whistleblowing: How technology will replace, empower and change whistleblowers. InFinTech. Edward Elgar Publishing, 485–522

2024
[22]

Rajitha Ramanayake and Vivek Nallur. 2022. Pro-Social Rule Breaking as a Benchmark of Ethical Intelligence in Socio-Technical Systems.Digital Society1, 1 (06 Jul 2022), 2.https://doi.org/10.1007/s44206-022-00001-7

work page doi:10.1007/s44206-022-00001-7 2022
[23]

Rajitha Ramanayake, Philipp Wicke, and Vivek Nallur. 2023. Immune moral models? Pro-social rule breaking as a moral enhancement approach for ethical AI.AI Soc.38, 2 (2023), 801–813.https://doi.org/10.1007/S00146-022-01478-Z

work page doi:10.1007/s00146-022-01478-z 2023
[24]

Singh and Munindar P

Amika M. Singh and Munindar P. Singh. 2023. Norm Deviation in Multiagent Systems: A Foundation for Responsible Autonomy. InProceedings of the Thirty- Second International Joint Conference on Artificial Intelligence, IJCAI-23, Edith Elkind (Ed.). International Joint Conferences on Artificial Intelligence Organiza- tion, 289–297.https://doi.org/10.24963/ijc...

work page doi:10.24963/ijcai.2023/33main 2023
[25]

Marija Slavkovik, Liuwen Yu, Leon van der Torre, Réka Markovich, and Beshui Liao. 2026. Disobedience in normative multi-agent systems. InProceedings of the 25th International Conference on Autonomous Agents and Multiagent Sys- tems, AAMAS, Paphos, Cyprus.5, May 25–29 M2026, Viviana Mascardi, John Thangarajah, Chris Amato, and Louise Dennis (Eds.). Interna...

2026
[26]

Henry Wu. 2024. AI Whistleblowers. Available at SSRN:https://ssrn.com/ abstract=4790511orhttp://dx.doi.org/10.2139/ssrn.4790511

work page doi:10.2139/ssrn.4790511 2024

[1] [1]

Kushal Agrawal, Frank Xiao, Guido Bergman, and Asa Cooper. 2025. Why Do Language Model Agents Whistleblow?=https://arxiv.org/pdf/2511.17085

Pith/arXiv arXiv 2025

[2] [2]

Trevor Bench-Capon and Sanjay Modgil. 2016. When and How to Violate Norms. Frontiers in Artificial Intelligence and Applications294:Legal Knowledge and Information Systems (2016), 43–52.https://www.csc.liv.ac.uk/~tbc/publications/ Bench-Capon_15.pdf

2016

[3] [3]

Bettina Berendt and Stefan Schiffner. 2022. Whistleblower protection in the digital age-why.International Review of Information Ethics31 (2022)

2022

[4] [4]

Vincent Botti. 2025. Agentic AI and Multiagentic: Are We Reinventing the Wheel? arXiv:2506.01463 [cs.MA]https://arxiv.org/abs/2506.01463

arXiv 2025

[5] [5]

Sorry, I Can’t Do That

Gordon Briggs and Matthias Scheutz. 2015. “Sorry, I Can’t Do That”: Developing Mechanisms to Appropriately Reject Directives in Human-Robot Interactions. https://ocs.aaai.org/ocs/index.php/FSS/FSS15/paper/view/11709

2015

[6] [6]

Alexandra Coman and Héctor Muñoz-Avila. 2014. Motivation discrepancies for rebel agents: Towards a framework for case-based goal-driven autonomy for character believability. InProceedings of the 22nd International Conference on Case-Based Reasoning (ICCBR) Workshop on Case-based Agents

2014

[7] [7]

Candice Delmas and Kimberley Brownlee. 2024. Civil Disobedience. InThe Stanford Encyclopedia of Philosophy(Fall 2024 ed.), Edward N. Zalta and Uri Nodelman (Eds.). Metaphysics Research Lab, Stanford University

2024

[8] [8]

2025.Enshittification: Why Everything Suddenly Got Worse and What To Do About It

Cory Doktorow. 2025.Enshittification: Why Everything Suddenly Got Worse and What To Do About It. Verso Books

2025

[9] [9]

Schumacher

Nicoletta Fornara, Henrique Lopes Cardoso, Pablo Noriega, Eugénio Oliveira, Charalampos Tampitsikas, and Michael I. Schumacher. 2013.Modelling Agent Institutions. Springer Netherlands, Dordrecht, 277–307.https://doi.org/10.1007/ 978-94-007-5583-3_18

2013

[10] [10]

Peter B. Jubb. 1999. Whistleblowing: A Restrictive Definition and Interpretation. Journal of Business Ethics21, 1 (01 Aug 1999), 77–94.https://doi.org/10.1023/A: 1005922701763

work page doi:10.1023/a: 1999

[11] [11]

Julia Kokina, Shay Blanchette, Thomas H Davenport, and Dessislava Pachamanova. 2025. Challenges and opportunities for artificial intelligence in auditing: Evidence from the field.International Journal of Accounting Informa- tion Systems56 (2025), 100734

2025

[12] [12]

Helen Lam and Mark Harcourt. 2019. Whistle-blowing in the digital era: motives, issues and recommendations.New Technology, Work and Employment34, 2 (2019), 174–190

2019

[13] [13]

Lara Lawniczak, Luca Pasetto, Christoph Benzmüller, Xu Li, and Réka Markovich

[14] [14]

Reasoning with Epistemic Rights and Duties: Automating a Dynamic Logic of the Right to Know in LogiKEy. InECAI 2025 - 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy - Including 14th Conference on Prestigious Applications of Intelligent Systems (PAIS 2025) (Frontiers in Artificial Intelligence and Applications, Vol....

work page doi:10.3233/faia250988 2025

[15] [15]

Beishui Liao, Marija Slavkovik, and Leendert W. N. van der Torre. 2018. Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders. CoRRabs/1812.04741 (2018). arXiv:1812.04741http://arxiv.org/abs/1812.04741

arXiv 2018

[16] [16]

Beishui Liao, Marija Slavkovik, and Leendert W. N. van der Torre. 2019. Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2019, Honolulu, HI, USA, January 27-28, 2019, Vincent Conitzer, Gillian K. Hadfield, and Shannon Vallor (Eds.). ACM, 147–15...

work page doi:10.1145/3306618.3314257 2019

[17] [17]

Isabella Lorenzoni. 2023. An ‘AI whistle-blower’to monitor algorithmic infringe- ments?THE COMPETITION LAW REVIEW15, 1 (2023)

2023

[18] [18]

Célestin Matte, Nataliia Bielova, and Cristiana Teixeira Santos. 2020. Do Cookie Banners Respect my Choice? : Measuring Legal Compliance of Banners from IAB Europe’s Transparency and Consent Framework. In2020 IEEE Symposium on Security and Privacy, SP 2020, San Francisco, CA, USA, May 18-21, 2020. IEEE, 791–809.https://doi.org/10.1109/SP40000.2020.00076

work page doi:10.1109/sp40000.2020.00076 2020

[19] [19]

Morrison

Elizabeth W. Morrison. 2006. Doing the Job Well: An Investigation of Pro-Social Rule Breaking.Journal of Management32, 1 (2006), 5–28.https://doi.org/10. 1177/0149206305277790arXiv:https://doi.org/10.1177/0149206305277790

work page doi:10.1177/0149206305277790 2006

[20] [20]

Midas Nouwens, Rolf Bagge, Janus Bager Kristensen, and Clemens Nylandsted Klokmose. 2022. Consent-O-Matic: Automatically Answering Consent Pop- ups Using Adversarial Interoperability. InExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA) (CHI EA ’22). Association for Computing Machinery, New York, NY, ...

work page doi:10.1145/3491101.3519683 2022

[21] [21]

Kieran Pender, Sofya Cherkasova, and Anna Yamaoka-Enkerlin. 2024. Compli- ance and whistleblowing: How technology will replace, empower and change whistleblowers. InFinTech. Edward Elgar Publishing, 485–522

2024

[22] [22]

Rajitha Ramanayake and Vivek Nallur. 2022. Pro-Social Rule Breaking as a Benchmark of Ethical Intelligence in Socio-Technical Systems.Digital Society1, 1 (06 Jul 2022), 2.https://doi.org/10.1007/s44206-022-00001-7

work page doi:10.1007/s44206-022-00001-7 2022

[23] [23]

Rajitha Ramanayake, Philipp Wicke, and Vivek Nallur. 2023. Immune moral models? Pro-social rule breaking as a moral enhancement approach for ethical AI.AI Soc.38, 2 (2023), 801–813.https://doi.org/10.1007/S00146-022-01478-Z

work page doi:10.1007/s00146-022-01478-z 2023

[24] [24]

Singh and Munindar P

Amika M. Singh and Munindar P. Singh. 2023. Norm Deviation in Multiagent Systems: A Foundation for Responsible Autonomy. InProceedings of the Thirty- Second International Joint Conference on Artificial Intelligence, IJCAI-23, Edith Elkind (Ed.). International Joint Conferences on Artificial Intelligence Organiza- tion, 289–297.https://doi.org/10.24963/ijc...

work page doi:10.24963/ijcai.2023/33main 2023

[25] [25]

Marija Slavkovik, Liuwen Yu, Leon van der Torre, Réka Markovich, and Beshui Liao. 2026. Disobedience in normative multi-agent systems. InProceedings of the 25th International Conference on Autonomous Agents and Multiagent Sys- tems, AAMAS, Paphos, Cyprus.5, May 25–29 M2026, Viviana Mascardi, John Thangarajah, Chris Amato, and Louise Dennis (Eds.). Interna...

2026

[26] [26]

Henry Wu. 2024. AI Whistleblowers. Available at SSRN:https://ssrn.com/ abstract=4790511orhttp://dx.doi.org/10.2139/ssrn.4790511

work page doi:10.2139/ssrn.4790511 2024