Reproducibility is the New Copyleft: Defining AGI-oriented Reproducible Builds
Pith reviewed 2026-06-28 08:29 UTC · model grok-4.3
The pith
Reproducible builds must replace copyleft to secure freedoms in AGI systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a functional analogue of copyleft for AGI must be grounded in reproducible builds guaranteeing bit-exact reconstructability from declared inputs rather than share-alike clauses over code, because the artifacts required to reconstruct a model face independent legal, technical, and economic constraints and because sufficiently capable systems can rewrite licensed source into functionally equivalent derivatives stripped of obligations.
What carries the argument
Reproducible builds, the practice of guaranteeing bit-exact reconstructability from declared inputs, which carries the argument by supplying the enforcement mechanism where copyleft's source-to-object premise collapses.
If this is right
- Current open-source frameworks leave independent constraints on reconstruction artifacts unresolved.
- Sufficiently capable AI systems can rewrite licensed source into derivatives that evade original obligations.
- Seven requirements must be met to achieve AGI-oriented reproducible builds.
- AI-to-AI coupling mechanisms form a dynamic linking layer for which copyleft-style licensing is ill-suited.
- Protocol-based governance offers a more suitable template than platform-based approaches.
Where Pith is reading between the lines
- Adoption of reproducible builds could allow independent verification of AI behavior without requiring full public release of every training component.
- The same reproducibility focus might apply to other complex, non-deterministic computational artifacts beyond current AI models.
- Standardized input declarations could become a practical requirement for any public AI release aiming to preserve long-term reconstructability.
Load-bearing premise
The premise that source code and the resulting system stand in a well-defined, humanly auditable, and reproducible relationship that open frameworks can satisfy no longer holds for advanced AI systems.
What would settle it
A controlled test that either succeeds or fails at reconstructing a published AI model to identical bit-level outputs and behavior using only its publicly declared code, data, weights, hyperparameters, toolchain, and hardware configuration.
read the original abstract
Copyleft, as implemented in licenses such as the GNU General Public License, was a legal hack that used copyright to guarantee user freedom by tying the availability of source code to every act of distribution. Its normative force rested on an implicit technical premise: that source code and object code stand in a well-defined, humanly auditable, and reproducible relationship. Large language models and, prospectively, Artificial General Intelligence (AGI) systems systematically violate this premise. The artifacts jointly required to reconstruct a model -- code, data, weights, hyperparameters, toolchain, and hardware configuration -- are each subject to independent legal, technical, and economic constraints that no current open-source framework fully resolves. Sufficiently capable AI systems can also rewrite licensed source into functionally equivalent derivatives stripped of their original obligations, a form of laundering against which copyleft has no effective defense. This paper argues that a functional analogue of copyleft for AGI must be grounded not in share-alike clauses over code, but in reproducible builds: a practice guaranteeing bit-exact reconstructability from declared inputs. We review the logic of copyleft, critically examine Maffulli's Second Liberation thesis according to which AI fulfills Stallman's dream, and show that the argument collapses unless AGI systems are themselves reproducible. Drawing on the Open Source AI Definition (OSAID), the Model Openness Framework (MOF), OpenMDW, and deterministic-inference research, we define seven requirements for AGI-oriented reproducible builds. We further argue that the Model Context Protocol (MCP) and analogous AI-to-AI coupling mechanisms constitute a new dynamic linking layer for which copyleft-style licensing is ill-suited, and that Masnick's "protocols, not platforms" framework offers a more promising governance template.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that copyleft's normative force depends on a source/object code relationship that AGI systems violate through complex, independently constrained artifacts (code, data, weights, hyperparameters, toolchain, hardware) and through AI-driven rewriting that evades share-alike obligations. It argues that a functional analogue must instead rest on reproducible builds guaranteeing bit-exact reconstructability, defines seven requirements for AGI-oriented reproducible builds after reviewing OSAID, MOF, OpenMDW and deterministic-inference work, shows that Maffulli's Second Liberation thesis collapses without reproducibility, and proposes that dynamic mechanisms such as the Model Context Protocol are better addressed via protocols-not-platforms governance than via copyleft licensing.
Significance. If the premises hold, the work supplies a coherent conceptual reframing that relocates open-source protection for advanced AI from legal clauses to technical reproducibility practices. It explicitly credits and extends prior frameworks (OSAID, MOF, OpenMDW) by distilling seven requirements and identifies a new dynamic-linking layer (MCP-style coupling) for which existing licensing templates are ill-suited. The argument is internally consistent and follows directly from the stated premises without circularity or hidden empirical claims.
minor comments (2)
- [Abstract] Abstract: the statement that 'no current open-source framework fully resolves' the listed constraints would be strengthened by a concise enumeration of the specific gaps each framework leaves unaddressed, even if only in summary form.
- The seven requirements are introduced as the core technical contribution; a short table or numbered list with one-sentence justification for each would improve readability without altering the argument.
Simulated Author's Rebuttal
We thank the referee for the detailed summary of the manuscript, the positive assessment of its significance, and the recommendation for minor revision. No specific major comments or requested changes were enumerated in the report.
Circularity Check
No significant circularity identified
full rationale
The paper advances a normative argument redefining copyleft-style obligations for AGI around reproducible builds rather than share-alike code. Its premises rest on external citations (OSAID, MOF, OpenMDW, Maffulli thesis, Masnick framework) and stated technical observations about AI artifacts; none of the seven requirements or the central claim reduces by definition, fitted parameter, or self-citation chain to the paper's own inputs. No equations, predictions, or uniqueness theorems appear that would trigger the enumerated circularity patterns. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Sufficiently capable AI systems can rewrite licensed source into functionally equivalent derivatives stripped of original obligations.
- domain assumption No current open-source framework fully resolves the independent legal, technical, and economic constraints on artifacts required to reconstruct a model.
Reference graph
Works this paper leans on
-
[1]
Anthropic: Introducing the Model Context Protocol (2024).https://www.anthro pic.com/news/model-context-protocol
2024
-
[2]
Take It or Leave It
Benhamou, Y., Reymond, M.: Open Source Artificial Intelligence Definition 1.0 – A “Take It or Leave It” Approach for Open Source AI Systems? Kluwer Copyright Blog, March 4 (2025).https://legalblogs.wolterskluwer.com/copyright-blo g/open-source-artificial-intelligence-definition-10-a-take-it-or-lea ve-it-approach-for-open-source-ai-systems/
2025
-
[3]
In: 30th USENIX Security Sympo- sium, pp
Carlini, N., Tramèr, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown, T., Song, D., Erlingsson, U., Oprea, A., Raffel, C.: Extracting Training Data from Large Language Models. In: 30th USENIX Security Sympo- sium, pp. 2633–2650 (2021).https://www.usenix.org/conference/usenixsecu rity21/presentation/carlini-extracting
2021
-
[4]
arXiv:2302.10149 (2023).https://doi.org/10.48550/arXiv.2302
Carlini, N., Jagielski, M., Choquette-Choo, C.A., Paleka, D., Pearce, W., Ander- son,H.,Terzis,A.,Thomas,K.,Tramèr,F.:PoisoningWeb-ScaleTrainingDatasets is Practical. arXiv:2302.10149 (2023).https://doi.org/10.48550/arXiv.2302. 10149
work page internal anchor Pith review doi:10.48550/arxiv.2302 2023
-
[5]
In: Proceedings of the 44th Inter- national Conference on Software Engineering (ICSE ’22), pp
Chen, B., Wen, M., Shi, Y., Lin, D., Rajbahadur, G.K., Jiang, Z.M.: Towards Training Reproducible Deep Learning Models. In: Proceedings of the 44th Inter- national Conference on Software Engineering (ICSE ’22), pp. 2202–2214 (2022). https://doi.org/10.1145/3510003.3510163
-
[6]
The Register, March 6 (2026).https://www.theregister.com/20 26/03/06/ai_kills_software_licensing/
Claburn, T.: Chardet Dispute Shows How AI Will Kill Software Licensing, Argues Bruce Perens. The Register, March 6 (2026).https://www.theregister.com/20 26/03/06/ai_kills_software_licensing/
2026
-
[7]
Official Journal of the European Union (2024)
European Parliament and Council: Regulation (EU) 2024/1689 of 13 June 2024 Laying Down Harmonized Rules on Artificial Intelligence (AI Act). Official Journal of the European Union (2024)
2024
-
[8]
Free Software Foundation: GNU General Public License, Version 3 (2007).https: //www.gnu.org/licenses/gpl-3.0.html
2007
-
[9]
Free Software Foundation: What Is Free Software? (2024).https://www.gnu.or g/philosophy/free-sw.html
2024
-
[10]
Future of Life Institute: Asilomar AI Principles (2017).https://futureoflife.o rg/2017/08/11/ai-principles/
2017
-
[11]
Generative AI Commons: Model Openness Tool (2025).https://isitopen.ai/
2025
-
[12]
En- abling Determinism in LLM Inference with Verified Speculation
Gond, R., Kamath, A.K., Ramjee, R., Panwar, A.: LLM-42: Enabling Determinism in LLM Inference with Verified Speculation. arXiv:2601.17768 (2026).https://do i.org/10.48550/arXiv.2601.17768
-
[13]
Google Developers Blog (2025).https://developers.googleblog.com/en/a2a-a-new-era-of-agent-i nteroperability/
Google: Announcing the Agent2Agent Protocol (A2A). Google Developers Blog (2025).https://developers.googleblog.com/en/a2a-a-new-era-of-agent-i nteroperability/
2025
-
[14]
Copyleft
Hatta, M.: “Copyleft” in the Context of GenAI. Hack or Be Hacked (Substack), October 21 (2024).https://mhatta.substack.com/p/copyleft-in-the-conte xt-of-genai Reproducibility is the New Copyleft 17
2024
-
[15]
(eds.) Artificial General Intelligence
Hatta,M.:SeveralIssuesRegardingDataGovernanceinAGI.In:Iklé,M.,Kolonin, A., Bennett, M. (eds.) Artificial General Intelligence. AGI 2025. Lecture Notes in Computer Science, vol. 16057, pp. 239–249. Springer, Cham (2026).https: //doi.org/10.1007/978-3-032-00686-8_22
-
[16]
Thinking Machines Lab: Connectionism, September (2025).https://thinkingma chines.ai/blog/defeating-nondeterminism-in-llm-inference/
He, H., Thinking Machines Lab: Defeating Nondeterminism in LLM Inference. Thinking Machines Lab: Connectionism, September (2025).https://thinkingma chines.ai/blog/defeating-nondeterminism-in-llm-inference/
2025
-
[17]
Open Source
Kuhn, B.M.: Open Source AI Definition Erodes the Meaning of “Open Source”. Software Freedom Conservancy Blog, October 31 (2024).https://sfconservanc y.org/blog/2024/oct/31/open-source-ai-definition-osaid-erodes-foss/
2024
-
[18]
IEEE Software 39(2), 62–70 (2022).https://doi.org/10.1109/ MS.2021.3073045
Lamb, C., Zacchiroli, S.: Reproducible Builds: Increasing the Integrity of Software Supply Chains. IEEE Software 39(2), 62–70 (2022).https://doi.org/10.1109/ MS.2021.3073045. IEEE Software Best Paper Award 2022
arXiv 2022
-
[19]
LMSYS Blog, September 22 (2025).https://www.lmsys.org/blog/2025-0 9-22-sglang-deterministic/
LMSYS: Towards Deterministic Inference in SGLang and Reproducible RL Train- ing. LMSYS Blog, September 22 (2025).https://www.lmsys.org/blog/2025-0 9-22-sglang-deterministic/
2025
-
[20]
Personal blog, March 16 (2026).https://www.maffulli.net/2026/03/16/ai-final-fro ntier-of-copyleft/
Maffulli, S.: The Second Liberation: AI Is the Final Frontier of Copyleft. Personal blog, March 16 (2026).https://www.maffulli.net/2026/03/16/ai-final-fro ntier-of-copyleft/
2026
-
[21]
Knight First Amendment Institute, Columbia University, 19–05 (2019).https: //knightcolumbia.org/content/protocols-not-platforms-a-technological -approach-to-free-speech
Masnick, M.: Protocols, Not Platforms: A Technological Approach to Free Speech. Knight First Amendment Institute, Columbia University, 19–05 (2019).https: //knightcolumbia.org/content/protocols-not-platforms-a-technological -approach-to-free-speech
2019
-
[22]
Software Engineering Institute, Carnegie Mellon University, Insights Blog, January 13 (2025).https://doi.org/10.58012 /g17y-gp09
Mellinger, A., Justice, D., Connor, M., Gallagher, S., Brooks, T.: The Myth of Ma- chine Learning Non-Reproducibility and Randomness for Acquisitions and Testing, Evaluation, Verification, and Validation. Software Engineering Institute, Carnegie Mellon University, Insights Blog, January 13 (2025).https://doi.org/10.58012 /g17y-gp09
2025
-
[23]
OpenAI Blog, June 13 (2023)
OpenAI: Function Calling and Other API Updates. OpenAI Blog, June 13 (2023). https://openai.com/blog/function-calling-and-other-api-updates
2023
-
[24]
Open Future Observatory (2024)
Open Future: The AI Act and Open Source AI. Open Future Observatory (2024). https://openfuture.eu/observatory/aia-open-source/
2024
-
[25]
Open Source Initiative: The Open Source AI Definition v1.0 (2024).https://op ensource.org/ai/open-source-ai-definition
2024
-
[26]
Open Source Initiative: Deep Dive: Data Governance (Online Event, October 1–3, 2025).https://opensource.org/events/deep-dive-data-governance
2025
-
[27]
Open Source Initiative: OSAID FAQs (2025).https://opensource.org/ai/faq
2025
-
[28]
Open Source Initiative: Report from OSS EU 2025 and AI_dev: What’s Next for OSAID (2025).https://opensource.org/blog/report-from-oss-eu-2025-and -ai_dev-whats-next-for-osaid
2025
-
[29]
The PyTorch Project: Reproducibility — PyTorch Documentation (2024).https: //docs.pytorch.org/docs/stable/notes/randomness.html
2024
-
[30]
The Reproducible Builds Project: Reproducible Builds—A Set of Software Devel- opment Practices That Create an Independently-Verifiable Path from Source to Binary Code.https://reproducible-builds.org/(2024)
2024
-
[31]
Science 381(6654), 158–161 (2023)
Samuelson, P.: Generative AI Meets Copyright. Science 381(6654), 158–161 (2023). https://doi.org/10.1126/science.adi0656
-
[32]
AI Magazine 46(2), e70002 (2025).https: //doi.org/10.1002/aaai.70002 18 M
Semmelrock, H., Ross-Hellauer, T., Kopeinik, S., Theiler, D., Haberl, A., Thal- mann, S., Kowald, D.: Reproducibility in Machine-Learning-Based Research: Overview, Barriers, and Drivers. AI Magazine 46(2), e70002 (2025).https: //doi.org/10.1002/aaai.70002 18 M. Hatta
-
[33]
Stall- man
Stallman, R.M.: Free Software, Free Society: Selected Essays of Richard M. Stall- man. GNU Press, Boston (2002).https://www.gnu.org/philosophy/fsfs/rms-e ssays.pdf
2002
-
[34]
White, M., Haddad, I., Osborne, C., Liu, X.-Y. (Yanglet), Abdelmonsef, A., Vargh- ese, S., Le Hors, A.: The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial Intel- ligence. arXiv:2403.13784 (2024).https://doi.org/10.48550/arXiv.2403.13784
-
[35]
Linux Foun- dation Blog, May 22 (2025).https://www.linuxfoundation.org/blog/the-ope n-source-legacy-and-ais-licensing-challenge
White, M.: The Open Source Legacy and AI’s Licensing Challenge. Linux Foun- dation Blog, May 22 (2025).https://www.linuxfoundation.org/blog/the-ope n-source-legacy-and-ais-licensing-challenge
2025
-
[36]
Widder, D.G., West, S., Whittaker, M.: Open (for Business): Big Tech, Con- centrated Power, and the Political Economy of Open AI. SSRN preprint (2023). https://doi.org/10.2139/ssrn.4543807
-
[37]
Chapman and Hall/CRC (2015).https://doi.org/10.1201/b18612
Yampolskiy, R.V.: Artificial Superintelligence: A Futuristic Approach. Chapman and Hall/CRC (2015).https://doi.org/10.1201/b18612
-
[38]
Machine Intelligence Research Institute Technical Report (2013).https: //intelligence.org/files/TilingAgents.pdf
Yudkowsky,E.,Herreshoff,M.:TilingAgentsforSelf-ModifyingAI,andtheLöbian Obstacle. Machine Intelligence Research Institute Technical Report (2013).https: //intelligence.org/files/TilingAgents.pdf
2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.