Market Design for AI: Beyond the Copyright Binary

Maryam Farboodi; Negin Golrezaei; Sepehr Shahshahani; Yan Dai

arxiv: 2606.12260 · v1 · pith:JRUOQWF4new · submitted 2026-06-10 · 💰 econ.TH · cs.AI· cs.GT· cs.LG· stat.ML

Market Design for AI: Beyond the Copyright Binary

Yan Dai , Maryam Farboodi , Negin Golrezaei , Sepehr Shahshahani This is my paper

Pith reviewed 2026-06-27 07:25 UTC · model grok-4.3

classification 💰 econ.TH cs.AIcs.GTcs.LGstat.ML

keywords AI training datacopyright regimesoriginality penaltycurse of precisiondata intermediarycreative incentivesmarket designStackelberg game

0 comments

The pith

Neither free-for-all fair use nor strong intellectual property rights sustain incentives for high-quality content creation in AI training markets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a free-for-all approach leaves creators uncompensated while strong property rights reduce incentives for innovative work through an originality penalty. Modeling the setting as a static Stackelberg game reveals how rights holders and platforms interact to underpower novel contributions. Extending to dynamics, the authors identify a feedback loop in which high-performing AI induces human reliance, producing homogenized content that degrades the model itself over time. They propose a data intermediary to internalize cross-creator externalities and subsidize innovative output, aligning individual decisions with efficient aggregate results.

Core claim

Both polar copyright regimes fail to compensate creators and maintain creative incentives for AI training data. Free-for-all provides no payments. Strong IP rights create an originality penalty that hits innovative creators hardest in a platform-led interaction. In dynamic settings, precise models increase reliance on AI-assisted creation, which homogenizes subsequent training inputs and lowers model performance in a curse of precision. A data intermediary restores efficiency by internalizing externalities across creators and directing subsidies toward innovative contributions.

What carries the argument

The data intermediary that internalizes cross-creator externalities and subsidizes innovative contributions.

If this is right

Strong intellectual property rights impose an originality penalty that disproportionately reduces incentives for innovative creators.
High-performing AI models induce greater human reliance, which homogenizes content and feeds back to degrade future model performance.
A data intermediary can restore efficiency by subsidizing innovative contributions and accounting for externalities between creators.
Free-for-all use leaves creators without compensation and therefore weakens the supply of high-quality training data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Platforms might adopt intermediary-style contracts voluntarily to slow the homogenization feedback loop.
Regulators could test intermediary designs in pilot markets for specific content domains such as images or text.
Long-run monitoring of output diversity metrics could serve as an early indicator of the curse of precision.

Load-bearing premise

Creator-AI interactions can be captured by a static Stackelberg game in which the platform moves first and that human reliance on AI necessarily produces homogenized content without offsetting mechanisms.

What would settle it

Measure whether the share of highly original contributions declines under strong IP enforcement relative to weaker regimes, or whether content diversity falls across successive generations of AI-assisted output as model accuracy rises.

read the original abstract

How can we design a market of human-generated content for use in training AI models that both enables technological progress and preserves individual incentives for high-quality content creation? Existing approaches take polar positions: a "free-for-all" model based on fair use and a "strong intellectual property rights" model. We show that both fail: Free-for-all does not compensate creators, and -- by modeling as a static Stackelberg game -- strong intellectual property rights also underpower creative incentives. We find this especially true for more innovative creators, a phenomenon we term the "originality penalty." Extending this insight to a dynamic model, we find another market failure undermining AI model performance, even for an initially good model: Such a model induces greater reliance by humans on AI-assisted creation, resulting in homogenized content feeding back into training, which degrades the model performance -- a "curse of precision." We further propose a market design with a data intermediary internalizing cross-creator externalities and subsidizing innovative contributions, thereby restoring efficiency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces originality penalty and curse of precision via Stackelberg and dynamic models but the failures trace directly to the chosen timing and homogenization assumptions.

read the letter

The main thing to know is that both the free-for-all and strong-IP regimes are shown to under-incentivize creators, with the latter hitting innovative ones harder through an originality penalty, while a dynamic extension produces a curse of precision that degrades even good models over time; the proposed data intermediary is meant to fix both by subsidizing originality and internalizing externalities.

The new pieces are the two named phenomena and the intermediary subsidy rule. The static part uses a Stackelberg setup to derive the penalty, and the dynamic part adds a feedback loop from human reliance on AI output. That framing moves past the usual binary and gives a concrete mechanism for the intermediary.

The modeling is standard game theory and the logic follows from the stated assumptions. The paper earns credit for spelling out how precision can feed back into lower diversity without needing extra parameters.

The soft spots sit in those assumptions. The originality penalty requires the platform to move first; a different order or simultaneous moves would likely change the result. The curse of precision requires that greater AI reliance produces net homogenization with no offsetting channels for variety. The stress-test note is right on this: alter either piece and the market failures need not hold. No robustness checks or alternative timings are flagged in the abstract, so the generality is unclear.

This is for readers working on AI data policy or market design in industrial organization. Someone already thinking about creator incentives and platform power will get the most out of it. It deserves peer review because the topic is live and the structure is clear enough for referees to test the assumptions directly.

Referee Report

2 major / 0 minor

Summary. The paper claims that both a fair-use 'free-for-all' regime and a strong intellectual property rights regime fail to compensate creators and sustain incentives for high-quality content production for AI training. Modeling creator-platform interactions as a static Stackelberg game with the platform moving first, it identifies an 'originality penalty' that disproportionately harms innovative creators under strong IP. Extending to a dynamic model, it identifies a 'curse of precision' in which initial model accuracy induces human reliance that homogenizes subsequent training data and degrades performance. It proposes a data intermediary that internalizes cross-creator externalities and subsidizes innovative contributions to restore efficiency.

Significance. If the modeling results hold, the paper offers a constructive market-design alternative to the copyright binary and isolates two concrete mechanisms (originality penalty and curse of precision) that could shape future theoretical and policy work on AI data markets. The explicit proposal of an intermediary that subsidizes innovation is a positive contribution to mechanism-design approaches in this domain.

major comments (2)

[Abstract and modeling sections] Abstract and modeling sections: The claims that strong IP creates an originality penalty and that both polar regimes are inefficient rest on the specific static Stackelberg timing (platform moves first) and the dynamic assumption that greater AI reliance necessarily produces homogenized content without offsetting diversity channels. These modeling choices are load-bearing for the central inefficiency results and the intermediary fix; the manuscript should supply robustness checks or alternative timings to establish that the failures are not artifacts of the chosen game structure.
[Dynamic model] Dynamic model (referenced in abstract): The curse of precision is derived from a feedback loop in which an initially precise model increases human reliance and thereby reduces content variety in training data. The paper should clarify the micro-foundations of the reliance-homogenization link and discuss whether market or behavioral offsets (e.g., differential pricing or human experimentation) could break the loop, as the absence of such offsets is essential to the claimed market failure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these constructive comments, which highlight important modeling assumptions. We address each point below.

read point-by-point responses

Referee: [Abstract and modeling sections] Abstract and modeling sections: The claims that strong IP creates an originality penalty and that both polar regimes are inefficient rest on the specific static Stackelberg timing (platform moves first) and the dynamic assumption that greater AI reliance necessarily produces homogenized content without offsetting diversity channels. These modeling choices are load-bearing for the central inefficiency results and the intermediary fix; the manuscript should supply robustness checks or alternative timings to establish that the failures are not artifacts of the chosen game structure.

Authors: The platform-leader Stackelberg timing is motivated by the institutional fact that platforms set data-acquisition policies before individual creators choose effort and originality levels. We will add a new subsection that examines the simultaneous-move and creator-leader variants analytically, demonstrating that the originality penalty survives whenever the platform can commit to its IP policy. For the dynamic homogenization assumption we will include a short robustness paragraph noting that the curse of precision is robust to moderate offsetting diversity channels provided those channels do not fully internalize cross-creator externalities; full numerical robustness checks across all parameterizations are left for future work given space constraints. revision: partial
Referee: [Dynamic model] Dynamic model (referenced in abstract): The curse of precision is derived from a feedback loop in which an initially precise model increases human reliance and thereby reduces content variety in training data. The paper should clarify the micro-foundations of the reliance-homogenization link and discuss whether market or behavioral offsets (e.g., differential pricing or human experimentation) could break the loop, as the absence of such offsets is essential to the claimed market failure.

Authors: We will expand Section 4 to derive the reliance-homogenization link from an explicit individual optimization problem in which each creator trades off the cost of original content against the lower cost of AI-assisted output, with the latter producing correlated signals. We will also add a paragraph discussing potential offsets (differential pricing, experimentation) and show that, while they can attenuate the loop, they do not eliminate the inefficiency in equilibrium because of the public-good character of data diversity; this clarifies the scope of the market failure without altering the core result. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results follow from explicit modeling assumptions

full rationale

The paper's central claims about failures of free-for-all and strong-IP regimes, the originality penalty, and the curse of precision are derived from an explicitly stated static Stackelberg game (platform moves first) and its dynamic extension with human reliance leading to homogenization. These are modeling choices presented as such, not reductions of outputs to inputs by construction, not fitted parameters renamed as predictions, and not justified via self-citation chains or imported uniqueness theorems. No equations or steps in the provided text reduce the conclusions to the inputs by definition; the derivation remains self-contained within the game-theoretic setup.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only review; the models rely on standard game-theoretic assumptions of rational agents and perfect information in Stackelberg setup plus an implicit assumption that content originality is observable and contractible by the intermediary.

axioms (2)

domain assumption Players are rational payoff maximizers in a Stackelberg leader-follower structure
Invoked to derive underpowered incentives under strong IP (abstract modeling description)
domain assumption Human creators will increase reliance on AI assistance as model quality rises
Central to the dynamic curse of precision feedback loop

pith-pipeline@v0.9.1-grok · 5722 in / 1311 out tokens · 19239 ms · 2026-06-27T07:25:18.544739+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 2 linked inside Pith

[1]

A marketplace for data: An algorithmic solution

Anish Agarwal, Munther Dahleh, and Tuhin Sarkar. A marketplace for data: An algorithmic solution. InProceedings of the 2019 ACM Conference on Economics and Computation, pages 701–726,

2019
[2]

Emergent abilities in large language models: A survey.arXiv preprint arXiv:2503.05788,

Leonardo Berti, Flavio Giorgi, and Gjergji Kasneci. Emergent abilities in large language models: A survey.arXiv preprint arXiv:2503.05788,

arXiv
[3]

Clickbait vs

Nicole Immorlica, Meena Jagadeesan, and Brendan Lucier. Clickbait vs. quality: How engagement- based optimization shapes the content landscape in online platforms. InProceedings of the ACM Web Conference 2024, pages 36–45,

2024
[4]

Supply-side equilibria in recommender systems.Advances in Neural Information Processing Systems, 36:14597–14608, 2023a

25 Market Design for AI: Beyond the Copyright Binary Meena Jagadeesan, Nikhil Garg, and Jacob Steinhardt. Supply-side equilibria in recommender systems.Advances in Neural Information Processing Systems, 36:14597–14608, 2023a. Meena Jagadeesan, Michael I Jordan, and Nika Haghtalab. Competition, alignment, and equilib- ria in digital marketplaces. InProceed...

Pith/arXiv arXiv
[5]

Scaling laws for neural language models.arXiv preprint arXiv:2001.08361,

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models.arXiv preprint arXiv:2001.08361,

Pith/arXiv arXiv 2001
[6]

Lemley and Lisa Larrimore Ouellette

Mark A. Lemley and Lisa Larrimore Ouellette. Plagiarism, copyright, and ai.University of Chicago Law Review Online, 2025,

2025
[7]

Pricing approaches for data markets

Alexander Muschalle, Florian Stahl, Alexander L¨ oser, and Gottfried Vossen. Pricing approaches for data markets. InEnabling Real-Time Business Intelligence: 6th International Workshop, BIRTE 2012, Held at the 38th International Conference on Very Large Databases, VLDB 2012, Istanbul, Turkey, August 27, 2012, Revised Selected Papers, volume 154, page

2012
[8]

How bad is top-k recommendation under competing content creators? InInternational Conference on Machine Learning, pages 39674–39701

Fan Yao, Chuanhao Li, Denis Nekipelov, Hongning Wang, and Haifeng Xu. How bad is top-k recommendation under competing content creators? InInternational Conference on Machine Learning, pages 39674–39701. PMLR, 2023a. Fan Yao, Chuanhao Li, Karthik Abinav Sankararaman, Yiming Liao, Yan Zhu, Qifan Wang, Hongn- ing Wang, and Haifeng Xu. Rethinking incentives i...

2025
[9]

gives the Best Linear Unbiased Estimator (BLUE) optimizing Equation (4) as ˆX= (1 ′Σ−11)−11′Σ−1s,whereΣ=D −1 + (σ2 η +γ)ββ ′.(27) Using the Sherman-Morrison-Woodbury identity (Sherman and Morrison, 1950; Woodbury, 1950), Σ−1 = D−1 + (σ2 η +γ)β β′ −1 =D− (σ2 η +γ)Dββ ′D 1 + (σ2η +γ)β ′Dβ . Substituting this into the precision formulaK(h) =1 ′Σ−11, we deriv...

1950
[10]

Proof of Proposition 3.To prove the uniqueness ofh ∗ andh sp, we claimK(h) is concave. This is because the first termPN i=1 hi is linear, and the second term PN i=1 hiβi 2 (σ2 η+γ)−1+PN i=1 hiβ2 i , a quadratic-over-linear function inh, is also convex (Boyd and Vandenberghe, 2004, p. 73). Given the production costC(h i) = c 2 h2 i is strictly convex, both...

2004
[11]

Proof of Proposition 5.According to the buyer’s optimization Equation (12), at anyt, the market pricep(t) maximizes the instantaneous profit Πinst(t) (we omit all dependencies ontfor readability): Πinst = dK dt − NO(p) +N C(p) pe(p) (a) = ∂K ∂SO dSO dt + ∂K ∂SC dSC dt − p2 c NO(p) +N C(p) (b) ≤1 NO p c −δS O +i C NC p c −δS C − p2 c (NO +N C) = p c NO(1−p...

2002
[12]

The numerator is positive becauseρ(p)>0 for allp, and the denominator is also positive because iC ∈(0,1] and alsoρ ′(p)>0 for allp(from Equation (7) and Lemma 13)

By the Implicit Function Theorem (Krantz and Parks, 2002), we have d¯p diC =− ∂G/∂i C ∂G/∂¯p = 1− 1 1+ρ(¯p) 1 + 1−iC (1+ρ(¯p))2 ρ′(¯p) . The numerator is positive becauseρ(p)>0 for allp, and the denominator is also positive because iC ∈(0,1] and alsoρ ′(p)>0 for allp(from Equation (7) and Lemma 13). Therefore, ¯p ′(iC)>0 for alli C ∈(0,1], as claimed. B.3...

2002

[1] [1]

A marketplace for data: An algorithmic solution

Anish Agarwal, Munther Dahleh, and Tuhin Sarkar. A marketplace for data: An algorithmic solution. InProceedings of the 2019 ACM Conference on Economics and Computation, pages 701–726,

2019

[2] [2]

Emergent abilities in large language models: A survey.arXiv preprint arXiv:2503.05788,

Leonardo Berti, Flavio Giorgi, and Gjergji Kasneci. Emergent abilities in large language models: A survey.arXiv preprint arXiv:2503.05788,

arXiv

[3] [3]

Clickbait vs

Nicole Immorlica, Meena Jagadeesan, and Brendan Lucier. Clickbait vs. quality: How engagement- based optimization shapes the content landscape in online platforms. InProceedings of the ACM Web Conference 2024, pages 36–45,

2024

[4] [4]

Supply-side equilibria in recommender systems.Advances in Neural Information Processing Systems, 36:14597–14608, 2023a

25 Market Design for AI: Beyond the Copyright Binary Meena Jagadeesan, Nikhil Garg, and Jacob Steinhardt. Supply-side equilibria in recommender systems.Advances in Neural Information Processing Systems, 36:14597–14608, 2023a. Meena Jagadeesan, Michael I Jordan, and Nika Haghtalab. Competition, alignment, and equilib- ria in digital marketplaces. InProceed...

Pith/arXiv arXiv

[5] [5]

Scaling laws for neural language models.arXiv preprint arXiv:2001.08361,

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models.arXiv preprint arXiv:2001.08361,

Pith/arXiv arXiv 2001

[6] [6]

Lemley and Lisa Larrimore Ouellette

Mark A. Lemley and Lisa Larrimore Ouellette. Plagiarism, copyright, and ai.University of Chicago Law Review Online, 2025,

2025

[7] [7]

Pricing approaches for data markets

Alexander Muschalle, Florian Stahl, Alexander L¨ oser, and Gottfried Vossen. Pricing approaches for data markets. InEnabling Real-Time Business Intelligence: 6th International Workshop, BIRTE 2012, Held at the 38th International Conference on Very Large Databases, VLDB 2012, Istanbul, Turkey, August 27, 2012, Revised Selected Papers, volume 154, page

2012

[8] [8]

How bad is top-k recommendation under competing content creators? InInternational Conference on Machine Learning, pages 39674–39701

Fan Yao, Chuanhao Li, Denis Nekipelov, Hongning Wang, and Haifeng Xu. How bad is top-k recommendation under competing content creators? InInternational Conference on Machine Learning, pages 39674–39701. PMLR, 2023a. Fan Yao, Chuanhao Li, Karthik Abinav Sankararaman, Yiming Liao, Yan Zhu, Qifan Wang, Hongn- ing Wang, and Haifeng Xu. Rethinking incentives i...

2025

[9] [9]

gives the Best Linear Unbiased Estimator (BLUE) optimizing Equation (4) as ˆX= (1 ′Σ−11)−11′Σ−1s,whereΣ=D −1 + (σ2 η +γ)ββ ′.(27) Using the Sherman-Morrison-Woodbury identity (Sherman and Morrison, 1950; Woodbury, 1950), Σ−1 = D−1 + (σ2 η +γ)β β′ −1 =D− (σ2 η +γ)Dββ ′D 1 + (σ2η +γ)β ′Dβ . Substituting this into the precision formulaK(h) =1 ′Σ−11, we deriv...

1950

[10] [10]

Proof of Proposition 3.To prove the uniqueness ofh ∗ andh sp, we claimK(h) is concave. This is because the first termPN i=1 hi is linear, and the second term PN i=1 hiβi 2 (σ2 η+γ)−1+PN i=1 hiβ2 i , a quadratic-over-linear function inh, is also convex (Boyd and Vandenberghe, 2004, p. 73). Given the production costC(h i) = c 2 h2 i is strictly convex, both...

2004

[11] [11]

Proof of Proposition 5.According to the buyer’s optimization Equation (12), at anyt, the market pricep(t) maximizes the instantaneous profit Πinst(t) (we omit all dependencies ontfor readability): Πinst = dK dt − NO(p) +N C(p) pe(p) (a) = ∂K ∂SO dSO dt + ∂K ∂SC dSC dt − p2 c NO(p) +N C(p) (b) ≤1 NO p c −δS O +i C NC p c −δS C − p2 c (NO +N C) = p c NO(1−p...

2002

[12] [12]

The numerator is positive becauseρ(p)>0 for allp, and the denominator is also positive because iC ∈(0,1] and alsoρ ′(p)>0 for allp(from Equation (7) and Lemma 13)

By the Implicit Function Theorem (Krantz and Parks, 2002), we have d¯p diC =− ∂G/∂i C ∂G/∂¯p = 1− 1 1+ρ(¯p) 1 + 1−iC (1+ρ(¯p))2 ρ′(¯p) . The numerator is positive becauseρ(p)>0 for allp, and the denominator is also positive because iC ∈(0,1] and alsoρ ′(p)>0 for allp(from Equation (7) and Lemma 13). Therefore, ¯p ′(iC)>0 for alli C ∈(0,1], as claimed. B.3...

2002