A new extraction technique applied to 200 books and 14 LLMs finds that memorization of full books is rare except in specific high-capacity models where entire texts can be recovered verbatim.
Feder Cooper, James Grimmelmann, and Daphne Ippolito
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
The paper identifies seven asymmetries in access to AI evidence and proposes a three-part test for courts to resolve disclosure disputes using proportionality and reasonable alternatives.
citing papers explorer
-
Extracting memorized pieces of (copyrighted) books from open-weight language models
A new extraction technique applied to 200 books and 14 LLMs finds that memorization of full books is rare except in specific high-capacity models where entire texts can be recovered verbatim.
-
Barriers to Evidence in AI-Related Cases and the Privatization of Proof
The paper identifies seven asymmetries in access to AI evidence and proposes a three-part test for courts to resolve disclosure disputes using proportionality and reasonable alternatives.