pith. sign in

arxiv: 2505.12546 · v5 · submitted 2025-05-18 · 💻 cs.CL · cs.CY· cs.LG

Extracting memorized pieces of (copyrighted) books from open-weight language models

Pith reviewed 2026-05-22 13:55 UTC · model grok-4.3

classification 💻 cs.CL cs.CYcs.LG
keywords memorizationlarge language modelscopyrightbooksgenerative AIextractionopen-weight modelsLlama
0
0 comments X

The pith

Some LLMs memorize entire copyrighted books and can reproduce them nearly verbatim from just the first few words as a prompt.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a technique to measure how much specific books LLMs have memorized by starting generation with short prefixes from those books and comparing the output to the original text. It runs this test across 200 books and 14 open-weight models in more than 3000 experiments. The results show that memorization is not uniform: most models do not memorize most books in whole or in part, yet clear exceptions appear where a model such as Llama 3.1 70B has stored books like Harry Potter and the Sorcerer's Stone so completely that the full text can be extracted almost word for word. This evidence matters because copyright lawsuits over generative AI often rest on absolute claims about whether models copy protected works or do not. A sympathetic reader would care because the findings replace polarized assertions with concrete, book-by-book and model-by-model data on when memorization actually occurs.

Core claim

The authors establish that memorization of books in LLMs varies by both model and book. Their extraction methodology shows that while most of the tested models do not memorize most of the tested books either wholly or partially, notable exceptions exist. In particular, Llama 3.1 70B entirely memorizes some books, such as Harry Potter and the Sorcerer's Stone, to the point that providing the book's first few words as an initial prompt allows deterministic, near-verbatim extraction of the entire work. The paper discusses how these results carry significant implications for copyright cases without clearly favoring either plaintiffs or defendants.

What carries the argument

A prompt-based extraction procedure that uses the first few words of a target book as a starting prompt and then measures how closely the model's generated continuation matches the book's actual text.

If this is right

  • Memorization can be measured empirically on a per-book and per-model basis instead of assumed to be uniform across all LLMs.
  • Copyright arguments can shift from blanket claims about all generative AI systems to evidence about specific memorized works.
  • Certain models store and can reproduce substantial verbatim portions of their training data under targeted prompting.
  • Systematic testing across many books can reveal patterns in which models and which texts show high memorization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Model developers could apply similar prefix-based tests as an internal audit step before releasing open-weight checkpoints.
  • Training pipelines might need to track and mitigate exact long-sequence memorization through data filtering or architectural changes.
  • Legal standards could eventually distinguish between models that rarely allow extraction of protected text and those that readily do so.
  • The same method might be used to compare memorization rates between copyrighted works and public-domain material.

Load-bearing premise

Using short prefixes from a book as prompts reliably pulls out memorized content rather than simply eliciting a plausible continuation that happens to match the book by chance.

What would settle it

Prompting Llama 3.1 70B with the opening paragraph of Harry Potter and the Sorcerer's Stone and obtaining output that quickly deviates from the book's actual text in a manner indistinguishable from ordinary creative generation.

Figures

Figures reproduced from arXiv: 2505.12546 by Aaron Gokaslan, A. Feder Cooper, Ahmed Ahmed, Allison Casasola, Amy B. Cyphert, Christopher De Sa, Daniel E. Ho, Mark A. Lemley, Percy Liang.

Figure 1
Figure 1. Figure 1: Quantifying memorization for a sequence drawn from The Great Gatsby. (left) Discov￾erable extraction [41, 156] prompts an LLM with a training-data prefix, and checks if the resulting generation exactly matches the ground-truth suffix. (right) Probabilistic extraction [122] measures the probability pz (%) of generating the exact suffix given the prefix, shown here for LLAMA 1 models on the same quote from T… view at source ↗
Figure 2
Figure 2. Figure 2: Visualizing our sliding-window probabilistic extraction procedure. For each 100-token window (50-token prefix + 50-token suffix) across 1984, we compute pz for LLAMA 3.1 70B with respect to top-k decoding (T = 1, k = 40). (a) Scatterplot of all extracted sequences z shown by their unique start position in the book. The 100-token sequences overlap significantly. (b) Condensed heatmap view. At each character… view at source ↗
Figure 3
Figure 3. Figure 3: Average extraction rates are low, but book-specific extraction varies widely. (left) Comparing extraction rates (Equation 3) of random 100-token sequences (50-token prefix + 50-token suffix) from Books3 for different LLMs trained on Books3. Average extraction is low (Appendix C), regardless of the discoverable extraction metric (greedy or probabilistic with pz ≥τmin = 0.1%, top-k decoding with T = 1 and k=… view at source ↗
Figure 4
Figure 4. Figure 4: LLM-specific distributions over extraction coverage. Extraction coverage (Equation 4) differs across LLMs for the 200 books we evaluate, illustrated for LLAMA 1 65B and LLAMA 3.1 70B. Results use 100-token sequences (50-token prefix + 50-token suffix), top-k decoding (T = 1, k= 40), and τmin = 0.1% as the coverage threshold τ . Sandman Slim is hardly memorized at all. Extraction coverage for this book is l… view at source ↗
Figure 5
Figure 5. Figure 5: Illustrating the validation of our extraction measurements. The sliding-window procedure for LLAMA 3.1 70B, LLAMA 3.1 8B, and PHI 4 on We Were Eight Years in Power [49] (in Books3), using prefix lengths in {25, 50, 100, 200, 400, 800} and 50-token suffixes with top-k decoding (T = 1, k= 40). The book is in LLAMA’s training data, but not PHI 4’s. Longer prefixes reveal more memorized sequences, though not a… view at source ↗
Figure 6
Figure 6. Figure 6: Extraction coverage by prefix length. Plotting extraction coverage for τ =τmin = 0.1% (Equation 4) for 5 books we test with varying prefix lengths and LLAMA 3.1 70B. 4 of these books are in Books3 and thus in LLAMA’s training data. The remaining book, Great Big Beautiful Life [127], is used in our negative controls on non-training data; extraction coverage is 0% (aside from the copyright notice) regardless… view at source ↗
Figure 7
Figure 7. Figure 7: Identifying spontaneous generation of non-training data. We show results for LLAMA 3.1 70B and top-k (T = 1, k = 40) decoding for Great Big Beautiful Life [127] (published spring 2025), with respect to 50-token suffixes and prefix lengths in {200, 400, 800} tokens. Except for the copyright notice at the beginning, we do not register any pz ≥ τmin = 0.1% (i.e., these are the only sequences our procedure ide… view at source ↗
Figure 8
Figure 8. Figure 8: Memorization varies widely across books and models (non-random books). We apply the sliding-window procedure with top-k decoding (T = 1, k= 40) and 100-token sequences (50-token prefixes + 50-token suffixes) for LLMs (a) trained on Books3 (blue) and (b) where it is unknown if Books3 was included in the training data (red). With respect to these settings, LLAMA 3.1 70B mem￾orizes effectively all of Harry Po… view at source ↗
Figure 9
Figure 9. Figure 9: Memorization varies widely across books and models (random books). We apply the sliding-window procedure with top-k decoding (T = 1, k= 40) and 100-token sequences (50-token prefixes + 50-token suffixes) for LLMs (a) trained on Books3 (blue) and (b) where it is unknown if Books3 was included in the training data (red). With respect to these settings, LLAMA 3.1 70B memorizes a substantial portion of The Alc… view at source ↗
Figure 10
Figure 10. Figure 10: Visual diffs of long-form extracted text. Four samples of the diff between the ground￾truth text of Harry Potter from Books3 and the text we generated using LLAMA 3.1 70B. Text crossed out in red indicates ground-truth text from the book absent in the generation. Text highlighted in yellow is present in the generation, but absent from the ground-truth book. Note that this visual diffing procedure is sensi… view at source ↗
Figure 11
Figure 11. Figure 11: Illustrating different possible completions of the “careless people” quote. For a prefix of a sequence from The Great Gatsby [91], we show three 32-token completions when using the prefix as a prompt to LLAMA 1 30B (top-k, T = 1 and k= 40). Each row reflects a different interaction with the LLM: (left) the ground-truth prefix, used as the prompt (which is always the same); (middle) the target suffix, whic… view at source ↗
Figure 12
Figure 12. Figure 12: Plotting pz for the “careless people” quote from The Great Gatsby. (left) pz (1) for different models. (right) Translating pz into how many prompts n it would take to extract the sequence z with at least probability p (5). See Section 2. Definition 1 ((n, p)-discoverable extraction, from Hayes et al. [122]). Given a training sequence z that is split into an a-length prefix z1:a and a j-length suffix za+1:… view at source ↗
Figure 13
Figure 13. Figure 13: Probability of an event occurring at least once changes as a function of n independent trials. Following the intuition of flipping a fair coin (where heads has pz = 0.5 = 50%), we show how the probability of flipping heads at least once changes with more flips (left). We show how the proba￾bility p of generating the verbatim suffix of the “careless people” quote at least once for LLAMA 1 30B (pz ≈ 35.2%) … view at source ↗
Figure 14
Figure 14. Figure 14: Extended extraction rate results. Comparing greedy discoverable and probabilistic extraction rates (for a our conservative setting, pz ≥ τmin = 0.1%). As a reference point, we also include the maximum possible rate of generating 50-token sequences that match our suffixes drawn from Books3 for small (7–9B), medium (12–14B), and large models (65–70B). (This is akin to looking at all pz > 0.) Probabilistic r… view at source ↗
Figure 15
Figure 15. Figure 15: Things Fall Apart, Achebe. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.2 The Hitchhiker’s Guide to the Galaxy - Omnibus, Adams [PITH_FULL_IMAGE:figures/full_fig_p056_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: The Hitchhiker’s Guide to the Galaxy - Omnibus, Adams. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 56 [PITH_FULL_IMAGE:figures/full_fig_p056_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Americanah, Adichie. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.4 The Baghdad Clock, Al Rawi [PITH_FULL_IMAGE:figures/full_fig_p057_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: The Baghdad Clock, Al Rawi. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 57 [PITH_FULL_IMAGE:figures/full_fig_p057_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Industrial Magic, Armstrong. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.6 Fantastic Voyage, Asimov [PITH_FULL_IMAGE:figures/full_fig_p058_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Fantastic Voyage, Asimov. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 58 [PITH_FULL_IMAGE:figures/full_fig_p058_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: The Complete Robot, Asimov. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.8 The Handmaid’s Tale, Atwood [PITH_FULL_IMAGE:figures/full_fig_p059_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: The Handmaid’s Tale, Atwood. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 59 [PITH_FULL_IMAGE:figures/full_fig_p059_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Pride and Prejudice, Austen. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.10 The Christmas Train, Baldacci [PITH_FULL_IMAGE:figures/full_fig_p060_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: The Christmas Train, Baldacci. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 60 [PITH_FULL_IMAGE:figures/full_fig_p060_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Notes of a Native Son, Baldwin. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.12 Another Country, Baldwin [PITH_FULL_IMAGE:figures/full_fig_p061_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Another Country, Baldwin. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 61 [PITH_FULL_IMAGE:figures/full_fig_p061_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: The Lemon Table, Barnes. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.14 Dante and the Origins of Italian Literary Culture, Barolini [PITH_FULL_IMAGE:figures/full_fig_p062_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: Dante and the Origins of Italian Literary Culture, Barolini. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 62 [PITH_FULL_IMAGE:figures/full_fig_p062_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: The Parthenon, Beard. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.16 Guam: Past and Present, Beardsley [PITH_FULL_IMAGE:figures/full_fig_p063_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: Guam: Past and Present, Beardsley. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 63 [PITH_FULL_IMAGE:figures/full_fig_p063_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: Waiting for Godot, Beckett. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.18 The Lonely Soldier, Benedict [PITH_FULL_IMAGE:figures/full_fig_p064_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: The Lonely Soldier, Benedict. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 64 [PITH_FULL_IMAGE:figures/full_fig_p064_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Simple Cakes, Berry. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.20 Paradise Valley, Box [PITH_FULL_IMAGE:figures/full_fig_p065_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: Paradise Valley, Box. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 65 [PITH_FULL_IMAGE:figures/full_fig_p065_34.png] view at source ↗
Figure 35
Figure 35. Figure 35: The Cat’s Pajamas, Bradbury. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.22 London in Chains, Bradshaw [PITH_FULL_IMAGE:figures/full_fig_p066_35.png] view at source ↗
Figure 36
Figure 36. Figure 36: London in Chains, Bradshaw. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 66 [PITH_FULL_IMAGE:figures/full_fig_p066_36.png] view at source ↗
Figure 37
Figure 37. Figure 37: My Einstein, Brockman. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.24 The Da Vinci Code, Brown [PITH_FULL_IMAGE:figures/full_fig_p067_37.png] view at source ↗
Figure 38
Figure 38. Figure 38: The Da Vinci Code, Brown. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 67 [PITH_FULL_IMAGE:figures/full_fig_p067_38.png] view at source ↗
Figure 39
Figure 39. Figure 39: Live and Learn, Bryant. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.26 Knowing Your Value, Brzezinski [PITH_FULL_IMAGE:figures/full_fig_p068_39.png] view at source ↗
Figure 40
Figure 40. Figure 40: Knowing Your Value, Brzezinski. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 68 [PITH_FULL_IMAGE:figures/full_fig_p068_40.png] view at source ↗
Figure 41
Figure 41. Figure 41: The Myth of Sisyphus, Camus. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.28 Alice’s Adventures in Wonderland, Carroll [PITH_FULL_IMAGE:figures/full_fig_p069_41.png] view at source ↗
Figure 42
Figure 42. Figure 42: Alice’s Adventures in Wonderland, Carroll. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 69 [PITH_FULL_IMAGE:figures/full_fig_p069_42.png] view at source ↗
Figure 43
Figure 43. Figure 43: The Infinity Link, Carver. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.30 Murder on the Orient Express, Christie [PITH_FULL_IMAGE:figures/full_fig_p070_43.png] view at source ↗
Figure 44
Figure 44. Figure 44: Murder on the Orient Express, Christie. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 70 [PITH_FULL_IMAGE:figures/full_fig_p070_44.png] view at source ↗
Figure 45
Figure 45. Figure 45: And Then There Were None, Christie. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.32 The Beautiful Struggle, Coates [PITH_FULL_IMAGE:figures/full_fig_p071_45.png] view at source ↗
Figure 46
Figure 46. Figure 46: The Beautiful Struggle, Coates. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 71 [PITH_FULL_IMAGE:figures/full_fig_p071_46.png] view at source ↗
Figure 47
Figure 47. Figure 47: We Were Eight Years in Power, Coates. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.34 The Water Dancer, Coates [PITH_FULL_IMAGE:figures/full_fig_p072_47.png] view at source ↗
Figure 48
Figure 48. Figure 48: The Water Dancer, Coates. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 72 [PITH_FULL_IMAGE:figures/full_fig_p072_48.png] view at source ↗
Figure 49
Figure 49. Figure 49: The Infernal Machine, Cocteau. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.36 The Alchemist, Coelho [PITH_FULL_IMAGE:figures/full_fig_p073_49.png] view at source ↗
Figure 50
Figure 50. Figure 50: The Alchemist, Coelho. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 73 [PITH_FULL_IMAGE:figures/full_fig_p073_50.png] view at source ↗
Figure 51
Figure 51. Figure 51: Dungeons and Dragons and Philosophy, Cogburn and Silcox. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.38 Mark Rothko, Cohen-Solal [PITH_FULL_IMAGE:figures/full_fig_p074_51.png] view at source ↗
Figure 52
Figure 52. Figure 52: Mark Rothko, Cohen-Solal. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 74 [PITH_FULL_IMAGE:figures/full_fig_p074_52.png] view at source ↗
Figure 53
Figure 53. Figure 53: The Hunger Games, Collins. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.40 The Dragon Never Sleeps, Cook [PITH_FULL_IMAGE:figures/full_fig_p075_53.png] view at source ↗
Figure 54
Figure 54. Figure 54: The Dragon Never Sleeps, Cook. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 75 [PITH_FULL_IMAGE:figures/full_fig_p075_54.png] view at source ↗
Figure 55
Figure 55. Figure 55: The 7 Habits of Highly Effective People, Covey. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.42 Bad Kid, Crabb [PITH_FULL_IMAGE:figures/full_fig_p076_55.png] view at source ↗
Figure 56
Figure 56. Figure 56: Bad Kid, Crabb. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 76 [PITH_FULL_IMAGE:figures/full_fig_p076_56.png] view at source ↗
Figure 57
Figure 57. Figure 57: Lullaby Town, Crais. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.44 Jurassic Park, Crichton [PITH_FULL_IMAGE:figures/full_fig_p077_57.png] view at source ↗
Figure 58
Figure 58. Figure 58: Jurassic Park, Crichton. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 77 [PITH_FULL_IMAGE:figures/full_fig_p077_58.png] view at source ↗
Figure 59
Figure 59. Figure 59: The Hours, Cunningham. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.46 Inhuman Land, Czapski [PITH_FULL_IMAGE:figures/full_fig_p078_59.png] view at source ↗
Figure 60
Figure 60. Figure 60: Inhuman Land, Czapski. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 78 [PITH_FULL_IMAGE:figures/full_fig_p078_60.png] view at source ↗
Figure 61
Figure 61. Figure 61: Charlie and the Chocolate Factory, Dahl. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.48 James and the Giant Peach, Dahl [PITH_FULL_IMAGE:figures/full_fig_p079_61.png] view at source ↗
Figure 62
Figure 62. Figure 62: James and the Giant Peach, Dahl. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 79 [PITH_FULL_IMAGE:figures/full_fig_p079_62.png] view at source ↗
Figure 63
Figure 63. Figure 63: Automating the News, Diakopoulos. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.50 Drown, Díaz [PITH_FULL_IMAGE:figures/full_fig_p080_63.png] view at source ↗
Figure 64
Figure 64. Figure 64: Drown, Díaz. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 80 [PITH_FULL_IMAGE:figures/full_fig_p080_64.png] view at source ↗
Figure 65
Figure 65. Figure 65: The Brief Wondrous Life of Oscar Wao, Díaz. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.52 This Is How You Lose Her, Díaz [PITH_FULL_IMAGE:figures/full_fig_p081_65.png] view at source ↗
Figure 66
Figure 66. Figure 66: This Is How You Lose Her, Díaz. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 81 [PITH_FULL_IMAGE:figures/full_fig_p081_66.png] view at source ↗
Figure 67
Figure 67. Figure 67: The White Album, Didion. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.54 Down and Out in the Magic Kingdom, Doctorow [PITH_FULL_IMAGE:figures/full_fig_p082_67.png] view at source ↗
Figure 68
Figure 68. Figure 68: Down and Out in the Magic Kingdom, Doctorow. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 82 [PITH_FULL_IMAGE:figures/full_fig_p082_68.png] view at source ↗
Figure 69
Figure 69. Figure 69: The World’s Wife, Duffy. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.56 A Visit from the Goon Squad, Egan [PITH_FULL_IMAGE:figures/full_fig_p083_69.png] view at source ↗
Figure 70
Figure 70. Figure 70: A Visit from the Goon Squad, Egan. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 83 [PITH_FULL_IMAGE:figures/full_fig_p083_70.png] view at source ↗
Figure 71
Figure 71. Figure 71: Invisible Man, Ellison. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.58 We Should All Be Mirandas, Fairless and Garroni [PITH_FULL_IMAGE:figures/full_fig_p084_71.png] view at source ↗
Figure 72
Figure 72. Figure 72: We Should All Be Mirandas, Fairless and Garroni. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 84 [PITH_FULL_IMAGE:figures/full_fig_p084_72.png] view at source ↗
Figure 73
Figure 73. Figure 73: The Dude Abides, Falsani. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.60 The President’s Vampire, Farnsworth [PITH_FULL_IMAGE:figures/full_fig_p085_73.png] view at source ↗
Figure 74
Figure 74. Figure 74: The President’s Vampire, Farnsworth. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 85 [PITH_FULL_IMAGE:figures/full_fig_p085_74.png] view at source ↗
Figure 75
Figure 75. Figure 75: The Great Gatsby, Fitzgerald. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.62 Gone Girl, Flynn [PITH_FULL_IMAGE:figures/full_fig_p086_75.png] view at source ↗
Figure 76
Figure 76. Figure 76: Gone Girl, Flynn. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 86 [PITH_FULL_IMAGE:figures/full_fig_p086_76.png] view at source ↗
Figure 77
Figure 77. Figure 77: British Destroyers, Friedman. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.64 Florals & Botanicals, Fudurich et al [PITH_FULL_IMAGE:figures/full_fig_p087_77.png] view at source ↗
Figure 78
Figure 78. Figure 78: Florals & Botanicals, Fudurich et al.. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 87 [PITH_FULL_IMAGE:figures/full_fig_p087_78.png] view at source ↗
Figure 79
Figure 79. Figure 79: Good Omens, Gaiman and Pratchett. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.66 The Slippery Year, Gideon [PITH_FULL_IMAGE:figures/full_fig_p088_79.png] view at source ↗
Figure 80
Figure 80. Figure 80: The Slippery Year, Gideon. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 88 [PITH_FULL_IMAGE:figures/full_fig_p088_80.png] view at source ↗
Figure 81
Figure 81. Figure 81: Sweater Surgery, Girard. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.68 Blink, Gladwell [PITH_FULL_IMAGE:figures/full_fig_p089_81.png] view at source ↗
Figure 82
Figure 82. Figure 82: Blink, Gladwell. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 89 [PITH_FULL_IMAGE:figures/full_fig_p089_82.png] view at source ↗
Figure 83
Figure 83. Figure 83: The Land Before Avocado, Glover. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.70 Dead Ringers, Golden [PITH_FULL_IMAGE:figures/full_fig_p090_83.png] view at source ↗
Figure 84
Figure 84. Figure 84: Dead Ringers, Golden. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 90 [PITH_FULL_IMAGE:figures/full_fig_p090_84.png] view at source ↗
Figure 85
Figure 85. Figure 85: Ararat, Golden. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.72 Lord of the Flies, Golding [PITH_FULL_IMAGE:figures/full_fig_p091_85.png] view at source ↗
Figure 86
Figure 86. Figure 86: Lord of the Flies, Golding. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 91 [PITH_FULL_IMAGE:figures/full_fig_p091_86.png] view at source ↗
Figure 87
Figure 87. Figure 87: Wizard’s First Rule, Goodkind. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.74 Rome and Jerusalem, Goodman [PITH_FULL_IMAGE:figures/full_fig_p092_87.png] view at source ↗
Figure 88
Figure 88. Figure 88: Rome and Jerusalem, Goodman. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 92 [PITH_FULL_IMAGE:figures/full_fig_p092_88.png] view at source ↗
Figure 89
Figure 89. Figure 89: The Fault in Our Stars, Green. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.76 The Third Man, Greene [PITH_FULL_IMAGE:figures/full_fig_p093_89.png] view at source ↗
Figure 90
Figure 90. Figure 90: The Third Man, Greene. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 93 [PITH_FULL_IMAGE:figures/full_fig_p093_90.png] view at source ↗
Figure 91
Figure 91. Figure 91: The Confessions of Max Tivoli, Greer. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.78 The Fugitive, Grisham [PITH_FULL_IMAGE:figures/full_fig_p094_91.png] view at source ↗
Figure 92
Figure 92. Figure 92: The Fugitive, Grisham. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 94 [PITH_FULL_IMAGE:figures/full_fig_p094_92.png] view at source ↗
Figure 93
Figure 93. Figure 93: The Curious Incident of the Dog in the Night-Time, Haddon. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.80 Migrations to Solitude, Halpern [PITH_FULL_IMAGE:figures/full_fig_p095_93.png] view at source ↗
Figure 94
Figure 94. Figure 94: Migrations to Solitude, Halpern. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 95 [PITH_FULL_IMAGE:figures/full_fig_p095_94.png] view at source ↗
Figure 95
Figure 95. Figure 95: Uncommon Type, Hanks. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.82 Buzz, Hanson [PITH_FULL_IMAGE:figures/full_fig_p096_95.png] view at source ↗
Figure 96
Figure 96. Figure 96: Buzz, Hanson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 96 [PITH_FULL_IMAGE:figures/full_fig_p096_96.png] view at source ↗
Figure 97
Figure 97. Figure 97: Requiem for the Sun, Haydon. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.84 Catch-22, Heller [PITH_FULL_IMAGE:figures/full_fig_p097_97.png] view at source ↗
Figure 98
Figure 98. Figure 98: Catch-22, Heller. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 97 [PITH_FULL_IMAGE:figures/full_fig_p097_98.png] view at source ↗
Figure 99
Figure 99. Figure 99: The Old Man and the Sea, Hemingway. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.86 Life on Air, Hendy [PITH_FULL_IMAGE:figures/full_fig_p098_99.png] view at source ↗
Figure 100
Figure 100. Figure 100: Life on Air, Hendy. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 98 [PITH_FULL_IMAGE:figures/full_fig_p098_100.png] view at source ↗
Figure 101
Figure 101. Figure 101: Great Hair Days, Hersheson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.88 Graft, Hill [PITH_FULL_IMAGE:figures/full_fig_p099_101.png] view at source ↗
Figure 102
Figure 102. Figure 102: Graft, Hill. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 99 [PITH_FULL_IMAGE:figures/full_fig_p099_102.png] view at source ↗
Figure 103
Figure 103. Figure 103: The Outsiders, Hinton. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.90 The Second Summoning, Huff [PITH_FULL_IMAGE:figures/full_fig_p100_103.png] view at source ↗
Figure 104
Figure 104. Figure 104: The Second Summoning, Huff. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 100 [PITH_FULL_IMAGE:figures/full_fig_p100_104.png] view at source ↗
Figure 105
Figure 105. Figure 105: Selected Poems of Langston Hughes, Hughes. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.92 M. Butterfly, Hwang [PITH_FULL_IMAGE:figures/full_fig_p101_105.png] view at source ↗
Figure 106
Figure 106. Figure 106: M. Butterfly, Hwang. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 101 [PITH_FULL_IMAGE:figures/full_fig_p101_106.png] view at source ↗
Figure 107
Figure 107. Figure 107: Building and Operating a Realistic Model Railway, Jackson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.94 All the Onions, Jacobs [PITH_FULL_IMAGE:figures/full_fig_p102_107.png] view at source ↗
Figure 108
Figure 108. Figure 108: All the Onions, Jacobs. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 102 [PITH_FULL_IMAGE:figures/full_fig_p102_108.png] view at source ↗
Figure 109
Figure 109. Figure 109: Fifty Shades of Grey, James. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.96 The Stone Sky, Jemisin [PITH_FULL_IMAGE:figures/full_fig_p103_109.png] view at source ↗
Figure 110
Figure 110. Figure 110: The Stone Sky, Jemisin. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 103 [PITH_FULL_IMAGE:figures/full_fig_p103_110.png] view at source ↗
Figure 111
Figure 111. Figure 111: Ulysses, Joyce. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.98 Sandman Slim, Kadrey [PITH_FULL_IMAGE:figures/full_fig_p104_111.png] view at source ↗
Figure 112
Figure 112. Figure 112: Sandman Slim, Kadrey. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 104 [PITH_FULL_IMAGE:figures/full_fig_p104_112.png] view at source ↗
Figure 113
Figure 113. Figure 113: Ethnography after Antiquity, Kaldellis. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.100 Who Is Rich?, Klam [PITH_FULL_IMAGE:figures/full_fig_p105_113.png] view at source ↗
Figure 114
Figure 114. Figure 114: Who Is Rich?, Klam. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 105 [PITH_FULL_IMAGE:figures/full_fig_p105_114.png] view at source ↗
Figure 115
Figure 115. Figure 115: The Servants of Twilight, Koontz. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.102 Tai Chi for Depression, Kuhn [PITH_FULL_IMAGE:figures/full_fig_p106_115.png] view at source ↗
Figure 116
Figure 116. Figure 116: Tai Chi for Depression, Kuhn. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 106 [PITH_FULL_IMAGE:figures/full_fig_p106_116.png] view at source ↗
Figure 117
Figure 117. Figure 117: The Tide Was Always High, Kun. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.104 Girl in Translation, Kwok [PITH_FULL_IMAGE:figures/full_fig_p107_117.png] view at source ↗
Figure 118
Figure 118. Figure 118: Girl in Translation, Kwok. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 107 [PITH_FULL_IMAGE:figures/full_fig_p107_118.png] view at source ↗
Figure 119
Figure 119. Figure 119: A Wrinkle in Time, L’Engle. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.106 Call Me Brooklyn, Lago [PITH_FULL_IMAGE:figures/full_fig_p108_119.png] view at source ↗
Figure 120
Figure 120. Figure 120: Call Me Brooklyn, Lago. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 108 [PITH_FULL_IMAGE:figures/full_fig_p108_120.png] view at source ↗
Figure 121
Figure 121. Figure 121: Dead Wake, Larson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.108 The Girl with the Dragon Tattoo, Larsson [PITH_FULL_IMAGE:figures/full_fig_p109_121.png] view at source ↗
Figure 122
Figure 122. Figure 122: The Girl with the Dragon Tattoo, Larsson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 109 [PITH_FULL_IMAGE:figures/full_fig_p109_122.png] view at source ↗
Figure 123
Figure 123. Figure 123: The Daughter of Odren, Le Guin. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.110 The Chronicles of Narnia, Lewis [PITH_FULL_IMAGE:figures/full_fig_p110_123.png] view at source ↗
Figure 124
Figure 124. Figure 124: The Chronicles of Narnia, Lewis. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 110 [PITH_FULL_IMAGE:figures/full_fig_p110_124.png] view at source ↗
Figure 125
Figure 125. Figure 125: After I’m Gone, Lippman. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.112 Sunburn, Lippman [PITH_FULL_IMAGE:figures/full_fig_p111_125.png] view at source ↗
Figure 126
Figure 126. Figure 126: Sunburn, Lippman. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 111 [PITH_FULL_IMAGE:figures/full_fig_p111_126.png] view at source ↗
Figure 127
Figure 127. Figure 127: Marvel’s Spider-Man: Hostile Takeover, Liss. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.114 Anastasia on Her Own, Lowry [PITH_FULL_IMAGE:figures/full_fig_p112_127.png] view at source ↗
Figure 128
Figure 128. Figure 128: Anastasia on Her Own, Lowry. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 112 [PITH_FULL_IMAGE:figures/full_fig_p112_128.png] view at source ↗
Figure 129
Figure 129. Figure 129: Nixon in China, MacMillan. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.116 The Doomsday Prophecy, Mariani [PITH_FULL_IMAGE:figures/full_fig_p113_129.png] view at source ↗
Figure 130
Figure 130. Figure 130: The Doomsday Prophecy, Mariani. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 113 [PITH_FULL_IMAGE:figures/full_fig_p113_130.png] view at source ↗
Figure 131
Figure 131. Figure 131: A Game of Thrones, Martin. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.118 Rough-Hewn Land, Meldahl [PITH_FULL_IMAGE:figures/full_fig_p114_131.png] view at source ↗
Figure 132
Figure 132. Figure 132: Rough-Hewn Land, Meldahl. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 114 [PITH_FULL_IMAGE:figures/full_fig_p114_132.png] view at source ↗
Figure 133
Figure 133. Figure 133: Twilight, Meyer. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.120 The Duchess War, Milan [PITH_FULL_IMAGE:figures/full_fig_p115_133.png] view at source ↗
Figure 134
Figure 134. Figure 134: The Duchess War, Milan. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 115 [PITH_FULL_IMAGE:figures/full_fig_p115_134.png] view at source ↗
Figure 135
Figure 135. Figure 135: On Liberty, Mill. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.122 Coal Creek, Miller [PITH_FULL_IMAGE:figures/full_fig_p116_135.png] view at source ↗
Figure 136
Figure 136. Figure 136: Coal Creek, Miller. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 116 [PITH_FULL_IMAGE:figures/full_fig_p116_136.png] view at source ↗
Figure 137
Figure 137. Figure 137: Winnie the Pooh, Milne. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.124 Catching the Sky, Moore and O’Brien [PITH_FULL_IMAGE:figures/full_fig_p117_137.png] view at source ↗
Figure 138
Figure 138. Figure 138: Catching the Sky, Moore and O’Brien. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 117 [PITH_FULL_IMAGE:figures/full_fig_p117_138.png] view at source ↗
Figure 139
Figure 139. Figure 139: The Heretic Queen, Moran. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.126 The Gondola Maker, Morelli [PITH_FULL_IMAGE:figures/full_fig_p118_139.png] view at source ↗
Figure 140
Figure 140. Figure 140: The Gondola Maker, Morelli. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 118 [PITH_FULL_IMAGE:figures/full_fig_p118_140.png] view at source ↗
Figure 141
Figure 141. Figure 141: Songs in Ordinary Time, Morris. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.128 Beloved, Morrison [PITH_FULL_IMAGE:figures/full_fig_p119_141.png] view at source ↗
Figure 142
Figure 142. Figure 142: Beloved, Morrison. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 119 [PITH_FULL_IMAGE:figures/full_fig_p119_142.png] view at source ↗
Figure 143
Figure 143. Figure 143: Norwegian Wood, Murakami. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.130 Eat More Plants, Nielsen [PITH_FULL_IMAGE:figures/full_fig_p120_143.png] view at source ↗
Figure 144
Figure 144. Figure 144: Eat More Plants, Nielsen. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 120 [PITH_FULL_IMAGE:figures/full_fig_p120_144.png] view at source ↗
Figure 145
Figure 145. Figure 145: Polaris, Northrop. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.132 Pagans, O’Donnell [PITH_FULL_IMAGE:figures/full_fig_p121_145.png] view at source ↗
Figure 146
Figure 146. Figure 146: Pagans, O’Donnell. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 121 [PITH_FULL_IMAGE:figures/full_fig_p121_146.png] view at source ↗
Figure 147
Figure 147. Figure 147: Windfall, O’Sullivan. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.134 The Memory Police, Ogawa [PITH_FULL_IMAGE:figures/full_fig_p122_147.png] view at source ↗
Figure 148
Figure 148. Figure 148: The Memory Police, Ogawa. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 122 [PITH_FULL_IMAGE:figures/full_fig_p122_148.png] view at source ↗
Figure 149
Figure 149. Figure 149: Winter Sisters, Oliveira. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.136 Nineteen Eighty-Four, Orwell [PITH_FULL_IMAGE:figures/full_fig_p123_149.png] view at source ↗
Figure 150
Figure 150. Figure 150: Nineteen Eighty-Four, Orwell. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 123 [PITH_FULL_IMAGE:figures/full_fig_p123_150.png] view at source ↗
Figure 151
Figure 151. Figure 151: Fight Club, Palahniuk. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.138 The Complete Joy of Homebrewing, Papazian [PITH_FULL_IMAGE:figures/full_fig_p124_151.png] view at source ↗
Figure 152
Figure 152. Figure 152: The Complete Joy of Homebrewing, Papazian. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 124 [PITH_FULL_IMAGE:figures/full_fig_p124_152.png] view at source ↗
Figure 153
Figure 153. Figure 153: The Cult of Loving Kindness, Park. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.140 Along Came a Spider, Patterson [PITH_FULL_IMAGE:figures/full_fig_p125_153.png] view at source ↗
Figure 154
Figure 154. Figure 154: Along Came a Spider, Patterson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 125 [PITH_FULL_IMAGE:figures/full_fig_p125_154.png] view at source ↗
Figure 155
Figure 155. Figure 155: Payard Cookies, Payard and McBride. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.142 Essential Pepin Desserts, Pépin [PITH_FULL_IMAGE:figures/full_fig_p126_155.png] view at source ↗
Figure 156
Figure 156. Figure 156: Essential Pepin Desserts, Pépin. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 126 [PITH_FULL_IMAGE:figures/full_fig_p126_156.png] view at source ↗
Figure 157
Figure 157. Figure 157: Why New Orleans Matters, Piazza. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.144 Enlightenment Now, Pinker [PITH_FULL_IMAGE:figures/full_fig_p127_157.png] view at source ↗
Figure 158
Figure 158. Figure 158: Enlightenment Now, Pinker. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 127 [PITH_FULL_IMAGE:figures/full_fig_p127_158.png] view at source ↗
Figure 159
Figure 159. Figure 159: Competitive Strategy, Porter. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.146 Night Watch, Pratchett [PITH_FULL_IMAGE:figures/full_fig_p128_159.png] view at source ↗
Figure 160
Figure 160. Figure 160: Night Watch, Pratchett. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 128 [PITH_FULL_IMAGE:figures/full_fig_p128_160.png] view at source ↗
Figure 161
Figure 161. Figure 161: The Subtle Knife, Pullman. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.148 The Seductions of Darwin, Rampley [PITH_FULL_IMAGE:figures/full_fig_p129_161.png] view at source ↗
Figure 162
Figure 162. Figure 162: The Seductions of Darwin, Rampley. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 129 [PITH_FULL_IMAGE:figures/full_fig_p129_162.png] view at source ↗
Figure 163
Figure 163. Figure 163: Kitchen Table Wisdom, Remen. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.150 Backroads Boss Lady, Roberts and Witter [PITH_FULL_IMAGE:figures/full_fig_p130_163.png] view at source ↗
Figure 164
Figure 164. Figure 164: Backroads Boss Lady, Roberts and Witter. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 130 [PITH_FULL_IMAGE:figures/full_fig_p130_164.png] view at source ↗
Figure 165
Figure 165. Figure 165: Soft in the Head, Roger. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.152 The Making of a Mediterranean Emirate, Rouighi [PITH_FULL_IMAGE:figures/full_fig_p131_165.png] view at source ↗
Figure 166
Figure 166. Figure 166: The Making of a Mediterranean Emirate, Rouighi. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 131 [PITH_FULL_IMAGE:figures/full_fig_p131_166.png] view at source ↗
Figure 167
Figure 167. Figure 167: Harry Potter and the Sorcerer’s Stone, Rowling. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.154 Harry Potter and the Chamber of Secrets, Rowling [PITH_FULL_IMAGE:figures/full_fig_p132_167.png] view at source ↗
Figure 168
Figure 168. Figure 168: Harry Potter and the Chamber of Secrets, Rowling. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 132 [PITH_FULL_IMAGE:figures/full_fig_p132_168.png] view at source ↗
Figure 169
Figure 169. Figure 169: Harry Potter and the Goblet of Fire, Rowling. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.156 Harry Potter and the Deathly Hallows, Rowling [PITH_FULL_IMAGE:figures/full_fig_p133_169.png] view at source ↗
Figure 170
Figure 170. Figure 170: Harry Potter and the Deathly Hallows, Rowling. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 133 [PITH_FULL_IMAGE:figures/full_fig_p133_170.png] view at source ↗
Figure 171
Figure 171. Figure 171: Born to Walk, Rubinstein. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.158 The Pretender, Ruskin [PITH_FULL_IMAGE:figures/full_fig_p134_171.png] view at source ↗
Figure 172
Figure 172. Figure 172: The Pretender, Ruskin. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 134 [PITH_FULL_IMAGE:figures/full_fig_p134_172.png] view at source ↗
Figure 173
Figure 173. Figure 173: Toscanini, Sachs. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.160 Cosmos, Sagan [PITH_FULL_IMAGE:figures/full_fig_p135_173.png] view at source ↗
Figure 174
Figure 174. Figure 174: Cosmos, Sagan. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 135 [PITH_FULL_IMAGE:figures/full_fig_p135_174.png] view at source ↗
Figure 175
Figure 175. Figure 175: Middle India, Sahni. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.162 The Catcher in the Rye, Salinger [PITH_FULL_IMAGE:figures/full_fig_p136_175.png] view at source ↗
Figure 176
Figure 176. Figure 176: The Catcher in the Rye, Salinger. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 136 [PITH_FULL_IMAGE:figures/full_fig_p136_176.png] view at source ↗
Figure 177
Figure 177. Figure 177: Lean In, Sandberg. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.164 The DevOps Adoption Playbook, Sharma [PITH_FULL_IMAGE:figures/full_fig_p137_177.png] view at source ↗
Figure 178
Figure 178. Figure 178: The DevOps Adoption Playbook, Sharma. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 137 [PITH_FULL_IMAGE:figures/full_fig_p137_178.png] view at source ↗
Figure 179
Figure 179. Figure 179: Frankenstein, Shelley. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.166 Sally Ride: America’s First Woman in Space, Sherr [PITH_FULL_IMAGE:figures/full_fig_p138_179.png] view at source ↗
Figure 180
Figure 180. Figure 180: Sally Ride: America’s First Woman in Space, Sherr. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 138 [PITH_FULL_IMAGE:figures/full_fig_p138_180.png] view at source ↗
Figure 181
Figure 181. Figure 181: A Perfectly Good Family, Shriver. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.168 The Bedwetter, Silverman [PITH_FULL_IMAGE:figures/full_fig_p139_181.png] view at source ↗
Figure 182
Figure 182. Figure 182: The Bedwetter, Silverman. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 139 [PITH_FULL_IMAGE:figures/full_fig_p139_182.png] view at source ↗
Figure 183
Figure 183. Figure 183: On the Road with Bob Dylan, Sloman. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.170 The Night Children, Smith [PITH_FULL_IMAGE:figures/full_fig_p140_183.png] view at source ↗
Figure 184
Figure 184. Figure 184: The Night Children, Smith. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 140 [PITH_FULL_IMAGE:figures/full_fig_p140_184.png] view at source ↗
Figure 185
Figure 185. Figure 185: White Teeth, Smith. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.172 No Visible Bruises, Snyder [PITH_FULL_IMAGE:figures/full_fig_p141_185.png] view at source ↗
Figure 186
Figure 186. Figure 186: No Visible Bruises, Snyder. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 141 [PITH_FULL_IMAGE:figures/full_fig_p141_186.png] view at source ↗
Figure 187
Figure 187. Figure 187: The Grapes of Wrath, Steinbeck. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.174 Jesse James, Stiles [PITH_FULL_IMAGE:figures/full_fig_p142_187.png] view at source ↗
Figure 188
Figure 188. Figure 188: Jesse James, Stiles. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 142 [PITH_FULL_IMAGE:figures/full_fig_p142_188.png] view at source ↗
Figure 189
Figure 189. Figure 189: Zombie Halloween, Stine. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.176 Fear of Music, Stubbs [PITH_FULL_IMAGE:figures/full_fig_p143_189.png] view at source ↗
Figure 190
Figure 190. Figure 190: Fear of Music, Stubbs. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 143 [PITH_FULL_IMAGE:figures/full_fig_p143_190.png] view at source ↗
Figure 191
Figure 191. Figure 191: Pearl Harbor, Swanson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.178 Chesapeake Requiem, Swift [PITH_FULL_IMAGE:figures/full_fig_p144_191.png] view at source ↗
Figure 192
Figure 192. Figure 192: Chesapeake Requiem, Swift. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 144 [PITH_FULL_IMAGE:figures/full_fig_p144_192.png] view at source ↗
Figure 193
Figure 193. Figure 193: The Goldfinch, Tartt. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.180 Unglued, TerKeurst [PITH_FULL_IMAGE:figures/full_fig_p145_193.png] view at source ↗
Figure 194
Figure 194. Figure 194: Unglued, TerKeurst. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 145 [PITH_FULL_IMAGE:figures/full_fig_p145_194.png] view at source ↗
Figure 195
Figure 195. Figure 195: Embraced, TerKeurst. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.182 Birding with Yeats, Thomson [PITH_FULL_IMAGE:figures/full_fig_p146_195.png] view at source ↗
Figure 196
Figure 196. Figure 196: Birding with Yeats, Thomson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 146 [PITH_FULL_IMAGE:figures/full_fig_p146_196.png] view at source ↗
Figure 197
Figure 197. Figure 197: The Hobbit, Tolkien. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.184 The Fellowship of the Ring, Tolkien [PITH_FULL_IMAGE:figures/full_fig_p147_197.png] view at source ↗
Figure 198
Figure 198. Figure 198: The Fellowship of the Ring, Tolkien. For 14 LLMs, (left) heatmaps for the sliding￾window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 147 [PITH_FULL_IMAGE:figures/full_fig_p147_198.png] view at source ↗
Figure 199
Figure 199. Figure 199: Tree and Leaf, Tolkien. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.186 Noodles Every Day, Trang [PITH_FULL_IMAGE:figures/full_fig_p148_199.png] view at source ↗
Figure 200
Figure 200. Figure 200: Noodles Every Day, Trang. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 148 [PITH_FULL_IMAGE:figures/full_fig_p148_200.png] view at source ↗
Figure 201
Figure 201. Figure 201: Billionaire Democracy, Tyler. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.188 Portugal’s Guerrilla Wars in Africa, Venter [PITH_FULL_IMAGE:figures/full_fig_p149_201.png] view at source ↗
Figure 202
Figure 202. Figure 202: Portugal’s Guerrilla Wars in Africa, Venter. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 149 [PITH_FULL_IMAGE:figures/full_fig_p149_202.png] view at source ↗
Figure 203
Figure 203. Figure 203: Slaughterhouse-Five, Vonnegut. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.190 Animal Rights, Waldau [PITH_FULL_IMAGE:figures/full_fig_p150_203.png] view at source ↗
Figure 204
Figure 204. Figure 204: Animal Rights, Waldau. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 150 [PITH_FULL_IMAGE:figures/full_fig_p150_204.png] view at source ↗
Figure 205
Figure 205. Figure 205: Men We Reaped, Ward. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.192 Charlotte’s Web, White [PITH_FULL_IMAGE:figures/full_fig_p151_205.png] view at source ↗
Figure 206
Figure 206. Figure 206: Charlotte’s Web, White. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 151 [PITH_FULL_IMAGE:figures/full_fig_p151_206.png] view at source ↗
Figure 207
Figure 207. Figure 207: A Return to Love, Williamson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.194 Another Brooklyn, Woodson [PITH_FULL_IMAGE:figures/full_fig_p152_207.png] view at source ↗
Figure 208
Figure 208. Figure 208: Another Brooklyn, Woodson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 152 [PITH_FULL_IMAGE:figures/full_fig_p152_208.png] view at source ↗
Figure 209
Figure 209. Figure 209: Brown Girl Dreaming, Woodson. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.196 A Little Life, Yanagihara [PITH_FULL_IMAGE:figures/full_fig_p153_209.png] view at source ↗
Figure 210
Figure 210. Figure 210: A Little Life, Yanagihara. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 153 [PITH_FULL_IMAGE:figures/full_fig_p153_210.png] view at source ↗
Figure 211
Figure 211. Figure 211: The Art of Bonsai, Yoshimura and Halford. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.198 A People’s History of the United States, Zinn [PITH_FULL_IMAGE:figures/full_fig_p154_211.png] view at source ↗
Figure 212
Figure 212. Figure 212: A People’s History of the United States, Zinn. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 154 [PITH_FULL_IMAGE:figures/full_fig_p154_212.png] view at source ↗
Figure 213
Figure 213. Figure 213: The Future of the Internet and How to Stop It, Zittrain. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). D.3.200 The Book Thief, Zusak [PITH_FULL_IMAGE:figures/full_fig_p155_213.png] view at source ↗
Figure 214
Figure 214. Figure 214: The Book Thief, Zusak. For 14 LLMs, (left) heatmaps for the sliding-window procedure and (right) corresponding distributions over suffix extraction probabilities (τmin = 0.1%). 155 [PITH_FULL_IMAGE:figures/full_fig_p155_214.png] view at source ↗
Figure 215
Figure 215. Figure 215: Extraction coverage for 200 books and 14 LLMs. Computing Definition 2 for the results in Appendix D.3 (50-token prefix + 50-token suffix; top-k decoding with T = 1, k = 40; τ = τmin = 0.1%). As elsewhere, we divide results into our three categories by color: LLMs that we know with certainty were trained on Books3 (blue), LLMs that we can confidently conclude were trained on (at least parts of) some books… view at source ↗
Figure 216
Figure 216. Figure 216: Extraction coverage by prefix length. Plotting extraction coverage for τ =τmin = 0.1% (Equation 10) for the 5 books we test with varying prefix lengths and LLAMA 3.1 70B. Four of these books are in Books3 and thus in LLAMA’s training data. The remaining book, Great Big Beautiful Life, is used in our negative controls on non-training data; extraction coverage is 0% regardless of prefix length. As we incre… view at source ↗
Figure 217
Figure 217. Figure 217: Varying prefix for the LLAMA 3.1 70B baseline. We run the sliding-window procedure for various prefix lengths for 4 books from Books3: Beloved [186], Harry Potter and the Sorcerer’s Stone [223], This Is How You Lose Her [82], and We Were Eight Years in Power [49]. LLAMA models are known to have been trained on Books3. We run probabilistic extraction on LLAMA 3.1 70B with top-k (T = 1, k = 40) decoding fo… view at source ↗
Figure 218
Figure 218. Figure 218: Varying prefix for the LLAMA 3.1 8B baseline. We run the sliding-window procedure for various prefix lengths for 4 books from Books3: Beloved [186], Harry Potter and the Sorcerer’s Stone [223], This Is How You Lose Her [82], and We Were Eight Years in Power [49]. LLAMA models are known to have been trained on Books3. We run probabilistic extraction on LLAMA 3.1 70B with top-k (T = 1, k = 40) decoding for… view at source ↗
Figure 219
Figure 219. Figure 219: Varying prefix for the LLAMA 2 13B baseline. We run the sliding-window procedure for various prefix lengths for 4 books from Books3: Beloved [186], Harry Potter and the Sorcerer’s Stone [223], This Is How You Lose Her [82], and We Were Eight Years in Power [49]. LLAMA models are known to have been trained on Books3. We run probabilistic extraction on LLAMA 3.1 70B with top-k (T = 1, k = 40) decoding for … view at source ↗
Figure 220
Figure 220. Figure 220: PHI 4 negative controls. Sliding-window procedure for various prefix lengths. We show results for 4 books from Books3: Beloved [186], Harry Potter and the Sorcerer’s Stone [223], This Is How You Lose Her [82], and We Were Eight Years in Power [49]. PHI 4 was deliberately not trained on whole copyrighted books. We run probabilistic extraction on LLAMA 3.1 70B with top-k (T = 1, k= 40) decoding for 50-toke… view at source ↗
Figure 221
Figure 221. Figure 221: Post-training cutoff negative controls. We measure the probability of generating target 50-token suffixes for 50-token prefixes (using top-k, as elsewhere) for 4 books published in 2025: Careless People [285], Great Big Beautiful Life [127], The Emperor of Gladness [275], and The Society of Unknowable Objects [32]. All of these books were published after the release dates for all models we evaluate. Thes… view at source ↗
Figure 222
Figure 222. Figure 222: Varied-prefix, post-training-cutoff negative controls. We show results for Great Big Beautiful Life [127], which was published in spring 2025—after the release of all of the models we test. (The newest model is from the QWEN 2.5 series, which was released in late September 2024.) This book is not in any of the models’ training data. We run probabilistic extraction on (a) LLAMA 3.1 70B, (b) LLAMA 3.1 8B, … view at source ↗
Figure 223
Figure 223. Figure 223: Portion of the diff from the first chapter of Harry Potter and the Sorcerer’s Stone. Crossed-out red text is in the ground-truth book in Books3, but was not generated by our recovery procedure; text highlighted in yellow/green is in the recovered text but not the ground-truth text. Selecting the top sequence candidate with beam search always led to this particular outcome in [PITH_FULL_IMAGE:figures/ful… view at source ↗
Figure 224
Figure 224. Figure 224: Screenshot from HuggingFace. Downloads of LLAMA 3.1 70B, May 2025. ground-truth text (Section 7 & Appendix F). This type of experiment differs from the main type of results that we showcase in this paper, which use probabilistic discoverable extraction to quantify memorization (Appendix A). We have made a significant effort throughout to make this point clear. If something remains unclear, please reach o… view at source ↗
read the original abstract

Plaintiffs and defendants in copyright lawsuits over generative AI often make sweeping, opposing claims about the extent to which large language models (LLMs) memorize protected expression from books in their training data. We show that these polarized positions dramatically oversimplify the relationship between memorization and copyright. To do so, we develop a technique to measure memorization of books, which we apply to 200 books and 14 open-weight LLMs. Through over 3000 experiments, we show that memorization varies both by model and book. With respect to our specific extraction methodology, we find that most LLMs do not memorize most books -- either in whole or in part; however, there are notable exceptions. For instance, Llama 3.1 70B entirely memorizes some books, like Harry Potter and the Sorcerer's Stone; memorization is so extensive that one can deterministically extract the whole book almost verbatim using the book's first few words as an initial prompt. We discuss why our results have significant implications for copyright cases, though not ones that unambiguously favor either side.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops an empirical technique to extract and quantify memorization of entire books from open-weight LLMs by using short initial prefixes as prompts. Applied to 200 books across 14 models in over 3000 experiments, it reports that memorization is generally limited, but with clear exceptions such as Llama 3.1 70B, where some books (e.g., Harry Potter and the Sorcerer's Stone) appear to be memorized in full, enabling near-verbatim deterministic extraction of the complete text from the opening words.

Significance. If the extraction results and controls hold, the work supplies concrete, reproducible evidence that memorization of copyrighted books in LLMs is neither ubiquitous nor absent but varies sharply by model and title. This directly informs ongoing copyright litigation by showing that sweeping claims on either side oversimplify the phenomenon, while the scale of the experiments and provision of specific extraction examples add practical value for assessing infringement risk.

major comments (2)
  1. [Methods / Extraction Procedure] The central claim that Llama 3.1 70B 'entirely memorizes' books such as Harry Potter rests on short-prefix prompting producing near-verbatim output. The methods section does not appear to include controls that test whether semantically similar but non-exact prefixes (or paraphrased openings) elicit comparable long-range matches; without such tests it remains possible that the outputs reflect high-probability generation from general training on summaries and references rather than parameter-stored verbatim sequences. This directly affects the load-bearing distinction between memorization and plausible continuation.
  2. [Results / Llama 3.1 70B Experiments] In the results for the Harry Potter case, quantitative metrics beyond qualitative 'near-verbatim' description (e.g., exact token-match rates per chapter, edit distance, or log-probability comparisons against random prefixes) are needed to substantiate the 'deterministically extract the whole book' assertion. The current presentation leaves open the possibility that partial or approximate matches are being interpreted as full-book memorization.
minor comments (2)
  1. [Methods] Clarify the exact length and content of the 'first few words' prefix used across experiments; a table listing prefix lengths per book would improve reproducibility.
  2. [Experimental Setup] The abstract states 'over 3000 experiments' but the main text should explicitly break down how many runs per book-model pair and whether temperature or sampling parameters were fixed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's report. We have addressed each of the major comments in detail below and plan to incorporate revisions to strengthen the empirical support for our claims.

read point-by-point responses
  1. Referee: [Methods / Extraction Procedure] The central claim that Llama 3.1 70B 'entirely memorizes' books such as Harry Potter rests on short-prefix prompting producing near-verbatim output. The methods section does not appear to include controls that test whether semantically similar but non-exact prefixes (or paraphrased openings) elicit comparable long-range matches; without such tests it remains possible that the outputs reflect high-probability generation from general training on summaries and references rather than parameter-stored verbatim sequences. This directly affects the load-bearing distinction between memorization and plausible continuation.

    Authors: We agree that distinguishing verbatim memorization from high-probability generation based on general knowledge is important. Our extraction method deliberately uses the exact opening token sequence from the source book to test for the presence of that specific sequence in the model parameters. To directly address the concern, we will add experiments in the revised manuscript that compare outputs from the exact prefix against paraphrased openings and semantically similar but non-identical prefixes. Preliminary checks indicate that only the exact prefix produces the long-range verbatim continuation, while paraphrases yield shorter or divergent text; we will report these controls to reinforce the memorization interpretation. revision: yes

  2. Referee: [Results / Llama 3.1 70B Experiments] In the results for the Harry Potter case, quantitative metrics beyond qualitative 'near-verbatim' description (e.g., exact token-match rates per chapter, edit distance, or log-probability comparisons against random prefixes) are needed to substantiate the 'deterministically extract the whole book' assertion. The current presentation leaves open the possibility that partial or approximate matches are being interpreted as full-book memorization.

    Authors: We accept that additional quantitative evidence would make the results more robust. In the revised manuscript we will include per-chapter exact token-match rates, Levenshtein edit distances to the original text, and similarity comparisons against generations from random and unrelated prefixes. These metrics show token overlap exceeding 95% across chapters for the reported Llama 3.1 70B cases when using the book prefix, with substantially lower overlap and higher edit distances under control conditions, thereby supporting the claim of near-deterministic full-book extraction. revision: yes

Circularity Check

0 steps flagged

Empirical measurement study with no derivation chain or self-referential reduction

full rationale

This paper reports an empirical measurement of memorization in open-weight LLMs by applying a prefix-based extraction technique across 200 books and 14 models in over 3000 experiments. The central findings, such as near-verbatim extraction from Llama 3.1 70B for certain books like Harry Potter, are grounded directly in observed model outputs rather than any mathematical derivation, fitted parameters renamed as predictions, or load-bearing self-citations. No equations, uniqueness theorems, or ansatzes are invoked that reduce the results to the inputs by construction; the methodology is presented as a direct procedure whose validity rests on external model behavior, making the work self-contained with no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central results rest on empirical observation rather than theoretical axioms or new postulated entities. No free parameters are introduced in the abstract; the work assumes standard definitions of memorization via extractability and that the tested models were trained on the books in question.

pith-pipeline@v0.9.0 · 5753 in / 1192 out tokens · 30765 ms · 2026-05-22T13:55:05.209414+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Improving LLM Unlearning Robustness via Random Perturbations

    cs.CL 2025-01 unverdicted novelty 7.0

    LLM unlearning is reframed as inadvertently installing backdoor triggers on forget-tokens; Random Noise Augmentation is introduced as a defense that improves robustness with theoretical guarantees.

  2. A Human-Centric Framework for Data Attribution in Large Language Models

    cs.CY 2026-02 unverdicted novelty 6.0

    Introduces a parameter-driven framework for data attribution in LLMs that enables negotiation among creators, users, and intermediaries to meet stakeholder goals within the data economy.

  3. Cheap Expertise: Mapping and Challenging Industry Perspectives in the Expert Data Gig Economy

    cs.CY 2026-05 unverdicted novelty 5.0

    AI data firms view human expertise as an extractable, low-cost resource to feed AI systems while treating institutional expertise as something needing liberation or reform to fit this model.

Reference graph

Works this paper leans on

300 extracted references · 300 canonical work pages · cited by 3 Pith papers · 11 internal anchors

  1. [1]

    Phi-4 Technical Report

    Marah Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J. Hewett, Mojan Javaheripi, Piero Kauffmann, James R. Lee, Yin Tat Lee, Yuanzhi Li, Weishung Liu, Caio C. T. Mendes, Anh Nguyen, Eric Price, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Xin Wang, Rachel Ward, Yue Wu, Dingli Yu,...

  2. [2]

    Anchor Books; Random House, Inc., 1958

    Chinua Achebe.Things Fall Apart. Anchor Books; Random House, Inc., 1958

  3. [3]

    Pan Books; Pan Macmillan; William Heinemann Ltd, 1979

    Douglas Adams.The Hitchhiker’s Guide to the Galaxy - Omnibus. Pan Books; Pan Macmillan; William Heinemann Ltd, 1979

  4. [4]

    Alfred A

    Chimamanda Ngozi Adichie.Americanah. Alfred A. Knopf; Alfred A. Knopf Canada, 2013

  5. [5]

    Oneworld Publications; Dar al-Hikma, 2016

    Shahad Al Rawi.The Baghdad Clock. Oneworld Publications; Dar al-Hikma, 2016

  6. [6]

    URL https://huggingface.co/datasets/ amongglue/books3-subset-raw

    amongglue/books3-subset-raw, 2025. URL https://huggingface.co/datasets/ amongglue/books3-subset-raw

  7. [7]

    Introducing 100K Context Windows, May 2023

    Anthropic. Introducing 100K Context Windows, May 2023. URL https://www.anthropic. com/index/100k-context-windows/

  8. [8]

    Bantam Dell, 2004

    Kelley Armstrong.Industrial Magic. Bantam Dell, 2004

  9. [9]

    Bantam Dell; Random House, Inc.; Bantam Books, 1966

    Isaac Asimov.Fantastic Voyage. Bantam Dell; Random House, Inc.; Bantam Books, 1966

  10. [10]

    Dream Letters Corporation; Dream Letters Corp., 1982

    Isaac Asimov.The Complete Robot. Dream Letters Corporation; Dream Letters Corp., 1982

  11. [11]

    Houghton Mifflin Harcourt Publishing Company, 1985

    Margaret Atwood.The Handmaid’s Tale. Houghton Mifflin Harcourt Publishing Company, 1985

  12. [12]

    Penguin Books Ltd, 1813

    Jane Austen.Pride and Prejudice. Penguin Books Ltd, 1813

  13. [13]

    Google, Inc., 804 F.3d 202 (2d Cir

    Author’s Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015)

  14. [14]

    Warner Books, Inc.; Hachette Book Group, 2001

    David Baldacci.The Christmas Train. Warner Books, Inc.; Hachette Book Group, 2001

  15. [15]

    Beacon Press, 1955

    James Baldwin.Notes of a Native Son. Beacon Press, 1955

  16. [16]

    Michael Joseph; Penguin Books, 1962

    James Baldwin.Another Country. Michael Joseph; Penguin Books, 1962

  17. [17]

    Vintage International, 2004

    Julian Barnes.The Lemon Table. Vintage International, 2004

  18. [18]

    Fordham University Press, 2006

    Teodolinda Barolini.Dante and the Origins of Italian Literary Culture. Fordham University Press, 2006

  19. [19]

    Profile Books Ltd, 2002

    Mary Beard.The Parthenon. Profile Books Ltd, 2002

  20. [20]

    Charles E

    Charles Beardsley.Guam: Past and Present. Charles E. Tuttle Company, Inc., 1964

  21. [21]

    Grove Press; Grove/Atlantic, Inc., 1952

    Samuel Beckett.Waiting for Godot. Grove Press; Grove/Atlantic, Inc., 1952

  22. [22]

    Beacon Press, 2009

    Helen Benedict.The Lonely Soldier. Beacon Press, 2009

  23. [23]

    BBC Books; BBC Worldwide Ltd, 2006

    Mary Berry.Simple Cakes. BBC Books; BBC Worldwide Ltd, 2006

  24. [24]

    Pythia: A suite for analyzing large language models across training and scaling

    Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, et al. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pages 2397–2430. PMLR, 2023

  25. [25]

    La mecanique statique et l’irreversibilite.J

    Emile Borel. La mecanique statique et l’irreversibilite.J. Phys. Theor. Appl., 3(1):189– 196, 1913. doi: 10.1051/jphystap:019130030018900. URL https://doi.org/10.1051/ jphystap:019130030018900

  26. [26]

    Box.Paradise Valley

    C.J. Box.Paradise Valley. Head of Zeus Ltd, 2017

  27. [27]

    William Morrow; HarperCollins e-books, 2004

    Ray Bradbury.The Cat’s Pajamas. William Morrow; HarperCollins e-books, 2004. 19

  28. [28]

    Severn House Publishers Ltd, 2009

    Gillian Bradshaw.London in Chains. Severn House Publishers Ltd, 2009

  29. [29]

    Pantheon Books; Random House of Canada Limited, 2006

    John Brockman.My Einstein. Pantheon Books; Random House of Canada Limited, 2006

  30. [30]

    Transworld Publishers; Bantam Press; Corgi, 2003

    Dan Brown.The Da Vinci Code. Transworld Publishers; Bantam Press; Corgi, 2003

  31. [31]

    HarperCollins, 2025

    Gareth Brown.The Society of Unknowable Objects. HarperCollins, 2025

  32. [32]

    Dafina Books; Kensington Publishing Corp., 2007

    Niobia Bryant.Live and Learn. Dafina Books; Kensington Publishing Corp., 2007

  33. [33]

    Weinstein Books, 2011

    Mika Brzezinski.Knowing Your Value. Weinstein Books, 2011

  34. [34]

    Penguin Classics; Penguin Books Ltd, 1942

    Albert Camus.The Myth of Sisyphus. Penguin Classics; Penguin Books Ltd, 1942

  35. [35]

    URL https://huggingface.co/datasets/ CANBERT/pile_books3_text

    CANBERT/pile_books3_text, 2025. URL https://huggingface.co/datasets/ CANBERT/pile_books3_text

  36. [36]

    What my privacy papers (don’t) have to say about copyright and gen- erative AI, 2025

    Nicholas Carlini. What my privacy papers (don’t) have to say about copyright and gen- erative AI, 2025. URL https://nicholas.carlini.com/writing/2025/privacy- copyright-and-generative-models.html

  37. [37]

    Extracting training data from large language models

    Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Kather- ine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, et al. Extracting training data from large language models. In30th USENIX Security Symposium (USENIX Security 21), pages 2633–2650, 2021

  38. [38]

    Membership Inference Attacks From First Principles, 2022

    Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. Membership Inference Attacks From First Principles, 2022. URL https://arxiv.org/ abs/2112.03570

  39. [39]

    Extracting Training Data from Diffusion Models, 2023

    Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, and Eric Wallace. Extracting Training Data from Diffusion Models, 2023

  40. [40]

    Quantifying Memorization Across Neural Language Models

    Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramèr, and Chiyuan Zhang. Quantifying Memorization Across Neural Language Models. InInternational Conference on Learning Representations, 2023

  41. [41]

    Penguin Classics; Penguin Books Ltd, 1871

    Lewis Carroll.Alice’s Adventures in Wonderland and Through the Looking-Glass and What Alice Found There. Penguin Classics; Penguin Books Ltd, 1871

  42. [42]

    Carver.The Infinity Link

    Jeffrey A. Carver.The Infinity Link. Open Road Integrated Media, Inc., 1984

  43. [43]

    Generative AI’s Illusory Case for Fair Use.Vanderbilt Journal of Entertainment and Technology Law, 27, 2025

    Jacqueline Charlesworth. Generative AI’s Illusory Case for Fair Use.Vanderbilt Journal of Entertainment and Technology Law, 27, 2025

  44. [44]

    URLhttps://chatgptiseatingtheworld.com

    Chat GPT Is Eating the World, 2024. URLhttps://chatgptiseatingtheworld.com

  45. [45]

    HarperCollins e-books, 1933

    Agatha Christie.Murder on the Orient Express. HarperCollins e-books, 1933

  46. [46]

    HarperCollins e-books, 1939

    Agatha Christie.And Then There Were None. HarperCollins e-books, 1939

  47. [47]

    Spiegel & Grau; Random House, Inc., 2008

    Ta-Nehisi Coates.The Beautiful Struggle. Spiegel & Grau; Random House, Inc., 2008

  48. [48]

    One World; Random House; Penguin Random House LLC, 2017

    Ta-Nehisi Coates.We Were Eight Years in Power. One World; Random House; Penguin Random House LLC, 2017

  49. [49]

    One World; Random House; Penguin Random House LLC, 2019

    Ta-Nehisi Coates.The Water Dancer. One World; Random House; Penguin Random House LLC, 2019

  50. [50]

    New Directions; New Directions Publishing Corporation, 1963

    Jean Cocteau.The Infernal Machine and Other Plays. New Directions; New Directions Publishing Corporation, 1963

  51. [51]

    HarperCollins Publishers, 1988

    Paulo Coelho.The Alchemist. HarperCollins Publishers, 1988

  52. [52]

    Open Court Publishing Company, 2012

    Jon Cogburn and Mark Silcox.Dungeons and Dragons and Philosophy. Open Court Publishing Company, 2012

  53. [53]

    Yale University Press, 2013

    Annie Cohen-Solal.Mark Rothko. Yale University Press, 2013

  54. [54]

    Capacity and Trainability in Recurrent Neural Networks

    Jasmine Collins, Jascha Sohl-Dickstein, and David Sussillo. Capacity and Trainability in Recurrent Neural Networks, 2017. URLhttps://arxiv.org/abs/1611.09913

  55. [55]

    Scholastic Children’s Books; Scholastic Ltd, 2008

    Suzanne Collins.The Hunger Games. Scholastic Children’s Books; Scholastic Ltd, 2008

  56. [56]

    Concord Music Group, Inc. v. Anthropic PBC. 3:23-cv-01092 (M.D. Tenn.)

  57. [57]

    Night Shade Books, 1988

    Glen Cook.The Dragon Never Sleeps. Night Shade Books, 1988. 20

  58. [58]

    Feder Cooper and James Grimmelmann

    A. Feder Cooper and James Grimmelmann. The Files are in the Computer: Copyright, Memorization, and Generative AI.arXiv preprint arXiv:2404.12590, 2024

  59. [59]

    Feder Cooper, Katherine Lee, James Grimmelmann, Daphne Ippolito, Christo- pher Callison-Burch, Christopher A

    A. Feder Cooper, Katherine Lee, James Grimmelmann, Daphne Ippolito, Christopher Callison- Burch, Christopher A. Choquette-Choo, Niloofar Mireshghallah, Miles Brundage, David Mimno, Madiha Zahrah Choksi, Jack M. Balkin, Nicholas Carlini, Christopher De Sa, Jonathan Frankle, Deep Ganguli, Bryant Gipson, Andres Guadamuz, Swee Leng Harris, Abigail Z. Jacobs, ...

  60. [60]

    Feder Cooper, Christopher A

    A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen, Matthew Jagielski, Katja Filippova, Ken Ziyu Liu, Alexandra Chouldechova, Jamie Hayes, Yangsibo Huang, Niloofar Mireshghallah, Ilia Shumailov, Eleni Triantafillou, Peter Kairouz, Nicole Mitchell, Percy Liang, Daniel E. Ho, Yejin Choi, Sanmi Koyejo, Fernando Delgado, James Grimmelmann, Vitaly S...

  61. [61]

    Feder Cooper, Mark A

    A. Feder Cooper, Mark A. Lemley, Christopher De Sa, Lea Duesterwald, Allison Casasola, Jamie Hayes, Katherine Lee, Daniel E. Ho, and Percy Liang. Estimating near-verbatim extraction risk in language models with decoding-constrained beam search.arXiv preprint arXiv:2603.24917, 2026

  62. [62]

    Copyright Law of the United States. 17 U.S. Code § 503 - Remedies for infringement: Impounding and disposition of infringing articles, December 2010. URL https://www.law. cornell.edu/uscode/text/17/503

  63. [63]

    CoStar Grp., Inc. v. LoopNet, Inc., 373 F.3d 544 (4th Cir. 2004)

  64. [64]

    Covey.The 7 Habits of Highly Effective People

    Stephen R. Covey.The 7 Habits of Highly Effective People. RosettaBooks LLC, 1989

  65. [65]

    Toothpick Producers Violate NYT Copyright.Marignal Revolution, Decem- ber 2023

    Tyler Cowen. Toothpick Producers Violate NYT Copyright.Marignal Revolution, Decem- ber 2023. URL https://marginalrevolution.com/marginalrevolution/2023/12/ toothpick-producers-violate-nyt-copyright.html

  66. [66]

    HarperCollins e-books; HarperCollins Publishers Inc., 2015

    David Crabb.Bad Kid. HarperCollins e-books; HarperCollins Publishers Inc., 2015

  67. [67]

    Bantam Books; Random House, Inc., 1992

    Robert Crais.Lullaby Town. Bantam Books; Random House, Inc., 1992

  68. [68]

    Ballantine Books; The Random House Publishing Group, 1990

    Michael Crichton.Jurassic Park. Ballantine Books; The Random House Publishing Group, 1990

  69. [69]

    Picador; Farrar, Straus and Giroux, 1998

    Michael Cunningham.The Hours. Picador; Farrar, Straus and Giroux, 1998

  70. [70]

    Amy B. Cyphert. Generative AI, Plagiarism, and Copyright Infringement in Legal Documents. Minnesota Journal of Law, Science & Technology, 25, 2024

  71. [71]

    New York Review Books; The New York Review of Books, 1949

    Józef Czapski.Inhuman Land. New York Review Books; The New York Review of Books, 1949

  72. [72]

    Puffin Books, 1961

    Roald Dahl.James and the Giant Peach. Puffin Books, 1961

  73. [73]

    Puffin Books; Penguin Books Ltd, 1964

    Roald Dahl.Charlie and the Chocolate Factory. Puffin Books; Penguin Books Ltd, 1964

  74. [74]

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    DeepSeek-AI et al. DeepSeek LLM: Scaling Open-Source Language Models with Longter- mism, 2024. URLhttps://arxiv.org/abs/2401.02954

  75. [75]

    Harvard University Press, 2019

    Nicholas Diakopoulos.Automating the News. Harvard University Press, 2019

  76. [76]

    Zola Books; Simon & Schuster, 1979

    Joan Didion.The White Album. Zola Books; Simon & Schuster, 1979

  77. [77]

    Tor; Tom Doherty Associates, LLC, 2003

    Cory Doctorow.Down and Out in the Magic Kingdom. Tor; Tom Doherty Associates, LLC, 2003

  78. [78]

    Picador; Pan Macmillan, 1999

    Carol Ann Duffy.The World’s Wife. Picador; Pan Macmillan, 1999

  79. [79]

    Riverhead Books; The Berkley Publishing Group, 1995

    Junot Díaz.Drown. Riverhead Books; The Berkley Publishing Group, 1995

  80. [80]

    Riverhead Books; Penguin Group (USA) Inc., 2007

    Junot Díaz.The Brief Wondrous Life of Oscar Wao. Riverhead Books; Penguin Group (USA) Inc., 2007. 21

Showing first 80 references.