The Persistence of Cultural Memory: Investigating Multimodal Iconicity in Diffusion Models

Eva Cetinic; Maria-Teresa De Rosa Palmini

arxiv: 2511.11435 · v3 · submitted 2025-11-14 · 💻 cs.CV · cs.AI

The Persistence of Cultural Memory: Investigating Multimodal Iconicity in Diffusion Models

Maria-Teresa De Rosa Palmini , Eva Cetinic This is my paper

Pith reviewed 2026-05-17 21:49 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords diffusion modelstext-to-image generationmultimodal iconicitycultural referencesmemorization evaluationCRT metricrecognition and realizationprompt perturbation

0 comments

The pith

Diffusion models respond to culturally iconic prompts through separate processes of recognizing a reference and then realizing it via replication or reinterpretation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies how text-to-image diffusion models handle prompts that draw on shared cultural visual references, such as famous artworks or film scenes. It introduces the Cultural Reference Transformation metric to measure whether a model evokes the reference at all and then how it chooses to depict it. Tests across five models and 767 references show clear differences: some models recognize fewer references while others copy visuals more closely. Prompt changes like synonym swaps still lead models to reproduce iconic structures, and recognition ties to training data frequency along with reference popularity and age. The work argues that evaluation must move past simple text-image matching to capture this contextual behavior.

Core claim

The behavior of diffusion models in culturally iconic settings cannot be reduced to simple reproduction but depends on how references are recognized and realized. The Cultural Reference Transformation metric isolates these two dimensions and reveals model-specific patterns when applied to 767 Wikidata-derived references spanning still and moving imagery. Recognition further correlates with training data frequency, textual uniqueness, reference popularity, and creation date, while prompt perturbations demonstrate persistent reproduction of iconic visual structures.

What carries the argument

The Cultural Reference Transformation (CRT) metric, which separates recognition of a cultural reference from its realization through replication or reinterpretation.

If this is right

Some diffusion models exhibit weaker recognition of cultural references while others rely more on visual replication when handling iconic prompts.
Models continue to reproduce iconic visual structures even after textual cues are altered through synonym substitutions or literal image descriptions.
Recognition strength correlates with training data frequency, textual uniqueness, reference popularity, and creation date.
Evaluation of text-to-image models must account for both recognition and realization rather than relying on basic matching metrics alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could be applied to other generative systems to detect similar cultural grounding patterns in outputs.
Training procedures might explicitly reward reinterpretation over replication to reduce unwanted visual copying of popular references.
Future benchmarks could include controlled variations in reference popularity to isolate its effect on model behavior.

Load-bearing premise

The 767 Wikidata-derived cultural references form a representative sample of multimodal iconicity and the CRT metric isolates recognition from realization without model-specific or reference-specific confounds.

What would settle it

Running the same evaluation on a fresh set of cultural references matched for frequency but drawn from non-iconic sources and finding identical recognition rates across models would indicate the patterns are not driven by cultural specificity.

Figures

Figures reproduced from arXiv: 2511.11435 by Eva Cetinic, Maria-Teresa De Rosa Palmini.

**Figure 2.** Figure 2: Recognition vs. Transformation in Multimodal Iconicity. Generations from three diffusion models illustrate how they respond to the prompt “The Dark Side of the Moon.” The vertical axis (Recognition) indicates whether the model evokes the intended cultural reference, while the horizontal axis (Transformation) reflects the degree of visual reinterpretation. multimodal iconicity. Such iconic pairs are charac… view at source ↗

**Figure 3.** Figure 3: Framework for evaluating multimodal iconicity. Cultural reference prompts generate images evaluated along two dimensions: Recognition (CRA), measuring alignment with reference images via CLIP, and Realization (VI), measuring how independently the model recreates them using DINOv3 patch analysis. The resulting Cultural Reference Transformation (CRT) metric captures both a model’s ability to identify cultura… view at source ↗

**Figure 4.** Figure 4: CRA–VR relationship across diffusion models for static (top) and dynamic (bottom) cultural references. Each point corresponds [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Example of images generated from the prompt Physical Graffiti using three diffusion models, SDXL, SD3, and Flux Schnell, shown alongside the iconic cultural reference image. onym variant) introduces minimal prompt change by replacing a key content word of the iconic title with a semantically close synonym (e.g., "The Shriek" for "The Scream"). The second (literal description) replaces the title with an … view at source ↗

**Figure 6.** Figure 6: Qualitative examples from the prompt perturbation experiments. For both static ( [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: (∆CRT — Retained recognition). Mean change in Cultural Reference Transformation (CRT) after textual perturbations (synonym and description) computed over the subset of cultural references that remained recognized before and after perturbation. Error bars denote 95% confidence intervals. largest share across both variants, indicating that diffusion models can reproduce iconic visuals despite altered ling… view at source ↗

**Figure 8.** Figure 8: Strongest correlates of CRA as a function of the number of deduplicated text–image pairs. Each scatterplot shows how CRA varies with creation date, image memorability, and text uniqueness (static and dynamic) as a function of the number of deduplicated text–image pairs. Points are colored by CRA, and median splits along both axes define quadrants annotated with average CRA values [PITH_FULL_IMAGE:figures/… view at source ↗

**Figure 9.** Figure 9: Examples of low text uniqueness cultural references with no alignment in SD v2.1. Iconic references (top) and corresponding generations (bottom). All shown references have text uniqueness below 0.1 and exhibit near-zero CRA. retrieve memorized data points in diffusion models [32]. Text concreteness also shows a weak positive correlation (ρ = 0.16),suggesting a minor trend in which more abstract titles hin… view at source ↗

**Figure 11.** Figure 11 [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: Modality Distribution. Concepts span six major cultural domains, ensuring balanced representation across artistic and media forms. ities between reference images belonging to the same cultural reference versus different ones. The resulting distributions were well separated (µsame ≈ 0.85, µdiff ≈ 0.47). Setting τ = 0.7 retained approximately 96% of true matches while keeping false positives below 1%, e… view at source ↗

**Figure 13.** Figure 13: CLIP similarity distributions. Cosine similarities between reference images of the same (blue) and different (red) cultural references. The separation supports the choice of τ = 0.7 for recognition alignment [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗

**Figure 14.** Figure 14: DINOv3 similarity distributions. Cosine similarities between reference images of the same (blue) and unrelated (red) cultural references. The observed separation motivates the reuse threshold τreuse = 0.6 used for patch-level analysis. τreuse = 0.6 achieved F1 ≈ 0.98 with a false-positive rate below 0.1, effectively distinguishing local replication from unrelated variation. This value aligns with the repl… view at source ↗

**Figure 15.** Figure 15: PDFE overestimates replication while CRT reveals cultural transformation. Compositional coherence stems from learned iconic structure rather than direct visual reuse (Napoleon Crossing the Alps, Sacred and Profane Love, The Big Bang Theory) [PITH_FULL_IMAGE:figures/full_fig_p014_15.png] view at source ↗

**Figure 16.** Figure 16: PDFE underestimates cultural alignment: low PDFE despite high CRA/CRT. Models reproduce canonical iconography through transformation rather than replication (Saint Jerome in the Wilderness, American Gothic, The Walking Dead) [PITH_FULL_IMAGE:figures/full_fig_p014_16.png] view at source ↗

**Figure 17.** Figure 17: PDFE indicates moderate replication while CRA reveals lack of cultural alignment. Models generate superficially similar scenes without capturing the intended cultural reference (Madonna with the Long Neck, Portrait of Père Tanguy, Lost in Translation) [PITH_FULL_IMAGE:figures/full_fig_p014_17.png] view at source ↗

**Figure 19.** Figure 19: Image-level distribution of visual reuse ( [PITH_FULL_IMAGE:figures/full_fig_p015_19.png] view at source ↗

**Figure 18.** Figure 18: CRA–CRC relationship across models for dynamic cultural references. References are grouped by CRA bins (1.0, 0.9, 0.8, etc.) and plotted against their average CRC values. F. Distribution of Visual Reuse [PITH_FULL_IMAGE:figures/full_fig_p015_18.png] view at source ↗

**Figure 20.** Figure 20: Prompt: Pillars of Creation. Models: Imagen 4 (left), Flux Schnell (center), SD3 (right). CRA: Imagen 4 = 1.0; Flux Schnell = 0.0; SD3 = 1.0. CRT: Imagen 4 = 0.00; Flux Schnell = 0.00; SD3 = 0.73 [PITH_FULL_IMAGE:figures/full_fig_p016_20.png] view at source ↗

**Figure 21.** Figure 21: Prompt: Nighthawks. Models: Imagen 4 (left), Flux Schnell (center), SDXL (right). CRA: Imagen 4 = 1.0; Flux Schnell = 0.0; SDXL = 1.0. CRT: Imagen 4 = 0.88; Flux Schnell = 0.00; SDXL = 0.19 [PITH_FULL_IMAGE:figures/full_fig_p016_21.png] view at source ↗

**Figure 22.** Figure 22: Prompt: Lady with an Ermine. Models: Flux Schnell (left), Imagen 4 (center), SDXL (right). CRA: Flux Schnell = 0.0; Imagen 4 = 1.0; SDXL = 1.0. CRT: Flux Schnell = 0.00; Imagen 4 = 0.87; SDXL = 0.68 [PITH_FULL_IMAGE:figures/full_fig_p016_22.png] view at source ↗

**Figure 23.** Figure 23: Prompt: Breakfast at Tiffany’s. Models: SDXL (left), Imagen 4 (center), Flux Schnell (right). CRA: SDXL = 1.0; Imagen 4 = 1.0; Flux Schnell = 1.0. CRT: SDXL = 0.98; Imagen 4 = 0.32; Flux Schnell = 0.96 [PITH_FULL_IMAGE:figures/full_fig_p017_23.png] view at source ↗

**Figure 24.** Figure 24: Prompt: House of Cards. Models: Flux Schnell (left), SD3 (center), SDXL (right). CRA: Flux Schnell = 0.0; SD3 = 1.0; SDXL = 1.0. CRT: Flux Schnell = 0.00; SD3 = 0.49; SDXL = 0.95 [PITH_FULL_IMAGE:figures/full_fig_p017_24.png] view at source ↗

**Figure 25.** Figure 25: Prompt: Breaking Bad. Models: Imagen 4 (left), SDXL (center), SD2 (right). CRA: Imagen 4 = 1.0; SDXL = 1.0; SD2 = 1.0. CRT: Imagen 4 = 0.59; SDXL = 0.87; SD2 = 0.21 [PITH_FULL_IMAGE:figures/full_fig_p017_25.png] view at source ↗

**Figure 26.** Figure 26: Change in Cultural Reference Alignment (∆CRA) under prompt perturbations. Mean change in ∆CRA under synonym substitutions (solid bars) and literal descriptions (hatched bars), shown separately for static (a) and dynamic (b) references. J. Examples of Residual Duplicates in LAION [PITH_FULL_IMAGE:figures/full_fig_p018_26.png] view at source ↗

**Figure 27.** Figure 27: Examples of derivative reproductions of The Starry Night found in LAION. Even after near-duplicate removal, visually varied but semantically redundant products (e.g., shirts, mugs, posters) remain. K. Correlation Analysis As shown in Tab. 5, both static and dynamic concepts exhibit significant correlations between Cultural Reference Alignment (CRA) and several cultural and trainingdata–related features… view at source ↗

read the original abstract

The ambiguity between generalization and memorization in TTI diffusion models becomes pronounced when prompts invoke culturally shared visual references, a phenomenon we term multimodal iconicity. These are instances in which images and texts reflect established cultural associations, such as when a title recalls a familiar artwork or film scene. Such cases challenge existing approaches to evaluating memorization, as they define a setting in which instance-level memorization and culturally grounded generalization are structurally intertwined. To address this challenge, we propose an evaluation framework to assess a model's ability to remain culturally grounded without relying on visual replication. Specifically, we introduce the Cultural Reference Transformation (CRT) metric, which separates two dimensions of model behavior: Recognition, whether a model evokes a reference, from Realization, how it depicts it through replication or reinterpretation. We evaluate five diffusion models on 767 Wikidata-derived cultural references, covering both still and moving imagery, and find differences in how they respond to multimodal iconicity: some show weaker recognition, while others rely more heavily on replication. To assess linguistic sensitivity, we conduct prompt perturbation experiments using synonym substitutions and literal image descriptions, finding that models often reproduce iconic visual structures even when textual cues are altered. Finally, we find that cultural reference recognition correlates not only with training data frequency, but also textual uniqueness, reference popularity, and creation date. Our findings show that the behavior of diffusion models in culturally iconic settings cannot be reduced to simple reproduction, but depends on how references are recognized and realized, advancing evaluation beyond simple text-image matching toward richer contextual understanding.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's real move is introducing the CRT metric to split recognition of cultural references from how diffusion models realize them, tested on 767 Wikidata items with perturbations, though the split may not be as clean as claimed.

read the letter

The main thing to know is that this work proposes the Cultural Reference Transformation metric to separate whether a model evokes a cultural reference from how it actually depicts it, either by close replication or reinterpretation. They run this on 767 Wikidata-derived references across five diffusion models and add prompt perturbation tests plus some correlation analysis with textual uniqueness and creation dates. That framing and the dataset choice are the clearest additions over standard memorization studies. The paper does a decent job laying out why cultural iconicity creates a messy overlap between memorization and generalization, and the perturbation experiments show models often keep iconic visual structures even after synonym swaps or literal rephrasings. Using Wikidata for both still and moving imagery gives the evaluation a broader base than many prompt-only tests. The correlations with popularity and date are straightforward to check and could be useful for others building similar benchmarks. The soft spot is the separability of the two dimensions. Both recognition and realization scores come from the same generated images, so any visual bias in the model or reference-specific features could couple them. Without seeing the exact thresholds, feature extractors, or controls for that overlap, it's hard to tell how independent the reported differences really are. The sample size is reasonable but Wikidata cultural entries can skew toward well-documented Western references, which might affect how general the findings are. This is for researchers working on evaluation frameworks for generative models, especially those dealing with copyright, cultural data, or bias in training sets. A reader who wants concrete tools beyond simple text-image similarity scores will find the setup worth examining. It deserves peer review because the empirical core with real references and perturbations is solid enough for referees to pressure-test the metric details and suggest tighter controls.

Referee Report

2 major / 2 minor

Summary. The paper examines multimodal iconicity in text-to-image diffusion models, where prompts invoke culturally shared visual references (e.g., artworks or film scenes). It introduces the Cultural Reference Transformation (CRT) metric to separate Recognition (whether the model evokes the reference) from Realization (replication versus reinterpretation), evaluates five models on 767 Wikidata-derived references, conducts synonym-substitution and literal-description prompt perturbations, and reports correlations between recognition and factors including training-data frequency, textual uniqueness, reference popularity, and creation date. The central claim is that model behavior under cultural iconicity cannot be reduced to simple reproduction but depends on these separable dimensions.

Significance. If the CRT metric validly isolates the two dimensions without model- or reference-specific confounds, the work strengthens evaluation of generative models by incorporating cultural grounding and contextual understanding beyond standard text-image similarity. The scale of the reference set, use of external Wikidata grounding, and perturbation experiments supply a reproducible empirical basis for comparing models on memorization-versus-generalization trade-offs.

major comments (2)

[§3 (CRT Metric)] §3 (CRT Metric): Recognition and Realization are both scored from visual features of the identical generated images. This risks circularity because any model-specific generation bias or prompt sensitivity directly couples the two scores, undermining the claim that models differ independently on these dimensions and that prompt-perturbation results cleanly isolate linguistic sensitivity.
[§4 (Evaluation and Results)] §4 (Evaluation and Results): The manuscript reports model differences, perturbation outcomes, and correlations with popularity/date but supplies no quantitative tables, error analysis, baseline comparisons, or statistical tests for the CRT scores. Without these, the magnitude and reliability of the reported differences cannot be assessed.

minor comments (2)

[§3.1] Clarify the exact similarity thresholds, feature extractors, or decision rules used to operationalize Recognition versus Realization in the CRT definition.
[Discussion] Add a limitations paragraph addressing potential selection bias in the 767 Wikidata references and any reference-specific confounds in the metric.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments, which help strengthen the presentation of our evaluation framework. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [§3 (CRT Metric)] §3 (CRT Metric): Recognition and Realization are both scored from visual features of the identical generated images. This risks circularity because any model-specific generation bias or prompt sensitivity directly couples the two scores, undermining the claim that models differ independently on these dimensions and that prompt-perturbation results cleanly isolate linguistic sensitivity.

Authors: We appreciate the concern regarding potential coupling. Recognition is operationalized via binary detection of presence/absence of a small set of reference-specific visual attributes drawn from Wikidata (e.g., distinctive objects, composition, or color palette), while Realization is scored separately as a continuous deviation measure using a perceptual hash distance focused on overall visual fidelity to the source image. These use distinct feature sets and decision thresholds, allowing the two dimensions to vary independently across models and prompts. The perturbation experiments further support separability by showing that synonym changes often reduce Recognition scores while leaving Realization largely unchanged. We will expand §3 with explicit scoring formulas, attribute lists, and an ablation demonstrating that the two scores are not deterministically linked. revision: yes
Referee: [§4 (Evaluation and Results)] §4 (Evaluation and Results): The manuscript reports model differences, perturbation outcomes, and correlations with popularity/date but supplies no quantitative tables, error analysis, baseline comparisons, or statistical tests for the CRT scores. Without these, the magnitude and reliability of the reported differences cannot be assessed.

Authors: We agree that the current version lacks sufficient quantitative detail for readers to evaluate effect sizes and statistical reliability. The revised manuscript will include: (1) a table of mean CRT Recognition and Realization scores per model with standard deviations and 95% confidence intervals; (2) an error analysis breaking down false-positive and false-negative cases by reference type (artwork vs. film scene); (3) baseline comparisons against CLIP cosine similarity and LPIPS; and (4) Pearson correlations with p-values for the reported factors (training-data frequency, textual uniqueness, popularity, creation date). These additions will be placed in §4 and the appendix. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical metric on external references

full rationale

The paper is an empirical evaluation study that introduces the CRT metric to separate recognition and realization on 767 Wikidata-derived cultural references. It relies on external data sources, prompt perturbation experiments, and generated image assessments rather than any derivation chain, fitted parameters, or self-referential equations that reduce reported outcomes to quantities defined within the paper itself. No load-bearing steps match the enumerated circularity patterns, and the central claims remain independent of internal self-definition or self-citation chains.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the validity of the newly introduced CRT metric and the representativeness of the Wikidata cultural reference set; both are postulated in the paper without independent prior validation or falsifiable handles outside the current study.

axioms (1)

domain assumption Wikidata entries provide an unbiased and representative sample of culturally shared visual references
Used to construct the 767-reference evaluation set covering still and moving imagery.

invented entities (2)

multimodal iconicity no independent evidence
purpose: To name the structural intertwining of instance-level memorization and culturally grounded generalization in prompts
Term introduced to frame the evaluation challenge; no independent evidence supplied.
Cultural Reference Transformation (CRT) metric no independent evidence
purpose: To separate Recognition from Realization dimensions of model behavior on cultural references
Newly proposed evaluation framework; no prior validation or external benchmark mentioned.

pith-pipeline@v0.9.0 · 5577 in / 1565 out tokens · 51824 ms · 2026-05-17T21:49:40.745477+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce the Cultural Reference Transformation (CRT) metric, which separates two dimensions of model behavior: Recognition, whether a model evokes a reference, from Realization, how it depicts it through replication or reinterpretation.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 3 internal anchors

[1]

Stable diffusion v2.1 and dreamstu- dio update.https : / / stability

Stability AI. Stable diffusion v2.1 and dreamstu- dio update.https : / / stability . ai / blog / stablediffusion2 - 1 - release7 - dec - 2022,

work page 2022
[2]

Accessed: November 4, 2025. 2, 7

work page 2025
[3]

Easily acces- sible text-to-image generation amplifies demographic stereo- types at large scale

Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, and Aylin Caliskan. Easily acces- sible text-to-image generation amplifies demographic stereo- types at large scale. InProceedings of the 2023 ACM con- ference on fairness, accountability, and transparency, pages 1493–15...

work page 2023
[4]

Flux.1-schnell.https : / / huggingface

Black Forest Labs. Flux.1-schnell.https : / / huggingface . co / black - forest - labs / FLUX . 1-schnell, 2024. Model card. 2

work page 2024
[5]

Montrage: Moni- toring training for attribution of generative diffusion models

Jonathan Brokman, Omer Hofman, Roman Vainshtein, Amit Giloni, Toshiya Shimizu, Inderjeet Singh, Oren Rachmil, Alon Zolfi, Asaf Shabtai, Yuki Unno, et al. Montrage: Moni- toring training for attribution of generative diffusion models. InEuropean Conference on Computer Vision, pages 1–17. Springer, 2024. 1, 2

work page 2024
[6]

Concreteness ratings for 40 thousand generally known en- glish word lemmas.Behavior research methods, 46(3):904– 911, 2014

Marc Brysbaert, Amy Beth Warriner, and Victor Kuperman. Concreteness ratings for 40 thousand generally known en- glish word lemmas.Behavior research methods, 46(3):904– 911, 2014. 7

work page 2014
[7]

The pri- vacy onion effect: Memorization is relative.Advances in Neural Information Processing Systems, 35:13263–13276,

Nicholas Carlini, Matthew Jagielski, Chiyuan Zhang, Nico- las Papernot, Andreas Terzis, and Florian Tramer. The pri- vacy onion effect: Memorization is relative.Advances in Neural Information Processing Systems, 35:13263–13276,

work page
[8]

Extracting training data from diffu- sion models

Nicolas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagiel- ski, Vikash Sehwag, Florian Tramer, Borja Balle, Daphne Ip- polito, and Eric Wallace. Extracting training data from diffu- sion models. In32nd USENIX security symposium (USENIX Security 23), pages 5253–5270, 2023. 1, 2

work page 2023
[9]

Emerg- ing properties in self-supervised vision transformers

Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerg- ing properties in self-supervised vision transformers. InPro- ceedings of the IEEE/CVF international conference on com- puter vision, pages 9650–9660, 2021. 4

work page 2021
[10]

The myth of culturally agnostic ai models

Eva Cetinic. The myth of culturally agnostic ai models. arXiv preprint arXiv:2211.15271, 2022. 2

work page arXiv 2022
[11]

Ambient diffu- sion: Learning clean distributions from corrupted data.Ad- vances in Neural Information Processing Systems, 36:288– 313, 2023

Giannis Daras, Kulin Shah, Yuval Dagan, Aravind Gol- lakota, Alex Dimakis, and Adam Klivans. Ambient diffu- sion: Learning clean distributions from corrupted data.Ad- vances in Neural Information Processing Systems, 36:288– 313, 2023. 2

work page 2023
[12]

The faiss library

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. The faiss library. IEEE Transactions on Big Data, 2025. 7

work page 2025
[13]

Scaling recti- fied flow transformers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling recti- fied flow transformers for high-resolution image synthesis. InForty-first international conference on machine learning,

work page
[14]

Erasing concepts from diffusion models

Rohit Gandikota, Joanna Materzynska, Jaden Fiotto- Kaufman, and David Bau. Erasing concepts from diffusion models. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 2426–2436, 2023. 1, 2

work page 2023
[15]

Imagen 4.https://deepmind

Google DeepMind. Imagen 4.https://deepmind. google/models/imagen/, 2025. Model overview. 2

work page 2025
[16]

University of Chicago Press, 2007

Robert Hariman and John Louis Lucaites.No caption needed: Iconic photographs, public culture, and liberal democracy. University of Chicago Press, 2007. 2

work page 2007
[17]

Ai art and its impact on artists

Harry H Jiang, Lauren Brown, Jessica Cheng, Mehtab Khan, Abhishek Gupta, Deja Workman, Alex Hanna, Johnathan Flowers, and Timnit Gebru. Ai art and its impact on artists. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, pages 363–374, 2023. 2

work page 2023
[18]

Computational hermeneutics: Evaluating generative ai as a cultural technology

Cody Kommers, Ruth Ahnert, Maria Antoniak, Emmanouil Benetos, Steve Benford, Mercedes Bunz, Baptiste Carami- aux, Shauna Concannon, Martin Disley, James Dobson, et al. Computational hermeneutics: Evaluating generative ai as a cultural technology. 2025. 2

work page 2025
[19]

Stable bias: Evaluating societal representa- tions in diffusion models.Advances in Neural Information Processing Systems, 36:56338–56351, 2023

Sasha Luccioni, Christopher Akiki, Margaret Mitchell, and Yacine Jernite. Stable bias: Evaluating societal representa- tions in diffusion models.Advances in Neural Information Processing Systems, 36:56338–56351, 2023. 1

work page 2023
[20]

Embracing new techniques in deep learning for estimating image memorabil- ity.Computational Brain & Behavior, 5(2):168–184, 2022

Coen D Needell and Wilma A Bainbridge. Embracing new techniques in deep learning for estimating image memorabil- ity.Computational Brain & Behavior, 5(2):168–184, 2022. 7

work page 2022
[21]

Synthetic history: Evaluating visual representations of the past in dif- fusion models.arXiv preprint arXiv:2505.17064, 2025

Maria-Teresa De Rosa Palmini and Eva Cetinic. Synthetic history: Evaluating visual representations of the past in dif- fusion models.arXiv preprint arXiv:2505.17064, 2025. 1

work page arXiv 2025
[22]

Photojournalism and foreign policy: Icons of outrage in international crises.(No Title), 1998

David D Perlmutter. Photojournalism and foreign policy: Icons of outrage in international crises.(No Title), 1998. 2

work page 1998
[23]

A self-supervised descriptor for image copy detection

Ed Pizzi, Sreya Dutta Roy, Sugosh Nagavara Ravindra, Priya Goyal, and Matthijs Douze. A self-supervised descriptor for image copy detection. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 14532–14542, 2022. 1, 3, 4

work page 2022
[24]

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion mod- els for high-resolution image synthesis.arXiv preprint arXiv:2307.01952, 2023. 2

work page internal anchor Pith review Pith/arXiv arXiv 2023
[25]

Learning transferable visual models from natural language supervi- sion

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 3

work page 2021
[26]

Unveiling and mitigating mem- orization in text-to-image diffusion models through cross at- tention

Jie Ren, Yaxin Li, Shenglai Zeng, Han Xu, Lingjuan Lyu, Yue Xing, and Jiliang Tang. Unveiling and mitigating mem- orization in text-to-image diffusion models through cross at- tention. InEuropean Conference on Computer Vision, pages 340–356. Springer, 2024. 2

work page 2024
[27]

The computational mem- orability of iconic images.Proceedings http://ceur-ws

Lisa Saleh and Nanne van Noord. The computational mem- orability of iconic images.Proceedings http://ceur-ws. org ISSN, 1613:0073, 2022. 2

work page 2022
[28]

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114, 2021. 7

work page internal anchor Pith review Pith/arXiv arXiv 2021
[29]

Laion-5b: An open large-scale dataset for training next generation image-text models.Advances in neural in- formation processing systems, 35:25278–25294, 2022

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Worts- man, et al. Laion-5b: An open large-scale dataset for training next generation image-text models.Advances in neural in- formation processing systems, 35:25278–25294, 2022. 7

work page 2022
[30]

Beyond aesthetics: Cultural competence in text-to- image models.Advances in Neural Information Processing Systems, 37:13716–13747, 2024

Nithish Kannen Senthilkumar, Arif Ahmad, Marco Andreetto, Vinodkumar Prabhakaran, Utsav Prabhu, Adji Bousso Dieng, Pushpak Bhattacharyya, and Shachi Dave. Beyond aesthetics: Cultural competence in text-to- image models.Advances in Neural Information Processing Systems, 37:13716–13747, 2024. 1

work page 2024
[31]

DINOv3

Oriane Siméoni, Huy V V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025. 4

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Diffusion art or digital forgery? investigating data replication in diffusion models

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Diffusion art or digital forgery? investigating data replication in diffusion models. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6048–6058, 2023. 1, 2, 4, 7

work page 2023
[33]

Understanding and mitigating copying in diffusion models.Advances in Neural Informa- tion Processing Systems, 36:47783–47803, 2023

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Understanding and mitigating copying in diffusion models.Advances in Neural Informa- tion Processing Systems, 36:47783–47803, 2023. 7, 8

work page 2023
[34]

Intrinsically mem- orable words have unique associations with their meanings

Greta Tuckute, Kyle Mahowald, Phillip Isola, Aude Oliva, Edward Gibson, and Evelina Fedorenko. Intrinsically mem- orable words have unique associations with their meanings. Journal of Experimental Psychology: General, 2025. 7

work page 2025
[35]

Mapping the latent spaces of culture

Ted Underwood. Mapping the latent spaces of culture. 2021. 2

work page 2021
[36]

The iconicity of the gen- erated image.arXiv preprint arXiv:2509.16473, 2025

Nanne van Noord and Noa Garcia. The iconicity of the gen- erated image.arXiv preprint arXiv:2509.16473, 2025. 2

work page arXiv 2025
[37]

Navigating cultural chasms: Exploring and unlocking the cultural pov of text-to-image models.Transactions of the Association for Computational Linguistics, 13:142–166,

Mor Ventura, Eyal Ben-David, Anna Korhonen, and Roi Re- ichart. Navigating cultural chasms: Exploring and unlocking the cultural pov of text-to-image models.Transactions of the Association for Computational Linguistics, 13:142–166,

work page
[38]

Wikidata: a free collaborative knowledgebase.Communications of the ACM, 57(10):78–85, 2014

Denny Vrande ˇci´c and Markus Krötzsch. Wikidata: a free collaborative knowledgebase.Communications of the ACM, 57(10):78–85, 2014. 2, 3

work page 2014
[39]

Evaluating data attribution for text-to-image models

Sheng-Yu Wang, Alexei A Efros, Jun-Yan Zhu, and Richard Zhang. Evaluating data attribution for text-to-image models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7192–7203, 2023. 1, 2

work page 2023
[40]

Image copy detection for diffusion models.Advances in Neural In- formation Processing Systems, 37:14417–14456, 2024

Wenhao Wang, Yifan Sun, Zhentao Tan, and Yi Yang. Image copy detection for diffusion models.Advances in Neural In- formation Processing Systems, 37:14417–14456, 2024. 1, 2, 4

work page 2024
[41]

On the de-duplication of laion-2b.arXiv preprint arXiv:2303.12733, 2023

Ryan Webster, Julien Rabin, Loic Simon, and Frederic Ju- rie. On the de-duplication of laion-2b.arXiv preprint arXiv:2303.12733, 2023. 2

work page arXiv 2023
[42]

evaluating student performance

Laura Weidinger, Inioluwa Deborah Raji, Hanna Wallach, Margaret Mitchell, Angelina Wang, Olawale Salaudeen, Rishi Bommasani, Deep Ganguli, Sanmi Koyejo, and William Isaac. Toward an evaluation science for generative ai systems.arXiv preprint arXiv:2503.05336, 2025. 2

work page arXiv 2025
[43]

De- tecting, explaining, and mitigating memorization in diffusion models

Yuxin Wen, Yuchen Liu, Chen Chen, and Lingjuan Lyu. De- tecting, explaining, and mitigating memorization in diffusion models. InThe Twelfth International Conference on Learn- ing Representations, 2024. 2

work page 2024
[44]

Counter- factual memorization in neural language models.Advances in Neural Information Processing Systems, 36:39321–39362,

Chiyuan Zhang, Daphne Ippolito, Katherine Lee, Matthew Jagielski, Florian Tramèr, and Nicholas Carlini. Counter- factual memorization in neural language models.Advances in Neural Information Processing Systems, 36:39321–39362,

work page
[45]

Forget-me-not: Learning to for- get in text-to-image diffusion models

Gong Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, and Humphrey Shi. Forget-me-not: Learning to for- get in text-to-image diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1755–1764, 2024. 2

work page 2024
[46]

The Scream

Yimeng Zhang, Xin Chen, Jinghan Jia, Yihua Zhang, Chongyu Fan, Jiancheng Liu, Mingyi Hong, Ke Ding, and Sijia Liu. Defensive unlearning with adversarial training for robust concept erasure in diffusion models.Advances in neu- ral information processing systems, 37:36748–36776, 2024. 2 The Persistence of Cultural Memory: Investigating Multimodal Iconicity ...

work page 2024

[1] [1]

Stable diffusion v2.1 and dreamstu- dio update.https : / / stability

Stability AI. Stable diffusion v2.1 and dreamstu- dio update.https : / / stability . ai / blog / stablediffusion2 - 1 - release7 - dec - 2022,

work page 2022

[2] [2]

Accessed: November 4, 2025. 2, 7

work page 2025

[3] [3]

Easily acces- sible text-to-image generation amplifies demographic stereo- types at large scale

Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, and Aylin Caliskan. Easily acces- sible text-to-image generation amplifies demographic stereo- types at large scale. InProceedings of the 2023 ACM con- ference on fairness, accountability, and transparency, pages 1493–15...

work page 2023

[4] [4]

Flux.1-schnell.https : / / huggingface

Black Forest Labs. Flux.1-schnell.https : / / huggingface . co / black - forest - labs / FLUX . 1-schnell, 2024. Model card. 2

work page 2024

[5] [5]

Montrage: Moni- toring training for attribution of generative diffusion models

Jonathan Brokman, Omer Hofman, Roman Vainshtein, Amit Giloni, Toshiya Shimizu, Inderjeet Singh, Oren Rachmil, Alon Zolfi, Asaf Shabtai, Yuki Unno, et al. Montrage: Moni- toring training for attribution of generative diffusion models. InEuropean Conference on Computer Vision, pages 1–17. Springer, 2024. 1, 2

work page 2024

[6] [6]

Concreteness ratings for 40 thousand generally known en- glish word lemmas.Behavior research methods, 46(3):904– 911, 2014

Marc Brysbaert, Amy Beth Warriner, and Victor Kuperman. Concreteness ratings for 40 thousand generally known en- glish word lemmas.Behavior research methods, 46(3):904– 911, 2014. 7

work page 2014

[7] [7]

The pri- vacy onion effect: Memorization is relative.Advances in Neural Information Processing Systems, 35:13263–13276,

Nicholas Carlini, Matthew Jagielski, Chiyuan Zhang, Nico- las Papernot, Andreas Terzis, and Florian Tramer. The pri- vacy onion effect: Memorization is relative.Advances in Neural Information Processing Systems, 35:13263–13276,

work page

[8] [8]

Extracting training data from diffu- sion models

Nicolas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagiel- ski, Vikash Sehwag, Florian Tramer, Borja Balle, Daphne Ip- polito, and Eric Wallace. Extracting training data from diffu- sion models. In32nd USENIX security symposium (USENIX Security 23), pages 5253–5270, 2023. 1, 2

work page 2023

[9] [9]

Emerg- ing properties in self-supervised vision transformers

Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerg- ing properties in self-supervised vision transformers. InPro- ceedings of the IEEE/CVF international conference on com- puter vision, pages 9650–9660, 2021. 4

work page 2021

[10] [10]

The myth of culturally agnostic ai models

Eva Cetinic. The myth of culturally agnostic ai models. arXiv preprint arXiv:2211.15271, 2022. 2

work page arXiv 2022

[11] [11]

Ambient diffu- sion: Learning clean distributions from corrupted data.Ad- vances in Neural Information Processing Systems, 36:288– 313, 2023

Giannis Daras, Kulin Shah, Yuval Dagan, Aravind Gol- lakota, Alex Dimakis, and Adam Klivans. Ambient diffu- sion: Learning clean distributions from corrupted data.Ad- vances in Neural Information Processing Systems, 36:288– 313, 2023. 2

work page 2023

[12] [12]

The faiss library

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. The faiss library. IEEE Transactions on Big Data, 2025. 7

work page 2025

[13] [13]

Scaling recti- fied flow transformers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling recti- fied flow transformers for high-resolution image synthesis. InForty-first international conference on machine learning,

work page

[14] [14]

Erasing concepts from diffusion models

Rohit Gandikota, Joanna Materzynska, Jaden Fiotto- Kaufman, and David Bau. Erasing concepts from diffusion models. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 2426–2436, 2023. 1, 2

work page 2023

[15] [15]

Imagen 4.https://deepmind

Google DeepMind. Imagen 4.https://deepmind. google/models/imagen/, 2025. Model overview. 2

work page 2025

[16] [16]

University of Chicago Press, 2007

Robert Hariman and John Louis Lucaites.No caption needed: Iconic photographs, public culture, and liberal democracy. University of Chicago Press, 2007. 2

work page 2007

[17] [17]

Ai art and its impact on artists

Harry H Jiang, Lauren Brown, Jessica Cheng, Mehtab Khan, Abhishek Gupta, Deja Workman, Alex Hanna, Johnathan Flowers, and Timnit Gebru. Ai art and its impact on artists. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, pages 363–374, 2023. 2

work page 2023

[18] [18]

Computational hermeneutics: Evaluating generative ai as a cultural technology

Cody Kommers, Ruth Ahnert, Maria Antoniak, Emmanouil Benetos, Steve Benford, Mercedes Bunz, Baptiste Carami- aux, Shauna Concannon, Martin Disley, James Dobson, et al. Computational hermeneutics: Evaluating generative ai as a cultural technology. 2025. 2

work page 2025

[19] [19]

Stable bias: Evaluating societal representa- tions in diffusion models.Advances in Neural Information Processing Systems, 36:56338–56351, 2023

Sasha Luccioni, Christopher Akiki, Margaret Mitchell, and Yacine Jernite. Stable bias: Evaluating societal representa- tions in diffusion models.Advances in Neural Information Processing Systems, 36:56338–56351, 2023. 1

work page 2023

[20] [20]

Embracing new techniques in deep learning for estimating image memorabil- ity.Computational Brain & Behavior, 5(2):168–184, 2022

Coen D Needell and Wilma A Bainbridge. Embracing new techniques in deep learning for estimating image memorabil- ity.Computational Brain & Behavior, 5(2):168–184, 2022. 7

work page 2022

[21] [21]

Synthetic history: Evaluating visual representations of the past in dif- fusion models.arXiv preprint arXiv:2505.17064, 2025

Maria-Teresa De Rosa Palmini and Eva Cetinic. Synthetic history: Evaluating visual representations of the past in dif- fusion models.arXiv preprint arXiv:2505.17064, 2025. 1

work page arXiv 2025

[22] [22]

Photojournalism and foreign policy: Icons of outrage in international crises.(No Title), 1998

David D Perlmutter. Photojournalism and foreign policy: Icons of outrage in international crises.(No Title), 1998. 2

work page 1998

[23] [23]

A self-supervised descriptor for image copy detection

Ed Pizzi, Sreya Dutta Roy, Sugosh Nagavara Ravindra, Priya Goyal, and Matthijs Douze. A self-supervised descriptor for image copy detection. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 14532–14542, 2022. 1, 3, 4

work page 2022

[24] [24]

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion mod- els for high-resolution image synthesis.arXiv preprint arXiv:2307.01952, 2023. 2

work page internal anchor Pith review Pith/arXiv arXiv 2023

[25] [25]

Learning transferable visual models from natural language supervi- sion

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 3

work page 2021

[26] [26]

Unveiling and mitigating mem- orization in text-to-image diffusion models through cross at- tention

Jie Ren, Yaxin Li, Shenglai Zeng, Han Xu, Lingjuan Lyu, Yue Xing, and Jiliang Tang. Unveiling and mitigating mem- orization in text-to-image diffusion models through cross at- tention. InEuropean Conference on Computer Vision, pages 340–356. Springer, 2024. 2

work page 2024

[27] [27]

The computational mem- orability of iconic images.Proceedings http://ceur-ws

Lisa Saleh and Nanne van Noord. The computational mem- orability of iconic images.Proceedings http://ceur-ws. org ISSN, 1613:0073, 2022. 2

work page 2022

[28] [28]

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114, 2021. 7

work page internal anchor Pith review Pith/arXiv arXiv 2021

[29] [29]

Laion-5b: An open large-scale dataset for training next generation image-text models.Advances in neural in- formation processing systems, 35:25278–25294, 2022

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Worts- man, et al. Laion-5b: An open large-scale dataset for training next generation image-text models.Advances in neural in- formation processing systems, 35:25278–25294, 2022. 7

work page 2022

[30] [30]

Beyond aesthetics: Cultural competence in text-to- image models.Advances in Neural Information Processing Systems, 37:13716–13747, 2024

Nithish Kannen Senthilkumar, Arif Ahmad, Marco Andreetto, Vinodkumar Prabhakaran, Utsav Prabhu, Adji Bousso Dieng, Pushpak Bhattacharyya, and Shachi Dave. Beyond aesthetics: Cultural competence in text-to- image models.Advances in Neural Information Processing Systems, 37:13716–13747, 2024. 1

work page 2024

[31] [31]

DINOv3

Oriane Siméoni, Huy V V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025. 4

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Diffusion art or digital forgery? investigating data replication in diffusion models

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Diffusion art or digital forgery? investigating data replication in diffusion models. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6048–6058, 2023. 1, 2, 4, 7

work page 2023

[33] [33]

Understanding and mitigating copying in diffusion models.Advances in Neural Informa- tion Processing Systems, 36:47783–47803, 2023

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Understanding and mitigating copying in diffusion models.Advances in Neural Informa- tion Processing Systems, 36:47783–47803, 2023. 7, 8

work page 2023

[34] [34]

Intrinsically mem- orable words have unique associations with their meanings

Greta Tuckute, Kyle Mahowald, Phillip Isola, Aude Oliva, Edward Gibson, and Evelina Fedorenko. Intrinsically mem- orable words have unique associations with their meanings. Journal of Experimental Psychology: General, 2025. 7

work page 2025

[35] [35]

Mapping the latent spaces of culture

Ted Underwood. Mapping the latent spaces of culture. 2021. 2

work page 2021

[36] [36]

The iconicity of the gen- erated image.arXiv preprint arXiv:2509.16473, 2025

Nanne van Noord and Noa Garcia. The iconicity of the gen- erated image.arXiv preprint arXiv:2509.16473, 2025. 2

work page arXiv 2025

[37] [37]

Navigating cultural chasms: Exploring and unlocking the cultural pov of text-to-image models.Transactions of the Association for Computational Linguistics, 13:142–166,

Mor Ventura, Eyal Ben-David, Anna Korhonen, and Roi Re- ichart. Navigating cultural chasms: Exploring and unlocking the cultural pov of text-to-image models.Transactions of the Association for Computational Linguistics, 13:142–166,

work page

[38] [38]

Wikidata: a free collaborative knowledgebase.Communications of the ACM, 57(10):78–85, 2014

Denny Vrande ˇci´c and Markus Krötzsch. Wikidata: a free collaborative knowledgebase.Communications of the ACM, 57(10):78–85, 2014. 2, 3

work page 2014

[39] [39]

Evaluating data attribution for text-to-image models

Sheng-Yu Wang, Alexei A Efros, Jun-Yan Zhu, and Richard Zhang. Evaluating data attribution for text-to-image models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7192–7203, 2023. 1, 2

work page 2023

[40] [40]

Image copy detection for diffusion models.Advances in Neural In- formation Processing Systems, 37:14417–14456, 2024

Wenhao Wang, Yifan Sun, Zhentao Tan, and Yi Yang. Image copy detection for diffusion models.Advances in Neural In- formation Processing Systems, 37:14417–14456, 2024. 1, 2, 4

work page 2024

[41] [41]

On the de-duplication of laion-2b.arXiv preprint arXiv:2303.12733, 2023

Ryan Webster, Julien Rabin, Loic Simon, and Frederic Ju- rie. On the de-duplication of laion-2b.arXiv preprint arXiv:2303.12733, 2023. 2

work page arXiv 2023

[42] [42]

evaluating student performance

Laura Weidinger, Inioluwa Deborah Raji, Hanna Wallach, Margaret Mitchell, Angelina Wang, Olawale Salaudeen, Rishi Bommasani, Deep Ganguli, Sanmi Koyejo, and William Isaac. Toward an evaluation science for generative ai systems.arXiv preprint arXiv:2503.05336, 2025. 2

work page arXiv 2025

[43] [43]

De- tecting, explaining, and mitigating memorization in diffusion models

Yuxin Wen, Yuchen Liu, Chen Chen, and Lingjuan Lyu. De- tecting, explaining, and mitigating memorization in diffusion models. InThe Twelfth International Conference on Learn- ing Representations, 2024. 2

work page 2024

[44] [44]

Counter- factual memorization in neural language models.Advances in Neural Information Processing Systems, 36:39321–39362,

Chiyuan Zhang, Daphne Ippolito, Katherine Lee, Matthew Jagielski, Florian Tramèr, and Nicholas Carlini. Counter- factual memorization in neural language models.Advances in Neural Information Processing Systems, 36:39321–39362,

work page

[45] [45]

Forget-me-not: Learning to for- get in text-to-image diffusion models

Gong Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, and Humphrey Shi. Forget-me-not: Learning to for- get in text-to-image diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1755–1764, 2024. 2

work page 2024

[46] [46]

The Scream

Yimeng Zhang, Xin Chen, Jinghan Jia, Yihua Zhang, Chongyu Fan, Jiancheng Liu, Mingyi Hong, Ke Ding, and Sijia Liu. Defensive unlearning with adversarial training for robust concept erasure in diffusion models.Advances in neu- ral information processing systems, 37:36748–36776, 2024. 2 The Persistence of Cultural Memory: Investigating Multimodal Iconicity ...

work page 2024