The paper introduces the first comprehensive taxonomy and visualization of 11 categories of technologies facilitating AI-generated non-consensual intimate images, derived from synthesis of primary sources and demonstrated through case studies.
hub Mixed citations
Multimodal datasets: misog- yny, pornography, and malignant stereotypes
Mixed citation behavior. Most common role is background (40%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Collective recourse formalizes community reports to fix group harms in diffusion models for urban visualizations via a report-triage-fix-verify pipeline, four primitives, a mandate score, and synthetic evaluation of 240 reports.
Imagen Video generates high-definition text-conditional videos via a cascade of base and super-resolution diffusion models, achieving high fidelity and controllability.
Optimizes a Neural Radiance Field via probability density distillation from a 2D diffusion model to produce text-conditioned 3D scenes viewable from any angle.
Imagen achieves state-of-the-art photorealistic text-to-image generation by scaling a text-only pretrained T5 language model within a diffusion framework, reaching FID 7.27 on COCO without training on it.
TextTeacher uses frozen text embeddings from captions as semantic anchors to guide vision model training, improving ImageNet accuracy by up to 2.7 p.p. and transfer performance by 1.0 p.p. on average.
An empirical audit of one web-scraped ML training dataset reveals persistent PII after sanitization, which the authors combine with legal analysis to highlight privacy risks and advocate redefining 'publicly available' data for AI training.
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
MagicVideo generates 256x256 text-conditioned video clips via latent diffusion with a custom 3D U-Net, achieving roughly 64 times lower compute than prior video diffusion models.
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
GPT-NeoX-20B is a publicly released 20B parameter autoregressive language model trained on the Pile that shows strong gains in five-shot reasoning over similarly sized prior models.
DSS is a lightweight inference-time framework that erases concepts in diffusion models at 91% average rate while preserving image fidelity, outperforming prior methods.
Analysis estimates 18.7% of Common Crawl documents contain geospatial information like coordinates and addresses, with little difference by language.
Content moderation operates as a stochastic penal colony that banishes users through the constant threat of account suspension, shown via auto-ethnographic case studies of Twitter, OpenAI DALL-E 2, and Pinterest.
citing papers explorer
-
How to Stop Playing Whack-a-Mole: Mapping the Ecosystem of Technologies Facilitating AI-Generated Non-Consensual Intimate Images
The paper introduces the first comprehensive taxonomy and visualization of 11 categories of technologies facilitating AI-generated non-consensual intimate images, derived from synthesis of primary sources and demonstrated through case studies.
-
Collective Recourse for Generative Urban Visualizations
Collective recourse formalizes community reports to fix group harms in diffusion models for urban visualizations via a report-triage-fix-verify pipeline, four primitives, a mandate score, and synthetic evaluation of 240 reports.
-
Imagen Video: High Definition Video Generation with Diffusion Models
Imagen Video generates high-definition text-conditional videos via a cascade of base and super-resolution diffusion models, achieving high fidelity and controllability.
-
DreamFusion: Text-to-3D using 2D Diffusion
Optimizes a Neural Radiance Field via probability density distillation from a 2D diffusion model to produce text-conditioned 3D scenes viewable from any angle.
-
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Imagen achieves state-of-the-art photorealistic text-to-image generation by scaling a text-only pretrained T5 language model within a diffusion framework, reaching FID 7.27 on COCO without training on it.
-
TextTeacher: What Can Language Teach About Images?
TextTeacher uses frozen text embeddings from captions as semantic anchors to guide vision model training, improving ImageNet accuracy by up to 2.7 p.p. and transfer performance by 1.0 p.p. on average.
-
A Common Pool of Privacy Problems: Legal and Technical Lessons from a Large-Scale Web-Scraped Machine Learning Dataset
An empirical audit of one web-scraped ML training dataset reveals persistent PII after sanitization, which the authors combine with legal analysis to highlight privacy risks and advocate redefining 'publicly available' data for AI training.
-
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
-
MagicVideo: Efficient Video Generation With Latent Diffusion Models
MagicVideo generates 256x256 text-conditioned video clips via latent diffusion with a custom 3D U-Net, achieving roughly 64 times lower compute than prior video diffusion models.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
-
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B is a publicly released 20B parameter autoregressive language model trained on the Pile that shows strong gains in five-shot reasoning over similarly sized prior models.
-
Dynamic Eraser for Guided Concept Erasure in Diffusion Models
DSS is a lightweight inference-time framework that erases concepts in diffusion models at 91% average rate while preserving image fidelity, outperforming prior methods.
-
Quantifying Geospatial in the Common Crawl Corpus
Analysis estimates 18.7% of Common Crawl documents contain geospatial information like coordinates and addresses, with little difference by language.
-
Mapping the Stochastic Penal Colony
Content moderation operates as a stochastic penal colony that banishes users through the constant threat of account suspension, shown via auto-ethnographic case studies of Twitter, OpenAI DALL-E 2, and Pinterest.