Laion-5b: An open large-scale dataset for training next generation image-text models.Advances in neural in- formation processing systems, 35:25278–25294

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al · 2022

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

representative citing papers

When LoRA Betrays: Backdooring Text-to-Image Models by Masquerading as Benign Adapters

cs.CV · 2026-02-25 · conditional · novelty 8.0

MasqLoRA shows that an independent LoRA adapter can be trained on a few trigger-target pairs to backdoor diffusion models with 99.8% success rate while remaining stealthy when the trigger is absent.

SteelDefectX: A Multi-Form Vision-Language Dataset and Benchmark for Steel Surface Defect Analysis

cs.CV · 2026-03-23 · unverdicted · novelty 7.0

SteelDefectX is a new multi-form vision-language dataset and benchmark for analyzing steel surface defects using 7,778 images across 25 categories.

One-to-More: High-Fidelity Training-Free Anomaly Generation with Attention Control

cs.CV · 2026-03-18 · unverdicted · novelty 6.0

O2MAG generates high-fidelity text-guided anomalies from a single image without training by manipulating self-attention in diffusion models with anomaly masks and dual enhancements.

SigLino: Efficient Multi-Teacher Distillation for Agglomerative Vision Foundation Models

cs.CV · 2025-12-23 · conditional · novelty 6.0

SigLino distills SigLIP2 and DINOv3 into efficient vision models via asymmetric relation-knowledge distillation, token-balanced batching, and hierarchical data sampling on a new 200M-image corpus, yielding better transfer to grounding VLMs than training from scratch.

RADSeg: Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglomerative Models

cs.CV · 2025-11-24 · unverdicted · novelty 6.0

RADSeg adapts the RADIO model with targeted enhancements to deliver 6-30% higher mIoU in zero-shot OVSS while using 2.5x fewer parameters and running 3.95x faster than prior large-model combinations.

Adapting Foundation Vision-Language Models to Medical Diagnosis via Query-Driven Expert Bridging

cs.CV · 2025-05-27 · unverdicted · novelty 6.0

MedBridge adapts pretrained VLMs to multi-label medical diagnosis via query tokens for non-destructive alignment and expert routing, reporting 6-15% AUC gains on chest radiograph benchmarks across eight models.

citing papers explorer

Showing 6 of 6 citing papers.

When LoRA Betrays: Backdooring Text-to-Image Models by Masquerading as Benign Adapters cs.CV · 2026-02-25 · conditional · none · ref 34
MasqLoRA shows that an independent LoRA adapter can be trained on a few trigger-target pairs to backdoor diffusion models with 99.8% success rate while remaining stealthy when the trigger is absent.
SteelDefectX: A Multi-Form Vision-Language Dataset and Benchmark for Steel Surface Defect Analysis cs.CV · 2026-03-23 · unverdicted · none · ref 31
SteelDefectX is a new multi-form vision-language dataset and benchmark for analyzing steel surface defects using 7,778 images across 25 categories.
One-to-More: High-Fidelity Training-Free Anomaly Generation with Attention Control cs.CV · 2026-03-18 · unverdicted · none · ref 33
O2MAG generates high-fidelity text-guided anomalies from a single image without training by manipulating self-attention in diffusion models with anomaly masks and dual enhancements.
SigLino: Efficient Multi-Teacher Distillation for Agglomerative Vision Foundation Models cs.CV · 2025-12-23 · conditional · none · ref 28
SigLino distills SigLIP2 and DINOv3 into efficient vision models via asymmetric relation-knowledge distillation, token-balanced batching, and hierarchical data sampling on a new 200M-image corpus, yielding better transfer to grounding VLMs than training from scratch.
RADSeg: Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglomerative Models cs.CV · 2025-11-24 · unverdicted · none · ref 32
RADSeg adapts the RADIO model with targeted enhancements to deliver 6-30% higher mIoU in zero-shot OVSS while using 2.5x fewer parameters and running 3.95x faster than prior large-model combinations.
Adapting Foundation Vision-Language Models to Medical Diagnosis via Query-Driven Expert Bridging cs.CV · 2025-05-27 · unverdicted · none · ref 26
MedBridge adapts pretrained VLMs to multi-label medical diagnosis via query tokens for non-destructive alignment and expert routing, reporting 6-15% AUC gains on chest radiograph benchmarks across eight models.

Laion-5b: An open large-scale dataset for training next generation image-text models.Advances in neural in- formation processing systems, 35:25278–25294

fields

years

verdicts

representative citing papers

citing papers explorer