CoM-PT trains vision foundation models in ascending size order using inverse knowledge transfer, allowing larger models to achieve superior performance with significantly reduced overall computational cost compared to individual training.
Reproducible scal- ing laws for contrastive language-image learning
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Memorization in Stable Diffusion is driven by the structural duplication of the CLIP <eot> embedding inside <pad> tokens, which causes over-reliance on that vector; simple inference-time masking or token replacement suppresses it without quality loss.
citing papers explorer
-
Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models
CoM-PT trains vision foundation models in ascending size order using inverse knowledge transfer, allowing larger models to achieve superior performance with significantly reduced overall computational cost compared to individual training.
-
Memorization In Stable Diffusion Is Unexpectedly Driven by CLIP Embeddings
Memorization in Stable Diffusion is driven by the structural duplication of the CLIP <eot> embedding inside <pad> tokens, which causes over-reliance on that vector; simple inference-time masking or token replacement suppresses it without quality loss.