DeltaPrompts generates 200k synthetic high-divergence reasoning prompts to escape zero-delta saturation in multimodal distillation, yielding up to 15% relative gains on chart, document, and perception benchmarks across multiple settings.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
CogVLM2 family achieves state-of-the-art results on image and video understanding benchmarks through improved visual expert architecture, higher resolution inputs, and automated temporal grounding for videos.
citing papers explorer
-
DeltaPrompts: Escaping the Zero-Delta Trap in Multimodal Distillation
DeltaPrompts generates 200k synthetic high-divergence reasoning prompts to escape zero-delta saturation in multimodal distillation, yielding up to 15% relative gains on chart, document, and perception benchmarks across multiple settings.
-
CogVLM2: Visual Language Models for Image and Video Understanding
CogVLM2 family achieves state-of-the-art results on image and video understanding benchmarks through improved visual expert architecture, higher resolution inputs, and automated temporal grounding for videos.