Mmbench: Is your multi-modal model an all-around player? InEuropean conference on computer vi- sion, pages 216–233

Yuan Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, et al · 2024

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

GeoMMBench and GeoMMAgent: Toward Expert-Level Multimodal Intelligence in Geoscience and Remote Sensing

cs.CV · 2026-04-10 · unverdicted · novelty 7.0

GeoMMBench reveals deficiencies in current multimodal LLMs for geoscience tasks while GeoMMAgent demonstrates that tool-integrated agents achieve significantly higher performance.

Omni-NegCLIP: Enhancing CLIP with Front-Layer Contrastive Fine-Tuning for Comprehensive Negation Understanding

cs.CV · 2026-03-31 · unverdicted · novelty 7.0

Omni-NegCLIP improves CLIP's negation understanding by up to 52.65% on presence-based and 12.50% on absence-based tasks through front-layer fine-tuning with specialized contrastive losses.

Learning to Focus and Precise Cropping: A Reinforcement Learning Framework with Information Gaps and Grounding Loss for MLLMs

cs.CV · 2026-03-29 · unverdicted · novelty 5.0

A two-stage RL method with information gaps and grounding loss trains MLLMs to focus on and precisely crop relevant image regions, yielding SOTA results on high-resolution VQA benchmarks.

citing papers explorer

Showing 3 of 3 citing papers.

GeoMMBench and GeoMMAgent: Toward Expert-Level Multimodal Intelligence in Geoscience and Remote Sensing cs.CV · 2026-04-10 · unverdicted · none · ref 28
GeoMMBench reveals deficiencies in current multimodal LLMs for geoscience tasks while GeoMMAgent demonstrates that tool-integrated agents achieve significantly higher performance.
Omni-NegCLIP: Enhancing CLIP with Front-Layer Contrastive Fine-Tuning for Comprehensive Negation Understanding cs.CV · 2026-03-31 · unverdicted · none · ref 21
Omni-NegCLIP improves CLIP's negation understanding by up to 52.65% on presence-based and 12.50% on absence-based tasks through front-layer fine-tuning with specialized contrastive losses.
Learning to Focus and Precise Cropping: A Reinforcement Learning Framework with Information Gaps and Grounding Loss for MLLMs cs.CV · 2026-03-29 · unverdicted · none · ref 23
A two-stage RL method with information gaps and grounding loss trains MLLMs to focus on and precisely crop relevant image regions, yielding SOTA results on high-resolution VQA benchmarks.

Mmbench: Is your multi-modal model an all-around player? InEuropean conference on computer vi- sion, pages 216–233

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer