Title resolution pending

Chen, L · 2024

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

dataset 2

citation-polarity summary

use dataset 2

representative citing papers

Can Multimodal Large Language Models Truly Understand Small Objects?

cs.CV · 2026-04-24 · unverdicted · novelty 7.0

Current MLLMs show weak performance on small object understanding tasks, but fine-tuning with the new SOU-Train dataset measurably improves their capabilities.

Hybrid Latent Reasoning with Decoupled Policy Optimization

cs.CV · 2026-04-22 · unverdicted · novelty 7.0

HyLaR with DePO enables effective RL in hybrid discrete-continuous spaces for multimodal models, outperforming prior MLLMs on perception and understanding benchmarks.

When Sinks Help or Hurt: Unified Framework for Attention Sink in Large Vision-Language Models

cs.CV · 2026-04-01 · unverdicted · novelty 7.0

Attention sinks in LVLM create a global-vs-local trade-off that a layer-wise gating module can balance to improve multimodal benchmark performance.

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

cs.CL · 2026-04-08 · unverdicted · novelty 5.0

OpenSpatial supplies a principled open-source data engine and 3-million-sample dataset that raises spatial-reasoning model performance by an average of 19 percent on benchmarks.

citing papers explorer

Showing 4 of 4 citing papers.

Can Multimodal Large Language Models Truly Understand Small Objects? cs.CV · 2026-04-24 · unverdicted · none · ref 3
Current MLLMs show weak performance on small object understanding tasks, but fine-tuning with the new SOU-Train dataset measurably improves their capabilities.
Hybrid Latent Reasoning with Decoupled Policy Optimization cs.CV · 2026-04-22 · unverdicted · none · ref 4
HyLaR with DePO enables effective RL in hybrid discrete-continuous spaces for multimodal models, outperforming prior MLLMs on perception and understanding benchmarks.
When Sinks Help or Hurt: Unified Framework for Attention Sink in Large Vision-Language Models cs.CV · 2026-04-01 · unverdicted · none · ref 4
Attention sinks in LVLM create a global-vs-local trade-off that a layer-wise gating module can balance to improve multimodal benchmark performance.
OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence cs.CL · 2026-04-08 · unverdicted · none · ref 10
OpenSpatial supplies a principled open-source data engine and 3-million-sample dataset that raises spatial-reasoning model performance by an average of 19 percent on benchmarks.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer