On path to multimodal generalist: General-level and general-bench

On Path to Multimodal Generalist: General-Level, General-Bench , author= · 2025 · arXiv 2505.04620

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

AVI-Bench: Toward Human-like Audio-Visual Intelligence of Omni-MLLMs

cs.CV · 2026-06-01 · unverdicted · novelty 7.0

AVI-Bench is a cognitively inspired benchmark that evaluates Omni-MLLMs on joint audio-visual tasks and reveals substantial limitations in current models.

Circle-RoPE: Cone-like Decoupled Rotary Positional Embedding for Large Vision-Language Models

cs.CV · 2025-05-22 · unverdicted · novelty 6.0

Circle-RoPE achieves cross-modal positional disentanglement in VLMs by mapping 2D image tokens to a cone-like annulus orthogonal to the text axis, with PTD=0 eliminating RoPE geometric bias while preserving intra-image structure via alternating geometry encoding.

citing papers explorer

Showing 1 of 1 citing paper after filters.

AVI-Bench: Toward Human-like Audio-Visual Intelligence of Omni-MLLMs cs.CV · 2026-06-01 · unverdicted · none · ref 63
AVI-Bench is a cognitively inspired benchmark that evaluates Omni-MLLMs on joint audio-visual tasks and reveals substantial limitations in current models.

On path to multimodal generalist: General-level and general-bench

fields

years

verdicts

representative citing papers

citing papers explorer