Omni-DuplexEval creates a new benchmark and LLM-as-a-Judge framework for real-time duplex omni-modal interaction, revealing that current models score below 40% overall and struggle especially with proactive responses.
Perception test: A diagnostic benchmark for multimodal models.arXiv preprint arXiv:2405.17348, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 2roles
background 1polarities
background 1representative citing papers
The arrow of time exhibits nonanalytic behavior at the critical point of measurement-induced phase transitions, with an identified critical exponent, in an exactly solved model of random quantum circuits with non-projective measurements.
citing papers explorer
-
Omni-DuplexEval: Evaluating Real-time Duplex Omni-modal Interaction
Omni-DuplexEval creates a new benchmark and LLM-as-a-Judge framework for real-time duplex omni-modal interaction, revealing that current models score below 40% overall and struggle especially with proactive responses.
-
Arrow of Time as an indicator of Measurement-Induced Phase Transitions
The arrow of time exhibits nonanalytic behavior at the critical point of measurement-induced phase transitions, with an identified critical exponent, in an exactly solved model of random quantum circuits with non-projective measurements.