AVOC is a retrieval-inspired token compression framework that improves long-form audio-video understanding in multimodal LLMs by selecting informative tokens based on classical IR principles.
Chronusomni: Improving time awareness of omni large language models.arXiv preprint arXiv:2512.09841, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
CogniRoute adds a cognitive schema and route-aware RL to an omni-modal MoE, reaching 59.38% accuracy on a new 118K-example social video QA benchmark and beating prior baselines by 15-27 points.
citing papers explorer
-
AVOC: Enhancing Hour-Level Audio-Video Understanding in Omni-Modal LLMs via Retrieval-Inspired Token Compression
AVOC is a retrieval-inspired token compression framework that improves long-form audio-video understanding in multimodal LLMs by selecting informative tokens based on classical IR principles.
-
CogniRoute: Learning to Route Social Evidence in Omni-Modal Models
CogniRoute adds a cognitive schema and route-aware RL to an omni-modal MoE, reaching 59.38% accuracy on a new 118K-example social video QA benchmark and beating prior baselines by 15-27 points.