F 3set: Towards analyzing fast, fre- quent, and fine-grained events from videos

Zhaoyu Liu, Kan Jiang, Murong Ma, Zhe Hou, Yun Lin, Jin Song Dong, “F 3set: Towards analyzing fast, frequent, fine-grained events from videos,”arXiv preprint arXiv:2504 · 2025 · arXiv 2504.08222

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

RefereeBench: Are Video MLLMs Ready to be Multi-Sport Referees

cs.CV · 2026-04-17 · unverdicted · novelty 8.0

RefereeBench shows that even the strongest video MLLMs reach only around 60% accuracy on multi-sport refereeing tasks and struggle with rule application and temporal grounding.

TennisTV: Do Multimodal Large Language Models Understand Tennis Rallies?

cs.CV · 2025-09-19 · unverdicted · novelty 7.0

Introduces TennisTV benchmark for evaluating 17 MLLMs on tennis video understanding from stroke-level to rally-level tasks with automated pipelines and human verification.

citing papers explorer

Showing 2 of 2 citing papers.

RefereeBench: Are Video MLLMs Ready to be Multi-Sport Referees cs.CV · 2026-04-17 · unverdicted · none · ref 31
RefereeBench shows that even the strongest video MLLMs reach only around 60% accuracy on multi-sport refereeing tasks and struggle with rule application and temporal grounding.
TennisTV: Do Multimodal Large Language Models Understand Tennis Rallies? cs.CV · 2025-09-19 · unverdicted · none · ref 11
Introduces TennisTV benchmark for evaluating 17 MLLMs on tennis video understanding from stroke-level to rally-level tasks with automated pipelines and human verification.

F 3set: Towards analyzing fast, fre- quent, and fine-grained events from videos

fields

years

verdicts

representative citing papers

citing papers explorer