SurgViVQA adds temporal video encoding to surgical VideoQA and reports 9-11% gains in keyword accuracy over image-only baselines on two datasets plus improved robustness to question rephrasing.
In: Proceedings of the 30th ACM International Conference on Multimedia, pp
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene Understanding
SurgViVQA adds temporal video encoding to surgical VideoQA and reports 9-11% gains in keyword accuracy over image-only baselines on two datasets plus improved robustness to question rephrasing.