SurgViVQA adds temporal video encoding to surgical VideoQA and reports 9-11% gains in keyword accuracy over image-only baselines on two datasets plus improved robustness to question rephrasing.
Endocrine Reviews44(5), 947–959 (2023)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene Understanding
SurgViVQA adds temporal video encoding to surgical VideoQA and reports 9-11% gains in keyword accuracy over image-only baselines on two datasets plus improved robustness to question rephrasing.