Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7

Abhishek Das; Anoop Cherian; Chiori Hori; Devi Parikh; Dhruv Batra; Huda Alamri; Irfan Essa; Jue Wang; Raphael Gontijo Lopes; Tim K. Marks

arxiv: 1806.00525 · v1 · pith:UUNH245Wnew · submitted 2018-06-01 · 💻 cs.CL · cs.CV

Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7

Huda Alamri , Vincent Cartillier , Raphael Gontijo Lopes , Abhishek Das , Jue Wang , Irfan Essa , Dhruv Batra , Devi Parikh

show 3 more authors

Anoop Cherian Tim K. Marks Chiori Hori

This is my paper

classification 💻 cs.CL cs.CV

keywords dialogchallengesystemsvisualaudioavsddstc7scene-aware

0 comments

read the original abstract

Scene-aware dialog systems will be able to have conversations with users about the objects and events around them. Progress on such systems can be made by integrating state-of-the-art technologies from multiple research areas including end-to-end dialog systems visual dialog, and video description. We introduce the Audio Visual Scene Aware Dialog (AVSD) challenge and dataset. In this challenge, which is one track of the 7th Dialog System Technology Challenges (DSTC7) workshop1, the task is to build a system that generates responses in a dialog about an input video

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Music Audio-Visual Question Answering Requires Specialized Multimodal Designs
cs.SD 2025-05 unverdicted novelty 3.0

Survey of Music AVQA finds specialized input processing, dedicated spatial-temporal designs, and music-specific modeling are critical for strong performance.