CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016

Bowen Zhang; Dahua Lin; Hang Song; Limin Wang; Luc Van Gool; Wei Li; Xiaoou Tang; Yuanjun Xiong; Yu Qiao; Zhe Wang

arxiv: 1608.00797 · v1 · pith:GC42R6IPnew · submitted 2016-08-02 · 💻 cs.CV

CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016

Yuanjun Xiong , Limin Wang , Zhe Wang , Bowen Zhang , Hang Song , Wei Li , Dahua Lin , Yu Qiao

show 2 more authors

Luc Van Gool Xiaoou Tang

This is my paper

classification 💻 cs.CV

keywords challengeactivitynetclassificationdeepsubmissiontechniquesaccuracyadditionally

0 comments

read the original abstract

This paper presents the method that underlies our submission to the untrimmed video classification task of ActivityNet Challenge 2016. We follow the basic pipeline of temporal segment networks and further raise the performance via a number of other techniques. Specifically, we use the latest deep model architecture, e.g., ResNet and Inception V3, and introduce new aggregation schemes (top-k and attention-weighted pooling). Additionally, we incorporate the audio as a complementary channel, extracting relevant information via a CNN applied to the spectrograms. With these techniques, we derive an ensemble of deep models, which, together, attains a high classification accuracy (mAP $93.23\%$) on the testing set and secured the first place in the challenge.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Survey on Deep Learning Techniques for Action Anticipation
cs.CV 2023-09 unverdicted novelty 2.0

A literature survey reviewing deep learning approaches to action anticipation in everyday scenarios, with method classifications, dataset and metric summaries, and future directions.