pith. sign in

arxiv: 1605.01600 · v4 · pith:ZQWQNCM2new · submitted 2016-05-05 · 💻 cs.CV · cs.HC· cs.MM

AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge

classification 💻 cs.CV cs.HCcs.MM
keywords emotiondepressionchallengeaudioprocessingrecognitionapproachesavec
0
0 comments X
read the original abstract

The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) "Depression, Mood and Emotion" will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common benchmark test set for multi-modal information processing and to bring together the depression and emotion recognition communities, as well as the audio, video and physiological processing communities, to compare the relative merits of the various approaches to depression and emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. This paper presents the challenge guidelines, the common data used, and the performance of the baseline system on the two tasks.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Bag-of-Audio-Words based on Autoencoder Codebook for Continuous Emotion Prediction

    eess.AS 2019-07 unverdicted novelty 6.0

    Autoencoder-based codebook for Bag-of-Audio-Words raises CCC for arousal from 0.225 to 0.322 and valence from 0.244 to 0.368 on AVEC 2017 audio data versus standard BoW.

  2. Emotion Recognition Using Fusion of Audio and Video Features

    cs.LG 2019-06 unverdicted novelty 4.0

    Feature-level or decision-level fusion of CNN video features and audio descriptors via SVR achieves CCC 0.749 (arousal) and 0.565 (valence) on RECOLA after preprocessing and post-processing.