pith. sign in

arxiv: 1907.11510 · v1 · pith:H5T7KLFTnew · submitted 2019-07-10 · 💻 cs.HC · cs.CV· cs.IR· cs.LG· stat.ML

AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition

Pith reviewed 2026-05-24 23:37 UTC · model grok-4.3

classification 💻 cs.HC cs.CVcs.IRcs.LGstat.ML
keywords audiovisual emotion recognitiondepression detectioncross-cultural affectmultimodal processingchallenge benchmarkstate-of-mind recognition
0
0 comments X

The pith

The AVEC 2019 challenge supplies benchmark datasets for state-of-mind recognition, depression assessment, and cross-cultural affect sensing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes the ninth Audio/Visual Emotion Challenge, which supplies shared test sets and rules so that different teams can compare multimodal methods for health and emotion analysis on identical real-life data. It defines three tasks and reports baseline results to let health researchers, emotion specialists, and audiovisual engineers measure the relative strengths of their approaches. The setup is intended to accelerate progress by removing differences in data access and evaluation protocols.

Core claim

The challenge provides common benchmark test sets for multimodal information processing on real-life data, together with guidelines, the data itself, and baseline system performances, so that participants can compare the relative merits of approaches to state-of-mind recognition, depression assessment with AI, and cross-cultural affect sensing under the same conditions.

What carries the argument

Three challenge tasks built on real-life audiovisual datasets, each supplied with task definitions and baseline systems for automatic health and emotion analysis.

If this is right

  • Teams can measure whether new multimodal pipelines improve on the supplied baselines for depression detection accuracy.
  • Cross-cultural results can be used to test whether affect models transfer across populations.
  • The shared evaluation protocol removes variance from data splits and metrics when comparing health-focused and emotion-focused methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • High-performing entries on these tasks could later be tested in clinical settings for screening utility.
  • Systematic differences in cross-cultural performance might point to the need for culture-specific training data.
  • The challenge format could be repeated for additional mental-health indicators beyond depression.

Load-bearing premise

The provided real-life audiovisual datasets and task definitions are sufficiently representative and unbiased to serve as a meaningful common benchmark for developing generalizable methods for depression assessment and cross-cultural affect recognition.

What would settle it

Independent replication on newly collected real-life audiovisual recordings showing that the reported baseline performances cannot be reproduced or that top challenge entries fail to generalize would falsify the claim that the supplied test sets constitute a useful common benchmark.

read the original abstract

The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: state-of-mind recognition, depression assessment with AI, and cross-cultural affect sensing, respectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The manuscript announces the AVEC 2019 Workshop and Challenge, which introduces three tasks—state-of-mind recognition, depression assessment with AI, and cross-cultural affect recognition—on real-life audiovisual data. It describes the challenge guidelines, the datasets, major novelties for this edition, and reports the performance of provided baseline systems under identical conditions to enable community comparison of multimodal methods.

Significance. If the described baselines and task definitions hold, the paper supplies a shared benchmark that can standardize evaluation across the affective computing, health AI, and audiovisual processing communities. The explicit focus on real-life data and cross-task participation is a constructive contribution to reproducible comparison in multimodal affect and depression analysis.

minor comments (1)
  1. The abstract and introduction refer to 'the performance of the baseline systems' without an explicit cross-reference to the section or table that tabulates the numerical results for each of the three tasks; adding such a pointer would improve navigability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review and recommendation to accept the manuscript.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a purely descriptive workshop and challenge announcement that outlines tasks, data partitions, baseline systems, and evaluation protocols without advancing any derivation, theorem, prediction, or empirical claim whose validity depends on an internal reduction to fitted parameters or self-citation. Its central statement is an explicit statement of intent to supply a shared benchmark; this statement does not contain equations, ansatzes, uniqueness theorems, or renamed empirical patterns that could be shown equivalent to the paper's own inputs by construction. Consequently no load-bearing step matches any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical derivations, free parameters, axioms, or invented entities are present; the document is an organizational description of a benchmark challenge.

pith-pipeline@v0.9.0 · 5769 in / 1013 out tokens · 19412 ms · 2026-05-24T23:37:39.140921+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

92 extracted references · 92 canonical work pages · 1 internal anchor

  1. [1]

    Tim Althoff, Kevin Clark, and Jure Leskovec. 2016. Large- scale Analysis of Coun- seling Conversations: An Application of Natural Language P rocessing to Men- tal Health. Transactions of the Association for Computational Linguis tics 4 (2016), 463–476

  2. [2]

    Shahin Amiriparian, Maurice Gerczuk, Sandra Ottl, Nich olas Cummins, Michael Freitag, Sergey Pugachevskiy, Alice Baird, and Björn Schul ler. 2017. Snore sound classification using image-based deep spectrum featu res. In Proc. of IN- TERSPEECH 2017, 18th Annual Conference of the International Speech Communi- cation Association. ISCA, Stockholm, Sweden, 3512–3516

  3. [3]

    American Psychiatric Association. 2013. Diagnostic and Statistical Manual of Mental Disorders (DSM-5). American Psychiatric Publishing, Arlington, V A

  4. [4]

    Tadas Baltrušaitis, Amir Zadeh, Yao Chong Lim, and Louis -Philippe Morency

  5. [5]

    OpenFace 2.0: Facial Behavior Analysis Toolkit. In Proc. 13th IEEE Interna- tional Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, Xi’an, P. R. China, 59–66

  6. [6]

    Amit Baumel and Elad Yom-Tov. 2018. Predicting user adhe rence to behavioral eHealth interventions in the real world: examining which as pects of interven- tion design matter most. Translational Behavioral Medicine 8, 5 (2018), 793–798

  7. [7]

    Yoshua Bengio, Aaron Courville, and Pascal Vincent. 201 3. Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 4 (August 2013), 1798–1828

  8. [8]

    Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and J ason Weston. 2009. Curriculum Learning. In Proc. International Conference on Machine Learning (ICML). ACM, Montreal, QC, Canada, 41–48

  9. [9]

    Cohn, Tomas Simon Kruez, Iain Matthews, Ying Ya ng, Minh Hoai Nguyen, Margara Tejera Padilla, Feng Zhou, and Fernando De l a Torre

    Jeffrey F. Cohn, Tomas Simon Kruez, Iain Matthews, Ying Ya ng, Minh Hoai Nguyen, Margara Tejera Padilla, Feng Zhou, and Fernando De l a Torre. 2009. Detecting Depression from Facial Actions and Vocal Prosody . In Proc. 3rd Inter- national Conference on Affective Computing and Intelligent Interaction and Work- shops. IEEE, Amsterdam, Netherlands. 7 pages

  10. [10]

    Cordaro, Rui Sun, Dacher Keltner, Shanmukh Kam ble, Niranjan Hud- dar, and Galen McNeil

    Daniel T. Cordaro, Rui Sun, Dacher Keltner, Shanmukh Kam ble, Niranjan Hud- dar, and Galen McNeil. 2018. Universals and cultural variat ions in 22 emotional expressions across five cultures. Emotion 18 (2018), 75–93

  11. [11]

    Corneanu, Marc O

    Ciprian A. Corneanu, Marc O. Simón, Jeffrey F. Cohn, and S ergio E. Guerrero

  12. [12]

    IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 8 (August 2016), 1548–1568

    Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Ex- pression Recognition: History, Trends, and Affect-Related Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 8 (August 2016), 1548–1568

  13. [13]

    Nicholas Cummins, Stefan Scherer, Jarek Krajewski, Se bastian Schnieder, Julien Epps, and Thomas F Quatieri. 2015. A review of depression and suicide risk assessment using speech analysis. Speech Communication 71 (July 2015), 10–49

  14. [14]

    Kerstin Dautenhahn. 2002. The origins of narrative: In search of the transac- tional format of narratives in humans and other social anima ls. International Journal of Cognition and Technology 1, 1 (2002), 97–123

  15. [15]

    Jun Deng, Nicholas Cummins, Maximilian Schmitt, Kun Qi an, Fabien Ringeval, and Björn Schuller. 2017. Speech-based diagnosis of autism spectrum condi- tion by generative adversarial network representations. In Proc. 7th International Conference on Digital Health (DH) . ACM, London, UK, 53–57

  16. [16]

    David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jonathan Gratch, Arno Hartholt, Marga ux Lhommet, Gale Lucas, Stacy Marsella, Fabrizio Morbini, Angela Nazarian, Stefan Scherer, Giota Stratou, Apar Suri, David Traum, Rachel Wood, Yuyu Xu, Alber to Rizzo, and Louis-Philippe Morency. 2014. SimSensei Kiosk: A V...

  17. [17]

    D’Mello and Jacqueline Kory

    Sidney K. D’Mello and Jacqueline Kory. 2015. A Review an d Meta-Analysis of Multimodal Affect Detection Systems. Comput. Surveys 47, 3 (February 2015). Article 43, 36 pages

  18. [18]

    Paul Ekman. 1971. Universals and cultural differences i n facial expressions of emotion. In Nebraska Symposium on Motivation , Vol. 19. University of Nebraska Press, Lincoln, NE, 207–283

  19. [19]

    Hillary Anger Elfenbein and Nalini Ambady. 2002. On the universality and cul- tural specificity of emotion recognition: A meta-analysis. Psychological Bulletin 128, 2 (2002), 203–235

  20. [20]

    Esposito, and Carl Vogel

    Anna Esposito, Antonietta M. Esposito, and Carl Vogel. 2015. Needs and chal- lenges in human computer interaction for processing social emotional informa- tion. Pattern Recognition Letters 66 (November 2015), 41–51. Issue C

  21. [21]

    Scherer, Björn Schuller, Johan Sundberg, Elisabeth An- dré, Carlos Busso, Laurence Devillers, Julien Epps, Petri L aukka, Shrikanth S

    Florian Eyben, Klaus R. Scherer, Björn Schuller, Johan Sundberg, Elisabeth An- dré, Carlos Busso, Laurence Devillers, Julien Epps, Petri L aukka, Shrikanth S. Narayanan, and Khiet P. Truong. 2016. The Geneva Minimalist ic Acoustic Pa- rameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Trans- actions on Affective Computing 7, 2 (April 2016...

  22. [22]

    Florian Eyben, Felix Weninger, Florian Groß, and Björn Schuller. 2013. Recent Developments in openSMILE, the Munich Open-Source Multime dia Feature Ex- tractor. In Proc. 21st ACM International Conference on Multimedia (ACM M M). ACM, Barcelona, Spain, 835–838

  23. [23]

    Silvia Monica Feraru, Dagmar Schuller, and Björn Schul ler. 2015. Cross- Language Acoustic Emotion Recognition: An Overview and Som e Tendencies. In Proc. 6th Biannual Conference on Affective Computing and Inte lligent Interac- tion (ACII). IEEE, Xi’an, P. R. China, 125–131

  24. [24]

    Yuan Gong and Christian Poellabauer. 2017. Topic Model ing Based Multi-modal Depression Detection. In Proc. 7th International Workshop on Audio/Visual Emo- tion Challenge (A VEC). ACM, Mountain View (CA), USA, 69–76

  25. [25]

    Jonathan Gratch, Ron Artstein, Gale Lucas, Giota Strat ou, Stefan Scherer, An- gela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Sta cy Marsella, David Traum, Skip Rizzo, and Louis-Philippe Morency. 2014. The Di stress Analysis Interview Corpus of human and computer interviews. In Proc. 9th International Conference on Language Resources and Evaluati...

  26. [26]

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2 016. Deep Residual Learning for Image Recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition . IEEE, Las Vegas, NV, 770–778

  27. [27]

    Marlies Houben, Wim Van Den Noortgate, and Peter Kuppen s. 2015. The re- lation between short term emotion dynamics and psychologic al well-being: A meta-analysis. Psychological Bulletin 141, 4 (July 2015), 901–930

  28. [28]

    Weinberger

    Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilia n Q. Weinberger

  29. [29]

    In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Densely Connected Convolutional Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . IEEE, Honolulu, HW, 4700– 4708

  30. [30]

    Jian Huang, Ya Li, Jianhua Tao, Zheng Lian, Mingyue Niu, and Minghao Yang

  31. [31]

    Multimodal Continuous Emotion Recognition with Data Augmentation Using Recurrent Neural Networks. In Proc. 8th International Workshop on Au- dio/Visual Emotion Challenge, A VEC’18. ACM, Seoul, South Korea, 57–64

  32. [32]

    Jyoti Joshi, Roland Goecke, Sharifa Alghowinem, Abhin av Dhall, Michael Wag- ner, Julien Epps, Gordon Parker, and Michael Breakspear. 20 13. Multimodal as- sistive technologies for depression diagnosis and monitor ing. Journal on Multi- modal User Interfaces 7, 3 (2013), 217–228

  33. [33]

    Heysem Kaya and Alexey A. Karpov. 2018. Efficient and effec tive strategies for cross-corpus acoustic emotion recognition. Neurocomputing 275 (January 2018), 1028–034

  34. [34]

    Ni colaou, Athanasios Pa- paioannou, Guoying Zhao, Björn Schuller, Irene Kotsia, and Stefanos Zafeiriou

    Dimitrios Kollias, Panagiotis Tzirakis, Mihalis A. Ni colaou, Athanasios Pa- paioannou, Guoying Zhao, Björn Schuller, Irene Kotsia, and Stefanos Zafeiriou

  35. [35]

    International Journal of Computer Vision 127, 6 (2019), 907–929

    Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond. International Journal of Computer Vision 127, 6 (2019), 907–929

  36. [36]

    Jean Kossaifi, Robert Walecki, Yannis Panagakis, Jie Sh en, Maximilian Schmitt, Fabien Ringeval, Jing Han, Vedhas Pandit, Bjorn Schuller, K am Star, Elnar Ha- jiyev, and Maja Pantic. 2019. SEW A DB: A Rich Database for Audio-Visual Emo- tion and Sentiment Research in the Wild. https://arxiv.org /abs/1901.02839. 17 pages

  37. [37]

    Allen, and Lisa Sheeber

    Peter Koval, Peter Kuppens, Nicholas B. Allen, and Lisa Sheeber. 2012. Getting stuck in depression: The roles of rumination and emotional i nertia. Cognition & Emotion 26, 8 (2012), 1412–1427

  38. [38]

    Pe, Kristof Meers, and Peter Ku ppens

    Peter Koval, Madeline L. Pe, Kristof Meers, and Peter Ku ppens. 2013. Affect dy- namics in relation to depressive symptoms: Variable, unstable or inert? Emotion 13, 6 (2013), 1132

  39. [39]

    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton . 2012. ImageNet Clas- sification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc., Lake Tahoe, NV , 1097–1105

  40. [40]

    Allen, and Lisa B

    Peter Kuppens, Nicholas B. Allen, and Lisa B. Sheeber. 2 010. Emotional inertia and psychological maladjustment. Psychological Science 21, 7 (2010), 984–991

  41. [41]

    Lin Li. 1989. A concordance correlation coefficient to ev aluate reproducibility. Biometrics 45, 1 (March 1989), 255–268. A VEC’19, October, 2019, Nice, France F. Ringeval et al

  42. [42]

    Reza Lotfian and Carlos Busso. 2019. Curriculum Learnin g for Speech Emotion Recognition from Crowdsourced Labels. IEEE Transactions on Audio, Speech & Language Processing 27, 4 (2019), 815–826

  43. [43]

    Maturana and Francisco J

    Humberto R. Maturana and Francisco J. Varela. 1987. Tree of Knowledge: The Biological Roots of Human Understanding . New Science Library/Shambhala Pub- lications, Boston, MA

  44. [44]

    Michelle Morales, Stefan Scherer, and Rivka Levitan. 2 017. A Cross-modal Re- view of Indicators for Depression Detection Systems. In Proc 4th Workshop on Computational Linguistics and Clinical Psychology – From Li nguistic Signal to Clinical Reality. ACL, Vancouver, BC, 1–12

  45. [45]

    World Health Organization. 2017. Depression and Other Common Mental Dis- orders: Global Health Estimates . Technical Report. World Health Organization. Licence: CC BY-NC-SA 3.0 IGO

  46. [46]

    Vedhas Pandit and Björn Schuller. 2019. On Many-to-Man y Mapping Between Concordance Correlation Coefficient and Mean Square Error. https://arxiv.org/abs/1902.05180. 23 pages

  47. [47]

    Cohn, and Thomas Huang

    Maja Pantic, Nicu Sebe, Jeffrey F. Cohn, and Thomas Huang . 2005. Affective Mul- timodal Human-computer Interaction. In Proc. 13th Annual ACM International Conference on Multimedia. ACM, Singapore, Singapore, 669–676

  48. [48]

    James W Pennebaker, Matthias R Mehl, and Kate G Niederho ffer. 2003. Psycho- logical Aspects of Natural Language Use: Our Words, Our Selves. Annual Review of Psychology 54, 1 (2003), 547–577

  49. [49]

    Nairan Ramirez-Esparza, Cindy K Chung, Ewa Kacewicz, a nd James W Pen- nebaker. 2008. The Psychology of Word Use in Depression Foru ms in English and in Spanish: Texting Two Text Analytic Approaches. In International Confer- ence on Weblogs and Social Media . AAAI, Seattle, W A, 102–108

  50. [50]

    Eva-Maria Rathner, Julia Djamali, Yannik Terhorst, Bj örn Schuller, Nicholas Cummins, Gudrun Salamon, Christina Hunger-Schoppe, and Ha rald Baumeis- ter. 2018. How Did You like 2017? Detection of Language Marke rs of Depres- sion and Narcissism in Personal Narratives. In Proc. of INTERSPEECH 2018, 19th Annual Conference of the International Speech Communi...

  51. [51]

    Eva-Maria Rathner, Yannik Terhorst, Nicholas Cummins , Björn Schuller, and Harald Baumeister. 2018. State of Mind: Classification thro ugh Self-reported Affect and Word Use in Speech. In Proc. of INTERSPEECH 2018, 19th Annual Con- ference of the International Speech Communication Association. ISCA, Hyderabad, India, 267–271

  52. [52]

    Fabien Ringeval, Björn Schuller, Michel Valstarand Ro ddy Cowie, Heysem Kaya, Maximilian Schmitt, Shahin Amiriparian, Nicholas Cummins , Dennis Lalanne, Adrien Michaud, Elvan Ciftci, Hüseyin Gülec, Albert Ali Sal ah, and Maja Pantic

  53. [53]

    A VEC 2018 Workshop and Challenge: Bipolar Disorder and Cross-Cultural Affect Recognition. In Proc. 8th International Workshop on Audio/Visual Emotion Challenge, A VEC’18. ACM, Seoul, South Korea, 3–13

  54. [55]

    A VEC 2015 – The 5th International Audio/Visual Emotio n Challenge and Workshop. In Proc. 23rd ACM International Conference on Multimedia, MM 20 15. ACM, Brisbane, Australia, 1335–1336

  55. [56]

    Fabien Ringeval, Björn Schuller, Michel Valstar, Rodd y Cowie, and Maja Pantic

  56. [57]

    Summary for A VEC 2017 – Real-life Depression, and Affec t Recognition Challenge sand Workshop. In Proc. 25th ACM International Conference on Multi- media (ACM MM). ACM, Mountain View, CA, USA, 1963–1964

  57. [58]

    Fabien Ringeval, Björn Schuller, Michel Valstar, Rodd y Cowie, and Maja Pan- tic. 2018. Summary for A VEC 2018: Bipolar Disorder and Cross -Cultural Affect Recognition. In Proc. 26th ACM International Conference on Multimedia, MM 2018. ACM, Seoul, South Korea, 2111–2112

  58. [59]

    Fabien Ringeval, Björn Schuller, Michel Valstar, Jona than Gratch, Roddy Cowie, Stefan Scherer, Sharon Mozgai, Nicholas Cummins, and Maja Pantic. 2017. A VEC 2017 – Real-life Depression, and Affect Recognition Worksho p and Challenge. In Proc. 7th International Workshop on Audio/Visual Emotion Ch allenge (A VEC). ACM, Mountain View, CA, USA, 3–9

  59. [60]

    James A Russell. 2003. Core affect and the psychological construction of emotion. Psychological review 110, 1 (2003), 145

  60. [61]

    James A Russell and Ulrich F Lanius. 1984. Adaptation le vel and the affective appraisal of environments. Journal of Environmental Psychology 4, 2 (1984), 119– 135

  61. [62]

    Hesam Sagha, Jun Deng, Maryna Gavryukova, Jing Han, and Björn Schuller

  62. [63]

    Cross lingual speech emotion recognition using canonical correlation anal- ysis on principal component subspace. InProc. 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . IEEE, Shanghai, P. R. China, 5800–5804

  63. [64]

    Robert M Sapolsky. 2004. Social status and health in hum ans and other animals. Annu. Rev. Anthropol. 33 (2004), 393–418

  64. [65]

    Scherer, Rainer Banse, and Harald G

    Klaus R. Scherer, Rainer Banse, and Harald G. Wallbott. 2001. Emotion inferences from vocal expression correlate across languages and cultu res. Journal of Cross- Cultural Psychology 32, 1 (January 2001), 76–92

  65. [66]

    Stefan Scherer, Giota Stratou, Jonathan Gratch, Jill B oberg, Marwa Mahmoud, Albert (Skip) Rizzo, and Louis-Philippe Morency. 2013. Aut omatic Behavior De- scriptors for Psychological Disorder Analysis. In Proc. 10th IEEE International Conference and Workshops on Automatic Face & Gesture Recogn ition (FG) . IEEE, Shanghai, P. R. China. 8 pages

  66. [67]

    Stefan Scherer, Giota Stratou, Gale Lucas, Marwa Mahmo ud, Jill Boberg, Jonathan Gratch, Albert (Skip) Rizzo, and Louis-Philippe M orency. 2014. Auto- matic audiovisual behavior descriptors for psychological disorder analysis. Im- age and Vision Computing 32, 10 (October 2014), 648–658

  67. [68]

    Maximilian Schmitt, Fabien Ringeval, and Björn Schull er. 2016. At the border of acoustics and linguistics: Bag-of-Audio-Words for the rec ognition of emotions in speech. In Proc. of INTERSPEECH 2016, 17th Annual Conference of the Inter na- tional Speech Communication Association . ISCA, San Francisco, CA, USA, 495– 499

  68. [69]

    Maximilian Schmitt and Björn Schuller. 2017. openXBOW – Introducing the Pas- sau Open-Source Crossmodal Bag-of-Words Toolkit. Journal of Machine Learn- ing Research 18, 96 (2017), 1–5

  69. [70]

    Marschik, Harald Baumeis- ter, Fengquan Dong, Simone Hantke, Florian B

    Björn Schuller, Stefan Steidl, Anton Batliner, Peter B . Marschik, Harald Baumeis- ter, Fengquan Dong, Simone Hantke, Florian B. Pokorny, Eva- Maria Rathner, Katrin D. Bartl-Pokorny, Christa Einspieler, Dajie Zhang, Alice Baird, Shahin Amiriparian, Kun Qian, Zhao Ren, Maximilian Schmitt, Panag iotis Tzirakis, and Stefanos Zafeiriou. 2018. The INTERSPEECH ...

  70. [71]

    Björn Schuller, Michel Valstar, Florian Eyben, Roddy C owie, and Maja Pantic

  71. [72]

    A VEC 2012 – The continuous Audio/Visual Emotion Chall enge. In Proc. 14th ACM International Conference on Multimodal Interaction (ICMI). ACM, Santa Monica, CA, USA, 449–456

  72. [73]

    Björn Schuller, Michel Valstar, Florian Eyben, Gary Mc Keown, Roddy Cowie, and Maja Pantic. 2011. A VEC 2011 – The First International Au dio/Visual Emo- tion Challenge. In Proc. 4th Biannual International Conference on Affective Com - puting and Intelligent Interaction (ACII) , Vol. II. Springer, Memphis, TN, USA, 415–424

  73. [74]

    Norbert Schwarz and Gerard L. Clore. 1983. Mood, misatt ribution, and judge- ments of well-being: Informative and directive functions of affective states. Jour- nal of Personality and Social Psychology 45, 3 (September 1983), 512–523

  74. [75]

    Andreas Schwerdtfeger. 2004. Predicting autonomic re activity to public speak- ing: don’t get fixed on self-report data! International Journal of Psychophysiology 52, 3 (2004), 217–224

  75. [76]

    Andreas R Schwerdtfeger and Eva-Maria Rathner. 2016. T he ecological validity of the autonomic-subjective response dissociation in repr essive coping. Anxiety, Stress, & Coping 29, 3 (2016), 241–258

  76. [77]

    Caifeng Shan, Shaogang Gong, and Peter W Mcowan. 2009. F acial expression recognition based on Local Binary Patterns: A comprehensiv e study. Image and Vision Computing 27, 6 (2009), 803–816

  77. [78]

    MacInnis

    Stewart Shapiro and Deborah J. MacInnis. 2002. Underst anding program- induced mood effects: Decoupling arousal from valence. Journal of Advertising 31, 4 (May 2002), 15–26

  78. [79]

    Karen Simonyan and Andrew Zisserman. 2014. Very deep co nvolutional net- works for large-scale image recognition. https://arxiv.o rg/abs/1409.1556. 14 pages

  79. [80]

    Lukas Stappen, Nicholas Cummins, Eva Messner, Harald B aumeister, Judith Dineley, and Björn Schuller. 2019. Context Modelling Using Hierarchical At- tention Networks for Sentiment and Self-assessed Emotion Detection in Spoken Narratives. In Proc. 44th IEEE International Conference on Acoustics, Spee ch and Signal Processing (ICASSP). IEEE, Brighton, Unite...

  80. [81]

    Nicolaou, Björn Schuller, and Stefanos Zafeiriou

    George Trigeorgis, Fabien Ringeval, Raymond Brueckne r, Erik Marchi, Mi- halis A. Nicolaou, Björn Schuller, and Stefanos Zafeiriou. 2016. Adieu features? End-to-end speech emotion recognition using a deep Convolu tional Recurrent Network. In Proc. 41st IEEE International Conference on Acoustics, Speech and Sig- nal Processing (ICASSP) . IEEE, Shanghai, P....

Showing first 80 references.