pith. sign in

arxiv: 2604.23753 · v1 · submitted 2026-04-26 · 💻 cs.AI · cs.HC· cs.LG

Modeling Induced Pleasure through Cognitive Appraisal Prediction via Multimodal Fusion

Pith reviewed 2026-05-08 06:18 UTC · model grok-4.3

classification 💻 cs.AI cs.HCcs.LG
keywords affective computingmultimodal fusioncognitive appraisalpleasure predictiontransformer modelsvideo analysisfuzzy logicemotion recognition
0
0 comments X p. Extension

The pith

A model predicts video-induced pleasure by inferring cognitive appraisal variables from multimodal features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes a framework that fuses multimodal video data using transformers and attention to predict cognitive appraisal variables, which in turn estimate levels of induced pleasure. It tackles issues of noisy labels, semantic gaps in emotions, and lack of specific datasets by combining data-driven techniques with cognitive theory and fuzzy modeling for better interpretability. Achieving 0.6624 accuracy, the work suggests applications in content recommendation and media design where understanding emotional elicitation matters. Readers interested in affective computing would see value in moving beyond black-box predictions to theory-grounded explanations of how visuals shape feelings.

Core claim

The model integrates transformer architectures for multimodal feature extraction with a cognitive appraisal theory-based fuzzy model to predict appraisal variables from videos, enabling the inference of pleasure levels with improved explainability and a reported peak accuracy of 0.6624.

What carries the argument

Multimodal fusion via transformers and attention mechanisms that feed into appraisal variable prediction as the bridge to pleasure outcomes.

Load-bearing premise

The assumption that multimodal features can be mapped reliably to cognitive appraisal variables that causally relate to pleasure, despite inconsistencies in human labeling.

What would settle it

If experiments demonstrate that pleasure ratings do not align with the predicted appraisal variables across a new set of videos, or if simpler statistical models achieve comparable accuracy without the appraisal component, the value of the proposed mediation would be questioned.

read the original abstract

Multimodal affective computing analyzes user-generated social media content to predict emotional states. However, a critical gap remains in understanding how visual content shapes cognitive interpretations and elicits specific affective experiences such as pleasure. This study introduces a novel computational model to infer video-induced pleasure via cognitive appraisal variables. The proposed model addresses four challenges: (1) noisy and inconsistent human labels, (2) the semantic gap between "positive emotions" and "pleasure," (3) the scarcity of pleasure-specific datasets, and (4) the limited interpretability of existing black-box fusion methods. Our approach integrates data-driven and cognitive theory-driven methods, using cognitive appraisal theory and a fuzzy model within an innovative framework. The model employs transformer-based architectures and attention mechanisms for fine-grained multimodal feature extraction and interpretable fusion to capture both inter- and intra-modal dynamics associated with pleasure. This enables the prediction of underlying appraisal variables, thereby bridging the semantic gap and enhancing model explainability beyond conventional statistical associations. Experimental results validate the efficacy of the proposed method in detecting video-induced pleasure, achieving a peak accuracy of 0.6624 in predicting pleasure levels. These findings highlight promising implications for affective content recommendation, intelligent media creation, and advancing our understanding of how digital media influences human emotions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces a multimodal fusion framework that combines transformer-based feature extraction with a fuzzy cognitive appraisal model to predict video-induced pleasure by inferring underlying appraisal variables. It addresses challenges of noisy labels, the semantic gap between general positive emotions and pleasure, data scarcity, and limited interpretability of black-box methods, reporting a peak accuracy of 0.6624 on pleasure level prediction with claimed improvements in explainability.

Significance. If the central claims were supported by proper validation, the integration of cognitive appraisal theory with multimodal transformers could offer a more interpretable approach to affective computing, with potential applications in content recommendation and media analysis. However, the current presentation provides no evidence that the appraisal variables are recoverable from video features or that the fuzzy layer adds value, so the significance cannot be assessed positively at this stage.

major comments (3)
  1. Abstract: The peak accuracy of 0.6624 is reported without any baseline comparisons (e.g., standard multimodal classifiers or prior affective models), ablation studies removing the fuzzy appraisal component, or dataset statistics such as sample size, class distribution, or annotation reliability, leaving the efficacy claim unsupported.
  2. Abstract: No ground-truth annotations or validation procedure for the cognitive appraisal variables are described, despite the central claim that predicting these variables bridges the semantic gap; this makes it impossible to verify that the model extracts appraisal information rather than fitting noise in the pleasure labels.
  3. Abstract: The manuscript provides no error analysis, confusion matrices, or discussion of how the model handles the acknowledged noise in human labels and scarcity of pleasure-specific data, which are load-bearing for assessing both performance and interpretability advantages over black-box fusion.
minor comments (1)
  1. Abstract: The number of pleasure levels (binary vs. multi-class) and the exact output of the fuzzy model are not specified, which would clarify the classification setup.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's thorough review and valuable feedback on our manuscript. The comments highlight important areas for strengthening the presentation of our results and validation procedures. We have revised the manuscript to incorporate additional analyses and clarifications as detailed in our point-by-point responses below.

read point-by-point responses
  1. Referee: [—] Abstract: The peak accuracy of 0.6624 is reported without any baseline comparisons (e.g., standard multimodal classifiers or prior affective models), ablation studies removing the fuzzy appraisal component, or dataset statistics such as sample size, class distribution, or annotation reliability, leaving the efficacy claim unsupported.

    Authors: We agree that the abstract requires supporting context for the reported accuracy. The full manuscript details the construction of our pleasure-specific dataset, including sample size, class distribution, and annotation procedures. In the revised version, we have added explicit baseline comparisons against standard multimodal classifiers and prior affective computing models, along with ablation studies that isolate the contribution of the fuzzy appraisal component. These results are now summarized in the abstract and expanded in the experimental section to substantiate the efficacy claims. revision: yes

  2. Referee: [—] Abstract: No ground-truth annotations or validation procedure for the cognitive appraisal variables are described, despite the central claim that predicting these variables bridges the semantic gap; this makes it impossible to verify that the model extracts appraisal information rather than fitting noise in the pleasure labels.

    Authors: Cognitive appraisal variables are latent constructs inferred via a fuzzy logic layer grounded in cognitive appraisal theory, rather than directly annotated, due to the practical challenges of obtaining reliable human labels for internal cognitive states. The validation procedure relies on demonstrating that the inferred appraisals improve pleasure prediction accuracy, exhibit interpretable mappings from multimodal features, and align with theoretical expectations through qualitative analysis. We have expanded the manuscript with a detailed description of the fuzzy inference rules, feature-to-appraisal mappings, and correlation analyses between predicted appraisals and pleasure outcomes to address concerns about noise fitting. While direct ground-truth would provide stronger verification, its absence is inherent to the theoretical modeling approach. revision: partial

  3. Referee: [—] Abstract: The manuscript provides no error analysis, confusion matrices, or discussion of how the model handles the acknowledged noise in human labels and scarcity of pleasure-specific data, which are load-bearing for assessing both performance and interpretability advantages over black-box fusion.

    Authors: We acknowledge the value of explicit error analysis for evaluating robustness. The revised manuscript now includes a dedicated error analysis subsection with confusion matrices for pleasure level classification, per-class performance breakdowns, and discussion of how the fuzzy appraisal layer and multimodal attention mechanisms mitigate label noise and data scarcity. These additions illustrate the interpretability advantages by showing how appraisal predictions provide insight into misclassifications that black-box methods cannot. revision: yes

Circularity Check

0 steps flagged

No significant circularity; model combines external cognitive theory with standard multimodal ML training without self-referential reduction.

full rationale

The abstract and description present a standard empirical ML pipeline: transformer-based multimodal feature extraction fused via attention, augmented by a fuzzy layer drawing on established cognitive appraisal theory to predict pleasure levels from video. No equations, parameter-fitting steps, or self-citations are shown that would make the reported accuracy (0.6624) equivalent to the inputs by construction. The approach addresses acknowledged challenges (noisy labels, semantic gap, data scarcity) through integration of independent theory and data-driven components rather than redefining or refitting the target quantity from itself. This is the typical non-circular structure for applied affective computing papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on specific parameters, axioms, or new entities introduced in the model.

pith-pipeline@v0.9.0 · 5528 in / 1328 out tokens · 89253 ms · 2026-05-08T06:18:31.347201+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

88 extracted references · 9 canonical work pages · 3 internal anchors

  1. [1]

    IEEE transactions on multimedia7(1), 143–154 (2005)

    Hanjalic, A., Xu, L.-Q.: Affective video content representation and modeling. IEEE transactions on multimedia7(1), 143–154 (2005)

  2. [2]

    Multimedia tools and applications 70(2), 757–779 (2014)

    Xu, M., Wang, J., He, X., Jin, J.S., Luo, S., Lu, H.: A three-level framework for affective content analysis and its case studies. Multimedia tools and applications 70(2), 757–779 (2014)

  3. [3]

    IEEE Transactions on Circuits and Systems for 29 Video Technology23(4), 636–647 (2012)

    Canini, L., Benini, S., Leonardi, R.: Affective recommendation of movies based on selected connotative features. IEEE Transactions on Circuits and Systems for 29 Video Technology23(4), 636–647 (2012)

  4. [4]

    In: Proceedings of the 2020 4th International Conference on Compute and Data Analysis, pp

    Nath, D., Anubhav, Singh, M., Sethia, D., Kalra, D., Indu, S.: A comparative study of subject-dependent and subject-independent strategies for eeg-based emo- tion recognition using lstm network. In: Proceedings of the 2020 4th International Conference on Compute and Data Analysis, pp. 142–147 (2020)

  5. [5]

    IEEE Transactions on Affective Computing13(3), 1401–1415 (2020)

    Zhu, Y., Chen, Z., Wu, F.: Affective video content analysis via multimodal deep quality embedding network. IEEE Transactions on Affective Computing13(3), 1401–1415 (2020)

  6. [6]

    IET Image Processing18(12), 3288–3301 (2024)

    Yi, Y., Zhou, J., Wang, H., Tang, P., Wang, M.: Emotion recognition in user-generated videos with long-range correlation-aware network. IET Image Processing18(12), 3288–3301 (2024)

  7. [7]

    Neuron86(3), 646–664 (2015)

    Berridge, K.C., Kringelbach, M.L.: Pleasure systems in the brain. Neuron86(3), 646–664 (2015)

  8. [8]

    Psychology of Well-Being: Theory, Research and Practice1(1), 3 (2011)

    Berridge, K.C., Kringelbach, M.L.: Building a neuroscience of pleasure and well- being. Psychology of Well-Being: Theory, Research and Practice1(1), 3 (2011)

  9. [9]

    Frontiers in human neuroscience12, 359 (2018)

    Moccia, L., Mazza, M., Nicola, M.D., Janiri, L.: The experience of pleasure: a perspective between neuroscience and psychoanalysis. Frontiers in human neuroscience12, 359 (2018)

  10. [10]

    Trends in cognitive sciences13(11), 479–487 (2009)

    Kringelbach, M.L., Berridge, K.C.: Towards a functional neuroanatomy of pleasure and happiness. Trends in cognitive sciences13(11), 479–487 (2009)

  11. [11]

    Informatics in Medicine Unlocked20, 100372 (2020)

    Hassouneh, A., Mutawa, A., Murugappan, M.: Development of a real-time emo- tion recognition system using facial expressions and eeg based on machine learning and deep neural network methods. Informatics in Medicine Unlocked20, 100372 (2020)

  12. [12]

    Computers in biology and medicine140, 105080 (2022)

    Li, R., Ren, C., Zhang, X., Hu, B.: A novel ensemble learning method using multiple objective particle swarm optimization for subject-independent eeg-based emotion recognition. Computers in biology and medicine140, 105080 (2022)

  13. [13]

    Neurocomputing448, 140–151 (2021)

    Huang, D., Chen, S., Liu, C., Zheng, L., Tian, Z., Jiang, D.: Differences first in asymmetric brain: A bi-hemisphere discrepancy convolutional neural network for eeg emotion recognition. Neurocomputing448, 140–151 (2021)

  14. [14]

    Sensors 22(8), 2976 (2022) 30

    Algarni, M., Saeed, F., Al-Hadhrami, T., Ghabban, F., Al-Sarem, M.: Deep learning-based approach for emotion recognition using electroencephalography (eeg) signals using bi-directional long short-term memory (bi-lstm). Sensors 22(8), 2976 (2022) 30

  15. [15]

    Biomedical Signal Processing and Control96, 106648 (2024)

    Liao, Y., Gao, Y., Wang, F., Xu, Z., Wu, Y., Zhang, L.: Exploring emotional expe- riences and dataset construction in the era of short videos based on physiological signals. Biomedical Signal Processing and Control96, 106648 (2024)

  16. [16]

    Scientific Reports14(1), 13491 (2024)

    Zhang, X., Cheng, X., Liu, H.: Tpro-net: an eeg-based emotion recognition method reflecting subtle changes in emotion. Scientific Reports14(1), 13491 (2024)

  17. [17]

    Advances in Neural Information Processing Systems37, 103737–103750 (2024)

    Xu, C., Liu, L., Jin, L., Du, G., Guo, Z., Zhao, Y., Huang, X., Li, R.,et al.: Infer induced sentiment of comment response to video: A new task, dataset and baseline. Advances in Neural Information Processing Systems37, 103737–103750 (2024)

  18. [18]

    IEEE Transactions on Pattern Analysis and Machine Intelligence45(9), 10745–10759 (2023)

    Wagner, J., Triantafyllopoulos, A., Wierstorf, H., Schmitt, M., Burkhardt, F., Eyben, F., Schuller, B.W.: Dawn of the transformer era in speech emotion recogni- tion: closing the valence gap. IEEE Transactions on Pattern Analysis and Machine Intelligence45(9), 10745–10759 (2023)

  19. [19]

    arXiv preprint arXiv:2008.12096 (2020)

    Dudzik, B., Broekens, J., Neerincx, M., Hung, H.: A blast from the past: Person- alizing predictions of video-induced emotions using personal memories as context. arXiv preprint arXiv:2008.12096 (2020)

  20. [20]

    IEEE Transactions on Multimedia22(9), 2454–2466 (2019)

    Yi, Y., Wang, H., Li, Q.: Affective video content analysis with adaptive fusion recurrent network. IEEE Transactions on Multimedia22(9), 2454–2466 (2019)

  21. [21]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Xie, H., Peng, C.-J., Tseng, Y.-W., Chen, H.-J., Hsu, C.-F., Shuai, H.-H., Cheng, W.-H.: Emovit: Revolutionizing emotion insights with visual instruction tuning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 26596–26605 (2024)

  22. [22]

    In: Proceedings of the 32nd ACM International Conference on Multimedia, pp

    Li, X., Wang, S., Huang, X.: Temporal enhancement for video affective con- tent analysis. In: Proceedings of the 32nd ACM International Conference on Multimedia, pp. 642–650 (2024)

  23. [23]

    In: International Workshop on Adaptive Multimedia Retrieval, pp

    Chan, C.H., Jones, G.J.: An affect-based video retrieval system with open vocab- ulary querying. In: International Workshop on Adaptive Multimedia Retrieval, pp. 103–117 (2010). Springer

  24. [24]

    IEEE Transactions on Multimedia12(6), 523–535 (2010)

    Irie, G., Satou, T., Kojima, A., Yamasaki, T., Aizawa, K.: Affective audio- visual words and latent topic driving model for realizing movie affective scene classification. IEEE Transactions on Multimedia12(6), 523–535 (2010)

  25. [25]

    Decision support systems115, 24–35 (2018)

    Kratzwald, B., Ili´ c, S., Kraus, M., Feuerriegel, S., Prendinger, H.: Deep learn- ing for affective computing: Text-based emotion recognition in decision support. Decision support systems115, 24–35 (2018)

  26. [26]

    Neural Computing and Applications36(16), 8901–8947 (2024)

    Younis, E.M., Mohsen, S., Houssein, E.H., Ibrahim, O.A.S.: Machine learning 31 for human emotion recognition: a comprehensive review. Neural Computing and Applications36(16), 8901–8947 (2024)

  27. [27]

    Engineering Applications of Artificial Intelligence133, 108339 (2024)

    Hazmoune, S., Bougamouza, F.: Using transformers for multimodal emotion recognition: Taxonomies and state of the art review. Engineering Applications of Artificial Intelligence133, 108339 (2024)

  28. [28]

    Expert Systems with Applications237, 121692 (2024)

    Zhang, S., Yang, Y., Chen, C., Zhang, X., Leng, Q., Zhao, X.: Deep learning- based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects. Expert Systems with Applications237, 121692 (2024)

  29. [29]

    IEEE Transactions on Affective Computing14(2), 1634–1654 (2021)

    Latif, S., Rana, R., Khalifa, S., Jurdak, R., Qadir, J., Schuller, B.: Survey of deep representation learning for speech emotion recognition. IEEE Transactions on Affective Computing14(2), 1634–1654 (2021)

  30. [30]

    Frontiers Media SA, ??? (2023)

    Cui, Z., Zheng, W.: Deep Learning Techniques Applied to Affective Computing. Frontiers Media SA, ??? (2023)

  31. [31]

    Multimedia Tools and Applications 83(10), 28373–28394 (2024)

    Kumar, P., Malik, S., Raman, B.: Interpretable multimodal emotion recognition using hybrid fusion of speech and image data. Multimedia Tools and Applications 83(10), 28373–28394 (2024)

  32. [32]

    IEEE Transactions on Neural Networks and Learning Systems35(10), 13101–13121 (2023)

    Corti˜ nas-Lorenzo, K., Lacey, G.: Toward explainable affective computing: A review. IEEE Transactions on Neural Networks and Learning Systems35(10), 13101–13121 (2023)

  33. [33]

    Journal of personality and social psychology114(3), 358 (2018)

    Scherer, K.R., Mortillaro, M., Rotondi, I., Sergi, I., Trznadel, S.: Appraisal-driven facial actions as building blocks for emotion inference. Journal of personality and social psychology114(3), 358 (2018)

  34. [34]

    Oxford University Press, New York, NY (1991)

    Lazarus, R.S.: Emotion and Adaptation. Oxford University Press, New York, NY (1991)

  35. [35]

    Cognitive Computation14(4), 1223–1246 (2022)

    Zall, R., Kangavari, M.R.: Comparative analytical survey on cognitive agents with emotional intelligence. Cognitive Computation14(4), 1223–1246 (2022)

  36. [36]

    Intelligent Agents with Emotional Intelligence: Current Trends, Challenges, and Future Prospects

    Zall, R., Kheyrkhah, A., Cambria, E., Naseri, Z., Kangavari, M.R.: Intelli- gent agents with emotional intelligence: Current trends, challenges, and future prospects. arXiv preprint arXiv:2511.20657 (2025)

  37. [37]

    Cognitive Systems Research88, 101285 (2024)

    Zall, R., Kangavari, M.R.: Towards emotion-aware intelligent agents by utilizing knowledge graphs of experiences. Cognitive Systems Research88, 101285 (2024)

  38. [38]

    Oxford University Press, ??? (2001)

    Scherer, K.R., Schorr, A., Johnstone, T.: Appraisal Processes in Emotion: Theory, Methods, Research. Oxford University Press, ??? (2001)

  39. [39]

    28–35 (2017)

    Tian, L., Muszynski, M., Lai, C., Moore, J.D., Kostoulas, T., Lombardo, P., Pun, 32 T., Chanel, G.: Recognizing induced emotions of movie audiences: Are induced and perceived emotions the same? In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 28–35 (2017). IEEE

  40. [40]

    Psycho- logical Review110(1), 145–172 (2003) https://doi.org/10.1037/0033-295X.110.1

    Russell, J.A.: Core affect and the psychological construction of emotion. Psycho- logical Review110(1), 145–172 (2003) https://doi.org/10.1037/0033-295X.110.1. 145

  41. [41]

    Development and psychopathology17(3), 715–734 (2005)

    Posner, J., Russell, J.A., Peterson, B.S.: The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and psychopathology17(3), 715–734 (2005)

  42. [42]

    Current psychology14(4), 261–292 (1996)

    Mehrabian, A.: Pleasure-arousal-dominance: A general framework for describing and measuring individual differences in temperament. Current psychology14(4), 261–292 (1996)

  43. [43]

    Current psychology33(3), 405–421 (2014)

    Bakker, I., Van Der Voordt, T., Vink, P., De Boon, J.: Pleasure, arousal, dom- inance: Mehrabian and russell revisited. Current psychology33(3), 405–421 (2014)

  44. [44]

    arXiv preprint arXiv:2511.12521 (2025)

    Wrobel, M.R.: A proxy-based method for mapping discrete emotions onto vad model. arXiv preprint arXiv:2511.12521 (2025)

  45. [45]

    In: 2025 IEEE 35th International Work- shop on Machine Learning for Signal Processing (MLSP), pp

    Jia, J., Zhang, H., Liang, J.: Bridging discrete and continuous: A multimodal strategy for complex emotion detection. In: 2025 IEEE 35th International Work- shop on Machine Learning for Signal Processing (MLSP), pp. 1–6 (2025). IEEE

  46. [46]

    IEEE Transactions on Affective Computing15(3), 1202–1212 (2023)

    Somarathna, R., Vuilleumier, P., Mohammadi, G.: Emostim: A database of emo- tional film clips with discrete and componential assessment. IEEE Transactions on Affective Computing15(3), 1202–1212 (2023)

  47. [47]

    Emotion review5(2), 119–124 (2013)

    Moors, A., Ellsworth, P.C., Scherer, K.R., Frijda, N.H.: Appraisal theories of emotion: State of the art and future development. Emotion review5(2), 119–124 (2013)

  48. [48]

    Education Sciences14(2), 138 (2024)

    Sullins, J., Turner, J., Kim, J., Barber, S.: Investigating the impacts of shame- proneness on students’ state shame, self-regulation, and learning. Education Sciences14(2), 138 (2024)

  49. [49]

    arXiv preprint arXiv:1609.09761 (2016)

    Soleymani, M.: Detecting cognitive appraisals from facial expressions for interest recognition. arXiv preprint arXiv:1609.09761 (2016)

  50. [50]

    PloS one20(1), 0315929 (2025) 33

    Barradas, I., Tschiesner, R., Peer, A.: Dynamic emotion intensity estimation from physiological signals facilitating interpretation via appraisal theory. PloS one20(1), 0315929 (2025) 33

  51. [51]

    Journal of personality and social psychology57(2), 212 (1989)

    Frijda, N.H., Kuipers, P., Ter Schure, E.: Relations among emotion, appraisal, and emotional action readiness. Journal of personality and social psychology57(2), 212 (1989)

  52. [52]

    Cognition & Emotion7(3-4), 325–355 (1993)

    Scherer, K.R.: Studying the emotion-antecedent appraisal process: An expert system approach. Cognition & Emotion7(3-4), 325–355 (1993)

  53. [53]

    Cambridge university press, ??? (2022)

    Ortony, A., Clore, G.L., Collins, A.: The Cognitive Structure of Emotions. Cambridge university press, ??? (2022)

  54. [54]

    Music Perception: An Interdisciplinary Journal42(5), 421–466 (2025)

    Juslin, P.N.: Major theories of emotion causation and their applicability to music: The case for multi-level approaches. Music Perception: An Interdisciplinary Journal42(5), 421–466 (2025)

  55. [55]

    Creativity Research Journal, 1–21 (2025)

    Friedrich, T.L., Kiefer, T., Eubanks, D.: Emotional reactions to idea evaluation: Creative perseverance and evaluating others. Creativity Research Journal, 1–21 (2025)

  56. [56]

    Cognitive Systems Research, 101442 (2026)

    Castellanos, S., Cuen, E.O., Padilla, E.L., Rodr´ ıguez, L.-F.: Systematic guidelines for extending the appraisal process in computational models of emotion. Cognitive Systems Research, 101442 (2026)

  57. [57]

    IEEE Transactions on Affective Computing (2025)

    Tak, A.N., Gratch, J., Scherer, K.R.: Aware yet biased: Investigating emotional reasoning and appraisal bias in large language models. IEEE Transactions on Affective Computing (2025)

  58. [58]

    xu et al

    Xu, K., Xie, C., Liu, Q., Du, Y., Li, X., Li, Y., Liu, J.: Conversational emotion prediction based on appraisal theory: K. xu et al. Soft Computing, 1–15 (2025)

  59. [59]

    Technium Romanian Journal of Applied Sciences and Technology26, 102–140 (2025) https://doi.org/10.47577/technium.v26i.12398

    Krzeminska, I.: Multimodal recognition of users states at human-ai interaction adaptation. Technium Romanian Journal of Applied Sciences and Technology26, 102–140 (2025) https://doi.org/10.47577/technium.v26i.12398

  60. [60]

    Cognition & emotion6(3-4), 169–200 (1992)

    Ekman, P.: An argument for basic emotions. Cognition & emotion6(3-4), 169–200 (1992)

  61. [61]

    Journal of personality and social psychology39(6), 1161 (1980)

    Russell, J.A.: A circumplex model of affect. Journal of personality and social psychology39(6), 1161 (1980)

  62. [62]

    Journal of personality and social psychology67(3), 525 (1994)

    Reisenzein, R.: Pleasure-arousal theory and the intensity of emotions. Journal of personality and social psychology67(3), 525 (1994)

  63. [63]

    Information Sciences546, 74–86 (2021)

    Taverner, J., Vivancos, E., Botti, V.: A fuzzy appraisal model for affective agents adapted to cultural environments using the pleasure and arousal dimensions. Information Sciences546, 74–86 (2021)

  64. [64]

    34 Information Processing & Management57(3), 102185 (2020)

    Li, C., Bao, Z., Li, L., Zhao, Z.: Exploring temporal representations by leverag- ing attention-based bidirectional lstm-rnns for multi-modal emotion recognition. 34 Information Processing & Management57(3), 102185 (2020)

  65. [65]

    In: Proceed- ings of the Computer Vision and Pattern Recognition Conference, pp

    Ahire, V., Shah, K., Khan, M., Pakhale, N., Sookha, L., Ganaie, M., Dhall, A.: Maven: Multi-modal attention for valence-arousal emotion network. In: Proceed- ings of the Computer Vision and Pattern Recognition Conference, pp. 5789–5799 (2025)

  66. [66]

    Knowledge- Based Systems, 115317 (2026)

    Hu, Q., Murad, M.A.A., Azman, A.B., Nasharuddin, N.A.: Target-conditioned triple-path consistency for distributional music emotion regression. Knowledge- Based Systems, 115317 (2026)

  67. [67]

    In: 2025 International Conference on Asian Language Processing (IALP), pp

    Liu, Y., Yang, J.: Emotional speech synthesis based on valence-arousal-dominance model and multi-feature codebook. In: 2025 International Conference on Asian Language Processing (IALP), pp. 255–259 (2025). IEEE

  68. [68]

    Engineering Applications of Artificial Intelligence166, 113346 (2026)

    Yu, Y., Xu, H., Xu, Z., Duan, Y., Wang, R., Zheng, H., Li, Y., Xu, Y.: An emotion recognition approach using peripheral physiological signals based on hierarchical gated residuals and receptive field attention. Engineering Applications of Artificial Intelligence166, 113346 (2026)

  69. [69]

    In: Proceedings of the 8th International Conference on Data Science and Man- agement of Data (12th ACM IKDD CODS and 30th COMAD), pp

    Priyadarshani, M., Miyapuram, K.P.: Predicting valence and arousal from affec- tive images: A comparative analysis of deep learning and random forest regressor. In: Proceedings of the 8th International Conference on Data Science and Man- agement of Data (12th ACM IKDD CODS and 30th COMAD), pp. 350–352 (2024)

  70. [70]

    Sensors 21(24), 8356 (2021)

    Thao, H.T.P., Balamurali, B., Roig, G., Herremans, D.: Attendaffectnet–emotion prediction of movie viewers using multimodal fusion with self-attention. Sensors 21(24), 8356 (2021)

  71. [71]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Zhang, Z., Zhao, P., Park, E., Yang, J.: Mart: Masked affective representa- tion learning via masked temporal distribution distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12830–12840 (2024)

  72. [72]

    International Journal of Computer Vision, 1–17 (2025)

    Guo, Y., Siddiqui, F., Zhao, Y., Chellappa, R., Lo, S.-Y.: Stimuvar: Spatiotem- poral stimuli-aware video affective reasoning with multimodal large language models. International Journal of Computer Vision, 1–17 (2025)

  73. [73]

    arXiv preprint arXiv:2511.02712 (2025)

    Zhang, Z., Wang, W., Zhu, Y., Qin, W., Wan, P., Zhang, D., Yang, J.: Videmo: Affective-tree reasoning for emotion-centric video foundation models. arXiv preprint arXiv:2511.02712 (2025)

  74. [74]

    In: Proceedings of the 33rd ACM International Conference on Multimedia, pp

    Lian, Z., Liu, R., Xu, K., Liu, B., Liu, X., Zhang, Y., Liu, X., Li, Y., Cheng, Z., Zuo, H.,et al.: Mer 2025: When affective computing meets large language models. In: Proceedings of the 33rd ACM International Conference on Multimedia, pp. 13837–13842 (2025) 35

  75. [75]

    Engineering Science and Technology, an International Journal24(6), 1442–1454 (2021)

    Topic, A., Russo, M.: Emotion recognition based on eeg feature maps through deep learning network. Engineering Science and Technology, an International Journal24(6), 1442–1454 (2021)

  76. [76]

    In: Proceedings of the 2020 International Conference on Multimodal Interaction, pp

    Dudzik, B., Broekens, J., Neerincx, M., Hung, H.: Exploring personal memories and video content as context for facial behavior in predictions of video-induced emotions. In: Proceedings of the 2020 International Conference on Multimodal Interaction, pp. 153–162 (2020)

  77. [77]

    Scientific Reports14(1), 26382 (2024)

    Antonov, A., Kumar, S.S., Wei, J., Headley, W., Wood, O., Montana, G.: Decoding viewer emotions in video ads. Scientific Reports14(1), 26382 (2024)

  78. [78]

    IEEE Transactions on Affective Computing14(2), 1249–1266 (2021)

    Dudzik, B., Hung, H., Neerincx, M., Broekens, J.: Collecting mementos: A multi- modal dataset for context-sensitive modeling of affect and memory processing in responses to videos. IEEE Transactions on Affective Computing14(2), 1249–1266 (2021)

  79. [79]

    Neural Computing and Applications35(18), 13565–13582 (2023)

    Kamran, S., Zall, R., Hosseini, S., Kangavari, M., Rahmani, S., Hua, W.: Emodnn: understanding emotions from short texts through a deep neural network ensemble. Neural Computing and Applications35(18), 13565–13582 (2023)

  80. [80]

    Knowledge-Based Systems261, 110219 (2023)

    Rahmani, S., Hosseini, S., Zall, R., Kangavari, M.R., Kamran, S., Hua, W.: Transfer-based adaptive tree for multimodal sentiment analysis based on user latent aspects. Knowledge-Based Systems261, 110219 (2023)

Showing first 80 references.