pith. sign in

arxiv: 2511.20657 · v2 · submitted 2025-10-11 · 💻 cs.HC · cs.AI

Intelligent Agents with Emotional Intelligence: Current Trends, Challenges, and Future Prospects

Pith reviewed 2026-05-18 08:08 UTC · model grok-4.3

classification 💻 cs.HC cs.AI
keywords affective computingemotional intelligencemultimodal emotion processingemotion synthesishuman-agent interactiongenerative technologieschallenges in affective systemsintelligent agents
0
0 comments X p. Extension

The pith

This survey supplies a unified map of artificial emotional intelligence spanning recognition from multiple inputs, cognitive processing for decisions, and generation of emotional outputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to close a gap left by narrower prior reviews by assembling one overview of artificial emotional intelligence. It walks through multimodal data processing for detecting emotions, affective cognition that includes appraisal and modulation during reasoning and learning, and the creation of emotional displays in text, speech, and faces. A reader would care because such integrated capabilities could make agents more useful in everyday interactions across work, learning, and care settings. The survey also names current obstacles and flags generative technologies as a route to stronger future systems.

Core claim

The central claim is that artificial emotion intelligence rests on three linked parts: emotion understanding via multimodal data processing, affective cognition that applies cognitive appraisal, emotion mapping, and adaptive modulation inside decision-making, learning, and reasoning, plus the synthesis of emotional expressions in text, speech, and facial channels, and that reviewing these parts together with their challenges and generative-technology prospects supplies the missing comprehensive picture.

What carries the argument

The holistic overview that joins multimodal emotion understanding, affective cognition with appraisal and modulation, and cross-modal emotional expression synthesis.

If this is right

  • Developers gain a single reference point for combining multimodal input processing with cognitive modulation when building new agents.
  • Catalogued challenges point to concrete next steps for improving current state-of-the-art methods in expression synthesis.
  • Attention to generative technologies supplies a direct path for creating richer and more varied emotional outputs in future agents.
  • The overall structure supports more effective integration of emotional capabilities into systems used across multiple sectors of society.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Testing the three-part structure in controlled user studies with real agents would reveal whether the described integration actually improves interaction quality.
  • Extending the challenge analysis to include privacy and manipulation risks would connect the survey to emerging ethical questions in affective computing.
  • Applying the same review lens to existing commercial chat systems could produce a practical scorecard of their current emotional intelligence levels.
  • Linking the affective cognition section to established psychological models of human emotion could generate testable predictions for agent behavior.

Load-bearing premise

That earlier reviews left a large enough gap in covering emotion understanding, elicitation, and expression for one new survey to close it completely.

What would settle it

A check that finds several major recent papers on multimodal affective systems or generative emotion generation that receive no analysis or citation in the survey would show the overview falls short of being holistic.

Figures

Figures reproduced from arXiv: 2511.20657 by Alireza Kheyrkhah, Erik Cambria, M.Reza Kangavari, Raziyeh Zall, Zahra Naseri.

Figure 1
Figure 1. Figure 1: Overview of Intelligent Agent with Emotional Intelligence [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of the 298 included studies across the three core capabilities of emotional intelligence [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Breakdown of Emotion Understanding research by modality [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Breakdown of Emotional Expression Synthesis research by output modality. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Categorization of Affective Cognition research by theoretical and computational focus. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The Overall Framework of Emotion Recognition. [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Challenges in Emotion Understanding 3.2.1 Data-Related Challenges Challenges related to data encompass small dataset sizes, noisy or low-quality data, and data imbalance, which restrict the accuracy and generalizability of emotion recognition systems. Addressing these issues is essential for developing robust and reliable models. Small size of data: The limited size of available datasets presents a signifi… view at source ↗
Figure 8
Figure 8. Figure 8: Block Diagram of Intelligent Agent with Affective Cognition [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Challenges in Affective Cognition 23 [PITH_FULL_IMAGE:figures/full_fig_p023_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Challenges in Emotional Text Synthesis 5.1 Approaches Style Transfer: Style transfer alters the emotional tone of a text while maintaining its meaning, which is beneficial for personalized content creation. However, it often struggles with maintaining semantic coherence and emotional consistency in more complex texts. Innovations such as the lexicon-based attention mecha￾nism and methods separating conten… view at source ↗
Figure 11
Figure 11. Figure 11: Challenges in Emotional Speech Synthesis [PITH_FULL_IMAGE:figures/full_fig_p033_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Challenges in Emotional Face Synthesis 7 Emotional Face Synthesis Emotional facial expressions are fundamental nonverbal signals that enable accurate assessment of internal states in psychiatric applications and bolster the integration of machine learning in mental-health diagnostics [152]. In therapeutic contexts, these cues guide clinicians in recognizing and addressing patient emotions [153], while in … view at source ↗
read the original abstract

The development of agents with emotional intelligence is becoming increasingly vital due to their significant role in human-computer interaction and the growing integration of computer systems across various sectors of society. Affective computing aims to design intelligent systems that can recognize, evoke, and express human emotions, thereby emulating human emotional intelligence. While previous reviews have focused on specific aspects of this field, there has been limited comprehensive research that encompasses emotion understanding, elicitation, and expression, along with the related challenges. This survey addresses this gap by providing a holistic overview of core components of artificial emotion intelligence. It covers emotion understanding through multimodal data processing, as well as affective cognition, which includes cognitive appraisal, emotion mapping, and adaptive modulation in decision-making, learning, and reasoning. Additionally, it addresses the synthesis of emotional expression across text, speech, and facial modalities to enhance human-agent interaction. This paper identifies and analyzes the key challenges and issues encountered in the development of affective systems, covering state-of-the-art methodologies designed to address them. Finally, we highlight promising future directions, with particular emphasis on the potential of generative technologies to advance affective computing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. This survey paper claims to provide a holistic overview of artificial emotional intelligence in intelligent agents to address gaps in previous reviews. It covers core components including emotion understanding through multimodal data processing, affective cognition with cognitive appraisal, emotion mapping, and adaptive modulation for decision-making, learning, and reasoning, as well as the synthesis of emotional expressions in text, speech, and facial modalities. The paper also identifies challenges in developing affective systems and discusses future directions, emphasizing generative technologies.

Significance. If the literature synthesis is thorough and unbiased, the paper could be significant as a consolidating reference in affective computing and human-computer interaction. It integrates fragmented topics into a unified framework, potentially informing the design of emotionally intelligent agents and highlighting pathways for advancement through generative AI, which is timely given the rapid development in the field.

major comments (1)
  1. The claim that this survey addresses the gap by providing a 'holistic overview' of emotion understanding, elicitation, expression, challenges, and future directions is load-bearing but unsupported by any disclosed literature review methodology. There is no mention of search strategy, databases consulted, inclusion/exclusion criteria, or the total number of papers reviewed, which prevents verification that the coverage is comprehensive rather than selective.
minor comments (2)
  1. While the abstract outlines the structure clearly, it would benefit from indicating the approximate number of references or the time span of the literature covered to better convey the scope of the survey.
  2. Ensure consistent use of terminology for 'affective cognition' and 'emotional intelligence' to avoid potential confusion for readers new to the field.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We agree that transparency in the literature review process is important for a survey paper and will address this by adding explicit methodological details in the revised manuscript.

read point-by-point responses
  1. Referee: The claim that this survey addresses the gap by providing a 'holistic overview' of emotion understanding, elicitation, expression, challenges, and future directions is load-bearing but unsupported by any disclosed literature review methodology. There is no mention of search strategy, databases consulted, inclusion/exclusion criteria, or the total number of papers reviewed, which prevents verification that the coverage is comprehensive rather than selective.

    Authors: We acknowledge this observation and agree that disclosing the review methodology strengthens the paper's credibility as a survey. In the revised version, we will insert a new subsection (likely in Section 1 or as a dedicated 'Review Methodology' section) that details the search strategy, databases consulted (including IEEE Xplore, ACM Digital Library, Google Scholar, and arXiv), key search terms and combinations, inclusion/exclusion criteria (e.g., peer-reviewed publications from 2015–2024 focusing on affective computing in agents), and the approximate number of papers initially retrieved and finally included. This addition will enable readers to evaluate the scope and potential selectivity of our synthesis while preserving the existing structure and contributions of the survey. revision: yes

Circularity Check

0 steps flagged

No circularity: literature survey with no derivations or fitted predictions

full rationale

This paper is a survey providing a holistic overview of affective computing components, challenges, and future directions. It contains no mathematical derivations, equations, empirical predictions, or parameter-fitting steps that could reduce to inputs by construction. The claim of addressing a gap in prior reviews is a direct statement of scope and contribution rather than a self-referential or fitted result. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked in a manner that creates circularity. The absence of explicit search methodology is a transparency issue but does not equate to circular reasoning under the defined criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a review paper with no new free parameters, axioms, or invented entities introduced; the contribution rests on synthesis of existing literature.

pith-pipeline@v0.9.0 · 5744 in / 969 out tokens · 24689 ms · 2026-05-18T08:08:26.234880+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Modeling Induced Pleasure through Cognitive Appraisal Prediction via Multimodal Fusion

    cs.AI 2026-04 unverdicted novelty 4.0

    A multimodal fusion model using cognitive appraisal theory, transformers, and fuzzy logic predicts video-induced pleasure levels with 0.6624 accuracy by inferring appraisal variables.

Reference graph

Works this paper leans on

298 extracted references · 298 canonical work pages · cited by 1 Pith paper · 12 internal anchors

  1. [1]

    R. W. Picard,Affective Computing. MIT Press, 2000

  2. [2]

    Emotions and affect in human factors and human–computer interaction: taxonomy, theories, approaches, and methods,

    M. Jeon, “Emotions and affect in human factors and human–computer interaction: taxonomy, theories, approaches, and methods,” inEmotions and Affect in Human Factors and Human-Computer Interaction. Elsevier, 2017, pp. 3–26

  3. [3]

    Humanizing the role of artificial intelligence in revolutionizing emo- tional intelligence,

    K. Subramani and G. Manoharan, “Humanizing the role of artificial intelligence in revolutionizing emo- tional intelligence,” in2024 3rd International Conference on Computational Modelling, Simulation and Optimization (ICCMSO), June 2024, pp. 237–242. IEEE, 2024

  4. [4]

    Schuller and B

    D. Schuller and B. W. Schuller, ”The age of artificial emotional intelligence,”Computer, vol. 51, no. 9, pp. 38–46, 2018

  5. [5]

    Becker, J

    J. Becker, J. P. Wahle, B. Gipp, and T. Ruas, ”Text generation: A systematic literature review of tasks, evaluation, and challenges,”arXiv preprint arXiv:2405.15604, 2024

  6. [6]

    R. Chen, J. Wang, L.-C. Yu, and X. Zhang, ”Decoupled variational autoencoder with interactive atten- tion for affective text generation,”Eng. Appl. Artif. Intell., vol. 123, p. 106447, 2023

  7. [7]

    Plutchik, ”A general psychoevolutionary theory of emotion,” inTheories of Emotion, pp

    R. Plutchik, ”A general psychoevolutionary theory of emotion,” inTheories of Emotion, pp. 3–33, 1980

  8. [8]

    Zheng, Y

    Y. Zheng, Y. Wang, P. Ke, Z. Yang, and M. Huang, ”Semantic-enhanced explainable finetuning for open-domain dialogues,”arXiv preprint arXiv:2106.03065, 2021

  9. [9]

    Truong, ”Textual emotion detection–A systematic literature review,” 2024

    V. Truong, ”Textual emotion detection–A systematic literature review,” 2024. 46

  10. [10]

    Recent Advances in Multimodal Affective Computing: An NLP Perspective

    Hu G, Xin Y, Lyu W, Huang H, Sun C, Zhu Z, Gui L, Cai R, Cambria E, Seifi H. Recent trends of multimodal affective computing: A survey from NLP perspective.arXiv preprint arXiv:2409.07388. 2024

  11. [11]

    Gatt and E

    A. Gatt and E. Krahmer, ”Survey of the state of the art in natural language generation: Core tasks, applications and evaluation,”J. Artif. Intell. Res., vol. 61, pp. 65–170, 2018

  12. [12]

    A review of human emotion synthesis based on generative technology.IEEE Transactions on Affective Computing

    Ma F, Xie Y, Li Y, He Y, Zhang Y, Ren H, Liu Z, et al. A review of human emotion synthesis based on generative technology.IEEE Transactions on Affective Computing. 2025

  13. [13]

    Affective computing in the era of large language models: A survey from the nlp perspective.arXiv preprint arXiv:2408.04638

    Zhang Y, Yang X, Xu X, Gao Z, Huang Y, Mu S, Feng S, et al. Affective computing in the era of large language models: A survey from the nlp perspective.arXiv preprint arXiv:2408.04638. 2024

  14. [14]

    Artificial Emotion: A Survey of Theories and Debates on Realising Emotion in Artificial Intelligence.arXiv preprint arXiv:2508.10286

    Li Y, Sun Q, Schlicher M, Lim Y W, Schuller B W. Artificial Emotion: A Survey of Theories and Debates on Realising Emotion in Artificial Intelligence.arXiv preprint arXiv:2508.10286. 2025

  15. [15]

    Emotion Recognition and Generation: A Comprehensive Review of Face, Speech, and Text Modalities.arXiv preprint arXiv:2502.06803

    Mobbs R, Makris D, Argyriou V. Emotion Recognition and Generation: A Comprehensive Review of Face, Speech, and Text Modalities.arXiv preprint arXiv:2502.06803. 2025

  16. [16]

    H. Zhou, M. Huang, T. Zhang, X. Zhu, and B. Liu, ”Emotional chatting machine: Emotional conversa- tion generation with internal and external memory,” inProc. AAAI Conf. Artif. Intell., vol. 32, no. 1, 2018

  17. [17]

    H. Xue, Y. Liang, B. Mu, S. Zhang, M. Chen, Q. Chen, and L. Xie, ”E-chat: Emotion-sensitive spoken dialogue system with large language models,” inProc. IEEE 14th Int. Symp. Chin. Spoken Lang. Process. (ISCSLP), pp. 586–590, 2024

  18. [18]

    Feng, ”Bridging emotional gaps in textual interactions: A study on the role of emotion analysis services,” Ph.D

    Z. Feng, ”Bridging emotional gaps in textual interactions: A study on the role of emotion analysis services,” Ph.D. dissertation, Purdue Univ., 2024

  19. [19]

    Abilbekov, S

    A. Abilbekov, S. Mussakhojayeva, R. Yeshpanov, and H. A. Varol, ”Kazemotts: A dataset for kazakh emotional text-to-speech synthesis,”arXiv preprint arXiv:2404.01033, 2024

  20. [20]

    Z. Fu, X. Tan, N. Peng, D. Zhao, and R. Yan, ”Style transfer in text: Exploration and evaluation,” in Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, 2018

  21. [21]

    Z. Song, X. Zheng, L. Liu, M. Xu, and X.-J. Huang, ”Generating responses with a specific emotion in dialog,” inProc. 57th Annu. Meet. Assoc. Comput. Linguist., pp. 3685–3695, 2019

  22. [22]

    X. Li, K. Song, S. Feng, D. Wang, and Y. Zhang, ”A co-attention neural network model for emotion cause analysis with emotional context awareness,” inProc. 2018 Conf. Empir. Methods Nat. Lang. Process., pp. 4752–4757, 2018

  23. [23]

    Y. Lee, A. Rabiee, and S.-Y. Lee, ”Emotional end-to-end neural speech synthesizer,”arXiv preprint arXiv:1711.05447, 2017

  24. [24]

    Jiang, C

    C. Jiang, C. Zhang, Y. Ji, Z. Hu, Z. Zhan, and G. Yang, ”An affective chatbot with controlled specific emotion expression,”Sci. China Inf. Sci., vol. 65, no. 10, p. 202102, 2022

  25. [25]

    H.-W. Yoon, O. Kwon, H. Lee, R. Yamamoto, E. Song, J.-M. Kim, and M.-J. Hwang, ”Language model-based emotion prediction methods for emotional speech synthesis systems,”arXiv preprint arXiv:2206.15067, 2022

  26. [26]

    J. Qian, L. Dong, Y. Shen, F. Wei, and W. Chen, ”Controllable natural language generation with contrastive prefixes,”arXiv preprint arXiv:2202.13257, 2022

  27. [27]

    Gu and K

    Z. Gu and K. He, ”Affective prompt-tuning-based language model for semantic-based emotional text generation,”Int. J. Semantic Web Inf. Syst., vol. 20, no. 1, pp. 1–19, 2024

  28. [28]

    Y. M. Resendiz and R. Klinger, ”MOPO: Multi-Objective Prompt Optimization for Affective Text Generation,”arXiv preprint arXiv:2412.12948, 2024

  29. [29]

    Y. Li, Q. Pan, S. Wang, T. Yang, and E. Cambria, ”A generative model for category text generation,” Inf. Sci., vol. 450, pp. 301–315, 2018. 47

  30. [30]

    Wang and X

    K. Wang and X. Wan, ”Sentigan: Generating sentimental texts via mixture adversarial networks,” in IJCAI, pp. 4446–4452, 2018

  31. [31]

    J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, and J. Wang, ”Long text generation via adversarial training with leaked information,” inProc. AAAI Conf. Artif. Intell., vol. 32, no. 1, 2018

  32. [32]

    Sinclair and W

    D. Sinclair and W. Pye, ”Towards emotion-based synthetic consciousness: Using llms to estimate emotion probability vectors,”arXiv preprint arXiv:2310.10673, 2023

  33. [33]

    Bhattacharya, N

    U. Bhattacharya, N. Rewkowski, A. Banerjee, P. Guhan, A. Bera, and D. Manocha, ”Text2gestures: A transformer-based network for generating emotive body gestures for virtual agents,” in2021 IEEE Virtual Reality 3D User Interfaces (VR), pp. 1–10, 2021

  34. [34]

    N. Peng, M. Ghazvininejad, J. May, and K. Knight, ”Towards controllable story generation,” inProc. 1st Workshop Storytelling, pp. 43–49, 2018

  35. [35]

    Chatterjee, K

    A. Chatterjee, K. N. Narahari, M. Joshi, and P. Agrawal, ”Semeval-2019 task 3: Emocontext contextual emotion detection in text,” inProc. 13th Int. Workshop Semantic Evaluation, pp. 39–48, 2019

  36. [36]

    L. Zhu, G. Pergola, L. Gui, D. Zhou, and Y. He, ”Topic-driven and knowledge-aware transformer for dialogue emotion detection,”arXiv preprint arXiv:2106.01071, 2021

  37. [37]

    X. Luo, S. Takamichi, T. Koriyama, Y. Saito, and H. Saruwatari, ”Controllable text-to-speech synthesis using prosodic features and emotion soft-label,” 2021

  38. [38]

    X. Luo, S. Takamichi, Y. Saito, T. Koriyama, H. Saruwatari et al., ”Emotion-controllable speech syn- thesis using emotion soft label, utterance-level prosodic factors, and word-level prominence,”APSIPA Trans. Signal Inf. Process., vol. 13, no. 1, 2024

  39. [39]

    Zhang and M

    Y. Zhang and M. Huang, ”Overview of the ntcir-14 short text generation subtask: emotion generation challenge,” inProc. 14th NTCIR Conf., 2019

  40. [40]

    Passalis and S

    N. Passalis and S. Doropoulos, ”Deepsing: Generating sentiment-aware visual stories using cross-modal music translation,”Expert Syst. Appl., vol. 164, p. 114059, 2021

  41. [41]

    Alnuhait, Q

    D. Alnuhait, Q. Wu, and Z. Yu, ”Facechat: An emotion-aware face-to-face dialogue framework,”arXiv preprint arXiv:2303.07316, 2023

  42. [42]

    S. De, I. Bostan, and N. Sastry, ”Making social platforms accessible: Emotion-aware speech generation with integrated text analysis,” inInt. Conf. Adv. Soc. Netw. Anal. Mining, pp. 101–116, 2024

  43. [43]

    Singh, A

    I. Singh, A. Barkati, T. Goswamy, and A. Modi, ”Adapting a language model for controlled affective text generation,” 2020

  44. [44]

    H. Yang, Y. Zhao, and B. Qin, ”Face-sensitive image-to-emotional-text cross-modal translation for mul- timodal aspect-based sentiment analysis,” inProc. 2022 Conf. Empir. Methods Nat. Lang. Process., pp. 3324–3335, 2022

  45. [45]

    Sailunaz and R

    K. Sailunaz and R. Alhajj, ”Emotion and sentiment analysis from twitter text,”J. Comput. Sci., vol. 36, p. 101003, 2019

  46. [46]

    Saravia, H.-C

    E. Saravia, H.-C. T. Liu, Y.-H. Huang, J. Wu, and Y.-S. Chen, ”Carer: Contextualized affect representa- tions for emotion recognition,” inProc. 2018 Conf. Empir. Methods Nat. Lang. Process., pp. 3687–3697, 2018

  47. [47]

    Emotion-Cause Pair Extraction: A New Task to Emotion Analysis in Texts

    R. Xia and Z. Ding, ”Emotion-cause pair extraction: A new task to emotion analysis in texts,”arXiv preprint arXiv:1906.01267, 2019

  48. [48]

    Z. Hu, Z. Yang, X. Liang, R. Salakhutdinov, and E. P. Xing, ”Toward controlled generation of text,” in Int. Conf. Mach. Learn., pp. 1587–1596, 2017

  49. [49]

    Gandhi, K

    A. Gandhi, K. Adhvaryu, S. Poria, E. Cambria, and A. Hussain, ”Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions,”Inf. Fusion, vol. 91, pp. 424–444, 2023. 48

  50. [50]

    Y. Wang, W. Song, W. Tao, A. Liotta, D. Yang et al., ”A systematic review on affective computing: Emotion models, databases, and recent advances,”Inf. Fusion, vol. 83, pp. 19–52, 2022

  51. [51]

    J. Li, X. Wang, G. Lv, and Z. Zeng, ”Graphcfc: A directed graph based cross-modal feature comple- mentation approach for multimodal conversational emotion recognition,”IEEE Trans. Multimedia, vol. 26, pp. 77–89, 2023

  52. [52]

    Liang, F

    Y. Liang, F. Meng, Y. Zhang, Y. Chen, J. Xu, and J. Zhou, ”Emotional conversation generation with heterogeneous graph neural network,”Artif. Intell., vol. 308, p. 103714, 2022

  53. [53]

    Affect-LM: A Neural Language Model for Customizable Affective Text Generation

    S. Ghosh, M. Chollet, E. Laksana, L.-P. Morency, and S. Scherer, ”Affect-lm: A neural language model for customizable affective text generation,”arXiv preprint arXiv:1704.06851, 2017

  54. [54]

    Kim and R

    E. Kim and R. Klinger, ”A survey on sentiment and emotion analysis for computational literary studies,” arXiv preprint arXiv:1808.03137, 2018

  55. [55]

    Y. Su, T. Lan, Y. Liu, F. Liu, D. Yogatama, Y. Wang, L. Kong, and N. Collier, ”Language models can see: Plugging visual controls in text generation,”arXiv preprint arXiv:2205.02655, 2022

  56. [56]

    S. Yoon, S. Byun, and K. Jung, ”Multimodal speech emotion recognition using audio and text,” in2018 IEEE Spoken Lang. Technol. Workshop (SLT), pp. 112–118, 2018

  57. [57]

    G. H. De Rosa and J. P. Papa, ”A survey on text generation using generative adversarial networks,” Pattern Recognit., vol. 119, p. 108098, 2021

  58. [58]

    P. Saha, K. Singh, A. Kumar, B. Mathew, A. Mukherjee, ”Countergedi: A controllable approach to generate polite, detoxified and emotional counterspeech,”arXiv preprint arXiv:2205.04304, 2022

  59. [59]

    Zhang, H

    H. Zhang, H. Song, S. Li, M. Zhou, and D. Song, ”A survey of controllable text generation using transformer-based pre-trained language models,”ACM Comput. Surv., vol. 56, no. 3, pp. 1–37, 2023

  60. [60]

    Yang and D

    K. Yang and D. Klein, ”Fudge: Controlled text generation with future discriminators,”arXiv preprint arXiv:2104.05218, 2021

  61. [61]

    Y. M. Resendiz and R. Klinger, ”Affective natural language generation of event descriptions through fine-grained appraisal conditions,” 2023

  62. [62]

    Achlioptas, M

    P. Achlioptas, M. Ovsjanikov, K. Haydarov, M. Elhoseiny, and L. J. Guibas, ”Artemis: Affective lan- guage for visual art,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 11569–11579, 2021

  63. [63]

    A. Jia, Y. He, Y. Zhang, S. Uprety, D. Song, and C. Lioma, ”Beyond emotion: A multi-modal dataset for human desire understanding,” inProc. 2022 Conf. North Am. Chapter Assoc. Comput. Linguist., pp. 1512–1522, 2022

  64. [64]

    Manivannan, V

    M. Manivannan, V. Nethrapalli, and M. Cartwright, ”Emotioncaps: Enhancing audio captioning through emotion-augmented data generation,”arXiv preprint arXiv:2410.12028, 2024

  65. [65]

    Zhang, J

    F. Zhang, J. Chen, Q. Tang, and Y. Tian, ”Evaluation of emotion classification schemes in social media text: an annotation-based approach,”BMC Psychol., vol. 12, no. 1, p. 503, 2024

  66. [66]

    Schmidt, K

    T. Schmidt, K. Dennerlein, and C. Wolff, ”Using deep learning for emotion analysis of 18th and 19th century german plays,” 2021

  67. [67]

    L. A. M. Bostan and R. Klinger, ”An analysis of annotated corpora for emotion classification in text,” 2018

  68. [68]

    H. Liu, Z. Zhu, N. Iwamoto, Y. Peng, Z. Li, Y. Zhou, E. Bozkurt, and B. Zheng, ”Beat: A large- scale semantic and emotional multimodal dataset for conversational gestures synthesis,” inEur. Conf. Comput. Vis., pp. 612–630, 2022

  69. [69]

    V. K. Jain, S. Kumar, and S. L. Fernandes, ”Extraction of emotions from multilingual text using intelligent text processing and computational linguistics,”J. Comput. Sci., vol. 21, pp. 316–326, 2017. 49

  70. [70]

    MojiTalk: Generating Emotional Responses at Scale

    X. Zhou and W. Y. Wang, ”Mojitalk: Generating emotional responses at scale,”arXiv preprint arXiv:1711.04090, 2017

  71. [71]

    Firdaus, G

    M. Firdaus, G. Singh, A. Ekbal, and P. Bhattacharyya, ”Multi-step prompting for few-shot emotion- grounded conversations,” inProc. 32nd ACM Int. Conf. Inf. Knowl. Manage., pp. 3886–3891, 2023

  72. [72]

    K. L. Tan, C. P. Lee, and K. M. Lim, ”A survey of sentiment analysis: Approaches, datasets, and future research,”Appl. Sci., vol. 13, no. 7, p. 4550, 2023

  73. [73]

    Bridging paintings and music– exploring emotion based music generation through paintings,

    T. Hisariya, H. Zhang, and J. Liang, ”Bridging paintings and music–exploring emotion based music generation through paintings,”arXiv preprint arXiv:2409.07827, 2024

  74. [74]

    Shum, X.-D

    H.-Y. Shum, X.-D. He, and D. Li, ”From eliza to xiaoice: Challenges and opportunities with social chatbots,”Front. Inf. Technol. Electron. Eng., vol. 19, pp. 10–26, 2018

  75. [75]

    X. Lu, S. Welleck, J. Hessel, L. Jiang, L. Qin, P. West, K. Cho, and Y. Choi, ”What makes dialogue representations learnable?,”arXiv preprint arXiv:2210.16227, 2022

  76. [76]

    Firdaus, U

    M. Firdaus, U. Jain, A. Ekbal, and P. Bhattacharyya, ”SEPRG: Sentiment aware emotion controlled personalized response generation,” inProc. 14th Int. Conf. Nat. Lang. Gener., pp. 353–363, 2021

  77. [78]

    X. Li, J. Thickstun, I. Gulrajani, P. Liang, and T. Hashimoto, ”Diffusion-lm improves controllable text generation,”arXiv preprint arXiv:2205.14217, 2022

  78. [79]

    P. Kaur, G. S. Kashyap, A. Kumar, M. T. Nafis, S. Kumar, and V. Shokeen, ”From text to transforma- tion: A comprehensive review of large language models’ versatility,”arXiv preprint arXiv:2402.16142, 2024

  79. [81]

    Sabour, S

    S. Sabour, S. Liu, Z. Zhang, J. Liu, J. Zhou, A. Sunaryo, T. Lee, R. Mihalcea, and M. Huang, ”EmoBench: Evaluating the Emotional Intelligence of Large Language Models,”arXiv preprint arXiv:2402.12071, 2024

  80. [82]

    S. M. Mohammad et al., ”SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection,” inProc. 2024 Conf. Empir. Methods Nat. Lang. Process., pp. 20939–20962, 2025

Showing first 80 references.