pith. machine review for the scientific record. sign in

arxiv: 2604.25614 · v1 · submitted 2026-04-28 · 💻 cs.AI

Recognition: unknown

HotComment: A Benchmark for Evaluating Popularity of Online Comments

Authors on Pith no claims yet

Pith reviewed 2026-05-07 16:03 UTC · model grok-4.3

classification 💻 cs.AI
keywords online commentspopularity evaluationmultimodal benchmarkcontent qualityuser behavior simulationengagement predictionsocial media analysisStyleCmt
0
0 comments X

The pith

HotComment benchmark evaluates online comment popularity using content quality, prediction models, and user behavior simulation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HotComment as a multimodal benchmark to assess how popular online comments are likely to become on social platforms. It combines video and text analysis to measure content quality against human examples, predict popularity from real interaction trends, and simulate user engagements with agents. This matters because comments influence opinions on social media, yet their success varies by style and community. The work also offers StyleCmt to capture how stylistic elements create social resonance through ripple effects. A sympathetic reader would care because better tools for this could improve content recommendation and understanding of online discourse.

Core claim

The paper claims to present HotComment, a multimodal benchmark that quantifies comment popularity through three aspects: evaluating semantic similarity with ground-truth comments and four quality dimensions, predicting popularity from models on real interaction data, and simulating user behavior with an agent-based framework to approximate engagement scores. It further proposes StyleCmt, which aligns multiple stylistic dimensions to amplify resonant expressions based on social ripple effects.

What carries the argument

HotComment, the multimodal benchmark integrating video and text modalities with three evaluation aspects for popularity, plus the StyleCmt method that models stylistic alignment via social ripple effects.

Load-bearing premise

The agent-based simulation accurately models real user distributions and engagement scores, and the three aspects together provide a comprehensive measure of popularity that generalizes across platforms with different stylistic preferences.

What would settle it

Running the benchmark on a set of comments from a live social media platform and finding that its predicted engagement scores show no correlation with the actual number of likes, replies, or shares those comments receive.

Figures

Figures reproduced from arXiv: 2604.25614 by Chen Xu, Guiyi Zeng, Junqing Yu, Liliang Ye, Yafeng Wu, Yunyao Zhang, Zikai Song.

Figure 1
Figure 1. Figure 1: Example from the HotComment benchmark. Il view at source ↗
Figure 2
Figure 2. Figure 2: HotComment evaluates models from three key aspects:(1) Content Quality, assessed through multi-dimensional view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the StyleCmt Framework. The com view at source ↗
Figure 4
Figure 4. Figure 4: Improvements across stylistic dimensions. Rela view at source ↗
read the original abstract

Online comments play a crucial role in shaping public sentiment and opinion dynamics on social media. However, evaluating their popularity remains challenging, not only because it depends on linguistic quality, originality, and emotional resonance, but also because stylistic preferences vary widely across platforms and user groups, causing the same comment to resonate differently in different communities. In this work, we present HotComment, a multimodal benchmark integrating video and text modalities that comprehensively quantifies popularity from three enhanced aspects: (1) Content Quality, which evaluates semantic similarity with ground-truth human comments and extends quality assessment through four interpretable dimensions; (2) Popularity Prediction, based on trends from models trained on real-world interaction data; and (3) User Behavior Simulation, which models the distribution of platform users and approximates \textbf{engagement scores} through an agent-based framework. Furthermore, we propose StyleCmt, inspired by social ripple effects, where multiple stylistic dimensions align to amplify socially resonant expressions and suppress incongruent ones.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents HotComment, a multimodal benchmark for evaluating the popularity of online comments that integrates video and text modalities. Popularity is quantified through three aspects: Content Quality, which measures semantic similarity to ground-truth human comments and assesses quality along four interpretable dimensions; Popularity Prediction, derived from trends in models trained on real-world interaction data; and User Behavior Simulation, which uses an agent-based framework to model platform user distributions and approximate engagement scores. The work also introduces StyleCmt, a method inspired by social ripple effects that aligns multiple stylistic dimensions to enhance resonant expressions and suppress incongruent ones.

Significance. If the benchmark is fully implemented and empirically validated, it could offer a more holistic and platform-agnostic approach to comment popularity assessment in social media research. By combining semantic analysis, predictive modeling, and behavioral simulation, it addresses limitations of single-metric evaluations like engagement counts. The StyleCmt proposal adds a novel angle on how stylistic factors influence social resonance, potentially aiding in understanding opinion dynamics. However, the current manuscript lacks the necessary experimental results to realize this potential.

major comments (2)
  1. [Abstract, User Behavior Simulation] The claim that the agent-based framework 'models the distribution of platform users and approximates engagement scores' lacks any supporting details on agent initialization, parameter values, calibration procedures, or validation against real-world metrics such as likes, replies, or shares. Without these, it is unclear whether this component provides an independent measure of popularity or risks circularity with the Popularity Prediction aspect, undermining the assertion that the three aspects together comprehensively quantify popularity.
  2. [Abstract, Overall Contribution] The manuscript asserts that HotComment 'comprehensively quantifies popularity from three enhanced aspects' but includes no experimental results, datasets, error analyses, or comparisons to baseline methods. This absence makes it impossible to evaluate the benchmark's effectiveness or its claimed generalization across platforms with varying stylistic preferences.
minor comments (2)
  1. The four interpretable dimensions for Content Quality are mentioned but not defined or exemplified, which would aid reader understanding of the quality assessment component.
  2. Consider adding a diagram to illustrate the workflow of the HotComment benchmark and how StyleCmt integrates with the three aspects.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. The comments highlight important areas for clarification and strengthening, particularly regarding implementation details and empirical validation. We address each major comment point by point below and commit to substantial revisions in the next version of the manuscript.

read point-by-point responses
  1. Referee: [Abstract, User Behavior Simulation] The claim that the agent-based framework 'models the distribution of platform users and approximates engagement scores' lacks any supporting details on agent initialization, parameter values, calibration procedures, or validation against real-world metrics such as likes, replies, or shares. Without these, it is unclear whether this component provides an independent measure of popularity or risks circularity with the Popularity Prediction aspect, undermining the assertion that the three aspects together comprehensively quantify popularity.

    Authors: We agree that the current manuscript provides only a high-level overview of the User Behavior Simulation and lacks the requested implementation details. In the revised version, we will add a dedicated subsection with specifics on agent initialization (sampling from platform-derived user distributions), parameter values (e.g., engagement thresholds and social ripple multipliers), calibration procedures (fitting to historical interaction statistics), and validation results (Pearson correlations with real likes, replies, and shares). We will also explicitly distinguish this component from Popularity Prediction: the latter trains supervised models on observed data to forecast trends, while the agent simulation independently generates synthetic engagement via behavioral modeling. Comparative experiments will be included to demonstrate their complementarity rather than circularity. revision: yes

  2. Referee: [Abstract, Overall Contribution] The manuscript asserts that HotComment 'comprehensively quantifies popularity from three enhanced aspects' but includes no experimental results, datasets, error analyses, or comparisons to baseline methods. This absence makes it impossible to evaluate the benchmark's effectiveness or its claimed generalization across platforms with varying stylistic preferences.

    Authors: The referee is correct that the submitted manuscript focuses on defining the HotComment benchmark and StyleCmt method without presenting full experimental results. To address this, the revised manuscript will incorporate a new Experiments section that includes: (1) descriptions of the multimodal datasets used, (2) quantitative results and error analyses for each of the three aspects, (3) comparisons against relevant baseline methods for popularity assessment, and (4) cross-platform evaluations demonstrating generalization across communities with differing stylistic preferences. These additions will provide the empirical grounding needed to support the abstract claims. revision: yes

Circularity Check

0 steps flagged

No circularity: benchmark description is self-contained with no equations or self-referential reductions.

full rationale

The paper presents HotComment as a multimodal benchmark quantifying comment popularity via three aspects (Content Quality via semantic similarity to ground-truth comments, Popularity Prediction from models on real-world interaction data, and User Behavior Simulation via agent-based modeling of user distributions to approximate engagement scores) plus the StyleCmt proposal inspired by social ripple effects. The provided abstract and text contain no equations, derivations, fitted parameters renamed as predictions, self-citations, or uniqueness claims that reduce any component to its own inputs by construction. All elements are described as new contributions or external-data-based without internal loops, making the work a standard benchmark proposal rather than a circular derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on abstract; no free parameters, axioms, or invented entities are identifiable or detailed.

pith-pipeline@v0.9.0 · 5480 in / 1275 out tokens · 48668 ms · 2026-05-07T16:03:57.198401+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

102 extracted references · 58 canonical work pages · 16 internal anchors

  1. [1]

    Eytan Bakshy, Itamar Rosenn, Cameron Marlow, and Lada A. Adamic. 2012. The Role of Social Networks in Information Diffusion. InProceedings of the 21st International Conference on World Wide Web. 519–528. doi:10.1145/2187836. 2187907

  2. [2]

    Jonah Berger and Katherine L. Milkman. 2012. What Makes Online Content Viral? Journal of Marketing Research49, 2 (2012), 192–205. doi:10.1509/jmr.10.0353

  3. [3]

    Qi Cao, Huawei Shen, Jinhua Gao, Bingzheng Wei, and Xueqi Cheng. 2020. Popularity Prediction on Social Platforms with Coupled Graph Neural Networks. InProceedings of the 13th ACM International Conference on Web Search and Data Mining. 70–78

  4. [4]

    Yupeng Chang, Yi Chang, and Yuan Wu. 2026. BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models. InThe Fourteenth International Conference on Learning Representations. https://openreview.net/forum?id=q0X9SiXiRO

  5. [5]

    Yupeng Chang, Yi Chang, and Yuan Wu. 2026. Decomposing Prompts: Discov- ering Reusable Scaffolds and Task-Specific Residuals. https://openreview.net/ forum?id=WrTjCHs2tS

  6. [6]

    Yupeng Chang, Chenlu Guo, Yi Chang, and Yuan Wu. 2025. LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization. InFindings of the Association for Computational Linguistics: EMNLP 2025. 648–659

  7. [7]

    Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al . 2024. A survey on evaluation of large language models.ACM transactions on intelligent systems and technology15, 3 (2024), 1–45

  8. [8]

    Jieting Chen, Junkai Ding, Wenping Chen, and Qin Jin. 2023. Knowledge En- hanced Model for Live Video Comment Generation. arXiv:2304.14657 [cs.CL]

  9. [9]

    Yuyan Chen, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Bang Liu, and Yunwen Chen. 2023. Can Pre-trained Language Models Understand Chinese Humor?. InProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining(Singapore, Singapore)(WSDM ’23). Association for Computing Machinery, New York, NY, USA, 465–480. doi:10.1145/3539597.3570431

  10. [10]

    Yuyan Chen, Songzhou Yan, Qingpei Guo, Jiyuan Jia, Zhixu Li, and Yanghua Xiao

  11. [11]

    InFindings of the Association for Computational Linguistics: ACL 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.)

    HOTVCOM: Generating Buzzworthy Comments for Videos. InFindings of the Association for Computational Linguistics: ACL 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computational Linguistics, Bangkok, Thailand, 2198–2224. doi:10.18653/v1/2024.findings-acl.130

  12. [12]

    Yuyan Chen, Yichen Yuan, Panjun Liu, Dayiheng Liu, Qinghao Guan, Mengfei Guo, Haiming Peng, Bang Liu, Zhixu Li, and Yanghua Xiao. 2024. Talk Funny! A Large-Scale Humor Response Dataset with Chain-of-Humor Interpretation. Proceedings of the AAAI Conference on Artificial Intelligence38, 16 (Mar. 2024), 17826–17834. doi:10.1609/aaai.v38i16.29736

  13. [13]

    Zhiwei Chen, Yupeng Hu, Zhiheng Fu, Zixu Li, Jiale Huang, Qinlei Huang, and Yinwei Wei. 2026. Intent: Invariance and discrimination-aware noise mitigation for robust composed image retrieval. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 20463–20471

  14. [14]

    Zhiwei Chen, Yupeng Hu, Zixu Li, Zhiheng Fu, Xuemeng Song, and Liqiang Nie. 2025. Offset: Segmentation-based focus shift revision for composed image retrieval. InProceedings of the 33rd ACM International Conference on Multimedia. 6113–6122

  15. [15]

    Adamic, P

    Justin Cheng, Lada A. Adamic, P. Alex Dow, Jon Kleinberg, and Jure Leskovec

  16. [16]

    InProceedings of the 23rd International Con- ference on World Wide Web

    Can Cascades Be Predicted?. InProceedings of the 23rd International Con- ference on World Wide Web. 925–936. doi:10.1145/2566486.2567997

  17. [17]

    Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. 2025. Gemini 2.5: Pushing the frontier with advanced reasoning, multi- modality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261(2025)

  18. [18]

    Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. 2024. The Llama 3 Herd of Models. arXiv:2407.21783 [cs.AI] https: //arxiv.org/abs/2407.21783

  19. [19]

    Emilio Ferrara and Zeyao Yang. 2015. Measuring Emotional Contagion in Social Media.PLOS ONE10, 11 (11 2015), 1–14. doi:10.1371/journal.pone.0142390

  20. [20]

    Zhiheng Fu, Yupeng Hu, Qianyun Yang, Shiqi Zhang, Zhiwei Chen, and Zixu Li

  21. [21]

    Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval

    Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval. arXiv:2604.19386 [cs.CV] https://arxiv.org/abs/ 2604.19386

  22. [22]

    Soichiro Fujita, Hayato Kobayashi, and Manabu Okumura. 2019. Dataset Creation for Ranking Constructive News Comments. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2619–2626. doi:10.18653/ v1/P19-1250

  23. [23]

    Kelly Garrett

    R. Kelly Garrett. 2009. Echo Chambers Online?: Politically Motivated Selective Exposure among Internet News Users.Journal of Computer-Mediated Communi- cation14, 2 (2009), 265–285. doi:10.1111/j.1083-6101.2009.01440.x

  24. [24]

    Kelly Garrett, Dustin Carnahan, and Emily K

    R. Kelly Garrett, Dustin Carnahan, and Emily K. Lynch. 2013. A Turn Toward Avoidance? Selective Exposure to Online Political Information, 2004–2008.Polit- ical Behavior35, 1 (2013), 113–134. doi:10.1007/s11109-011-9185-6

  25. [25]

    Sharad Goel, Ashton Anderson, Jake Hofman, and Duncan J. Watts. 2016. The Structural Virality of Online Diffusion.Management Science62, 1 (2016), 180–196. doi:10.1287/mnsc.2015.2158

  26. [26]

    Swapna Gottipati and Jing Jiang. 2012. Finding Thoughtful Comments from Social Media. InProceedings of COLING 2012: Technical Papers. 995–1010. doi:10. 3115/2380863.2380885

  27. [27]

    Ruiqi He, Yushu He, Longju Bai, Jiarui Liu, Zhenjie Sun, Zenghao Tang, He Wang, Hanchen Xia, Rada Mihalcea, and Naihao Deng. 2025. Chumor 2.0: Towards Better Benchmarking Chinese Humor Understanding from (Ruo Zhi Ba). In Findings of the Association for Computational Linguistics: ACL 2025, Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher ...

  28. [28]

    2016.Optics(5th ed.)

    Eugene Hecht. 2016.Optics(5th ed.). Pearson Education

  29. [29]

    and Lee, Lillian and Da, Jeff and Zellers, Rowan and Mankoff, Robert and Choi, Yejin

    Jack Hessel, Ana Marasovic, Jena D. Hwang, Lillian Lee, Jeff Da, Rowan Zellers, Robert Mankoff, and Yejin Choi. 2023. Do Androids Laugh at Electric Sheep? Humor “Understanding” Benchmarks from The New Yorker Caption Contest. In Proceedings of the 61st Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers), Anna Rogers, J...

  30. [30]

    Yupeng Hu, Zixu Li, Zhiwei Chen, Qinlei Huang, Zhiheng Fu, Mingzhu Xu, and Liqiang Nie. 2026. Refine: Composed video retrieval via shared and differ- ential semantics enhancement.ACM Transactions on Multimedia Computing, Communications and Applications(2026)

  31. [31]

    Yangliu Hu, Zikai Song, Na Feng, Yawei Luo, Junqing Yu, Yi-Ping Phoebe Chen, and Wei Yang. 2025. Sf2t: Self-supervised fragment finetuning of video-llms for fine-grained understanding. InProceedings of the Computer Vision and Pattern Recognition Conference. 29108–29117

  32. [32]

    Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, De- vendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. 2023. Mistral 7B. arXiv:2310.068...

  33. [33]

    Elihu Katz, Jay G Blumler, and Michael Gurevitch. 1973. Uses and gratifications research.The public opinion quarterly37, 4 (1973), 509–523

  34. [34]

    Hayato Kobayashi, Hiroaki Taguchi, Yoshimune Tabuchi, Chahine Koleejan, Ken Kobayashi, Soichiro Fujita, Kazuma Murao, Takeshi Masuyama, Taichi Yatsuka, Manabu Okumura, and Satoshi Sekine. 2021. A Case Study of In-House Com- petition for Ranking Constructive Comments in a News Service. InProceedings of the Ninth International Workshop on Natural Language P...

  35. [35]

    Varada Kolhatkar and Maite Taboada. 2017. Constructive Language in News Comments. InProceedings of the First Workshop on Abusive Language Online. 11–17. doi:10.18653/v1/W17-3002

  36. [36]

    Jewon Lee, Ki-Ung Song, Seungmin Yang, Donguk Lim, Jaeyeon Kim, Wooksu Shin, Bo-Kyeong Kim, Yong Jae Lee, and Tae-Ho Kim. 2025. Efficient LLaMA-3.2- Vision by Trimming Cross-attended Visual Features. arXiv:2504.00557 [cs.CV] https://arxiv.org/abs/2504.00557

  37. [37]

    Jae Kook Lee and Eunyi Kim. 2017. Incidental Exposure to News: Predictors in the Social Media Setting and Effects on Information Gain Online.Computers in Human Behavior75 (2017), 1008–1015. doi:10.1016/j.chb.2017.02.018 Yafeng Wu, Yunyao Zhang, Liliang Ye, Guiyi Zeng, Junqing Yu, Chen Xu, and Zikai Song

  38. [38]

    Yiming Lei, Chenkai Zhang, Zeming Liu, Haitao Leng, ShaoGuo Liu, Tingting Gao, Qingjie Liu, and Yunhong Wang. 2025. GODBench: A Benchmark for Mul- timodal Large Language Models in Video Comment Art. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Wanxiang Che, Joyce Nabende, Ekaterina Shut...

  39. [39]

    Kun Li, Chenwei Dai, Wei Zhou, and Songlin Hu. 2024. Fine-grained User Behavior Simulation on Social Media Based on Role-playing Large Language Models. arXiv:2412.03148 [cs.CL]

  40. [40]

    Wenjing Li, Zhongyuan Huang, Rui Xu, Ming Zhou, and Xiaolong Yang. 2019. Graph-to-Sequence Learning for News Comment Generation. InProceedings of the Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 455–465

  41. [41]

    Wenbing Li, Zikai Song, Jielei Zhang, Tianhao Zhao, Junkai Lin, Yiran Wang, and Wei Yang. 2026. Large Language Model as Token Compressor and Decompressor. arXiv:2603.25340 [cs.CL]

  42. [42]

    Wenbing Li, Zikai Song, Hang Zhou, Yunyao Zhang, Junqing Yu, and Wei Yang

  43. [43]

    LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing.arXiv preprint arXiv:2507.00029(2025)

  44. [44]

    Wenbing Li, Hang Zhou, Junqing Yu, Zikai Song, and Wei Yang. 2024. Coupled mamba: Enhanced multimodal fusion with coupled state space model.Advances in Neural Information Processing Systems37 (2024), 59808–59832

  45. [45]

    Yanshu Li, Yi Cao, Hongyang He, Qisen Cheng, Xiang Fu, Xi Xiao, Tianyang Wang, and Ruixiang Tang. 2025. M2IV: Towards Efficient and Fine-grained Multimodal In-Context Learning via Representation Engineering. InSecond Conference on Language Modeling. https://openreview.net/forum?id=9ffYcEiNw9

  46. [46]

    Yanshu Li, Jianjiang Yang, Tian Yun, Pinyuan Feng, Jinfa Huang, and Ruixiang Tang. 2025. Taco: Enhancing multimodal in-context learning via task mapping- guided sequence configuration. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 736–763

  47. [47]

    Zixu Li, Zhiwei Chen, Haokun Wen, Zhiheng Fu, Yupeng Hu, and Weili Guan

  48. [48]

    InProceedings of the AAAI Conference on Artificial Intelligence, Vol

    Encoder: Entity mining and modification relation binding for composed image retrieval. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 5101–5109

  49. [49]

    Zixu Li, Yupeng Hu, Zhiwei Chen, Qinlei Huang, Guozhi Qiu, Zhiheng Fu, and Meng Liu. 2026. Retrack: Evidence-driven dual-stream directional anchor calibration network for composed video retrieval. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 23373–23381

  50. [50]

    Zixu Li, Yupeng Hu, Zhiwei Chen, Mingyu Zhang, Zhiheng Fu, and Liqiang Nie

  51. [51]

    ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval

    ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval. arXiv:2604.20358 [cs.CV] https://arxiv.org/abs/ 2604.20358

  52. [52]

    Zixu Li, Yupeng Hu, Zhiwei Chen, Shiqi Zhang, Qinlei Huang, Zhiheng Fu, and Yinwei Wei. 2026. Habit: Chrono-synergia robust progressive learning framework for composed image retrieval. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 6762–6770

  53. [53]

    Zixu Li, Yupeng Hu, Zhiheng Fu, Zhiwei Chen, Yongqi Li, and Liqiang Nie. 2026. TEMA: Anchor the Image, Follow the Text for Multi-Modification Composed Image Retrieval. arXiv:2604.21806 [cs.CV] https://arxiv.org/abs/2604.21806

  54. [54]

    Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. InText Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81. https://aclanthology.org/W04-1013/

  55. [55]

    Xudong Lin, Ali Zare, Shiyuan Huang, Ming-Hsuan Yang, Shih-Fu Chang, and Li Zhang. 2024. Personalized Video Comment Generation. InFindings of the Association for Computational Linguistics: EMNLP 2024. 16806–16820. doi:10. 18653/v1/2024.findings-emnlp.979

  56. [56]

    Yucheng Lin, Yuhan Xia, and Yunfei Long. 2024. Augmenting emotion features in irony detection with Large language modeling. arXiv:2404.12291 [cs.CL] https://arxiv.org/abs/2404.12291

  57. [57]

    Ge Luo, Yuchen Ma, Manman Zhang, Junqiang Huang, Sheng Li, Zhenxing Qian, and Xinpeng Zhang. 2024. Engaging Live Video Comments Generation. In Proceedings of the 32nd ACM International Conference on Multimedia(Melbourne VIC, Australia)(MM ’24). Association for Computing Machinery, New York, NY, USA, 8034–8042. doi:10.1145/3664647.3681195

  58. [58]

    Shuming Ma, Lei Cui, Damai Dai, Furu Wei, and Xu Sun. 2019. LiveBot: Gen- erating Live Video Comments Based on Visual and Textual Contexts. InPro- ceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6810–6817. doi:10.1609/aaai.v33i01.33016810

  59. [59]

    Xuanjing Ma, Xiaochun Li, Lei Wang, and Wei Zhang. 2018. Unsupervised Article Comment Generation with Semantic Matching. InProceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1234–1243

  60. [60]

    Zixiang Meng, Qiang Gao, Di Guo, Yunlong Li, Bobo Li, Hao Fei, Shengqiong Wu, Fei Li, Chong Teng, and Donghong Ji. 2024. MMLSCU: A Dataset for Multi-modal Multi-domain Live Streaming Comment Understanding. InProceedings of the ACM on Web Conference 2024. 4395–4406. doi:10.1145/3589334.3645677

  61. [61]

    Matteo Migliarini, Berat Ercevik, Oluwagbemike Olowe, Saira Fatima, Sarah Zhao, Minh Anh Le, Vasu Sharma, and Ashwinee Panda. 2026. @grokSet: Multi- party Human-LLM Interactions in Social Media. arXiv:2602.21236 [cs.CL]

  62. [62]

    OpenAI. 2023. GPT-4 Technical Report. InarXiv preprint arXiv:2303.08774

  63. [63]

    Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. InProceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Pierre Isabelle, Eugene Charniak, and Dekang Lin (Eds.). Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 31...

  64. [64]

    O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S

    Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative Agents: Interactive Simulacra of Human Behavior. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–22. doi:10.1145/3586183.3606763

  65. [65]

    Lianhui Qin, Lemao Liu, Victoria Bi, Yan Wang, Xiaojiang Liu, Zhiting Hu, Hai Zhao, and Shuming Shi. 2018. Automatic Article Commenting: The Task and Dataset. InProceedings of the Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 899–905

  66. [66]

    Zhongyi Qiu, Hanjia Lyu, Wei Xiong, and Jiebo Luo. 2025. Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation. arXiv:2502.12073 [cs.CL] doi:10.48550/arXiv.2502.12073

  67. [67]

    Julian Risch and Ralf Krestel. 2020. Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions. InProceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 579–589. doi:10.1609/icwsm.v14i1.7325

  68. [68]

    Svenja Schäfer. 2023. Incidental News Exposure in a Digital Media Environment: A Scoping Review of Recent Research.Annals of the International Communication Association47, 2 (2023), 242–260. doi:10.1080/23808985.2023.2169953

  69. [69]

    Zikai Song, Run Luo, Lintao Ma, Ying Tang, Yi-Ping Phoebe Chen, Junqing Yu, and Wei Yang. 2025. Temporal coherent object flow for multi-object tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 6978–6986

  70. [70]

    Zikai Song, Run Luo, Junqing Yu, Yi-Ping Phoebe Chen, and Wei Yang. 2023. Compact transformer tracker with correlative masked modeling. InProceedings of the AAAI conference on artificial intelligence, Vol. 37. 2321–2329

  71. [71]

    Zikai Song, Ying Tang, Run Luo, Lintao Ma, Junqing Yu, Yi-Ping Phoebe Chen, and Wei Yang. 2024. Autogenic language embedding for coherent point tracking. InProceedings of the 32nd ACM International Conference on Multimedia. 2021– 2030

  72. [72]

    Zikai Song, Junqing Yu, Yi-Ping Phoebe Chen, and Wei Yang. 2022. Transformer tracking with cyclic shifting window attention. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8791–8800

  73. [73]

    Zikai Song, Junqing Yu, Yi-Ping Phoebe Chen, Wei Yang, and Xinchao Wang

  74. [74]

    Hypergraph-State Collaborative Reasoning for Multi-Object Tracking

    Hypergraph-State Collaborative Reasoning for Multi-Object Tracking. arXiv:2604.12665 [cs.CV]

  75. [75]

    Jiao Sun, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Tagy- oung Chung, Jing Huang, Yang Liu, and Nanyun Peng. 2022. ExPUNations: Augmenting Puns with Keywords and Explanations. arXiv:2210.13513 [cs.CL] https://arxiv.org/abs/2210.13513

  76. [76]

    Yuchong Sun, Bei Liu, Xu Chen, Ruihua Song, and Jianlong Fu. 2024. ViCo: Engaging Video Comment Generation with Human Preference Rewards. InPro- ceedings of the 6th ACM International Conference on Multimedia in Asia (MMAsia ’24). Association for Computing Machinery, New York, NY, USA, Article 98, 1 pages. doi:10.1145/3696409.3700260

  77. [77]

    Petter Törnberg, Diliara Valeeva, Justus Uitermark, and Christopher Bail. 2023. Simulating Social Media Using Large Language Models to Evaluate Alternative News Feed Algorithms. arXiv:2310.05984 [cs.CY]

  78. [78]

    Lilian Weng, Alessandro Flammini, Alessandro Vespignani, and Filippo Menczer

  79. [79]

    doi:10.1038/srep00335

    Competition among Memes in a World with Limited Attention.Scientific Reports2 (2012), 335. doi:10.1038/srep00335

  80. [80]

    Yijie Xu, Bolun Zheng, Wei Zhu, Hangjia Pan, Yuchen Yao, Ning Xu, Anan Liu, Quan Zhang, and Chenggang Yan. 2025. SMTPD: A New Benchmark for Temporal Prediction of Social Media Popularity. arXiv:2503.04446 [cs.SI]

Showing first 80 references.