MLLMs show limited agreement with human PMSV ratings on video engagement, with downward mean-shift, central-tendency biases, and inconsistent profile sensitivity.
A Computational Model of Message Sensation Value in Short Video Multimodal Features that Predicts Sensory and Behavioral Engagement
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
The contemporary media landscape is characterized by sensational short videos. While prior research examines the effects of individual multimodal features, the collective impact of multimodal features on viewer engagement with short videos remains unknown. Grounded in the theoretical framework of Message Sensation Value (MSV), this study develops and tests a computational model of MSV with multimodal feature analysis and human evaluation of 1,200 short videos. This model that predicts sensory and behavioral engagement was further validated across two unseen datasets from three short video platforms (combined N = 14,492). While MSV is positively associated with sensory engagement, it shows an inverted U-shaped relationship with behavioral engagement: Higher MSV elicits stronger sensory stimulation, but moderate MSV optimizes behavioral engagement. This research advances the theoretical understanding of short video engagement and introduces a robust computational tool for short video research.
fields
cs.HC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multimodal Large Language Models as Synthetic Participants in Video-Based Studies: An Evaluation
MLLMs show limited agreement with human PMSV ratings on video engagement, with downward mean-shift, central-tendency biases, and inconsistent profile sensitivity.