Will It Go Viral? Grounding Micro-Video Popularity Prediction on the Open Web

Dongha Lee; Ryang Heo

arxiv: 2605.18653 · v1 · pith:CMFNOUVNnew · submitted 2026-05-18 · 💻 cs.MM

Will It Go Viral? Grounding Micro-Video Popularity Prediction on the Open Web

Ryang Heo , Dongha Lee This is my paper

Pith reviewed 2026-05-20 00:50 UTC · model grok-4.3

classification 💻 cs.MM

keywords micro-video popularity predictionopen-web groundingvirality forecastingevidence-cardonline adaptationshort-form video datasettrend shiftspopularity regression

0 comments

The pith

Structured open-web context and trend-aware adaptation are required for accurate micro-video popularity prediction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that micro-video popularity cannot be reliably predicted from content or internal platform data alone because virality hinges on external trends visible on the open web. To address this, it presents the WEBSHORTS dataset of 14K videos with real-time web evidence-cards and tracked views, and the SHORTS-CAST model that reasons over web dimensions and adapts to delayed labels indicating trend shifts. Experiments confirm superior performance in realistic offline and online settings. This matters for recommendation and advertising in short-video platforms where timing and context determine success.

Core claim

Micro-video popularity prediction is reformulated as open-web grounded prediction. The WEBSHORTS dataset couples 14K videos with real-time open-web context organized as three-dimensional evidence-cards and daily view counts over 7 days. SHORTS-CAST generates dimension-wise rationales from the evidence-card to guide popularity regression and adapts selectively when delayed labels reveal genuine trend shifts. It outperforms content-only, retrieval-augmented, and other online adaptation baselines under offline and delayed-label online protocols.

What carries the argument

The three-dimensional evidence-card capturing external attention along complementary web-context dimensions, which serves as the basis for rationale generation and popularity prediction in the SHORTS-CAST framework.

If this is right

Improved accuracy in popularity forecasting supports better recommendation and advertising decisions.
Trend-aware adaptation enables handling of fast-evolving short-form video ecosystems.
Use of delayed labels allows detection of genuine trend shifts for model updates.
Structured web context reduces reliance on historical internal video corpora.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may generalize to predicting engagement for other time-sensitive content like live streams or social posts.
Real-time web data collection could be combined with privacy-preserving techniques for broader adoption.
Comparing performance across different web search providers might show robustness or sensitivity to data sources.

Load-bearing premise

Open-web context collected at upload time supplies predictive signal for popularity that is not already present in the video content or in retrieval from platform-internal video corpora.

What would settle it

A controlled experiment showing that a model without open-web context achieves comparable performance to SHORTS-CAST on the delayed-label online protocol would falsify the claim that web context is jointly necessary.

Figures

Figures reproduced from arXiv: 2605.18653 by Dongha Lee, Ryang Heo.

**Figure 1.** Figure 1: Conventional scenario (Left) retrieves similar videos from a static in-platform corpus, missing the external trends behind virality. In contrast, the open-web grounded scenario (Right) captures real-time web signals, closely anticipating the viral outcome. However, extending MVPP from internal-corpus retrieval to open-web grounded prediction is not straightforward, as this shift surfaces two challenges tha… view at source ↗

**Figure 2.** Figure 2: The overview of our WEBSHORTS construction pipeline. Candidate retrieval To support diversity in video topics and categories, we adopt the hierarchical topic categorization from prior video datasets [44, 45], comprising 17 main topics and 10 sub-topics per main topic [46, 47]. We further introduce a trend feature axis (e.g., Hot, Latest, Viral) with 10 variants, and construct seed queries by combining all … view at source ↗

**Figure 3.** Figure 3: WEBSHORTS statistics: (a) day-7 view count distribution by popularity tier, (b) per-tier view growth curves, and (c) web source distribution across evidence-card dimensions. E (t) i = LLMsearch(Xi , t) aligned with observation day t. Motivated by the importance of temporal alignment and early popularity in social media popularity prediction [14], we instantiate t as the first three relative observation day… view at source ↗

**Figure 4.** Figure 4: Overview of SHORTS-CAST Step 1, open-web grounded training. evidence dimension to the target video’s popularity tier. The rationale additionally assigns each dimension a saliency score (1–10) quantifying how strongly that dimension’s web signals contribute to the video’s predicted popularity, giving the predictor an explicit signal for weighing which evidence dimensions matter more for a given video. Learn… view at source ↗

**Figure 5.** Figure 5: Overview of SHORTS-CAST Step 2, online trend adaptation. the growth-curve taxonomy of [64], we calibrate percentile thresholds γlow and γhigh on Dval so that γi > γhigh captures initial viral growth (rapid early surge then plateau) and γi < γlow captures delayed viral growth (low initial views followed by a late burst) [8]. Videos within the normal range (γi ∈ (γlow, γhigh)) are excluded regardless of erro… view at source ↗

**Figure 6.** Figure 6: Case study of SHORTS-CAST on initial viral (Left) and delayed viral (Right) videos, showing predicted view counts against the ground truth across online baselines. test nMSE from 0.701 to 0.885). Parametric methods update the same LoRA architecture as SHORTSCAST and improve test nMSE (0.673, 0.624), yet treating every delayed label equally destabilizes the mapping mid-stream, producing the weakest prequen… view at source ↗

**Figure 7.** Figure 7: nMSE across evidencecard snapshots by growth type. Effect of evidence-card refresh We fix SHORTS-CAST and vary only the observation snapshot t ∈ {0, 1, 2} at which the evidence-card is collected [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: Distribution of cited-source publication dates relative to [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

**Figure 9.** Figure 9: Search freshness across observation snapshots. [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Subscriber-count distribution of the source channels in [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Case study of SHORTS-CAST on a Large-tier video (a) and a Micro-tier video (b), pairing the input evidence-card (left) with the generated rationale and per-dimension saliency (right). 22 [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗

read the original abstract

Micro-video popularity prediction (MVPP) forecasts the popularity a newly uploaded short-form video will attract within a fixed number of days after upload. This task supports downstream applications in recommendation, advertising, and creator analytics, yet the problem is hard since virality depends on external trends rather than video content alone. Prior MVPP methods incorporate context by retrieving similar videos from platform-internal corpora, however historical neighbors cannot reveal whether a topic is currently trending, controversial, or already saturated across the open web. To this end, we reformulate MVPP as open-web grounded prediction and introduce WEBSHORTS, the first micro-video dataset that couples 14K videos with real-time open-web context collected at upload time, alongside daily view counts tracked over 7 days. The context for each video is organized as a structured evidence-card that captures the external attention landscape along three complementary web-context dimensions. We further propose SHORTS-CAST, a framework that generates dimension-wise rationales from the evidence-card to guide popularity regression, then adapts at deployment by selectively updating the context-to-popularity mapping when delayed labels reveal genuine trend shifts. In our experiments, SHORTS-CAST consistently outperforms content-only, video corpus retrieval-augmented, and online adaptation baselines under both offline and delayed-label online protocols, confirming that structured web context and trend-aware adaptation are jointly necessary for popularity forecasting under realistic deployment constraints in fast-evolving short-form video ecosystems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces the WEBSHORTS dataset of 14K micro-videos paired with real-time open-web context collected at upload time and daily view counts over 7 days. Context is organized into structured three-dimensional evidence-cards. It proposes SHORTS-CAST, which generates dimension-wise rationales from the evidence-card for popularity regression and selectively adapts the context-to-popularity mapping at deployment when delayed labels indicate genuine trend shifts. Experiments report that SHORTS-CAST consistently outperforms content-only, video corpus retrieval-augmented, and online adaptation baselines under both offline and delayed-label online protocols, concluding that structured web context and trend-aware adaptation are jointly necessary for micro-video popularity prediction.

Significance. If the results hold under rigorous controls, the work advances micro-video popularity prediction by demonstrating the value of grounding forecasts in contemporaneous open-web signals rather than historical internal corpora alone. The new WEBSHORTS dataset and the evidence-card representation provide a concrete resource for future research on external context in dynamic media ecosystems. The selective adaptation mechanism directly targets the challenge of evolving trends.

major comments (2)

[Experiments / online protocol description] The claim that trend-aware adaptation is jointly necessary for the online protocol rests on delayed labels reliably indicating genuine external trend shifts rather than platform noise or random fluctuations. The manuscript provides no quantitative check (e.g., correlation of label deltas with independent signals such as search-volume spikes or external mention counts) at the moments adaptation is triggered. This validation is load-bearing for the 'genuine trend shifts' premise and the resulting conclusion.
[Method / SHORTS-CAST framework] Details on evidence-card construction (exact sources, aggregation rules, and temporal alignment for the three dimensions) and on the rationale-generation process (models, prompts, or training) are insufficient for replication. These choices directly affect whether the reported gains can be attributed to the structured web context rather than implementation specifics.

minor comments (1)

[Abstract] The abstract would benefit from a brief statement of the magnitude of improvements (e.g., relative gains or absolute metrics) to allow readers to gauge practical significance without reading the full experimental tables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our work. The comments highlight important aspects of validation and reproducibility that we address point by point below. We will revise the manuscript to incorporate clarifications and additional details where feasible.

read point-by-point responses

Referee: [Experiments / online protocol description] The claim that trend-aware adaptation is jointly necessary for the online protocol rests on delayed labels reliably indicating genuine external trend shifts rather than platform noise or random fluctuations. The manuscript provides no quantitative check (e.g., correlation of label deltas with independent signals such as search-volume spikes or external mention counts) at the moments adaptation is triggered. This validation is load-bearing for the 'genuine trend shifts' premise and the resulting conclusion.

Authors: We acknowledge that an explicit quantitative validation linking label deltas to independent external signals would further support the interpretation of genuine trend shifts. Our online protocol is intentionally designed around the realistic constraint of delayed labels only, with the selective adaptation mechanism intended to respond to significant deviations that may reflect external changes. To address this, we will add a supplementary analysis in the revised manuscript that examines correlations between adaptation triggers and spikes in web mentions or related signals already present in the evidence-cards. We will also clarify the assumptions underlying the protocol and discuss potential noise sources as a limitation if the correlations prove modest. revision: partial
Referee: [Method / SHORTS-CAST framework] Details on evidence-card construction (exact sources, aggregation rules, and temporal alignment for the three dimensions) and on the rationale-generation process (models, prompts, or training) are insufficient for replication. These choices directly affect whether the reported gains can be attributed to the structured web context rather than implementation specifics.

Authors: We agree that the current level of detail is insufficient for replication and that this affects attribution of gains to the structured context. In the revised manuscript we will expand the Methods section (and add an appendix if needed) to specify: the exact sources and collection methods for each of the three evidence-card dimensions; the aggregation rules, counting procedures, and normalization steps; the temporal alignment logic across sources; and the precise models, prompting templates, and any training or fine-tuning procedures used for rationale generation. These additions will enable readers to reproduce the evidence-card construction and SHORTS-CAST pipeline. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external data and empirical baselines

full rationale

The paper introduces a new dataset (WEBSHORTS) coupling videos with real-time open-web context collected at upload time and proposes SHORTS-CAST for generating rationales and selective adaptation using delayed labels. Performance is evaluated against content-only, retrieval-augmented, and online adaptation baselines under offline and delayed-label protocols. No self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided abstract or description. The central claim of joint necessity for web context and trend-aware adaptation follows from comparative outperformance rather than by construction from the inputs themselves. The approach is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The evidence-card structure and three web-context dimensions appear to be introduced constructs whose construction rules are not detailed.

invented entities (1)

structured evidence-card no independent evidence
purpose: Organize external attention landscape along three complementary web-context dimensions for popularity regression
New data structure introduced to capture open-web context at upload time

pith-pipeline@v0.9.0 · 5784 in / 1299 out tokens · 40547 ms · 2026-05-20T00:50:25.047513+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We reformulate MVPP as open-web grounded prediction and introduce WEBSHORTS... evidence-card that captures the external attention landscape along three complementary web-context dimensions.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat induction and embed_strictMono unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Growth-Conditioned Drift filtering... triggers lightweight updates... when delayed labels reveal genuine trend shifts.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

79 extracted references · 79 canonical work pages · 3 internal anchors

[1]

Micro tells macro: Predicting the popularity of micro-videos via a transductive model

Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. InProceedings of the 24th ACM International Conference on Multimedia, pages 898–907, 2016. doi: 10.1145/2964284.2964314

work page doi:10.1145/2964284.2964314 2016
[2]

Smp challenge: An overview of social media prediction challenge 2019

Bo Wu, Wen-Huang Cheng, Peiye Liu, Bei Liu, Zhaoyang Zeng, and Jiebo Luo. Smp challenge: An overview of social media prediction challenge 2019. InProceedings of the 27th ACM International Conference on Multimedia, pages 2667–2671, 2019

work page 2019
[3]

Mvp: Winning solution to smp challenge 2025 video track

Liliang Ye, Yunyao Zhang, Yafeng Wu, Yi-Ping Phoebe Chen, Junqing Yu, Wei Yang, and Zikai Song. Mvp: Winning solution to smp challenge 2025 video track. InProceedings of the ACM International Conference on Multimedia, pages 14079–14085, 2025. doi: 10.1145/3746027.3763761

work page doi:10.1145/3746027.3763761 2025
[4]

A multimodal variational encoder-decoder framework for micro-video popularity prediction

Jiayi Xie, Yaochen Zhu, Zhibin Zhang, Jian Peng, Jing Yi, Yaosi Hu, Hongyi Liu, and Zhenzhong Chen. A multimodal variational encoder-decoder framework for micro-video popularity prediction. In Proceedings of The Web Conference 2020, WWW ’20, page 2542–2548, New York, NY , USA, 2020. Association for Computing Machinery. ISBN 9781450370233. doi: 10.1145/336...

work page doi:10.1145/3366423.3380004 2020
[5]

Predicting micro-video popularity via multi-modal retrieval augmentation

Ting Zhong, Jian Lang, Yifan Zhang, Zhangtao Cheng, Kunpeng Zhang, and Fan Zhou. Predicting micro-video popularity via multi-modal retrieval augmentation. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2579–2583, 2024. doi: 10.1145/3626772.3657929

work page doi:10.1145/3626772.3657929 2024
[6]

Seeing the unseen in micro-video popularity prediction: Self-correlation retrieval for missing modality generation

Zhangtao Cheng, Jian Lang, Ting Zhong, and Fan Zhou. Seeing the unseen in micro-video popularity prediction: Self-correlation retrieval for missing modality generation. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 142–152, 2025. doi: 10.1145/ 3690624.3709308

work page arXiv 2025
[7]

Cats and captions vs

Jack Hessel, Lillian Lee, and David Mimno. Cats and captions vs. creators and the clock: Comparing multimodal content to context in predicting relative popularity. InProceedings of the 26th international conference on world wide web, pages 927–936, 2017

work page 2017
[8]

Expecting to be hip: Hawkes intensity processes for social media popularity

Marian-Andrei Rizoiu, Lexing Xie, Scott Sanner, Manuel Cebrian, Honglin Yu, and Pascal Van Hentenryck. Expecting to be hip: Hawkes intensity processes for social media popularity. InProceedings of the 26th International Conference on World Wide Web, WWW ’17, page 735–744, Republic and Canton of Geneva, CHE, 2017. International World Wide Web Conferences S...

work page doi:10.1145/3038912.3052650 2017
[9]

Retrieval- augmented hypergraph for multimodal social media popularity prediction

Zhangtao Cheng, Jienan Zhang, Xovee Xu, Goce Trajcevski, Ting Zhong, and Fan Zhou. Retrieval- augmented hypergraph for multimodal social media popularity prediction. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 445–455, 2024. doi: 10.1145/3637528.3672041

work page doi:10.1145/3637528.3672041 2024
[10]

Echoes in the feed: Evolution- aware prompt-augmented micro-video popularity prediction

Wei Chen, Jiao Li, Jian Lang, Zhangtao Cheng, Yong Wang, and Fan Zhou. Echoes in the feed: Evolution- aware prompt-augmented micro-video popularity prediction. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2744–2748, 2025. doi: 10.1145/3726302.3730184

work page doi:10.1145/3726302.3730184 2025
[11]

In-context prompt-augmented micro- video popularity prediction

Zhangtao Cheng, Jiao Li, Jian Lang, Ting Zhong, and Fan Zhou. In-context prompt-augmented micro- video popularity prediction. InProceedings of the AAAI Conference on Artificial Intelligence, pages 11527–11535, 2025. doi: 10.1609/aaai.v39i11.33254

work page doi:10.1609/aaai.v39i11.33254 2025
[12]

Improving multimodal social media popularity prediction via selective retrieval knowledge augmentation

Xovee Xu, Yifan Zhang, Fan Zhou, and Jingkuan Song. Improving multimodal social media popularity prediction via selective retrieval knowledge augmentation. InProceedings of the AAAI Conference on Artificial Intelligence, pages 932–940, 2025. doi: 10.1609/aaai.v39i1.32078

work page doi:10.1609/aaai.v39i1.32078 2025
[13]

A content-driven micro-video recommendation dataset at scale.arXiv preprint arXiv:2309.15379, 2023

Yongxin Ni, Yu Cheng, Xiangyan Liu, Junchen Fu, Youhua Li, Xiangnan He, Yongfeng Zhang, and Fajie Yuan. A content-driven micro-video recommendation dataset at scale.arXiv preprint arXiv:2309.15379, 2023

work page arXiv 2023
[14]

Freeman, Frédo Durand, Eli Shechtman, and Xun Huang

Yijie Xu, Bolun Zheng, Wei Zhu, Hangjia Pan, Yuchen Yao, Ning Xu, Anan Liu, Quan Zhang, and Chenggang Yan. Smtpd: A new benchmark for temporal prediction of social media popularity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18847–18857, 2025. doi: 10.1109/CVPR52734.2025.01756. 10

work page doi:10.1109/cvpr52734.2025.01756 2025
[15]

Real-time short video recommendation on mobile devices

Xudong Gong, Qinlin Feng, Yuan Zhang, Jiangling Qin, Weijie Ding, Biao Li, Peng Jiang, and Kun Gai. Real-time short video recommendation on mobile devices. InProceedings of the 31st ACM international conference on information & knowledge management, pages 3103–3112, 2022

work page 2022
[16]

Smp challenge: An overview and analysis of social media prediction challenge

Bo Wu, Peiye Liu, Wen-Huang Cheng, Bei Liu, Zhaoyang Zeng, Jia Wang, Qiushi Huang, and Jiebo Luo. Smp challenge: An overview and analysis of social media prediction challenge. InProceedings of the 31st ACM International Conference on Multimedia, pages 9651–9655, 2023

work page 2023
[17]

What makes an image popular? InProceedings of the 23rd International Conference on World Wide Web (WWW), pages 867–876, 2014

Aditya Khosla, Atish Das Sarma, and Raffay Hamid. What makes an image popular? InProceedings of the 23rd International Conference on World Wide Web (WWW), pages 867–876, 2014. doi: 10.1145/ 2566486.2567996

work page arXiv 2014
[18]

Low-rank multi-view embedding learning for micro-video popularity prediction.IEEE Transactions on Knowledge and Data Engineering (TKDE), 30(8):1519–1532, 2018

Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. Low-rank multi-view embedding learning for micro-video popularity prediction.IEEE Transactions on Knowledge and Data Engineering (TKDE), 30(8):1519–1532, 2018. doi: 10.1109/TKDE.2017.2785784

work page doi:10.1109/tkde.2017.2785784 2018
[19]

Social media popularity prediction based on visual-textual features with xgboost

Junhong Chen, Dayong Liang, Zhanmo Zhu, Xiaojing Zhou, Zihan Ye, and Xiuyun Mo. Social media popularity prediction based on visual-textual features with xgboost. InProceedings of the 27th ACM International Conference on Multimedia, pages 2692–2696, 2019

work page 2019
[20]

HyFea: Winning solution to social media popularity prediction for multimedia grand challenge 2020

Xin Lai, Yihong Zhang, and Wei Zhang. HyFea: Winning solution to social media popularity prediction for multimedia grand challenge 2020. InProceedings of the 28th ACM International Conference on Multimedia (MM), pages 4565–4569, 2020. doi: 10.1145/3394171.3416275

work page doi:10.1145/3394171.3416275 2020
[21]

Micro-video popularity prediction via multimodal varia- tional information bottleneck.IEEE Transactions on Multimedia, 25:24–37, 2021

Jiayi Xie, Yaochen Zhu, and Zhenzhong Chen. Micro-video popularity prediction via multimodal varia- tional information bottleneck.IEEE Transactions on Multimedia, 25:24–37, 2021

work page 2021
[22]

Crossmodal bipolar attention for multimodal classification on social media.Neurocomputing, 514:1–12, 2022

Tsun-hin Cheung and Kin-man Lam. Crossmodal bipolar attention for multimodal classification on social media.Neurocomputing, 514:1–12, 2022

work page 2022
[23]

Multi-modal variational auto-encoder model for micro-video popularity prediction

Zhuoran Zhang, Shibiao Xu, Li Guo, and Wenke Lian. Multi-modal variational auto-encoder model for micro-video popularity prediction. InProceedings of the 8th International Conference on Communication and Information Processing (ICCIP), pages 9–16, 2022. doi: 10.1145/3571662.3571664

work page doi:10.1145/3571662.3571664 2022
[24]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021

work page 2021
[25]

Multi-queue momentum contrast for microvideo-product retrieval

Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, and Liqiang Nie. Multi-queue momentum contrast for microvideo-product retrieval. InProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pages 1003–1011, 2023

work page 2023
[26]

Dual-stream pre-training transformer to enhance multimodal learning for social media prediction

Wenhao Hu, Weilong Chen, Weimin Yuan, Yan Wang, Shimin Cai, and Yanru Zhang. Dual-stream pre-training transformer to enhance multimodal learning for social media prediction. InProceedings of the 32nd ACM International Conference on Multimedia, pages 11450–11456, 2024

work page 2024
[27]

Higher-order vision-language alignment for social media prediction

Mingsheng Tu, Tianjiao Wan*, Qisheng Xu, Xinhao Jiang, Kele Xu, and Cheng Yang. Higher-order vision-language alignment for social media prediction. InProceedings of the 32nd ACM International Conference on Multimedia, pages 11457–11463, 2024

work page 2024
[28]

Efficient test-time adaptation of vision-language models

Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, and Eric Xing. Efficient test-time adaptation of vision-language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14162–14171, 2024

work page 2024
[29]

Realistic test-time adaptation of vision-language models

Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer, and Ismail Ben Ayed. Realistic test-time adaptation of vision-language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25103–25112, 2025

work page 2025
[30]

Noisy test-time adaptation in vision-language models

Chentao Cao, Zhun Zhong, Zhanke Zhou, Tongliang Liu, Yang Liu, Kun Zhang, and Bo Han. Noisy test-time adaptation in vision-language models. InThe Thirteenth International Conference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=iylpeTI0Ql

work page 2025
[31]

Dota: Distributional test-time adaptation of vision-language models

Zongbo Han, Jialong Yang, Guangyu Wang, Junfan Li, Qianli Xu, Mike Zheng Shou, and Changqing Zhang. Dota: Distributional test-time adaptation of vision-language models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems. 11

work page
[32]

Lightweight online adaption for time series foundation model forecasts

Thomas L Lee, William Toner, Rajkarn Singh, Artjom Joosen, and Martin Asenov. Lightweight online adaption for time series foundation model forecasts. InForty-second International Conference on Machine Learning, 2025. URLhttps://openreview.net/forum?id=gAxYbvoOQz

work page 2025
[33]

2020–2031

Lifan Zhao and Yanyan Shen. Proactive model adaptation against concept drift for online time series forecasting. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2020–2031, 2025. doi: 10.1145/3690624.3709210

work page doi:10.1145/3690624.3709210 2020
[34]

Fast and slow streams for online time series forecast- ing without information leakage

Ying yee Ava Lau, Zhiwen Shao, and Dit-Yan Yeung. Fast and slow streams for online time series forecast- ing without information leakage. InThe Thirteenth International Conference on Learning Representations,

work page
[35]

URLhttps://openreview.net/forum?id=I0n3EyogMi

work page
[36]

Continual collaborative distillation for recommender system

Gyuseok Lee, SeongKu Kang, Wonbin Kweon, and Hwanjo Yu. Continual collaborative distillation for recommender system. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’24, page 1495–1505, New York, NY , USA, 2024. Association for Computing Machinery. ISBN 9798400704901. doi: 10.1145/3637528.3671924. URL https://do...

work page doi:10.1145/3637528.3671924 2024
[37]

Mitigating distribution shifts in sequential recommendation: An invariance perspective

Yuxin Liao, Yonghui Yang, Min Hou, Le Wu, Hefei Xu, and Hao Liu. Mitigating distribution shifts in sequential recommendation: An invariance perspective. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1603–1613, 2025

work page 2025
[38]

Online drift detection with maximum concept discrepancy

Ke Wan, Yi Liang, and Susik Yoon. Online drift detection with maximum concept discrepancy. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2924–2935, 2024. doi: 10.1145/3637528.3672016

work page doi:10.1145/3637528.3672016 2024
[39]

Inflora: Interference-free low-rank adaptation for continual learning

Yan-Shuo Liang and Wu-Jun Li. Inflora: Interference-free low-rank adaptation for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23638– 23647, 2024

work page 2024
[40]

Online-lora: Task-free online continual learning via low rank adaptation

Xiwen Wei, Guihong Li, and Radu Marculescu. Online-lora: Task-free online continual learning via low rank adaptation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

work page 2025
[41]

Gated integration of low-rank adaptation for continual learning of large language models

Yan-Shuo Liang, Jiarui Chen, and Wu-Jun Li. Gated integration of low-rank adaptation for continual learning of large language models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems

work page
[42]

Hierarchical knowledge prompt tuning for multi-task test-time adaptation

Qiang Zhang, Mengsheng Zhao, Jiawei Liu, Fanrui Zhang, Yongchao Xu, and Zheng-Jun Zha. Hierarchical knowledge prompt tuning for multi-task test-time adaptation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 30524–30533, 2025

work page 2025
[43]

Dpcore: Dynamic prompt coreset for continual test-time adaptation

Yunbei Zhang, Akshay Mehra, Shuaicheng Niu, and Jihun Hamm. Dpcore: Dynamic prompt coreset for continual test-time adaptation. InForty-second International Conference on Machine Learning

work page
[44]

Forecasting the buzz: Enriching hashtag popularity prediction with llm reasoning

Yifei Xu, Jiaying Wu, Herun Wan, Yang Li, Zhen Hou, and Min-Yen Kan. Forecasting the buzz: Enriching hashtag popularity prediction with llm reasoning. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 5396–5400, 2025. doi: 10.1145/3746252.3760970

work page doi:10.1145/3746252.3760970 2025
[45]

Mmsum: A dataset for multimodal summarization and thumbnail generation of videos

Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Ding Zhao, et al. Mmsum: A dataset for multimodal summarization and thumbnail generation of videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21909–21921, 2024

work page 2024
[46]

Hippo-video: Simulating watch histories with large language models for personalized video highlighting

Jeongeun Lee, Youngjae Yu, and Dongha Lee. Hippo-video: Simulating watch histories with large language models for personalized video highlighting. InConference on Language Modeling, 2025. URL https://arxiv.org/abs/2507.16873. Published as a conference paper at COLM 2025

work page arXiv 2025
[47]

Towards automatic learning of procedures from web instructional videos

Luowei Zhou, Chenliang Xu, and Jason Corso. Towards automatic learning of procedures from web instructional videos. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

work page 2018
[48]

Howto100m: Learning a text-video embedding by watching hundred million narrated video clips

Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, and Josef Sivic. Howto100m: Learning a text-video embedding by watching hundred million narrated video clips. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2630–2640, 2019

work page 2019
[49]

Robust speech recognition via large-scale weak supervision

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision. InInternational conference on machine learning, pages 28492–28518. PMLR, 2023. 12

work page 2023
[50]

Perplexity ai.https://www.perplexity.ai/, 2024

Perplexity AI. Perplexity ai.https://www.perplexity.ai/, 2024. Accessed: 2025-05-08

work page 2024
[51]

Introducing chatgpt search, 2024

OpenAI. Introducing chatgpt search, 2024. URL https://openai.com/index/ introducing-chatgpt-search/

work page 2024
[52]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with ad- vanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[53]

Search-o1: Agentic search-enhanced large reasoning models

Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, and Zhicheng Dou. Search-o1: Agentic search-enhanced large reasoning models. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 5420–5438, 2025

work page 2025
[54]

Gonzalez

Mihran Miroyan, Tsung-Han Wu, Logan King, Tianle Li, Jiayi Pan, Xinyan Hu, Wei-Lin Chiang, Anas- tasios Nikolas Angelopoulos, Trevor Darrell, Narges Norouzi, and Joseph E. Gonzalez. Search arena: Analyzing search-augmented LLMs. InThe Fourteenth International Conference on Learning Representa- tions, 2026. URLhttps://openreview.net/forum?id=MMGRlDnhtI

work page 2026
[55]

Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, and 1 others

Zhengyang Liang, Yan Shu, Xiangrui Liu, Minghao Qin, Kaixin Liang, Paolo Rota, Nicu Sebe, Zheng Liu, and Lizi Liao. Video-browsecomp: Benchmarking agentic video research on open web.arXiv preprint arXiv:2512.23044, 2025

work page arXiv 2025
[56]

Agenticshop: Benchmarking agentic product curation for personalized web shopping

Sunghwan Kim, Ryang Heo, Yongsik Seo, Jinyoung Yeo, and Dongha Lee. Agenticshop: Benchmarking agentic product curation for personalized web shopping. InProceedings of the ACM Web Conference 2026, pages 2489–2500, 2026

work page 2026
[57]

grok-4.1-fast-reasoning, 2025

xAI. grok-4.1-fast-reasoning, 2025. URL https://docs.x.ai/developers/models/ grok-4-1-fast-reasoning

work page 2025
[58]

GPT-4 Technical Report

OpenAI. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023. URL https://api. semanticscholar.org/CorpusID:257532815

work page internal anchor Pith review Pith/arXiv arXiv 2023
[59]

Critique-out-loud reward models,

Zachary Ankner, Mansheej Paul, Brandon Cui, Jonathan D Chang, and Prithviraj Ammanabrolu. Critique- out-loud reward models.arXiv preprint arXiv:2408.11791, 2024

work page arXiv 2024
[60]

MM-RLHF: The next step forward in multimodal LLM alignment

YiFan Zhang, Tao Yu, Haochen Tian, Chaoyou Fu, Peiyan Li, Jianshu Zeng, Wulin Xie, Yang Shi, Huanyu Zhang, Junkang Wu, Xue Wang, Yibo Hu, Bin Wen, Tingting Gao, Zhang Zhang, Fan Yang, Di ZHANG, Liang Wang, and Rong Jin. MM-RLHF: The next step forward in multimodal LLM alignment. In Forty-second International Conference on Machine Learning, 2025. URL https...

work page 2025
[61]

Personalized reward modeling for text-to-image generation

Jeongeun Lee, Ryang Heo, and Dongha Lee. Personalized reward modeling for text-to-image generation. arXiv preprint arXiv:2511.19458, 2025

work page arXiv 2025
[62]

Joint reward modeling: Internalizing chain-of-thought for efficient visual reward models.arXiv preprint arXiv:2602.07533, 2026

Yankai Yang, Yancheng Long, Hongyang Wei, Wei Chen, Tianke Zhang, Kaiyu Jiang, Haonan Fan, Changyi Liu, Jiankang Chen, Kaiyu Tang, et al. Joint reward modeling: Internalizing chain-of-thought for efficient visual reward models.arXiv preprint arXiv:2602.07533, 2026

work page arXiv 2026
[63]

Multimodal llms as customized reward models for text-to-image generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu, Branislav Kveton, Yufan Zhou, Jiuxiang Gu, Jian Chen, and Changyou Chen. Multimodal llms as customized reward models for text-to-image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 19638–19648, 2025

work page 2025
[64]

Why cannot long-term cascade be predicted? exploring temporal dynamics in information diffusion processes.Royal Society Open Science, 8(9), 2021

Ren-Meng Cao, Xiao Fan Liu, and Xiao-Ke Xu. Why cannot long-term cascade be predicted? exploring temporal dynamics in information diffusion processes.Royal Society Open Science, 8(9), 2021

work page 2021
[65]

Characterizing viral videos: Methodology and applications.Electronic Commerce Research and Applications, 19:19–32, 2016

Stephen L France, Mahyar Sharif Vaghefi, and Huimin Zhao. Characterizing viral videos: Methodology and applications.Electronic Commerce Research and Applications, 19:19–32, 2016

work page 2016
[66]

Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Liang Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022

work page 2022
[67]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[68]

Generalization through memorization: Nearest neighbor language models

Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. Generalization through memorization: Nearest neighbor language models. InInternational Conference on Learning Representations. 13

work page
[69]

Adaptation approaches for nearest neighbor language models

Rishabh Bhardwaj, George Polovets, and Monica Sunkara. Adaptation approaches for nearest neighbor language models. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors,Findings of the Association for Computational Linguistics: ACL 2023, pages 1135–1146, Toronto, Canada, July

work page 2023
[70]

doi: 10.18653/v1/2023.findings-acl.73

Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.73. URL https: //aclanthology.org/2023.findings-acl.73/

work page doi:10.18653/v1/2023.findings-acl.73 2023
[71]

Adanpc: Exploring non-parametric classifier for test-time adaptation

Yifan Zhang, Xue Wang, Kexin Jin, Kun Yuan, Zhang Zhang, Liang Wang, Rong Jin, and Tieniu Tan. Adanpc: Exploring non-parametric classifier for test-time adaptation. InInternational conference on machine learning, pages 41647–41676. PMLR, 2023

work page 2023
[72]

Ts-memory: Plug-and-play memory for time series foundation models.arXiv preprint arXiv:2602.11550, 2026

Sisuo Lyu, Siru Zhong, Tiegang Chen, Weilin Ruan, Qingxiang Liu, Taiqiang Lv, Qingsong Wen, Raymond Chi-Wing Wong, and Yuxuan Liang. Ts-memory: Plug-and-play memory for time series foundation models.arXiv preprint arXiv:2602.11550, 2026

work page arXiv 2026
[73]

Orthogonal subspace learning for language model continual learning

Xiao Wang, Tianze Chen, Qiming Ge, Han Xia, Rong Bao, Rui Zheng, Qi Zhang, Tao Gui, and Xuan-Jing Huang. Orthogonal subspace learning for language model continual learning. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 10658–10671, 2023

work page 2023
[74]

A simple but strong baseline for online continual learning: Repeated augmented rehearsal

Yaqian Zhang, Bernhard Pfahringer, Eibe Frank, Albert Bifet, Nick Jin Sean Lim, and Alvin Jia. A simple but strong baseline for online continual learning: Repeated augmented rehearsal. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors,Advances in Neural Information Processing Systems, 2022. URLhttps://openreview.net/forum?id=bhvUOhnsgZ

work page 2022
[75]

Self-consistent reasoning-based aspect-sentiment quad prediction with extract-then-assign strategy

Jieyong Kim, Ryang Heo, Yongsik Seo, SeongKu Kang, Jinyoung Yeo, and Dongha Lee. Self-consistent reasoning-based aspect-sentiment quad prediction with extract-then-assign strategy. InFindings of the Association for Computational Linguistics: ACL 2024, pages 7295–7303, 2024

work page 2024
[76]

Make compound sentences simple to analyze: Learning to split sentences for aspect-based sentiment analysis

Yongsik Seo, Sungwon Song, Ryang Heo, Jieyong Kim, and Dongha Lee. Make compound sentences simple to analyze: Learning to split sentences for aspect-based sentiment analysis. InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 11171–11184, 2024

work page 2024
[77]

Imagine all the relevance: Scenario-profiled indexing with knowledge expansion for dense retrieval

Sangam Lee, Ryang Heo, SeongKu Kang, and Dongha Lee. Imagine all the relevance: Scenario-profiled indexing with knowledge expansion for dense retrieval. InSecond Conference on Language Modeling

work page
[78]

Angle-optimized text embeddings,

Xianming Li and Jing Li. Angle-optimized text embeddings.arXiv preprint arXiv:2309.12871, 2023

work page arXiv 2023
[79]

{Trend Feature} {Topic} {Sub-Topic} #shorts

Ryang Heo, Yongsik Seo, Junseong Lee, and Dongha Lee. Can large language models be effective online opinion miners? InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 23108–23147, 2025. 14 A Limitations and Broader Impacts LimitationsWhile our results confirm the value of open-web grounding and trend-aware adap...

work page 2025

[1] [1]

Micro tells macro: Predicting the popularity of micro-videos via a transductive model

Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. InProceedings of the 24th ACM International Conference on Multimedia, pages 898–907, 2016. doi: 10.1145/2964284.2964314

work page doi:10.1145/2964284.2964314 2016

[2] [2]

Smp challenge: An overview of social media prediction challenge 2019

Bo Wu, Wen-Huang Cheng, Peiye Liu, Bei Liu, Zhaoyang Zeng, and Jiebo Luo. Smp challenge: An overview of social media prediction challenge 2019. InProceedings of the 27th ACM International Conference on Multimedia, pages 2667–2671, 2019

work page 2019

[3] [3]

Mvp: Winning solution to smp challenge 2025 video track

Liliang Ye, Yunyao Zhang, Yafeng Wu, Yi-Ping Phoebe Chen, Junqing Yu, Wei Yang, and Zikai Song. Mvp: Winning solution to smp challenge 2025 video track. InProceedings of the ACM International Conference on Multimedia, pages 14079–14085, 2025. doi: 10.1145/3746027.3763761

work page doi:10.1145/3746027.3763761 2025

[4] [4]

A multimodal variational encoder-decoder framework for micro-video popularity prediction

Jiayi Xie, Yaochen Zhu, Zhibin Zhang, Jian Peng, Jing Yi, Yaosi Hu, Hongyi Liu, and Zhenzhong Chen. A multimodal variational encoder-decoder framework for micro-video popularity prediction. In Proceedings of The Web Conference 2020, WWW ’20, page 2542–2548, New York, NY , USA, 2020. Association for Computing Machinery. ISBN 9781450370233. doi: 10.1145/336...

work page doi:10.1145/3366423.3380004 2020

[5] [5]

Predicting micro-video popularity via multi-modal retrieval augmentation

Ting Zhong, Jian Lang, Yifan Zhang, Zhangtao Cheng, Kunpeng Zhang, and Fan Zhou. Predicting micro-video popularity via multi-modal retrieval augmentation. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2579–2583, 2024. doi: 10.1145/3626772.3657929

work page doi:10.1145/3626772.3657929 2024

[6] [6]

Seeing the unseen in micro-video popularity prediction: Self-correlation retrieval for missing modality generation

Zhangtao Cheng, Jian Lang, Ting Zhong, and Fan Zhou. Seeing the unseen in micro-video popularity prediction: Self-correlation retrieval for missing modality generation. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 142–152, 2025. doi: 10.1145/ 3690624.3709308

work page arXiv 2025

[7] [7]

Cats and captions vs

Jack Hessel, Lillian Lee, and David Mimno. Cats and captions vs. creators and the clock: Comparing multimodal content to context in predicting relative popularity. InProceedings of the 26th international conference on world wide web, pages 927–936, 2017

work page 2017

[8] [8]

Expecting to be hip: Hawkes intensity processes for social media popularity

Marian-Andrei Rizoiu, Lexing Xie, Scott Sanner, Manuel Cebrian, Honglin Yu, and Pascal Van Hentenryck. Expecting to be hip: Hawkes intensity processes for social media popularity. InProceedings of the 26th International Conference on World Wide Web, WWW ’17, page 735–744, Republic and Canton of Geneva, CHE, 2017. International World Wide Web Conferences S...

work page doi:10.1145/3038912.3052650 2017

[9] [9]

Retrieval- augmented hypergraph for multimodal social media popularity prediction

Zhangtao Cheng, Jienan Zhang, Xovee Xu, Goce Trajcevski, Ting Zhong, and Fan Zhou. Retrieval- augmented hypergraph for multimodal social media popularity prediction. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 445–455, 2024. doi: 10.1145/3637528.3672041

work page doi:10.1145/3637528.3672041 2024

[10] [10]

Echoes in the feed: Evolution- aware prompt-augmented micro-video popularity prediction

Wei Chen, Jiao Li, Jian Lang, Zhangtao Cheng, Yong Wang, and Fan Zhou. Echoes in the feed: Evolution- aware prompt-augmented micro-video popularity prediction. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2744–2748, 2025. doi: 10.1145/3726302.3730184

work page doi:10.1145/3726302.3730184 2025

[11] [11]

In-context prompt-augmented micro- video popularity prediction

Zhangtao Cheng, Jiao Li, Jian Lang, Ting Zhong, and Fan Zhou. In-context prompt-augmented micro- video popularity prediction. InProceedings of the AAAI Conference on Artificial Intelligence, pages 11527–11535, 2025. doi: 10.1609/aaai.v39i11.33254

work page doi:10.1609/aaai.v39i11.33254 2025

[12] [12]

Improving multimodal social media popularity prediction via selective retrieval knowledge augmentation

Xovee Xu, Yifan Zhang, Fan Zhou, and Jingkuan Song. Improving multimodal social media popularity prediction via selective retrieval knowledge augmentation. InProceedings of the AAAI Conference on Artificial Intelligence, pages 932–940, 2025. doi: 10.1609/aaai.v39i1.32078

work page doi:10.1609/aaai.v39i1.32078 2025

[13] [13]

A content-driven micro-video recommendation dataset at scale.arXiv preprint arXiv:2309.15379, 2023

Yongxin Ni, Yu Cheng, Xiangyan Liu, Junchen Fu, Youhua Li, Xiangnan He, Yongfeng Zhang, and Fajie Yuan. A content-driven micro-video recommendation dataset at scale.arXiv preprint arXiv:2309.15379, 2023

work page arXiv 2023

[14] [14]

Freeman, Frédo Durand, Eli Shechtman, and Xun Huang

Yijie Xu, Bolun Zheng, Wei Zhu, Hangjia Pan, Yuchen Yao, Ning Xu, Anan Liu, Quan Zhang, and Chenggang Yan. Smtpd: A new benchmark for temporal prediction of social media popularity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18847–18857, 2025. doi: 10.1109/CVPR52734.2025.01756. 10

work page doi:10.1109/cvpr52734.2025.01756 2025

[15] [15]

Real-time short video recommendation on mobile devices

Xudong Gong, Qinlin Feng, Yuan Zhang, Jiangling Qin, Weijie Ding, Biao Li, Peng Jiang, and Kun Gai. Real-time short video recommendation on mobile devices. InProceedings of the 31st ACM international conference on information & knowledge management, pages 3103–3112, 2022

work page 2022

[16] [16]

Smp challenge: An overview and analysis of social media prediction challenge

Bo Wu, Peiye Liu, Wen-Huang Cheng, Bei Liu, Zhaoyang Zeng, Jia Wang, Qiushi Huang, and Jiebo Luo. Smp challenge: An overview and analysis of social media prediction challenge. InProceedings of the 31st ACM International Conference on Multimedia, pages 9651–9655, 2023

work page 2023

[17] [17]

What makes an image popular? InProceedings of the 23rd International Conference on World Wide Web (WWW), pages 867–876, 2014

Aditya Khosla, Atish Das Sarma, and Raffay Hamid. What makes an image popular? InProceedings of the 23rd International Conference on World Wide Web (WWW), pages 867–876, 2014. doi: 10.1145/ 2566486.2567996

work page arXiv 2014

[18] [18]

Low-rank multi-view embedding learning for micro-video popularity prediction.IEEE Transactions on Knowledge and Data Engineering (TKDE), 30(8):1519–1532, 2018

Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. Low-rank multi-view embedding learning for micro-video popularity prediction.IEEE Transactions on Knowledge and Data Engineering (TKDE), 30(8):1519–1532, 2018. doi: 10.1109/TKDE.2017.2785784

work page doi:10.1109/tkde.2017.2785784 2018

[19] [19]

Social media popularity prediction based on visual-textual features with xgboost

Junhong Chen, Dayong Liang, Zhanmo Zhu, Xiaojing Zhou, Zihan Ye, and Xiuyun Mo. Social media popularity prediction based on visual-textual features with xgboost. InProceedings of the 27th ACM International Conference on Multimedia, pages 2692–2696, 2019

work page 2019

[20] [20]

HyFea: Winning solution to social media popularity prediction for multimedia grand challenge 2020

Xin Lai, Yihong Zhang, and Wei Zhang. HyFea: Winning solution to social media popularity prediction for multimedia grand challenge 2020. InProceedings of the 28th ACM International Conference on Multimedia (MM), pages 4565–4569, 2020. doi: 10.1145/3394171.3416275

work page doi:10.1145/3394171.3416275 2020

[21] [21]

Micro-video popularity prediction via multimodal varia- tional information bottleneck.IEEE Transactions on Multimedia, 25:24–37, 2021

Jiayi Xie, Yaochen Zhu, and Zhenzhong Chen. Micro-video popularity prediction via multimodal varia- tional information bottleneck.IEEE Transactions on Multimedia, 25:24–37, 2021

work page 2021

[22] [22]

Crossmodal bipolar attention for multimodal classification on social media.Neurocomputing, 514:1–12, 2022

Tsun-hin Cheung and Kin-man Lam. Crossmodal bipolar attention for multimodal classification on social media.Neurocomputing, 514:1–12, 2022

work page 2022

[23] [23]

Multi-modal variational auto-encoder model for micro-video popularity prediction

Zhuoran Zhang, Shibiao Xu, Li Guo, and Wenke Lian. Multi-modal variational auto-encoder model for micro-video popularity prediction. InProceedings of the 8th International Conference on Communication and Information Processing (ICCIP), pages 9–16, 2022. doi: 10.1145/3571662.3571664

work page doi:10.1145/3571662.3571664 2022

[24] [24]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021

work page 2021

[25] [25]

Multi-queue momentum contrast for microvideo-product retrieval

Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, and Liqiang Nie. Multi-queue momentum contrast for microvideo-product retrieval. InProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pages 1003–1011, 2023

work page 2023

[26] [26]

Dual-stream pre-training transformer to enhance multimodal learning for social media prediction

Wenhao Hu, Weilong Chen, Weimin Yuan, Yan Wang, Shimin Cai, and Yanru Zhang. Dual-stream pre-training transformer to enhance multimodal learning for social media prediction. InProceedings of the 32nd ACM International Conference on Multimedia, pages 11450–11456, 2024

work page 2024

[27] [27]

Higher-order vision-language alignment for social media prediction

Mingsheng Tu, Tianjiao Wan*, Qisheng Xu, Xinhao Jiang, Kele Xu, and Cheng Yang. Higher-order vision-language alignment for social media prediction. InProceedings of the 32nd ACM International Conference on Multimedia, pages 11457–11463, 2024

work page 2024

[28] [28]

Efficient test-time adaptation of vision-language models

Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, and Eric Xing. Efficient test-time adaptation of vision-language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14162–14171, 2024

work page 2024

[29] [29]

Realistic test-time adaptation of vision-language models

Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer, and Ismail Ben Ayed. Realistic test-time adaptation of vision-language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25103–25112, 2025

work page 2025

[30] [30]

Noisy test-time adaptation in vision-language models

Chentao Cao, Zhun Zhong, Zhanke Zhou, Tongliang Liu, Yang Liu, Kun Zhang, and Bo Han. Noisy test-time adaptation in vision-language models. InThe Thirteenth International Conference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=iylpeTI0Ql

work page 2025

[31] [31]

Dota: Distributional test-time adaptation of vision-language models

Zongbo Han, Jialong Yang, Guangyu Wang, Junfan Li, Qianli Xu, Mike Zheng Shou, and Changqing Zhang. Dota: Distributional test-time adaptation of vision-language models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems. 11

work page

[32] [32]

Lightweight online adaption for time series foundation model forecasts

Thomas L Lee, William Toner, Rajkarn Singh, Artjom Joosen, and Martin Asenov. Lightweight online adaption for time series foundation model forecasts. InForty-second International Conference on Machine Learning, 2025. URLhttps://openreview.net/forum?id=gAxYbvoOQz

work page 2025

[33] [33]

2020–2031

Lifan Zhao and Yanyan Shen. Proactive model adaptation against concept drift for online time series forecasting. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2020–2031, 2025. doi: 10.1145/3690624.3709210

work page doi:10.1145/3690624.3709210 2020

[34] [34]

Fast and slow streams for online time series forecast- ing without information leakage

Ying yee Ava Lau, Zhiwen Shao, and Dit-Yan Yeung. Fast and slow streams for online time series forecast- ing without information leakage. InThe Thirteenth International Conference on Learning Representations,

work page

[35] [35]

URLhttps://openreview.net/forum?id=I0n3EyogMi

work page

[36] [36]

Continual collaborative distillation for recommender system

Gyuseok Lee, SeongKu Kang, Wonbin Kweon, and Hwanjo Yu. Continual collaborative distillation for recommender system. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’24, page 1495–1505, New York, NY , USA, 2024. Association for Computing Machinery. ISBN 9798400704901. doi: 10.1145/3637528.3671924. URL https://do...

work page doi:10.1145/3637528.3671924 2024

[37] [37]

Mitigating distribution shifts in sequential recommendation: An invariance perspective

Yuxin Liao, Yonghui Yang, Min Hou, Le Wu, Hefei Xu, and Hao Liu. Mitigating distribution shifts in sequential recommendation: An invariance perspective. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1603–1613, 2025

work page 2025

[38] [38]

Online drift detection with maximum concept discrepancy

Ke Wan, Yi Liang, and Susik Yoon. Online drift detection with maximum concept discrepancy. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2924–2935, 2024. doi: 10.1145/3637528.3672016

work page doi:10.1145/3637528.3672016 2024

[39] [39]

Inflora: Interference-free low-rank adaptation for continual learning

Yan-Shuo Liang and Wu-Jun Li. Inflora: Interference-free low-rank adaptation for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23638– 23647, 2024

work page 2024

[40] [40]

Online-lora: Task-free online continual learning via low rank adaptation

Xiwen Wei, Guihong Li, and Radu Marculescu. Online-lora: Task-free online continual learning via low rank adaptation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

work page 2025

[41] [41]

Gated integration of low-rank adaptation for continual learning of large language models

Yan-Shuo Liang, Jiarui Chen, and Wu-Jun Li. Gated integration of low-rank adaptation for continual learning of large language models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems

work page

[42] [42]

Hierarchical knowledge prompt tuning for multi-task test-time adaptation

Qiang Zhang, Mengsheng Zhao, Jiawei Liu, Fanrui Zhang, Yongchao Xu, and Zheng-Jun Zha. Hierarchical knowledge prompt tuning for multi-task test-time adaptation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 30524–30533, 2025

work page 2025

[43] [43]

Dpcore: Dynamic prompt coreset for continual test-time adaptation

Yunbei Zhang, Akshay Mehra, Shuaicheng Niu, and Jihun Hamm. Dpcore: Dynamic prompt coreset for continual test-time adaptation. InForty-second International Conference on Machine Learning

work page

[44] [44]

Forecasting the buzz: Enriching hashtag popularity prediction with llm reasoning

Yifei Xu, Jiaying Wu, Herun Wan, Yang Li, Zhen Hou, and Min-Yen Kan. Forecasting the buzz: Enriching hashtag popularity prediction with llm reasoning. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 5396–5400, 2025. doi: 10.1145/3746252.3760970

work page doi:10.1145/3746252.3760970 2025

[45] [45]

Mmsum: A dataset for multimodal summarization and thumbnail generation of videos

Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Ding Zhao, et al. Mmsum: A dataset for multimodal summarization and thumbnail generation of videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21909–21921, 2024

work page 2024

[46] [46]

Hippo-video: Simulating watch histories with large language models for personalized video highlighting

Jeongeun Lee, Youngjae Yu, and Dongha Lee. Hippo-video: Simulating watch histories with large language models for personalized video highlighting. InConference on Language Modeling, 2025. URL https://arxiv.org/abs/2507.16873. Published as a conference paper at COLM 2025

work page arXiv 2025

[47] [47]

Towards automatic learning of procedures from web instructional videos

Luowei Zhou, Chenliang Xu, and Jason Corso. Towards automatic learning of procedures from web instructional videos. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

work page 2018

[48] [48]

Howto100m: Learning a text-video embedding by watching hundred million narrated video clips

Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, and Josef Sivic. Howto100m: Learning a text-video embedding by watching hundred million narrated video clips. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2630–2640, 2019

work page 2019

[49] [49]

Robust speech recognition via large-scale weak supervision

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision. InInternational conference on machine learning, pages 28492–28518. PMLR, 2023. 12

work page 2023

[50] [50]

Perplexity ai.https://www.perplexity.ai/, 2024

Perplexity AI. Perplexity ai.https://www.perplexity.ai/, 2024. Accessed: 2025-05-08

work page 2024

[51] [51]

Introducing chatgpt search, 2024

OpenAI. Introducing chatgpt search, 2024. URL https://openai.com/index/ introducing-chatgpt-search/

work page 2024

[52] [52]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with ad- vanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[53] [53]

Search-o1: Agentic search-enhanced large reasoning models

Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, and Zhicheng Dou. Search-o1: Agentic search-enhanced large reasoning models. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 5420–5438, 2025

work page 2025

[54] [54]

Gonzalez

Mihran Miroyan, Tsung-Han Wu, Logan King, Tianle Li, Jiayi Pan, Xinyan Hu, Wei-Lin Chiang, Anas- tasios Nikolas Angelopoulos, Trevor Darrell, Narges Norouzi, and Joseph E. Gonzalez. Search arena: Analyzing search-augmented LLMs. InThe Fourteenth International Conference on Learning Representa- tions, 2026. URLhttps://openreview.net/forum?id=MMGRlDnhtI

work page 2026

[55] [55]

Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, and 1 others

Zhengyang Liang, Yan Shu, Xiangrui Liu, Minghao Qin, Kaixin Liang, Paolo Rota, Nicu Sebe, Zheng Liu, and Lizi Liao. Video-browsecomp: Benchmarking agentic video research on open web.arXiv preprint arXiv:2512.23044, 2025

work page arXiv 2025

[56] [56]

Agenticshop: Benchmarking agentic product curation for personalized web shopping

Sunghwan Kim, Ryang Heo, Yongsik Seo, Jinyoung Yeo, and Dongha Lee. Agenticshop: Benchmarking agentic product curation for personalized web shopping. InProceedings of the ACM Web Conference 2026, pages 2489–2500, 2026

work page 2026

[57] [57]

grok-4.1-fast-reasoning, 2025

xAI. grok-4.1-fast-reasoning, 2025. URL https://docs.x.ai/developers/models/ grok-4-1-fast-reasoning

work page 2025

[58] [58]

GPT-4 Technical Report

OpenAI. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023. URL https://api. semanticscholar.org/CorpusID:257532815

work page internal anchor Pith review Pith/arXiv arXiv 2023

[59] [59]

Critique-out-loud reward models,

Zachary Ankner, Mansheej Paul, Brandon Cui, Jonathan D Chang, and Prithviraj Ammanabrolu. Critique- out-loud reward models.arXiv preprint arXiv:2408.11791, 2024

work page arXiv 2024

[60] [60]

MM-RLHF: The next step forward in multimodal LLM alignment

YiFan Zhang, Tao Yu, Haochen Tian, Chaoyou Fu, Peiyan Li, Jianshu Zeng, Wulin Xie, Yang Shi, Huanyu Zhang, Junkang Wu, Xue Wang, Yibo Hu, Bin Wen, Tingting Gao, Zhang Zhang, Fan Yang, Di ZHANG, Liang Wang, and Rong Jin. MM-RLHF: The next step forward in multimodal LLM alignment. In Forty-second International Conference on Machine Learning, 2025. URL https...

work page 2025

[61] [61]

Personalized reward modeling for text-to-image generation

Jeongeun Lee, Ryang Heo, and Dongha Lee. Personalized reward modeling for text-to-image generation. arXiv preprint arXiv:2511.19458, 2025

work page arXiv 2025

[62] [62]

Joint reward modeling: Internalizing chain-of-thought for efficient visual reward models.arXiv preprint arXiv:2602.07533, 2026

Yankai Yang, Yancheng Long, Hongyang Wei, Wei Chen, Tianke Zhang, Kaiyu Jiang, Haonan Fan, Changyi Liu, Jiankang Chen, Kaiyu Tang, et al. Joint reward modeling: Internalizing chain-of-thought for efficient visual reward models.arXiv preprint arXiv:2602.07533, 2026

work page arXiv 2026

[63] [63]

Multimodal llms as customized reward models for text-to-image generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu, Branislav Kveton, Yufan Zhou, Jiuxiang Gu, Jian Chen, and Changyou Chen. Multimodal llms as customized reward models for text-to-image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 19638–19648, 2025

work page 2025

[64] [64]

Why cannot long-term cascade be predicted? exploring temporal dynamics in information diffusion processes.Royal Society Open Science, 8(9), 2021

Ren-Meng Cao, Xiao Fan Liu, and Xiao-Ke Xu. Why cannot long-term cascade be predicted? exploring temporal dynamics in information diffusion processes.Royal Society Open Science, 8(9), 2021

work page 2021

[65] [65]

Characterizing viral videos: Methodology and applications.Electronic Commerce Research and Applications, 19:19–32, 2016

Stephen L France, Mahyar Sharif Vaghefi, and Huimin Zhao. Characterizing viral videos: Methodology and applications.Electronic Commerce Research and Applications, 19:19–32, 2016

work page 2016

[66] [66]

Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Liang Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022

work page 2022

[67] [67]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[68] [68]

Generalization through memorization: Nearest neighbor language models

Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. Generalization through memorization: Nearest neighbor language models. InInternational Conference on Learning Representations. 13

work page

[69] [69]

Adaptation approaches for nearest neighbor language models

Rishabh Bhardwaj, George Polovets, and Monica Sunkara. Adaptation approaches for nearest neighbor language models. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors,Findings of the Association for Computational Linguistics: ACL 2023, pages 1135–1146, Toronto, Canada, July

work page 2023

[70] [70]

doi: 10.18653/v1/2023.findings-acl.73

Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.73. URL https: //aclanthology.org/2023.findings-acl.73/

work page doi:10.18653/v1/2023.findings-acl.73 2023

[71] [71]

Adanpc: Exploring non-parametric classifier for test-time adaptation

Yifan Zhang, Xue Wang, Kexin Jin, Kun Yuan, Zhang Zhang, Liang Wang, Rong Jin, and Tieniu Tan. Adanpc: Exploring non-parametric classifier for test-time adaptation. InInternational conference on machine learning, pages 41647–41676. PMLR, 2023

work page 2023

[72] [72]

Ts-memory: Plug-and-play memory for time series foundation models.arXiv preprint arXiv:2602.11550, 2026

Sisuo Lyu, Siru Zhong, Tiegang Chen, Weilin Ruan, Qingxiang Liu, Taiqiang Lv, Qingsong Wen, Raymond Chi-Wing Wong, and Yuxuan Liang. Ts-memory: Plug-and-play memory for time series foundation models.arXiv preprint arXiv:2602.11550, 2026

work page arXiv 2026

[73] [73]

Orthogonal subspace learning for language model continual learning

Xiao Wang, Tianze Chen, Qiming Ge, Han Xia, Rong Bao, Rui Zheng, Qi Zhang, Tao Gui, and Xuan-Jing Huang. Orthogonal subspace learning for language model continual learning. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 10658–10671, 2023

work page 2023

[74] [74]

A simple but strong baseline for online continual learning: Repeated augmented rehearsal

Yaqian Zhang, Bernhard Pfahringer, Eibe Frank, Albert Bifet, Nick Jin Sean Lim, and Alvin Jia. A simple but strong baseline for online continual learning: Repeated augmented rehearsal. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors,Advances in Neural Information Processing Systems, 2022. URLhttps://openreview.net/forum?id=bhvUOhnsgZ

work page 2022

[75] [75]

Self-consistent reasoning-based aspect-sentiment quad prediction with extract-then-assign strategy

Jieyong Kim, Ryang Heo, Yongsik Seo, SeongKu Kang, Jinyoung Yeo, and Dongha Lee. Self-consistent reasoning-based aspect-sentiment quad prediction with extract-then-assign strategy. InFindings of the Association for Computational Linguistics: ACL 2024, pages 7295–7303, 2024

work page 2024

[76] [76]

Make compound sentences simple to analyze: Learning to split sentences for aspect-based sentiment analysis

Yongsik Seo, Sungwon Song, Ryang Heo, Jieyong Kim, and Dongha Lee. Make compound sentences simple to analyze: Learning to split sentences for aspect-based sentiment analysis. InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 11171–11184, 2024

work page 2024

[77] [77]

Imagine all the relevance: Scenario-profiled indexing with knowledge expansion for dense retrieval

Sangam Lee, Ryang Heo, SeongKu Kang, and Dongha Lee. Imagine all the relevance: Scenario-profiled indexing with knowledge expansion for dense retrieval. InSecond Conference on Language Modeling

work page

[78] [78]

Angle-optimized text embeddings,

Xianming Li and Jing Li. Angle-optimized text embeddings.arXiv preprint arXiv:2309.12871, 2023

work page arXiv 2023

[79] [79]

{Trend Feature} {Topic} {Sub-Topic} #shorts

Ryang Heo, Yongsik Seo, Junseong Lee, and Dongha Lee. Can large language models be effective online opinion miners? InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 23108–23147, 2025. 14 A Limitations and Broader Impacts LimitationsWhile our results confirm the value of open-web grounding and trend-aware adap...

work page 2025