LLM-driven feature synthesis from data-rich verticals improves MTL ranking models in data-sparse verticals via taxonomic features from user histories.
Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Estimating post-click conversion rate (CVR) accurately is crucial for ranking systems in industrial applications such as recommendation and advertising. Conventional CVR modeling applies popular deep learning methods and achieves state-of-the-art performance. However it encounters several task-specific problems in practice, making CVR modeling challenging. For example, conventional CVR models are trained with samples of clicked impressions while utilized to make inference on the entire space with samples of all impressions. This causes a sample selection bias problem. Besides, there exists an extreme data sparsity problem, making the model fitting rather difficult. In this paper, we model CVR in a brand-new perspective by making good use of sequential pattern of user actions, i.e., impression -> click -> conversion. The proposed Entire Space Multi-task Model (ESMM) can eliminate the two problems simultaneously by i) modeling CVR directly over the entire space, ii) employing a feature representation transfer learning strategy. Experiments on dataset gathered from Taobao's recommender system demonstrate that ESMM significantly outperforms competitive methods. We also release a sampling version of this dataset to enable future research. To the best of our knowledge, this is the first public dataset which contains samples with sequential dependence of click and conversion labels for CVR modeling.
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Mind the Gap: Bridging Behavioral Silos with LLMs in Multi-Vertical Recommendations
LLM-driven feature synthesis from data-rich verticals improves MTL ranking models in data-sparse verticals via taxonomic features from user histories.