pith. sign in

arxiv: 1905.06874 · v1 · pith:I6P4FY2Gnew · submitted 2019-05-15 · 💻 cs.IR · cs.LG

Behavior Sequence Transformer for E-commerce Recommendation in Alibaba

classification 💻 cs.IR cs.LG
keywords recommendationalibababehaviorfeaturesmodelonlinesequentialthen
0
0 comments X
read the original abstract

Deep learning based methods have been widely used in industrial recommendation systems (RSs). Previous works adopt an Embedding&MLP paradigm: raw features are embedded into low-dimensional vectors, which are then fed on to MLP for final recommendations. However, most of these works just concatenate different features, ignoring the sequential nature of users' behaviors. In this paper, we propose to use the powerful Transformer model to capture the sequential signals underlying users' behavior sequences for recommendation in Alibaba. Experimental results demonstrate the superiority of the proposed model, which is then deployed online at Taobao and obtain significant improvements in online Click-Through-Rate (CTR) comparing to two baselines.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. FEDIN: Frequency-Enhanced Deep Interest Network for Click-Through Rate Prediction

    cs.IR 2026-05 unverdicted novelty 6.0

    FEDIN improves CTR prediction by using target-aware frequency filtering to isolate low-entropy periodic interest signals from high-entropy noise in user attention patterns.

  2. Make It Long, Keep It Fast: End-to-End 10K Long User Behavior Sequence Modeling for Billion-Scale Douyin Recommendation

    cs.LG 2025-11 conditional novelty 6.0

    Douyin deploys stacked target-to-history cross attention and request-level batching to scale end-to-end recommendation modeling to 10k-length histories, observing scaling-law gains and live engagement improvements.

  3. Make It Long, Keep It Fast: End-to-End 10K Long User Behavior Sequence Modeling for Billion-Scale Douyin Recommendation

    cs.LG 2025-11 unverdicted novelty 5.0

    Introduces STCA for linear-complexity target-to-history attention, RLB for shared user encoding across targets, and length-extrapolative training to enable end-to-end 10K sequence modeling with observed scaling-law ga...