LLMs exhibit mid-layer representation advantage for recommendations; MARC compresses representations modularly to reduce costs while improving performance, as shown in a large-scale online advertising deployment.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.IR 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
STAMP mitigates semantic dilution in SID-based generative recommendation via adaptive input pruning and densified output supervision, delivering 1.23-1.38x speedup and 17-55% VRAM savings with maintained or improved accuracy.
citing papers explorer
-
Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations
LLMs exhibit mid-layer representation advantage for recommendations; MARC compresses representations modularly to reduce costs while improving performance, as shown in a large-scale online advertising deployment.
-
Semantic Trimming and Auxiliary Multi-step Prediction for Generative Recommendation
STAMP mitigates semantic dilution in SID-based generative recommendation via adaptive input pruning and densified output supervision, delivering 1.23-1.38x speedup and 17-55% VRAM savings with maintained or improved accuracy.