The paper proposes an item-aware attention mechanism with intra-item and inter-item layers to let LLMs capture item-level collaborative relations instead of only token-level ones.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
An Efficient Generative Targeting framework accelerates LLM inference in advertising via adaptive group quantization, layer-adaptive hierarchical sparsification, and prefix-tree parallel verification while accepting limited quality degradation.
citing papers explorer
-
Beyong Tokens: Item-aware Attention for LLM-based Recommendation
The paper proposes an item-aware attention mechanism with intra-item and inter-item layers to let LLMs capture item-level collaborative relations instead of only token-level ones.
-
Efficient LLM-based Advertising via Model Compression and Parallel Verification
An Efficient Generative Targeting framework accelerates LLM inference in advertising via adaptive group quantization, layer-adaptive hierarchical sparsification, and prefix-tree parallel verification while accepting limited quality degradation.