DyKnow-RAG uses Group Relative Policy Optimization with dual-group rollouts and posterior-driven advantage scaling to optimize context utilization in RAG for e-commerce relevance, showing offline gains and production lifts when deployed at Taobao.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Learning to Trust: Dynamic Utilization of Retrieval-Augmented Generation for E-commerce Search Relevance
DyKnow-RAG uses Group Relative Policy Optimization with dual-group rollouts and posterior-driven advantage scaling to optimize context utilization in RAG for e-commerce relevance, showing offline gains and production lifts when deployed at Taobao.