F-GRPO factorizes group-relative policy optimization into generation and ranking phases within one autoregressive sequence, using order-invariant coverage and position-aware utility rewards to improve top-ranked performance on recommendation and multi-hop QA tasks.
Title resolution pending
11 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
LLMAR applies LLM reasoning with a self-correction reflection loop to generate semantic user motives for tuning-free recommendations, showing up to 54.6% nDCG@10 gains on a sparse industrial dataset over trained baselines.
SRBench is a multi-dimensional benchmark for sequential recommendation that uses prompt engineering and a coupled extraction mechanism to support fair evaluation of both neural-network and LLM-based models across accuracy, fairness, stability, and efficiency.
SpecTran applies a spectral-aware transformer adapter with learnable position encoding to aggregate informative components across the full spectrum of LLM embeddings, yielding 9.17% average gains on sequential recommendation tasks.
IAP uses RL to train LLMs to explicitly infer and apply implicit user intent in single-turn personalized QA, achieving ~7.5% average macro-score gains over baselines on LaMP-QA.
RcLLM accelerates generative recommendation inference by 1.31x-9.51x in TTFT through beyond-prefix KV caching, replicated user caches, sharded item caches, affinity scheduling, and selective attention with negligible accuracy loss.
Heuristic demonstration selection methods outperform embedding-based methods for practical LLM-based next POI prediction on three real-world datasets.
PURE is a three-component LLM system that extracts and maintains user profiles from reviews to outperform prior LLM recommenders on sequential Amazon tasks.
Reproducibility study diagnoses semantic drift in PO4ISR and introduces PO4ISR++ with reflexive prompting that restores performance with gains up to 54% on Games and 96% on Bundle.
An Efficient Generative Targeting framework accelerates LLM inference in advertising via adaptive group quantization, layer-adaptive hierarchical sparsification, and prefix-tree parallel verification while accepting limited quality degradation.
citing papers explorer
-
F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking
F-GRPO factorizes group-relative policy optimization into generation and ranking phases within one autoregressive sequence, using order-invariant coverage and position-aware utility rewards to improve top-ranked performance on recommendation and multi-hop QA tasks.
-
LLMAR: A Tuning-Free Recommendation Framework for Sparse and Text-Rich Industrial Domains
LLMAR applies LLM reasoning with a self-correction reflection loop to generate semantic user motives for tuning-free recommendations, showing up to 54.6% nDCG@10 gains on a sparse industrial dataset over trained baselines.
-
SRBench: A Comprehensive Benchmark for Sequential Recommendation with Large Language Models
SRBench is a multi-dimensional benchmark for sequential recommendation that uses prompt engineering and a coupled extraction mechanism to support fair evaluation of both neural-network and LLM-based models across accuracy, fairness, stability, and efficiency.
-
SpecTran: Spectral-Aware Transformer-based Adapter for LLM-Enhanced Sequential Recommendation
SpecTran applies a spectral-aware transformer adapter with learnable position encoding to aggregate informative components across the full spectrum of LLM embeddings, yielding 9.17% average gains on sequential recommendation tasks.
-
Training LLMs with Reinforcement Learning for Intent-Aware Personalized Question Answering
IAP uses RL to train LLMs to explicitly infer and apply implicit user intent in single-turn personalized QA, achieving ~7.5% average macro-score gains over baselines on LaMP-QA.
-
RcLLM: Accelerating Generative Recommendation via Beyond-Prefix KV Caching
RcLLM accelerates generative recommendation inference by 1.31x-9.51x in TTFT through beyond-prefix KV caching, replicated user caches, sharded item caches, affinity scheduling, and selective attention with negligible accuracy loss.
-
A Comparative Study of Demonstration Selection for Practical Large Language Models-based Next POI Prediction
Heuristic demonstration selection methods outperform embedding-based methods for practical LLM-based next POI prediction on three real-world datasets.
-
LLM-based User Profile Management for Recommender System
PURE is a three-component LLM system that extracts and maintains user profiles from reviews to outperform prior LLM recommenders on sequential Amazon tasks.
-
A Reproducibility Analysis of PO4ISR: Diagnosing and Mitigating Semantic Drift in LLM-Based Session Recommendation
Reproducibility study diagnoses semantic drift in PO4ISR and introduces PO4ISR++ with reflexive prompting that restores performance with gains up to 54% on Games and 96% on Bundle.
-
Efficient LLM-based Advertising via Model Compression and Parallel Verification
An Efficient Generative Targeting framework accelerates LLM inference in advertising via adaptive group quantization, layer-adaptive hierarchical sparsification, and prefix-tree parallel verification while accepting limited quality degradation.
- BEAR: Towards Beam-Search-Aware Optimization for Recommendation with Large Language Models