F-GRPO factorizes group-relative policy optimization into generation and ranking phases within one autoregressive sequence, using order-invariant coverage and position-aware utility rewards to improve top-ranked performance on recommendation and multi-hop QA tasks.
Title resolution pending
11 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
LLMAR applies LLM reasoning with a self-correction reflection loop to generate semantic user motives for tuning-free recommendations, showing up to 54.6% nDCG@10 gains on a sparse industrial dataset over trained baselines.
SRBench is a multi-dimensional benchmark for sequential recommendation that uses prompt engineering and a coupled extraction mechanism to support fair evaluation of both neural-network and LLM-based models across accuracy, fairness, stability, and efficiency.
SpecTran applies a spectral-aware transformer adapter with learnable position encoding to aggregate informative components across the full spectrum of LLM embeddings, yielding 9.17% average gains on sequential recommendation tasks.
IAP uses RL to train LLMs to explicitly infer and apply implicit user intent in single-turn personalized QA, achieving ~7.5% average macro-score gains over baselines on LaMP-QA.
RcLLM accelerates generative recommendation inference by 1.31x-9.51x in TTFT through beyond-prefix KV caching, replicated user caches, sharded item caches, affinity scheduling, and selective attention with negligible accuracy loss.
Heuristic demonstration selection methods outperform embedding-based methods for practical LLM-based next POI prediction on three real-world datasets.
PURE is a three-component LLM system that extracts and maintains user profiles from reviews to outperform prior LLM recommenders on sequential Amazon tasks.
Reproducibility study diagnoses semantic drift in PO4ISR and introduces PO4ISR++ with reflexive prompting that restores performance with gains up to 54% on Games and 96% on Bundle.
An Efficient Generative Targeting framework accelerates LLM inference in advertising via adaptive group quantization, layer-adaptive hierarchical sparsification, and prefix-tree parallel verification while accepting limited quality degradation.
citing papers explorer
-
A Comparative Study of Demonstration Selection for Practical Large Language Models-based Next POI Prediction
Heuristic demonstration selection methods outperform embedding-based methods for practical LLM-based next POI prediction on three real-world datasets.