EgoIntrospect provides the first egocentric dataset with self-annotations for internal state tasks and shows multimodal LLMs struggle to infer subjective states from combined signals.
hub Canonical reference
Chat- rec: Towards interactive and explainable llms-augmented recommender system
Canonical reference. 100% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
roles
background 5polarities
background 5representative citing papers
SAGER equips LLM recommendation agents with per-user evolving policy skills via two-representation architecture, contrastive CoT diagnosis, and skill-augmented listwise reasoning, yielding SOTA gains orthogonal to memory accumulation.
RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.
FAERec fuses collaborative ID embeddings with LLM semantic embeddings using adaptive gating and dual-level alignment to enhance tail-item sequential recommendations.
VidHal is a new benchmark that evaluates VLLM temporal hallucinations through a caption ordering task on videos with varying hallucination levels.
On the Moltbook platform populated by LLM agents, popularity-based and item-side collaborative filtering methods outperform user-representation techniques for predicting next forum engagement.
IntuRec anchors LLM latent reasoning for recommendation by deriving an intuition embedding from top-K candidates via self- and cross-attention to initialize more accurate trajectories.
A systematic review of over 200 studies concludes that LLMs in recommender systems act as a double-edged sword, creating both opportunities and new risks for trustworthiness.
Hesitator is a theory-grounded simulator that separates utility-based item selection from overload-aware commitment decisions to reduce unrealistic high acceptance rates in conversational recommender evaluations.
DynamicPO adds dynamic boundary-negative selection and dual-margin beta adjustment to multi-negative DPO to avoid gradient suppression and improve recommendation accuracy.
HingeMem segments dialogue memory via boundary-triggered hyperedges over four elements and applies query-adaptive retrieval, yielding ~20% relative gains and 68% lower QA token cost versus baselines on LOCOMO.
MATRAG deploys four agents (user modeling, item analysis, reasoning, explanation) plus knowledge-graph retrieval and a transparency score to raise hit rate 12.7% and NDCG 15.3% while producing explanations rated helpful by 87.4% of experts.
SpecTran applies a spectral-aware transformer adapter with learnable position encoding to aggregate informative components across the full spectrum of LLM embeddings, yielding 9.17% average gains on sequential recommendation tasks.
This survey organizes generative recommendation into data, model, and task dimensions, identifying five advantages including world knowledge integration and creative generation while noting challenges in benchmarks and efficiency.
Masked History Learning augments autoregressive training in generative recommenders with an auxiliary masked historical item reconstruction task using entropy-guided masking and curriculum learning.
Hallucinations are inevitable on an infinite set of inputs but can be made statistically negligible with sufficient training data quality and quantity.
ShopX is a single foundation model combining intent understanding, planning, and SID-native item fulfillment for agentic shopping, with claimed improvements over tool-mediated systems on Taobao logs.
RcLLM accelerates generative recommendation inference by 1.31x-9.51x in TTFT through beyond-prefix KV caching, replicated user caches, sharded item caches, affinity scheduling, and selective attention with negligible accuracy loss.
HaNoRec dynamically weights harder preference samples and applies Gaussian perturbations to output distributions to improve multimodal LLM performance on sequential recommendation tasks.
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
An LLM semantic-matching framework for journal recommendation reports 40.23% Top-3 accuracy on 23,609 statistics articles from 49 journals without task-specific training.
A transformer recommender system trained on a new benchmark of over 5,000 model performances from medical imaging papers achieves up to 75.5% HitRate@100.
The LMMP framework improves tool-calling accuracy and task success rates for Earth observation agents by grounding plans in multimodal features and remote sensing expert knowledge via a two-stage training process.
A survey of RAG paradigms, components, benchmarks, and challenges for improving LLMs on knowledge-intensive tasks.
citing papers explorer
-
EgoIntrospect: An Egocentric Dataset and Benchmark for User-Centric Internal State Reasoning
EgoIntrospect provides the first egocentric dataset with self-annotations for internal state tasks and shows multimodal LLMs struggle to infer subjective states from combined signals.
-
SAGER: Self-Evolving User Policy Skills for Recommendation Agent
SAGER equips LLM recommendation agents with per-user evolving policy skills via two-representation architecture, contrastive CoT diagnosis, and skill-augmented listwise reasoning, yielding SOTA gains orthogonal to memory accumulation.
-
Retrieval Augmented Conversational Recommendation with Reinforcement Learning
RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.
-
Fusion and Alignment Enhancement with Large Language Models for Tail-item Sequential Recommendation
FAERec fuses collaborative ID embeddings with LLM semantic embeddings using adaptive gating and dual-level alignment to enhance tail-item sequential recommendations.
-
VidHal: Benchmarking Temporal Hallucinations in Vision LLMs
VidHal is a new benchmark that evaluates VLLM temporal hallucinations through a caption ordering task on videos with varying hallucination levels.
-
Do Recommendation Algorithms Work When Users Are LLM Agents? A Case Study on Moltbook
On the Moltbook platform populated by LLM agents, popularity-based and item-side collaborative filtering methods outperform user-representation techniques for predicting next forum engagement.
-
Intuition-Guided Latent Reasoning for LLM-Based Recommendation
IntuRec anchors LLM latent reasoning for recommendation by deriving an intuition embedding from top-K candidates via self- and cross-attention to initialize more accurate trajectories.
-
Trustworthy Recommendation in the Era of Large Language Models: Opportunities and Challenges
A systematic review of over 200 studies concludes that LLMs in recommender systems act as a double-edged sword, creating both opportunities and new risks for trustworthiness.
-
Decision-aware User Simulation Agent for Evaluating Conversational Recommender Systems
Hesitator is a theory-grounded simulator that separates utility-based item selection from overload-aware commitment decisions to reduce unrealistic high acceptance rates in conversational recommender evaluations.
-
DynamicPO: Dynamic Preference Optimization for Recommendation
DynamicPO adds dynamic boundary-negative selection and dual-margin beta adjustment to multi-negative DPO to avoid gradient suppression and improve recommendation accuracy.
-
HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues
HingeMem segments dialogue memory via boundary-triggered hyperedges over four elements and applies query-adaptive retrieval, yielding ~20% relative gains and 68% lower QA token cost versus baselines on LOCOMO.
-
MATRAG: Multi-Agent Transparent Retrieval-Augmented Generation for Explainable Recommendations
MATRAG deploys four agents (user modeling, item analysis, reasoning, explanation) plus knowledge-graph retrieval and a transparency score to raise hit rate 12.7% and NDCG 15.3% while producing explanations rated helpful by 87.4% of experts.
-
SpecTran: Spectral-Aware Transformer-based Adapter for LLM-Enhanced Sequential Recommendation
SpecTran applies a spectral-aware transformer adapter with learnable position encoding to aggregate informative components across the full spectrum of LLM embeddings, yielding 9.17% average gains on sequential recommendation tasks.
-
From Past To Path: Masked History Learning for Next-Item Prediction in Generative Recommendation
Masked History Learning augments autoregressive training in generative recommenders with an auxiliary masked historical item reconstruction task using entropy-guided masking and curriculum learning.
-
Hallucinations are inevitable but can be made statistically negligible
Hallucinations are inevitable on an infinite set of inputs but can be made statistically negligible with sufficient training data quality and quantity.
-
ShopX: A Foundation Model for Intent-to-Item Fulfillment in Agentic Shopping
ShopX is a single foundation model combining intent understanding, planning, and SID-native item fulfillment for agentic shopping, with claimed improvements over tool-mediated systems on Taobao logs.
-
RcLLM: Accelerating Generative Recommendation via Beyond-Prefix KV Caching
RcLLM accelerates generative recommendation inference by 1.31x-9.51x in TTFT through beyond-prefix KV caching, replicated user caches, sharded item caches, affinity scheduling, and selective attention with negligible accuracy loss.
-
Multimodal Large Language Models with Adaptive Preference Optimization for Sequential Recommendation
HaNoRec dynamically weights harder preference samples and applies Gaussian perturbations to output distributions to improve multimodal LLM performance on sequential recommendation tasks.
-
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
-
An LLM-Powered Semantic Alignment Framework for Journal Recommendation
An LLM semantic-matching framework for journal recommendation reports 40.23% Top-3 accuracy on 23,609 statistics articles from 49 journals without task-specific training.
-
MedicalRec: Medical recommender system for image classification without retraining
A transformer recommender system trained on a new benchmark of over 5,000 model performances from medical imaging papers achieves up to 75.5% HitRate@100.
-
Bridging Perception and Action: A Lightweight Multimodal Meta-Planner Framework for Robust Earth Observation Agents
The LMMP framework improves tool-calling accuracy and task success rates for Earth observation agents by grounding plans in multimodal features and remote sensing expert knowledge via a two-stage training process.
-
Retrieval-Augmented Generation for Large Language Models: A Survey
A survey of RAG paradigms, components, benchmarks, and challenges for improving LLMs on knowledge-intensive tasks.