EgoIntrospect provides the first egocentric dataset with self-annotations for internal state tasks and shows multimodal LLMs struggle to infer subjective states from combined signals.
hub Canonical reference
Chat- rec: Towards interactive and explainable llms-augmented recommender system
Canonical reference. 100% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
roles
background 5polarities
background 5representative citing papers
SAGER equips LLM recommendation agents with per-user evolving policy skills via two-representation architecture, contrastive CoT diagnosis, and skill-augmented listwise reasoning, yielding SOTA gains orthogonal to memory accumulation.
RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.
FAERec fuses collaborative ID embeddings with LLM semantic embeddings using adaptive gating and dual-level alignment to enhance tail-item sequential recommendations.
VidHal is a new benchmark that evaluates VLLM temporal hallucinations through a caption ordering task on videos with varying hallucination levels.
On the Moltbook platform populated by LLM agents, popularity-based and item-side collaborative filtering methods outperform user-representation techniques for predicting next forum engagement.
IntuRec anchors LLM latent reasoning for recommendation by deriving an intuition embedding from top-K candidates via self- and cross-attention to initialize more accurate trajectories.
A systematic review of over 200 studies concludes that LLMs in recommender systems act as a double-edged sword, creating both opportunities and new risks for trustworthiness.
Hesitator is a theory-grounded simulator that separates utility-based item selection from overload-aware commitment decisions to reduce unrealistic high acceptance rates in conversational recommender evaluations.
DynamicPO adds dynamic boundary-negative selection and dual-margin beta adjustment to multi-negative DPO to avoid gradient suppression and improve recommendation accuracy.
HingeMem segments dialogue memory via boundary-triggered hyperedges over four elements and applies query-adaptive retrieval, yielding ~20% relative gains and 68% lower QA token cost versus baselines on LOCOMO.
MATRAG deploys four agents (user modeling, item analysis, reasoning, explanation) plus knowledge-graph retrieval and a transparency score to raise hit rate 12.7% and NDCG 15.3% while producing explanations rated helpful by 87.4% of experts.
SpecTran applies a spectral-aware transformer adapter with learnable position encoding to aggregate informative components across the full spectrum of LLM embeddings, yielding 9.17% average gains on sequential recommendation tasks.
This survey organizes generative recommendation into data, model, and task dimensions, identifying five advantages including world knowledge integration and creative generation while noting challenges in benchmarks and efficiency.
Masked History Learning augments autoregressive training in generative recommenders with an auxiliary masked historical item reconstruction task using entropy-guided masking and curriculum learning.
Hallucinations are inevitable on an infinite set of inputs but can be made statistically negligible with sufficient training data quality and quantity.
ShopX is a single foundation model combining intent understanding, planning, and SID-native item fulfillment for agentic shopping, with claimed improvements over tool-mediated systems on Taobao logs.
RcLLM accelerates generative recommendation inference by 1.31x-9.51x in TTFT through beyond-prefix KV caching, replicated user caches, sharded item caches, affinity scheduling, and selective attention with negligible accuracy loss.
HaNoRec dynamically weights harder preference samples and applies Gaussian perturbations to output distributions to improve multimodal LLM performance on sequential recommendation tasks.
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
An LLM semantic-matching framework for journal recommendation reports 40.23% Top-3 accuracy on 23,609 statistics articles from 49 journals without task-specific training.
A transformer recommender system trained on a new benchmark of over 5,000 model performances from medical imaging papers achieves up to 75.5% HitRate@100.
The LMMP framework improves tool-calling accuracy and task success rates for Earth observation agents by grounding plans in multimodal features and remote sensing expert knowledge via a two-stage training process.
A survey of RAG paradigms, components, benchmarks, and challenges for improving LLMs on knowledge-intensive tasks.