Retrieval coverage limits LLM rerankers in cold-start recommendation; a learned hybrid fusion improves pool quality but LLM reranking often degrades end-to-end performance while simpler rankers exploit the pool.
(2019, July)
11 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
NuGNN applies a heterogeneous graph neural network to surrogate-solve a 690-isotope nuclear reaction network, achieving few-percent errors and reproducing final abundances where fully connected and Res-U-Net models fail.
SCICONVBENCH is a new benchmark evaluating LLMs on multi-turn disambiguation and inconsistency resolution for task formulation in computational science, with frontier models reaching only 52.7% success on fluid mechanics disambiguation cases.
A systematic method leveraging Weisfeiler-Leman coloring to mine class-discriminating motifs as proxy explanations, enabling the creation of the OpenGraphXAI benchmark suite from real-world datasets.
CA-BED uses Bayesian experimental design and simulated conversation trees with LLM likelihoods to optimize multi-turn question selection, reporting 21.8% higher success rates than direct prompting on entity-deduction benchmarks.
A dual hierarchical RL framework with two agents coordinates high-level dialogue strategy and low-level question generation to emulate judicial questioning and extract key information from Supreme Court arguments, outperforming baselines.
A jointly learned hierarchical index with cross-attention and residual quantization scales exact retrieval in foundational recommendation models, deployed at Meta with additional performance from test-time training on index nodes.
ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.
MirrorBench defines a reproducible benchmark combining lexical metrics (MATTR, Yule's K, HD-D) and LLM-judge metrics with calibration controls to measure human-likeness of user-proxy agents across four datasets.
TextBridgeGNN pre-trains GNNs using text-guided hierarchical propagation to enable effective cross-domain knowledge transfer in recommendations.
DMICF models interactions from user- and item-centric perspectives with a macro-micro prototype-aware variational encoder and dimension-wise intent alignment to improve collaborative filtering.
citing papers explorer
-
Diagnosing and Mitigating Retrieval Bottlenecks in LLM-Based Cold-Start Recommendation
Retrieval coverage limits LLM rerankers in cold-start recommendation; a learned hybrid fusion improves pool quality but LLM reranking often degrades end-to-end performance while simpler rankers exploit the pool.
-
NuGNN: a Graph Neural Network for Nuclear Reaction Network Equations
NuGNN applies a heterogeneous graph neural network to surrogate-solve a 690-isotope nuclear reaction network, achieving few-percent errors and reproducing final abundances where fully connected and Res-U-Net models fail.
-
SCICONVBENCH: Benchmarking LLMs on Multi-Turn Clarification for Task Formulation in Computational Science
SCICONVBENCH is a new benchmark evaluating LLMs on multi-turn disambiguation and inconsistency resolution for task formulation in computational science, with frontier models reaching only 52.7% success on fluid mechanics disambiguation cases.
-
CA-BED: Conversation-Aware Bayesian Experimental Design
CA-BED uses Bayesian experimental design and simulated conversation trees with LLM likelihoods to optimize multi-turn question selection, reporting 21.8% higher success rates than direct prompting on entity-deduction benchmarks.
-
Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents
A dual hierarchical RL framework with two agents coordinates high-level dialogue strategy and low-level question generation to emulate judicial questioning and extract key information from Supreme Court arguments, outperforming baselines.
-
Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation
A jointly learned hierarchical index with cross-attention and residual quantization scales exact retrieval in foundational recommendation models, deployed at Meta with additional performance from test-time training on index nodes.
-
ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation
ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.
-
MirrorBench: A Benchmark to Evaluate Conversational User-Proxy Agents for Human-Likeness
MirrorBench defines a reproducible benchmark combining lexical metrics (MATTR, Yule's K, HD-D) and LLM-judge metrics with calibration controls to measure human-likeness of user-proxy agents across four datasets.