SCICONVBENCH is a new benchmark evaluating LLMs on multi-turn disambiguation and inconsistency resolution for task formulation in computational science, with frontier models reaching only 52.7% success on fluid mechanics disambiguation cases.
Analysing mixed initiatives and search strategies during conversational search
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7representative citing papers
A jointly learned hierarchical index with cross-attention and residual quantization scales exact retrieval in foundational recommendation models, deployed at Meta with additional performance from test-time training on index nodes.
TextBridgeGNN pre-trains GNNs using text-guided hierarchical propagation to enable effective cross-domain knowledge transfer in recommendations.
QPP methods can select query variants that boost end-to-end RAG quality over the original query, though retrieval-optimized variants often fail to produce the best generated answers, revealing a utility gap.
Song introductions stimulate interest in unfamiliar music primarily through narrative transportation, with cognitive elaboration as a weaker but easier-to-implement secondary factor.
Simulations show that cooperative outcomes in network games with personality-driven LLM agents depend on both network connectivity and the placement of pro-social personalities, not just pairwise interaction preferences.
The thesis identifies theoretical, empirical, and conceptual flaws in offline fairness measures for recommender systems and contributes new evaluation methods and practical guidelines.
citing papers explorer
-
SCICONVBENCH: Benchmarking LLMs on Multi-Turn Clarification for Task Formulation in Computational Science
SCICONVBENCH is a new benchmark evaluating LLMs on multi-turn disambiguation and inconsistency resolution for task formulation in computational science, with frontier models reaching only 52.7% success on fluid mechanics disambiguation cases.
-
Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation
A jointly learned hierarchical index with cross-attention and residual quantization scales exact retrieval in foundational recommendation models, deployed at Meta with additional performance from test-time training on index nodes.
-
TextBridgeGNN: Pre-training Graph Neural Network for Cross-Domain Recommendation via Text-Guided Transfer
TextBridgeGNN pre-trains GNNs using text-guided hierarchical propagation to enable effective cross-domain knowledge transfer in recommendations.
-
Can QPP Choose the Right Query Variant? Evaluating Query Variant Selection for RAG Pipelines
QPP methods can select query variants that boost end-to-end RAG quality over the original query, though retrieval-optimized variants often fail to produce the best generated answers, revealing a utility gap.
-
Let Me Introduce You: Stimulating Taste-Broadening Serendipity Through Song Introductions
Song introductions stimulate interest in unfamiliar music primarily through narrative transportation, with cognitive elaboration as a weaker but easier-to-implement secondary factor.
-
NetworkGames: Simulating Cooperation in Network Games with Personality-driven LLM Agents
Simulations show that cooperative outcomes in network games with personality-driven LLM agents depend on both network connectivity and the placement of pro-social personalities, not just pairwise interaction preferences.
-
Offline Evaluation Measures of Fairness in Recommender Systems
The thesis identifies theoretical, empirical, and conceptual flaws in offline fairness measures for recommender systems and contributes new evaluation methods and practical guidelines.