archive
Every paper Pith has read. Search by title, abstract, or pith.
1286 papers in cs.IR · page 11
-
Peerispect checks peer review claims against the paper
Peerispect: Claim Verification in Scientific Peer Reviews
-
Code-switching degrades retrieval by up to 27 percent
Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers
-
Joint RL training optimizes reasoning agent and document ranker together
CoSearch: Joint Training of Reasoning and Document Ranking via Reinforcement Learning for Agentic Search
-
Natural language search over 8 million math statements
Matlas: A Semantic Search Engine for Mathematics
-
Multi-agent system cuts false positives in filters by 74%
Transparent and Controllable Recommendation Filtering via Multimodal Multi-Agent Collaboration
-
RoTRAG raises dialogue harm detection F1 by 40% with retrieved rules
RoTRAG: Rule of Thumb Reasoning for Conversation Harm Detection with Retrieval-Augmented Generation
-
Structured memory paths cut dilution in LLM agent searches
MemSearch-o1: Empowering Large Language Models with Reasoning-Aligned Memory Growth in Agentic Search
-
Token-level memory paths cut dilution in agentic LLM search
MemSearch-o1: Empowering Large Language Models with Reasoning-Aligned Memory Growth in Agentic Search
-
HORIZON benchmark tests user models on cross-domain and long-term generalization
HORIZON: A Benchmark for In-the-wild User Behaviour Modeling
-
Attention heads rank passages after one forward pass
HeadRank: Decoding-Free Passage Reranking via Preference-Aligned Attention Heads
-
Attention heads rerank passages without generating tokens
HeadRank: Decoding-Free Passage Reranking via Preference-Aligned Attention Heads
-
LLMs beat graph heuristics only when facts are scattered
RLM-on-KG: Heuristics First, LLMs When Needed: Adaptive Retrieval Control over Mention Graphs for Scattered Evidence
-
Block-level retrieval lifts document QA accuracy 7.2% and cuts tokens 73%
LFRAG: Layout-oriented Fine-grained Retrieval-Augmented Generation on Multimodal Document Understanding
-
Hybrid model uses text and tone to flag alarming student replies
Detecting Alarming Student Verbal Responses using Text and Audio Classifier
-
Adaptive augmentation improves sequential recommendations by 26 percent
Beyond One-Size-Fits-All: Adaptive Test-Time Augmentation for Sequential Recommendation
-
LLM retrievers beat baselines on typos but fail on synonyms
On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability
-
JFinTEB creates first benchmark for Japanese financial embeddings
JFinTEB: Japanese Financial Text Embedding Benchmark
-
Classic search retrieves relevant texts more than useful ones
UsefulBench: Towards Decision-Useful Information as a Target for Information Retrieval
-
RL method generates diverse valid hypotheses for event forecasts
Scattered Hypothesis Generation for Open-Ended Event Forecasting
-
Next-token prediction matches full-item MLE in generative recs
On the Equivalence Between Auto-Regressive Next Token Prediction and Full-Item-Vocabulary Maximum Likelihood Estimation in Generative Recommendation--A Short Note
-
Dependency tree recovery improves document chunking
M3DocDep: Multi-modal, Multi-page, Multi-document Dependency Chunking with Large Vision-Language Models
-
Double helix propagation and contrastive alignment disentangle intents
Intent Propagation Contrastive Collaborative Filtering
-
SIF turns each raw sample into a sequence token
Sample Is Feature: Beyond Item-Level, Toward Sample-Level Tokens for Unified Large Recommender Models
-
Full sample tokens unify sequence and feature modeling
Sample Is Feature: Beyond Item-Level, Toward Sample-Level Tokens for Unified Large Recommender Models
-
The paper proposes SIMMER, a single MLLM-based model that embeds food images and recipe…
SIMMER: Cross-Modal Food Image--Recipe Retrieval via MLLM-Based Embedding
-
Adaptive retrieval filters noise for weak LLMs
Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking
-
MeSH hierarchy guides contrastive learning for biomedical retrieval
BioHiCL: Hierarchical Multi-Label Contrastive Learning for Biomedical Retrieval with MeSH Labels
-
Personalized time contexts boost item embedding accuracy
Learning Behaviorally Grounded Item Embeddings via Personalized Temporal Contexts
-
Weighted similarities from shared embeddings match complex recommenders
Collaborative Filtering Through Weighted Similarities of User and Item Embeddings
-
RePAIR maps flawed RAG outputs to fixes without error categories
Improving Retrieval-Augmented Generation without Taxonomy-based Error Categorization
-
Hybrid search disagreements create free labels that boost small embeddings
vstash: Local-First Hybrid Retrieval with Adaptive Fusion for LLM Agents
-
Step-level info gain rewards improve search reasoning
IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning
-
Smooth rank approximation enables metric-agnostic learning to rank
Metric-agnostic Learning-to-Rank via Boosting and Rank Approximation
-
Personalizing reasoning rules gives agents gains beyond memory updates
SAGER: Self-Evolving User Policy Skills for Recommendation Agent
-
Generative rec model boosts clicks 9.5% and transactions 8.7%
GenRec: A Preference-Oriented Generative Framework for Large-Scale Recommendation
-
User init from item modalities and clusters closes semantic gap in multimodal recs
Well Begun is Half Done: Training-Free and Model-Agnostic Semantically Guaranteed User Representation Initialization for Multimodal Recommendation
-
Natural language bridges let LLMs recommend across private domains
Federated User Behavior Modeling for Privacy-Preserving LLM Recommendation
-
Lagrangian solver selectively adds LLM knowledge to generative recommenders
LWGR: Lagrangian-Constrained Personalized World Knowledge for Generative Recommendation
-
Generative diffusion creates stable learning paths by modeling uncertainty
Uncertainty-aware Generative Learning Path Recommendation with Cognition-Adaptive Diffusion
-
New framework balances accuracy and diversity in game recommendations
Category-based and Popularity-guided Video Game Recommendation: A Balance-oriented Framework
-
CPGRec+ extends a prior game recommender by adding edge reweighting to mark strong likes…
CPGRec+: A Balance-oriented Framework for Personalized Video Game Recommendations
-
Dual channels learn preferences from mixed user behaviors
Behavior-Aware Dual-Channel Preference Learning for Heterogeneous Sequential Recommendation
-
Navigating a skill directory beats chunk retrieval for enterprise QA
Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG
-
Navigating distilled skill trees beats flat retrieval on enterprise RAG
Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG
-
PyTorch toolkit adds GUI and modularity to news recommendation
NewsTorch: A PyTorch-based Toolkit for Learner-oriented News Recommendation
-
Retrieval covers active authorities only with frontier inclusion and no ignored superseder
Controlling Authority Retrieval: A Missing Retrieval Objective for Authority-Governed Knowledge
-
Unified model matches RAG quality at one-tenth context size
A Unified Model and Document Representation for On-Device Retrieval-Augmented Generation
-
Property graphs structure chats for accurate long-term AI memory
APEX-MEM: Agentic Semi-Structured Memory with Temporal Reasoning for Long-Term Conversational AI
-
ID-graph contrastive fusion raises sequential recommendation accuracy
ID and Graph View Contrastive Learning with Multi-View Attention Fusion for Sequential Recommendation
-
Unified LLM jointly predicts needs and recommends local services
Enhancing Local Life Service Recommendation with Agentic Reasoning in Large Language Model