archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 4

cs.IR 2026-05-11 reviewed

18% of web searches concern places
Much of Geospatial Web Search Is Beyond Traditional GIS

Ilya Ilyankou +2
cs.LG 2026-05-11 reviewed

One verification trace yields calibrated LLM-judge confidence
VERDI: Single-Call Confidence Estimation for Verification-Based LLM Judges via Decomposed Inference

Jasmine Qi +2
cs.IR 2026-05-11 reviewed

Structured belief store beats vector search for LLM memory
Structured Belief State and the First Precision-Aware Benchmark for LLM Memory Retrieval

Jeffrey Flynt
cs.IR 2026-05-11 reviewed

Representative Stochastic ranker reaches near-parity exposure in RAG
Towards FairRAG: Preventing Representational Harm in Retrieval-Augmented Generation by Enforcing Fair Exposure at Retrieval Time

Riddhi Tikoo
cs.LG 2026-05-11 reviewed

Locale boosting fixes US bias in global ranking models
Localization Boosting for Growth Markets: Mitigating Cross-Locale Behavioral Bias in Learning-to-Rank

Suryaa Veerabathiran Seran +3
cs.IR 2026-05-11 reviewed

LLM-assisted benchmark tests retrieval across four scholarly categories
MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval

Mehmet Deniz T\"urkmen +5
cs.IR 2026-05-11 reviewed

Benchmark shows semantic plausibility misses real utility in LLM recs
RecoAtlas: From Semantic Plausibility to Set-Level Utility in LLM Recommendation Agents

Imad Aouali +4
cs.IR 2026-05-11 reviewed

Adaptive weights inside GNN message passing cut popularity bias
Debiasing Message Passing to Mitigate Popularity Bias in GNN-based Collaborative Filtering

Md Aminul Islam +3
cs.CL 2026-05-11 reviewed

Assertion-aware retrieval lifts clinical QA accuracy by 22 points
ClinicalBench: Stress-Testing Assertion-Aware Retrieval for Cross-Admission Clinical QA on MIMIC-IV

Alex Stinard
cs.AI 2026-05-11 reviewed

Cascaded generative model lifts e-commerce cart adds 2.7%
A Cascaded Generative Approach for e-Commerce Recommendations

Moein Hasani +6
cs.AI 2026-05-11 reviewed

Generative cascade boosts e-commerce cart adds by 2.7%
A Cascaded Generative Approach for e-Commerce Recommendations

Moein Hasani +6
cs.CL 2026-05-11 reviewed

Prompt optimization ranks second for EHR clinical QA
Neural at ArchEHR-QA 2026: One Method Fits All: Unified Prompt Optimization for Clinical QA over EHRs

Abrar Majeedi +3
cs.IR 2026-05-11 reviewed

BM25 lexical search hits 83% accuracy in LLM research agents
Rethinking Agentic Search with Pi-Serini: Is Lexical Retrieval Sufficient?

Tz-Huan Hsu +2
cs.IR 2026-05-11 reviewed

User context inside the loop lifts LLM research report relevance
Personalized Deep Research: A User-Centric Framework, Dataset, and Hybrid Evaluation for Knowledge Discovery

Xiaopeng Li +8
cs.IR 2026-05-11 reviewed

Iterative denoising unifies list reranking
UniRank: Unified List-wise Reranking via Confidence-Ordered Denoising

Pengyue Jia +9
cs.AI 2026-05-11 reviewed

Synthetic probes can isolate how data traits affect LLM behavior
Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance

Shiqiang Wang +3
cs.IR 2026-05-11 reviewed

LLM agents model group leadership to lift recommendation accuracy
AgentGR: Semantic-aware Agentic Group Decision-Making Simulator for Group Recommendation

Yangtao Zhou +6
cs.IR 2026-05-11 reviewed

LLM recommenders gain from anchoring ratings as numeric tokens
Every Preference Has Its Strength: Injecting Ordinal Semantics into LLM-Based Recommenders

Jiwon Jeong +4
cs.CL 2026-05-11 reviewed

Answer-aware reranking reaches 96% accuracy on Ukrainian document QA
Qwen Goes Brrr: Off-the-Shelf RAG for Ukrainian Multi-Domain Document Understanding

Anton Bazdyrev +4
cs.CL 2026-05-11 reviewed

Local 9B model nears commercial LLM on FOIA privilege classification
To Redact, or not to Redact? A Local LLM Approach to Deliberative Process Privilege Classification

Maik Larooij +1
cs.IR 2026-05-11 reviewed

Latent reasoning halves steps while lifting generative recommendation accuracy
LASAR: Latent Adaptive Semantic Aligned Reasoning for Generative Recommendation

Yiwen Chen +10
cs.CL 2026-05-11 reviewed

Benchmark scores abstract answers by topic coverage without comparisons
ASTRA-QA: A Benchmark for Abstract Question Answering over Documents

Shu Wang +5
cs.IR 2026-05-11 reviewed

NumColBERT injects numeracy into ColBERT without pipeline overhaul
NumColBERT: Non-Intrusive Numeracy Injection for Late-Interaction Retrieval Models

Haruki Fujimaki +1
cs.IR 2026-05-11 reviewed

Three-layer memory turns reading scrolls into tailored paper questions
H-MAPS: Hierarchical Memory-Augmented Proactive Search Assistant for Scientific Literature

Koji Nishikawa +1
cs.IR 2026-05-11 reviewed

CCD-aware scheduling lifts vector search throughput 3.7x
CCD-Level and Load-Aware Thread Orchestration for In-Memory Vector ANNS on Multi-Core CPUs

Yuchen Huang +8
cs.IR 2026-05-11 reviewed

Query clustering and novel loss improve health intent accuracy
Enhancing Healthcare Search Intent Recognition with Query Representation Learning and Session Context

Harshita Jagdish Sahijwani +6
cs.IR 2026-05-11 reviewed

SABER improves RAG accuracy by choosing to trust or abstain
Trust or Abstain? A Self-Aware RAG Approach

Xi Zhu +7
cs.IR 2026-05-11 reviewed

LLM-RAG system raises average HEI scores by 6.45 points
An LLM-RAG Approach for Healthy Eating Index-Informed Personalized Food Recommendations

Yibin Wang +4
cs.CV 2026-05-11 reviewed

2 million Weibo photos benchmark AI on city space understanding
Urban-ImageNet: A Large-Scale Multi-Modal Dataset and Evaluation Framework for Urban Space Perception

Yiwei Ou +7
cs.IR 2026-05-11 reviewed

Graph of codecs compresses data smaller and faster
OpenZL: Using Graphs to Compress Smaller and Faster

Yann Collet +12
cs.CR 2026-05-11 reviewed

Black-box method flags LLM agent drift at 0.83 AUC
Nautilus Compass: Black-box Persona Drift Detection for Production LLM Agents

Chunxiao Wang
cs.IR 2026-05-11 reviewed

ReCoVR reaches 74% recall after one interactive video round
ReCoVR: Closing the Loop in Interactive Composed Video Retrieval

Bingqing Zhang +6
cs.IR 2026-05-11 reviewed

Hybrid system recommends coherent outfits from fashion catalogs
Loom: Hybrid Retrieval-Scoring Outfit Recommendation with Semantic Material Compatibility and Occasion-Aware Embedding Priors

Anushree Berlia
cs.IR 2026-05-10 reviewed

LLM agents let users beat platform personalization with their own data
LLM Agents Enable User-Governed Personalization Beyond Platform Boundaries

Jiacheng Lin +17
cs.LG 2026-05-10 reviewed

Aggregate peaks occur at 3-5 times the individual exposure level
Simpson's Paradox in Behavioral Curves: How Aggregation Distorts Parametric Models of User Dynamics

Chao Zhou
cs.IR 2026-05-10 reviewed

MM-LLM captions lift recsys AUC by 0.35% at industrial scale
A General Framework for Multimodal LLM-Based Multimedia Understanding in Large-Scale Recommendation Systems

Yiming Zhu +11
cs.IR 2026-05-10 reviewed

OpenIIR runs LLM persona simulations for IR research
OpenIIR: An Open Simulation Platform for Information Retrieval Research

Saber Zerhoudi
cs.IR 2026-05-10 reviewed

Open platform runs LLM personas in four IR scenario types
OpenIIR: An Open Simulation Platform for Information Retrieval Research

Saber Zerhoudi
cs.CL 2026-05-10 reviewed

Semantic search finds more hidden Locke receptions than word matching
Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke

Yu Wu +4
cs.CL 2026-05-10 reviewed

Semantic search finds more implicit Locke references than keywords
Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke

Yu Wu +4
cs.IR 2026-05-09 reviewed

Reddit music chats become 190k Deezer-grounded dialogues
Reddit2Deezer: A Scalable Dataset for Real-World Grounded Conversational Music Recommendation

Haven Kim +1
cs.DB 2026-05-09 reviewed

Personalized privacy cuts infinite stream estimation error by 53.6%
Personalized w-Event Privacy for Infinite Stream Estimation

Leilei Du +6
cs.AI 2026-05-09 reviewed

Semantic IDs enable efficient ultra-long user sequence modeling
UxSID: Semantic-Aware User Interests Modeling for Ultra-Long Sequence

Hongwei Zhang +10
cs.AI 2026-05-09 reviewed

Semantic IDs enable efficient modeling of ultra-long user sequences
UxSID: Semantic-Aware User Interests Modeling for Ultra-Long Sequence

Hongwei Zhang +10
cs.AI 2026-05-09 reviewed

Semantic IDs enable efficient ultra-long user sequence modeling
UxSID: Semantic-Aware User Interests Modeling for Ultra-Long Sequence

Hongwei Zhang +10
cs.IR 2026-05-09 reviewed

LLM framework summarizes user histories into personas
UserGPT Technical Report

Yunyi Xuan +11
cs.AI 2026-05-08 reviewed

Bio-inspired memory cuts LLM agent storage by 58% at 97% precision
Human-Inspired Memory Architecture for LLM Agents

Doga Kerestecioglu +4
cs.IR 2026-05-08 reviewed

Multi-level contrastive learning improves knowledge graph recommendations
Multi-Level Graph Attention Network Contrastive Learning for Knowledge-Aware Recommendation

Zhifei Hu +1
cs.IR 2026-05-08 reviewed

Exclusion distances raise filtered ANNS speed 1.3-5x
FAVOR: Efficient Filter-Agnostic Vector ANNS Based on Selectivity-Aware Exclusion Distances

Junjie Song +6
cs.IR 2026-05-08 reviewed

Benchmark reveals three competency gaps in tourism recommenders
TRACE: Tourism Recommendation with Accountable Citation Evidence

Zixu Zhao +8