archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 13

cs.CL 2026-04-12 reviewed

Agentic framework enables grounded multimodal long-form generation
Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation

Fangda Ye +7
cs.IR 2026-04-12 reviewed

Adaptive selector picks right table count per query
Retrieve Only Relevant Tables Whether Few or Many: Adaptive Table Retrieval Method

Taehee Kim +3
cs.CL 2026-04-12 reviewed

Syllable tokenizer beats 200x larger model on Turkish retrieval
HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval

Senol Gulgonul
cs.IT 2026-04-12 reviewed

Verification threshold limits synthetic identities to spherical code size
On the Capacity of Distinguishable Synthetic Identity Generation under Face Verification

Behrooz Razeghi
cs.SD 2026-04-12 reviewed

Small curated LilyPond set beats 15B-token corpus on music classification
BMdataset: A Musicologically Curated LilyPond Dataset

Matteo Spanio +2
cs.IR 2026-04-12 reviewed

Fuzzy logic adds boolean operations to neural embeddings without retraining
NSFL: A Post-Training Neuro-Symbolic Fuzzy Logic Framework for Boolean Operators in Neural Embeddings

Vladi Vexler +3
cs.IR 2026-04-12 reviewed

LLMs give different answers to the same medical question 87-97% of the time
Evaluating Small Open LLMs for Medical Question Answering: A Practical Framework

Avi-ad Avraam Buskila
cs.IR 2026-04-12 reviewed

Coordinated semantic IDs lift video search ranking performance
SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search

Guowen Li +6
cs.CR 2026-04-11 reviewed

Homoglyph swaps degrade stylometric analysis of text
Hijacking Text Heritage: Hiding the Human Signature through Homoglyphic Substitution

Robert Dilworth
cs.CV 2026-04-11 reviewed

Chunking method slashes visual document storage by 90 percent
Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval

Yibo Yan +7
cs.CL 2026-04-11 reviewed

Benchmark lets dialogue clarify ambiguous table questions
ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification

Zhensheng Wang +5
cs.IR 2026-04-11 reviewed

Three orthogonal preference factors raise multi-domain recommendation accuracy
MOSAIC: Multi-Domain Orthogonal Session Adaptive Intent Capture for Prescient Recommendations

Abderaouf Bahi +4
cs.IR 2026-04-11 reviewed

STAR retriever cuts two biases for better graph QA
STAR: Semantic-Tuned and Tail-Adaptive Retriever for Graph-Augmented Generation

Shuai Li +4
cs.IR 2026-04-11 reviewed

Value-guided tree search lifts recommendation metrics on three datasets
HARPO: Hierarchical Agentic Reasoning for User-Aligned Conversational Recommendation

Subham Raj +3
cs.IR 2026-04-11 reviewed

Self-distilled RL makes recommender agents learn from mutual interactions
Self-Distilled Reinforcement Learning for Co-Evolving Agentic Recommender Systems

Zongwei Wang +7
cs.IR 2026-04-11 reviewed

MaxSim operator caps multi-vector models at 20-word queries
Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions

Utshab Kumar Ghosh +2
cs.CY 2026-04-10 reviewed

User workshops expose blind spots in search result rankings
All Eyes on the Ranker: Participatory Auditing to Surface Blind Spots in Ranked Search Results

Anna Marie Rezk +5
cs.IR 2026-04-10 reviewed

Users control recommendations in federated system
Beyond Centralization: User-Controlled Federated Recommendations in Practice

Manel Slokom +1
cs.CR 2026-04-10 reviewed

Query transforms enable secure multi-org RAG without decryption
Trans-RAG: Query-Centric Vector Transformation for Secure Cross-Organizational Retrieval

Yu Liu +6
cs.CL 2026-04-10 reviewed

Auto-generated negatives train verifiers to check evidence support
Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision

Soroosh Tayebi Arasteh +4
cs.CL 2026-04-10 reviewed

Interleaved retrieval fixes lost-in-thought degradation in LLMs
RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

Kyle Whitecross +1
cs.IR 2026-04-10 reviewed

Non-experts build competitive AI pipelines through four LLM-guided stages
From Intent to AI Pipelines: A Controlled Agentic Framework for Non-AI Expert Scientists

Hyacinth Ali +2
cs.IR 2026-04-10 reviewed

LLM reference documents cut reranking time by up to 66%
Dynamic Ranked List Truncation for Reranking Pipelines via LLM-generated Reference-Documents

Nilanjan Sinhababu +3
cs.IR 2026-04-10 reviewed

Recs improve by personalizing time
TME-PSR: Time-aware, Multi-interest, and Explanation Personalization for Sequential Recommendation

Qingzhuo Wang +6
cs.IR 2026-04-10 reviewed

Quantum-inspired embeddings show weak unstable ranking
On the Representational Limits of Quantum-Inspired 1024-D Document Embeddings: An Experimental Evaluation Framework

Dario Maio
cs.HC 2026-04-10 reviewed

Co-design with BLV experts produces multi-modal 3D data tool
Three Modalities, Two Design Probes, One Prototype, and No Vision: Experience-Based Co-Design of a Multi-modal 3D Data Visualization Tool

Sanchita S. Kamath +5
cs.IR 2026-04-10 reviewed

Graph retrieval improves LLM analysis of wearable health data
Query-Conditioned Graph Retrieval for Contextualized LLM Reasoning in Personalized Wearable Data

Zhenyu Lu +2
cs.CV 2026-04-10 reviewed

Expert dataset advances AI on three fashion outfit tasks
FashionStylist: An Expert Knowledge-enhanced Multimodal Dataset for Fashion Understanding

Kaidong Feng +9
cs.IR 2026-04-10 reviewed

Hybrid recommender cuts regret in CFD model selection
Hybrid Cold-Start Recommender System for Closure Model Selection in Multiphase Flow Simulations

S. H\"ansch +8
cs.IR 2026-04-10 reviewed

Active learning and topic guesses raise RAG data extraction rates
ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation

Xingyu Lyu +7
cs.IR 2026-04-10 reviewed

Dual intent spaces lift recommendation performance
DIAURec: Dual-Intent Space Representation Optimization for Recommendation

Yu Zhang +3
cs.CE 2026-04-10 reviewed

Momentum optimization matches NASDAQ growth with lower volatility
Taming the Black Swan: A Momentum-Gated Hierarchical Optimisation Framework for Asymmetric Alpha Generation

Arya Chakraborty +1
cs.CE 2026-04-10 reviewed

Momentum-gated optimization matches NASDAQ returns with less risk
Taming the Black Swan: A Momentum-Gated Hierarchical Optimisation Framework for Asymmetric Alpha Generation

Arya Chakraborty +1
cs.IR 2026-04-10 reviewed

RAG pipeline improves accuracy of Hong Kong health advice
PriHA: A RAG-Enhanced LLM Framework for Primary Healthcare Assistant in Hong Kong

Richard Wai Cheung Chan +5
cs.IR 2026-04-10 reviewed

RAG framework boosts accuracy of Hong Kong health advice
PriHA: A RAG-Enhanced LLM Framework for Primary Healthcare Assistant in Hong Kong

Richard Wai Cheung Chan +5
cs.IR 2026-04-10 reviewed

Two surface rules route two-hop questions to better retrieval
Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA

Andre Bacellar
cs.CL 2026-04-10 reviewed

Bandit method lifts document QA scores 5-18 percent
MAB-DQA: Addressing Query Aspect Importance in Document Question Answering with Multi-Armed Bandits

Yixin Xiang +5
cs.IR 2026-04-10 reviewed

Compressing user history to tokens boosts recommender performance
IAT: Instance-As-Token Compression for Historical User Sequence Modeling in Industrial Recommender Systems

Xinchun Li +13
cs.IR 2026-04-10 reviewed

Retrieval for LLMs must track answer quality
Beyond Relevance: Utility-Centric Retrieval in the LLM Era

Hengran Zhang +3
cs.IR 2026-04-10 reviewed

Bracket tournament lifts LLM document ranking accuracy
BracketRank: Large Language Model Document Ranking via Reasoning-based Competitive Elimination

Abdelrahman Abdallah +3
cs.CY 2026-04-09 reviewed

Algebra step differences yield general strategy embeddings
Towards Generalizable Representations of Mathematical Strategies

Siddhartha Pradhan +2
cs.LG 2026-04-09 reviewed

Raw banking events train versatile financial foundation model
PRAGMA: Revolut Foundation Model

Maxim Ostroukhov +12
cs.IR 2026-04-09 reviewed

Pairwise margins decompose into unique factor shares for rankings
A Mathematical Theory of Ranking

Yin Cheng
cs.CR 2026-04-09 reviewed

RAC holds 96 percent accuracy on unbalanced confidential documents
Retrieval Augmented Classification for Confidential Documents

Yeseul E. Chang +4
cs.IR 2026-04-09 reviewed

Recognizing ethical knowledge gaps shifts buying habits
Search Changes Consumers' Minds: How Recognizing Gaps Drives Sustainable Choices

Frans van der Sluis +1
cs.IR 2026-04-09 reviewed

Explicit sparsity scales recommenders where dense models stall
Beyond Dense Connectivity: Explicit Sparsity for Scalable Recommendation

Yantao Yu +4
cs.IR 2026-04-09 reviewed

Causal disentanglement boosts cross-domain recommendations
Context-Aware Disentanglement for Cross-Domain Sequential Recommendation: A Causal View

Xingzi Wang +2
cs.IR 2026-04-09 reviewed

Intent taxonomy enriches queries for better infographic matches
Show Me the Infographic I Imagine: Intent-Aware Infographic Retrieval for Authoring Support

Jing Xu +4
cs.CL 2026-04-09 reviewed

Semantic model predicts when RAG improves QA answers
Rag Performance Prediction for Question Answering

Or Dado +2
cs.IR 2026-04-09 reviewed

Semantic relevance training improves sponsored search retrieval
Unified Supervision for Walmart's Sponsored Search Retrieval via Joint Semantic Relevance and Behavioral Engagement Modeling

Shasvat Desai +6