archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 2

cs.IR 2026-05-18 reviewed

Parameter-free module boosts text-video retrieval
Text-Video Retrieval With Global-Local Contrastive Consistency Learning

Xiaolun Jing +2
cs.AI 2026-05-18 reviewed

AI chunking builds maps predicting war in Thucydides model
Agentic Chunking and Bayesian De-chunking of AI Generated Fuzzy Cognitive Maps: A Model of the Thucydides Trap

Akash Kumar Panda +2
cs.IR 2026-05-18 reviewed

Second-stage correction cuts watch-time error by 12.57 percent
DADF: A Distribution-Aware Debiasing Framework for Watch-Time Regression in Recommender Systems

Yiqing Yang +6
cs.AI 2026-05-18 reviewed

PuppyChatter unifies SDK simplicity across AI vendors
Accelerating AI-Powered Research: The PuppyChatter Framework for Usable and Flexible Tooling

Chun-Hsiung Tseng +4
cs.IR 2026-05-18 reviewed

Uncertainty calibration lifts retention for low-active users
Uncertainty-Calibrated Recommendations for Low-Active Users

Bob Junyi Zou +4
cs.IR 2026-05-17 reviewed

Three-stage pipeline lifts video RAG retrieval from 0.195 to 0.759 nDCG
MARQUIS: A Three-Stage Pipeline for Video Retrieval-Augmented Generation

Debashish Chakraborty +9
cs.CL 2026-05-17 reviewed

Co-citation predictability drops over 20 years
Temporal Decay of Co-Citation Predictability: A 20-Year Statute Retrieval Benchmark from 396M Ukrainian Court Citations

Volodymyr Ovcharov
cs.CL 2026-05-17 reviewed

Catalogues miss 609 datasets across 53 languages
Beyond Catalogue Counts: the Dataset Visibility Asymmetry in Low-Resource Multilingual NLP

Zhiyin Tan +1
cs.LG 2026-05-17 reviewed

Codebook-free layer keeps ANN recall stable under streaming
IVF-TQ: Calibration-Free Streaming Vector Search via a Codebook-Free Residual Layer

Tarun Sharma
cs.LG 2026-05-17 reviewed

Fixed rotation and scalar quantizer keeps IVF recall stable in streaming data
IVF-TQ: Calibration-Free Streaming Vector Search via a Codebook-Free Residual Layer

Tarun Sharma
cs.IR 2026-05-17 reviewed

Text guidance extracts cleaner visuals from noisy product photos
Text-Guided Visual Representation Learning for Robust Multimodal E-Commerce Recommendation

Yufei Guo +7
cs.CL 2026-05-17 reviewed

Five agents map news bias by exposing omissions and manipulations
NewsLens: A Multi-Agent Framework for Adversarial News Bias Navigation

Joy Bose
cs.IR 2026-05-17 reviewed

Dual model generates fashion images with text explanations
Dual-Diffusional Generative Fashion Recommendation

Mingzhe Yu +3
cs.IR 2026-05-17 reviewed

Interleaving reviews with items improves generative recs
RAGR: Review-Augmented Generative Recommendation

Yingyi Zhang +10
cs.IR 2026-05-17 reviewed

BLAST workflow plus dual filters boosts protein QA for new sequences
Unlocking Biological Workflows for Robust Protein-Text Question Answering: A Dual-Dimensional RAG Framework

Li Ding +6
cs.IR 2026-05-16 reviewed

Ghost cuts popularity bias in generative recommenders
Echoes in Filter Bubble: Diagnosing and Curing Popularity Bias in Generative Recommenders

Jun Yin +7
cs.IR 2026-05-16 reviewed

Path-level recs beat item-level ones in unified education tests
UniER: A Unified Benchmark for Item-level and Path-level Exercise Recommendation

Xinghe Cheng +7
cs.DC 2026-05-16 reviewed

Coding plus sketching cuts distributed ML runtime
Approximate Distributed Coded Computing: Polynomial Codes and Randomized Sketching

Neophytos Charalambides +1
cs.IR 2026-05-15 reviewed

Retrieval adapts thresholds to raise multi-label accuracy
RAPT: Retrieval-Augmented Post-hoc Thresholding for Multi-Label Classification

Lasal Jayawardena +3
cs.IR 2026-05-15 reviewed

Dynamic facets refine job search queries in real time
Policy-Grounded Dynamic Facet Suggestions for Job Search

Dan Xu +13
cs.CL 2026-05-15 reviewed

Navigator assembles research from complementary evidence pieces
Argus: Evidence Assembly for Scalable Deep Research Agents

Zhen Zhang +9
cs.CL 2026-05-15 reviewed

Evidence graph dispatches parallel searchers to reach 86.2 on BrowseComp
Argus: Evidence Assembly for Scalable Deep Research Agents

Zhen Zhang +9
cs.DL 2026-05-15 reviewed

Companion JSON makes papers actionable for LLM agents
paper.json: A Coordination Convention for LLM-Agent-Actionable Papers

Arquimedes Canedo
cs.IR 2026-05-15 reviewed

Multimodal fusion retrieves events in Vietnamese news videos
MERVIN: A Unified Framework for Multimodal Event Retrieval in Vietnamese News Videos

Anh-Tai Pham-Nguyen +3
cs.IR 2026-05-15 reviewed

LLM logits refine ad auctions for chatbots
LERA: LLM-Enhanced RAG for Ad Auction in Generative Chatbots

Haoran Sun +9
cs.IR 2026-05-15 reviewed

NPU-CPU split speeds billion-scale search up to 100x
Ascend-RaBitQ: Heterogeneous NPU-CPU Acceleration of Billion-Scale Similarity Search with 1-bit Quantization

Fujun He +14
cs.IR 2026-05-15 reviewed

Generative distributions replace target matching in CTR retrieval
Generative Long-term User Interest Modeling for Click-Through Rate Prediction

Jiangli Shao +7
cs.DB 2026-05-15 reviewed

Fairness optimization cuts bias in RAG retrieval
Fairness-Aware Retrieval Optimization for Retrieval-Augmented Generation

Yingqi Zhao +3
cs.AI 2026-05-15 reviewed

Attention traces raise AI lead detection from 9.5% to 61.9%
X-SYNTH: Beyond Retrieval -- Enterprise Context Synthesis from Observed Digital Human Attention

Guruprasad Raghavan +2
cs.AI 2026-05-15 reviewed

Digital attention lifts true lead rate 6.5 times
X-SYNTH: Beyond Retrieval -- Enterprise Context Synthesis from Observed Digital Human Attention

Guruprasad Raghavan +2
cs.IR 2026-05-14 reviewed

Evidence from documents beats LLM priors for AI job exposure
Jobs' AI Exposure Should Be Measured from Evidence, Not Model Priors

Luca Mouchel +2
cs.IR 2026-05-14 reviewed

Degree clipping keeps private hashing at 92.5% of original accuracy
Differentially Private Motif-Preserving Multi-modal Hashing

Zehua Cheng +2
cs.CL 2026-05-14 reviewed

Ukrainian court citations form unsupervised legal ontology
Automatic Construction of a Legal Citation Graph from 100 Million Ukrainian Court Decisions: Large-Scale Extraction, Topological Analysis, and Ontology-Driven Clustering

Volodymyr Ovcharov
cs.IR 2026-05-14 reviewed

Pruning time-unstable features makes recommendation models more consistent
Fortress: A Case Study in Stabilizing Search Recommendations via Temporal Data Augmentation and Feature Pruning

Milind Pandurang Jagre +8
cs.IR 2026-05-14 reviewed

AI Overviews lift Reddit engagement 12 percent in safe communities
The Impact of AI Search on the Online Content Ecosystem: Evidence from Google and Reddit

Peibo Zhang +2
cs.IR 2026-05-14 reviewed

AI Overviews lift Reddit comments by 12% in SFW communities
The Impact of AI Search on the Online Content Ecosystem: Evidence from Google and Reddit

Peibo Zhang +2
cs.AI 2026-05-14 reviewed

Citations miss key context in agent graph answers
Why Neighborhoods Matter: Traversal Context and Provenance in Agentic GraphRAG

Riccardo Terrenzi +2
stat.ML 2026-05-14 reviewed

Logging policies optimize off-policy estimates by balancing rewards and coverage
Logging Policy Design for Off-Policy Evaluation

Connor Douglas +2
stat.ML 2026-05-14 reviewed

Optimal logging policies minimize OPE error via reward-coverage balance
Logging Policy Design for Off-Policy Evaluation

Connor Douglas +2
cs.AI 2026-05-14 reviewed

The paper presents a fixed six-stage deterministic workflow that confines language model…
A Deterministic Agentic Workflow for HS Tariff Classification: Multi-Dimensional Rule Reasoning with Interpretable Decisions

Yu Zhang +6
cs.IR 2026-05-14 reviewed

One model trains to rank and retrieves by generation
Discrimination Is Generation: Unifying Ranking and Retrieval from a Tokenizer Perspective

Shuli Wang +8
cs.AI 2026-05-14 reviewed

Graph paths verify legal reasoning in Indian court AI
Falkor-IRAC: Graph-Constrained Generation for Verified Legal Reasoning in Indian Judicial AI

Joy Bose
cs.CV 2026-05-14 reviewed

Aggregated vectors make different financial docs look identical
A Picture is Worth a Thousand Words? An Empirical Study of Aggregation Strategies for Visual Financial Document Retrieval

Ho Hung Lim +1
cs.IR 2026-05-14 reviewed

AsymRec raises generative recommender accuracy 15.8%
Asymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization

Bin Huang +8
cs.IR 2026-05-14 reviewed

Distilled rerankers match quality with 34% fewer tokens
Stop Overthinking: Unlocking Efficient Listwise Reranking with Minimal Reasoning

Danyang Liu +1
cs.CV 2026-05-14 reviewed

Adaptive gate skips reasoning for simple multimodal inputs
Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecture

Longxiang Zhang +4
cs.IR 2026-05-14 reviewed

Semantic IDs halve beam search size for e-commerce retrieval
Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL

Jianbo Zhu +7
cs.IR 2026-05-14 reviewed

PaSaMaster beats GPT-5.2 in paper retrieval at 1% cost
Towards Self-Evolving Agentic Literature Retrieval

Yuwen Du +10
cs.IR 2026-05-14 reviewed

99% retrieval success can equal random selection
The 99% Success Paradox: When Near-Perfect Retrieval Equals Random Selection

Vyzantinos Repantis +7
cs.IR 2026-05-13 reviewed

Imagined future steps triple recall of distant memories
Thinking Ahead: Prospection-Guided Retrieval of Memory with Language Models

Harshita Chopra +4