archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 5

cs.IR 2026-05-08 reviewed

Hyperlinks as metadata improve RAG quality and efficiency
LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation

Giorgia Bolognesi +5
cs.CV 2026-05-08 reviewed

Multimodal agents score below 50% on interleaved search benchmark
InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search

Bohan Hou +7
cs.CL 2026-05-08 reviewed

Browser LLM tool extracts structured data from papers at 94% compliance
TCMIIES: A Browser-Based LLM-Powered Intelligent Information Extraction System for Academic Literature

Hanqing Zhao
cs.IR 2026-05-08 reviewed

Skills enable reliable, reusable execution in LLM agents
A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

Yingli Zhou +5
cs.IR 2026-05-08 reviewed

Skills as reusable procedures scale LLM agents
A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

Yingli Zhou +5
cs.IR 2026-05-08 reviewed

Dual channels separate semantics from behavior to lift sparse recommendations
DCGL: Dual-Channel Graph Learning with Large Language Models for Knowledge-Aware Recommendation

Xinchi Zou +6
cs.IR 2026-05-08 reviewed

PRISM models interacting preference and relevance in e-commerce search
PRISM: Refracting the Entangled User Behavior Space for E-Commerce Search

Haoqian Zhang +2
cs.IR 2026-05-08 reviewed

Multilingual retrievers split between semantic strength and language match
MLAIRE: Multilingual Language-Aware Information Retrieval Evaluation Protocal

Youngjoon Jang +3
cs.IR 2026-05-08 reviewed

LARGER raises coding agent file retrieval accuracy by 11-13 points
LARGER: Lexically Anchored Repository Graph Exploration and Retrieval

Yuntong Hu +4
cs.IR 2026-05-08 reviewed

Parallel tokens in diffusion LMs top BEIR-7 retrieval scores
DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models

Shuai Wang +4
cs.IR 2026-05-08 reviewed

Embeddings match sub-fields but miss agendas 80 percent of the time
Topic Is Not Agenda: A Citation-Community Audit of Text Embeddings

Junseon Yoo
cs.IR 2026-05-08 reviewed

LLMs learn to fetch memories dynamically for better recommendations
RRCM: Ranking-Driven Retrieval over Collaborative and Meta Memories for LLM Recommendation

Shijun Li +4
cs.IR 2026-05-08 reviewed

Simple graph heuristic beats trained recommenders on benchmarks
An Embarrassingly Simple Graph Heuristic Reveals Shortcut-Solvable Benchmarks for Sequential Recommendation

Haoyu Han +11
cs.IR 2026-05-08 reviewed

RL picks per-request utility weights for Pinterest Homefeed
A Production-Ready RL Framework for Personalized Utility Tuning with Pareto Sweeping in Pinterest Recommender Systems

Yichu Zhou +11
cs.IR 2026-05-07 reviewed

RL aligns LLM text profiles with embeddings for recommendations
Bridging Textual Profiles and Latent User Embeddings for Personalization

Zhaoxuan Tan +4
cs.HC 2026-05-07 reviewed

Moodle AI tutor grounded in teacher content reaches 0.97 faithfulness
From Surface Learning to Deep Understanding: A Grounded AI Tutoring System for Moodle

Anna Ostrowska +6
cs.IR 2026-05-07 reviewed

Single lexical query outperforms multi-round retrieval agents
Superintelligent Retrieval Agent: The Next Frontier of Agentic Retrieval

Zeyu Yang +3
cs.IR 2026-05-07 reviewed

Pruning trims deep recommenders while raising accuracy
Light-FMP: Lightweight Feature and Model Pruning for Enhanced Deep Recommender Systems

Nghia Bui +2
cs.CL 2026-05-07 reviewed

Convergence nodes annotate cells from gene sets with one LLM call
GATHER: Convergence-Centric Hyper-Entity Retrieval for Zero-Shot Cell-Type Annotation

Zhonghui Zhang +4
cs.IR 2026-05-07 reviewed

Semantic ID trees block simple preferences in generative recommenders
Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation

Yupeng Hou +5
cs.AI 2026-05-07 reviewed

LLM pipeline annotates PII in HTTP traffic for any taxonomy
Addressing Labelled Data Scarcity: Taxonomy-Agnostic Annotation of PII Values in HTTP Traffic using LLMs

Thomas Cory +1
cs.IR 2026-05-07 reviewed

Retrievers miss most documents that match latent patterns
OBLIQ-Bench: Exposing Overlooked Bottlenecks in Modern Retrievers with Latent and Implicit Queries

Diane Tchuindjo +2
cs.CV 2026-05-07 reviewed

The paper introduces Holmes, a hierarchical evidential learning method for retrieving…
Revisiting Uncertainty: On Evidential Learning for Partially Relevant Video Retrieval

Jun Li +7
cs.IR 2026-05-07 reviewed

Agents automate e-commerce search relevance fixes
A Case-Driven Multi-Agent Framework for E-Commerce Search Relevance

Global E-Commerce Search Relevance Team
cs.IR 2026-05-07 reviewed

Active queries lift conversation starter penetration by 0.54%
Bridging Passive and Active: Enhancing Conversation Starter Recommendation via Active Expression Modeling

Yiqing Wu +6
cs.IR 2026-05-07 reviewed

Value signals folded into generative ad tokens raise hit rate 37%
Unified Value Alignment for Generative Recommendation in Industrial Advertising

Xinxun Zhang +15
cs.IR 2026-05-07 reviewed

Rebuilding rare location transitions lifts next-POI accuracy
Beyond Long Tail POIs: Transition-Centered Generalization for Human Mobility Prediction

Dingyang Lyu +3
cs.IR 2026-05-07 reviewed

Router and transmitter modules transfer knowledge to lift CVR prediction
Effective Knowledge Transfer for Multi-Task Recommendation Models

Guohao Cai +2
cs.AI 2026-05-07 reviewed

Bidirectional channels close RAG's text-graph gaps
Text-Graph Synergy: A Bidirectional Verification and Completion Framework for RAG

Jiarui Zhong +1
cs.AI 2026-05-07 reviewed

Agentic tools lift enterprise retrieval recall by 22 points
AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases

Susheel Suresh +4
cs.CV 2026-05-06 reviewed

LLM adds context to lift satellite query retrieval by 16 percent
Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery

Md Adnan Arefeen +4
cs.CR 2026-05-06 reviewed

Policy gating blocks cross-tenant leaks in shared AI retrieval
Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use

Francisco Javier Arceo +1
cs.IR 2026-05-06 reviewed

Burn-down diffusion models interest decay for better CF recommendations
Interests Burn-down Diffusion Process for Personalized Collaborative Filtering

Yifang Qin +5
cs.IR 2026-05-06 reviewed

Capsule routing yields better semantic IDs for recommendation
CapsID: Soft-Routed Variable-Length Semantic IDs for Generative Recommendation

Wenzhuo Cheng +6
cs.SD 2026-05-06 reviewed

2.5K pop samples restore accuracy in jazz-tuned chord model
Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation

Jinju Lee
cs.CV 2026-05-06 reviewed

Modular pipeline with origin tracking turns table images into traceable KGs
From Historical Tabular Image to Knowledge Graphs: A Provenance-Aware Modular Pipeline

Sarah Binta Alam Shoilee +3
cs.CL 2026-05-06 reviewed

TabEmbed outperforms text models on tabular tasks
TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

Minjie Qiang +7
cs.IR 2026-05-06 reviewed

Enriched SERP dataset labels every element with boxes and types
AllSERP: Exhaustive Per-Element Enrichment of the Versatile AdSERP Dataset

K. Andrew Edmonds
cs.IR 2026-05-06 reviewed

AllSERP adds per-element boxes and types to AdSERP corpus
AllSERP: Exhaustive Per-Element Enrichment of the Versatile AdSERP Dataset

K. Andrew Edmonds
cs.CL 2026-05-06 reviewed

Verbatim events and staged retrieval replace extraction for agent memory
Storage Is Not Memory: A Retrieval-Centered Architecture for Agent Recall

Joshua Adler +1
cs.LG 2026-05-06 reviewed

Longer contexts worsen time series forecasts
Retrieval Mechanisms Surpass Long-Context Scaling in Time Series Forecasting

Rishi Ahuja +3
cs.IR 2026-05-06 reviewed

Crowd aggregation stabilizes deepfake authenticity but not type ID
Beyond Seeing Is Believing: On Crowdsourced Detection of Audiovisual Deepfakes

Michael Soprano +2
cs.IR 2026-05-06 reviewed

On-device LLM lifts Taobao recommendation accuracy
RecGPT-Mobile: On-Device Large Language Models for User Intent Understanding in Taobao Feed Recommendation

Bin Zhang +11
cs.IR 2026-05-06 reviewed

Hierarchical convolutions outperform attention on user sequences
Rethinking Convolutional Networks for Attribute-Aware Sequential Recommendation

Shereen Elsayed +3
cs.IR 2026-05-06 reviewed

Bayesian updates break static bound for LLM recommendation alignment
Beyond Static Best-of-N: Bayesian List-wise Alignment for LLM-based Recommendation

Ruijun Chen +4
cs.IR 2026-05-06 reviewed

Career vault lifts ATS scores 7.8 points for matching roles
Career-Aware Resume Tailoring via Multi-Source Retrieval-Augmented Generation with Provenance Tracking: A Case Study

Kumar Abhinav
cs.CL 2026-05-06 reviewed

Three-stage pipeline automates QA nuggets for report evaluation
DoGMaTiQ: Automated Generation of Question-and-Answer Nuggets for Report Evaluation

Bryan Li +8
cs.DC 2026-05-06 reviewed

Adaptive HBM split cuts recommender P99 latency 24-38%
One Pool, Two Caches: Adaptive HBM Partitioning for Accelerating Generative Recommender Serving

Wenjun Yu +2
cs.IR 2026-05-05 reviewed

New benchmark tests RAG on 500,000 enterprise documents
EnterpriseRAG-Bench: A RAG Benchmark for Company Internal Knowledge

Yuhong Sun +6
cs.IR 2026-05-05 reviewed

New benchmark tests RAG on 500k company documents
EnterpriseRAG-Bench: A RAG Benchmark for Company Internal Knowledge

Yuhong Sun +6