archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 9

cs.IR 2026-04-26 reviewed

Secret key creates green-red bias to watermark recommenders
Green-Red Watermarking for Recommender Systems

Lei Zhou +4
cs.CR 2026-04-26 reviewed

Hybrid detector cuts AI phishing misses by 78 points with privacy intact
CyberCane: Neuro-Symbolic RAG for Privacy-Preserving Phishing Detection with Formal Ontology Reasoning

Safayat Bin Hakim +5
cs.IR 2026-04-26 reviewed

Adaptive SID learning preserves compatible overlaps for better recs
Beyond Static Collision Handling: Adaptive Semantic ID Learning for Multimodal Recommendation at Industrial Scale

Yongsen Pan +10
cs.CL 2026-04-25 reviewed

This paper assembles four new Reddit-sourced datasets for detecting suicidal ideation
A Benchmark Suite of Reddit-Derived Datasets for Mental Health Detection

Khalid Hasan +1
cs.IR 2026-04-25 reviewed

Prompt chaining lifts LLM accuracy on scientific text classification
Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models

Gautam Kishore Shahi +1
cs.IR 2026-04-25 reviewed

Web workbench turns IR user simulation into visual shareable workflows
IIRSim Studio: A Dashboard for User Simulation

Saber Zerhoudi +2
cs.IR 2026-04-25 reviewed

Typos cause plan collapse in generative retrieval
Lost in Decoding? Reproducing and Stress-Testing the Look-Ahead Prior in Generative Retrieval

Kidist Amde Mekonnen +4
cs.IR 2026-04-25 reviewed

Sparse memory updates stabilize generative retrieval on growing collections
A Parametric Memory Head for Continual Generative Retrieval

Kidist Amde Mekonnen +2
cs.IR 2026-04-25 reviewed

Distillation creates linear retriever that nearly matches reranker on rationale tasks
Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA

Teng Chen +6
cs.IR 2026-04-25 reviewed

Distillation turns reranker into linear-time retriever
Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA

Teng Chen +6
cs.IR 2026-04-25 reviewed

Multimodal embeddings fail to follow modality instructions
MMEB-V3: Measuring the Performance Gaps of Omni-Modality Embedding Models

Haohang Huang +11
cs.IR 2026-04-25 reviewed

Pro-GEO cuts geographic clustering distance by 45.6%
Birds of a Feather Cluster Nearby: a Proximity-Aware Geo-Codebook for Local Service Recommendation

Tian He +5
cs.HC 2026-04-25 reviewed

Users and AI co-edit knowledge graphs for better organization
MindTrellis: Co-Creating Knowledge Structures with AI through Interactive Visual Exploration

Xiang Li +4
cs.IR 2026-04-25 reviewed

Pretrained music models lag in recommendations vs tagging
Adopting State-of-the-Art Pretrained Audio Representations for Music Recommender Systems

Yan-Martin Tamm +1
cs.IR 2026-04-24 reviewed

Support penalty selects reliable two-stage recommender policies
CASP: Support-Aware Offline Policy Selection for Two-Stage Recommender Systems

Nilson Chapagain
cs.IR 2026-04-24 reviewed

RAG model predicts osteomyelitis outcomes from mixed clinical records
RAG4Outcome: A Retrieval-Augmented Multimodal Framework for Prognostic Prediction in Chronic Osteomyelitis

Daqian Shi +6
cs.CL 2026-04-24 reviewed

Self-re-expression adapts LLMs to tasks using only unannotated data
Self Knowledge Re-expression: A Fully Local Method for Adapting LLMs to Tasks Using Intrinsic Knowledge

Mengyu Wang +6
cs.IR 2026-04-24 reviewed

Distilled embeddings deliver 30% better retrieval at 180x LLM speed
Aligning Dense Retrievers with LLM Utility via Distillation

Rajinder Sandhu +6
cs.IR 2026-04-24 reviewed

QPP picks query variants that raise RAG answer quality
Can QPP Choose the Right Query Variant? Evaluating Query Variant Selection for RAG Pipelines

Negar Arabzadeh +3
cs.IR 2026-04-24 reviewed

Bi-level optimization enables learnable graph filters for recommendations
ASPIRE: Make Spectral Graph Collaborative Filtering Great Again via Adaptive Filter Learning

Yunhang He +4
cs.IR 2026-04-24 reviewed

Beam-search negatives reshape AUC to partial AUC in LLM recommenders
Objective Shaping with Hard Negatives: Windowed Partial AUC Optimization for RL-based LLM Recommenders

Wentao Shi +9
cs.IR 2026-04-24 reviewed

Compact model beats 23x larger rival on patent retrieval
Citation-Driven Multi-View Training for Patent Embeddings: QaECTER and Sophia-Bench

Younes Djemmal +4
cs.AI 2026-04-24 reviewed

Benchmark of 10k agents reveals semantic search misses real performance
AgentSearchBench: A Benchmark for AI Agent Search in the Wild

Bin Wu +3
cs.IR 2026-04-24 reviewed

Alignment distorts signals in semantic-collaborative recommender fusion
Rethinking Semantic Collaborative Integration: Why Alignment Is Not Enough

Maolin Wang +9
cs.IR 2026-04-24 reviewed

ResRank reranks with one token per passage and no generation
ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression

Xiaojie Ke +8
cs.LG 2026-04-24 reviewed

Sharpness-aware method improves poisoning attack transfer in recommenders
Sharpness-Aware Poisoning: Enhancing Transferability of Injective Attacks on Recommender Systems

Junsong Xie +3
cs.LG 2026-04-24 reviewed

ReCast matches RL performance with 4% rollout budget in generative recs
ReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation

Peiyan Zhang +5
cs.DB 2026-04-23 reviewed

ESPRESSO scales keyword search over Solid pods with privacy
Implementation and Privacy Guarantees for Scalable Keyword Search on SOLID-based Decentralized Data with Granular Visibility Constraints

Mohamed Ragab +7
cs.IR 2026-04-23 reviewed

Profile portability changes user utility by algorithm
Multistakeholder Impacts of Profile Portability in a Recommender Ecosystem

Anas Buhayh +3
cs.CL 2026-04-23 reviewed

Hierarchical memory links events for better LLM reasoning at lower cost
StructMem: Structured Memory for Long-Horizon Behavior in LLMs

Buqiang Xu +7
cs.CV 2026-04-23 reviewed

Logic networks match DNNs for video copy detection with tiny descriptors
Efficient Logic Gate Networks for Video Copy Detection

Katarzyna Fojcik
cs.IR 2026-04-23 reviewed

Counterfactual model boosts pre-promotion conversion forecasts
Counterfactual Multi-task Learning for Delayed Conversion Modeling in E-commerce Sales Pre-Promotion

Xin Song +2
cs.IR 2026-04-23 reviewed

Multi-task model improves delayed conversion predictions in pre-promotions
Counterfactual Multi-task Learning for Delayed Conversion Modeling in E-commerce Sales Pre-Promotion

Xin Song +2
cs.IR 2026-04-23 reviewed

LLM user profiles distilled into sequential recommenders keep serving fast
Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

Nikita Severin +10
cs.IR 2026-04-23 reviewed

SAE concepts replace tokens in SPLADE with comparable retrieval and better efficiency
From Tokens to Concepts: Leveraging SAE for SPLADE

Yuxuan Zong +4
cs.IR 2026-04-23 reviewed

Corpus of 301k systematic reviews spans all scientific fields
A Large-Scale, Cross-Disciplinary Corpus of Systematic Reviews

Pierre Achkar +4
cs.IR 2026-04-23 reviewed

Wavelet packets align graph signals to temporal scales in recs
WPGRec: Wavelet Packet Guided Graph Enhanced Sequential Recommendation

Peilin Liu +2
cs.IR 2026-04-23 reviewed

Benchmark tests LLMs on integrated scientific paper reasoning
PaperMind: Benchmarking Agentic Reasoning and Critique over Scientific Papers in Multimodal LLMs

Yanjun Zhao +9
cs.CL 2026-04-23 reviewed

Separate encoders plus explanatory discriminator improve authorship attribution
Explainable Disentangled Representation Learning for Generalizable Authorship Attribution in the Era of Generative AI

Hieu Man +4
cs.AI 2026-04-23 reviewed

Verbatim storage, not spatial metaphor, explains MemPalace recall
Spatial Metaphors for LLM Memory: A Critical Analysis of the MemPalace Architecture

Robin Dey +1
cs.CL 2026-04-23 reviewed

LLM pipeline raises multi-table matching F1 by 5.1 percent
Unlocking the Power of Large Language Models for Multi-table Entity Matching

Yingkai Tang +4
cs.IR 2026-04-23 reviewed

IntrAgent lifts literature retrieval accuracy 13.2 percent
IntrAgent: An LLM Agent for Content-Grounded Information Retrieval through Literature Review

Fengbo Ma +7
cs.CL 2026-04-23 reviewed

Fine-tuned LLM matches supervised accuracy on next job prediction
On Reasoning Behind Next Occupation Recommendation

Shan Dong +5
cs.CY 2026-04-22 reviewed

Dialect cues bypass LLM safety filters unlike explicit profiles
Dialect vs Demographics: Quantifying LLM Bias from Implicit Linguistic Signals vs. Explicit User Profiles

Irti Haq +1
cs.IR 2026-04-22 reviewed

Non-English sources needed for realistic multilingual ToT queries
Multilingual and Domain-Agnostic Tip-of-the-Tongue Query Generation for Simulated Evaluation

Xuhong He +5
cs.IR 2026-04-22 reviewed

AI extracts pharmacokinetic data from XML tables by structure
Automated Extraction of Pharmacokinetic Parameters from Structured XML Scientific Articles: Enhancing Data Accessibility at Scale

Remya Ampadi Ramachandran +6
cs.IR 2026-04-22 reviewed

Eye tracking shows carousel users defy web-search patterns
Following the Eye-Tracking Evidence: Established Web-Search Assumptions Fail in Carousel Interfaces

Jingwei Kang +2
cs.IR 2026-04-22 reviewed

Entity clusters replace averages for reliable retrieval tests
Coverage, Not Averages: Semantic Stratification for Trustworthy Retrieval Evaluation

Andrew Klearman +5
cs.IR 2026-04-22 reviewed

Self-aware embeddings double RAG accuracy on versioned queries
Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confidence-Weighted, and Relational Knowledge

Naizhong Xu
cs.CL 2026-04-22 reviewed

Multi-agent AI generates more diverse research ideas
Enhancing Research Idea Generation through Combinatorial Innovation and Multi-Agent Iterative Search Strategies

Shuai Chen +1