archive
Every paper Pith has read. Search by title, abstract, or pith.
1286 papers in cs.IR · page 2
-
Parameter-free module boosts text-video retrieval
Text-Video Retrieval With Global-Local Contrastive Consistency Learning
-
AI chunking builds maps predicting war in Thucydides model
Agentic Chunking and Bayesian De-chunking of AI Generated Fuzzy Cognitive Maps: A Model of the Thucydides Trap
-
Second-stage correction cuts watch-time error by 12.57 percent
DADF: A Distribution-Aware Debiasing Framework for Watch-Time Regression in Recommender Systems
-
PuppyChatter unifies SDK simplicity across AI vendors
Accelerating AI-Powered Research: The PuppyChatter Framework for Usable and Flexible Tooling
-
Uncertainty calibration lifts retention for low-active users
Uncertainty-Calibrated Recommendations for Low-Active Users
-
Three-stage pipeline lifts video RAG retrieval from 0.195 to 0.759 nDCG
MARQUIS: A Three-Stage Pipeline for Video Retrieval-Augmented Generation
-
Co-citation predictability drops over 20 years
Temporal Decay of Co-Citation Predictability: A 20-Year Statute Retrieval Benchmark from 396M Ukrainian Court Citations
-
Catalogues miss 609 datasets across 53 languages
Beyond Catalogue Counts: the Dataset Visibility Asymmetry in Low-Resource Multilingual NLP
-
Codebook-free layer keeps ANN recall stable under streaming
IVF-TQ: Calibration-Free Streaming Vector Search via a Codebook-Free Residual Layer
-
Fixed rotation and scalar quantizer keeps IVF recall stable in streaming data
IVF-TQ: Calibration-Free Streaming Vector Search via a Codebook-Free Residual Layer
-
Text guidance extracts cleaner visuals from noisy product photos
Text-Guided Visual Representation Learning for Robust Multimodal E-Commerce Recommendation
-
Five agents map news bias by exposing omissions and manipulations
NewsLens: A Multi-Agent Framework for Adversarial News Bias Navigation
-
Dual model generates fashion images with text explanations
Dual-Diffusional Generative Fashion Recommendation
-
Interleaving reviews with items improves generative recs
RAGR: Review-Augmented Generative Recommendation
-
BLAST workflow plus dual filters boosts protein QA for new sequences
Unlocking Biological Workflows for Robust Protein-Text Question Answering: A Dual-Dimensional RAG Framework
-
Ghost cuts popularity bias in generative recommenders
Echoes in Filter Bubble: Diagnosing and Curing Popularity Bias in Generative Recommenders
-
Path-level recs beat item-level ones in unified education tests
UniER: A Unified Benchmark for Item-level and Path-level Exercise Recommendation
-
Coding plus sketching cuts distributed ML runtime
Approximate Distributed Coded Computing: Polynomial Codes and Randomized Sketching
-
Retrieval adapts thresholds to raise multi-label accuracy
RAPT: Retrieval-Augmented Post-hoc Thresholding for Multi-Label Classification
-
Dynamic facets refine job search queries in real time
Policy-Grounded Dynamic Facet Suggestions for Job Search
-
Navigator assembles research from complementary evidence pieces
Argus: Evidence Assembly for Scalable Deep Research Agents
-
Evidence graph dispatches parallel searchers to reach 86.2 on BrowseComp
Argus: Evidence Assembly for Scalable Deep Research Agents
-
Companion JSON makes papers actionable for LLM agents
paper.json: A Coordination Convention for LLM-Agent-Actionable Papers
-
Multimodal fusion retrieves events in Vietnamese news videos
MERVIN: A Unified Framework for Multimodal Event Retrieval in Vietnamese News Videos
-
LLM logits refine ad auctions for chatbots
LERA: LLM-Enhanced RAG for Ad Auction in Generative Chatbots
-
NPU-CPU split speeds billion-scale search up to 100x
Ascend-RaBitQ: Heterogeneous NPU-CPU Acceleration of Billion-Scale Similarity Search with 1-bit Quantization
-
Generative distributions replace target matching in CTR retrieval
Generative Long-term User Interest Modeling for Click-Through Rate Prediction
-
Fairness optimization cuts bias in RAG retrieval
Fairness-Aware Retrieval Optimization for Retrieval-Augmented Generation
-
Attention traces raise AI lead detection from 9.5% to 61.9%
X-SYNTH: Beyond Retrieval -- Enterprise Context Synthesis from Observed Digital Human Attention
-
Digital attention lifts true lead rate 6.5 times
X-SYNTH: Beyond Retrieval -- Enterprise Context Synthesis from Observed Digital Human Attention
-
Evidence from documents beats LLM priors for AI job exposure
Jobs' AI Exposure Should Be Measured from Evidence, Not Model Priors
-
Degree clipping keeps private hashing at 92.5% of original accuracy
Differentially Private Motif-Preserving Multi-modal Hashing
-
Ukrainian court citations form unsupervised legal ontology
Automatic Construction of a Legal Citation Graph from 100 Million Ukrainian Court Decisions: Large-Scale Extraction, Topological Analysis, and Ontology-Driven Clustering
-
Pruning time-unstable features makes recommendation models more consistent
Fortress: A Case Study in Stabilizing Search Recommendations via Temporal Data Augmentation and Feature Pruning
-
AI Overviews lift Reddit engagement 12 percent in safe communities
The Impact of AI Search on the Online Content Ecosystem: Evidence from Google and Reddit
-
AI Overviews lift Reddit comments by 12% in SFW communities
The Impact of AI Search on the Online Content Ecosystem: Evidence from Google and Reddit
-
Citations miss key context in agent graph answers
Why Neighborhoods Matter: Traversal Context and Provenance in Agentic GraphRAG
-
Logging policies optimize off-policy estimates by balancing rewards and coverage
Logging Policy Design for Off-Policy Evaluation
-
Optimal logging policies minimize OPE error via reward-coverage balance
Logging Policy Design for Off-Policy Evaluation
-
The paper presents a fixed six-stage deterministic workflow that confines language model…
A Deterministic Agentic Workflow for HS Tariff Classification: Multi-Dimensional Rule Reasoning with Interpretable Decisions
-
One model trains to rank and retrieves by generation
Discrimination Is Generation: Unifying Ranking and Retrieval from a Tokenizer Perspective
-
Graph paths verify legal reasoning in Indian court AI
Falkor-IRAC: Graph-Constrained Generation for Verified Legal Reasoning in Indian Judicial AI
-
Aggregated vectors make different financial docs look identical
A Picture is Worth a Thousand Words? An Empirical Study of Aggregation Strategies for Visual Financial Document Retrieval
-
AsymRec raises generative recommender accuracy 15.8%
Asymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization
-
Distilled rerankers match quality with 34% fewer tokens
Stop Overthinking: Unlocking Efficient Listwise Reranking with Minimal Reasoning
-
Adaptive gate skips reasoning for simple multimodal inputs
Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecture
-
Semantic IDs halve beam search size for e-commerce retrieval
Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL
-
PaSaMaster beats GPT-5.2 in paper retrieval at 1% cost
Towards Self-Evolving Agentic Literature Retrieval
-
99% retrieval success can equal random selection
The 99% Success Paradox: When Near-Perfect Retrieval Equals Random Selection
-
Imagined future steps triple recall of distant memories
Thinking Ahead: Prospection-Guided Retrieval of Memory with Language Models