archive
Every paper Pith has read. Search by title, abstract, or pith.
1286 papers in cs.IR · page 8
-
Reshaping LLM profiles guides recommenders to unseen preferences
ProMax: Exploring the Potential of LLM-derived Profiles with Distribution Shaping for Recommender Systems
-
Models fail to retrieve relevant history from unrelated sessions
LUCid: Redefining Relevance For Lifelong Personalization
-
Memory tree raises agent correctness by over 10%
Hierarchical Long-Term Semantic Memory for LinkedIn's Hiring Agent
-
CNN identifies fashion houses at 78% accuracy by texture
FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing
-
The paper proposes RKHS, a method that combines retrieval-augmented generation with…
RAG-Enhanced Kernel-Based Heuristic Synthesis (RKHS): A Structured Methodology Using Large Language Models for Hardware Design
-
Patch features drive TCGA whole-slide retrieval more than aggregation
Validation of Whole-Slide Foundation Models for Image Retrieval in TCGA Data
-
Revenue encoding segments customers for profitable product recommendations
Value-Aware Product Recommendation by Customer Segmentation using a suitable High-Dimensional Similarity Measure
-
TF-IDF matches LLMs at building navigable text hypergraphs
Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text
-
Distillation lets models use future content for retention
Break the Inaccessible Boundary: Distilling Post-Conversion Content for User Retention Modeling
-
Timed action sequences improve short video recommendations
Action-Aware Generative Sequence Modeling for Short Video Recommendation
-
Chain inside one model generates then ranks items
Harmonizing Generative Retrieval and Ranking in Chain-of-Recommendation
-
Code metrics match plagiarism tools in ranking performance
Can Code Evaluation Metrics Detect Code Plagiarism?
-
Normalizing flows turn neural processes multimodal for cold-start cross-domain recs
Personalized Multi-Interest Modeling for Cross-Domain Recommendation to Cold-Start Users
-
Budget-aware chunk selection raises RAG performance 52% over random
Budget-Constrained Online Retrieval-Augmented Generation: The Chunk-as-a-Service Model
-
Generative engines cite more sources or absorb deeper depending on the platform
From Citation Selection to Citation Absorption: A Measurement Framework for Generative Engine Optimization Across AI Search Platforms
-
Knowledge anchors lift LLM e-commerce relevance
K-CARE: Knowledge-driven Symmetrical Contextual Anchoring and Analogical Prototype Reasoning for E-commerce Relevance
-
Self-reflective LLM loop raises summary accuracy by 33 percent
LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation
-
Semantic search runs on 166 million clinical notes at $4k per month
Health System Scale Semantic Search Across Unstructured Clinical Notes
-
Fair re-ranking reduces to gradient descent on a market manifold
The Attention Market: Interpreting Online Fair Re-ranking as Manifold Optimization under Walrasian Equilibrium
-
ACM and IEEE journals overlap in themes and favor open access
A contemporary science map through the lens of IEEE and ACM periodicals
-
Web-scale reverse search boosts image geolocalization
GeoSearch: Augmenting Worldwide Geolocalization with Web-Scale Reverse Image Search and Image Matching
-
Wilcoxon test inflates false positives in IR comparisons
Stop Using the Wilcoxon Test: Myth, Misconception and Misuse in IR Research
-
The paper introduces GloRank
From Local Indices to Global Identifiers: Generative Reranking for Recommender Systems via Global Action Space
-
CroSearch-R1 aligns cross-lingual knowledge to improve RAG
CroSearch-R1: Better Leveraging Cross-lingual Knowledge for Retrieval-Augmented Generation
-
Uncertainty sampling adapts retrievers with fewer documents
UnIte: Uncertainty-based Iterative Document Sampling for Domain Adaptation in Information Retrieval
-
One Scholar ID yields full citation lists
CiteRadar: A Citation Intelligence Platform for Researcher Profiling and Geographic Visualization
-
Recommender fairness measures often fail to compute or interpret
Offline Evaluation Measures of Fairness in Recommender Systems
-
Similar-case retrieval yields safer pathology image captions
Retrieval-Guided Generation for Safer Histopathology Image Captioning
-
Graph perturbations show which facts drive RAG answers
XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation
-
Router picks best LLM attention heads per query for re-ranking
Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models
-
Semantic anchoring on key tokens improves multimodal evidence selection
MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG
-
BITRec raises multi-behavior recommendation accuracy by modeling intensity and transitions
Modeling Behavioral Intensity and Transitions for Generative Recommendation
-
Anisotropic SSL features impair ANN image retrieval
Geometric Analysis of Self-Supervised Vision Representations for Semantic Image Retrieval
-
Summary attention compresses contexts to O(n/k) size
Kwai Summary Attention Technical Report
-
Late materialization slashes storage for long user sequences in DLRMs
Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale
-
FreeScale cuts bubbles by 90 percent in recommendation model training
FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost
-
Disagreement between LLM and model views denoises user histories
Disagreement as Signals: Dual-view Calibration for Sequential Recommendation Denoising
-
Averaging table format embeddings stabilizes retrieval
Improving Robustness of Tabular Retrieval via Representational Stability
-
Retrieval index alone flags new species or known ones
DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery
-
LLM metasearch MVP proves effective for rural smart solutions
FUTURAL: A Metasearch Platform for Empowering Rural Areas with Smart Solutions
-
Similar users' sequences boost CTR prediction accuracy
Similar Users-Augmented Interest Network
-
Fine-tuning beats RAG by 6.8 points for medical MCQA at 4B scale
Domain Fine-Tuning vs. Retrieval-Augmented Generation for Medical Multiple-Choice Question Answering: A Controlled Comparison at the 4B-Parameter Scale
-
Judge module guides RAG by naming evidence gaps
S2G-RAG: Structured Sufficiency and Gap Judging for Iterative Retrieval-Augmented QA
-
Generative inference retrieves legal cases via charges and elements
GLIER: Generative Legal Inference and Evidence Ranking for Legal Case Retrieval
-
Rerankers output relevance, contributions, and evidence together
Prism-Reranker: Beyond Relevance Scoring -- Jointly Producing Contributions and Evidence for Agentic Retrieval
-
Black-box attacks boost unpopular items in LLM recommenders
Prompt-Unknown Promotion Attacks against LLM-based Sequential Recommender Systems
-
Smart-home AI teams balance company rights with cultural rites
From Rights to Rites: Expectations Management in Smart-Home AI
-
Atomic claims cut financial hallucinations by 68 percent
FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification
-
Knowledge graph RAG improves regulatory gap detection over general models
ComplianceNLP: Knowledge-Graph-Augmented RAG for Multi-Framework Regulatory Gap Detection
-
Generative anonymizer swaps face identities in MRAG while keeping attributes
Identity-Decoupled Anonymization for Visual Evidence in Multi-modal Retrieval-Augmented Generation