archive
Every paper Pith has read. Search by title, abstract, or pith.
1286 papers in cs.IR · page 13
-
Agentic framework enables grounded multimodal long-form generation
Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation
-
Adaptive selector picks right table count per query
Retrieve Only Relevant Tables Whether Few or Many: Adaptive Table Retrieval Method
-
Syllable tokenizer beats 200x larger model on Turkish retrieval
HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval
-
Verification threshold limits synthetic identities to spherical code size
On the Capacity of Distinguishable Synthetic Identity Generation under Face Verification
-
Small curated LilyPond set beats 15B-token corpus on music classification
BMdataset: A Musicologically Curated LilyPond Dataset
-
Fuzzy logic adds boolean operations to neural embeddings without retraining
NSFL: A Post-Training Neuro-Symbolic Fuzzy Logic Framework for Boolean Operators in Neural Embeddings
-
LLMs give different answers to the same medical question 87-97% of the time
Evaluating Small Open LLMs for Medical Question Answering: A Practical Framework
-
Coordinated semantic IDs lift video search ranking performance
SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search
-
Homoglyph swaps degrade stylometric analysis of text
Hijacking Text Heritage: Hiding the Human Signature through Homoglyphic Substitution
-
Chunking method slashes visual document storage by 90 percent
Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval
-
Benchmark lets dialogue clarify ambiguous table questions
ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification
-
Three orthogonal preference factors raise multi-domain recommendation accuracy
MOSAIC: Multi-Domain Orthogonal Session Adaptive Intent Capture for Prescient Recommendations
-
STAR retriever cuts two biases for better graph QA
STAR: Semantic-Tuned and Tail-Adaptive Retriever for Graph-Augmented Generation
-
Value-guided tree search lifts recommendation metrics on three datasets
HARPO: Hierarchical Agentic Reasoning for User-Aligned Conversational Recommendation
-
Self-distilled RL makes recommender agents learn from mutual interactions
Self-Distilled Reinforcement Learning for Co-Evolving Agentic Recommender Systems
-
MaxSim operator caps multi-vector models at 20-word queries
Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions
-
User workshops expose blind spots in search result rankings
All Eyes on the Ranker: Participatory Auditing to Surface Blind Spots in Ranked Search Results
-
Users control recommendations in federated system
Beyond Centralization: User-Controlled Federated Recommendations in Practice
-
Query transforms enable secure multi-org RAG without decryption
Trans-RAG: Query-Centric Vector Transformation for Secure Cross-Organizational Retrieval
-
Auto-generated negatives train verifiers to check evidence support
Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision
-
Interleaved retrieval fixes lost-in-thought degradation in LLMs
RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval
-
Non-experts build competitive AI pipelines through four LLM-guided stages
From Intent to AI Pipelines: A Controlled Agentic Framework for Non-AI Expert Scientists
-
LLM reference documents cut reranking time by up to 66%
Dynamic Ranked List Truncation for Reranking Pipelines via LLM-generated Reference-Documents
-
Recs improve by personalizing time
TME-PSR: Time-aware, Multi-interest, and Explanation Personalization for Sequential Recommendation
-
Quantum-inspired embeddings show weak unstable ranking
On the Representational Limits of Quantum-Inspired 1024-D Document Embeddings: An Experimental Evaluation Framework
-
Co-design with BLV experts produces multi-modal 3D data tool
Three Modalities, Two Design Probes, One Prototype, and No Vision: Experience-Based Co-Design of a Multi-modal 3D Data Visualization Tool
-
Graph retrieval improves LLM analysis of wearable health data
Query-Conditioned Graph Retrieval for Contextualized LLM Reasoning in Personalized Wearable Data
-
Expert dataset advances AI on three fashion outfit tasks
FashionStylist: An Expert Knowledge-enhanced Multimodal Dataset for Fashion Understanding
-
Hybrid recommender cuts regret in CFD model selection
Hybrid Cold-Start Recommender System for Closure Model Selection in Multiphase Flow Simulations
-
Active learning and topic guesses raise RAG data extraction rates
ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation
-
Dual intent spaces lift recommendation performance
DIAURec: Dual-Intent Space Representation Optimization for Recommendation
-
Momentum optimization matches NASDAQ growth with lower volatility
Taming the Black Swan: A Momentum-Gated Hierarchical Optimisation Framework for Asymmetric Alpha Generation
-
Momentum-gated optimization matches NASDAQ returns with less risk
Taming the Black Swan: A Momentum-Gated Hierarchical Optimisation Framework for Asymmetric Alpha Generation
-
RAG pipeline improves accuracy of Hong Kong health advice
PriHA: A RAG-Enhanced LLM Framework for Primary Healthcare Assistant in Hong Kong
-
RAG framework boosts accuracy of Hong Kong health advice
PriHA: A RAG-Enhanced LLM Framework for Primary Healthcare Assistant in Hong Kong
-
Two surface rules route two-hop questions to better retrieval
Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA
-
Bandit method lifts document QA scores 5-18 percent
MAB-DQA: Addressing Query Aspect Importance in Document Question Answering with Multi-Armed Bandits
-
Compressing user history to tokens boosts recommender performance
IAT: Instance-As-Token Compression for Historical User Sequence Modeling in Industrial Recommender Systems
-
Retrieval for LLMs must track answer quality
Beyond Relevance: Utility-Centric Retrieval in the LLM Era
-
Bracket tournament lifts LLM document ranking accuracy
BracketRank: Large Language Model Document Ranking via Reasoning-based Competitive Elimination
-
Algebra step differences yield general strategy embeddings
Towards Generalizable Representations of Mathematical Strategies
-
Raw banking events train versatile financial foundation model
PRAGMA: Revolut Foundation Model
-
Pairwise margins decompose into unique factor shares for rankings
A Mathematical Theory of Ranking
-
RAC holds 96 percent accuracy on unbalanced confidential documents
Retrieval Augmented Classification for Confidential Documents
-
Recognizing ethical knowledge gaps shifts buying habits
Search Changes Consumers' Minds: How Recognizing Gaps Drives Sustainable Choices
-
Explicit sparsity scales recommenders where dense models stall
Beyond Dense Connectivity: Explicit Sparsity for Scalable Recommendation
-
Causal disentanglement boosts cross-domain recommendations
Context-Aware Disentanglement for Cross-Domain Sequential Recommendation: A Causal View
-
Intent taxonomy enriches queries for better infographic matches
Show Me the Infographic I Imagine: Intent-Aware Infographic Retrieval for Authoring Support
-
Semantic model predicts when RAG improves QA answers
Rag Performance Prediction for Question Answering
-
Semantic relevance training improves sponsored search retrieval
Unified Supervision for Walmart's Sponsored Search Retrieval via Joint Semantic Relevance and Behavioral Engagement Modeling