pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 1

  1. cs.IR 2026-05-22 reviewed
    One model ranks items, carousels and search via user stories

    TubiFM: Unified Item, Carousel, and Search Ranking for Streaming Discovery

    Alexandre Salle +9

  2. cs.IR 2026-05-22 reviewed
    Generative search engines cite AI sources in 16% of cases

    Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources

    Mowafak Allaham +1

  3. cs.DL 2026-05-22 reviewed
    University of Nigeria produced 6,353 papers in 2014-2023

    Tracking a Decade of Research at the University of Nigeria, Nsukka: A Scientometric Analysis (2014-2023)

    Muneer Ahmad +1

  4. cs.IR 2026-05-22 reviewed
    Three-phase recipe keeps 98% precision in 190M retrieval models

    HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval

    Vipul Gupta +6

  5. cs.LG 2026-05-22 reviewed
    Low dimension suffices for near-max retrieval margins

    Is Dimensionality a Barrier for Retrieval Models?

    Kiril Bangachev +3

    4 Piths
  6. cs.IR 2026-05-22 reviewed
    Trajectory merging cuts error buildup in iterative DPO

    TPMM-DPO: Trajectory-aware Preference-guided Model Merging for Iterative Direct Preference Optimization

    Lingling Fu +1

  7. cs.IR 2026-05-22 reviewed
    1B generative recommender backbone beats 2M baseline on MRR

    Towards Generalizable and Efficient Large-Scale Generative Recommenders

    Qiuling Xu +2

  8. cs.IR 2026-05-22 reviewed
    Asymmetric head-to-tail transfer lifts CTR in long-tail rec

    From Head to Tail: Asymmetric Knowledge Transfer in Long-tail Recommendation with Generative Semantic IDs

    Chenyi Yan +5

  9. cs.LG 2026-05-22 reviewed
    RankElastor stabilizes rank trajectories for scaled recommenders

    Expand More, Shrink Less: Shaping Effective-Rank Dynamics for Dense Scaling in Recommendation

    Guoming Li +9

  10. cs.LG 2026-05-21 reviewed
    Two-stage pipeline keeps sensitive mobile data on device for recommendations

    Building a privacy-preserving Federated Recommender system for mobile devices

    Aasheesh Singh

  11. cs.IR 2026-05-21 reviewed
    LaTeX source yields better RAG chunks than PDF text

    AI-Friendly LaTeX: Using LaTeX Code as a Knowledge Source for Retrieval-Augmented Generation

    Tom Verhoeff

  12. cs.IR 2026-05-21 reviewed
    Tables in model cards raise search coverage

    Diversed Model Discovery via Structured Table Discovery

    Zhengyuan Dong +1

  13. cs.CL 2026-05-21 reviewed
    Any embedding model can rank first with the right prompt

    One prompt is not enough: Instruction Sensitivity Undermines Embedding Model Evaluation

    Yevhen Kostiuk +1

  14. cs.AI 2026-05-21 reviewed
    Self-distillation drives search reasoners to 0.440 EM

    Search-E1: Self-Distillation Drives Self-Evolution in Search-Augmented Reasoning

    Zihan Liang +6

  15. cs.CL 2026-05-21 reviewed
    Generative re-ranker lifts biomedical linking accuracy 3-24%

    BeLink: Biomedical Entity Linking Meets Generative Re-Ranking

    Darya Shlyk +2

  16. cs.IR 2026-05-21 reviewed
    Chain-of-thought steps lift generative retrieval by 6.86% on multi-hop tasks

    Integrating Chain-of-Thought into Generative Retrieval: A Preliminary Study

    Wenhao Zhang +4

  17. cs.CV 2026-05-21 reviewed
    OMR tops matched music score search

    Direct content-based retrieval from music scores images

    Noelia Luna-Barahona +4

  18. cs.IR 2026-05-21 reviewed
    Calibration step lifts multimodal recs using only training overlaps

    Behavior-Guided Candidate Calibration for Multimodal Recommendation

    Zesheng Li +2

  19. cs.CL 2026-05-21 reviewed
    RoBERTa reaches 93 percent accuracy on IMDb sentiment task

    From TF-IDF to Transformers: A Comparative and Ensemble Approach to Sentiment Classification

    Dip Biswas Shanto +3

  20. cs.IR 2026-05-21 reviewed
    One autoregressive model handles both recommendations and chat

    Generative Conversational Recommender System

    Sixiao Zhang +2

  21. cs.IR 2026-05-21 reviewed
    LLM semantic retrieval raises ad recommendation stability

    LLM Retrieval for Stable and Predictable Ad Recommendations

    Vinodh Kumar Sunkara +15

  22. cs.IR 2026-05-21 reviewed
    Rec head supplies RL rewards to align LLM reasoning with item predictions

    Reinforced Preference Optimization for Reasoning-Augmented Recommendations

    Jingtong Gao +9

  23. cs.IR 2026-05-20 reviewed
    Seed-guided LLMs match real query lengths 7.5x better than baselines

    Bridging the Cold-Start Gap: LLM-Powered Synthetic Data Generation for Natural Language Search at Airbnb

    Wendy Ran Wei +11

  24. cs.AI 2026-05-20 reviewed
    43M-paper graph gives AI agents deterministic cross-field links

    SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research

    Shuofei Qiao +10

  25. cs.IR 2026-05-20 reviewed
    Explicit principles improve legal citation retrieval

    SG-LegalCite: A Principle-Augmented Benchmark for Legal Citation Retrieval in Singapore Law

    Shannon Lee Yueh Ern +4

  26. cs.IR 2026-05-20 reviewed
    Tests show memory systems mismatch retrieval and answers under conflicts

    MemConflict: Evaluating Long-Term Memory Systems Under Memory Conflicts

    Zhen Tao +6

  27. cs.CL 2026-05-20 reviewed
    7B open LLMs run GraphRAG locally for EHR schema queries

    GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

    Peter Fernandes +1

  28. cs.IR 2026-05-20 reviewed
    Dual memory layers give LLMs unbounded conversation context

    CALMem : Application-Layer Dual Memory for Conversational AI

    Rajendra Narayan Jena +2

  29. cs.CL 2026-05-20 reviewed
    Self-limiting losses compress embeddings without overfitting

    DIVE: Embedding Compression via Self-Limiting Gradient Updates

    Dongfang Zhao

  30. cs.IR 2026-05-20 reviewed
    Middle-layer compression raises reranker speed up to 116%

    Layer-wise Token Compression for Efficient Document Reranking

    Shengyao Zhuang +2

  31. cs.IR 2026-05-20 reviewed
    Middle-layer compression speeds document reranking up to 116%

    Layer-wise Token Compression for Efficient Document Reranking

    Shengyao Zhuang +2

  32. cs.LG 2026-05-19 reviewed
    Gating ensemble harvests reliable negatives for fraud models

    SAGE: Scalable Automatic Gating Ensemble for Confident Negative Harvesting in Fraud Detection

    Sudheer Tubati +1

  33. cs.CR 2026-05-19 reviewed
    Bidirectional ranking cuts RAG poisoning attacks by 54%

    BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

    Chengcai Gao +4

  34. cs.CR 2026-05-19 reviewed
    Colluding accounts scale RAG privacy loss by sqrt(k)

    Auditing Privacy in Multi-Tenant RAG under Account Collusion

    Florian A. D. Burnat +1

  35. cs.IR 2026-05-19 reviewed
    Multi-source sampling breaks negative feedback loops in recommendations

    Divergence Meets Consensus: A Multi-Source Negative Sampling Framework for Sequential Recommendation

    Yuanzi Li +6

  36. cs.IR 2026-05-19 reviewed
    Wacky weights help SPLADE mainly inside the training domain

    Understanding Wacky Weights: A Dissection of SPLADE's Learned Term Importance

    Gregory Polyakov +2

  37. cs.IR 2026-05-18 reviewed
    TF-IDF beats GPT-4o at finding astronomy expert reviewers

    Traditional statistical representations outperform generative AI in identifying expert peer reviewers

    Vicente Amado Olivo +7

  38. cs.IR 2026-05-18 reviewed
    q-log odds lift BM25 NDCG@10 by 89% on code search

    Improving BM25 Code Retrieval Under Fixed Generic Tokenization: Adaptive q-Log Odds as a Drop-In BM25 Fix

    Santosh Kumar Radha +1

  39. cs.CL 2026-05-18 reviewed
    Wiki beats RAG on cross-paper links but costs more tokens

    Vector RAG vs LLM-Compiled Wiki: A Preregistered Comparison on a Small Multi-Domain Research

    Theodore O. Cochran

  40. cs.IR 2026-05-18 reviewed
    Text guidance focuses full images for cropped-query e-commerce search

    TIGER-FG: Text-Guided Implicit Fine-Grained Grounding for E-commerce Retrieval

    Xinyu Sun +7

  41. cs.AI 2026-05-18 reviewed
    Self-distillation supplies step-level search signals from own rollouts

    SD-Search: On-Policy Hindsight Self-Distillation for Search-Augmented Reasoning

    Yufei Ma +8

  42. cs.CL 2026-05-18 reviewed
    Preference focus cuts device RAG memory 2400 times

    From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

    Changmin Lee +2

  43. cs.IR 2026-05-18 reviewed
    Prompting methods raise table QA accuracy without training

    Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting

    Amritansh Maurya +3

  44. cs.IR 2026-05-18 reviewed
    RCTEA aligns temporal entities via richness-guided fusion

    RCTEA: Richness-guided Co-training for Temporal Entity Alignment

    Jiayun Li +5

  45. cs.CL 2026-05-18 reviewed
    SomaliWeb v1 delivers 303M tokens of cleaned Somali text

    SomaliWeb v1: A Quality-Filtered Somali Web Corpus with a Matched Tokenizer and a Public Language-Identification Benchmark

    Khalid Yusuf Dahir

  46. cs.IR 2026-05-18 reviewed
    LLM pseudoqueries from table profiles improve dataset search

    PIPER: Content-Based Table Search via profiling and LLM-Generated Pseudoqueries

    Riccardo Terrenzi +3

  47. cs.CR 2026-05-18 reviewed
    Indirect injections hijack chatbots to leak user data

    An Empirical Study of Privacy Leakage Chains via Prompt Injection in Black-Box Chatbot Environments

    Hongjang Yang +2

  48. cs.IR 2026-05-18 reviewed
    SynGR boosts generative recs by limiting dominant modalities

    SynGR: Unleashing the Potential of Cross-Modal Synergy for Generative Recommendation

    Wei Chen +8

  49. cs.IR 2026-05-18 reviewed
    Dynamic modulation replaces static IDs in multimodal recommendations

    Modality-Aware Identity Construction and Counterfactual Structure Learning for ID-Free Multimodal Recommendation

    Hongjian Ma +4

  50. cs.IR 2026-05-18 reviewed
    E-commerce search lifts new-item GMV 5.3 percent via long-term value estimates

    Towards Sustainable Growth: A Multi-Value-Aware Retrieval Framework for E-Commerce Search

    Yifan Wang +4