pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 7

  1. cs.IR 2026-05-01 reviewed
    ReformIR prevents drift as reformulation count rises

    When More Reformulations Hurt: Avoiding Drift using Ranker Feedback

    V Venktesh +2

  2. cs.LG 2026-05-01 reviewed
    Iterative tree merging lifts cross-document RAG F1 by 25.9% over RAPTOR

    Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

    Ziwen Zhao +1

  3. cs.IR 2026-05-01 reviewed
    Denoising emerges as main bottleneck for LLM retrieval

    LLM-Oriented Information Retrieval: A Denoising-First Perspective

    Lu Dai +6

  4. cs.IR 2026-05-01 reviewed
    Denoising becomes the main bottleneck for LLM retrieval

    LLM-Oriented Information Retrieval: A Denoising-First Perspective

    Lu Dai +6

  5. cs.IR 2026-05-01 reviewed
    Dual experts separate habits from discovery in basket predictions

    Time-Interval-Aware Disentangled Expert Modeling for Next-Basket Recommendation

    Zhiying Deng +5

  6. cs.IR 2026-05-01 reviewed
    SCARV stabilizes rankings in redundant NLP datasets

    SCARV: Structure-Constrained Aggregation for Stable Sample Ranking in Redundant NLP Datasets

    Xu Zheng +4

  7. cs.IR 2026-05-01 reviewed
    Table retrievers ignore explicit instructions on content and columns

    FollowTable: A Benchmark for Instruction-Following Table Retrieval

    Rihui Jin +9

  8. cs.IR 2026-05-01 reviewed
    Taxonomy negatives raise offline accuracy 2.6% but no online gain

    Negative Data Mining for Contrastive Learning in Dense Retrieval at IKEA.com

    Eva Agapaki +1

  9. cs.IR 2026-05-01 reviewed
    Dynamic negative selection prevents DPO collapse

    DynamicPO: Dynamic Preference Optimization for Recommendation

    Xingyu Hu +9

  10. cs.IR 2026-05-01 reviewed
    Gradual feature fading speeds efficiency rollouts by 5x

    Intelligent Elastic Feature Fading: Enabling Model Retrain-Free Feature Efficiency Rollouts at Scale

    Jieming Di +23

  11. cs.CL 2026-05-01 reviewed
    Row-aware chunking slashes table fragments up to 56%

    Structure-Aware Chunking for Tabular Data in Retrieval-Augmented Generation

    Pooja Guttal +5

  12. cs.CL 2026-04-30 reviewed
    14B RAG model reaches 68.75% of GPT-4o on CA tasks

    Retrieval-Augmented Reasoning for Chartered Accountancy

    Jatin Gupta +3

  13. cs.CL 2026-04-30 reviewed
    RSAT trains small language models to output step-by-step table reasoning in structured…

    RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

    Jugal Gajjar +1

  14. cs.CL 2026-04-30 reviewed
    Integrated cell citations raise small-model faithfulness 3.7x on tables

    RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

    Jugal Gajjar +1

  15. cs.NI 2026-04-30 reviewed
    LLM-dominant sites are prevalent and growing on the web

    DeGenTWeb: A First Look at LLM-dominant Websites

    Sichang Steven He +3

  16. cs.IR 2026-04-30 reviewed
    Token-aware clustering makes multivector search nearly 10x faster

    Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing

    Silvio Martinico +3

  17. cs.CL 2026-04-30 reviewed
    Templates from past queries boost Text-to-SQL accuracy 36%

    Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

    Smit Jivani +2

  18. cs.IR 2026-04-30 reviewed
    Human-likeness test fails to predict simulator ranking validity

    SimEval-IR: A Unified Toolkit and Benchmark Suite for Evaluating User Simulators and Search Sessions

    Saber Zerhoudi

  19. cs.IR 2026-04-30 reviewed
    Evidence chains raise RAG reasoning accuracy at under 20% token cost

    NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

    Shiyao Peng +9

  20. cs.AI 2026-04-30 reviewed
    ObjectGraph cuts agent document tokens by 95% without accuracy loss

    ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era

    Mohit Dubey +1

  21. cs.IR 2026-04-30 reviewed
    AI Overviews appear for 51.5% of queries with different sources

    How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews

    Riley Grossman +5

  22. cs.IR 2026-04-30 reviewed
    Position embeddings speed LLM list recommendation 3x

    Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation

    Jiaju Chen +6

  23. cs.CL 2026-04-30 reviewed
    One hub text outscores real captions for many images in CLIP

    One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness

    Hiroyuki Deguchi +2

  24. cs.IR 2026-04-30 reviewed
    Multimodal RAG gains 27% CIDEr by selecting fragments

    Purifying Multimodal Retrieval: Fragment-Level Evidence Selection for RAG

    Xihang Wang +6

  25. cs.IR 2026-04-30 reviewed
    LLM rerankers produce stable rankings regardless of candidate order

    One Pass, Any Order: Position-Invariant Listwise Reranking for LLM-Based Recommendation

    Ethan Bito +2

  26. cs.IR 2026-04-30 reviewed
    Survey maps benchmarks and taxonomy for reasoning-intensive retrieval

    A Survey of Reasoning-Intensive Retrieval: Progress and Challenges

    Yiyang Wei +3

  27. cs.IR 2026-04-30 reviewed
    Adaptive reranking via corpus graph lifts reasoning retrieval

    Reproducing Adaptive Reranking for Reasoning-Intensive IR

    Mandeep Rathee +3

  28. cs.IR 2026-04-30 reviewed
    LLM reformulation gains vanish on neural retrievers

    A Reproducibility Study of LLM-Based Query Reformulation

    Amin Bigdeli +6

  29. cs.IR 2026-04-30 reviewed
    LLM attribute graphs boost zero-shot ranking precision over 5%

    From Unstructured to Structured: LLM-Guided Attribute Graphs for Entity Search and Ranking

    Yilun Zhu +2

  30. cs.CR 2026-04-30 reviewed
    LLM framework cuts SOC triage time to under 10 minutes

    Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations

    Md Hasan Saju +1

  31. cs.IR 2026-04-30 reviewed
    Managed atomic nuggets raise RAG recall 42% and cut conflicts 55%

    NuggetIndex: Governed Atomic Retrieval for Maintainable RAG

    Saber Zerhoudi +2

  32. cs.IR 2026-04-29 reviewed
    Log-retrieved queries plus LLM variants raise QPP accuracy up to 30%

    RAQG-QPP: Query Performance Prediction with Retrieved Query Variants and Retrieval Augmented Query Generation

    Fangzheng Tian +2

  33. physics.ao-ph 2026-04-29 reviewed
    Multi-sensor system maps Pakistan floods daily in near real time

    Continuous Flood Nowcasting in South Asia: A Multi-Sensor Ensemble Remote Sensing Framework for Flood Extent

    Usman Nazir +3

  34. cs.IR 2026-04-29 reviewed
    LLM pipeline spots Snapchat trends at production scale

    LLM-Enhanced Topical Trend Detection at Snapchat

    Hangqi Zhao +8

  35. cs.IR 2026-04-29 reviewed
    Gated contrastive model boosts ranking metrics for review recommenders

    A Gated Hybrid Contrastive Collaborative Filtering Recommendation

    Eduardo Ferreira da Silva +8

  36. cs.IR 2026-04-29 reviewed
    Reproduction confirms Hypencoder beats bi-encoders with faster search

    Hypencoder Revisited: Reproducibility and Analysis of Non-Linear Scoring for First-Stage Retrieval

    Arne Eichholtz +4

  37. cs.IR 2026-04-29 reviewed
    Multiple latent factors boost LLM recommendations

    Factorized Latent Reasoning for LLM-based Recommendation

    Tianqi Gao +5

  38. cs.IR 2026-04-29 reviewed
    AgentSim builds 100k+ verifiable reasoning traces for RAG agents

    AgentSim: A Platform for Verifiable Agent-Trace Simulation

    Saber Zerhoudi +2

  39. cs.IR 2026-04-29 reviewed
    User state representation beats algorithm choice in recommenders

    The Bandit's Blind Spot: The Critical Role of User State Representation in Recommender Systems

    Pedro R. Pires +4

  40. cs.IR 2026-04-29 reviewed
    Uncertainty-triggered retrieval raises F1 10% with 47% fewer calls

    When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models

    Dongxin Guo +2

  41. cs.LG 2026-04-29 reviewed
    DNNs prevent embedding collapse in feature interaction models

    Understanding DNNs in Feature Interaction Models: A Dimensional Collapse Perspective

    Jiancheng Wang +3

  42. cs.IR 2026-04-29 reviewed
    Compressed embeddings let 8B reranker run 3-18x faster than smaller models

    Efficient Listwise Reranking with Compressed Document Representations

    Herv\'e D\'ejean +1

  43. cs.IR 2026-04-29 reviewed
    CARD introduces a generative recommendation framework that unifies textual

    CARD: Non-Uniform Quantization of Visual Semantic Unit for Generative Recommendation

    Yibiao Wei +6

  44. cs.IR 2026-04-29 reviewed
    Targeted privacy noise plus meta-learning raises rec accuracy

    Meta-Learning and Targeted Differential Privacy to Improve the Accuracy-Privacy Trade-off in Recommendations

    Peter M\"ullner +3

  45. cs.IR 2026-04-29 reviewed
    Targeted DP plus meta-learning lifts recsys accuracy

    Meta-Learning and Targeted Differential Privacy to Improve the Accuracy-Privacy Trade-off in Recommendations

    Peter M\"ullner +3

  46. cs.CL 2026-04-29 reviewed
    Document AI stages barely correlate

    Benchmarking Complex Multimodal Document Processing Pipelines: A Unified Evaluation Framework for Enterprise AI

    Saurabh K. Singh +1

  47. cs.CL 2026-04-29 reviewed
    Query-adaptive chunking boosts RAG F1 to 0.85

    Query-Adaptive Semantic Chunking for Retrieval-Augmented Generation: A Dynamic Strategy with Contextual Window Expansion

    Mudit Rastogi

  48. cs.IR 2026-04-29 reviewed
    Reflexive prompting fixes LLM recommender drift on complex domains

    A Reproducibility Analysis of PO4ISR: Diagnosing and Mitigating Semantic Drift in LLM-Based Session Recommendation

    Aditya Tiwari +2

  49. cs.IR 2026-04-29 reviewed
    Measure classification yields exact attribution formulas for some cases

    Explaining the "Why": A Unified Framework for the Additive Attribution of Changes in Arbitrary Measures

    Changsheng Zhou +5

  50. cs.IR 2026-04-29 reviewed
    Recency as spectral operator adapts multimodal recommendations

    TimeMM: Time-as-Operator Spectral Filtering for Dynamic Multimodal Recommendation

    Wei Yang +6