pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1286 papers in cs.IR · page 3

  1. cs.CR 2026-05-13 reviewed
    Small rotations hide data in embeddings undetected

    VectorSmuggle: Steganographic Exfiltration in Embedding Stores and a Cryptographic Provenance Defense

    Jascha Wanger

  2. cs.IR 2026-05-13 reviewed
    The paper describes benchmarks of XRootD and Pelican services in the Open Science Data…

    Benchmarking the Open Science Data Federation services to develop XRootD best practices

    Fabio Andrijauskas +2

  3. cs.IR 2026-05-13 reviewed
    Granite R2 models lead multilingual retrieval in 200+ languages

    Granite Embedding Multilingual R2 Models

    Parul Awasthy +17

  4. cs.IR 2026-05-13 reviewed
    LLM profiles boost recommender simulation ranking by 7%

    Task-Aware Automated User Profile Generation for Recommendation Simulation Using Large Language Models

    Xinye Wanyan +4

  5. cs.AI 2026-05-13 reviewed
    Graph links convergent claims from multiple innovation methods

    IdeaForge: A Knowledge Graph-Grounded Multi-Agent Framework for Cross-Methodology Innovation Analysis and Patent Claim Generation

    Joy Bose

  6. cs.DL 2026-05-13 reviewed
    Graph links 200k research repos to papers and artifacts

    SemRepo: A Knowledge Graph for Research Software and Its Scholarly Ecosystem

    Abdul Rafay +3

  7. cs.CL 2026-05-13 reviewed
    Parallel dataset gives medical dialogues in nine Indic languages

    IndicMedDialog: A Parallel Multi-Turn Medical Dialogue Dataset for Accessible Healthcare in Indic Languages

    Shubham Kumar Nigam +2

  8. cs.CL 2026-05-13 reviewed
    Latent info gain ranks visual evidence for better multimodal RAG

    Utility-Oriented Visual Evidence Selection for Multimodal Retrieval-Augmented Generation

    Weiqing Luo +5

  9. cs.IR 2026-05-13 reviewed
    AI assistant retrieves Kadi research data under privacy controls

    KadiAssistant: A conversational AI Agent for information retrieval in Kadi4Mat

    Adrian Cierpka +5

  10. cs.IR 2026-05-13 reviewed
    LeanSearch v2 lifts Lean 4 proof success to 20%

    LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving

    Guoxiong Gao +7

  11. cs.IR 2026-05-13 reviewed
    LeanSearch v2 lifts Lean 4 proof success to 20 percent

    LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving

    Guoxiong Gao +7

  12. cs.CL 2026-05-13 reviewed
    Knowledge base lifts Text-to-SQL accuracy when data is scarce

    Knowledge Distillation for Low-Resource Open-source Text-to-SQL Model

    Tianhao Qiu +1

  13. cs.MA 2026-05-13 reviewed
    Multi-agent system automates VC due diligence

    A Multi-Agent Orchestration Framework for Venture Capital Due Diligence

    Grigorios Alexandrou +1

  14. cs.IR 2026-05-13 reviewed
    Half of ReDial CRS accuracy traced to repetition shortcuts

    A Standardized Re-evaluation of Conversational Recommender Systems on the ReDial Dataset

    Ivica Kostric +1

  15. cs.IR 2026-05-13 reviewed
    Half of ReDial CRS accuracy traces to repetition shortcuts

    A Standardized Re-evaluation of Conversational Recommender Systems on the ReDial Dataset

    Ivica Kostric +1

  16. cs.IR 2026-05-13 reviewed
    LLMs predict query-specific validity horizons for web content

    RAG-Enhanced Large Language Models for Dynamic Content Expiration Prediction in Web Search

    Tingyu Chen +6

  17. cs.CV 2026-05-13 reviewed
    Source figures become verifiable evidence in deep research reports

    ViDR: Grounding Multimodal Deep Research Reports in Source Visual Evidence

    Zhuofan Shi +6

  18. cs.AI 2026-05-13 reviewed
    KITE tutor raises simulated student accuracy on algorithm tasks

    Retrieval-Augmented Tutoring for Algorithm Tracing and Problem-Solving in AI Education

    Mragisha Jain +8

  19. cs.IR 2026-05-13 reviewed
    Context changes what the same image means for retrieval

    Same Image, Different Meanings: Toward Retrieval of Context-Dependent Meanings

    Ayuto Tsutsumi +1

  20. cs.IR 2026-05-13 reviewed
    Linked page ecosystems steer LLM agents to target recommendations

    EcoGEO: Trajectory-Aware Evidence Ecosystems for Web-Enabled LLM Search Agents

    Hengwei Ye +3

  21. cs.IR 2026-05-12 reviewed
    Code scaffolds raise small model MCQA accuracy by 28 points

    Code-Guided Reasoning for Small Language Models: Evaluating Executable MCQA Scaffolds

    Prateek Biswas +4

  22. cs.IR 2026-05-12 reviewed
    MLP distillation accelerates generative recommenders 8.74 times

    MLPs are Efficient Distilled Generative Recommenders

    Zitian Guo +4

  23. cs.HC 2026-05-12 reviewed
    Admins like AI help writing WhatsApp rules but fear trust breaches

    Creating Group Rules with AI: Human-AI Collaboration in WhatsApp Moderation

    Gauri Nayak +3

  24. cs.CL 2026-05-12 reviewed
    LLM refines embeddings at test time for up to 25% gains

    Task-Adaptive Embedding Refinement via Test-time LLM Guidance

    Ariel Gera +4

  25. cs.CL 2026-05-12 reviewed
    This paper proposes ORBIT, a method that tracks how far a fine-tuned generative retrieval…

    ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging

    Neha Verma +9

  26. cs.CL 2026-05-12 reviewed
    Entropy of plausibility scores estimates LLM question difficulty

    Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring

    Jamshid Mozafari +2

  27. cs.CL 2026-05-12 reviewed
    High-convergence sentences lift LLM accuracy on inferential questions

    Context Convergence Improves Answering Inferential Questions

    Jamshid Mozafari +2

  28. cs.CL 2026-05-12 reviewed
    Benchmark forces models to combine facts from two articles

    MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

    Rezarta Islamaj +15

  29. cs.IR 2026-05-12 reviewed
    Prototype-guided retrieval improves EHR clinical predictions

    EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records

    Saeed Shurrab +3

  30. cs.CL 2026-05-12 reviewed
    Retrieval lifts two-hop medical QA to 89% conceptual accuracy

    Overview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering

    Rezarta Islamaj +15

  31. cs.IR 2026-05-12 reviewed
    BatchBench framework equalizes autoscaling policy tests

    BatchBench: Toward a Workload-Aware Benchmark for Autoscaling Policies in Big Data Batch Processing -- A Proposed Framework

    Venkata Krishna Prasanth Budigi +1

  32. cs.IR 2026-05-12 reviewed
    Crowdsourcing validates LLM ontology mappings at scale

    Unlocking Crowdsourcing for Ontology Matching Validation

    Zhangcheng Qiang

  33. cs.IR 2026-05-12 reviewed
    Three mechanisms make crowdsourcing reliable for ontology match validation

    Unlocking Crowdsourcing for Ontology Matching Validation

    Zhangcheng Qiang

  34. cs.CV 2026-05-12 reviewed
    One autoregressive model makes personalized ad images and text

    Design Your Ad: Personalized Advertising Image and Text Generation with Unified Autoregressive Models

    Yexing Xu +17

  35. cs.CL 2026-05-12 reviewed
    Three-stage retrieval pipeline ranks 8th in SemEval multi-turn task

    Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking

    David-Maximilian Caraman +1

  36. cs.IR 2026-05-12 reviewed
    Health record trajectories improve image-based disease forecasts

    From Trajectories to Phenotypes: Disease Progression as Structural Priors for Multi-organ Imaging Representation Learning

    Zian Wang +11

  37. cs.DS 2026-05-12 reviewed
    Ulam similarity admits O(n/sqrt(log n)) LSH distortion

    On the LSH Distortion of Ulam and Cayley Similarities

    Flavio Chierichetti +3

  38. cs.IR 2026-05-12 reviewed
    Benchmark with 1M entries tests multi-dimensional rewards for recommender agents

    RecRM-Bench: Benchmarking Multidimensional Reward Modeling for Agentic Recommender Systems

    Wenwen Zeng +12

  39. cs.IR 2026-05-12 reviewed
    ZipRerank matches top multimodal rerankers at 10x lower latency

    Very Efficient Listwise Multimodal Reranking for Long Documents

    Yiqun Sun +2

  40. cs.LG 2026-05-12 reviewed
    Single max nonconformity score covers every pipeline stage at 1-alpha

    PASC: Pipeline-Aware Conformal Prediction with Joint Coverage Guarantees for Multi-Stage NLP and LLM Pipelines

    Varun Kotte

  41. cs.IR 2026-05-12 reviewed
    Critic and generator agents iteratively refine research outlines

    AgentDisCo: Towards Disentanglement and Collaboration in Open-ended Deep Research Agents

    Jiarui Jin +4

  42. cs.IR 2026-05-12 reviewed
    Dual-context views with quality weights boost sequential recs

    Quality-Aware Collaborative Multi-Positive Contrastive Learning for Sequential Recommendation

    Wei Wang

  43. cs.IR 2026-05-12 reviewed
    Staged mining and activity grouping boost LLM recommendations

    HSUGA: LLM-Enhanced Recommendation with Hierarchical Semantic Understanding and Group-Aware Alignment

    Guorui Li +4

  44. cs.SD 2026-05-12 reviewed
    Computational graphs map Rossini arietta revisions

    Advanced Scientific Methodology Plays Rossini

    Silvia Licciardi +3

  45. cs.IR 2026-05-12 reviewed
    Planner picks slow reasoning only when it improves recommendations

    TwiSTAR:Think Fast, Think Slow, Then Act,Generative Recommendation with Adaptive Reasoning

    Shiteng Cao +3

  46. cs.IR 2026-05-12 reviewed
    Conditional memory fixes SID representation conflicts in generative recommendation

    Conditional Memory Enhanced Item Representation for Generative Recommendation

    Ziwei Liu +4

  47. cs.IR 2026-05-12 reviewed
    Codebooks quantize signals to boost multi-market CTR privately

    FedMM: Federated Collaborative Signal Quantization for Multi-Market CTR Prediction

    Jun Zhang +4

  48. cs.LG 2026-05-12 reviewed
    Test-time algebra boosts frozen embedding retrieval

    Test-Time Compute for Frozen Embedding Models through Agentic Program Search

    Han Xiao

  49. cs.LG 2026-05-12 reviewed
    Centroid interpolation lifts nDCG for any frozen embedder

    Test-Time Compute for Frozen Embedding Models through Agentic Program Search

    Han Xiao

  50. cs.CL 2026-05-12 reviewed
    LLMs extract causal relations from disaster social media

    Large Language Models for Causal Relations Extraction in Social Media: A Validation Framework for Disaster Intelligence

    Ujun Jeong +5