pith. sign in
Pith Number

pith:RY2TXKHF

pith:2026:RY2TXKHF67AJGWLTCK37JRCI5M
not attested not anchored not stored refs pending

LLM-Oriented Information Retrieval: A Denoising-First Perspective

Cehao Yang, Fanpu Cao, Hao Liu, Hui Xiong, Liang Sun, Lu Dai, Ziyang Rao

Denoising to maximize evidence density and verifiability becomes the central task in information retrieval for large language models.

arxiv:2605.00505 v2 · 2026-05-01 · cs.IR · cs.AI · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{RY2TXKHF67AJGWLTCK37JRCI5M}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

denoising—maximizing usable evidence density and verifiability within a context window—is becoming the primary bottleneck across the full information access pipeline

C2weakest assumption

That LLMs' limited attention budgets and unique vulnerability to noise represent a fundamental paradigm shift in IR that requires a new denoising-first framework, rather than incremental extensions of existing relevance and quality techniques.

C3one line summary

Denoising to maximize usable evidence density and verifiability is becoming the primary bottleneck in LLM-oriented information retrieval, conceptualized via a four-stage framework and addressed through a pipeline taxonomy of optimization techniques.

Receipt and verification
First computed 2026-05-20T00:04:33.298604Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

8e353ba8e5f7c093597312b7f4c448eb0e6f55ca850c9e2852adc558b68adb24

Aliases

arxiv: 2605.00505 · arxiv_version: 2605.00505v2 · doi: 10.48550/arxiv.2605.00505 · pith_short_12: RY2TXKHF67AJ · pith_short_16: RY2TXKHF67AJGWLT · pith_short_8: RY2TXKHF
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/RY2TXKHF67AJGWLTCK37JRCI5M \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8e353ba8e5f7c093597312b7f4c448eb0e6f55ca850c9e2852adc558b68adb24
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "e52c92cee4d72406a8cecf1b8f11ec55dc3251ac112d14b0cafb53ece80485d1",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.IR",
    "submitted_at": "2026-05-01T08:30:52Z",
    "title_canon_sha256": "c8d19d1e2d7b2dc5b904e4265d694d77ace67307f6666cafb5a0645780d59049"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.00505",
    "kind": "arxiv",
    "version": 2
  }
}