pith. sign in
Pith Number

pith:VVAKTGNX

pith:2026:VVAKTGNXZJT4UCU5BKB6GVKP7I
not attested not anchored not stored refs resolved

When Evidence Conflicts: Uncertainty and Order Effects in Retrieval-Augmented Biomedical Question Answering

Halil Kilicoglu, Mengfei Lan, Yikun Han

Reversing the order of conflicting biomedical documents causes large language models to flip their answers in 11 to 25 percent of cases.

arxiv:2605.14115 v1 · 2026-05-13 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{VVAKTGNXZJT4UCU5BKB6GVKP7I}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

In this conflicting-evidence order contrast, where the same two documents are both present and only their order is reversed, accuracy drops for every model and 11.4%--25.2% of predictions flip.

C2weakest assumption

That the controlled conflicting-evidence conditions created in HealthContradict are representative of the contradictory or incomplete evidence that real-world biomedical RAG systems encounter.

C3one line summary

Conflicting biomedical evidence triggers order-dependent prediction flips in RAG LLMs, and a new abstention score combining confidence with conflict detection raises selective accuracy by 7-33 points in the hardest conditions.

References

24 extracted · 24 resolved · 4 Pith anchors

[1] Advances in neural information processing systems , volume=
[2] Advances in neural information processing systems , volume=
[3] Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[4] Transactions of the association for computational linguistics , volume=
[5] Findings of the Association for Computational Linguistics: ACL 2024 , pages= 2024
Receipt and verification
First computed 2026-05-17T23:39:11.957796Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

ad40a999b7ca67ca0a9d0a83e3554ffa277f375d1753ecd6457faa5be0708700

Aliases

arxiv: 2605.14115 · arxiv_version: 2605.14115v1 · doi: 10.48550/arxiv.2605.14115 · pith_short_12: VVAKTGNXZJT4 · pith_short_16: VVAKTGNXZJT4UCU5 · pith_short_8: VVAKTGNX
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/VVAKTGNXZJT4UCU5BKB6GVKP7I \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ad40a999b7ca67ca0a9d0a83e3554ffa277f375d1753ecd6457faa5be0708700
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "2205b5d8472ebac2bf3d479bf63e770923d2c4bb13a523ad3ac752f023b4bc44",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-13T21:02:24Z",
    "title_canon_sha256": "8520796670a6a585869558ab1278ab8a1994dad5f0231d1e2814ac587e7cbd43"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14115",
    "kind": "arxiv",
    "version": 1
  }
}