pith:DAATHSCB
Reasoning with Exploration: An Entropy Perspective
Augmenting the RL advantage function with an entropy term improves LLM reasoning on Pass@K by encouraging longer exploratory chains.
arxiv:2506.14758 v4 · 2025-06-17 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{DAATHSCBPPTSQC6F3DVN7HDE2A}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
our method achieves significant gains on the Pass@K metric -- an upper-bound estimator of LLM reasoning capabilities -- even when evaluated with extremely large K values, pushing the boundaries of LLM reasoning.
The observed positive correlations between high-entropy regions and beneficial exploratory actions (pivotal tokens, reflection, rare behaviors) will translate into improved downstream reasoning performance when the entropy term is added to the advantage function.
Augmenting the RL advantage with an entropy term promotes deeper LLM reasoning chains and raises Pass@K scores.
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:48.849568Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
180133c8417be7280bc5d8eadf9c64d018a7830496441949e808ed3313acc502
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/DAATHSCBPPTSQC6F3DVN7HDE2A \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 180133c8417be7280bc5d8eadf9c64d018a7830496441949e808ed3313acc502
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "5bec794faeb56b17b0c7c956ea9d005937306fb3a32dab0f0ba478a155f70bf3",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2025-06-17T17:54:03Z",
"title_canon_sha256": "034df4168332dcc08d8cea9f107cc98a5b3c3e9ff5e87576f78fdb2d99b5faf0"
},
"schema_version": "1.0",
"source": {
"id": "2506.14758",
"kind": "arxiv",
"version": 4
}
}