pith. sign in
Pith Number

pith:KATRKKFL

pith:2026:KATRKKFLRXSSKV4IGUQUS2OXEN
not attested not anchored not stored refs resolved

From Feedback Loops to Policy Updates: Reinforcement Fine-Tuning for LLM-Based Alpha Factor Discovery

Chiming Duan, Lingzhe Zhang, Minghua He, Philip S. Yu, Tong Jia, Ying Li, Yunpeng Zhai, Zixuan Xie

Reinforcement fine-tuning converts quantitative evaluations into policy updates so an LLM internalizes alpha factor optimization experience instead of accumulating prompt feedback.

arxiv:2605.15412 v1 · 2026-05-14 · cs.CE · cs.AI · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{KATRKKFLRXSSKV4IGUQUS2OXEN}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

QuantEvolver consistently improves the primary evaluation metric of each task over existing LLM-based alpha factor discovery baselines, produces higher-quality and more complementary factor pools.

C2weakest assumption

That converting executable quantitative evaluation results into reinforcement policy updates allows the Miner LLM to internalize historical optimization experience without introducing new biases or failing to generalize beyond the regime backtests used during training.

C3one line summary

QuantEvolver applies reinforcement fine-tuning to evolve an LLM policy for generating executable alpha factor expressions, yielding higher-quality and more complementary factors than prompt-based baselines on market benchmarks.

References

85 extracted · 85 resolved · 9 Pith anchors

[1] Autoalpha: an efficient hierarchical evolutionary algorithm for mining alpha factors in quantitative invest- ment, 2002
[2] Alpha mining and enhancing via warm start genetic programming for quantitative investment, 2024
[3] Z. Kakushadze, “101 formulaic alphas,” Wilmott, vol. 2016, no. 84, pp. 72–81, 2016 2016
[4] Multiple regression genetic programming, 2014
[5] Alpha discovery via grammar-guided learning and search, 2026

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:00:57.298359Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

50271528ab8de525578835214969d7237388e4425fe23cc0362480fae7afa191

Aliases

arxiv: 2605.15412 · arxiv_version: 2605.15412v1 · doi: 10.48550/arxiv.2605.15412 · pith_short_12: KATRKKFLRXSS · pith_short_16: KATRKKFLRXSSKV4I · pith_short_8: KATRKKFL
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/KATRKKFLRXSSKV4IGUQUS2OXEN \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 50271528ab8de525578835214969d7237388e4425fe23cc0362480fae7afa191
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "0050df30ec955b63eaddfad66649b248143a5f57d70eadcc71fe842a910bd4a3",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CE",
    "submitted_at": "2026-05-14T20:54:40Z",
    "title_canon_sha256": "d1e0186de0a1193da710b18dcf521f68e6ae266dea061730c86d009283934046"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15412",
    "kind": "arxiv",
    "version": 1
  }
}