pith:4AFFH5ZO
Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: A large-scale benchmark of operator-adaptive PLS and Ridge models
Operator-adaptive models that fold preprocessing selection inside calibration outperform standard PLS and Ridge on most NIRS datasets.
arxiv:2605.13587 v1 · 2026-05-13 · stat.ML · cs.LG · eess.SP
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{4AFFH5ZOLJAPPXRLMCLKQT4V5A}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Compact operator-adaptive PLS with ASLS branch preprocessing achieved a median RMSEP/PLS ratio of 0.960 with 42 wins on 57 datasets, while a deployable AOM-Ridge selector improved over tuned Ridge by a median 2.22% with 35 wins on 52 datasets.
That treating nonlinear or sample-adaptive corrections (SNV, MSC, ASLS) as fold-local branches fully prevents information leakage while still allowing the model to adaptively select effective preprocessing without introducing bias or overfitting to the specific dataset splits.
Operator-adaptive PLS and Ridge models internalize preprocessing selection via linear operators and fold-local branches, achieving median RMSEP/PLS ratio of 0.960 on 57 datasets and 2.22% improvement over tuned Ridge on 52 datasets.
References
Receipt and verification
| First computed | 2026-05-18T02:44:23.132509Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
e00a53f72e5a40f7de2b6096a84f95e82bb1fd01106a9f8946b1ef558ce12d6f
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/4AFFH5ZOLJAPPXRLMCLKQT4V5A \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e00a53f72e5a40f7de2b6096a84f95e82bb1fd01106a9f8946b1ef558ce12d6f
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "78db40950582b1cef379628dd31244c0819021c541401447a504683bd7a126e4",
"cross_cats_sorted": [
"cs.LG",
"eess.SP"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "stat.ML",
"submitted_at": "2026-05-13T14:23:00Z",
"title_canon_sha256": "a1552dc44d73c40b5fdf6ea1b93318c79cab22b0fb35be8a92443ffbca907b5c"
},
"schema_version": "1.0",
"source": {
"id": "2605.13587",
"kind": "arxiv",
"version": 1
}
}