{"paper":{"title":"Farthest Point Sampling in Property Designated Chemical Feature Space as a General Strategy for Enhancing the Machine Learning Model Performance for Small Scale Chemical Dataset","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"","cross_cats":["physics.data-an"],"primary_cat":"physics.chem-ph","authors_text":"Xi Yu, Yuze Liu","submitted_at":"2024-04-17T13:07:10Z","abstract_excerpt":"Machine learning model development in chemistry and materials science often grapples with the challenge of small scale, unbalanced labelled datasets, a common limitation in scientific experiments. These dataset imbalances can precipitate overfit ting and diminish model generalization. Our study explores the efficacy of the farthest point sampling (FPS) strategy within target ed chemical feature spaces, demonstrating its capacity to generate well-distributed training datasets and consequently enhance model performance. We rigorously evaluated this strategy across various machine learning models"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"2404.11348","kind":"arxiv","version":1},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2404.11348/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}