{"paper":{"title":"Approximate Label Symmetries Improve Data Scaling","license":"http://creativecommons.org/licenses/by/4.0/","headline":"","cross_cats":[],"primary_cat":"physics.chem-ph","authors_text":"Mathis Lechaume-Robert, O. Anatole von Lilienfeld, Scott Y. H. Kim","submitted_at":"2026-05-27T09:53:59Z","abstract_excerpt":"Enforcing universal symmetries in machine learning (ML) models is a common strategy to mitigate data scarcity. We show that exploiting exact, as well as approximate, label symmetries can benefit scaling laws. We illustrate the idea for the s, p, d orbital densities of the electron in the hydrogen atom, for the three vibrational normal modes of the water molecule, as well as its full 3D potential energy hypersurface. Resulting ML models of electron density and potential energies exhibit superior learning curves, demonstrating improved generalization efficiency. When label symmetries are not exa"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"2605.28238","kind":"arxiv","version":1},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.28238/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}