{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2017:UWHQ6MGHYYUW36Y2HSEXDXBE6L","short_pith_number":"pith:UWHQ6MGH","schema_version":"1.0","canonical_sha256":"a58f0f30c7c6296dfb1a3c8971dc24f2d4ee07dce3fede472bbd332e2de68a70","source":{"kind":"arxiv","id":"1711.09846","version":2},"attestation_state":"computed","paper":{"title":"Population Based Training of Neural Networks","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":["cs.NE"],"primary_cat":"cs.LG","authors_text":"Ali Razavi, Chrisantha Fernando, Iain Dunning, Jeff Donahue, Karen Simonyan, Koray Kavukcuoglu, Max Jaderberg, Oriol Vinyals, Simon Osindero, Tim Green, Valentin Dalibard, Wojciech M. Czarnecki","submitted_at":"2017-11-27T17:33:27Z","abstract_excerpt":"Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm. In this work we present \\emph{Population Based Training (PBT)}, a simple asynchronous optimisation algorithm which effectively utilises a fixed computational budget to jointly optimise a population of models and their hyperparameters to maximise performance. Importantly, PBT discovers a schedule of hyperparameter settings rather than following the generally sub-"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"1711.09846","kind":"arxiv","version":2},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.LG","submitted_at":"2017-11-27T17:33:27Z","cross_cats_sorted":["cs.NE"],"title_canon_sha256":"cc024fb9ba93d2317a58dff4c462349f56756854d43a0d422cfe4465eaa91cd8","abstract_canon_sha256":"b4e11b54b85e94c8c0742aa98f91c07e72e321da630acaec0d7683ae0f67b651"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-18T00:29:23.912121Z","signature_b64":"5vc+/rCMbXpQRPjdZHB7TG9RpUu2TIKSwaHoUePxKFoXJEAF8vwYh6SS+Ad8rPgVffylwnMsjxQ+nYsn+Rb7Aw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"a58f0f30c7c6296dfb1a3c8971dc24f2d4ee07dce3fede472bbd332e2de68a70","last_reissued_at":"2026-05-18T00:29:23.911416Z","signature_status":"signed_v1","first_computed_at":"2026-05-18T00:29:23.911416Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Population Based Training of Neural Networks","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":["cs.NE"],"primary_cat":"cs.LG","authors_text":"Ali Razavi, Chrisantha Fernando, Iain Dunning, Jeff Donahue, Karen Simonyan, Koray Kavukcuoglu, Max Jaderberg, Oriol Vinyals, Simon Osindero, Tim Green, Valentin Dalibard, Wojciech M. Czarnecki","submitted_at":"2017-11-27T17:33:27Z","abstract_excerpt":"Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm. In this work we present \\emph{Population Based Training (PBT)}, a simple asynchronous optimisation algorithm which effectively utilises a fixed computational budget to jointly optimise a population of models and their hyperparameters to maximise performance. Importantly, PBT discovers a schedule of hyperparameter settings rather than following the generally sub-"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"1711.09846","kind":"arxiv","version":2},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"1711.09846","created_at":"2026-05-18T00:29:23.911539+00:00"},{"alias_kind":"arxiv_version","alias_value":"1711.09846v2","created_at":"2026-05-18T00:29:23.911539+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.1711.09846","created_at":"2026-05-18T00:29:23.911539+00:00"},{"alias_kind":"pith_short_12","alias_value":"UWHQ6MGHYYUW","created_at":"2026-05-18T12:31:49.984773+00:00"},{"alias_kind":"pith_short_16","alias_value":"UWHQ6MGHYYUW36Y2","created_at":"2026-05-18T12:31:49.984773+00:00"},{"alias_kind":"pith_short_8","alias_value":"UWHQ6MGH","created_at":"2026-05-18T12:31:49.984773+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":22,"internal_anchor_count":14,"sample":[{"citing_arxiv_id":"1906.12213","citing_title":"On the notion of number in humans and machines","ref_index":31,"is_internal_anchor":true},{"citing_arxiv_id":"1906.12266","citing_title":"Growing Action Spaces","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"1907.02874","citing_title":"Attentive Multi-Task Deep Reinforcement Learning","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"1907.04882","citing_title":"Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition","ref_index":9,"is_internal_anchor":true},{"citing_arxiv_id":"1907.08392","citing_title":"Automated Machine Learning in Practice: State of the Art and Recent Results","ref_index":38,"is_internal_anchor":true},{"citing_arxiv_id":"2310.02540","citing_title":"Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data","ref_index":39,"is_internal_anchor":true},{"citing_arxiv_id":"2407.15134","citing_title":"Proximal Policy Distillation","ref_index":8,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16233","citing_title":"FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast","ref_index":7,"is_internal_anchor":true},{"citing_arxiv_id":"2605.20086","citing_title":"What Do Evolutionary Coding Agents Evolve?","ref_index":51,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16727","citing_title":"PopuLoRA: Co-Evolving LLM Populations for Reasoning Self-Play","ref_index":29,"is_internal_anchor":true},{"citing_arxiv_id":"2603.25099","citing_title":"Large Language Models as Optimization Controllers: Adaptive Continuation for SIMP Topology Optimization","ref_index":37,"is_internal_anchor":true},{"citing_arxiv_id":"2605.15400","citing_title":"Beyond Partner Diversity: An Influence-Based Team Steering Framework for Zero-Shot Human-Machine Teaming","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2512.13399","citing_title":"Differentiable Evolutionary Reinforcement Learning","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2309.16797","citing_title":"Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution","ref_index":159,"is_internal_anchor":true},{"citing_arxiv_id":"2604.03472","citing_title":"Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution","ref_index":16,"is_internal_anchor":false},{"citing_arxiv_id":"2604.27667","citing_title":"Can Tabular Foundation Models Guide Exploration in Robot Policy Learning?","ref_index":14,"is_internal_anchor":false},{"citing_arxiv_id":"2604.24708","citing_title":"Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models","ref_index":8,"is_internal_anchor":false},{"citing_arxiv_id":"2605.04531","citing_title":"Reward-Guided Semantic Evolution for Test-time Adaptive Object Detection","ref_index":25,"is_internal_anchor":false},{"citing_arxiv_id":"2604.11302","citing_title":"3D-Anchored Lookahead Planning for Persistent Robotic Scene Memory via World-Model-Based MCTS","ref_index":8,"is_internal_anchor":false},{"citing_arxiv_id":"2604.13130","citing_title":"Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates","ref_index":7,"is_internal_anchor":false},{"citing_arxiv_id":"2511.04831","citing_title":"Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning","ref_index":35,"is_internal_anchor":false},{"citing_arxiv_id":"2604.10911","citing_title":"EvoNash-MARL: A Closed-Loop Multi-Agent Reinforcement Learning Framework for Medium-Horizon Equity Allocation","ref_index":16,"is_internal_anchor":false}]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L","json":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L.json","graph_json":"https://pith.science/api/pith-number/UWHQ6MGHYYUW36Y2HSEXDXBE6L/graph.json","events_json":"https://pith.science/api/pith-number/UWHQ6MGHYYUW36Y2HSEXDXBE6L/events.json","paper":"https://pith.science/paper/UWHQ6MGH"},"agent_actions":{"view_html":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L","download_json":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L.json","view_paper":"https://pith.science/paper/UWHQ6MGH","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=1711.09846&json=true","fetch_graph":"https://pith.science/api/pith-number/UWHQ6MGHYYUW36Y2HSEXDXBE6L/graph.json","fetch_events":"https://pith.science/api/pith-number/UWHQ6MGHYYUW36Y2HSEXDXBE6L/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L/action/timestamp_anchor","attest_storage":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L/action/storage_attestation","attest_author":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L/action/author_attestation","sign_citation":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L/action/citation_signature","submit_replication":"https://pith.science/pith/UWHQ6MGHYYUW36Y2HSEXDXBE6L/action/replication_record"}},"created_at":"2026-05-18T00:29:23.911539+00:00","updated_at":"2026-05-18T00:29:23.911539+00:00"}