{"paper":{"title":"Unsolved Problems in ML Safety","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Machine learning safety should focus on four research areas as models scale and deploy in critical settings.","cross_cats":["cs.AI","cs.CL","cs.CV"],"primary_cat":"cs.LG","authors_text":"Dan Hendrycks, Jacob Steinhardt, John Schulman, Nicholas Carlini","submitted_at":"2021-09-28T17:59:36Z","abstract_excerpt":"Machine learning (ML) systems are rapidly increasing in size, are acquiring new capabilities, and are increasingly deployed in high-stakes settings. As with other powerful technologies, safety for ML should be a leading research priority. In response to emerging safety challenges in ML, such as those introduced by recent large-scale models, we provide a new roadmap for ML Safety and refine the technical problems that the field needs to address. We present four problems ready for research, namely withstanding hazards (\"Robustness\"), identifying hazards (\"Monitoring\"), reducing inherent model ha"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We present four problems ready for research, namely withstanding hazards (Robustness), identifying hazards (Monitoring), reducing inherent model hazards (Alignment), and reducing systemic hazards (Systemic Safety).","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the four categories comprehensively capture the primary safety challenges without major omissions or overlaps that would require a different organizing structure.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"The paper presents a roadmap that identifies four unsolved problems in ML safety: robustness against hazards, monitoring for hazards, alignment of model goals with human intent, and systemic safety.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Machine learning safety should focus on four research areas as models scale and deploy in critical settings.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"50b09283156682c4430d6abb6d29e8cbdf1634ce34f2d84a17582b3fa30c5b5a"},"source":{"id":"2109.13916","kind":"arxiv","version":5},"verdict":{"id":"d2189e3c-86b6-4abf-9f81-8c8df98e94c3","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T20:42:44.186395Z","strongest_claim":"We present four problems ready for research, namely withstanding hazards (Robustness), identifying hazards (Monitoring), reducing inherent model hazards (Alignment), and reducing systemic hazards (Systemic Safety).","one_line_summary":"The paper presents a roadmap that identifies four unsolved problems in ML safety: robustness against hazards, monitoring for hazards, alignment of model goals with human intent, and systemic safety.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the four categories comprehensively capture the primary safety challenges without major omissions or overlaps that would require a different organizing structure.","pith_extraction_headline":"Machine learning safety should focus on four research areas as models scale and deploy in critical settings."},"references":{"count":228,"sample":[{"doi":"","year":2000,"title":"Asilomar AI Principles","work_id":"03f6baad-25b0-48b1-8a06-a27fa1400be4","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2015,"title":"Autonomous Weapons: An Open Letter from AI and Robotics Researchers","work_id":"d7272ec7-47d2-4699-a4ee-a9f04aa6d6d1","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2016,"title":"Deep Learning with Differential Privacy","work_id":"4eef2d5a-f5d7-41db-bcc1-267fd6da556f","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2021,"title":"Network intrusion detection system: A systematic study of machine learning and deep learning approaches","work_id":"ef8751e2-9b84-4735-b05f-ad68f7914fee","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2016,"title":"Concrete Problems in AI Safety","work_id":"08cbe17e-9d7e-44fb-858c-a0ac0590f206","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":228,"snapshot_sha256":"bf208f6b914ca6cfebba77df22cacd0f8a0d3b6842a507897bfba419a19076c0","internal_anchors":6},"formal_canon":{"evidence_count":1,"snapshot_sha256":"c1d24e7e2dfba039bb7d821d511fc92df8c3ea44e02c07a98474796c051d4f00"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}