pith. sign in
Pith Number

pith:3W6OWNBD

pith:2021:3W6OWNBDN5XXVX7QEVJJUIODVF
not attested not anchored not stored refs resolved

Unsolved Problems in ML Safety

Dan Hendrycks, Jacob Steinhardt, John Schulman, Nicholas Carlini

Machine learning safety should focus on four research areas as models scale and deploy in critical settings.

arxiv:2109.13916 v5 · 2021-09-28 · cs.LG · cs.AI · cs.CL · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{3W6OWNBDN5XXVX7QEVJJUIODVF}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We present four problems ready for research, namely withstanding hazards (Robustness), identifying hazards (Monitoring), reducing inherent model hazards (Alignment), and reducing systemic hazards (Systemic Safety).

C2weakest assumption

That the four categories comprehensively capture the primary safety challenges without major omissions or overlaps that would require a different organizing structure.

C3one line summary

The paper presents a roadmap that identifies four unsolved problems in ML safety: robustness against hazards, monitoring for hazards, alignment of model goals with human intent, and systemic safety.

References

228 extracted · 228 resolved · 6 Pith anchors

[1] Asilomar AI Principles 2000
[2] Autonomous Weapons: An Open Letter from AI and Robotics Researchers 2015
[3] Deep Learning with Differential Privacy 2016
[4] Network intrusion detection system: A systematic study of machine learning and deep learning approaches 2021
[5] Concrete Problems in AI Safety 2016

Formal links

1 machine-checked theorem link

Cited by

25 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:46.627509Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

ddbceb34236f6f7adff025529a21c3a94cf19d20a37c0276d159bdd99b0dbe03

Aliases

arxiv: 2109.13916 · arxiv_version: 2109.13916v5 · doi: 10.48550/arxiv.2109.13916 · pith_short_12: 3W6OWNBDN5XX · pith_short_16: 3W6OWNBDN5XXVX7Q · pith_short_8: 3W6OWNBD
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/3W6OWNBDN5XXVX7QEVJJUIODVF \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ddbceb34236f6f7adff025529a21c3a94cf19d20a37c0276d159bdd99b0dbe03
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "ec2de830542c9c602b251629731834dfbf6d2dfc1a53e75a7221d1c506944fdc",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "cs.CV"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2021-09-28T17:59:36Z",
    "title_canon_sha256": "5a4ab8343ed3723538f3941013ab53844ebf2c24c850b5375261ae1bfc81ae59"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2109.13916",
    "kind": "arxiv",
    "version": 5
  }
}