Does this advance safety along with, or as a consequence of, advancing other capabilities or the study of AI? □ 30 E.3 Elaborations and Other Considerations

Safety via Capabilities

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

cs.LG · 2024-03-05 · unverdicted · novelty 6.0

WMDP is a public benchmark measuring hazardous LLM knowledge across biosecurity, cybersecurity, and chemical security, paired with RMU unlearning that reduces WMDP performance without degrading general capabilities.

citing papers explorer

Showing 1 of 1 citing paper.

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning cs.LG · 2024-03-05 · unverdicted · none · ref 15
WMDP is a public benchmark measuring hazardous LLM knowledge across biosecurity, cybersecurity, and chemical security, paired with RMU unlearning that reduces WMDP performance without degrading general capabilities.

Does this advance safety along with, or as a consequence of, advancing other capabilities or the study of AI? □ 30 E.3 Elaborations and Other Considerations

fields

years

verdicts

representative citing papers

citing papers explorer