pith. sign in

← back to paper

Review history

arxiv: 2605.22373 · 2 revisions

Boundary-targeted Membership Inference Attacks on Safety Classifiers

  1. 2026-05-25 UNVERDICTED LOW v0.9.0 novelty 6.0
    45173 ms 5769 in 1034 out 2026-05-25T05:41:06.639708+00:00
  2. 2026-05-22 UNVERDICTED LOW v0.9.0 novelty 6.0
    29169 ms 5769 in 1153 out 2026-05-22T08:03:06.390668+00:00