pith. sign in
Pith Number

pith:LB6M4H2Y

pith:2026:LB6M4H2YOBUYZTQQJZFXAT43CI
not attested not anchored not stored refs resolved

Fair and Calibrated Toxicity Detection with Robust Training and Abstention

Mokshit Surana

Toxicity detectors hide calibration unfairness across identity subgroups despite near-perfect overall scores.

arxiv:2605.14074 v1 · 2026-05-13 · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{LB6M4H2YOBUYZTQQJZFXAT43CI}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Calibration disparity is a hidden fairness violation. ERM has near-perfect aggregate calibration (0.013) but is significantly miscalibrated across all identity subgroups (+0.029 to +0.134). Training interventions reshape rather than eliminate disparity, and abstention itself is unfair.

C2weakest assumption

That the chosen subgroup definitions, metrics (subgroup AUC, BPSN/BNSP AUC, ECE), and bootstrap CIs fully capture real-world fairness harms and that post-hoc methods can be evaluated independently of training choices.

C3one line summary

Training interventions reshape rather than eliminate calibration and abstention disparities in toxicity detection, requiring a multi-axis fairness framework.

References

11 extracted · 11 resolved · 0 Pith anchors

[1] Borkan, D., Dixon, L., Sorensen, J., Thain, N., and Vasserman, L. (2019). Nuanced metrics for measuring unintended bias with real data for text classification.WWW Companion 2019
[2] Dixon, L., Li, J., Sorensen, J., Thain, N., and Vasserman, L. (2018). Measuring and mitigating unintended bias in text classification.AAAI/ACM AIES 2018
[3] Geifman, Y ., and El-Yaniv, R. (2017). Selective classification for deep neural networks.NeurIPS 2017 2017
[4] Guo, C., Pleiss, G., Sun, Y ., and Weinberger, K. Q. (2017). On calibration of modern neural networks.ICML 2017 2017
[5] Y ., Arjovsky, M., Pezeshki, M., and Lopez-Paz, D 2022

Formal links

1 machine-checked theorem link

Receipt and verification
First computed 2026-05-17T23:39:12.382333Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

587cce1f5870698cce104e4b704f9b1212246c162f98492efdd1460b0bfacb21

Aliases

arxiv: 2605.14074 · arxiv_version: 2605.14074v1 · doi: 10.48550/arxiv.2605.14074 · pith_short_12: LB6M4H2YOBUY · pith_short_16: LB6M4H2YOBUYZTQQ · pith_short_8: LB6M4H2Y
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/LB6M4H2YOBUYZTQQJZFXAT43CI \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 587cce1f5870698cce104e4b704f9b1212246c162f98492efdd1460b0bfacb21
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "3ecd350b56951bc7424beb6da781a938e660910527815ad3d5ee02497cd04d29",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-13T19:50:35Z",
    "title_canon_sha256": "6474fa5dc1d69786de7cc7a9010ec3710929a3e99b89268e89646fbe05119328"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14074",
    "kind": "arxiv",
    "version": 1
  }
}