Pith Number
pith:TO7S2ELH
pith:2019:TO7S2ELHTBHW4FYY2WRTOASAOK
not attested
not anchored
not stored
refs resolved
PIQA: Reasoning about Physical Commonsense in Natural Language
Large pretrained models reach only 77 percent accuracy on physical commonsense questions that humans answer at 95 percent.
arxiv:1911.11641 v1 · 2019-11-26 · cs.CL · cs.AI · cs.LG
Record completeness
1
Bitcoin timestamp
2
Internet Archive
3
Author claim
· sign in to claim
4
Citations
5
Replications
Claims
C1strongest claim
large pretrained models struggle (77%). We provide analysis about the dimensions of knowledge that existing models lack, which offers significant opportunities for future research.
C2weakest assumption
That the collected PIQA questions genuinely require physical commonsense reasoning and cannot be solved primarily through linguistic patterns or reporting bias present in the training data.
C3one line summary
PIQA is a new benchmark showing that current AI models achieve 77% on physical commonsense questions versus humans at 95%.
References
[1] CVPR , year =
[2] SocialIQA: Commonsense Reasoning about Social Interactions , booktitle =
[3] WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale , author=. AAAI , year=
[4] ACL , year =
[5] IROS , year =
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:13.765123Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519 (pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
9bbf2d1167984f6e1718d5a337024072854ec4e228b8b3380322fd6fb7d9eff6
Aliases
· ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/TO7S2ELHTBHW4FYY2WRTOASAOK \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9bbf2d1167984f6e1718d5a337024072854ec4e228b8b3380322fd6fb7d9eff6
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "8c17cda83691e282f942aa64ce5ea21bb14aa866bfa39c594f09daff64f57594",
"cross_cats_sorted": [
"cs.AI",
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2019-11-26T15:31:46Z",
"title_canon_sha256": "e3a4aee1cb2d205fde52f4d934479050fda3d115f107a611a5e692b7161df23d"
},
"schema_version": "1.0",
"source": {
"id": "1911.11641",
"kind": "arxiv",
"version": 1
}
}