pith. sign in
Pith Number

pith:YCKTO57A

pith:2019:YCKTO57ACJOSYV2C5II5FNCVKL
not attested not anchored not stored refs resolved

CLEVRER: CoLlision Events for Video REpresentation and Reasoning

Antonio Torralba, Chuang Gan, Jiajun Wu, Joshua B. Tenenbaum, Kexin Yi, Pushmeet Kohli, Yunzhu Li

CLEVRER shows video models describe collisions accurately but fail at explaining causes, predicting outcomes, or reasoning about alternatives.

arxiv:1910.01442 v2 · 2019-10-03 · cs.CV · cs.AI · cs.CL · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{YCKTO57ACJOSYV2C5II5FNCVKL}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

While these models thrive on the perception-based task (descriptive), they perform poorly on the causal tasks (explanatory, predictive and counterfactual), suggesting that a principled approach for causal reasoning should incorporate the capability of both perceiving complex visual and language inputs, and understanding the underlying dynamics and causal relations.

C2weakest assumption

That the observed poor performance on causal tasks stems primarily from a lack of causal reasoning capability in the models rather than from dataset-specific artifacts, insufficient training regimes, or other unmeasured factors.

C3one line summary

CLEVRER introduces a diagnostic dataset for evaluating video models on causal reasoning via descriptive, explanatory, predictive, and counterfactual questions about object collision events.

References

300 extracted · 300 resolved · 1 Pith anchors

[1] Generating the future with adversarial transformers , author=
[2] How, whether, why: Causal judgments as counterfactual contrasts. , author=. CogSci , year=
[3] Learning perceptual causality from video , author=. TIST , volume=. 2016 , publisher= 2016
[4] Causality , author=. 2009 , publisher= 2009
[5] Self-supervised visual planning with temporal skip connections , author=

Formal links

2 machine-checked theorem links

Cited by

31 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:47.042917Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

c0953777e0125d2c5742ea11d2b45552e8fa51ab923359a76c6f70b0ca81b66d

Aliases

arxiv: 1910.01442 · arxiv_version: 1910.01442v2 · doi: 10.48550/arxiv.1910.01442 · pith_short_12: YCKTO57ACJOS · pith_short_16: YCKTO57ACJOSYV2C · pith_short_8: YCKTO57A
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/YCKTO57ACJOSYV2C5II5FNCVKL \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c0953777e0125d2c5742ea11d2b45552e8fa51ab923359a76c6f70b0ca81b66d
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "8930a24c1e57e3b6af8721f047049d228eaf6ed7c52e69b0ab201473d80f3b2e",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2019-10-03T13:16:36Z",
    "title_canon_sha256": "ce3702a80e6002b7aa4f141b07db198195323413a04f96bffd93cfe879324433"
  },
  "schema_version": "1.0",
  "source": {
    "id": "1910.01442",
    "kind": "arxiv",
    "version": 2
  }
}