NIPS-W , year=

Automatic differentiation in PyTorch , author=

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

SocialIQA: Commonsense Reasoning about Social Interactions

cs.CL · 2019-04-22 · unverdicted · novelty 7.0

SocialIQA is the first large-scale benchmark with 38k crowdsourced questions testing commonsense about social interactions, where pretrained language models trail humans by over 20% but transfer to improve performance on Winograd Schemas and COPA.

The Recurrent Transformer: Greater Effective Depth and Efficient Decoding

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

Recurrent Transformers add per-layer recurrent memory via self-attention on own activations plus a tiling algorithm that reduces training memory traffic, yielding better C4 pretraining cross-entropy than parameter-matched standard transformers with fewer layers.

The Falcon Series of Open Language Models

cs.CL · 2023-11-28 · conditional · novelty 6.0

Falcon-180B is a 180B-parameter open decoder-only model trained on 3.5 trillion tokens that approaches PaLM-2-Large performance at lower cost and is released with dataset extracts.

PIQA: Reasoning about Physical Commonsense in Natural Language

cs.CL · 2019-11-26 · accept · novelty 6.0

PIQA is a new benchmark showing that current AI models achieve 77% on physical commonsense questions versus humans at 95%.

citing papers explorer

Showing 4 of 4 citing papers.

SocialIQA: Commonsense Reasoning about Social Interactions cs.CL · 2019-04-22 · unverdicted · none · ref 119
SocialIQA is the first large-scale benchmark with 38k crowdsourced questions testing commonsense about social interactions, where pretrained language models trail humans by over 20% but transfer to improve performance on Winograd Schemas and COPA.
The Recurrent Transformer: Greater Effective Depth and Efficient Decoding cs.LG · 2026-04-23 · unverdicted · none · ref 22
Recurrent Transformers add per-layer recurrent memory via self-attention on own activations plus a tiling algorithm that reduces training memory traffic, yielding better C4 pretraining cross-entropy than parameter-matched standard transformers with fewer layers.
The Falcon Series of Open Language Models cs.CL · 2023-11-28 · conditional · none · ref 108
Falcon-180B is a 180B-parameter open decoder-only model trained on 3.5 trillion tokens that approaches PaLM-2-Large performance at lower cost and is released with dataset extracts.
PIQA: Reasoning about Physical Commonsense in Natural Language cs.CL · 2019-11-26 · accept · none · ref 18
PIQA is a new benchmark showing that current AI models achieve 77% on physical commonsense questions versus humans at 95%.

NIPS-W , year=

fields

years

verdicts

representative citing papers

citing papers explorer