pith. sign in

NIPS-W , year=

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

fields

cs.CL 3 cs.LG 1

representative citing papers

SocialIQA: Commonsense Reasoning about Social Interactions

cs.CL · 2019-04-22 · unverdicted · novelty 7.0

SocialIQA is the first large-scale benchmark with 38k crowdsourced questions testing commonsense about social interactions, where pretrained language models trail humans by over 20% but transfer to improve performance on Winograd Schemas and COPA.

The Recurrent Transformer: Greater Effective Depth and Efficient Decoding

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

Recurrent Transformers add per-layer recurrent memory via self-attention on own activations plus a tiling algorithm that reduces training memory traffic, yielding better C4 pretraining cross-entropy than parameter-matched standard transformers with fewer layers.

The Falcon Series of Open Language Models

cs.CL · 2023-11-28 · conditional · novelty 6.0

Falcon-180B is a 180B-parameter open decoder-only model trained on 3.5 trillion tokens that approaches PaLM-2-Large performance at lower cost and is released with dataset extracts.

citing papers explorer

Showing 4 of 4 citing papers.

  • SocialIQA: Commonsense Reasoning about Social Interactions cs.CL · 2019-04-22 · unverdicted · none · ref 119

    SocialIQA is the first large-scale benchmark with 38k crowdsourced questions testing commonsense about social interactions, where pretrained language models trail humans by over 20% but transfer to improve performance on Winograd Schemas and COPA.

  • The Recurrent Transformer: Greater Effective Depth and Efficient Decoding cs.LG · 2026-04-23 · unverdicted · none · ref 22

    Recurrent Transformers add per-layer recurrent memory via self-attention on own activations plus a tiling algorithm that reduces training memory traffic, yielding better C4 pretraining cross-entropy than parameter-matched standard transformers with fewer layers.

  • The Falcon Series of Open Language Models cs.CL · 2023-11-28 · conditional · none · ref 108

    Falcon-180B is a 180B-parameter open decoder-only model trained on 3.5 trillion tokens that approaches PaLM-2-Large performance at lower cost and is released with dataset extracts.

  • PIQA: Reasoning about Physical Commonsense in Natural Language cs.CL · 2019-11-26 · accept · none · ref 18

    PIQA is a new benchmark showing that current AI models achieve 77% on physical commonsense questions versus humans at 95%.