Pith Number

pith:Y2PRIO4E

pith:2021:Y2PRIO4E3LEENHWXTSUJMEA4GQ

not attested not anchored not stored refs resolved

Are NLP Models really able to Solve Simple Math Word Problems?

Arkil Patel, Navin Goyal, Satwik Bhattamishra

NLP solvers for simple math word problems achieve high benchmark scores by exploiting shallow patterns instead of actual reasoning.

arxiv:2103.07191 v2 · 2021-03-12 · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{Y2PRIO4E3LEENHWXTSUJMEA4GQ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

MWP solvers that do not have access to the question asked in the MWP can still solve a large fraction of MWPs. Similarly, models that treat MWPs as bag-of-words can also achieve surprisingly high accuracy. The best accuracy achieved by state-of-the-art models is substantially lower on SVAMP.

C2weakest assumption

That the carefully chosen variations used to create SVAMP are sufficient to block all shallow heuristics while still testing the intended arithmetic reasoning.

C3one line summary

NLP models for elementary math word problems rely on shallow heuristics rather than genuine understanding, performing well without questions or as bag-of-words but dropping substantially on the new SVAMP variation dataset.

References

12 extracted · 12 resolved · 0 Pith anchors

[1] Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel Bowman, and Noah A 2018

[2] In Proceed- ings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 975–984, On- line 2020

[3] In Proceed- ings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages 3702–3710, Online 2020

[4] IEEE Transac- tions on Pattern Analysis and Machine Intelligence , 42(9):2287–2305 2020

[5] B Implementation Details We use 8 NVIDIA Tesla P100 GPUs each with 16 GB memory to run our experiments

Formal links

1 machine-checked theorem link

Cited by

40 papers in Pith

AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning

Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models

LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Factored Causal Representation Learning for Robust Reward Modeling in RLHF

Receipt and verification

First computed	2026-05-17T23:38:47.115752Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

c69f143b84dac8469ed79ca896101c343937602e2da37b54f59641f4f9c4056c

Aliases

arxiv: 2103.07191 · arxiv_version: 2103.07191v2 · doi: 10.48550/arxiv.2103.07191 · pith_short_12: Y2PRIO4E3LEE · pith_short_16: Y2PRIO4E3LEENHWX · pith_short_8: Y2PRIO4E

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/Y2PRIO4E3LEENHWXTSUJMEA4GQ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c69f143b84dac8469ed79ca896101c343937602e2da37b54f59641f4f9c4056c

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "620056386ef10602f08752d34cf4fdc9b37ad1f22161507fa11deb87ca0ad34a",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2021-03-12T10:23:47Z",
    "title_canon_sha256": "27961b34ee3f17ffb2c39dc34923abd3cbb8e795187eb39ceb96d48b3f33953d"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2103.07191",
    "kind": "arxiv",
    "version": 2
  }
}