Pith Number
pith:RLI6ARNP
pith:2023:RLI6ARNPX5PWAUF7QWZAPSSCXW
not attested
not anchored
not stored
refs resolved
Evaluating the Performance of Large Language Models on GAOKAO Benchmark
Large language models achieve competitive scores on the Chinese GAOKAO exam but vary widely by subject.
arxiv:2305.12474 v3 · 2023-05-21 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{RLI6ARNPX5PWAUF7QWZAPSSCXW}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more
Record completeness
1
Bitcoin timestamp
2
Internet Archive
3
Author claim
· sign in to
claim
4
Citations
5
Replications
✓
Portable graph bundle live · download bundle · merged
state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same
current state with the deterministic merge algorithm.
Claims
C1strongest claim
Our findings reveal that LLMs have achieved competitive scores in Chinese GAOKAO examination, while they exhibit significant performance disparities across various subjects.
C2weakest assumption
That zero-shot prompting on GAOKAO questions produces answers whose quality can be fairly compared to human exam performance via human grading.
C3one line summary
LLMs achieve competitive scores on GAOKAO exam questions but display large performance gaps across subjects.
References
[1] Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt
[2] GPT-4 Technical Report
[3] Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
[4] 加强管理:对凤堰古梯田保护区内的 游客进行管理,设置必要的警示标志, 禁止破坏梯田、采摘植物等行为。同 时,加强对古建筑民居群落、古寨堡、 古庙宇、古堰渠、古塘坝等文物遗存的 保护,防止游客在参观过程中对这些文 物遗存造成损害。
[5] 推广科普:在凤堰古梯田保护区内设 置科普展板,向游客介绍梯田的历史、 文化和生态环境,提高游客的文化素养 和环保意识,减少游客对梯田的破坏。
Formal links
Cited by
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Receipt and verification
| First computed | 2026-05-17T23:38:14.110393Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
8ad1e045afbf5f6050bf85b207ca42bda5f27be94e10621488196130a1286ac1
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/RLI6ARNPX5PWAUF7QWZAPSSCXW \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8ad1e045afbf5f6050bf85b207ca42bda5f27be94e10621488196130a1286ac1
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "074e0a9cd4babe7f97f7ce2bdaf6a075a0d6b98e6d8b35c99eb9422629c11d85",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2023-05-21T14:39:28Z",
"title_canon_sha256": "e837fe6ef607afdc534a14dee6ec69abb75d423d278e34ba67aff1c544e6c953"
},
"schema_version": "1.0",
"source": {
"id": "2305.12474",
"kind": "arxiv",
"version": 3
}
}