pith. machine review for the scientific record.
sign in
Pith Number

pith:BT6II3QL

pith:2025:BT6II3QL27YCFVKH5JUSPUGTZP
not attested not anchored not stored refs pending

MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark

Dingdong Wang, Dongchao Yang, Helen Meng, Jincenzi Wu, Junan Li, Tianhua Zhang, Xueyuan Chen

MMSU benchmark shows current SpeechLLMs have substantial room for improvement in fine-grained spoken language understanding and reasoning.

arxiv:2506.04779 v3 · 2025-06-05 · cs.CL · cs.SD · eess.AS

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Through a rigorous evaluation of 14 advanced SpeechLLMs, we identify substantial room for improvement in existing models, highlighting meaningful directions for future optimization.

C2weakest assumption

The 5,000 audio-question-answer triplets have been meticulously curated to fairly and comprehensively represent the targeted linguistic phenomena without introducing selection bias or annotation artifacts that would distort model comparisons.

C3one line summary

MMSU is a new benchmark with 5,000 curated audio-QA pairs across 47 linguistically grounded tasks that reveals substantial limitations in existing SpeechLLMs for fine-grained spoken language understanding and reasoning.

Formal links

2 machine-checked theorem links

Cited by

17 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:13.515980Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

0cfc846e0bd7f022d547ea6927d0d3cbfd89940e84fdef2238d94e6032e07f58

Aliases

arxiv: 2506.04779 · arxiv_version: 2506.04779v3 · doi: 10.48550/arxiv.2506.04779 · pith_short_12: BT6II3QL27YC · pith_short_16: BT6II3QL27YCFVKH · pith_short_8: BT6II3QL
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/BT6II3QL27YCFVKH5JUSPUGTZP \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0cfc846e0bd7f022d547ea6927d0d3cbfd89940e84fdef2238d94e6032e07f58
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "481749de47ba8e9209a5209b8f90181c4c35f07e514039babb3b91f7d9116e68",
    "cross_cats_sorted": [
      "cs.SD",
      "eess.AS"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-06-05T09:09:36Z",
    "title_canon_sha256": "34a512c230e4ac979ff8cefdeccb4c211d40636bd1a37df4ddf69434f28af1f3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2506.04779",
    "kind": "arxiv",
    "version": 3
  }
}