pith. the verified trust layer for science. sign in
Pith Number

pith:D4TUMAVG

pith:2025:D4TUMAVGLHEMVPYXJ3GLZSKBVT
not attested not anchored not stored refs pending

Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens

Jiahao Pan, Lei Xie, Linqin Li, Liumeng Xue, Mingqi Jiang, Pengcheng Zhu, Qixi Zheng, Ruibin Yuan, Rui Wang, Sitong Cheng, Songxiang Liu, Wei Xue, Weizhen Bian, Xiaoqin Feng, Xie Chen, Xinfa Zhu, Xinsheng Wang, Yike Guo, Yunlin Chen, Zheng Liang, Zhen Ye, Zhifei Li, Zhixian Zhao, Ziyang Ma, Ziyu Zhang

A single-stream speech codec decouples content from speaker traits to let an LLM deliver both zero-shot cloning and fine voice control.

arxiv:2503.01710 v1 · 2025-03-03 · cs.SD · cs.AI · eess.AS

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{D4TUMAVGLHEMVPYXJ3GLZSKBVT}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Spark-TTS not only achieves state-of-the-art zero-shot voice cloning but also generates highly customizable voices that surpass the limitations of reference-based synthesis.

C2weakest assumption

That BiCodec's decomposition into semantic and global tokens provides clean, independent control over linguistic content and speaker attributes without quality loss or unwanted interactions between the two token streams.

C3one line summary

Spark-TTS uses BiCodec single-stream decoupled tokens and Qwen2.5 LLM with CoT to deliver efficient state-of-the-art zero-shot voice cloning and fine-grained voice control.

Formal links

1 machine-checked theorem link

Cited by

18 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:14.473782Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

1f274602a659c8cabf174eccbcc941acd025ec017bd5fd9ef70fe12649d84e9a

Aliases

arxiv: 2503.01710 · arxiv_version: 2503.01710v1 · doi: 10.48550/arxiv.2503.01710 · pith_short_12: D4TUMAVGLHEM · pith_short_16: D4TUMAVGLHEMVPYX · pith_short_8: D4TUMAVG
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/D4TUMAVGLHEMVPYXJ3GLZSKBVT \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1f274602a659c8cabf174eccbcc941acd025ec017bd5fd9ef70fe12649d84e9a
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "b4bc77c613949cbe690d43096fb51fdfe685e9c1aa7a5ef3b595224bdee14e36",
    "cross_cats_sorted": [
      "cs.AI",
      "eess.AS"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.SD",
    "submitted_at": "2025-03-03T16:23:10Z",
    "title_canon_sha256": "52966132713a950926ab1d240d92326b9bcb9bdd2cca5cafe7fe5f468f605054"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2503.01710",
    "kind": "arxiv",
    "version": 1
  }
}