pith:IG5772UI
Voice ''Cloning'' is Style Transfer
Voice cloning models apply style transfer to source voices rather than faithfully replicating them.
arxiv:2605.16578 v1 · 2026-05-15 · cs.SD · cs.AI · cs.HC · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{IG5772UIP6IQ3QCAS2AHWQWHGO}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
widely-used voice cloning models systematically apply style transfer to source voices. As rated by human annotators, cloned voices are perceived as more authoritative, warm, customer-service-like, and human-like compared to their sources. Human annotators also report greater trust in cloned voices than source voices, and a greater willingness to disclose sensitive personal information to them. voice cloning leads to homogenization of speaker characteristics, as measured by reduced variance in accent, speaking rate, and the audio embedding space.
The assumption that differences in human ratings and reduced variance are caused by inherent style transfer in the cloning models rather than by specific training data choices, model architectures, or unmeasured confounding factors in the evaluation setup.
Voice cloning models perform style transfer rather than faithful cloning, producing voices rated as more authoritative and warm with reduced variance in accent and speaking rate.
References
Formal links
Receipt and verification
| First computed | 2026-05-20T00:02:30.705299Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
41bbffea887f910dc04096807b42c73397c8ef9a43aa60c321615186fd1cd664
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/IG5772UIP6IQ3QCAS2AHWQWHGO \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 41bbffea887f910dc04096807b42c73397c8ef9a43aa60c321615186fd1cd664
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "7994ed9c27f65e8a8a945f8d938883ff7e00683fa643b4e689b7a093a42ceff2",
"cross_cats_sorted": [
"cs.AI",
"cs.HC",
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
"primary_cat": "cs.SD",
"submitted_at": "2026-05-15T19:32:28Z",
"title_canon_sha256": "a516748f6815ebcfd4bc0425f30462ef22fe4b8fe48f25d6e95d145c7c19c8f3"
},
"schema_version": "1.0",
"source": {
"id": "2605.16578",
"kind": "arxiv",
"version": 1
}
}