pith:F5776M4Z
Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models
An adaptive guidance scheme detects and compensates for mismatches between desired emotions and text meaning to enable better emotional control in auto-regressive text-to-speech models.
arxiv:2510.13293 v3 · 2025-10-15 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{F5776M4ZSPX4YTWUKU3MXKTEVO}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Our results demonstrate that the proposed adaptive CFG scheme improves the emotional expressiveness of the AR TTS model while maintaining audio quality and intelligibility.
That mismatch between the desired emotion style prompt and the semantic content of the text can be reliably detected and quantified by large language models or natural language inference models in a manner that permits effective, quality-preserving adaptation of CFG strength.
An adaptive CFG method that tunes guidance based on LLM-detected mismatch between emotion prompts and text semantics improves emotional expressiveness in AR TTS while preserving audio quality and intelligibility.
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-20T01:05:00.120309Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
2f7fff339993efcc4ed45536cbaa64abb0a137e1c601b53c9fe86fd47b6a8e84
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/F5776M4ZSPX4YTWUKU3MXKTEVO \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 2f7fff339993efcc4ed45536cbaa64abb0a137e1c601b53c9fe86fd47b6a8e84
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "7dfad2d64f20900e50b2d03ff123953e260a7d036311d5f8ccbc3b2bae479cc4",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2025-10-15T08:37:16Z",
"title_canon_sha256": "dc34796605ac654dc68f83bf355882049062e59deb2c319530eca3b5f83f4414"
},
"schema_version": "1.0",
"source": {
"id": "2510.13293",
"kind": "arxiv",
"version": 3
}
}