pith. sign in
Pith Number

pith:F5776M4Z

pith:2025:F5776M4ZSPX4YTWUKU3MXKTEVO
not attested not anchored not stored refs pending

Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models

Bin Ma, Chongjia Ni, Chong Zhang, Eng Siong Chng, Yi-Wen Chao, Yizhou Peng, Yukun Ma

An adaptive guidance scheme detects and compensates for mismatches between desired emotions and text meaning to enable better emotional control in auto-regressive text-to-speech models.

arxiv:2510.13293 v3 · 2025-10-15 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{F5776M4ZSPX4YTWUKU3MXKTEVO}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our results demonstrate that the proposed adaptive CFG scheme improves the emotional expressiveness of the AR TTS model while maintaining audio quality and intelligibility.

C2weakest assumption

That mismatch between the desired emotion style prompt and the semantic content of the text can be reliably detected and quantified by large language models or natural language inference models in a manner that permits effective, quality-preserving adaptation of CFG strength.

C3one line summary

An adaptive CFG method that tunes guidance based on LLM-detected mismatch between emotion prompts and text semantics improves emotional expressiveness in AR TTS while preserving audio quality and intelligibility.

Formal links

2 machine-checked theorem links

Cited by

2 papers in Pith

Receipt and verification
First computed 2026-05-20T01:05:00.120309Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

2f7fff339993efcc4ed45536cbaa64abb0a137e1c601b53c9fe86fd47b6a8e84

Aliases

arxiv: 2510.13293 · arxiv_version: 2510.13293v3 · doi: 10.48550/arxiv.2510.13293 · pith_short_12: F5776M4ZSPX4 · pith_short_16: F5776M4ZSPX4YTWU · pith_short_8: F5776M4Z
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/F5776M4ZSPX4YTWUKU3MXKTEVO \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 2f7fff339993efcc4ed45536cbaa64abb0a137e1c601b53c9fe86fd47b6a8e84
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "7dfad2d64f20900e50b2d03ff123953e260a7d036311d5f8ccbc3b2bae479cc4",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-10-15T08:37:16Z",
    "title_canon_sha256": "dc34796605ac654dc68f83bf355882049062e59deb2c319530eca3b5f83f4414"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2510.13293",
    "kind": "arxiv",
    "version": 3
  }
}