DeSRPA: Decoupled Speech Role-Playing Agent via Inference-Time Intervention

· 2026 · cs.SD · arXiv 2606.17669

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

While Large Language Models (LLMs) have revolutionized text-based role-playing, creating immersive Speech Role-Playing Agents (SRPAs) requires a seamless bridge between cognitive reasoning and paralinguistic nuances. Current SRPAs primarily rely on end-to-end (E2E) fine-tuning. However, this paradigm suffers from poor generalization to unseen characters due to its reliance on role-specific data, while imposing a "modality alignment tax" that degrades intrinsic LLM reasoning capabilities. We propose DeSRPA, an agentic framework for character role play via inference-time intervention on frozen backbones. DeSRPA employs a dual-level control vector mechanism, Internal Cognitive Steering and External Expressive Rendering, to synchronize "mind" and "voice". Experiments on SpeechRole and OmniCharacter benchmarks demonstrate that DeSRPA significantly outperforms E2E baselines in personality and emotional consistency. It achieves high speech naturalness, narrowing the gap with proprietary models like GPT-4o Audio, while remaining a scalable and training-free paradigm.

representative citing papers

DeSRPA: Decoupled Speech Role-Playing Agent via Inference-Time Intervention

cs.SD · 2026-06-16 · unverdicted · novelty 6.0

DeSRPA introduces a dual-level control vector method for inference-time intervention on frozen backbones to improve personality consistency and speech naturalness in role-playing agents over end-to-end fine-tuned baselines.

citing papers explorer

Showing 1 of 1 citing paper after filters.

DeSRPA: Decoupled Speech Role-Playing Agent via Inference-Time Intervention cs.SD · 2026-06-16 · unverdicted · none · ref 2 · internal anchor
DeSRPA introduces a dual-level control vector method for inference-time intervention on frozen backbones to improve personality consistency and speech naturalness in role-playing agents over end-to-end fine-tuned baselines.

DeSRPA: Decoupled Speech Role-Playing Agent via Inference-Time Intervention

fields

years

verdicts

representative citing papers

citing papers explorer