Recognition: 2 theorem links
· Lean TheoremVITA-QinYu: Expressive Spoken Language Model for Role-Playing and Singing
Pith reviewed 2026-05-11 00:56 UTC · model grok-4.3
The pith
VITA-QinYu is the first end-to-end spoken language model that generates role-playing speech and singing alongside natural conversation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
VITA-QinYu adopts a hybrid speech-text paradigm that extends interleaved text-audio modeling with multi-codebook audio tokens. This design supports richer paralinguistic representation while keeping a clear separation between modalities. The model is trained on 15.8K hours of synthesized natural conversation, role-playing, and singing data, allowing it to produce speech that conveys personality, mood, or performance elements.
What carries the argument
hybrid speech-text paradigm with multi-codebook audio tokens, which separates linguistic content from paralinguistic features such as tone and expression
If this is right
- Speech output can now carry personality and mood for role-playing tasks inside one model.
- Singing becomes possible in the same end-to-end system used for ordinary conversation.
- Conversational accuracy and fluency remain at or above prior state-of-the-art levels.
- Full-duplex streaming interaction becomes available for expressive spoken exchanges.
Where Pith is reading between the lines
- The same separation of text and audio codebooks could be tested on other paralinguistic tasks such as emotional or accented speech.
- Open-sourced streaming support may allow quick integration into dialogue systems that need both information and performance.
- Scaling the synthetic data pipeline to narrower domains like storytelling or education could produce specialized expressive voices.
Load-bearing premise
The synthetic data pipeline produces training examples whose expressiveness and distribution closely match real human role-playing and singing speech.
What would settle it
Human listening tests on real, non-synthetic role-playing and singing recordings in which VITA-QinYu scores lower than a baseline spoken language model on naturalness or expressiveness.
Figures
read the original abstract
Human speech conveys expressiveness beyond linguistic content, including personality, mood, or performance elements, such as a comforting tone or humming a song, which we formalize as role-playing and singing. We present VITA-QinYu, the first expressive end-to-end (E2E) spoken language model (SLM) that goes beyond natural conversation to support both role-playing and singing generation. VITA-QinYu adopts a hybrid speech-text paradigm that extends interleaved text-audio modeling with multi-codebook audio tokens, a design enabling richer paralinguistic representation while preserving a clear separation between modalities to avoid interference. We further develop a comprehensive data generation pipeline to synthesize a total of 15.8K hours of natural conversation, role-playing, and singing data for training. VITA-QinYu demonstrates superior expressiveness, outperforming peer SLMs by 7 percentage points on objective role-playing benchmarks, and surpassing peer models by 0.13 points on a 5-point MOS scale for singing. Simultaneously, it achieves state-of-the-art conversational accuracy and fluency, exceeding prior SLMs by 1.38 and 4.98 percentage points on the C3 and URO benchmarks, respectively. We open-source our code and models and provide an easy-to-use demo with full-stack support for streaming and full-duplex interaction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces VITA-QinYu as the first expressive end-to-end spoken language model (SLM) supporting role-playing and singing in addition to natural conversation. It employs a hybrid interleaved text-audio modeling approach with multi-codebook audio tokens to enable richer paralinguistic representations while avoiding modality interference. A custom data generation pipeline is used to synthesize 15.8K hours of training data covering natural conversation, role-playing, and singing. The model is reported to outperform peer SLMs by 7 percentage points on objective role-playing benchmarks, by 0.13 points on a 5-point MOS scale for singing, and to achieve state-of-the-art results on the C3 (+1.38 pp) and URO (+4.98 pp) conversational benchmarks. Code, models, and a streaming demo are open-sourced.
Significance. If the central claims hold after addressing the evaluation gaps, this would represent a meaningful advance in spoken language modeling by extending capabilities beyond standard conversation to expressive tasks like role-playing and singing. The hybrid multi-codebook architecture and large-scale synthetic data pipeline are potentially enabling contributions for paralinguistic modeling. The open-sourcing of code and models strengthens reproducibility and could accelerate follow-on work in interactive AI applications.
major comments (2)
- [Method / Data Generation Pipeline] Data generation pipeline (described in the method section): The paper states that all training uses 15.8K hours of synthetically generated data but provides no quantitative validation of the pipeline's fidelity, such as human perceptual ratings, acoustic feature histograms (e.g., prosody, pitch, timbre distributions), or controlled real-vs-synthetic benchmark splits. This is load-bearing for the central claims because the reported 7 pp role-playing gain, 0.13 MOS singing improvement, and conversational SOTA results could arise from synthetic data artifacts or distribution shifts rather than the hybrid architecture.
- [Experiments / Results] Experimental evaluation (§4 / Results): Benchmark improvements are presented (7 pp on role-playing, +1.38 pp C3, +4.98 pp URO, 0.13 MOS) without details on evaluation protocols, statistical significance testing, data splits, inter-rater reliability for MOS, error bars, or ablations isolating the multi-codebook design from data effects. This prevents assessment of whether the gains are robust or confounded by the synthetic training distribution.
minor comments (2)
- [Abstract] The abstract refers to 'peer SLMs' and 'prior SLMs' without naming the specific baselines or providing citations; these should be explicitly listed with references in §4 and Table 1 or equivalent.
- [Method] Notation for the multi-codebook audio tokens and hybrid interleaving scheme could be clarified with a diagram or pseudocode in the method section to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The concerns regarding validation of the synthetic data pipeline and transparency in experimental reporting are well-taken. We address each point below and will revise the manuscript to incorporate additional analyses and details for improved rigor.
read point-by-point responses
-
Referee: [Method / Data Generation Pipeline] Data generation pipeline (described in the method section): The paper states that all training uses 15.8K hours of synthetically generated data but provides no quantitative validation of the pipeline's fidelity, such as human perceptual ratings, acoustic feature histograms (e.g., prosody, pitch, timbre distributions), or controlled real-vs-synthetic benchmark splits. This is load-bearing for the central claims because the reported 7 pp role-playing gain, 0.13 MOS singing improvement, and conversational SOTA results could arise from synthetic data artifacts or distribution shifts rather than the hybrid architecture.
Authors: We agree that explicit quantitative validation of the data pipeline would strengthen the manuscript. The pipeline employs state-of-the-art TTS and voice conversion to generate expressive data, and downstream SOTA results provide indirect evidence of quality. In the revised version, we will add: acoustic feature histograms (pitch, prosody, timbre) comparing synthetic samples to real speech corpora; human perceptual ratings (naturalness and expressiveness) on a held-out subset of generated data; and evaluations on real-world test sets to demonstrate generalization beyond synthetic distributions. These additions will help rule out artifacts as the source of gains. revision: yes
-
Referee: [Experiments / Results] Experimental evaluation (§4 / Results): Benchmark improvements are presented (7 pp on role-playing, +1.38 pp C3, +4.98 pp URO, 0.13 MOS) without details on evaluation protocols, statistical significance testing, data splits, inter-rater reliability for MOS, error bars, or ablations isolating the multi-codebook design from data effects. This prevents assessment of whether the gains are robust or confounded by the synthetic training distribution.
Authors: We acknowledge the need for greater transparency in reporting. Section 4 describes the benchmarks and protocols, but we will expand it in revision to include: full details on data splits and evaluation procedures; statistical significance testing with p-values and confidence intervals for all reported improvements; error bars on objective metrics; inter-rater reliability (e.g., Krippendorff's alpha) for MOS scores; and ablations comparing multi-codebook vs. single-codebook variants trained on identical data to isolate architectural contributions from data effects. This will confirm the robustness of the results. revision: yes
Circularity Check
No circularity: empirical SLM with external benchmark results
full rationale
The paper proposes a hybrid multi-codebook architecture and a synthetic data pipeline, then reports performance deltas on independent benchmarks (role-playing, MOS singing, C3, URO). No equations, first-principles derivations, or predictions are presented that reduce by construction to fitted parameters, self-definitions, or self-citation chains. The central claims rest on experimental outcomes measured against external test sets rather than tautological renaming or input-output equivalence.
Axiom & Free-Parameter Ledger
free parameters (2)
- multi-codebook audio token configuration
- data synthesis hyperparameters
axioms (2)
- domain assumption Hybrid interleaved text-audio modeling with multi-codebook tokens preserves clear modality separation without interference.
- domain assumption Synthesized data distribution matches real human expressiveness for role-playing and singing.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
VITA-QINYU adopts a hybrid speech–text paradigm that extends interleaved text–audio modeling with multi-codebook audio tokens... eight 12.5 Hz codebooks (100 Hz total)
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We further develop a comprehensive data generation pipeline to synthesize a total of 15.8K hours of natural conversation, role-playing, and singing data
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Deepseek-v3 technical report , author=. arXiv preprint arXiv:2412.19437 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[2]
The T05 System for The
Baba, Kaito and Nakata, Wataru and Saito, Yuki and Saruwatari, Hiroshi , booktitle =. The T05 System for The. 2024 , pages=
2024
-
[3]
2026 , url=
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning , author=. 2026 , url=
2026
-
[4]
ArXiv , year=
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models , author=. ArXiv , year=
-
[5]
GitHub repository , howpublished =
Silero-Team , title =. GitHub repository , howpublished =. 2024 , publisher =
2024
-
[6]
CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models
Cosyvoice 2: Scalable streaming speech synthesis with large language models , author=. arXiv preprint arXiv:2412.10117 , year=
work page internal anchor Pith review arXiv
-
[7]
Spark-tts: An efficient llm-based text-to-speech model with single-stream decoupled speech tokens , author=. arXiv preprint arXiv:2503.01710 , year=
-
[8]
Neural Networks , volume=
HiddenSinger: High-quality singing voice synthesis via neural audio codec and latent diffusion models , author=. Neural Networks , volume=. 2025 , publisher=
2025
-
[9]
arXiv preprint arXiv:2406.08416 , year=
Toksing: Singing voice synthesis based on discrete tokens , author=. arXiv preprint arXiv:2406.08416 , year=
-
[10]
Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches , author=. Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics , pages=
-
[11]
International Conference on Machine Learning , pages=
Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech , author=. International Conference on Machine Learning , pages=. 2021 , organization=
2021
-
[12]
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=
Visinger: Variational inference with adversarial learning for end-to-end singing voice synthesis , author=. ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2022 , organization=
2022
-
[13]
2025 , url =
TEN-Team , title =. 2025 , url =
2025
-
[14]
Step-audio-aqaa: a fully end-to- end expressive large audio language model,
Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model , author=. arXiv preprint arXiv:2506.08967 , year=
-
[15]
From persona to personalization: A survey on role-playing language agents , author=. arXiv preprint arXiv:2404.18231 , year=
-
[16]
Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition , author=. arXiv preprint arXiv:2206.08317 , year=
-
[17]
Xy-tokenizer: Mitigating the semantic-acoustic conflict in low-bitrate speech codecs,
XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs , author=. arXiv preprint arXiv:2506.23325 , year=
-
[18]
Funaudiollm: Voice understanding and generation foundation models for natural interaction between humans and llms , author=. arXiv preprint arXiv:2407.04051 , year=
-
[19]
arXiv preprint arXiv:2512.24618 , year=
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models , author=. arXiv preprint arXiv:2512.24618 , year=
-
[20]
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=
C3: A bilingual benchmark for spoken dialogue models exploring challenges in complex conversations , author=. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=
2025
-
[21]
arXiv preprint arXiv:2510.11098 , year=
VCB Bench: An Evaluation Benchmark for Audio-Grounded Large Language Model Conversational Agents , author=. arXiv preprint arXiv:2510.11098 , year=
-
[22]
Uro-bench: A comprehensive benchmark for end-to-end spoken dialogue models , author=. arXiv preprint arXiv:2502.17810 , year=
-
[23]
2026 , eprint=
Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches , author=. 2026 , eprint=
2026
-
[24]
2025 , eprint=
OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction , author=. 2025 , eprint=
2025
-
[25]
2024 , eprint=
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models , author=. 2024 , eprint=
2024
-
[26]
2023 , eprint=
ChatHaruhi: Reviving Anime Character in Reality via Large Language Model , author=. 2023 , eprint=
2023
-
[27]
2025 , eprint=
Kimi-Audio Technical Report , author=. 2025 , eprint=
2025
-
[28]
2025 , eprint=
LUCY: Linguistic Understanding and Control Yielding Early Stage of Her , author=. 2025 , eprint=
2025
-
[29]
ArXiv , year=
MiMo-Audio: Audio Language Models are Few-Shot Learners , author=. ArXiv , year=
-
[30]
ArXiv , year=
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model , author=. ArXiv , year=
-
[31]
Proceedings of the 2013 conference on empirical methods in natural language processing , pages=
Semantic parsing on freebase from question-answer pairs , author=. Proceedings of the 2013 conference on empirical methods in natural language processing , pages=
2013
-
[32]
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension , author=. arXiv preprint arXiv:1705.03551 , year=
work page internal anchor Pith review arXiv
-
[33]
Transactions of the Association for Computational Linguistics , volume=
On generative spoken language modeling from raw audio , author=. Transactions of the Association for Computational Linguistics , volume=. 2021 , publisher=
2021
-
[34]
High Fidelity Neural Audio Compression
High fidelity neural audio compression , author=. arXiv preprint arXiv:2210.13438 , year=
work page internal anchor Pith review arXiv
-
[35]
Proceedings of ACL 2023
WebCPM: Interactive Web Search for Chinese Long-form Question Answering. Proceedings of ACL 2023. 2023
2023
-
[36]
arXiv preprint arXiv:2305.15255 , year=
Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM , author=. arXiv preprint arXiv:2305.15255 , year=
-
[37]
Mistral 7B , author=. arXiv preprint arXiv:2310.06825 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[38]
LLaMA: Open and Efficient Foundation Language Models
Llama: Open and efficient foundation language models , author=. arXiv preprint arXiv:2302.13971 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[39]
Speechverse: A large-scale generalizable audio language model , author=. arXiv preprint arXiv:2405.08295 , year=
-
[40]
Qwen2. 5 Technical Report , author=. arXiv preprint arXiv:2412.15115 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[41]
Cosyvoice: A scalable multilingual zero-shot text-to-speech synthesizer based on supervised semantic tokens , author=. arXiv preprint arXiv:2407.05407 , year=
-
[42]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Audiogpt: Understanding and generating speech, music, sound, and talking head , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
- [43]
-
[44]
Mini-omni2: Towards open-source gpt-4o with vision, speech and duplex capabilities , author=. arXiv preprint arXiv:2410.11190 , year=
-
[45]
Freeze-omni: A smart and low latency speech-to-speech dialogue model with frozen llm , author=. arXiv preprint arXiv:2411.00774 , year=
-
[46]
Moshi: a speech-text foundation model for real-time dialogue
Moshi: a speech-text foundation model for real-time dialogue , author=. arXiv preprint arXiv:2410.00037 , year=
work page internal anchor Pith review arXiv
-
[47]
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot , author=. arXiv preprint arXiv:2412.02612 , year=
-
[48]
Minmo: A multimodal large language model for seamless voice interaction.CoRR, abs/2501.06282, 2025
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction , author=. arXiv preprint arXiv:2501.06282 , year=
-
[49]
Transactions of the Association for Computational Linguistics , volume=
Spirit-lm: Interleaved spoken and written language model , author=. Transactions of the Association for Computational Linguistics , volume=. 2025 , publisher=
2025
-
[50]
arXiv preprint arXiv:2411.17607 , year=
Scaling Speech-Text Pre-training with Synthetic Interleaved Data , author=. arXiv preprint arXiv:2411.17607 , year=
-
[51]
Llama-omni: Seamless speech interaction with large language models , author=. arXiv preprint arXiv:2409.06666 , year=
-
[52]
A full-duplex speech dialogue scheme based on large language models
A full-duplex speech dialogue scheme based on large language models , author=. arXiv preprint arXiv:2405.19487 , year=
-
[53]
arXiv preprint arXiv:2412.15649 , year=
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training , author=. arXiv preprint arXiv:2412.15649 , year=
-
[54]
Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound Generation , year=
SNAC: Multi-Scale Neural Audio Codec , author=. Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound Generation , year=
2024
-
[55]
Wavchat: A survey of spoken dialogue models
WavChat: A Survey of Spoken Dialogue Models , author=. arXiv preprint arXiv:2411.13577 , year=
-
[56]
Qwen2-audio technical report , author=. arXiv preprint arXiv:2407.10759 , year=
work page internal anchor Pith review arXiv
-
[57]
2025 , eprint=
Qwen3 Technical Report , author=. 2025 , eprint=
2025
-
[58]
IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=
Panns: Large-scale pretrained audio neural networks for audio pattern recognition , author=. IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=. 2020 , publisher=
2020
-
[59]
International conference on machine learning , pages=
Robust speech recognition via large-scale weak supervision , author=. International conference on machine learning , pages=. 2023 , organization=
2023
-
[60]
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models , author=. arXiv preprint arXiv:2311.07919 , year=
work page internal anchor Pith review arXiv
-
[61]
Advances in Neural Information Processing Systems , volume=
Simple and controllable music generation , author=. Advances in Neural Information Processing Systems , volume=
-
[62]
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations
Meld: A multimodal multi-party dataset for emotion recognition in conversations , author=. arXiv preprint arXiv:1810.02508 , year=
-
[63]
arXiv preprint arXiv:2205.10237 , year=
M3ED: Multi-modal multi-scene multi-label emotional dialogue database , author=. arXiv preprint arXiv:2205.10237 , year=
-
[64]
arXiv preprint arXiv:2406.07162 , year=
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark , author=. arXiv preprint arXiv:2406.07162 , year=
-
[65]
In International Conference on Human-Computer Interaction, 78–97
emotion2vec: Self-supervised pre-training for speech emotion representation , author=. arXiv preprint arXiv:2312.15185 , year=
-
[66]
IEEE Journal of Selected Topics in Signal Processing , volume=
Wavlm: Large-scale self-supervised pre-training for full stack speech processing , author=. IEEE Journal of Selected Topics in Signal Processing , volume=. 2022 , publisher=
2022
-
[67]
Gpt-4 technical report , author=. arXiv preprint arXiv:2303.08774 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[68]
A DevOps Domain Knowledge Evaluation Benchmarks for Large Language Models , howpublished =
-
[69]
Hinton and Simon Osindero and Yee Whye Teh , title =
Geoffrey E. Hinton and Simon Osindero and Yee Whye Teh , title =. Neural Computation , year =
-
[70]
Foundations and Trends in Machine Learning , year =
Yoshua Bengio , title =. Foundations and Trends in Machine Learning , year =
-
[71]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
An image is worth 16x16 words: Transformers for image recognition at scale , author=. arXiv preprint arXiv:2010.11929 , year=
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[72]
Advances in neural information processing systems , pages=
Attention is all you need , author=. Advances in neural information processing systems , pages=
-
[73]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Swin transformer: Hierarchical vision transformer using shifted windows , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[74]
European conference on computer vision , pages=
End-to-end object detection with transformers , author=. European conference on computer vision , pages=. 2020 , organization=
2020
-
[75]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[76]
International Conference on Machine Learning , pages=
Training data-efficient image transformers & distillation through attention , author=. International Conference on Machine Learning , pages=. 2021 , organization=
2021
-
[77]
IEEE transactions on neural networks and learning systems , volume=
Object detection with deep learning: A review , author=. IEEE transactions on neural networks and learning systems , volume=. 2019 , publisher=
2019
-
[78]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
A convnet for the 2020s , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[79]
Vivit: A video vision transformer,
Vivit: A video vision transformer , author=. arXiv preprint arXiv:2103.15691 , year=
-
[80]
IEEE transactions on pattern analysis and machine intelligence , volume=
Multimodal machine learning: A survey and taxonomy , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2018 , publisher=
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.