Virtual Speech Therapist: A Clinician-in-the-Loop AI Speech Therapy Agent for Personalized and Supervised Therapy

Bjorn W Schuller; Fabrice Hirsch; Goncalo Leal; Md Sahidullah; Patrick Marmaroli; Shakeel Sheikh; Slim Ouni

arxiv: 2605.01101 · v1 · submitted 2026-05-01 · 💻 cs.AI · cs.CL· cs.SD· eess.AS

Virtual Speech Therapist: A Clinician-in-the-Loop AI Speech Therapy Agent for Personalized and Supervised Therapy

Shakeel Sheikh , Patrick Marmaroli , Md Sahidullah , Slim Ouni , Fabrice Hirsch , Goncalo Leal , Bjorn W Schuller This is my paper

Pith reviewed 2026-05-09 19:08 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.SDeess.AS

keywords stuttering therapyAI speech therapymulti-agent LLMclinician-in-the-looppersonalized therapy planningdeep learning classificationspeech impairment

0 comments

The pith

An AI platform called Virtual Speech Therapist combines speech classification with multi-agent reasoning to draft personalized stuttering therapy plans for clinician review.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Virtual Speech Therapist, a system that first uses deep learning to classify stuttering from patient speech samples and then deploys multiple large language model agents to generate and refine individualized therapy plans. A dedicated critic agent checks each plan for safety, methodological soundness, and consistency with peer-reviewed evidence before passing it to a human clinician for feedback and final approval. This clinician-in-the-loop design keeps professional oversight intact while automating the repetitive parts of assessment and initial planning. A sympathetic reader would care because the approach could let therapists spend more time on direct patient interaction and reach more people with speech impairments.

Core claim

Virtual Speech Therapist integrates deep learning-based stuttering classification with a multi-agent LLM reasoning process in which specialized agents autonomously generate, critique, and iteratively refine individualized therapy plans. A critic agent evaluates all plans for clinical safety and alignment with established professional guidelines. The resulting draft is reviewed by a clinician who supplies feedback, after which the system produces a finalized plan. Experimental evaluation by expert speech therapists confirms that VST consistently generates high-quality, evidence-based therapy recommendations.

What carries the argument

The multi-agent LLM reasoning workflow with a dedicated critic agent that generates, evaluates, and refines therapy plans for safety and evidence alignment before clinician input.

If this is right

Clinicians receive ready-to-review therapy drafts rather than starting from scratch, which can lower administrative workload.
Therapy plans are tailored to the specific stuttering classification obtained from each patient's speech sample.
The critic agent and clinician feedback together keep final plans under professional supervision.
The system can support consistent application of evidence-based practices across different therapists.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same agent-critic structure could be adapted to other speech or language disorders if appropriate classification models and guidelines exist.
Adding longitudinal patient data might allow the agents to update plans as therapy progresses.
Deployment in regions with limited access to speech therapists could expand service reach while preserving oversight.
Controlled trials measuring actual patient outcomes and clinician time savings would quantify the practical benefit.

Load-bearing premise

The multi-agent LLM reasoning and critic agent will reliably produce plans that align with peer-reviewed evidence and professional guidelines without introducing clinically unsafe suggestions.

What would settle it

A blinded review in which expert speech therapists examine a representative sample of VST-generated plans and find that a substantial fraction contain recommendations unsupported by guidelines or carrying clinical risk.

read the original abstract

This paper develops Virtual Speech Therapist (VST), an intelligent agent-based platform that streamlines stuttering assessment and delivers customized therapy planning through automated and adaptive AI-driven workflows. VST integrates state-of-the-art deep learning-based stuttering classification, and multi-agent large language model (LLM) reasoning to support evidence-based clinical decision-making. The VST begins with the acquisition and feature extraction of patient speech samples, followed by robust classification of stuttering types. Building on these outputs, VST initiates an agentic reasoning process in which specialized LLM agents autonomously generate, critique, and iteratively refine individualized therapy plans. A dedicated critic agent evaluates all generated therapy plans to ensure clinical safety, methodological soundness, and alignment with peer-reviewed evidence and established professional guidelines. The resulting output is a comprehensive, patient-specific therapy draft intended for clinician review. Incorporating clinician feedback, the system then produces a finalized therapy plan suitable for patient delivery, thereby maintaining a clinician-in-the-loop paradigm. Experimental evaluation by expert speech therapists confirms that VST consistently generates high-quality, evidence-based therapy recommendations. These findings demonstrate the system's potential to augment clinical workflows, reduce clinician burden, and improve therapeutic outcomes for individuals with speech impairments. An interactive user interface for the proposed system is available online at: https://vocametrix.com/ai/stuttering-therapy-planning-agent , facilitating real-time stuttering assessment and personalized therapy planning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper describes a stuttering therapy workflow that pairs deep-learning classification with a multi-agent LLM planner and critic plus clinician oversight, but the key claim of expert validation has no numbers, methods, or sample details attached.

read the letter

The main takeaway is a concrete engineering integration: speech samples go through feature extraction and a stuttering classifier, then a generator agent drafts therapy plans, a critic agent checks them against guidelines and safety, and a clinician reviews before final output. The online interface at vocametrix.com is a practical addition that lets people try the flow in real time. This specific combination for stuttering, with the explicit critic step and closed clinician loop, looks new relative to earlier AI speech-therapy papers that mostly stopped at classification or simple chatbots. The architecture is described clearly enough that someone could replicate the high-level structure using off-the-shelf components and existing clinical guidelines. That is the useful part. The soft spot is the evaluation. The abstract states that expert therapists judged the plans high-quality and evidence-based, yet it gives no count of therapists, no count of patient cases or plans reviewed, no scoring rubric, no inter-rater numbers, and no comparison to unaided clinician plans. The critic agent is said to filter unsafe suggestions, but there is no ablation, no failure-case review, and no data on how often it actually changed outputs. Without those details the central claim stays an assertion rather than evidence. The work is aimed at speech-language pathologists who want to explore AI assistance for initial planning and at applied AI researchers who build clinician-in-the-loop systems. A reader looking for a working prototype idea or a starting architecture will get something concrete; anyone expecting quantified performance or safety data will not. I would send it to peer review once the authors supply the missing evaluation protocol and results, because the underlying workflow is a reasonable next step in this narrow domain and the gaps are fixable rather than fatal.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces the Virtual Speech Therapist (VST), a clinician-in-the-loop AI platform that integrates deep learning-based stuttering classification from speech samples with a multi-agent LLM system. Specialized agents generate, critique, and refine individualized therapy plans aligned with clinical guidelines, with a dedicated critic agent enforcing safety and evidence-based standards before clinician review and finalization. The central claim is that expert speech therapists have evaluated the VST outputs as consistently high-quality and evidence-based.

Significance. The clinician-in-the-loop design combined with an explicit critic agent for guideline alignment represents a practical strength in applying agentic AI to a clinical domain while prioritizing safety. The availability of a public interactive UI supports transparency and further testing. If the evaluation were properly detailed, the work could usefully illustrate how existing LLM and DL components can be orchestrated to reduce clinician burden in speech therapy without replacing human judgment.

major comments (1)

[Abstract] Abstract: The claim that 'Experimental evaluation by expert speech therapists confirms that VST consistently generates high-quality, evidence-based therapy recommendations' is unsupported by any reported methodology. No information is provided on the number of therapists, number of patient cases or plans reviewed, scoring rubrics (e.g., alignment with ASHA guidelines or safety checklists), inter-rater agreement, quantitative metrics (e.g., Likert scores or error rates), or baseline comparisons. This is load-bearing for the paper's central assertion.

minor comments (2)

[Methods] The description of the multi-agent pipeline would benefit from an explicit ablation or failure-case analysis of the critic agent's rejections to demonstrate its effectiveness.
[Introduction] Ensure all acronyms (e.g., VST, LLM) are defined at first use and that references to 'peer-reviewed evidence' and 'professional guidelines' cite specific sources.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review of our manuscript. We address the major comment below and commit to revisions that strengthen the paper without overstating our current results.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that 'Experimental evaluation by expert speech therapists confirms that VST consistently generates high-quality, evidence-based therapy recommendations' is unsupported by any reported methodology. No information is provided on the number of therapists, number of patient cases or plans reviewed, scoring rubrics (e.g., alignment with ASHA guidelines or safety checklists), inter-rater agreement, quantitative metrics (e.g., Likert scores or error rates), or baseline comparisons. This is load-bearing for the paper's central assertion.

Authors: We agree that the abstract claim is not supported by the methodological details requested. The current manuscript mentions evaluation by expert speech therapists but does not report the number of therapists, cases reviewed, rubrics, inter-rater agreement, quantitative scores, or baselines. We will revise the abstract to remove or appropriately qualify this statement. In the revised manuscript we will also expand any existing evaluation description to include these specifics or, if no such data exist, clearly state the preliminary nature of the therapist feedback and the availability of the public UI for independent verification. revision: yes

Circularity Check

0 steps flagged

No circularity: engineering integration of independent components

full rationale

The paper describes an applied system that combines existing deep-learning stuttering classifiers with multi-agent LLM workflows and clinician oversight. No equations, parameter fitting, or derivation steps are presented. No self-citations are invoked to justify uniqueness, ansatzes, or load-bearing premises. The evaluation claim is an external assertion rather than a mathematical reduction to the system's own inputs. The architecture is therefore self-contained against external benchmarks and contains no circular steps of the enumerated kinds.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The system relies on pre-existing deep-learning models for stuttering classification and general-purpose LLMs; no new mathematical axioms, free parameters fitted in this work, or invented physical entities are introduced.

axioms (2)

domain assumption Existing deep-learning models can accurately classify stuttering types from speech features.
Invoked in the description of the classification stage; treated as given rather than re-derived.
domain assumption LLM agents can generate and critique therapy plans that align with peer-reviewed clinical guidelines.
Central to the agentic reasoning process; no independent verification mechanism beyond the critic agent is described.

pith-pipeline@v0.9.0 · 5580 in / 1322 out tokens · 41681 ms · 2026-05-09T19:08:16.095736+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages · 1 internal anchor

[1]

Next-generation agentic

Karunanayake, Nalan , journal=. Next-generation agentic. 2025 , publisher=

work page 2025
[2]

2025 , publisher=

Navigating Childhood Stuttering: A Guide to Management of Stuttering at Home and School , author=. 2025 , publisher=

work page 2025
[3]

Current Research in Neurobiology , volume=

Stuttering as a spectrum disorder: A hypothesis , author=. Current Research in Neurobiology , volume=. 2023 , publisher=

work page 2023
[4]

Dialogue without barriers: a comprehensive approach to dealing with stuttering , pages=

Becoming an effective clinician specialized in fluency disorders , author=. Dialogue without barriers: a comprehensive approach to dealing with stuttering , pages=. 2023 , publisher=

work page 2023
[5]

2025 , publisher=

Stuttering: Foundations and clinical applications , author=. 2025 , publisher=

work page 2025
[6]

American Journal of Speech-Language Pathology , volume=

Defining, identifying, and evaluating clinical trials of stuttering treatments: A tutorial for clinicians , author=. American Journal of Speech-Language Pathology , volume=

work page
[7]

Asia Pacific Journal of Speech, Language and Hearing , volume=

Clinical identification of early stuttering: Methods, issues, and future directions , author=. Asia Pacific Journal of Speech, Language and Hearing , volume=. 2007 , publisher=

work page 2007
[8]

American Journal of Speech-Language Pathology , volume=

Identification of early stuttering: Issues and suggested strategies , author=. American Journal of Speech-Language Pathology , volume=. 1992 , publisher=

work page 1992
[9]

Clinician-in-the-loop decision making: Reinforcement learning with near-optimal set-valued policies , author=. Proc. of International Conference on Machine Learning , pages=. 2020 , organization=

work page 2020
[10]

Language, speech, and hearing services in schools , volume=

Stuttering in school-age children: A comprehensive approach to treatment , author=. Language, speech, and hearing services in schools , volume=

work page
[11]

Schuller, Bj. The. Proc. of the 30th ACM International Conference on Multimedia , pages=

work page
[12]

Classification of stuttering--The

Bayerl, Sebastian P and Gerczuk, Maurice and Batliner, Anton and Bergler, Christian and Amiriparian, Shahin and Schuller, Bj. Classification of stuttering--The. Computer Speech & Language , volume=. 2023 , publisher=

work page 2023
[13]

Journal of Fluency Disorders , volume=

Classification of stuttering symptoms using neural network models , author=. Journal of Fluency Disorders , volume=. 2010 , publisher=

work page 2010
[14]

Journal of Speech, Language, and Hearing Research , volume=

Acoustic analysis of stutterers' fluent speech before and after therapy , author=. Journal of Speech, Language, and Hearing Research , volume=. 1983 , publisher=

work page 1983
[15]

Detecting multiple speech disfluencies using a deep residual network with bidirectional long short-term memory , author=. Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2020 , organization=

work page 2020
[16]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=

Fluentnet: End-to-end detection of stuttered speech disfluencies with deep learning , author=. IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=. 2021 , publisher=

work page 2021
[17]

Attention Is All You Need , author=. Proc. of Advances in Neural Information Processing Systems (NeurIPS) , pages=

work page
[18]

Sensors , volume=

TranStutter: A convolution-free transformer-based deep learning method to classify stuttered speech using 2D mel-spectrogram visualization and attention-based feature representation , author=. Sensors , volume=. 2023 , publisher=

work page 2023
[19]

Multi-task Learning for Automatic Stuttering Recognition and Severity Estimation , author=. Proc. of Interspeech , pages=

work page
[20]

Journal of Speech, Language, and Hearing Research , volume=

Stuttering: A Motor Control Perspective , author=. Journal of Speech, Language, and Hearing Research , volume=. 2010 , publisher=

work page 2010
[21]

Advances in neural information processing systems , volume=

wav2vec 2.0: A framework for self-supervised learning of speech representations , author=. Advances in neural information processing systems , volume=

work page
[22]

Journal of Fluency Disorders , volume=

Epidemiology of stuttering: 21st century advances , author=. Journal of Fluency Disorders , volume=. 2013 , publisher=

work page 2013
[23]

2013 , publisher=

Stuttering: An Integrated Approach to Its Nature and Treatment , author=. 2013 , publisher=

work page 2013
[24]

Pediatrics , volume=

Natural history of stuttering to 4 years of age: A prospective community-based study , author=. Pediatrics , volume=. 2013 , publisher=

work page 2013
[25]

Journal of Speech, Language, and Hearing Research , volume=

Stuttering: A motor control perspective , author=. Journal of Speech, Language, and Hearing Research , volume=. 2018 , publisher=

work page 2018
[26]

NeuroImage , volume=

Neural bases of stuttering and speech motor control , author=. NeuroImage , volume=. 2015 , publisher=

work page 2015
[27]

Folia Phoniatrica et Logopaedica , volume=

Laryngeal function in people who stutter: Evidence from electroglottography , author=. Folia Phoniatrica et Logopaedica , volume=. 2014 , publisher=

work page 2014
[28]

Journal of Fluency Disorders , volume=

Automatic Detection of Speech Disfluencies Using Spectro-Temporal Features and Deep Neural Networks , author=. Journal of Fluency Disorders , volume=. 2020 , publisher=

work page 2020
[29]

Vision-Based Detection of Facial and Articulatory Cues in Stuttering , author=. Proc. of Interspeech , pages=

work page
[30]

Folia Phoniatrica et Logopaedica , volume=

Electroglottography and Its Clinical Applications in Fluency Disorders , author=. Folia Phoniatrica et Logopaedica , volume=. 2018 , publisher=

work page 2018
[31]

Disfluency Detection Using a Bidirectional LSTM , author=. Proc. of NAACL-HLT , pages=

work page
[32]

Topics in Cognitive Science , volume=

Real-Time Magnetic Resonance Imaging and Its Application to Speech Science , author=. Topics in Cognitive Science , volume=. 2017 , publisher=

work page 2017
[33]

Journal of Speech, Language, and Hearing Research , volume=

Respiratory Control in Speech Production: Effects in People Who Stutter , author=. Journal of Speech, Language, and Hearing Research , volume=. 2015 , publisher=

work page 2015
[35]

Dietrich, Nicholas , journal=. Agentic. 2025 , publisher=

work page 2025
[36]

Journal of the American College of Radiology , year=

Agentic artificial intelligence: the power to change medicine and our world , author=. Journal of the American College of Radiology , year=

work page
[37]

The Lancet , volume=

The rise of agentic AI teammates in medicine , author=. The Lancet , volume=. 2025 , publisher=

work page 2025
[38]

Addressing Task Conflicts in Stuttering Detection via

Liu, Xiaokang and Li, Xingfeng and Yang, Yudong and Wang, Lan and Yan, Nan , booktitle=. Addressing Task Conflicts in Stuttering Detection via

work page
[39]

Shakeel Ahmad Sheikh and Md Sahidullah and Fabrice Hirsch and Slim Ouni , title =. Proc. of the ACM Multimedia 2022 , year =

work page 2022
[40]

Neurocomputing , volume=

Machine learning for stuttering identification: Review, challenges and future directions , author=. Neurocomputing , volume=. 2022 , publisher=

work page 2022
[41]

Introducing

Sheikh, Shakeel Ahmad and Sahidullah, Md and Hirsch, Fabrice and Ouni, Slim , journal=. Introducing

work page
[42]

2023 , month =

Shakeel Ahmad Sheikh , title =. 2023 , month =

work page 2023
[43]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities , author=. arXiv preprint arXiv:2507.06261 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[44]

Bayerl and Dominik Wagner and Elmar Nöth and Tobias Bocklet and Korbinian Riedhammer , year =

Sebastian P. Bayerl and Dominik Wagner and Elmar Nöth and Tobias Bocklet and Korbinian Riedhammer , year =. The Influence of Dataset-Partitioning on Dysfluency. Proc. of Text,

work page
[45]

Sep-28k: A dataset for stuttering event detection from podcasts with people who stutter , author=. Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2021 , organization=

work page 2021
[46]

International Journal of Speech Technology , volume=

Stuttering detection using speaker representations and self-supervised contextual embeddings , author=. International Journal of Speech Technology , volume=. 2023 , publisher=

work page 2023
[47]

Unsupervised cross- lingual representation learning for speech recognition,

Unsupervised cross-lingual representation learning for speech recognition , author=. arXiv preprint arXiv:2006.13979 , year=

work page arXiv 2006
[48]

and Edin, Joakim and Igel, Christian and Kirchhoff, Katrin and Li, Shang-Wen and Livescu, Karen and Maaløe, Lars and Sainath, Tara N

Mohamed, Abdelrahman and Lee, Hung-yi and Borgholt, Lasse and Havtorn, Jakob D. and Edin, Joakim and Igel, Christian and Kirchhoff, Katrin and Li, Shang-Wen and Livescu, Karen and Maaløe, Lars and Sainath, Tara N. and Watanabe, Shinji , journal=. Self-Supervised Speech Representation Learning: A Review , year=

work page
[49]

Robust stuttering detection via multi-task and adversarial learning , author=. Proc. of 30th European Signal Processing Conference (EUSIPCO) , pages=. 2022 , organization=

work page 2022
[50]

IEEE Journal of Selected Topics in Signal Processing , year=

Overview of Automatic Speech Analysis and Technologies for Neurodegenerative Disorders: Diagnosis and Assistive Applications , author=. IEEE Journal of Selected Topics in Signal Processing , year=

work page
[51]

The effect of sampling temperature on problem solving in large language models , author=. Proc. of Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

work page 2024
[52]

Seminars in speech and language , volume=

Psychosocial impact of living with a stuttering disorder: Knowing is not enough , author=. Seminars in speech and language , volume=. 2014 , organization=

work page 2014
[53]

Journal of Fluency Disorders , pages=

More than meets the eye: Self-rated covert stuttering is linked to reduced psychosocial and communicative outcomes , author=. Journal of Fluency Disorders , pages=. 2025 , publisher=

work page 2025
[54]

Overall Assessment of the Speaker's Experience of Stuttering (

Yaruss, J Scott and Quesal, Robert W , journal=. Overall Assessment of the Speaker's Experience of Stuttering (. 2006 , publisher=

work page 2006
[55]

Journal of Fluency disorders , volume=

Social anxiety disorder and stuttering: Current status and future directions , author=. Journal of Fluency disorders , volume=. 2014 , publisher=

work page 2014
[56]

What works for whom? Multidimensional individualized stuttering therapy (

S. What works for whom? Multidimensional individualized stuttering therapy (. Journal of Communication Disorders , volume=. 2020 , publisher=

work page 2020
[57]

Neurobiology of Language , volume=

Stuttering: Our current knowledge, research opportunities, and ways to address critical gaps , author=. Neurobiology of Language , volume=. 2025 , publisher=

work page 2025
[58]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=

Systematic review of machine learning approaches for detecting developmental stuttering , author=. IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=. 2022 , publisher=

work page 2022
[59]

2021 , organization=

Sheikh, Shakeel A and Sahidullah, Md and Hirsch, Fabrice and Ouni, Slim , booktitle=. 2021 , organization=

work page 2021
[60]

IEEE Journal of Biomedical and Health Informatics , year=

Advancing stuttering detection via data augmentation, class-balanced loss and multi-contextual deep learning , author=. IEEE Journal of Biomedical and Health Informatics , year=

work page

[1] [1]

Next-generation agentic

Karunanayake, Nalan , journal=. Next-generation agentic. 2025 , publisher=

work page 2025

[2] [2]

2025 , publisher=

Navigating Childhood Stuttering: A Guide to Management of Stuttering at Home and School , author=. 2025 , publisher=

work page 2025

[3] [3]

Current Research in Neurobiology , volume=

Stuttering as a spectrum disorder: A hypothesis , author=. Current Research in Neurobiology , volume=. 2023 , publisher=

work page 2023

[4] [4]

Dialogue without barriers: a comprehensive approach to dealing with stuttering , pages=

Becoming an effective clinician specialized in fluency disorders , author=. Dialogue without barriers: a comprehensive approach to dealing with stuttering , pages=. 2023 , publisher=

work page 2023

[5] [5]

2025 , publisher=

Stuttering: Foundations and clinical applications , author=. 2025 , publisher=

work page 2025

[6] [6]

American Journal of Speech-Language Pathology , volume=

Defining, identifying, and evaluating clinical trials of stuttering treatments: A tutorial for clinicians , author=. American Journal of Speech-Language Pathology , volume=

work page

[7] [7]

Asia Pacific Journal of Speech, Language and Hearing , volume=

Clinical identification of early stuttering: Methods, issues, and future directions , author=. Asia Pacific Journal of Speech, Language and Hearing , volume=. 2007 , publisher=

work page 2007

[8] [8]

American Journal of Speech-Language Pathology , volume=

Identification of early stuttering: Issues and suggested strategies , author=. American Journal of Speech-Language Pathology , volume=. 1992 , publisher=

work page 1992

[9] [9]

Clinician-in-the-loop decision making: Reinforcement learning with near-optimal set-valued policies , author=. Proc. of International Conference on Machine Learning , pages=. 2020 , organization=

work page 2020

[10] [10]

Language, speech, and hearing services in schools , volume=

Stuttering in school-age children: A comprehensive approach to treatment , author=. Language, speech, and hearing services in schools , volume=

work page

[11] [11]

Schuller, Bj. The. Proc. of the 30th ACM International Conference on Multimedia , pages=

work page

[12] [12]

Classification of stuttering--The

Bayerl, Sebastian P and Gerczuk, Maurice and Batliner, Anton and Bergler, Christian and Amiriparian, Shahin and Schuller, Bj. Classification of stuttering--The. Computer Speech & Language , volume=. 2023 , publisher=

work page 2023

[13] [13]

Journal of Fluency Disorders , volume=

Classification of stuttering symptoms using neural network models , author=. Journal of Fluency Disorders , volume=. 2010 , publisher=

work page 2010

[14] [14]

Journal of Speech, Language, and Hearing Research , volume=

Acoustic analysis of stutterers' fluent speech before and after therapy , author=. Journal of Speech, Language, and Hearing Research , volume=. 1983 , publisher=

work page 1983

[15] [15]

Detecting multiple speech disfluencies using a deep residual network with bidirectional long short-term memory , author=. Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2020 , organization=

work page 2020

[16] [16]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=

Fluentnet: End-to-end detection of stuttered speech disfluencies with deep learning , author=. IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=. 2021 , publisher=

work page 2021

[17] [17]

Attention Is All You Need , author=. Proc. of Advances in Neural Information Processing Systems (NeurIPS) , pages=

work page

[18] [18]

Sensors , volume=

TranStutter: A convolution-free transformer-based deep learning method to classify stuttered speech using 2D mel-spectrogram visualization and attention-based feature representation , author=. Sensors , volume=. 2023 , publisher=

work page 2023

[19] [19]

Multi-task Learning for Automatic Stuttering Recognition and Severity Estimation , author=. Proc. of Interspeech , pages=

work page

[20] [20]

Journal of Speech, Language, and Hearing Research , volume=

Stuttering: A Motor Control Perspective , author=. Journal of Speech, Language, and Hearing Research , volume=. 2010 , publisher=

work page 2010

[21] [21]

Advances in neural information processing systems , volume=

wav2vec 2.0: A framework for self-supervised learning of speech representations , author=. Advances in neural information processing systems , volume=

work page

[22] [22]

Journal of Fluency Disorders , volume=

Epidemiology of stuttering: 21st century advances , author=. Journal of Fluency Disorders , volume=. 2013 , publisher=

work page 2013

[23] [23]

2013 , publisher=

Stuttering: An Integrated Approach to Its Nature and Treatment , author=. 2013 , publisher=

work page 2013

[24] [24]

Pediatrics , volume=

Natural history of stuttering to 4 years of age: A prospective community-based study , author=. Pediatrics , volume=. 2013 , publisher=

work page 2013

[25] [25]

Journal of Speech, Language, and Hearing Research , volume=

Stuttering: A motor control perspective , author=. Journal of Speech, Language, and Hearing Research , volume=. 2018 , publisher=

work page 2018

[26] [26]

NeuroImage , volume=

Neural bases of stuttering and speech motor control , author=. NeuroImage , volume=. 2015 , publisher=

work page 2015

[27] [27]

Folia Phoniatrica et Logopaedica , volume=

Laryngeal function in people who stutter: Evidence from electroglottography , author=. Folia Phoniatrica et Logopaedica , volume=. 2014 , publisher=

work page 2014

[28] [28]

Journal of Fluency Disorders , volume=

Automatic Detection of Speech Disfluencies Using Spectro-Temporal Features and Deep Neural Networks , author=. Journal of Fluency Disorders , volume=. 2020 , publisher=

work page 2020

[29] [29]

Vision-Based Detection of Facial and Articulatory Cues in Stuttering , author=. Proc. of Interspeech , pages=

work page

[30] [30]

Folia Phoniatrica et Logopaedica , volume=

Electroglottography and Its Clinical Applications in Fluency Disorders , author=. Folia Phoniatrica et Logopaedica , volume=. 2018 , publisher=

work page 2018

[31] [31]

Disfluency Detection Using a Bidirectional LSTM , author=. Proc. of NAACL-HLT , pages=

work page

[32] [32]

Topics in Cognitive Science , volume=

Real-Time Magnetic Resonance Imaging and Its Application to Speech Science , author=. Topics in Cognitive Science , volume=. 2017 , publisher=

work page 2017

[33] [33]

Journal of Speech, Language, and Hearing Research , volume=

Respiratory Control in Speech Production: Effects in People Who Stutter , author=. Journal of Speech, Language, and Hearing Research , volume=. 2015 , publisher=

work page 2015

[34] [35]

Dietrich, Nicholas , journal=. Agentic. 2025 , publisher=

work page 2025

[35] [36]

Journal of the American College of Radiology , year=

Agentic artificial intelligence: the power to change medicine and our world , author=. Journal of the American College of Radiology , year=

work page

[36] [37]

The Lancet , volume=

The rise of agentic AI teammates in medicine , author=. The Lancet , volume=. 2025 , publisher=

work page 2025

[37] [38]

Addressing Task Conflicts in Stuttering Detection via

Liu, Xiaokang and Li, Xingfeng and Yang, Yudong and Wang, Lan and Yan, Nan , booktitle=. Addressing Task Conflicts in Stuttering Detection via

work page

[38] [39]

Shakeel Ahmad Sheikh and Md Sahidullah and Fabrice Hirsch and Slim Ouni , title =. Proc. of the ACM Multimedia 2022 , year =

work page 2022

[39] [40]

Neurocomputing , volume=

Machine learning for stuttering identification: Review, challenges and future directions , author=. Neurocomputing , volume=. 2022 , publisher=

work page 2022

[40] [41]

Introducing

Sheikh, Shakeel Ahmad and Sahidullah, Md and Hirsch, Fabrice and Ouni, Slim , journal=. Introducing

work page

[41] [42]

2023 , month =

Shakeel Ahmad Sheikh , title =. 2023 , month =

work page 2023

[42] [43]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities , author=. arXiv preprint arXiv:2507.06261 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[43] [44]

Bayerl and Dominik Wagner and Elmar Nöth and Tobias Bocklet and Korbinian Riedhammer , year =

Sebastian P. Bayerl and Dominik Wagner and Elmar Nöth and Tobias Bocklet and Korbinian Riedhammer , year =. The Influence of Dataset-Partitioning on Dysfluency. Proc. of Text,

work page

[44] [45]

Sep-28k: A dataset for stuttering event detection from podcasts with people who stutter , author=. Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2021 , organization=

work page 2021

[45] [46]

International Journal of Speech Technology , volume=

Stuttering detection using speaker representations and self-supervised contextual embeddings , author=. International Journal of Speech Technology , volume=. 2023 , publisher=

work page 2023

[46] [47]

Unsupervised cross- lingual representation learning for speech recognition,

Unsupervised cross-lingual representation learning for speech recognition , author=. arXiv preprint arXiv:2006.13979 , year=

work page arXiv 2006

[47] [48]

and Edin, Joakim and Igel, Christian and Kirchhoff, Katrin and Li, Shang-Wen and Livescu, Karen and Maaløe, Lars and Sainath, Tara N

Mohamed, Abdelrahman and Lee, Hung-yi and Borgholt, Lasse and Havtorn, Jakob D. and Edin, Joakim and Igel, Christian and Kirchhoff, Katrin and Li, Shang-Wen and Livescu, Karen and Maaløe, Lars and Sainath, Tara N. and Watanabe, Shinji , journal=. Self-Supervised Speech Representation Learning: A Review , year=

work page

[48] [49]

Robust stuttering detection via multi-task and adversarial learning , author=. Proc. of 30th European Signal Processing Conference (EUSIPCO) , pages=. 2022 , organization=

work page 2022

[49] [50]

IEEE Journal of Selected Topics in Signal Processing , year=

Overview of Automatic Speech Analysis and Technologies for Neurodegenerative Disorders: Diagnosis and Assistive Applications , author=. IEEE Journal of Selected Topics in Signal Processing , year=

work page

[50] [51]

The effect of sampling temperature on problem solving in large language models , author=. Proc. of Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

work page 2024

[51] [52]

Seminars in speech and language , volume=

Psychosocial impact of living with a stuttering disorder: Knowing is not enough , author=. Seminars in speech and language , volume=. 2014 , organization=

work page 2014

[52] [53]

Journal of Fluency Disorders , pages=

More than meets the eye: Self-rated covert stuttering is linked to reduced psychosocial and communicative outcomes , author=. Journal of Fluency Disorders , pages=. 2025 , publisher=

work page 2025

[53] [54]

Overall Assessment of the Speaker's Experience of Stuttering (

Yaruss, J Scott and Quesal, Robert W , journal=. Overall Assessment of the Speaker's Experience of Stuttering (. 2006 , publisher=

work page 2006

[54] [55]

Journal of Fluency disorders , volume=

Social anxiety disorder and stuttering: Current status and future directions , author=. Journal of Fluency disorders , volume=. 2014 , publisher=

work page 2014

[55] [56]

What works for whom? Multidimensional individualized stuttering therapy (

S. What works for whom? Multidimensional individualized stuttering therapy (. Journal of Communication Disorders , volume=. 2020 , publisher=

work page 2020

[56] [57]

Neurobiology of Language , volume=

Stuttering: Our current knowledge, research opportunities, and ways to address critical gaps , author=. Neurobiology of Language , volume=. 2025 , publisher=

work page 2025

[57] [58]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=

Systematic review of machine learning approaches for detecting developmental stuttering , author=. IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=. 2022 , publisher=

work page 2022

[58] [59]

2021 , organization=

Sheikh, Shakeel A and Sahidullah, Md and Hirsch, Fabrice and Ouni, Slim , booktitle=. 2021 , organization=

work page 2021

[59] [60]

IEEE Journal of Biomedical and Health Informatics , year=

Advancing stuttering detection via data augmentation, class-balanced loss and multi-contextual deep learning , author=. IEEE Journal of Biomedical and Health Informatics , year=

work page