arxiv: 2604.20535 · v1 · submitted 2026-04-22 · 💻 cs.CL · cs.HC

Recognition: unknown

Aligning Stuttered-Speech Research with End-User Needs: Scoping Review, Survey, and Guidelines

Hawau Olamide Toyin , Mutiah Apampa , Toluwani Aremu , Humaid Alblooshi , Ana Rita Valente , Gon\c{c}alo Leal , Zhengjun Yue , Zeerak Talat

show 1 more author

Hanan Aldarmaki

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:49 UTC · model grok-4.3

classification 💻 cs.CL cs.HC

keywords stuttered speechspeech technologyscoping reviewstakeholder surveyend-user needsresearch taxonomyguidelines for alignment

0 comments

The pith

A scoping review and stakeholder survey show that stuttered-speech research often fails to address the priorities of adults who stutter and speech-language pathologists.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines current research on stuttered speech through a scoping review of relevant papers and a survey of 70 stakeholders, such as adults who stutter and speech-language pathologists. It creates a taxonomy to categorize the research efforts and points out specific areas where research directions do not match what stakeholders say they need. The work then provides concrete guidelines for future studies and technology development to better serve the stuttering community. Understanding these gaps matters because speech technologies like recognition systems could become more practical and helpful if they reflect real user experiences instead of researcher assumptions.

Core claim

By conducting a scoping review of the literature on stuttered speech and surveying stakeholders including adults who stutter and speech-language pathologists, the authors develop a taxonomy of the research and identify divergences from end-user needs, leading to guidelines for aligning future work with the actual requirements of the stuttering community.

What carries the argument

A taxonomy of stuttered-speech research derived from the scoping review and stakeholder survey, used to map and contrast research priorities with end-user needs.

If this is right

Current research can be classified using the proposed taxonomy to reveal overlooked areas.
Specific divergences, such as in evaluation methods and priorities, can be addressed by following the outlined guidelines.
Future speech technology research will better support end-users if it incorporates the identified stakeholder needs.
The guidelines provide directions for interdisciplinary dialogue between researchers and the stuttering community.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Speech recognition systems for stuttered speech could improve in real-world performance if research follows these guidelines.
The approach of combining literature review with direct stakeholder input could be applied to other forms of atypical speech or communication disorders.
Developers might create more inclusive tools by consulting the taxonomy and guidelines when setting research agendas.

Load-bearing premise

The scoping review captures the full scope of relevant literature on stuttered speech and the survey responses from 70 stakeholders offer representative insights into the needs of the broader stuttering community.

What would settle it

A comprehensive re-examination of the literature that uncovers major missed papers altering the taxonomy, or a larger-scale survey yielding substantially different needs and priorities from those reported.

Figures

Figures reproduced from arXiv: 2604.20535 by Ana Rita Valente, Gon\c{c}alo Leal, Hanan Aldarmaki, Hawau Olamide Toyin, Humaid Alblooshi, Mutiah Apampa, Toluwani Aremu, Zeerak Talat, Zhengjun Yue.

**Figure 1.** Figure 1: Variation in inputs, task formulation and outputs for the stutter identification research area. We propose a task taxonomy (see Section 2.2.1) to handle variation in output forms and standardise research nomenclature. users (i.e., SLPs and PWS) for data collection and annotation, we find that relatively few studies report end user participation in problem formulation, task definition, annotation design, e… view at source ↗

**Figure 2.** Figure 2: Task Combinations in Literature Survey. The left bar chart shows distribution of frequency of research areas in the survey. The top bar chart shows the frequency of combination of research areas. distinguish between intended and verbatim speech recognition. Intended speech recognition aims to remove all stuttering events from the transcription to convey the intended message more effectively, whereas verba… view at source ↗

**Figure 4.** Figure 4: Number of papers published by year. PWS means papers involving PWS collaboration; SLPs means papers involving SLPs collaboration; All means all papers published. Fewer than 20% of papers explicitly report interdisciplinary collaboration with stakeholders, indicating that much of the research area combinations remain more technically focused rather than truly human-centred. Stakeholders are collaboratin… view at source ↗

**Figure 3.** Figure 3: Languages covered in literature shows stutteredspeech AI research is English-centric. The long-tail indicates notable but minimal efforts towards language diversity. The language distribution in the stuttered-speech technology literature is highly skewed towards English with 152 papers focusing on English. In contrast, the next four most common languages are German (22 papers), Mandarin (17 papers), Hi… view at source ↗

**Figure 5.** Figure 5: PWS responses to Likert-scale questions on stuttering variability. PWS respondents formed a young and multilingual group. Most were between 18 and 44 years old, and more than 90% reported speaking at least two languages; around one-third spoke three or more. On a five-point scale, most PWS rated their current stuttering severity as mild to moderate (levels 2–3; Fig- [PITH_FULL_IMAGE:figures/full_fig_p00… view at source ↗

read the original abstract

Atypical speech is receiving greater attention in speech technology research, but much of this work unfolds with limited interdisciplinary dialogue. For stuttered speech in particular, it is widely recognised that current speech recognition systems fall short in practice, and current evaluation methods and research priorities are not systematically grounded in end-user experiences and needs. In this work, we analyse these gaps through 1) a scoping review of papers that deal with stuttered speech and 2) a survey of 70 stakeholders, including adults who stutter and speech-language pathologists. By analysing these two perspectives, we propose a taxonomy of stuttered-speech research, identify where current research directions diverge from the needs articulated by stakeholders, and conclude by outlining concrete guidelines and directions towards addressing the real needs of the stuttering community.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's taxonomy and guidelines come from pairing a scoping review with a 70-person stakeholder survey, but the survey's reach and the review's search details are the parts that need checking before the recommendations can be taken as firm.

read the letter

The paper pairs a scoping review of stuttered-speech papers with a survey of 70 adults who stutter and speech-language pathologists. From that it builds a taxonomy of current research topics and flags places where the work diverges from what the stakeholders say they need, then lists concrete guidelines for closing those gaps. That combination is the main new piece: prior calls for user involvement existed, but this one ties specific literature patterns to fresh survey responses in one place. It does a clear job laying out the mismatches, such as heavy focus on recognition accuracy versus user priorities around real-world use and social impact. The approach is straightforward and stays grounded in the two data sources rather than abstract claims. The survey sample is small and the abstract gives no detail on how participants were found or whether the group covers different ages, severities, languages, or regions. A scoping review is only as good as its search terms, databases, and screening rules; without seeing those steps it is hard to judge whether the claimed divergences rest on a complete picture or a selective one. If the full methods section shows convenience sampling or narrow inclusion criteria, the guidelines become more like informed suggestions than evidence-based directions. This is useful reading for speech-tech researchers who work on atypical speech and for clinicians who want to steer projects toward practical outcomes. HCI or accessibility people would also find the taxonomy helpful as a quick map of the field. It is not a breakthrough in method or theory, but the topic is live and the user-centered angle is worth referee time. I would send it for review with instructions to examine the recruitment and search protocol closely.

Referee Report

2 major / 2 minor

Summary. The manuscript conducts a scoping review of the literature on stuttered speech in speech technology research together with a survey of 70 stakeholders (adults who stutter and speech-language pathologists). From these two sources it derives a taxonomy of current research, identifies divergences between published work and stakeholder-articulated needs, and offers concrete guidelines for future research directions.

Significance. If the review protocol and survey sampling are shown to be systematic and representative, the work would provide a valuable bridge between technical speech-processing research and the practical requirements of the stuttering community. Explicit stakeholder input and the resulting taxonomy/guidelines could help reorient evaluation metrics and system design priorities toward more usable and inclusive ASR systems.

major comments (2)

[§3] §3 (Scoping Review): The search strategy, databases, exact query strings, inclusion/exclusion criteria, and screening process are described at too high a level to permit replication or independent assessment of literature completeness. Because the central claim of 'divergences' between research directions and stakeholder needs rests on the review having captured the relevant corpus, this omission is load-bearing.
[§4] §4 (Survey): Recruitment channels, response rate, demographic stratification (age, stuttering severity, gender, geography, language), and any steps taken to reduce self-selection bias are not reported for the n=70 sample. Without these details the claim that the responses represent 'the stuttering community' and therefore justify the taxonomy and guidelines cannot be evaluated.

minor comments (2)

[§3] A PRISMA-style flow diagram or explicit table summarizing the number of papers screened, included, and excluded at each stage would improve transparency of the scoping review.
[§5] The taxonomy presentation would be clearer if accompanied by a single summary table or figure that maps each category to the specific review findings and survey items that support it.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which highlights important areas for improving the transparency and replicability of our methods. We address each major comment below and commit to substantial revisions that will strengthen the manuscript without altering its core claims or findings.

read point-by-point responses

Referee: [§3] §3 (Scoping Review): The search strategy, databases, exact query strings, inclusion/exclusion criteria, and screening process are described at too high a level to permit replication or independent assessment of literature completeness. Because the central claim of 'divergences' between research directions and stakeholder needs rests on the review having captured the relevant corpus, this omission is load-bearing.

Authors: We agree that the current description in §3 is insufficiently detailed for replication. In the revised manuscript we will expand this section to report the complete search strategy, including all databases queried (ACM Digital Library, IEEE Xplore, PubMed, Google Scholar, and arXiv), the exact Boolean query strings, the full list of inclusion and exclusion criteria with justifications, the number of records screened at each stage, and a PRISMA flow diagram. These additions will directly address the load-bearing concern and allow independent verification of corpus completeness. revision: yes
Referee: [§4] §4 (Survey): Recruitment channels, response rate, demographic stratification (age, stuttering severity, gender, geography, language), and any steps taken to reduce self-selection bias are not reported for the n=70 sample. Without these details the claim that the responses represent 'the stuttering community' and therefore justify the taxonomy and guidelines cannot be evaluated.

Authors: We acknowledge that the survey methods section currently lacks the requested granularity. The revised version will include explicit details on recruitment channels (stuttering advocacy organizations, social media groups, professional SLP networks, and university clinics), the overall response rate, a demographic table breaking down the 70 participants by age, self-reported stuttering severity, gender, geographic region, and primary language, and the specific steps taken to mitigate self-selection bias (targeted outreach to underrepresented groups and inclusion of both clinical and community-based respondents). We will also add a limitations subsection discussing the sample's representativeness. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation rests on external literature review and independent survey data

full rationale

The paper's chain proceeds from a scoping review of external papers plus a newly collected survey of 70 stakeholders to a taxonomy, divergence identification, and guidelines. No equations, fitted parameters, or predictions are defined within the paper that later reappear as outputs. No self-citation is invoked as a load-bearing uniqueness theorem or ansatz. The work is self-contained against external benchmarks (published literature and fresh stakeholder responses) and does not reduce any central claim to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

This is a scoping review and survey paper with no mathematical derivations, fitted parameters, or invented physical entities. The central claims rest on standard domain assumptions about the completeness of literature searches and the representativeness of stakeholder input.

axioms (2)

domain assumption The scoping review process comprehensively identifies and categorizes relevant papers on stuttered speech.
Invoked to support the claim that current research directions can be reliably mapped and compared to user needs.
domain assumption The survey responses from 70 stakeholders accurately reflect the broader needs and priorities of adults who stutter and speech-language pathologists.
Central to identifying divergences and formulating guidelines; no details on sampling frame or bias mitigation are provided in the abstract.

pith-pipeline@v0.9.0 · 5476 in / 1509 out tokens · 58138 ms · 2026-05-09T23:49:06.261283+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Aligning Stuttered-Speech Research with End-User Needs: Scoping Review, Survey, and Guidelines

Introduction In recent years, research at the intersection of speech technol- ogy and atypical speech, including stuttering, has increased, with applications of automatic speech recognition (ASR) [1], speech synthesis [2], and event classification [3]. Stuttering is a variable, multidimensional speech disorder that disrupts speech fluency, leading to the ...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

disfluency/dysfluency

Literature Mapping 2.1. Search and Annotation Protocol We conducted a scoping review of research work on stuttered speech processing to characterise the current state of research. To identify relevant papers, we performed a keyword search resulting in680papers published between 2010 and October 2025 across all open access publications indexed by Semantic ...

2010
[3]

what gets researched

Stakeholder Survey 3.1. Survey Methodology One of the main goals of this work is to characterise what stakeholders’ expectations are and identify how stuttered- speech research can better serve stakeholders. We designed two complementary online questionnaires: one targetingPWSand one targetingSLPs. Both questionnaires were co-designed iteratively with two...
[4]

Impatient ASR

Stakeholder Insights This section highlights the needs ofPWSandSLPsfrom stuttered-speech technologies by analysing their responses to the corresponding questionnaires. We include a small num- ber of anonymised, verbatim excerpts from open-text survey re- sponses to illustrate recurring themes and to contextualise the quantitative results. Quotes are label...
[5]

when/where

Alignment Analysis To move beyond a stand-alone literature survey and a stand- alone stakeholder survey, we explicitly analyse alignment be- tween research priorities and stakeholder needs in this section and interpret what these gaps imply for stuttering and research community. 5.1. The Gaps in Alignment Table 2 summarises the main gaps identified in our...
[6]

detection

Moving Forward Our analysis reveals that current research priorities and task formulations only partially overlap with whatPWSandSLPs identify as valuable, and fewer than20%of works report any stakeholder involvement. As a result, many research outputs remain difficult to adapt and insufficiently user-centred, limit- ing their usefulness in real clinical,...
[7]

Generative AI tools Generative artificial intelligence tools were used solely to assist with language editing for clarity of presentation. All research questions, annotations, and interpretations were conceived and carried out by the authors, who take full responsibility for the originality, validity, and integrity of the work
[8]

Inclusive ASR for Disfluent Speech: Cascaded Large- Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation,

D. Mujtaba, N. R. Mahapatra, M. Arney, J. S. Yaruss, C. Herring, and J. Bin, “Inclusive ASR for Disfluent Speech: Cascaded Large- Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation,” inInterspeech, 2024, pp. 1275–1279

2024
[9]

Adversarial training for low-resource disfluency correction,

V . Bhat, P. Jyothi, and P. Bhattacharyya, “Adversarial training for low-resource disfluency correction,” inAnnual Meeting of the Association for Computational Linguistics, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:259138602

2023
[10]

Clinical Annotations for Automatic Stuttering Severity Assessment,

A. Valente, R. Marew, H. Toyin, H. Al-Ali, A. Bohnen, I. Becerra, E. Soares, G. Leal, and H. Aldarmaki, “Clinical Annotations for Automatic Stuttering Severity Assessment,” inInterspeech 2025, pp. 4318–4322

2025
[11]

From user perceptions to technical improvement: Enabling people who stutter to better use speech recognition,

C. S. Lea, Z. Huang, J. Narain, L. Tooley, D. Yee, D. T. Tran, P. G. Georgiou, J. P. Bigham, and L. Findlater, “From user perceptions to technical improvement: Enabling people who stutter to better use speech recognition,”Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023. [Online]. Available: https://api.semanticscholar.or...

2023
[12]

Our collective voices: The social and technical values of a grassroots chinese stuttered speech dataset,

J. Li, Q. Li, R. Gong, L. Wang, and S. Wu, “Our collective voices: The social and technical values of a grassroots chinese stuttered speech dataset,”Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, 2025. [Online]. Available: https://api.semanticscholar.org/CorpusID:279469435

2025
[13]

Lost in transcription: Identifying and quantifying the accuracy biases of automatic speech recognition systems against disfluent speech

D. Mujtaba, N. Mahapatra, M. Arney, J. Yaruss, H. Gerlach- Houck, C. Herring, and J. Bin, “Lost in transcription: Identifying and quantifying the accuracy biases of automatic speech recognition systems against disfluent speech.” Association for Computational Linguistics, Jun. 2024, pp. 4795–4809. [Online]. Available: https://aclanthology.org/2024.naacl-long.269/

2024
[14]

Analysis and tuning of a voice assistant system for dysfluent speech,

V . Mitra, Z. Huang, C. S. Lea, L. Tooley, S. Wu, D. Botten, A. Palekar, S. Thelapurath, P. G. Georgiou, S. S. Kajarekar, and J. Bigham, “Analysis and tuning of a voice assistant system for dysfluent speech,” inInterspeech, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:235593228

2021
[15]

J-j-j-just stutter: Benchmark- ing whisper’s performance disparities on different stut- tering patterns,

C. Sridhar and S. Wu, “J-j-j-just stutter: Benchmark- ing whisper’s performance disparities on different stut- tering patterns,”Interspeech, 2025. [Online]. Available: https://api.semanticscholar.org/CorpusID:281308675

2025
[16]

Govern with, not for: Understanding the stuttering community’s preferences and goals for speech ai data governance in the us and china,

J. Li, P. Liu, R. Lietz, N. Tang, N. M. Su, and S. Wu, “Govern with, not for: Understanding the stuttering community’s preferences and goals for speech ai data governance in the us and china,”Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 2025. [Online]. Available: https: //api.semanticscholar.org/CorpusID:282142736

2025
[17]

doi: 10.1016/j.neucom

S. A. Sheikh, M. Sahidullah, F. Hirsch, and S. Ouni, “Machine learning for stuttering identification: Review, challenges and future directions,”Neurocomput., vol. 514, no. C, p. 385–402, Dec. 2022. [Online]. Available: https://doi.org/10.1016/j.neucom. 2022.10.015

work page doi:10.1016/j.neucom 2022
[18]

A comparative study of the techniques for feature extraction and classification in stuttering,

S. Khara, S. Singh, and D. Vir, “A comparative study of the techniques for feature extraction and classification in stuttering,”2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 887–893, 2018. [Online]. Available: https://api.semanticscholar. org/CorpusID:52899878

2018
[19]

Fluencybank timestamped: An updated data set for disfluency detection and automatic intended speech recognition,

A. Romana, M. Niu, M. Perez, and E. M. Provost, “Fluencybank timestamped: An updated data set for disfluency detection and automatic intended speech recognition,”Journal of Speech, Language, and Hearing Research : JSLHR, vol. 67, pp. 4203 – 4215, 2024. [Online]. Available: https://api.semanticscholar.org/ CorpusID:273200647

2024
[20]

Improved robustness to disfluencies in rnn-transducer based speech recognition,

V . Mendelev, T. Raissi, G. Camporese, and M. Giollo, “Improved robustness to disfluencies in rnn-transducer based speech recognition,”IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6878–6882, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID: 228376178

2021
[21]

Stuttering detection based on self-attention weights of temporal acoustic vector sequence,

G. Miyahara, T. Kato, and A. Tamura, “Stuttering detection based on self-attention weights of temporal acoustic vector sequence,”Interspeech, 2025. [Online]. Available: https: //api.semanticscholar.org/CorpusID:281393485

2025
[22]

Towards classification of typical and atypical disfluencies: A self supervised represen- tation approach,

P. Kommagouni, P. Khanna, V . Narasinga, A. Bocha, and A. K. Vuppala, “Towards classification of typical and atypical disfluencies: A self supervised represen- tation approach,”Interspeech, 2025. [Online]. Available: https://api.semanticscholar.org/CorpusID:282040462

2025
[23]

Stutter-solver: End-to-end multi-lingual dysfluency detection,

X. Zhou, C. J. Cho, A. Sharma, B. Morin, D. Baquirin, J. M. J. V onk, Z. Ezzes, Z. Miller, B. L. Tee, M. L. Gorno-Tempini, J. Lian, and G. K. Anumanchipalli, “Stutter-solver: End-to-end multi-lingual dysfluency detection,”2024 IEEE Spoken Language Technology Workshop (SLT), pp. 1039–1046, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID...

2024
[24]

Fluent trans- lations from disfluent speech in end-to-end speech trans- lation,

E. Salesky, M. Sperber, and A. H. Waibel, “Fluent trans- lations from disfluent speech in end-to-end speech trans- lation,” inNorth American Chapter of the Association for Computational Linguistics, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:173990664

2019
[25]

Towards fluent translations from disfluent speech,

E. Salesky, S. Burger, J. Niehues, and A. H. Waibel, “Towards fluent translations from disfluent speech,”2018 IEEE Spoken Language Technology Workshop (SLT), pp. 921–926, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID: 53208312

2018
[26]

Generating fluent translations from disfluent text without access to fluent references: Iit bombay@iwslt2020,

N. Saini, J. Khatri, P. Jyothi, and P. Bhattacharyya, “Generating fluent translations from disfluent text without access to fluent references: Iit bombay@iwslt2020,” inInternational Workshop on Spoken Language Translation, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:220058607

2020
[27]

Efficient recognition and classification of stuttered word from speech signal using deep learning technique,

K. Murugan, N. K. Cherukuri, and S. S. Donthu, “Efficient recognition and classification of stuttered word from speech signal using deep learning technique,”2022 IEEE World Conference on Applied Intelligence and Computing (AIC), pp. 774–781, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID: 251763193

2022
[28]

Speech stuttering detection and removal using deep neural networks,

S. Rajput, R. Nersisson, A. N. J. Raj, A. M. Mekala, O. Frolova, and E. E. Lyakso, “Speech stuttering detection and removal using deep neural networks,”Proceedings of the 11th International Conference on Computer Engineering and Networks, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID: 244066468

2021
[29]

Automatic speech recognition with stuttering speech removal using long short-term memory (lstm),

“Automatic speech recognition with stuttering speech removal using long short-term memory (lstm),”International Journal of Recent Technology and Engineering, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:242465818

2020
[30]

Disco: A large scale human annotated corpus for disfluency correction in indo-european languages,

V . Bhat, P. Jyothi, and P. Bhattacharyya, “Disco: A large scale human annotated corpus for disfluency correction in indo-european languages,” inConference on Empirical Methods in Natural Language Processing, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:264451744

2023
[31]

Demo: Easetalk: An llm-driven speech practice tool for real-life scenarios,

M. Faggiani, M. M. Qirtas, P. Frizelle, F. Ryan, N. Muller, and A. Visentin, “Demo: Easetalk: An llm-driven speech practice tool for real-life scenarios,”2025 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 246–248, 2025. [Online]. Available: https://api.semanticscholar.org/CorpusID:279899214

2025
[32]

Speak in public: an innovative tool for the treatment of stuttering through virtual reality, biosensors, and speech emotion recognition,

F. V ona, F. Pentimalli, F. Catania, A. Patti, and F. Garzotto, “Speak in public: an innovative tool for the treatment of stuttering through virtual reality, biosensors, and speech emotion recognition,”Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:258217035

2023
[33]

Computer-assisted disfluency counts for stuttered speech,

P. A. Heeman, A. McMillin, and J. S. Yaruss, “Computer-assisted disfluency counts for stuttered speech,” inInterspeech 2011. [Online]. Available: https://api.semanticscholar.org/CorpusID: 5361988

2011
[34]

Using clinician annotations to improve automatic speech recognition of stuttered speech,

P. A. Heeman, R. Lunsford, A. McMillin, and J. S. Yaruss, “Using clinician annotations to improve automatic speech recognition of stuttered speech,” inInterspeech 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:1906213

2016
[35]

Boli: A dataset for understanding stuttering experience and analyzing stuttered speech,

A. Batra, M. narang, N. K. Sharma, and P. K. Das, “Boli: A dataset for understanding stuttering experience and analyzing stuttered speech,”ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–4. [Online]. Available: https: //api.semanticscholar.org/CorpusID:275920628

2025
[36]

KSoF: The kassel state of fluency dataset – a therapy centered dataset of stuttering,

S. Bayerl, A. Wolff von Gudenberg, F. H ¨onig, E. Noeth, and K. Riedhammer, “KSoF: The kassel state of fluency dataset – a therapy centered dataset of stuttering,” inProceedings of the Thirteenth Language Resources and Evaluation Conference, N. Calzolari, F. B ´echet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mari...

2022
[37]

Analysis and Evaluation of Synthetic Data Generation in Speech Dysfluency Detection,

J. Zhang, X. Zhou, J. Lian, S. Li, W. Li, Z. Ezzes, R. Bogley, L. Wauters, Z. Miller, J. V onk, B. Morin, M. Gorno-Tempini, and G. Anumanchipalli, “Analysis and Evaluation of Synthetic Data Generation in Speech Dysfluency Detection,” inInterspeech 2025, pp. 1853–1857

2025
[38]

Fluentnet: End-to- end detection of stuttered speech disfluencies with deep learning,

T. Kourkounakis, A. Hajavi, and A. Etemad, “Fluentnet: End-to- end detection of stuttered speech disfluencies with deep learning,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2986–2999, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:237619436

2021
[39]

Semantic parsing of disfluent speech,

P. Sen and I. Groves, “Semantic parsing of disfluent speech,” inConference of the European Chapter of the Association for Computational Linguistics, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:233189587

2021
[40]

Anonymization of stuttered speech – removing speaker information while preserving the utterance,

J. Hintz, S. P. Bayerl, Y . Sinha, S. Ghosh, M. Schubert, S. Stober, K. Riedhammer, and I. Siegert, “Anonymization of stuttered speech – removing speaker information while preserving the utterance,”3rd Symposium on Security and Privacy in Speech Communication, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:261852104

2023
[41]

Distilling distributional uncertainty from a gaussian process,

J. H. M. Wong and N. F. Chen, “Distilling distributional uncertainty from a gaussian process,”ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 9956–9960. [Online]. Available: https://api.semanticscholar.org/CorpusID:268567593

2024
[42]

Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling,

T. Kouzelis, G. Paraskevopoulos, A. Katsamanis, and V . Kat- souros, “Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling,” inInterspeech 2023, pp. 1563– 1567

2023
[43]

Investigating ai applications in communication tools for individuals with speech impairments: An in-depth analysis,

S. B. Evangeline and A. D. Moorthy, “Investigating ai applications in communication tools for individuals with speech impairments: An in-depth analysis,”2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI), vol. 2, pp. 1–6, 2024. [Online]. Available: https://api.semanticscholar.org/...

2024
[44]

Yolo- stutter: End-to-end region-wise speech dysfluency detection,

X. Zhou, A. Kashyap, S. Li, A. Sharma, B. Morin, D. Baquirin, J. M. J. V onk, Z. Ezzes, Z. A. Miller, M. L. Gorno-Tempini, J. Lian, and G. K. Anumanchipalli, “Yolo- stutter: End-to-end region-wise speech dysfluency detection,” Interspeech, vol. 2024, pp. 937–941, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:271974383

2024