ASTRA: A Scalable Next-Generation ATCO Training Simulator with Autonomous Simpilots

Brandon Koh; Caden Toh; Darius Koh; Enjia Wu; Ethan Chew; Galen Tay; Ian Lim; Iruss Eng; Jonathan Koong; Kaleb Nim

arxiv: 2606.18319 · v2 · pith:OQ577YHKnew · submitted 2026-06-16 · 💻 cs.LG · cs.AI· cs.HC· cs.SE

ASTRA: A Scalable Next-Generation ATCO Training Simulator with Autonomous Simpilots

Ethan Chew , Enjia Wu , Iruss Eng , Ian Lim , Ranen Sim , Brandon Koh , Kaleb Nim , Caden Toh

show 6 more authors

Wei Dong Soin Darius Koh Galen Tay Prannaya Gupta Jonathan Koong Yong Zhi Lim

This is my paper

Pith reviewed 2026-06-27 01:02 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.HCcs.SE

keywords ATCO trainingautomatic speech recognitionaviation simulatorSingapore accentradiotelephony evaluationAI performance assessmentspeech adaptation

0 comments

The pith

ASTRA adapts speech recognition to cut word error rates on Singaporean aviation speech to 23.45 percent while adding AI scoring of trainee communications.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ASTRA as an end-to-end simulator that automates simpilot roles by transcribing ATCO speech, interpreting instructions, and generating responses with locally adapted voice models. It establishes that a fine-tuned ASR pipeline reduces WER to 23.45 percent on Singaporean-accented aviation speech, far below off-the-shelf rates exceeding 100 percent. The system adds an AI-assisted evaluation framework that scores trainee radiotelephony on accuracy, brevity, and completeness at 91.7, 88.2, and 86.9 percent after optimization. A sympathetic reader would care because the approach promises to scale ATCO training capacity and reduce dependence on scarce human trainers in regional contexts.

Core claim

ASTRA is an end-to-end training simulator that automates simpilot roles through a pipeline that transcribes ATCO speech, interprets instructions, and generates appropriate pilot and ATCO responses using locally adapted voice models. The fine-tuned ASR pipeline reduces WER to 23.45 percent. Beyond traffic simulation, ASTRA incorporates an AI-assisted performance evaluation framework that assesses trainee radiotelephony communications across accuracy, brevity, and completeness, achieving post-optimization scores of 91.7 percent, 88.2 percent, and 86.9 percent.

What carries the argument

The fine-tuned Automatic Speech Recognition pipeline combined with an AI-assisted performance evaluation framework that scores radiotelephony communications on accuracy, brevity, and completeness.

Load-bearing premise

The locally adapted ASR and voice models plus the post-optimization evaluation framework will maintain reported performance when deployed in actual Singaporean operational contexts rather than the development data.

What would settle it

Deploy the full ASTRA system on new live recordings from Singapore ATCO training sessions and measure whether WER remains near 23.45 percent and evaluation scores stay above 86 percent across the three metrics.

read the original abstract

Air Traffic Control Operators (ATCOs) are vital in ensuring the safe, orderly, and efficient flow of air traffic, yet training capacity is constrained by reliance on specialized human trainers known as simpilots, who must role-play both pilots and ATCOs in a simulated airspace. Existing automated solutions rely on Western-centric speech models that perform poorly in Singaporean operational contexts, with off-the-shelf systems exhibiting Word Error Rates (WER) of up to 107.80% on Singaporean-accented aviation speech. We introduce ASTRA, an end-to-end training simulator that automates these simpilot roles through a pipeline that transcribes ATCO speech, interprets instructions, and generates appropriate pilot and ATCO responses using locally adapted voice models. Our fine-tuned Automatic Speech Recognition (ASR) pipeline reduces WER to 23.45%, substantially outperforming existing approaches in this domain. Beyond traffic simulation, ASTRA incorporates an AI-assisted performance evaluation framework that assesses trainee radiotelephony communications across accuracy, brevity, and completeness, achieving post-optimization scores of 91.7%, 88.2%, and 86.9%, respectively. Built on open-source foundations such as DSPy and Unsloth, this approach enables scalable, standardized ATCO assessment while reducing instructor workload.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ASTRA integrates fine-tuned ASR and orchestration for Singapore-accented ATCO training but the reported gains rest on thin experimental reporting with no visible data splits or held-out validation.

read the letter

The paper's core move is taking standard ASR fine-tuning plus a DSPy-style pipeline and pointing it at Singapore aviation radiotelephony, where off-the-shelf models fail badly. That produces a usable end-to-end simpilot simulator plus an automated scorer on accuracy, brevity, and completeness.

What stands out is the domain choice and the concrete numbers: WER down from 107.80% to 23.45%, and post-optimization evaluator scores in the high 80s to low 90s. The authors also ship the work on open-source bases, which helps anyone who wants to replicate the setup.

The soft spot is exactly what the stress-test flags. The abstract gives no dataset size, no train/test split description, no mention of whether the test utterances were held out from the adaptation data, and no statistical comparison to baselines. Without those, the 23.45% WER and the evaluator scores cannot be read as evidence of generalization to live Singapore operations rather than performance on data close to the training distribution. That gap is load-bearing for the main claim.

The work is aimed at applied groups doing speech tech for safety-critical or accent-specific domains, or at aviation training organizations looking for scalable assessment tools. A reader who needs a working prototype description will find the pipeline useful; someone looking for a new method or rigorous benchmark will not.

It deserves peer review. The application is real and the engineering choices are transparent enough that referees can ask for the missing experimental controls without starting from zero.

Referee Report

2 major / 0 minor

Summary. The paper introduces ASTRA, an end-to-end ATCO training simulator that automates simpilot roles via an ASR pipeline for transcribing Singaporean-accented radiotelephony, instruction interpretation, and response generation with locally adapted voice models. It claims the fine-tuned ASR achieves 23.45% WER (vs. 107.80% for off-the-shelf systems) and that an AI-assisted evaluation framework scores trainee communications at 91.7% accuracy, 88.2% brevity, and 86.9% completeness after post-optimization, all built on open-source tools like DSPy and Unsloth.

Significance. If the performance claims are substantiated with proper held-out validation, the work would offer a practical advance in scalable, standardized ATCO training that addresses accent-specific challenges in aviation speech and reduces instructor workload.

major comments (2)

[Abstract] Abstract: the central performance claims (WER reduced to 23.45%, evaluation scores of 91.7/88.2/86.9) are presented without any information on dataset size, train/test splits, whether the test set is held-out from the adaptation data, baseline system details beyond the single off-the-shelf WER figure, or statistical significance; these omissions make the outperformance claim impossible to assess and are load-bearing for the paper's main contribution.
[Abstract] Abstract: the phrase 'post-optimization' for the evaluation framework is undefined; no description is given of the optimization procedure, the data on which scores were computed, or any external validation set from live Singaporean operations, leaving open the possibility that reported numbers reflect in-sample performance rather than generalization.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the abstract. We agree that the performance claims require additional context to be properly assessed and that the term 'post-optimization' needs definition. We will revise the abstract and supporting sections to address these points directly.

read point-by-point responses

Referee: [Abstract] Abstract: the central performance claims (WER reduced to 23.45%, evaluation scores of 91.7/88.2/86.9) are presented without any information on dataset size, train/test splits, whether the test set is held-out from the adaptation data, baseline system details beyond the single off-the-shelf WER figure, or statistical significance; these omissions make the outperformance claim impossible to assess and are load-bearing for the paper's main contribution.

Authors: We agree that the abstract as currently written does not provide enough information for readers to evaluate the claims. The full manuscript reports dataset details, train/test splits, held-out status, additional baselines, and significance testing in the Experiments section. In the revised version we will add concise statements to the abstract covering dataset size, confirmation that the test set is held-out, reference to multiple baselines, and mention of statistical significance so that the outperformance claim can be assessed from the abstract alone. revision: yes
Referee: [Abstract] Abstract: the phrase 'post-optimization' for the evaluation framework is undefined; no description is given of the optimization procedure, the data on which scores were computed, or any external validation set from live Singaporean operations, leaving open the possibility that reported numbers reflect in-sample performance rather than generalization.

Authors: We will define 'post-optimization' explicitly in the revised abstract as the result of DSPy prompt optimization applied to the evaluator. We will state that the reported scores were obtained on the held-out test portion of our Singaporean aviation speech corpus and briefly describe the optimization procedure. We note that live operational recordings from Singapore ATC are not available to us for external validation due to access and privacy constraints; the held-out test set therefore serves as the primary evidence of generalization within the collected domain. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical results reported directly from fine-tuning

full rationale

The manuscript contains no equations, derivations, or load-bearing self-citations. Performance numbers (WER 23.45%, evaluation scores 91.7/88.2/86.9) are presented as measured outcomes of the described fine-tuning and post-optimization steps on the authors' data. No step reduces a claimed prediction or uniqueness result to a fitted parameter or prior self-citation by construction. The paper is self-contained against external benchmarks in the sense that its claims rest on reported empirical measurements rather than internal redefinitions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, mathematical axioms, or new postulated entities; the system description rests on standard supervised fine-tuning assumptions that are not detailed.

pith-pipeline@v0.9.1-grok · 5813 in / 1083 out tokens · 49557 ms · 2026-06-27T01:02:31.415875+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

63 extracted references · 10 canonical work pages · 1 internal anchor

[1]

2025 , file =

John Kelly , date =. 2025 , file =

2025
[2]

Chris Isidore , date=
[3]

Justin Ong Guang-Xi , date =
[4]

FAA air traffic overtime costs soar as hiring lags, report says , author=
[5]

Hofbauer, Konrad and Petrik, Stefan and Hering, Horst , booktitle=
[6]

arXiv preprint arXiv:2006.10304 , year=

Automatic speech recognition benchmark for air-traffic communications , author=. arXiv preprint arXiv:2006.10304 , year=

work page arXiv 2006
[7]

2022 IEEE Spoken Language Technology Workshop (SLT) , pages=

How does pre-trained wav2vec 2.0 perform on domain-shifted asr? an extensive benchmark on air traffic control communications , author=. 2022 IEEE Spoken Language Technology Workshop (SLT) , pages=. 2023 , organization=

2022
[8]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

CB-Whisper: Contextual biasing Whisper using open-vocabulary keyword-spotting , author=. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

2024
[9]

Interspeech , volume=

Adding user feedback to enhance CB-Whisper , author=. Interspeech , volume=
[10]

Brudnicki, Dan and Ethier, Bob and Chastain, Kerri , year=
[11]

2021 , volume=

Lin, Yi and Wu, YuanKai and Guo, Dongyue and Zhang, Pan and Yin, Changyu and Yang, Bo and Zhang, Jianwei , journal=. 2021 , volume=

2021
[12]

Prasad, Amrutha and Zuluaga-Gomez, Juan and Motlicek, Petr and Sarfjoo, Saeed and Nigmatulina, Iuliia and Vesely, Karel , journal=
[13]

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle=
[14]

2023 , publisher=

Zuluaga-Gomez, Juan and Prasad, Amrutha and Nigmatulina, Iuliia and Motlicek, Petr and Kleinert, Matthias , journal=. 2023 , publisher=

2023
[15]

AIAA AVIATION 2023 Forum , pages=

SafeAeroBERT: Towards a safety-informed aerospace-specific language model , author=. AIAA AVIATION 2023 Forum , pages=

2023
[16]

2024 , publisher=

Jiang, Peiyuan and Zeng, Chen and Pan, Weijun and Han, Boyuan and Zhang, Jian , journal=. 2024 , publisher=

2024
[17]

arXiv preprint arXiv:2409.09717 , year=

Andriu. arXiv preprint arXiv:2409.09717 , year=

work page arXiv
[18]

Air Traffic Control – Chapter 5, Section 4: Transfer of Radar Identification , author=. n.d. , url=
[19]

2014 , url=

Target Generation Facility (TGF) Simulation Pilot Operations Guide Fifteenth Edition , author=. 2014 , url=

2014
[20]

Gheorghe Comanici and et al. , year=. 2507.06261 , archivePrefix=

work page internal anchor Pith review Pith/arXiv arXiv
[21]

and Moazam, Hanna and Miller, Heather and Zaharia, Matei and Potts, Christopher , journal=

Khattab, Omar and Singhvi, Arnav and Maheshwari, Paridhi and Zhang, Zhiyuan and Santhanam, Keshav and Vardhamanan, Sri and Haq, Saiful and Sharma, Ashutosh and Joshi, Thomas T. and Moazam, Hanna and Miller, Heather and Zaharia, Matei and Potts, Christopher , journal=
[22]

Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles , year=

Efficient Memory Management for Large Language Model Serving with PagedAttention , author=. Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles , year=
[23]

2023 , organization=

Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya , booktitle=. 2023 , organization=

2023
[24]

Sekoyan, Monica and Koluguri, Nithin Rao and Tadevosyan, Nune and Zelasko, Piotr and Bartley, Travis and Karpov, Nikolay and Balam, Jagadeesh and Ginsburg, Boris , journal=
[25]

He, Yingxu and Liu, Zhuohan and Sun, Shuo and Wang, Bin and Zhang, Wenyu and Zou, Xunlong and Chen, Nancy F and Aw, Ai Ti , journal=
[26]

Wang, Bin and Zou, Xunlong and Sun, Shuo and Zhang, Wenyu and He, Yingxu and Liu, Zhuohan and Wei, Chengwei and Chen, Nancy F and Aw, AiTi , journal=
[27]

2025 , url=

Anonymous , booktitle=. 2025 , url=

2025
[28]

Hoekstra and Patrick Jonk and de Vries , Vincent

van Doorn , Jan and Junzi Sun and J.M. Hoekstra and Patrick Jonk and de Vries , Vincent. 2024 International Conference on Research in Air Transportation (ICRAT) , year=

2024
[29]

2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=

Tree-constrained pointer generator for end-to-end contextual speech recognition , author=. 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=. 2021 , organization=

2021
[30]

Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition , author=. Proc. Interspeech 2022 , pages=

2022
[31]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=

Minimising biasing word errors for contextual ASR with the tree-constrained pointer generator , author=. IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=. 2022 , publisher=

2022
[32]

Proceedings of the 20th International Workshop on Multimedia Signal Processing (MMSP) , year=

A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement , author=. Proceedings of the 20th International Workshop on Multimedia Signal Processing (MMSP) , year=
[33]

Can Contextual Biasing Remain Effective with Whisper and GPT-2? , author=. Proc. Interspeech 2023 , pages=

2023
[34]

Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function , author=. Proc. Interspeech 2025 , pages=

2025
[35]

2024 7th International Conference on Machine Learning and Natural Language Processing (MLNLP) , pages=

Contextual biasing to improve domain-specific custom vocabulary audio transcription without explicit fine-tuning of whisper model , author=. 2024 7th International Conference on Machine Learning and Natural Language Processing (MLNLP) , pages=. 2024 , organization=

2024
[36]

arXiv preprint arXiv:2211.04054 , year=

Zuluaga-Gomez, Juan and Vesel. arXiv preprint arXiv:2211.04054 , year=

work page arXiv
[37]

2025 , organization=

Wee, Marcus Yu Zhe and Wong, Justin Juin Hng and Lim, Lynus and Tan, Joe Yu Wei and Gupta, Prannaya and Lim, Dillion and Tew, En Hao and Han, Aloysius Keng Siew and Lim, Yong Zhi , booktitle=. 2025 , organization=

2025
[38]

2021 , volume=

Kleinert, Matthias and Helmke, Hartmut and Shetty, Shruthi and Ohneiser, Oliver and Ehr, Heiko and Prasad, Amrutha and Motlicek, Petr and Harfmann, Julia , booktitle=. 2021 , volume=

2021
[39]

Hartmut Helmke, Matthias Kleinert, Oliver Ohneiser , title =
[40]

XTTS: a massively multilingual zero-shot text-to-speech model,

Casanova, Edresson and Davis, Kelly and G. arXiv preprint arXiv:2406.04904 , year=

work page arXiv
[41]

arXiv preprint arXiv:2509.15969 , year=

VoXtream: Full-Stream Text-to-Speech with Extremely Low Latency , author=. arXiv preprint arXiv:2509.15969 , year=

work page arXiv
[42]

2021 , organization=

Kim, Jaehyeon and Kong, Jungil and Son, Juhee , booktitle=. 2021 , organization=

2021
[43]

Advances in neural information processing systems , volume=

Fastspeech: Fast, robust and controllable text to speech , author=. Advances in neural information processing systems , volume=
[44]

2025 , volume=

Ohneiser, Oliver and Ahmed, Umair , journal=. 2025 , volume=

2025
[45]

Journal of Machine Learning Research , volume=

Scaling speech technology to 1,000+ languages , author=. Journal of Machine Learning Research , volume=
[46]

2024 , issue_date =

Zhou, Xuehao and Zhang, Mingyang and Zhou, Yi and Wu, Zhizheng and Li, Haizhou , title =. 2024 , issue_date =. doi:10.1109/TASLP.2024.3363414 , journal =

work page doi:10.1109/taslp.2024.3363414 2024
[47]

Chu, Min and Li, Chun and Peng, Hu and Chang, Eric , year =
[48]

Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu and others , journal=
[49]

Hu, Junjie and Xia, Mengzhou and Neubig, Graham and Carbonell, Jaime G , booktitle=
[50]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

Mitigating hallucinations in lm-based tts models via distribution alignment using gflownets , author=. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

2025
[51]

Kirkland, Ambika and Mehta, Shivam and Lameris, Harm and Henter, Gustav and Szekely, Eva and Gustafson, Joakim , year =
[52]

2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=

Salesky, Elizabeth and M. 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=. 2021 , organization=

2021
[53]

2025 , organization=

Wang, Siyin and Yu, Wenyi and Yang, Yudong and Tang, Changli and Li, Yixuan and Zhuang, Jimin and Chen, Xianzhao and Tian, Xiaohai and Zhang, Jun and Sun, Guangzhi and others , booktitle=. 2025 , organization=

2025
[54]

2024 , month=

Nvidia Developer , author=. 2024 , month=

2024
[55]

2506.22023 , archivePrefix=

Bohan Li and Zhihan Li and Haoran Wang and Hanglei Zhang and Yiwei Guo and Hankun Wang and Xie Chen and Kai Yu , year=. 2506.22023 , archivePrefix=

work page arXiv
[56]

Application of rule-based expert system in ATC simulator evaluation system , year=

Wu, Xisheng , booktitle=. Application of rule-based expert system in ATC simulator evaluation system , year=. doi:10.1109/ICVRIS51417.2020.00059 , organization=

work page doi:10.1109/icvris51417.2020.00059 2020
[57]

and Artzi, Yoav , booktitle=

Zhang, Tianyi and Kishore, Varsha and Wu, Felix and Weinberger, Kilian Q. and Artzi, Yoav , booktitle=. 2020 , url=

2020
[58]

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena , doi =

Chiang, Wei-Lin and Gonzalez, Joseph and Li, Dacheng and Zhuohan, Li and Lin, Zi and Sheng, Ying and Stoica, Ion and Wu, Zhanghao and Xing, Eric and Zhang, Hao and Zheng, Lianmin and Zhuang, Siyuan and Zhuang, Yonghao , year =. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena , doi =
[59]

Ergonomics and Human Factors 2024 , year=

Identifying Human Performance Metrics in Air Traffic Control , author=. Ergonomics and Human Factors 2024 , year=

2024
[60]

Recommendations for next generation air traffic control training , year=

Updegrove, Jessica and Jafer, Shafagh , booktitle=. Recommendations for next generation air traffic control training , year=. doi:10.1109/DASC.2017.8102129 , organization=

work page doi:10.1109/dasc.2017.8102129 2017
[61]

2021 , howpublished =

Silero Team , title =. 2021 , howpublished =

2021
[62]

2023 , howpublished =

SYSTRAN , title =. 2023 , howpublished =

2023
[63]

2020 , howpublished =

Klein, Guillaume , title =. 2020 , howpublished =

2020

[1] [1]

2025 , file =

John Kelly , date =. 2025 , file =

2025

[2] [2]

Chris Isidore , date=

[3] [3]

Justin Ong Guang-Xi , date =

[4] [4]

FAA air traffic overtime costs soar as hiring lags, report says , author=

[5] [5]

Hofbauer, Konrad and Petrik, Stefan and Hering, Horst , booktitle=

[6] [6]

arXiv preprint arXiv:2006.10304 , year=

Automatic speech recognition benchmark for air-traffic communications , author=. arXiv preprint arXiv:2006.10304 , year=

work page arXiv 2006

[7] [7]

2022 IEEE Spoken Language Technology Workshop (SLT) , pages=

How does pre-trained wav2vec 2.0 perform on domain-shifted asr? an extensive benchmark on air traffic control communications , author=. 2022 IEEE Spoken Language Technology Workshop (SLT) , pages=. 2023 , organization=

2022

[8] [8]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

CB-Whisper: Contextual biasing Whisper using open-vocabulary keyword-spotting , author=. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

2024

[9] [9]

Interspeech , volume=

Adding user feedback to enhance CB-Whisper , author=. Interspeech , volume=

[10] [10]

Brudnicki, Dan and Ethier, Bob and Chastain, Kerri , year=

[11] [11]

2021 , volume=

Lin, Yi and Wu, YuanKai and Guo, Dongyue and Zhang, Pan and Yin, Changyu and Yang, Bo and Zhang, Jianwei , journal=. 2021 , volume=

2021

[12] [12]

Prasad, Amrutha and Zuluaga-Gomez, Juan and Motlicek, Petr and Sarfjoo, Saeed and Nigmatulina, Iuliia and Vesely, Karel , journal=

[13] [13]

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle=

[14] [14]

2023 , publisher=

Zuluaga-Gomez, Juan and Prasad, Amrutha and Nigmatulina, Iuliia and Motlicek, Petr and Kleinert, Matthias , journal=. 2023 , publisher=

2023

[15] [15]

AIAA AVIATION 2023 Forum , pages=

SafeAeroBERT: Towards a safety-informed aerospace-specific language model , author=. AIAA AVIATION 2023 Forum , pages=

2023

[16] [16]

2024 , publisher=

Jiang, Peiyuan and Zeng, Chen and Pan, Weijun and Han, Boyuan and Zhang, Jian , journal=. 2024 , publisher=

2024

[17] [17]

arXiv preprint arXiv:2409.09717 , year=

Andriu. arXiv preprint arXiv:2409.09717 , year=

work page arXiv

[18] [18]

Air Traffic Control – Chapter 5, Section 4: Transfer of Radar Identification , author=. n.d. , url=

[19] [19]

2014 , url=

Target Generation Facility (TGF) Simulation Pilot Operations Guide Fifteenth Edition , author=. 2014 , url=

2014

[20] [20]

Gheorghe Comanici and et al. , year=. 2507.06261 , archivePrefix=

work page internal anchor Pith review Pith/arXiv arXiv

[21] [21]

and Moazam, Hanna and Miller, Heather and Zaharia, Matei and Potts, Christopher , journal=

Khattab, Omar and Singhvi, Arnav and Maheshwari, Paridhi and Zhang, Zhiyuan and Santhanam, Keshav and Vardhamanan, Sri and Haq, Saiful and Sharma, Ashutosh and Joshi, Thomas T. and Moazam, Hanna and Miller, Heather and Zaharia, Matei and Potts, Christopher , journal=

[22] [22]

Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles , year=

Efficient Memory Management for Large Language Model Serving with PagedAttention , author=. Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles , year=

[23] [23]

2023 , organization=

Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya , booktitle=. 2023 , organization=

2023

[24] [24]

Sekoyan, Monica and Koluguri, Nithin Rao and Tadevosyan, Nune and Zelasko, Piotr and Bartley, Travis and Karpov, Nikolay and Balam, Jagadeesh and Ginsburg, Boris , journal=

[25] [25]

He, Yingxu and Liu, Zhuohan and Sun, Shuo and Wang, Bin and Zhang, Wenyu and Zou, Xunlong and Chen, Nancy F and Aw, Ai Ti , journal=

[26] [26]

Wang, Bin and Zou, Xunlong and Sun, Shuo and Zhang, Wenyu and He, Yingxu and Liu, Zhuohan and Wei, Chengwei and Chen, Nancy F and Aw, AiTi , journal=

[27] [27]

2025 , url=

Anonymous , booktitle=. 2025 , url=

2025

[28] [28]

Hoekstra and Patrick Jonk and de Vries , Vincent

van Doorn , Jan and Junzi Sun and J.M. Hoekstra and Patrick Jonk and de Vries , Vincent. 2024 International Conference on Research in Air Transportation (ICRAT) , year=

2024

[29] [29]

2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=

Tree-constrained pointer generator for end-to-end contextual speech recognition , author=. 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=. 2021 , organization=

2021

[30] [30]

Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition , author=. Proc. Interspeech 2022 , pages=

2022

[31] [31]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=

Minimising biasing word errors for contextual ASR with the tree-constrained pointer generator , author=. IEEE/ACM Transactions on Audio, Speech, and Language Processing , volume=. 2022 , publisher=

2022

[32] [32]

Proceedings of the 20th International Workshop on Multimedia Signal Processing (MMSP) , year=

A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement , author=. Proceedings of the 20th International Workshop on Multimedia Signal Processing (MMSP) , year=

[33] [33]

Can Contextual Biasing Remain Effective with Whisper and GPT-2? , author=. Proc. Interspeech 2023 , pages=

2023

[34] [34]

Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function , author=. Proc. Interspeech 2025 , pages=

2025

[35] [35]

2024 7th International Conference on Machine Learning and Natural Language Processing (MLNLP) , pages=

Contextual biasing to improve domain-specific custom vocabulary audio transcription without explicit fine-tuning of whisper model , author=. 2024 7th International Conference on Machine Learning and Natural Language Processing (MLNLP) , pages=. 2024 , organization=

2024

[36] [36]

arXiv preprint arXiv:2211.04054 , year=

Zuluaga-Gomez, Juan and Vesel. arXiv preprint arXiv:2211.04054 , year=

work page arXiv

[37] [37]

2025 , organization=

Wee, Marcus Yu Zhe and Wong, Justin Juin Hng and Lim, Lynus and Tan, Joe Yu Wei and Gupta, Prannaya and Lim, Dillion and Tew, En Hao and Han, Aloysius Keng Siew and Lim, Yong Zhi , booktitle=. 2025 , organization=

2025

[38] [38]

2021 , volume=

Kleinert, Matthias and Helmke, Hartmut and Shetty, Shruthi and Ohneiser, Oliver and Ehr, Heiko and Prasad, Amrutha and Motlicek, Petr and Harfmann, Julia , booktitle=. 2021 , volume=

2021

[39] [39]

Hartmut Helmke, Matthias Kleinert, Oliver Ohneiser , title =

[40] [40]

XTTS: a massively multilingual zero-shot text-to-speech model,

Casanova, Edresson and Davis, Kelly and G. arXiv preprint arXiv:2406.04904 , year=

work page arXiv

[41] [41]

arXiv preprint arXiv:2509.15969 , year=

VoXtream: Full-Stream Text-to-Speech with Extremely Low Latency , author=. arXiv preprint arXiv:2509.15969 , year=

work page arXiv

[42] [42]

2021 , organization=

Kim, Jaehyeon and Kong, Jungil and Son, Juhee , booktitle=. 2021 , organization=

2021

[43] [43]

Advances in neural information processing systems , volume=

Fastspeech: Fast, robust and controllable text to speech , author=. Advances in neural information processing systems , volume=

[44] [44]

2025 , volume=

Ohneiser, Oliver and Ahmed, Umair , journal=. 2025 , volume=

2025

[45] [45]

Journal of Machine Learning Research , volume=

Scaling speech technology to 1,000+ languages , author=. Journal of Machine Learning Research , volume=

[46] [46]

2024 , issue_date =

Zhou, Xuehao and Zhang, Mingyang and Zhou, Yi and Wu, Zhizheng and Li, Haizhou , title =. 2024 , issue_date =. doi:10.1109/TASLP.2024.3363414 , journal =

work page doi:10.1109/taslp.2024.3363414 2024

[47] [47]

Chu, Min and Li, Chun and Peng, Hu and Chang, Eric , year =

[48] [48]

Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu and others , journal=

[49] [49]

Hu, Junjie and Xia, Mengzhou and Neubig, Graham and Carbonell, Jaime G , booktitle=

[50] [50]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

Mitigating hallucinations in lm-based tts models via distribution alignment using gflownets , author=. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

2025

[51] [51]

Kirkland, Ambika and Mehta, Shivam and Lameris, Harm and Henter, Gustav and Szekely, Eva and Gustafson, Joakim , year =

[52] [52]

2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=

Salesky, Elizabeth and M. 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pages=. 2021 , organization=

2021

[53] [53]

2025 , organization=

Wang, Siyin and Yu, Wenyi and Yang, Yudong and Tang, Changli and Li, Yixuan and Zhuang, Jimin and Chen, Xianzhao and Tian, Xiaohai and Zhang, Jun and Sun, Guangzhi and others , booktitle=. 2025 , organization=

2025

[54] [54]

2024 , month=

Nvidia Developer , author=. 2024 , month=

2024

[55] [55]

2506.22023 , archivePrefix=

Bohan Li and Zhihan Li and Haoran Wang and Hanglei Zhang and Yiwei Guo and Hankun Wang and Xie Chen and Kai Yu , year=. 2506.22023 , archivePrefix=

work page arXiv

[56] [56]

Application of rule-based expert system in ATC simulator evaluation system , year=

Wu, Xisheng , booktitle=. Application of rule-based expert system in ATC simulator evaluation system , year=. doi:10.1109/ICVRIS51417.2020.00059 , organization=

work page doi:10.1109/icvris51417.2020.00059 2020

[57] [57]

and Artzi, Yoav , booktitle=

Zhang, Tianyi and Kishore, Varsha and Wu, Felix and Weinberger, Kilian Q. and Artzi, Yoav , booktitle=. 2020 , url=

2020

[58] [58]

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena , doi =

Chiang, Wei-Lin and Gonzalez, Joseph and Li, Dacheng and Zhuohan, Li and Lin, Zi and Sheng, Ying and Stoica, Ion and Wu, Zhanghao and Xing, Eric and Zhang, Hao and Zheng, Lianmin and Zhuang, Siyuan and Zhuang, Yonghao , year =. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena , doi =

[59] [59]

Ergonomics and Human Factors 2024 , year=

Identifying Human Performance Metrics in Air Traffic Control , author=. Ergonomics and Human Factors 2024 , year=

2024

[60] [60]

Recommendations for next generation air traffic control training , year=

Updegrove, Jessica and Jafer, Shafagh , booktitle=. Recommendations for next generation air traffic control training , year=. doi:10.1109/DASC.2017.8102129 , organization=

work page doi:10.1109/dasc.2017.8102129 2017

[61] [61]

2021 , howpublished =

Silero Team , title =. 2021 , howpublished =

2021

[62] [62]

2023 , howpublished =

SYSTRAN , title =. 2023 , howpublished =

2023

[63] [63]

2020 , howpublished =

Klein, Guillaume , title =. 2020 , howpublished =

2020