LLMs in the Real World: Evaluating "AI" in Emergency Contexts

Lara Downing; Micha Elsner; Sara Court

arxiv: 2607.00019 · v1 · pith:M6G6S3RUnew · submitted 2026-05-29 · 💻 cs.CY · cs.AI

LLMs in the Real World: Evaluating "AI" in Emergency Contexts

Sara Court , Lara Downing , Micha Elsner This is my paper

Pith reviewed 2026-07-02 22:44 UTC · model grok-4.3

classification 💻 cs.CY cs.AI

keywords LLM deploymentemergency servicesmachine translationpublic communicationAI misconceptionstext-to-911real-world evaluationbest practices

0 comments

The pith

Researchers should explain their findings on simple AI uses to the public to avoid misconceptions in emergency services.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper urges AI researchers to take more responsibility for communicating results beyond academic circles. It presents a case study of an LLM-based translation system for texting 911 operators across 55 languages to demonstrate how such tools are often misunderstood in practice. The authors highlight several common misconceptions and end with specific recommendations for everyone involved in building and rolling out these systems. They contend that attention tends to go to difficult technical challenges while everyday deployment issues receive less scrutiny.

Core claim

While scientific progress often centers on solving the hard problems, it is often the easy ones—problems for which the latest technology is often unnecessary—that are most overlooked, as shown by misconceptions surrounding the initial deployment of an LLM-based text-2-911 system in 55 languages.

What carries the argument

The case study of the initial deployment of an LLM-based machine translation application for a text-2-911 system, used to surface common misconceptions about such technologies in emergency contexts.

If this is right

Stakeholders across the development and deployment pipeline should adopt the recommended best practices to reduce risks in emergency AI applications.
Greater public articulation of research findings can correct misunderstandings before they affect real emergency responses.
Shifting focus to overlooked simple problems can improve the safety of AI tools in critical settings where advanced capabilities are not required.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar communication gaps may affect AI use in other high-stakes areas such as medical advice or legal aid for non-native speakers.
Developers could test whether basic rule-based translation suffices for many emergency phrases before introducing LLMs.
Public education efforts might reduce over-reliance on machine translation when human operators remain available.

Load-bearing premise

The case study of the text-2-911 deployment accurately identifies the misconceptions people hold about these technologies.

What would settle it

A survey or log analysis showing that the misconceptions described do not appear among actual users or operators of similar emergency translation systems would undermine the argument.

read the original abstract

This paper offers a call to action. We urge our colleagues in the research community to play a greater role in the articulation of our findings to the public. To illustrate the stakes we present a case study on the initial stages of an LLM-based machine translation application's deployment in a real-world context: a text-2-911 system advertising capabilities in 55 languages for use in emergencies in which it may be difficult to call operators directly. We identify a number of common misconceptions about technologies such as these, concluding with a set of concrete recommendations and best practices for stakeholders at every stage of the development and deployment pipeline. While the advancement of scientific research often lies in solving the "hard" problems, we argue it is often the "easy" ones -- problems for which the latest technology is often unnecessary -- that are most overlooked.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Position paper urging better public communication of AI limits in emergencies, illustrated by a text-2-911 case study but thin on any supporting details.

read the letter

The paper's main point is that AI researchers should explain limitations more clearly to the public, using a case study of an LLM translation tool for texting 911 in 55 languages to show how misconceptions arise in emergency settings. It argues that overlooked 'easy' problems in deployment often matter more than chasing hard technical advances.

What it does reasonably is flag real risks when hype meets high-stakes use like emergency services, where language barriers can be life-critical. The recommendations for stakeholders across the pipeline are concrete and practical, which gives the piece some utility for practitioners.

The weakness is that the case study is only referenced, not shown. No specifics appear on what misconceptions were observed, how the deployment played out, or any evidence collected. The argument therefore rests on the authors' unelaborated observations rather than documented findings. This makes the central claim about easy problems being overlooked feel more asserted than demonstrated.

The work is aimed at the AI ethics and responsible deployment community rather than core technical researchers. Someone working on public-facing AI systems or policy might pick up useful framing or discussion points, but it adds no new framework or data.

It is worth sending for peer review as a position piece so the recommendations can be stress-tested, though the lack of evidence for the case study will likely draw comments.

Referee Report

1 major / 0 minor

Summary. This position paper issues a call to action for the research community to better communicate findings to the public. It illustrates the stakes via a case study of the initial deployment of an LLM-based text-2-911 translation system advertised in 55 languages for emergency use, identifies common misconceptions about such technologies, and offers concrete recommendations and best practices. The authors contend that 'easy' problems (those not requiring the latest technology) are frequently overlooked compared to 'hard' problems.

Significance. If the case-study observations are representative, the paper usefully draws attention to risks of deploying machine translation in life-critical emergency contexts and the value of researcher involvement in public articulation of limitations. As a qualitative call to action rather than an empirical study, its contribution lies in framing and recommendations rather than new data or formal results.

major comments (1)

[Abstract] Abstract and case-study description: the manuscript references a case study of an LLM text-2-911 deployment and states that it 'identify a number of common misconceptions,' yet provides no specific observations, error examples, deployment metrics, user reports, or methodological details to ground those misconceptions. This absence is load-bearing for the central claim that the case study illustrates overlooked 'easy' problems.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and for highlighting an opportunity to strengthen the abstract's grounding of the case study. We address the comment below.

read point-by-point responses

Referee: [Abstract] Abstract and case-study description: the manuscript references a case study of an LLM text-2-911 deployment and states that it 'identify a number of common misconceptions,' yet provides no specific observations, error examples, deployment metrics, user reports, or methodological details to ground those misconceptions. This absence is load-bearing for the central claim that the case study illustrates overlooked 'easy' problems.

Authors: We agree that the abstract would be improved by briefly referencing one concrete observation from the case study to illustrate the misconceptions. In revision we will add a short clause noting, for example, the observed mismatch between advertised 55-language coverage and reliable performance on time-critical emergency queries. The body of the paper already contains the qualitative description of the deployment and the specific misconceptions encountered; the abstract change will make this connection explicit without converting the position paper into an empirical report. revision: yes

Circularity Check

0 steps flagged

No significant circularity; qualitative position paper with no derivations or fitted claims

full rationale

The paper is explicitly a call to action and qualitative position piece that presents opinions and a descriptive case study on misconceptions around an LLM text-2-911 deployment. It contains no equations, parameters, predictions, uniqueness theorems, or technical derivations that could reduce to their own inputs by construction. No self-citations are invoked as load-bearing premises for any formal result. The central argument is presented as interpretive framing rather than a derived claim, making the work self-contained against external benchmarks with no circular steps present.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is a position piece without technical derivations, free parameters, or invented entities. It rests on domain assumptions about researcher responsibilities and the representativeness of the case study.

axioms (1)

domain assumption Researchers in AI have an obligation to articulate their findings and limitations to the public
This underpins the entire call to action in the abstract.

pith-pipeline@v0.9.1-grok · 5666 in / 1197 out tokens · 40821 ms · 2026-07-02T22:44:26.653763+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

82 extracted references · 34 canonical work pages · 1 internal anchor

[1]

Mohamed Abdalla, Jan Philip Wahle, Terry Ruas, Aur \'e lie N \'e v \'e ol, Fanny Ducel, Saif Mohammad, and Karen Fort. 2023. https://doi.org/10.18653/v1/2023.acl-long.734 The elephant in the room: Analyzing the presence of big tech in natural language processing research . In Proceedings of the 61st Annual Meeting of the Association for Computational Ling...

work page doi:10.18653/v1/2023.acl-long.734 2023
[2]

Ali Al-Laith and Rachida Kebdani. 2025. https://aclanthology.org/2025.wacl-1.8.pdf E valuating Calibration of Arabic Pre-trained Language Models on Dialectal Text . In Proceedings of the 4th Workshop on Arabic Corpus Linguistics (WACL-4) , pages 68--76

2025
[3]

Anyaegbuna, N

C. Anyaegbuna, N. Steele, A. S. Liang, S. P. Ma, I. Lopez, N. Chilukuri, K. Patel, K. Schulman, and J. H. Chen. 2026. https://doi.org/10.1136/bmjhci-2025-102007 Artificial intelligence translation in healthcare: an urgent call for evidence-informed policy frameworks . BMJ Health Care Informatics, 33(1):e102007

work page doi:10.1136/bmjhci-2025-102007 2026
[4]

Seth Aycock, David Stap, Di Wu, Christof Monz, and Khalil Sima'an. 2025. https://openreview.net/forum?id=aMBSY2ebPw Can LLM s Really Learn to Translate a Low-Resource Language from One Grammar Book ? In The Thirteenth International Conference on Learning Representations

2025
[5]

Benjamin

R. Benjamin. 2019. https://books.google.com/books?id=G6-hDwAAQBAJ Race After Technology: Abolitionist Tools for the New Jim Code . Polity Press

2019
[7]

Richard A Berk. 2021. https://doi.org/10.1146/annurev-criminol-051520-012342 Artificial I ntelligence, P redictive P olicing, and R isk A ssessment for L aw E nforcement . Annual Review of Criminology, 4(1):209--237

work page doi:10.1146/annurev-criminol-051520-012342 2021
[8]

Johana Bhuiyan. 2023. https://www.theguardian.com/us-news/2023/sep/07/ai-translation-app-asylum-application Lost in AI translation: Growing reliance on language apps jeopardizes some asylum applications . The Guardian

2023
[9]

Su Lin Blodgett, Solon Barocas, Hal Daumé III, and Hanna Wallach. 2020. https://doi.org/10.48550/arXiv.2005.14050 Language ( Technology) is Power: A Critical Survey of “Bias” in NLP . (arXiv:2005.14050). ArXiv:2005.14050 [cs]

work page doi:10.48550/arxiv.2005.14050 2020
[10]

Anna Burns. 2025. https://mapleridgenews.com/2025/10/02/surrey-police-shooting-death-prompts-calls-for-interpreter-access/ Surrey police shooting death prompts calls for interpreter access . Maple Ridge News

2025
[11]

Greta Byrum and Ruha Benjamin. 2022. https://doi.org/10.48558/9SEV-4D26 Disrupting the Gospel of Tech Solutionism to Build Tech Justice . Stanford Social Innovation Review

work page doi:10.48558/9sev-4d26 2022
[12]

CalMatters. 2025. https://calmatters.org/justice/2025/07/ice-detention-deaf-asylum-seeker/ Deaf Mongolian Immigrant Held by ICE in California for 4 Months with No Access to I nterpreter

2025
[13]

Center for Democracy & Technology . 2025 a . https://cdt.org/insights/content-moderation-in-the-global-south-a-comparative-study-of-four-low-resource-languages/ Content Moderation in the Global South: A Comparative Study of Four Low-Resource Languages

2025
[14]

Center for Democracy & Technology . 2025 b . https://cdt.org/wp-content/uploads/2025/09/2025-09-22-Humans-in-the-Loop-CDT-Civic-Tech-report-final.pdf Humans in the loop . Civic tech report, Center for Democracy & Technology

2025
[15]

Central Ohio Hospital Council , Columbus Public Health , and Franklin County Public Health . 2025. https://centralohiohospitals.org/wp-content/uploads/2025/06/HM2025.FINAL2_.pdf Franklin County HealthMap2025: Community Health Needs Assessment

2025
[16]

Amit Choudhari, Sylvain Guilley, and Khaled Karray. 2021. https://doi.org/10.1109/NICS54270.2021.9701469 Cryscanner: Finding cryptographic libraries misuse . In 2021 8th NAFOSTED Conference on Information and Computer Science (NICS), pages 230--235

work page doi:10.1109/nics54270.2021.9701469 2021
[17]

Colorado General Assembly . 2024. https://leg.colorado.gov/bills/sb24-205 Concerning consumer protections in interactions with artificial intelligence systems . Signed into law May 17, 2024; effective February 1, 2026. Codified at Colo.\ Rev.\ Stat.\ 6-1-1701 et seq

2024
[18]

Ângela Costa, Wang Ling, Tiago Luís, Rui Correia, and Luísa Coheur. 2015. https://doi.org/10.1007/s10590-015-9169-0 A linguistically motivated taxonomy for machine translation error analysis . Mach. Transl., 29(2):127--161

work page doi:10.1007/s10590-015-9169-0 2015
[19]

Sara Court and Micha Elsner. 2024. https://doi.org/10.18653/v1/2024.wmt-1.125 Shortcomings of LLM s for low-resource translation: Retrieval and understanding are both the problem . In Proceedings of the Ninth Conference on Machine Translation, pages 1332--1354, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.wmt-1.125 2024
[20]

William De Brugger. 2023. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/ Chat GPT sets record for fastest growing user base: Analyst note . Accessed October 4, 2025

2023
[21]

Andrew Deck. 2023. https://restofworld.org/2023/ai-translation-errors-afghan-refugees-asylum/ AI Translation Is Jeopardizing Afghan Asylum Claims . Rest of World

2023
[22]

Ameet Deshpande, Tanmay Rajpurohit, Karthik Narasimhan, and Ashwin Kalyan. 2023. https://doi.org/10.18653/v1/2023.nllp-1.1 Anthropomorphization of AI : Opportunities and risks . In Proceedings of the Natural Legal Language Processing Workshop 2023, pages 1--7, Singapore. Association for Computational Linguistics

work page doi:10.18653/v1/2023.nllp-1.1 2023
[23]

K. N. Dew, A. M. Turner, Y. K. Choi, A. Bosold, and K. Kirchhoff. 2018. https://doi.org/10.1016/j.jbi.2018.07.018 Development of machine translation technology for assisting health communication: A systematic review . Journal of Biomedical Informatics, 85:56--67

work page doi:10.1016/j.jbi.2018.07.018 2018
[24]

Lelia Erscoi, Annelies Véronique Kleinherenbrink, and Olivia Guest. 2023. https://doi.org/10.31235/osf.io/jqxb6 Pygmalion displacement: When humanising AI dehumanises women

work page doi:10.31235/osf.io/jqxb6 2023
[25]

Virginia Eubanks. 2018. Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin's Press

2018
[26]

European Parliament and Council of the European Union . 2024. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:L_202401689 Regulation ( EU ) 2024/1689 of the European Parliament and of the Council of 13 june 2024 laying down harmonised rules on artificial intelligence (artificial intelligence act)

2024
[28]

Franklin County Board of Commissioners . 2019. Residents can now text-to-911 in an emergency. Press release. Available at: https://www.franklincountyohio.gov/files/assets/public/v/1/emergency-management/documents/text-911-news-release.pdf (accessed [11/20/2025])

2019
[29]

Markus Freitag, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, and Wolfgang Macherey. 2021. https://doi.org/10.1162/tacl_a_00437 Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation . Transactions of the Association for Computational Linguistics, 9:1460--1474

work page doi:10.1162/tacl_a_00437 2021
[30]

Markus Freitag, Nitika Mathur, Daniel Deutsch, Chi-Kiu Lo, Eleftherios Avramidis, Ricardo Rei, Brian Thompson, Frederic Blain, Tom Kocmi, Jiayi Wang, David Ifeoluwa Adelani, Marianna Buchicchio, Chrysoula Zerva, and Alon Lavie. 2024. https://doi.org/10.18653/v1/2024.wmt-1.2 Are LLM s breaking MT metrics? results of the WMT 24 metrics shared task . In Proc...

work page doi:10.18653/v1/2024.wmt-1.2 2024
[31]

Valentin Hofmann, Pratyusha Ria Kalluri, Dan Jurafsky, and Sharese King. 2024. https://doi.org/10.1038/s41586-024-07856-5 AI generates covertly racist decisions about people based on their dialect . Nature, 633(8028):147--154. Epub 2024 Aug 28

work page doi:10.1038/s41586-024-07856-5 2024
[32]

Jess Hohenstein and Malte Jung. 2020. https://doi.org/10.1016/j.chb.2019.106190 AI as a moral crumple zone: The effects of AI -mediated communication on attribution and trust . 106:106190

work page doi:10.1016/j.chb.2019.106190 2020
[33]

Anne H Charity Hudley, Christine Mallinson, and Mary Bucholtz. 2024. Decolonizing linguistics. Oxford University Press

2024
[34]

International Association of Privacy Professionals . 2025. https://iapp.org/news/a/italy-s-dpa-reaffirms-ban-on-replika-over-ai-and-children-s-privacy-concerns Italy's DPA reaffirms ban on Replika over AI and children's privacy concerns

2025
[35]

Marie-Odile Junker. 2024. https://aclanthology.org/2024.computel-1.8/ Data-mining and extraction: the gold rush of AI on I ndigenous languages . In Proceedings of the Seventh Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 52--57, St. Julians, Malta. Association for Computational Linguistics

2024
[36]

Cecilia Kang. 2025. https://www.nytimes.com/2025/03/24/technology/trump-ai-regulation.html Trump Unveils Plan to Overhaul A.I. Regulation . The New York Times. Accessed: 2025-11-16

2025
[37]

Kapur, Michael Pecht, and Andrew P

Kailash C. Kapur, Michael Pecht, and Andrew P. Sage. 2014. Reliability engineering. Wiley

2014
[38]

Antonia Karamolegkou, Sandrine Schiller Hansen, Ariadni Christopoulou, Filippos Stamatiou, Anne Lauscher, and Anders S gaard. 2025. https://doi.org/10.18653/v1/2025.naacl-long.580 Ethical concern identification in NLP : A corpus of ACL A nthology ethics statements . In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Associ...

work page doi:10.18653/v1/2025.naacl-long.580 2025
[39]

Aliah Keller. 2025. https://spectrumnews1.com/oh/columbus/news/2025/06/04/columbus-police-break-language-barriers- Columbus police break language barriers in emergencies with new tools . Spectrum News 1. Published 5:02 AM ET

2025
[40]

Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, and Yulia Tsvetkov. 2023. https://aclanthology.org/2023.eacl-main.241/ Language generation models can cause harm: So what can we do about it? an actionable survey . In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, page...

2023
[41]

Jordan Laird. 2025. https://www.dispatch.com/story/news/local/2025/04/23/columbus-911-text-translation-facetime-video-update/83229379007/ Columbus upgrades 911 system with text translation in 55 languages, 'one-way facetime' . The Columbus Dispatch

work page arXiv 2025
[42]

Richard N Landers and Tara S Behrend. 2023. https://doi.org/10.1037/amp0000972 Auditing the AI auditors: A framework for evaluating fairness and bias in high stakes AI predictive models. American Psychologist, 78(1):36

work page doi:10.1037/amp0000972 2023
[43]

David Lazar, Haogang Chen, Xi Wang, and Nickolai Zeldovich. 2014. https://doi.org/10.1145/2637166.2637237 Why does cryptographic software fail? a case study and open problems . In Proceedings of 5th Asia-Pacific Workshop on Systems, APSys '14, New York, NY, USA. Association for Computing Machinery

work page doi:10.1145/2637166.2637237 2014
[44]

Karim Lekadir, Alejandro F Frangi, Antonio R Porras, Ben Glocker, Celia Cintas, Curtis P Langlotz, Eva Weicken, Folkert W Asselbergs, Fred Prior, Gary S Collins, and 1 others. 2025. Future-ai: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare. bmj, 388

2025
[45]

Bryan Li, Jiaming Luo, Eleftheria Briakou, and Colin Cherry. 2025. https://doi.org/10.18653/v1/2025.knowledgenlp-1.7 Leveraging domain knowledge at inference time for LLM translation: Retrieval versus generation . In Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing, pages 91--106, Albuquerque, Ne...

work page doi:10.18653/v1/2025.knowledgenlp-1.7 2025
[46]

https://www.lsadc.org/linguistics_language_and_the_public_award Linguistics, Language, and the Public Award

Linguistic Society of America . https://www.lsadc.org/linguistics_language_and_the_public_award Linguistics, Language, and the Public Award
[47]

Lopez, D

I. Lopez, D. E. Velasquez, J. H. Chen, and J. A. Rodriguez. 2025. https://doi.org/10.1038/s41746-025-01944-0 Operationalizing machine-assisted translation in healthcare . npj Digital Medicine, 8(1):584

work page doi:10.1038/s41746-025-01944-0 2025
[48]

Elisabeth Mahase. 2023. Babylon looks to sell gp at hand and other uk business amid financial issues. BMJ: British Medical Journal (Online), 382:p1835

2023
[49]

Kyle Mahowald, Anna A Ivanova, Idan A Blank, Nancy Kanwisher, Joshua B Tenenbaum, and Evelina Fedorenko. 2024. https://www.evlab.mit.edu/s/Mahowald_Ivanova_et_al_2024_TiCS.pdf Dissociating language and thought in large language models . Trends in cognitive sciences, 28(6):517--540

2024
[50]

Jonibek Mansurov, Akhmed Sakip, and Alham Fikri Aji. 2025. https://doi.org/10.18653/v1/2025.acl-long.407 Data laundering: Artificially boosting benchmark results through knowledge distillation . In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8332--8345, Vienna, Austria. Association...

work page doi:10.18653/v1/2025.acl-long.407 2025
[51]

Nikita Mehandru, Sweta Agrawal, Yimin Xiao, Ge Gao, Elaine Khoong, Marine Carpuat, and Niloufar Salehi. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.712 Physician detection of clinical harm in machine translation: Quality estimation aids in reliance and backtranslation identifies critical errors . In Proceedings of the 2023 Conference on Empirical Me...

work page doi:10.18653/v1/2023.emnlp-main.712 2023
[52]

Timothee Mickus, Elaine Zosa, Raul Vazquez, Teemu Vahtola, J \"o rg Tiedemann, Vincent Segonne, Alessandro Raganato, and Marianna Apidianaki. 2024. https://doi.org/10.18653/v1/2024.semeval-1.273 S em E val-2024 task 6: SHROOM , a shared-task on hallucinations and related observable overgeneration mistakes . In Proceedings of the 18th International Worksho...

work page doi:10.18653/v1/2024.semeval-1.273 2024
[53]

Venkatesh Mishra, Bimsara Pathiraja, Mihir Parmar, Sat Chidananda, Jayanth Srinivasa, Gaowen Liu, Ali Payani, and Chitta Baral. 2025. Investigating the Shortcomings of LLM s in Step-by-Step Legal Reasoning . arXiv preprint arXiv:2502.05675

work page arXiv 2025
[54]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency, pages 220--229

2019
[55]

Melanie Mitchell. 2024. https://doi.org/10.1126/science.adt6140 The metaphors of artificial intelligence . Science, 386(6723):eadt6140

work page doi:10.1126/science.adt6140 2024
[56]

Sabrina Moreno. 2021. https://richmond.com/news/local/virginia-uses-google-translate-for-covid-vaccine-information-heres-how-that-magnifies-language-barriers-misinformation/article_715cb81a-d880-5c98-aac5-6b30b378bbd3.html Virginia Uses Google Translate for COVID Vaccine Information. Here’s How That Magnifies Language Barriers, Misinformation . Richmond T...

2021
[57]

Evgeny Morozov. 2013. To save everything, click here: The folly of technological solutionism. Public Affairs

2013
[58]

Denis Moser, Nikola Stanic, and Murat Sariyar. 2025. https://doi.org/10.1093/jamiaopen/ooaf147 Benchmarking speech-to-text robustness in noisy emergency medical dialogues: an evaluation of models under realistic acoustic conditions . JAMIA Open, 8(6):ooaf147

work page doi:10.1093/jamiaopen/ooaf147 2025
[59]

National Immigrant Women’s Advocacy Project (NiWAP) and American University Washington College of Law . 2013. https://niwaplibrary.wcl.american.edu/wp-content/uploads/IMM-Qref-LangAccessUVisaCollaboration.pdf Immigrant and limited english proficient victims’ access to the criminal justice system: The importance of collaboration . Technical report, America...

2013
[60]

National Institute of Standards and Technology . 2023. https://doi.org/10.6028/NIST.AI.100-1 AI risk management framework ( AI RMF 1.0) . Technical Report NIST AI 100-1, National Institute of Standards and Technology, Gaithersburg, MD

work page doi:10.6028/nist.ai.100-1 2023
[61]

National Institute of Standards and Technology . 2026. https://www.nist.gov/programs-projects/concept-note-ai-rmf-profile-trustworthy-ai-critical-infrastructure Profile on trustworthy AI in critical infrastructure . Technical report, National Institute of Standards and Technology, Gaithersburg, MD. Details forthcoming at time of writing

2026
[62]

Von Nessen

Joseph C. Von Nessen. 2025. https://www.odvn.org/wp-content/uploads/2025/02/19Feb_EconImpact_release.pdf The Economic Impact of Intimate Partner Violence in Ohio . Report commissioned by Ohio Domestic Violence Network, released Feb. 24, 2025

2025
[63]

Elizabeth Nielsen, Isaac Rayburn Caswell, Jiaming Luo, and Colin Cherry. 2025. https://aclanthology.org/2025.naacl-short.18/ Alligators all around: Mitigating lexical confusion in low-resource machine translation . In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language ...

2025
[64]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, and 1 others. 2022. https://papers.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf Training language models to follow instructions with human feedback . Advances in neur...

2022
[65]

Tekendra Parmar. 2025. https://www.motherjones.com/criminal-justice/2025/08/axon-police-ai-draft-one-foia/ Axon’s Draft One Is Designed to Defy Transparency . Mother Jones. Accessed: 2025‑10‑20

2025
[66]

Sofia Quaglia. 2022. https://slate.com/technology/2022/09/machine-translation-accuracy-government-danger.html Death by machine translation? Slate. Archived at https://perma.cc/6RD2-3TY3

2022
[67]

Kevin Roose. 2023. https://www.nytimes.com/2023/02/16/technology/bing-chatbot-transcript.html Bing’s A.I. Chat Reveals Its Feelings: ‘I Want to Be Alive. ’ . The New York Times. Accessed: 2025‑10‑19

2023
[68]

SAFE-AI Task Force . 2024. https://safeaitf.org/wp-content/uploads/2024/07/SAFE-AI-Guidance-07-01-24.pdf Interpreting safe AI task force guidance: AI and interpreting services . Technical report, Stakeholders Advocating for Fair and Ethical AI in Interpreting. Version dated July 1, 2024

2024
[69]

SAFE AI Task Force and CoSET . 2025. https://safeaitf.org/wp-content/uploads/2025/09/AI-Interpreting-Solutions-Evaluation-Toolkit_Part-A.pdf AI Interpreting Solutions Evaluation Toolkit, Part A: Organization, Implementation and Management . Technical report, SAFE AI Task Force and the Coalition for Sign Language Equity in Technology (CoSET)

2025
[70]

Thomas W Sanchez, Marc Brenman, and Xinyue Ye. 2025. The ethical concerns of artificial intelligence in urban planning. Journal of the American Planning Association, 91(2):294--307

2025
[71]

Danielle Saunders. 2022. Domain adaptation and multi-domain adaptation for neural machine translation: A survey. Journal of Artificial Intelligence Research, 75:351--424

2022
[72]

Forcada, Miquel Espl \`a -Gomis, and Lucia Specia

Scarton Scarton, Mikel L. Forcada, Miquel Espl \`a -Gomis, and Lucia Specia. 2019. https://aclanthology.org/2019.iwslt-1.23/ Estimating post-editing effort: a study on human judgements, task-based and reference-based metrics of MT quality . In Proceedings of the 16th International Conference on Spoken Language Translation, Hong Kong. Association for Compu...

2019
[73]

Behzad Shayegh, Jan-Thorsten Peter, David Vilar, Tobias Domhan, Juraj Juraska, Markus Freitag, and Lili Mou. 2025. https://arxiv.org/pdf/2503.24013? Feeding two birds or favoring one? adequacy--fluency tradeoffs in evaluation and meta-evaluation of machine translation . In Proceedings of the Tenth Conference on Machine Translation (WMT), Volume 1: Researc...

work page arXiv 2025
[74]

Ana Silva, Nikit Srivastava, Tatiana Moteu Ngoli, Michael R \"o der, Diego Moussallem, and Axel-Cyrille Ngonga Ngomo. 2024. Benchmarking low-resource machine translation systems. In Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 175--185

2024
[75]

State of Ohio . 2023. https://das.ohio.gov/wps/wcm/connect/gov/de987825-6f6d-41e7-86b9-31c957551975/IT-17.pdf?MOD=AJPERES&CONVERT_TO=url&CACHEID=ROOTWORKSPACE.Z18_K9I401S01H7F40QBNJU3SO1F56-de987825-6f6d-41e7-86b9-31c957551975-oWr6g0E Use of Artificial Intelligence in State of Ohio Solutions . Administrative policy it-17, Ohio Department of Administrative...

2023
[76]

Taira, Valerie Kreger, Amanda Orue, and Lisa C

Breena R. Taira, Valerie Kreger, Amanda Orue, and Lisa C. Diamond. 2021. https://doi.org/10.1007/s11606-021-06666-z A pragmatic assessment of google translate for emergency department instructions . Journal of General Internal Medicine, 36(11):3361--3365

work page doi:10.1007/s11606-021-06666-z 2021
[77]

Alan M. Turing. 1950. Computing machinery and intelligence. Mind, 59(236):433

1950
[78]

Cruz-Zamora

United States v. Cruz-Zamora . 2018. United states vs. omar cruz-zamora. The United States District Court for the District of Kansas. Retrieved from https://ecf.ksd.uscourts.gov/cgi-bin/show_public_doc?2017cr40100-24

2018
[79]

Ashok Urlana, Charaka Vinayak Kumar, Bala Mallikarjunarao Garlapati, Ajeet Kumar Singh, and Rahul Mishra. 2025. No size fits all: The perils and pitfalls of leveraging LLM s vary with company size. In Proceedings of the 31st International Conference on Computational Linguistics: Industry Track, pages 187--203

2025
[80]

Baptiste Vasey, Myura Nagendran, Bruce Campbell, David A Clifton, Gary S Collins, Spiros Denaxas, Alastair K Denniston, Livia Faes, Bart Geerts, Mudathir Ibrahim, Xiaoxuan Liu, Bilal A Mateen, Piyush Mathur, Melissa D McCradden, Lauren Morgan, Johan Ordish, Chris Rogers, Suchi Saria, Daniel Shu Wei Ting, and 4 others. 2022. https://doi.org/10.1038/s41591-...

work page doi:10.1038/s41591-022-01772-9 2022
[81]

Lucas Nunes Vieira. 2020. https://doi.org/10.1075/ts.00023.nun Machine translation in the news: A framing analysis of the written press . Translation Spaces, 9(1):98--122

work page doi:10.1075/ts.00023.nun 2020
[82]

Lucas Nunes Vieira, Minako O'Hagan, and Carol O'Sullivan. 2021. https://doi.org/10.1080/1369118X.2020.1776370 Understanding the societal impacts of machine translation: A critical review of the literature on medical and legal use cases . Information, Communication & Society, 24(11):1515--1532

work page doi:10.1080/1369118x.2020.1776370 2021

Showing first 80 references.

[1] [1]

Mohamed Abdalla, Jan Philip Wahle, Terry Ruas, Aur \'e lie N \'e v \'e ol, Fanny Ducel, Saif Mohammad, and Karen Fort. 2023. https://doi.org/10.18653/v1/2023.acl-long.734 The elephant in the room: Analyzing the presence of big tech in natural language processing research . In Proceedings of the 61st Annual Meeting of the Association for Computational Ling...

work page doi:10.18653/v1/2023.acl-long.734 2023

[2] [2]

Ali Al-Laith and Rachida Kebdani. 2025. https://aclanthology.org/2025.wacl-1.8.pdf E valuating Calibration of Arabic Pre-trained Language Models on Dialectal Text . In Proceedings of the 4th Workshop on Arabic Corpus Linguistics (WACL-4) , pages 68--76

2025

[3] [3]

Anyaegbuna, N

C. Anyaegbuna, N. Steele, A. S. Liang, S. P. Ma, I. Lopez, N. Chilukuri, K. Patel, K. Schulman, and J. H. Chen. 2026. https://doi.org/10.1136/bmjhci-2025-102007 Artificial intelligence translation in healthcare: an urgent call for evidence-informed policy frameworks . BMJ Health Care Informatics, 33(1):e102007

work page doi:10.1136/bmjhci-2025-102007 2026

[4] [4]

Seth Aycock, David Stap, Di Wu, Christof Monz, and Khalil Sima'an. 2025. https://openreview.net/forum?id=aMBSY2ebPw Can LLM s Really Learn to Translate a Low-Resource Language from One Grammar Book ? In The Thirteenth International Conference on Learning Representations

2025

[5] [5]

Benjamin

R. Benjamin. 2019. https://books.google.com/books?id=G6-hDwAAQBAJ Race After Technology: Abolitionist Tools for the New Jim Code . Polity Press

2019

[6] [7]

Richard A Berk. 2021. https://doi.org/10.1146/annurev-criminol-051520-012342 Artificial I ntelligence, P redictive P olicing, and R isk A ssessment for L aw E nforcement . Annual Review of Criminology, 4(1):209--237

work page doi:10.1146/annurev-criminol-051520-012342 2021

[7] [8]

Johana Bhuiyan. 2023. https://www.theguardian.com/us-news/2023/sep/07/ai-translation-app-asylum-application Lost in AI translation: Growing reliance on language apps jeopardizes some asylum applications . The Guardian

2023

[8] [9]

Su Lin Blodgett, Solon Barocas, Hal Daumé III, and Hanna Wallach. 2020. https://doi.org/10.48550/arXiv.2005.14050 Language ( Technology) is Power: A Critical Survey of “Bias” in NLP . (arXiv:2005.14050). ArXiv:2005.14050 [cs]

work page doi:10.48550/arxiv.2005.14050 2020

[9] [10]

Anna Burns. 2025. https://mapleridgenews.com/2025/10/02/surrey-police-shooting-death-prompts-calls-for-interpreter-access/ Surrey police shooting death prompts calls for interpreter access . Maple Ridge News

2025

[10] [11]

Greta Byrum and Ruha Benjamin. 2022. https://doi.org/10.48558/9SEV-4D26 Disrupting the Gospel of Tech Solutionism to Build Tech Justice . Stanford Social Innovation Review

work page doi:10.48558/9sev-4d26 2022

[11] [12]

CalMatters. 2025. https://calmatters.org/justice/2025/07/ice-detention-deaf-asylum-seeker/ Deaf Mongolian Immigrant Held by ICE in California for 4 Months with No Access to I nterpreter

2025

[12] [13]

Center for Democracy & Technology . 2025 a . https://cdt.org/insights/content-moderation-in-the-global-south-a-comparative-study-of-four-low-resource-languages/ Content Moderation in the Global South: A Comparative Study of Four Low-Resource Languages

2025

[13] [14]

Center for Democracy & Technology . 2025 b . https://cdt.org/wp-content/uploads/2025/09/2025-09-22-Humans-in-the-Loop-CDT-Civic-Tech-report-final.pdf Humans in the loop . Civic tech report, Center for Democracy & Technology

2025

[14] [15]

Central Ohio Hospital Council , Columbus Public Health , and Franklin County Public Health . 2025. https://centralohiohospitals.org/wp-content/uploads/2025/06/HM2025.FINAL2_.pdf Franklin County HealthMap2025: Community Health Needs Assessment

2025

[15] [16]

Amit Choudhari, Sylvain Guilley, and Khaled Karray. 2021. https://doi.org/10.1109/NICS54270.2021.9701469 Cryscanner: Finding cryptographic libraries misuse . In 2021 8th NAFOSTED Conference on Information and Computer Science (NICS), pages 230--235

work page doi:10.1109/nics54270.2021.9701469 2021

[16] [17]

Colorado General Assembly . 2024. https://leg.colorado.gov/bills/sb24-205 Concerning consumer protections in interactions with artificial intelligence systems . Signed into law May 17, 2024; effective February 1, 2026. Codified at Colo.\ Rev.\ Stat.\ 6-1-1701 et seq

2024

[17] [18]

Ângela Costa, Wang Ling, Tiago Luís, Rui Correia, and Luísa Coheur. 2015. https://doi.org/10.1007/s10590-015-9169-0 A linguistically motivated taxonomy for machine translation error analysis . Mach. Transl., 29(2):127--161

work page doi:10.1007/s10590-015-9169-0 2015

[18] [19]

Sara Court and Micha Elsner. 2024. https://doi.org/10.18653/v1/2024.wmt-1.125 Shortcomings of LLM s for low-resource translation: Retrieval and understanding are both the problem . In Proceedings of the Ninth Conference on Machine Translation, pages 1332--1354, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.wmt-1.125 2024

[19] [20]

William De Brugger. 2023. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/ Chat GPT sets record for fastest growing user base: Analyst note . Accessed October 4, 2025

2023

[20] [21]

Andrew Deck. 2023. https://restofworld.org/2023/ai-translation-errors-afghan-refugees-asylum/ AI Translation Is Jeopardizing Afghan Asylum Claims . Rest of World

2023

[21] [22]

Ameet Deshpande, Tanmay Rajpurohit, Karthik Narasimhan, and Ashwin Kalyan. 2023. https://doi.org/10.18653/v1/2023.nllp-1.1 Anthropomorphization of AI : Opportunities and risks . In Proceedings of the Natural Legal Language Processing Workshop 2023, pages 1--7, Singapore. Association for Computational Linguistics

work page doi:10.18653/v1/2023.nllp-1.1 2023

[22] [23]

K. N. Dew, A. M. Turner, Y. K. Choi, A. Bosold, and K. Kirchhoff. 2018. https://doi.org/10.1016/j.jbi.2018.07.018 Development of machine translation technology for assisting health communication: A systematic review . Journal of Biomedical Informatics, 85:56--67

work page doi:10.1016/j.jbi.2018.07.018 2018

[23] [24]

Lelia Erscoi, Annelies Véronique Kleinherenbrink, and Olivia Guest. 2023. https://doi.org/10.31235/osf.io/jqxb6 Pygmalion displacement: When humanising AI dehumanises women

work page doi:10.31235/osf.io/jqxb6 2023

[24] [25]

Virginia Eubanks. 2018. Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin's Press

2018

[25] [26]

European Parliament and Council of the European Union . 2024. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:L_202401689 Regulation ( EU ) 2024/1689 of the European Parliament and of the Council of 13 june 2024 laying down harmonised rules on artificial intelligence (artificial intelligence act)

2024

[26] [28]

Franklin County Board of Commissioners . 2019. Residents can now text-to-911 in an emergency. Press release. Available at: https://www.franklincountyohio.gov/files/assets/public/v/1/emergency-management/documents/text-911-news-release.pdf (accessed [11/20/2025])

2019

[27] [29]

Markus Freitag, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, and Wolfgang Macherey. 2021. https://doi.org/10.1162/tacl_a_00437 Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation . Transactions of the Association for Computational Linguistics, 9:1460--1474

work page doi:10.1162/tacl_a_00437 2021

[28] [30]

Markus Freitag, Nitika Mathur, Daniel Deutsch, Chi-Kiu Lo, Eleftherios Avramidis, Ricardo Rei, Brian Thompson, Frederic Blain, Tom Kocmi, Jiayi Wang, David Ifeoluwa Adelani, Marianna Buchicchio, Chrysoula Zerva, and Alon Lavie. 2024. https://doi.org/10.18653/v1/2024.wmt-1.2 Are LLM s breaking MT metrics? results of the WMT 24 metrics shared task . In Proc...

work page doi:10.18653/v1/2024.wmt-1.2 2024

[29] [31]

Valentin Hofmann, Pratyusha Ria Kalluri, Dan Jurafsky, and Sharese King. 2024. https://doi.org/10.1038/s41586-024-07856-5 AI generates covertly racist decisions about people based on their dialect . Nature, 633(8028):147--154. Epub 2024 Aug 28

work page doi:10.1038/s41586-024-07856-5 2024

[30] [32]

Jess Hohenstein and Malte Jung. 2020. https://doi.org/10.1016/j.chb.2019.106190 AI as a moral crumple zone: The effects of AI -mediated communication on attribution and trust . 106:106190

work page doi:10.1016/j.chb.2019.106190 2020

[31] [33]

Anne H Charity Hudley, Christine Mallinson, and Mary Bucholtz. 2024. Decolonizing linguistics. Oxford University Press

2024

[32] [34]

International Association of Privacy Professionals . 2025. https://iapp.org/news/a/italy-s-dpa-reaffirms-ban-on-replika-over-ai-and-children-s-privacy-concerns Italy's DPA reaffirms ban on Replika over AI and children's privacy concerns

2025

[33] [35]

Marie-Odile Junker. 2024. https://aclanthology.org/2024.computel-1.8/ Data-mining and extraction: the gold rush of AI on I ndigenous languages . In Proceedings of the Seventh Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 52--57, St. Julians, Malta. Association for Computational Linguistics

2024

[34] [36]

Cecilia Kang. 2025. https://www.nytimes.com/2025/03/24/technology/trump-ai-regulation.html Trump Unveils Plan to Overhaul A.I. Regulation . The New York Times. Accessed: 2025-11-16

2025

[35] [37]

Kapur, Michael Pecht, and Andrew P

Kailash C. Kapur, Michael Pecht, and Andrew P. Sage. 2014. Reliability engineering. Wiley

2014

[36] [38]

Antonia Karamolegkou, Sandrine Schiller Hansen, Ariadni Christopoulou, Filippos Stamatiou, Anne Lauscher, and Anders S gaard. 2025. https://doi.org/10.18653/v1/2025.naacl-long.580 Ethical concern identification in NLP : A corpus of ACL A nthology ethics statements . In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Associ...

work page doi:10.18653/v1/2025.naacl-long.580 2025

[37] [39]

Aliah Keller. 2025. https://spectrumnews1.com/oh/columbus/news/2025/06/04/columbus-police-break-language-barriers- Columbus police break language barriers in emergencies with new tools . Spectrum News 1. Published 5:02 AM ET

2025

[38] [40]

Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, and Yulia Tsvetkov. 2023. https://aclanthology.org/2023.eacl-main.241/ Language generation models can cause harm: So what can we do about it? an actionable survey . In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, page...

2023

[39] [41]

Jordan Laird. 2025. https://www.dispatch.com/story/news/local/2025/04/23/columbus-911-text-translation-facetime-video-update/83229379007/ Columbus upgrades 911 system with text translation in 55 languages, 'one-way facetime' . The Columbus Dispatch

work page arXiv 2025

[40] [42]

Richard N Landers and Tara S Behrend. 2023. https://doi.org/10.1037/amp0000972 Auditing the AI auditors: A framework for evaluating fairness and bias in high stakes AI predictive models. American Psychologist, 78(1):36

work page doi:10.1037/amp0000972 2023

[41] [43]

David Lazar, Haogang Chen, Xi Wang, and Nickolai Zeldovich. 2014. https://doi.org/10.1145/2637166.2637237 Why does cryptographic software fail? a case study and open problems . In Proceedings of 5th Asia-Pacific Workshop on Systems, APSys '14, New York, NY, USA. Association for Computing Machinery

work page doi:10.1145/2637166.2637237 2014

[42] [44]

Karim Lekadir, Alejandro F Frangi, Antonio R Porras, Ben Glocker, Celia Cintas, Curtis P Langlotz, Eva Weicken, Folkert W Asselbergs, Fred Prior, Gary S Collins, and 1 others. 2025. Future-ai: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare. bmj, 388

2025

[43] [45]

Bryan Li, Jiaming Luo, Eleftheria Briakou, and Colin Cherry. 2025. https://doi.org/10.18653/v1/2025.knowledgenlp-1.7 Leveraging domain knowledge at inference time for LLM translation: Retrieval versus generation . In Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing, pages 91--106, Albuquerque, Ne...

work page doi:10.18653/v1/2025.knowledgenlp-1.7 2025

[44] [46]

https://www.lsadc.org/linguistics_language_and_the_public_award Linguistics, Language, and the Public Award

Linguistic Society of America . https://www.lsadc.org/linguistics_language_and_the_public_award Linguistics, Language, and the Public Award

[45] [47]

Lopez, D

I. Lopez, D. E. Velasquez, J. H. Chen, and J. A. Rodriguez. 2025. https://doi.org/10.1038/s41746-025-01944-0 Operationalizing machine-assisted translation in healthcare . npj Digital Medicine, 8(1):584

work page doi:10.1038/s41746-025-01944-0 2025

[46] [48]

Elisabeth Mahase. 2023. Babylon looks to sell gp at hand and other uk business amid financial issues. BMJ: British Medical Journal (Online), 382:p1835

2023

[47] [49]

Kyle Mahowald, Anna A Ivanova, Idan A Blank, Nancy Kanwisher, Joshua B Tenenbaum, and Evelina Fedorenko. 2024. https://www.evlab.mit.edu/s/Mahowald_Ivanova_et_al_2024_TiCS.pdf Dissociating language and thought in large language models . Trends in cognitive sciences, 28(6):517--540

2024

[48] [50]

Jonibek Mansurov, Akhmed Sakip, and Alham Fikri Aji. 2025. https://doi.org/10.18653/v1/2025.acl-long.407 Data laundering: Artificially boosting benchmark results through knowledge distillation . In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8332--8345, Vienna, Austria. Association...

work page doi:10.18653/v1/2025.acl-long.407 2025

[49] [51]

Nikita Mehandru, Sweta Agrawal, Yimin Xiao, Ge Gao, Elaine Khoong, Marine Carpuat, and Niloufar Salehi. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.712 Physician detection of clinical harm in machine translation: Quality estimation aids in reliance and backtranslation identifies critical errors . In Proceedings of the 2023 Conference on Empirical Me...

work page doi:10.18653/v1/2023.emnlp-main.712 2023

[50] [52]

Timothee Mickus, Elaine Zosa, Raul Vazquez, Teemu Vahtola, J \"o rg Tiedemann, Vincent Segonne, Alessandro Raganato, and Marianna Apidianaki. 2024. https://doi.org/10.18653/v1/2024.semeval-1.273 S em E val-2024 task 6: SHROOM , a shared-task on hallucinations and related observable overgeneration mistakes . In Proceedings of the 18th International Worksho...

work page doi:10.18653/v1/2024.semeval-1.273 2024

[51] [53]

Venkatesh Mishra, Bimsara Pathiraja, Mihir Parmar, Sat Chidananda, Jayanth Srinivasa, Gaowen Liu, Ali Payani, and Chitta Baral. 2025. Investigating the Shortcomings of LLM s in Step-by-Step Legal Reasoning . arXiv preprint arXiv:2502.05675

work page arXiv 2025

[52] [54]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency, pages 220--229

2019

[53] [55]

Melanie Mitchell. 2024. https://doi.org/10.1126/science.adt6140 The metaphors of artificial intelligence . Science, 386(6723):eadt6140

work page doi:10.1126/science.adt6140 2024

[54] [56]

Sabrina Moreno. 2021. https://richmond.com/news/local/virginia-uses-google-translate-for-covid-vaccine-information-heres-how-that-magnifies-language-barriers-misinformation/article_715cb81a-d880-5c98-aac5-6b30b378bbd3.html Virginia Uses Google Translate for COVID Vaccine Information. Here’s How That Magnifies Language Barriers, Misinformation . Richmond T...

2021

[55] [57]

Evgeny Morozov. 2013. To save everything, click here: The folly of technological solutionism. Public Affairs

2013

[56] [58]

Denis Moser, Nikola Stanic, and Murat Sariyar. 2025. https://doi.org/10.1093/jamiaopen/ooaf147 Benchmarking speech-to-text robustness in noisy emergency medical dialogues: an evaluation of models under realistic acoustic conditions . JAMIA Open, 8(6):ooaf147

work page doi:10.1093/jamiaopen/ooaf147 2025

[57] [59]

National Immigrant Women’s Advocacy Project (NiWAP) and American University Washington College of Law . 2013. https://niwaplibrary.wcl.american.edu/wp-content/uploads/IMM-Qref-LangAccessUVisaCollaboration.pdf Immigrant and limited english proficient victims’ access to the criminal justice system: The importance of collaboration . Technical report, America...

2013

[58] [60]

National Institute of Standards and Technology . 2023. https://doi.org/10.6028/NIST.AI.100-1 AI risk management framework ( AI RMF 1.0) . Technical Report NIST AI 100-1, National Institute of Standards and Technology, Gaithersburg, MD

work page doi:10.6028/nist.ai.100-1 2023

[59] [61]

National Institute of Standards and Technology . 2026. https://www.nist.gov/programs-projects/concept-note-ai-rmf-profile-trustworthy-ai-critical-infrastructure Profile on trustworthy AI in critical infrastructure . Technical report, National Institute of Standards and Technology, Gaithersburg, MD. Details forthcoming at time of writing

2026

[60] [62]

Von Nessen

Joseph C. Von Nessen. 2025. https://www.odvn.org/wp-content/uploads/2025/02/19Feb_EconImpact_release.pdf The Economic Impact of Intimate Partner Violence in Ohio . Report commissioned by Ohio Domestic Violence Network, released Feb. 24, 2025

2025

[61] [63]

Elizabeth Nielsen, Isaac Rayburn Caswell, Jiaming Luo, and Colin Cherry. 2025. https://aclanthology.org/2025.naacl-short.18/ Alligators all around: Mitigating lexical confusion in low-resource machine translation . In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language ...

2025

[62] [64]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, and 1 others. 2022. https://papers.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf Training language models to follow instructions with human feedback . Advances in neur...

2022

[63] [65]

Tekendra Parmar. 2025. https://www.motherjones.com/criminal-justice/2025/08/axon-police-ai-draft-one-foia/ Axon’s Draft One Is Designed to Defy Transparency . Mother Jones. Accessed: 2025‑10‑20

2025

[64] [66]

Sofia Quaglia. 2022. https://slate.com/technology/2022/09/machine-translation-accuracy-government-danger.html Death by machine translation? Slate. Archived at https://perma.cc/6RD2-3TY3

2022

[65] [67]

Kevin Roose. 2023. https://www.nytimes.com/2023/02/16/technology/bing-chatbot-transcript.html Bing’s A.I. Chat Reveals Its Feelings: ‘I Want to Be Alive. ’ . The New York Times. Accessed: 2025‑10‑19

2023

[66] [68]

SAFE-AI Task Force . 2024. https://safeaitf.org/wp-content/uploads/2024/07/SAFE-AI-Guidance-07-01-24.pdf Interpreting safe AI task force guidance: AI and interpreting services . Technical report, Stakeholders Advocating for Fair and Ethical AI in Interpreting. Version dated July 1, 2024

2024

[67] [69]

SAFE AI Task Force and CoSET . 2025. https://safeaitf.org/wp-content/uploads/2025/09/AI-Interpreting-Solutions-Evaluation-Toolkit_Part-A.pdf AI Interpreting Solutions Evaluation Toolkit, Part A: Organization, Implementation and Management . Technical report, SAFE AI Task Force and the Coalition for Sign Language Equity in Technology (CoSET)

2025

[68] [70]

Thomas W Sanchez, Marc Brenman, and Xinyue Ye. 2025. The ethical concerns of artificial intelligence in urban planning. Journal of the American Planning Association, 91(2):294--307

2025

[69] [71]

Danielle Saunders. 2022. Domain adaptation and multi-domain adaptation for neural machine translation: A survey. Journal of Artificial Intelligence Research, 75:351--424

2022

[70] [72]

Forcada, Miquel Espl \`a -Gomis, and Lucia Specia

Scarton Scarton, Mikel L. Forcada, Miquel Espl \`a -Gomis, and Lucia Specia. 2019. https://aclanthology.org/2019.iwslt-1.23/ Estimating post-editing effort: a study on human judgements, task-based and reference-based metrics of MT quality . In Proceedings of the 16th International Conference on Spoken Language Translation, Hong Kong. Association for Compu...

2019

[71] [73]

Behzad Shayegh, Jan-Thorsten Peter, David Vilar, Tobias Domhan, Juraj Juraska, Markus Freitag, and Lili Mou. 2025. https://arxiv.org/pdf/2503.24013? Feeding two birds or favoring one? adequacy--fluency tradeoffs in evaluation and meta-evaluation of machine translation . In Proceedings of the Tenth Conference on Machine Translation (WMT), Volume 1: Researc...

work page arXiv 2025

[72] [74]

Ana Silva, Nikit Srivastava, Tatiana Moteu Ngoli, Michael R \"o der, Diego Moussallem, and Axel-Cyrille Ngonga Ngomo. 2024. Benchmarking low-resource machine translation systems. In Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 175--185

2024

[73] [75]

State of Ohio . 2023. https://das.ohio.gov/wps/wcm/connect/gov/de987825-6f6d-41e7-86b9-31c957551975/IT-17.pdf?MOD=AJPERES&CONVERT_TO=url&CACHEID=ROOTWORKSPACE.Z18_K9I401S01H7F40QBNJU3SO1F56-de987825-6f6d-41e7-86b9-31c957551975-oWr6g0E Use of Artificial Intelligence in State of Ohio Solutions . Administrative policy it-17, Ohio Department of Administrative...

2023

[74] [76]

Taira, Valerie Kreger, Amanda Orue, and Lisa C

Breena R. Taira, Valerie Kreger, Amanda Orue, and Lisa C. Diamond. 2021. https://doi.org/10.1007/s11606-021-06666-z A pragmatic assessment of google translate for emergency department instructions . Journal of General Internal Medicine, 36(11):3361--3365

work page doi:10.1007/s11606-021-06666-z 2021

[75] [77]

Alan M. Turing. 1950. Computing machinery and intelligence. Mind, 59(236):433

1950

[76] [78]

Cruz-Zamora

United States v. Cruz-Zamora . 2018. United states vs. omar cruz-zamora. The United States District Court for the District of Kansas. Retrieved from https://ecf.ksd.uscourts.gov/cgi-bin/show_public_doc?2017cr40100-24

2018

[77] [79]

Ashok Urlana, Charaka Vinayak Kumar, Bala Mallikarjunarao Garlapati, Ajeet Kumar Singh, and Rahul Mishra. 2025. No size fits all: The perils and pitfalls of leveraging LLM s vary with company size. In Proceedings of the 31st International Conference on Computational Linguistics: Industry Track, pages 187--203

2025

[78] [80]

Baptiste Vasey, Myura Nagendran, Bruce Campbell, David A Clifton, Gary S Collins, Spiros Denaxas, Alastair K Denniston, Livia Faes, Bart Geerts, Mudathir Ibrahim, Xiaoxuan Liu, Bilal A Mateen, Piyush Mathur, Melissa D McCradden, Lauren Morgan, Johan Ordish, Chris Rogers, Suchi Saria, Daniel Shu Wei Ting, and 4 others. 2022. https://doi.org/10.1038/s41591-...

work page doi:10.1038/s41591-022-01772-9 2022

[79] [81]

Lucas Nunes Vieira. 2020. https://doi.org/10.1075/ts.00023.nun Machine translation in the news: A framing analysis of the written press . Translation Spaces, 9(1):98--122

work page doi:10.1075/ts.00023.nun 2020

[80] [82]

Lucas Nunes Vieira, Minako O'Hagan, and Carol O'Sullivan. 2021. https://doi.org/10.1080/1369118X.2020.1776370 Understanding the societal impacts of machine translation: A critical review of the literature on medical and legal use cases . Information, Communication & Society, 24(11):1515--1532

work page doi:10.1080/1369118x.2020.1776370 2021