LLMs in Qualitative Research: Opportunities, Limitations, and Practical Considerations

Alexandra Coso Strong; Henry Salgado; Martine Ceberio; Meagan R. Kendall

arxiv: 2605.16538 · v1 · pith:RYLKVRDUnew · submitted 2026-05-15 · 💻 cs.HC · cs.CL

LLMs in Qualitative Research: Opportunities, Limitations, and Practical Considerations

Henry Salgado , Meagan R. Kendall , Martine Ceberio , Alexandra Coso Strong This is my paper

Pith reviewed 2026-05-20 15:56 UTC · model grok-4.3

classification 💻 cs.HC cs.CL

keywords large language modelsqualitative researchAI in research methodsresearch epistemologyreflexivityinterpretive analysisprompt engineering

0 comments

The pith

Responsible integration of LLMs into qualitative research requires critical engagement with specific technical parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that large language models can support qualitative research only when researchers deliberately manage a focused set of technical controls to protect the interpretive core of the work. It shows how the black-box quality of current LLMs creates distinct demands compared with earlier text-analysis tools, requiring choices that respect reflexivity, positionality, and judgment rather than treating outputs as neutral. Readers would care because the guidance offers a practical way to adopt new tools without eroding the standards that define rigorous qualitative inquiry.

Core claim

The paper claims that responsible integration of LLMs into qualitative workflows requires researchers to engage critically with context window constraints, temperature and top-p sampling settings, user and system prompt design, and model documentation in the form of system cards, situating these choices within qualitative research's commitments to reflexivity, positionality, and interpretive judgment while noting that LLM opacity differs from earlier tools such as topic models and lexicon-based sentiment analyzers.

What carries the argument

The curated set of technical parameters—context window constraints, temperature and top-p sampling settings, user and system prompt design, and system cards—which researchers must examine critically to keep LLM assistance aligned with qualitative epistemology.

If this is right

Researchers will document and justify their temperature, top-p, and prompt choices as part of standard methodological reporting.
Prompt design will be reframed as an exercise in positionality rather than a purely technical step.
Consultation of system cards will become routine when selecting models for interpretive tasks.
Context-window limits will shape decisions about which data excerpts enter analysis prompts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could support development of qualitative-specific LLM interfaces that surface parameter effects in real time.
Training programs for new researchers may begin to include modules on aligning technical settings with epistemological stance.
The emphasis on critical engagement offers a model for other social-science fields facing similar automation pressures.

Load-bearing premise

The opacity of contemporary LLMs differs from earlier natural language processing tools in ways that require specific alignment with qualitative research commitments such as reflexivity, positionality, and interpretive judgment.

What would settle it

A controlled comparison in which one set of qualitative researchers applies explicit critical review of the listed technical parameters while another does not, then measuring whether the groups produce measurably different levels of documented reflexivity or interpretive depth in their final analyses.

read the original abstract

This paper examines the opportunities, limitations, and practical considerations associated with the use of large language models (LLMs) in qualitative research. Drawing on a multidisciplinary perspective that combines expertise in qualitative methods and explainable AI, the paper argues that responsible integration of LLMs into qualitative workflows requires researchers to engage critically with a curated set of technical parameters, that is, context window constraints, temperature and top-p sampling settings, user and system prompt design, and model documentation in the form of system cards. The paper situates these considerations within the epistemological commitments of qualitative research, including reflexivity, positionality, and interpretive judgment, and discusses how the opacity of contemporary LLMs differs from earlier natural language processing tools such as topic models and lexicon-based sentiment analyzers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a discussion paper that lists key technical parameters for LLM use in qualitative work but does not show how those parameters connect to reflexivity or interpretive judgment.

read the letter

The main takeaway is that the paper collects standard cautions about LLMs in qualitative research and frames them around a short list of technical choices: context windows, temperature and top-p, prompt design, and system cards. It does not introduce new data, mechanisms, or empirical tests. What it handles reasonably is the multidisciplinary angle. The authors combine qualitative methods background with explainable AI to note that current LLMs are more opaque than earlier tools such as topic models or lexicon-based sentiment analysis, and they place those technical parameters inside the usual commitments of reflexivity, positionality, and interpretive judgment. That contrast is clear and worth stating. The softer spot is exactly the one the stress-test flags. The central claim that responsible integration requires critical engagement with those parameters rests on an assumption rather than a demonstration. There is no example or step-by-step account showing how, say, a particular temperature setting or prompt structure actually supports or undermines reflexivity during coding or theme development. Without that mapping the advice stays at the level of general good sense. The paper is aimed at qualitative researchers and HCI people who are starting to try LLMs in their workflows. A reader who wants a concise reminder of the epistemological stakes and a short checklist of things to watch will find it useful as an orientation piece. It will not supply tested procedures or new theory. Given the timeliness of the topic and the authors' combined expertise, it deserves a serious referee. Feedback could push the authors to add the missing concrete links between the listed parameters and the qualitative commitments they invoke.

Referee Report

1 major / 1 minor

Summary. The paper examines opportunities, limitations, and practical considerations of using LLMs in qualitative research from a multidisciplinary perspective combining qualitative methods and explainable AI. It argues that responsible integration requires researchers to critically engage with a curated set of technical parameters—context window constraints, temperature and top-p sampling settings, user and system prompt design, and model documentation via system cards—while situating these within qualitative epistemological commitments such as reflexivity, positionality, and interpretive judgment. The paper further contrasts the opacity of contemporary LLMs with earlier NLP tools like topic models and lexicon-based sentiment analyzers.

Significance. If the central argument holds, the manuscript provides a timely framework for aligning LLM-assisted workflows with qualitative research standards, potentially reducing risks of unreflexive analysis in social science applications. The multidisciplinary grounding in both qualitative epistemology and AI documentation practices is a clear strength, offering practical guidance that could inform training and tool development in HCI and related fields.

major comments (1)

[Abstract and discussion of practical considerations] The central claim that responsible integration 'requires' critical engagement with context window constraints, temperature/top-p settings, prompt design, and system cards to support reflexivity, positionality, and interpretive judgment lacks an explicit mapping or mechanism. No section derives or illustrates how a concrete parameter choice (e.g., lowering temperature or constraining context length) directly enables or strengthens these epistemological commitments during tasks such as coding or theme interpretation.

minor comments (1)

The abstract would be strengthened by including one brief, concrete example of how a specific parameter setting interacts with qualitative analysis.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review, which identifies a valuable opportunity to clarify the linkages in our argument. We appreciate the positive assessment of the manuscript's multidisciplinary grounding and its potential contributions to HCI and qualitative methods. We respond to the major comment below and commit to revisions that strengthen the explicit connections without altering the core claims.

read point-by-point responses

Referee: [Abstract and discussion of practical considerations] The central claim that responsible integration 'requires' critical engagement with context window constraints, temperature/top-p settings, prompt design, and system cards to support reflexivity, positionality, and interpretive judgment lacks an explicit mapping or mechanism. No section derives or illustrates how a concrete parameter choice (e.g., lowering temperature or constraining context length) directly enables or strengthens these epistemological commitments during tasks such as coding or theme interpretation.

Authors: We acknowledge the referee's observation that while the manuscript situates each technical parameter within qualitative epistemological commitments across the practical considerations sections, the linkages could be made more explicit through direct mappings and illustrative examples. The current text explains the relevance of parameters such as temperature settings for controlling output variability (which can aid consistent interpretive judgment) and context window constraints for maintaining focus in extended coding tasks, but we agree that a dedicated mapping would better demonstrate the mechanisms. In revision, we will add a table and accompanying examples in the Practical Considerations section that explicitly maps each parameter to specific commitments (e.g., how lowering temperature supports reflexivity by enabling more predictable and traceable theme generation) and illustrates their application to tasks like coding and theme interpretation. This addition will derive the connections more clearly while remaining grounded in the paper's existing discussion of opacity versus earlier NLP tools. revision: yes

Circularity Check

0 steps flagged

No circularity; discursive recommendations rest on external epistemological commitments rather than self-referential reduction

full rationale

The paper advances a position that responsible LLM integration in qualitative work requires critical engagement with context-window limits, sampling parameters, prompt design, and system cards to support reflexivity, positionality, and interpretive judgment. This claim is advanced through multidisciplinary argument drawing on established qualitative-methods literature and known LLM properties; it contains no equations, fitted parameters, derivations, or self-citation chains that reduce the conclusion to its own inputs by construction. The listed technical parameters are treated as independent considerations whose alignment with qualitative commitments is asserted on substantive grounds rather than defined into existence or statistically forced. No uniqueness theorems, ansatzes smuggled via prior work, or renamings of known results appear. The derivation chain is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper draws on standard domain assumptions from qualitative research without introducing fitted parameters or new entities; it relies on background knowledge about epistemological commitments and LLM technical features.

axioms (1)

domain assumption Qualitative research is defined by commitments to reflexivity, positionality, and interpretive judgment.
Invoked directly in the abstract as the epistemological context for LLM integration.

pith-pipeline@v0.9.0 · 5661 in / 1209 out tokens · 42475 ms · 2026-05-20T15:56:01.751140+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

responsible integration of LLMs into qualitative workflows requires researchers to engage critically with a curated set of technical parameters, that is, context window constraints, temperature and top-p sampling settings, user and system prompt design, and model documentation in the form of system cards
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the opacity of contemporary LLMs differs from earlier natural language processing tools such as topic models and lexicon-based sentiment analyzers

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 5 internal anchors

[1]

How People Use ChatGPT

A. Chatterji et al., “How People Use ChatGPT,” Sep. 2025, National Bureau of Economic Research: 34255. doi: 10.3386/w34255

work page doi:10.3386/w34255 2025
[2]

Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell

E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell, “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?,” in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, in FAccT ’21. New York, NY, USA: Association for Computing Machinery, Mar. 2021, pp. 610–623. doi: 10.1145/3442188.3445922

work page doi:10.1145/3442188.3445922 2021
[3]

Beyond Individual Accountability: (Re-)Asserting Democratic Control of AI

S. Luccioni, Y. Jernite, and E. Strubell, “Power Hungry Processing: Watts Driving the Cost of AI Deployment?,” in Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, in FAccT ’24. New York, NY, USA: Association for Computing Machinery, Jun. 2024, pp. 85–99. doi: 10.1145/3630106.3658542

work page doi:10.1145/3630106.3658542 2024
[4]

Carbon Emissions and Large Neural Network Training

D. Patterson et al., “Carbon Emissions and Large Neural Network Training,” Apr. 23, 2021, arXiv: arXiv:2104.10350. doi: 10.48550/arXiv.2104.10350

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2104.10350 2021
[5]

Intellectual property issues in artificial intelligence trained on scraped data,

OECD, “Intellectual property issues in artificial intelligence trained on scraped data,” OECD Artificial Intelligence Papers, Feb. 2025, doi: 10.1787/d5241a23-en

work page doi:10.1787/d5241a23-en 2025
[6]

Rethinking open source generative AI: open-washing and the EU AI Act,

A. Liesenfeld and M. Dingemanse, “Rethinking open source generative AI: open-washing and the EU AI Act,” in Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, in FAccT ’24. New York, NY, USA: Association for Computing Machinery, Jun. 2024, pp. 1774–1787. doi: 10.1145/3630106.3659005

work page doi:10.1145/3630106.3659005 2024
[7]

A review of topic modeling methods,

I. Vayansky and S. A. P. Kumar, “A review of topic modeling methods,” Information Systems, vol. 94, p. 101582, Dec. 2020, doi: 10.1016/j.is.2020.101582

work page doi:10.1016/j.is.2020.101582 2020
[8]

A survey on sentiment analysis methods, applications, and challenges,

M. Wankhade, A. C. S. Rao, and C. Kulkarni, “A survey on sentiment analysis methods, applications, and challenges,” Artif Intell Rev, vol. 55, no. 7, pp. 5731–5780, Oct. 2022, doi: 10.1007/s10462-022-10144-1

work page doi:10.1007/s10462-022-10144-1 2022
[9]

Mapping Engineering Leadership Research through an AI-enabled Systematic Literature Review,

M. Kendall, B. Novoselich, M. Handley, and M. Dabkowski, “Mapping Engineering Leadership Research through an AI-enabled Systematic Literature Review,” presented at the 2022 ASEE Annual Conference & Exposition, Aug. 2022. Accessed: Feb. 20, 2026. [Online]. Available: https://peer.asee.org/mapping-engineering-leadership-research- through-an-ai-enabled-syste...

work page 2022
[10]

Deep Neural Networks, Explanations, and Rationality,

E. A. Lee, “Deep Neural Networks, Explanations, and Rationality,” in Bridging the Gap Between AI and Reality, B. Steffen, Ed., Cham: Springer Nature Switzerland, 2024, pp. 11–

work page 2024
[11]

doi: 10.1007/978-3-031-46002-9_1

work page doi:10.1007/978-3-031-46002-9_1
[12]

Barocas, M

S. Barocas, M. Hardt, and A. Narayanan, Fairness and Machine Learning: Limitations and Opportunities. MIT Press, 2023. Accessed: Jan. 07, 2026. [Online]. Available: https://fairmlbook.org/

work page 2023
[13]

Large language models associate Muslims with violence,

A. Abid, M. Farooqi, and J. Zou, “Large language models associate Muslims with violence,” Nat Mach Intell, vol. 3, no. 6, pp. 461–463, Jun. 2021, doi: 10.1038/s42256-021- 00359-2

work page doi:10.1038/s42256-021- 2021
[14]

Huggingface

T. Hu, Y. Kyrychenko, S. Rathje, N. Collier, S. van der Linden, and J. Roozenbeek, “Generative language models exhibit social identity biases,” Nat Comput Sci, vol. 5, no. 1, pp. 65–75, Jan. 2025, doi: 10.1038/s43588-024-00741-1

work page doi:10.1038/s43588-024-00741-1 2025
[15]

Detecting and Evaluating Bias in Large Language Models: Concepts, Methods, and Challenges,

Z. Gao, L. Tong, and Z. Zhang, “Detecting and Evaluating Bias in Large Language Models: Concepts, Methods, and Challenges,” Journal of Behavioral Data Science, vol. 6, no. 1, pp. 1–68, Feb. 2026, doi: 10.35566/jbds/gao

work page doi:10.35566/jbds/gao 2026
[16]

Reasoning Models Don't Always Say What They Think

Y. Chen et al., “Reasoning Models Don’t Always Say What They Think,” May 08, 2025, arXiv: arXiv:2505.05410. doi: 10.48550/arXiv.2505.05410

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.05410 2025
[17]

Survey on the Role of Mechanistic Interpretability in Generative AI,

L. Ranaldi, “Survey on the Role of Mechanistic Interpretability in Generative AI,” Big Data and Cognitive Computing, vol. 9, no. 8, Jul. 2025, doi: 10.3390/bdcc9080193

work page doi:10.3390/bdcc9080193 2025
[18]

On the Biology of a Large Language Model,

A. J. Lindsey† et al., “On the Biology of a Large Language Model,” Transformer Circuits. Accessed: Jul. 20, 2025. [Online]. Available: https://transformer- circuits.pub/2025/attribution-graphs/biology.html

work page 2025
[19]

Causal Discovery for Explainable AI: A Dual-Encoding Approach,

H. Salgado, M. R. Kendall, and M. Ceberio, “Causal Discovery for Explainable AI: A Dual-Encoding Approach,” in The 17th International Conference on Ambient Systems, Networks and Technologies (ANT 2023) / The 3rd International Workshop on Causality, Agents and Large Models (CALM-26), in Procedia Computer Science. Istanbul, Turkey: Springer, Apr. 2026. doi:...

work page doi:10.48550/arxiv.2601.21221 2023
[20]

Braun and V

V. Braun and V. Clarke, Thematic Analysis: A Practical Guide. SAGE Publications, 2021

work page 2021
[21]

Case Study Research in Education

S. B. Merriam, Qualitative Research and Case Study Applications in Education. Revised and Expanded from" Case Study Research in Education.". ERIC, 1998

work page 1998
[22]

Positionality practices and dimensions of impact on equity research: A collaborative inquiry and call to the community,

S. Secules et al., “Positionality practices and dimensions of impact on equity research: A collaborative inquiry and call to the community,” Journal of Engineering Education, vol. 110, no. 1, pp. 19–43, 2021, doi: https://doi.org/10.1002/jee.20377

work page doi:10.1002/jee.20377 2021
[23]

Quantitative, Qualitative, and Mixed Research Methods in Engineering Education,

M. Borrego, E. P. Douglas, and C. T. Amelink, “Quantitative, Qualitative, and Mixed Research Methods in Engineering Education,” Journal of Engineering Education, vol. 98, no. 1, pp. 53–66, 2009, doi: 10.1002/j.2168-9830.2009.tb01005.x

work page doi:10.1002/j.2168-9830.2009.tb01005.x 2009
[24]

We reject the use of generative artificial intelligence for reflexive qualitative researc,

T. Jowsey, V. Braun, V. Clarke, D. Lupton, and M. Fine, “We reject the use of generative artificial intelligence for reflexive qualitative researc,” Oct. 20, 2025, Social Science Research Network, Rochester, NY: 5676462. doi: 10.2139/ssrn.5676462

work page doi:10.2139/ssrn.5676462 2025
[25]

Interrogating the Use of Large Language Models in Qualitative Research Using the Qualifying Qualitative Research Quality Framework,

D. Reeping, C. Hampton, and D. Özkan, “Interrogating the Use of Large Language Models in Qualitative Research Using the Qualifying Qualitative Research Quality Framework,” Studies in Engineering Education, vol. 6, no. 2, Jul. 2025, doi: 10.21061/see.174

work page doi:10.21061/see.174 2025
[26]

Cultural bias and cultural alignment of large language models,

Y. Tao, O. Viberg, R. S. Baker, and R. F. Kizilcec, “Cultural bias and cultural alignment of large language models,” PNAS Nexus, vol. 3, no. 9, p. pgae346, Sep. 2024, doi: 10.1093/pnasnexus/pgae346

work page doi:10.1093/pnasnexus/pgae346 2024
[27]

Qualitative Research Quality: A Collaborative Inquiry Across Multiple Methodological Perspectives,

J. Walther et al., “Qualitative Research Quality: A Collaborative Inquiry Across Multiple Methodological Perspectives,” Journal of Engineering Education, vol. 106, no. 3, pp. 398– 430, 2017, doi: 10.1002/jee.20170

work page doi:10.1002/jee.20170 2017
[28]

Inference to the Best Explanation in Large Language Models,

D. Dalal, M. Valentino, A. Freitas, and P. Buitelaar, “Inference to the Best Explanation in Large Language Models,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 217–235. doi: 10.18653/v1/2024.acl-long.14

work page doi:10.18653/v1/2024.acl-long.14 2024
[29]

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

M. Grootendorst, “BERTopic: Neural topic modeling with a class-based TF-IDF procedure,” Mar. 11, 2022, arXiv: arXiv:2203.05794. doi: 10.48550/arXiv.2203.05794

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2203.05794 2022
[30]

CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models,

J. Gao et al., “CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models,” Jan. 22, 2024, arXiv: arXiv:2304.07366. doi: 10.48550/arXiv.2304.07366

work page doi:10.48550/arxiv.2304.07366 2024
[31]

Vera Liao, Rania Abdelghani, and Pierre-Yves Oudeyer

Z. Xiao, X. Yuan, Q. V. Liao, R. Abdelghani, and P.-Y. Oudeyer, “Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding,” in 28th International Conference on Intelligent User Interfaces, Mar. 2023, pp. 75–78. doi: 10.1145/3581754.3584136

work page doi:10.1145/3581754.3584136 2023
[32]

Leveraging Generative Text Models and Natural Language Processing to Perform Traditional Thematic Data Analysis,

I. Anakok, A. Katz, K. J. Chew, and H. Matusovich, “Leveraging Generative Text Models and Natural Language Processing to Perform Traditional Thematic Data Analysis,” International Journal of Qualitative Methods, vol. 24, p. 16094069251338898, Apr. 2025, doi: 10.1177/16094069251338898

work page doi:10.1177/16094069251338898 2025
[33]

Performing an Inductive Thematic Analysis of Semi-Structured Interviews With a Large Language Model: An Exploration and Provocation on the Limits of the Approach,

S. De Paoli, “Performing an Inductive Thematic Analysis of Semi-Structured Interviews With a Large Language Model: An Exploration and Provocation on the Limits of the Approach,” 2024, doi: 10.1177/08944393231220483

work page doi:10.1177/08944393231220483 2024
[34]

Thematic analysis with open-source generative AI and machine learning: a new method for inductive qualitative codebook development,

A. Katz, G. C. Fleming, and J. B. Main, “Thematic analysis with open-source generative AI and machine learning: a new method for inductive qualitative codebook development,” Humanit Soc Sci Commun, Jan. 2026, doi: 10.1057/s41599-026-06508-5

work page doi:10.1057/s41599-026-06508-5 2026
[35]

Using generative AI for large-scale qualitative analysis of social media posts to understand why people leave computer science,

A. Ross and A. Katz, “Using generative AI for large-scale qualitative analysis of social media posts to understand why people leave computer science,” Journal of Engineering Education, vol. 114, no. 4, p. e70036, 2025, doi: 10.1002/jee.70036

work page doi:10.1002/jee.70036 2025
[36]

Generative AI for thematic analysis in a maternal health study: coding semistructured interviews using large language models,

S. Qiao, X. Fang, J. Wang, R. Zhang, X. Li, and Y. Kang, “Generative AI for thematic analysis in a maternal health study: coding semistructured interviews using large language models,” Applied Psychology: Health and Well-Being, vol. 17, no. 3, p. e70038, 2025, doi: 10.1111/aphw.70038

work page doi:10.1111/aphw.70038 2025
[37]

Advancing Qualitative Analysis: An Exploration of the Potential of Generative AI and NLP in Thematic Coding,

Y. Gamieldien, J. M. Case, and A. Katz, “Advancing Qualitative Analysis: An Exploration of the Potential of Generative AI and NLP in Thematic Coding,” Jun. 21, 2023, Social Science Research Network, Rochester, NY: 4487768. doi: 10.2139/ssrn.4487768

work page doi:10.2139/ssrn.4487768 2023
[38]

The Use of Artificial Intelligence for Qualitative Data Analysis: ChatGPT,

I.-D. Lixandru, “The Use of Artificial Intelligence for Qualitative Data Analysis: ChatGPT,” IE, vol. 28, no. 1/2024, pp. 57–67, Mar. 2024, doi: 10.24818/issn14531305/28.1.2024.05

work page doi:10.24818/issn14531305/28.1.2024.05 2024
[39]

A practical guide to implementing ChatGPT as a secondary coder in qualitative research,

E. Blondeel, P. Everaert, and E. Opdecam, “A practical guide to implementing ChatGPT as a secondary coder in qualitative research,” International Journal of Accounting Information Systems, vol. 56, p. 100754, Dec. 2025, doi: 10.1016/j.accinf.2025.100754

work page doi:10.1016/j.accinf.2025.100754 2025
[40]

Using large language models to complement humans for the coding of social media interactions between science teachers,

R. Burgess, K. Waters, E. Spray, and E. Prieto-Rodriguez, “Using large language models to complement humans for the coding of social media interactions between science teachers,” Discov Educ, vol. 5, no. 1, p. 81, Feb. 2026, doi: 10.1007/s44217-025-00868-x

work page doi:10.1007/s44217-025-00868-x 2026
[41]

Can large language models be used to code text for thematic analysis? An explorative study,

Z. Han et al., “Can large language models be used to code text for thematic analysis? An explorative study,” Discov Artif Intell, vol. 5, no. 1, p. 171, Jul. 2025, doi: 10.1007/s44163- 025-00441-3

work page doi:10.1007/s44163- 2025
[42]

Artificial Intelligence for Literature Reviews: Opportunities and Challenges,

F. Bolanos, A. Salatino, F. Osborne, and E. Motta, “Artificial Intelligence for Literature Reviews: Opportunities and Challenges,” Aug. 06, 2024, arXiv: arXiv:2402.08565. doi: 10.48550/arXiv.2402.08565

work page doi:10.48550/arxiv.2402.08565 2024
[43]

Thematic analysis of interview data with ChatGPT: designing and testing a reliable research protocol for qualitative research,

M. Goyanes, C. Lopezosa, and B. Jordá, “Thematic analysis of interview data with ChatGPT: designing and testing a reliable research protocol for qualitative research,” Qual Quant, vol. 59, no. 6, pp. 5491–5510, Dec. 2025, doi: 10.1007/s11135-025-02199-3

work page doi:10.1007/s11135-025-02199-3 2025
[44]

Methodological foundations for artificial intelligence-driven survey question generation,

T. K. Mburu, K. Rong, C. J. McColley, and A. Werth, “Methodological foundations for artificial intelligence-driven survey question generation,” Journal of Engineering Education, vol. 114, no. 3, p. e70012, 2025, doi: 10.1002/jee.70012

work page doi:10.1002/jee.70012 2025
[45]

Exploring AI Bots as Simulators in Human Subject Research: A Novel Approach to Ethical and Efficient Experimentation in Engineering Education Research,

J. Strobel, M. Medina, E. S. Guzman, and M. van den Bogaard, “Exploring AI Bots as Simulators in Human Subject Research: A Novel Approach to Ethical and Efficient Experimentation in Engineering Education Research,” in 2024 IEEE Frontiers in Education Conference (FIE), Oct. 2024, pp. 1–9. doi: 10.1109/FIE61694.2024.10893007

work page doi:10.1109/fie61694.2024.10893007 2024
[46]

Using Chat GPT to Clean Qualitative Interview Transcriptions: A Usability and Feasibility Analysis,

Z. Taylor, “Using Chat GPT to Clean Qualitative Interview Transcriptions: A Usability and Feasibility Analysis,” AM J QUALITATIVE RES, vol. 8, no. 2, pp. 153–160, Apr. 2024, doi: 10.29333/ajqr/14487

work page doi:10.29333/ajqr/14487 2024
[47]

(PDF) Using Generative AI for Qualitative Coding

“(PDF) Using Generative AI for Qualitative Coding.” Accessed: Jan. 16, 2026. [Online]. Available: https://www.researchgate.net/publication/392174927_Using_Generative_AI_for_Qualitativ e_Coding

work page arXiv 2026
[48]

Sakaguchi, R

K. Sakaguchi, R. Sakama, and T. Watari, “Evaluating ChatGPT in Qualitative Thematic Analysis With Human Researchers in the Japanese Clinical Context and Its Cultural Interpretation Challenges: Comparative Qualitative Study,” J Med Internet Res, vol. 27, p. e71521, Apr. 2025, doi: 10.2196/71521

work page doi:10.2196/71521 2025
[49]

Model Cards for Model Reporting,

M. Mitchell et al., “Model Cards for Model Reporting,” in Proceedings of the Conference on Fairness, Accountability, and Transparency, Jan. 2019, pp. 220–229. doi: 10.1145/3287560.3287596

work page doi:10.1145/3287560.3287596 2019
[50]

Datasheets for Datasets

T. Gebru et al., “Datasheets for Datasets,” Dec. 01, 2021, arXiv: arXiv:1803.09010. doi: 10.48550/arXiv.1803.09010

work page doi:10.48550/arxiv.1803.09010 2021
[51]

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators,

A. Liesenfeld, A. Lopez, and M. Dingemanse, “Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators,” in Proceedings of the 5th International Conference on Conversational User Interfaces, in CUI ’23. New York, NY, USA: Association for Computing Machinery, Jul. 2023, pp. 1–6. doi: 10.1145/3571884.3604316

work page doi:10.1145/3571884.3604316 2023
[52]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

N. F. Liu et al., “Lost in the Middle: How Language Models Use Long Contexts,” Transactions of the Association for Computational Linguistics, vol. 12, pp. 157–173, 2024, doi: 10.1162/tacl_a_00638

work page doi:10.1162/tacl_a_00638 2024
[53]

Language Model Tokenizers Introduce Unfairness Between Languages

A. Petrov, E. L. Malfa, P. H. S. Torr, and A. Bibi, “Language Model Tokenizers Introduce Unfairness Between Languages,” Oct. 20, 2023, arXiv: arXiv:2305.15425. doi: 10.48550/arXiv.2305.15425

work page doi:10.48550/arxiv.2305.15425 2023
[54]

The Tokenizer Playground - a Hugging Face Space by Xenova

“The Tokenizer Playground - a Hugging Face Space by Xenova.” Accessed: Feb. 20, 2026. [Online]. Available: https://huggingface.co/spaces/Xenova/the-tokenizer-playground

work page 2026
[55]

The effect of sampling temperature on problem solving in large language models

M. Renze, “The Effect of Sampling Temperature on Problem Solving in Large Language Models,” in Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA: Association for Computational Linguistics, 2024, pp. 7346–7356. doi: 10.18653/v1/2024.findings-emnlp.432

work page doi:10.18653/v1/2024.findings-emnlp.432 2024
[56]

Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs,

S. Troshin, W. Mohammed, Y. Meng, C. Monz, A. Fokkens, and V. Niculae, “Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs,” Sep. 20, 2025, arXiv: arXiv:2510.01218. doi: 10.48550/arXiv.2510.01218

work page doi:10.48550/arxiv.2510.01218 2025
[57]

N., Baker, A., Neo, C., Roush, A., Kirsch, A., and Shwartz-Ziv, R

M. Nguyen, A. Baker, C. Neo, A. Roush, A. Kirsch, and R. Shwartz-Ziv, “Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs,” Oct. 13, 2024, arXiv: arXiv:2407.01082. doi: 10.48550/arXiv.2407.01082

work page doi:10.48550/arxiv.2407.01082 2024
[58]

Survey of Hallucination in Natural Language Generation

Z. Ji et al., “Survey of Hallucination in Natural Language Generation,” ACM Comput. Surv., vol. 55, no. 12, p. 248:1-248:38, Mar. 2023, doi: 10.1145/3571730

work page doi:10.1145/3571730 2023
[59]

A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT

J. White et al., “A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT,” Feb. 21, 2023, arXiv: arXiv:2302.11382. doi: 10.48550/arXiv.2302.11382

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2302.11382 2023
[60]

The Prompt Report: A Systematic Survey of Prompt Engineering Techniques

S. Schulhoff et al., “The Prompt Report: A Systematic Survey of Prompt Engineering Techniques,” Feb. 26, 2025, arXiv: arXiv:2406.06608. doi: 10.48550/arXiv.2406.06608

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2406.06608 2025
[61]

Board 50: Work in Progress: A Systematic Review of Embedding Large Language Models in Engineering and Computing Education,

D. Reeping and A. Shah, “Board 50: Work in Progress: A Systematic Review of Embedding Large Language Models in Engineering and Computing Education,” presented at the 2024 ASEE Annual Conference & Exposition, Jun. 2024. Accessed: Feb. 18, 2026. [Online]. Available: https://peer.asee.org/board-50-work-in-progress-a-systematic-review- of-embedding-large-lang...

work page 2024

[1] [1]

How People Use ChatGPT

A. Chatterji et al., “How People Use ChatGPT,” Sep. 2025, National Bureau of Economic Research: 34255. doi: 10.3386/w34255

work page doi:10.3386/w34255 2025

[2] [2]

Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell

E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell, “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?,” in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, in FAccT ’21. New York, NY, USA: Association for Computing Machinery, Mar. 2021, pp. 610–623. doi: 10.1145/3442188.3445922

work page doi:10.1145/3442188.3445922 2021

[3] [3]

Beyond Individual Accountability: (Re-)Asserting Democratic Control of AI

S. Luccioni, Y. Jernite, and E. Strubell, “Power Hungry Processing: Watts Driving the Cost of AI Deployment?,” in Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, in FAccT ’24. New York, NY, USA: Association for Computing Machinery, Jun. 2024, pp. 85–99. doi: 10.1145/3630106.3658542

work page doi:10.1145/3630106.3658542 2024

[4] [4]

Carbon Emissions and Large Neural Network Training

D. Patterson et al., “Carbon Emissions and Large Neural Network Training,” Apr. 23, 2021, arXiv: arXiv:2104.10350. doi: 10.48550/arXiv.2104.10350

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2104.10350 2021

[5] [5]

Intellectual property issues in artificial intelligence trained on scraped data,

OECD, “Intellectual property issues in artificial intelligence trained on scraped data,” OECD Artificial Intelligence Papers, Feb. 2025, doi: 10.1787/d5241a23-en

work page doi:10.1787/d5241a23-en 2025

[6] [6]

Rethinking open source generative AI: open-washing and the EU AI Act,

A. Liesenfeld and M. Dingemanse, “Rethinking open source generative AI: open-washing and the EU AI Act,” in Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, in FAccT ’24. New York, NY, USA: Association for Computing Machinery, Jun. 2024, pp. 1774–1787. doi: 10.1145/3630106.3659005

work page doi:10.1145/3630106.3659005 2024

[7] [7]

A review of topic modeling methods,

I. Vayansky and S. A. P. Kumar, “A review of topic modeling methods,” Information Systems, vol. 94, p. 101582, Dec. 2020, doi: 10.1016/j.is.2020.101582

work page doi:10.1016/j.is.2020.101582 2020

[8] [8]

A survey on sentiment analysis methods, applications, and challenges,

M. Wankhade, A. C. S. Rao, and C. Kulkarni, “A survey on sentiment analysis methods, applications, and challenges,” Artif Intell Rev, vol. 55, no. 7, pp. 5731–5780, Oct. 2022, doi: 10.1007/s10462-022-10144-1

work page doi:10.1007/s10462-022-10144-1 2022

[9] [9]

Mapping Engineering Leadership Research through an AI-enabled Systematic Literature Review,

M. Kendall, B. Novoselich, M. Handley, and M. Dabkowski, “Mapping Engineering Leadership Research through an AI-enabled Systematic Literature Review,” presented at the 2022 ASEE Annual Conference & Exposition, Aug. 2022. Accessed: Feb. 20, 2026. [Online]. Available: https://peer.asee.org/mapping-engineering-leadership-research- through-an-ai-enabled-syste...

work page 2022

[10] [10]

Deep Neural Networks, Explanations, and Rationality,

E. A. Lee, “Deep Neural Networks, Explanations, and Rationality,” in Bridging the Gap Between AI and Reality, B. Steffen, Ed., Cham: Springer Nature Switzerland, 2024, pp. 11–

work page 2024

[11] [11]

doi: 10.1007/978-3-031-46002-9_1

work page doi:10.1007/978-3-031-46002-9_1

[12] [12]

Barocas, M

S. Barocas, M. Hardt, and A. Narayanan, Fairness and Machine Learning: Limitations and Opportunities. MIT Press, 2023. Accessed: Jan. 07, 2026. [Online]. Available: https://fairmlbook.org/

work page 2023

[13] [13]

Large language models associate Muslims with violence,

A. Abid, M. Farooqi, and J. Zou, “Large language models associate Muslims with violence,” Nat Mach Intell, vol. 3, no. 6, pp. 461–463, Jun. 2021, doi: 10.1038/s42256-021- 00359-2

work page doi:10.1038/s42256-021- 2021

[14] [14]

Huggingface

T. Hu, Y. Kyrychenko, S. Rathje, N. Collier, S. van der Linden, and J. Roozenbeek, “Generative language models exhibit social identity biases,” Nat Comput Sci, vol. 5, no. 1, pp. 65–75, Jan. 2025, doi: 10.1038/s43588-024-00741-1

work page doi:10.1038/s43588-024-00741-1 2025

[15] [15]

Detecting and Evaluating Bias in Large Language Models: Concepts, Methods, and Challenges,

Z. Gao, L. Tong, and Z. Zhang, “Detecting and Evaluating Bias in Large Language Models: Concepts, Methods, and Challenges,” Journal of Behavioral Data Science, vol. 6, no. 1, pp. 1–68, Feb. 2026, doi: 10.35566/jbds/gao

work page doi:10.35566/jbds/gao 2026

[16] [16]

Reasoning Models Don't Always Say What They Think

Y. Chen et al., “Reasoning Models Don’t Always Say What They Think,” May 08, 2025, arXiv: arXiv:2505.05410. doi: 10.48550/arXiv.2505.05410

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.05410 2025

[17] [17]

Survey on the Role of Mechanistic Interpretability in Generative AI,

L. Ranaldi, “Survey on the Role of Mechanistic Interpretability in Generative AI,” Big Data and Cognitive Computing, vol. 9, no. 8, Jul. 2025, doi: 10.3390/bdcc9080193

work page doi:10.3390/bdcc9080193 2025

[18] [18]

On the Biology of a Large Language Model,

A. J. Lindsey† et al., “On the Biology of a Large Language Model,” Transformer Circuits. Accessed: Jul. 20, 2025. [Online]. Available: https://transformer- circuits.pub/2025/attribution-graphs/biology.html

work page 2025

[19] [19]

Causal Discovery for Explainable AI: A Dual-Encoding Approach,

H. Salgado, M. R. Kendall, and M. Ceberio, “Causal Discovery for Explainable AI: A Dual-Encoding Approach,” in The 17th International Conference on Ambient Systems, Networks and Technologies (ANT 2023) / The 3rd International Workshop on Causality, Agents and Large Models (CALM-26), in Procedia Computer Science. Istanbul, Turkey: Springer, Apr. 2026. doi:...

work page doi:10.48550/arxiv.2601.21221 2023

[20] [20]

Braun and V

V. Braun and V. Clarke, Thematic Analysis: A Practical Guide. SAGE Publications, 2021

work page 2021

[21] [21]

Case Study Research in Education

S. B. Merriam, Qualitative Research and Case Study Applications in Education. Revised and Expanded from" Case Study Research in Education.". ERIC, 1998

work page 1998

[22] [22]

Positionality practices and dimensions of impact on equity research: A collaborative inquiry and call to the community,

S. Secules et al., “Positionality practices and dimensions of impact on equity research: A collaborative inquiry and call to the community,” Journal of Engineering Education, vol. 110, no. 1, pp. 19–43, 2021, doi: https://doi.org/10.1002/jee.20377

work page doi:10.1002/jee.20377 2021

[23] [23]

Quantitative, Qualitative, and Mixed Research Methods in Engineering Education,

M. Borrego, E. P. Douglas, and C. T. Amelink, “Quantitative, Qualitative, and Mixed Research Methods in Engineering Education,” Journal of Engineering Education, vol. 98, no. 1, pp. 53–66, 2009, doi: 10.1002/j.2168-9830.2009.tb01005.x

work page doi:10.1002/j.2168-9830.2009.tb01005.x 2009

[24] [24]

We reject the use of generative artificial intelligence for reflexive qualitative researc,

T. Jowsey, V. Braun, V. Clarke, D. Lupton, and M. Fine, “We reject the use of generative artificial intelligence for reflexive qualitative researc,” Oct. 20, 2025, Social Science Research Network, Rochester, NY: 5676462. doi: 10.2139/ssrn.5676462

work page doi:10.2139/ssrn.5676462 2025

[25] [25]

Interrogating the Use of Large Language Models in Qualitative Research Using the Qualifying Qualitative Research Quality Framework,

D. Reeping, C. Hampton, and D. Özkan, “Interrogating the Use of Large Language Models in Qualitative Research Using the Qualifying Qualitative Research Quality Framework,” Studies in Engineering Education, vol. 6, no. 2, Jul. 2025, doi: 10.21061/see.174

work page doi:10.21061/see.174 2025

[26] [26]

Cultural bias and cultural alignment of large language models,

Y. Tao, O. Viberg, R. S. Baker, and R. F. Kizilcec, “Cultural bias and cultural alignment of large language models,” PNAS Nexus, vol. 3, no. 9, p. pgae346, Sep. 2024, doi: 10.1093/pnasnexus/pgae346

work page doi:10.1093/pnasnexus/pgae346 2024

[27] [27]

Qualitative Research Quality: A Collaborative Inquiry Across Multiple Methodological Perspectives,

J. Walther et al., “Qualitative Research Quality: A Collaborative Inquiry Across Multiple Methodological Perspectives,” Journal of Engineering Education, vol. 106, no. 3, pp. 398– 430, 2017, doi: 10.1002/jee.20170

work page doi:10.1002/jee.20170 2017

[28] [28]

Inference to the Best Explanation in Large Language Models,

D. Dalal, M. Valentino, A. Freitas, and P. Buitelaar, “Inference to the Best Explanation in Large Language Models,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 217–235. doi: 10.18653/v1/2024.acl-long.14

work page doi:10.18653/v1/2024.acl-long.14 2024

[29] [29]

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

M. Grootendorst, “BERTopic: Neural topic modeling with a class-based TF-IDF procedure,” Mar. 11, 2022, arXiv: arXiv:2203.05794. doi: 10.48550/arXiv.2203.05794

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2203.05794 2022

[30] [30]

CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models,

J. Gao et al., “CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models,” Jan. 22, 2024, arXiv: arXiv:2304.07366. doi: 10.48550/arXiv.2304.07366

work page doi:10.48550/arxiv.2304.07366 2024

[31] [31]

Vera Liao, Rania Abdelghani, and Pierre-Yves Oudeyer

Z. Xiao, X. Yuan, Q. V. Liao, R. Abdelghani, and P.-Y. Oudeyer, “Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding,” in 28th International Conference on Intelligent User Interfaces, Mar. 2023, pp. 75–78. doi: 10.1145/3581754.3584136

work page doi:10.1145/3581754.3584136 2023

[32] [32]

Leveraging Generative Text Models and Natural Language Processing to Perform Traditional Thematic Data Analysis,

I. Anakok, A. Katz, K. J. Chew, and H. Matusovich, “Leveraging Generative Text Models and Natural Language Processing to Perform Traditional Thematic Data Analysis,” International Journal of Qualitative Methods, vol. 24, p. 16094069251338898, Apr. 2025, doi: 10.1177/16094069251338898

work page doi:10.1177/16094069251338898 2025

[33] [33]

Performing an Inductive Thematic Analysis of Semi-Structured Interviews With a Large Language Model: An Exploration and Provocation on the Limits of the Approach,

S. De Paoli, “Performing an Inductive Thematic Analysis of Semi-Structured Interviews With a Large Language Model: An Exploration and Provocation on the Limits of the Approach,” 2024, doi: 10.1177/08944393231220483

work page doi:10.1177/08944393231220483 2024

[34] [34]

Thematic analysis with open-source generative AI and machine learning: a new method for inductive qualitative codebook development,

A. Katz, G. C. Fleming, and J. B. Main, “Thematic analysis with open-source generative AI and machine learning: a new method for inductive qualitative codebook development,” Humanit Soc Sci Commun, Jan. 2026, doi: 10.1057/s41599-026-06508-5

work page doi:10.1057/s41599-026-06508-5 2026

[35] [35]

Using generative AI for large-scale qualitative analysis of social media posts to understand why people leave computer science,

A. Ross and A. Katz, “Using generative AI for large-scale qualitative analysis of social media posts to understand why people leave computer science,” Journal of Engineering Education, vol. 114, no. 4, p. e70036, 2025, doi: 10.1002/jee.70036

work page doi:10.1002/jee.70036 2025

[36] [36]

Generative AI for thematic analysis in a maternal health study: coding semistructured interviews using large language models,

S. Qiao, X. Fang, J. Wang, R. Zhang, X. Li, and Y. Kang, “Generative AI for thematic analysis in a maternal health study: coding semistructured interviews using large language models,” Applied Psychology: Health and Well-Being, vol. 17, no. 3, p. e70038, 2025, doi: 10.1111/aphw.70038

work page doi:10.1111/aphw.70038 2025

[37] [37]

Advancing Qualitative Analysis: An Exploration of the Potential of Generative AI and NLP in Thematic Coding,

Y. Gamieldien, J. M. Case, and A. Katz, “Advancing Qualitative Analysis: An Exploration of the Potential of Generative AI and NLP in Thematic Coding,” Jun. 21, 2023, Social Science Research Network, Rochester, NY: 4487768. doi: 10.2139/ssrn.4487768

work page doi:10.2139/ssrn.4487768 2023

[38] [38]

The Use of Artificial Intelligence for Qualitative Data Analysis: ChatGPT,

I.-D. Lixandru, “The Use of Artificial Intelligence for Qualitative Data Analysis: ChatGPT,” IE, vol. 28, no. 1/2024, pp. 57–67, Mar. 2024, doi: 10.24818/issn14531305/28.1.2024.05

work page doi:10.24818/issn14531305/28.1.2024.05 2024

[39] [39]

A practical guide to implementing ChatGPT as a secondary coder in qualitative research,

E. Blondeel, P. Everaert, and E. Opdecam, “A practical guide to implementing ChatGPT as a secondary coder in qualitative research,” International Journal of Accounting Information Systems, vol. 56, p. 100754, Dec. 2025, doi: 10.1016/j.accinf.2025.100754

work page doi:10.1016/j.accinf.2025.100754 2025

[40] [40]

Using large language models to complement humans for the coding of social media interactions between science teachers,

R. Burgess, K. Waters, E. Spray, and E. Prieto-Rodriguez, “Using large language models to complement humans for the coding of social media interactions between science teachers,” Discov Educ, vol. 5, no. 1, p. 81, Feb. 2026, doi: 10.1007/s44217-025-00868-x

work page doi:10.1007/s44217-025-00868-x 2026

[41] [41]

Can large language models be used to code text for thematic analysis? An explorative study,

Z. Han et al., “Can large language models be used to code text for thematic analysis? An explorative study,” Discov Artif Intell, vol. 5, no. 1, p. 171, Jul. 2025, doi: 10.1007/s44163- 025-00441-3

work page doi:10.1007/s44163- 2025

[42] [42]

Artificial Intelligence for Literature Reviews: Opportunities and Challenges,

F. Bolanos, A. Salatino, F. Osborne, and E. Motta, “Artificial Intelligence for Literature Reviews: Opportunities and Challenges,” Aug. 06, 2024, arXiv: arXiv:2402.08565. doi: 10.48550/arXiv.2402.08565

work page doi:10.48550/arxiv.2402.08565 2024

[43] [43]

Thematic analysis of interview data with ChatGPT: designing and testing a reliable research protocol for qualitative research,

M. Goyanes, C. Lopezosa, and B. Jordá, “Thematic analysis of interview data with ChatGPT: designing and testing a reliable research protocol for qualitative research,” Qual Quant, vol. 59, no. 6, pp. 5491–5510, Dec. 2025, doi: 10.1007/s11135-025-02199-3

work page doi:10.1007/s11135-025-02199-3 2025

[44] [44]

Methodological foundations for artificial intelligence-driven survey question generation,

T. K. Mburu, K. Rong, C. J. McColley, and A. Werth, “Methodological foundations for artificial intelligence-driven survey question generation,” Journal of Engineering Education, vol. 114, no. 3, p. e70012, 2025, doi: 10.1002/jee.70012

work page doi:10.1002/jee.70012 2025

[45] [45]

Exploring AI Bots as Simulators in Human Subject Research: A Novel Approach to Ethical and Efficient Experimentation in Engineering Education Research,

J. Strobel, M. Medina, E. S. Guzman, and M. van den Bogaard, “Exploring AI Bots as Simulators in Human Subject Research: A Novel Approach to Ethical and Efficient Experimentation in Engineering Education Research,” in 2024 IEEE Frontiers in Education Conference (FIE), Oct. 2024, pp. 1–9. doi: 10.1109/FIE61694.2024.10893007

work page doi:10.1109/fie61694.2024.10893007 2024

[46] [46]

Using Chat GPT to Clean Qualitative Interview Transcriptions: A Usability and Feasibility Analysis,

Z. Taylor, “Using Chat GPT to Clean Qualitative Interview Transcriptions: A Usability and Feasibility Analysis,” AM J QUALITATIVE RES, vol. 8, no. 2, pp. 153–160, Apr. 2024, doi: 10.29333/ajqr/14487

work page doi:10.29333/ajqr/14487 2024

[47] [47]

(PDF) Using Generative AI for Qualitative Coding

“(PDF) Using Generative AI for Qualitative Coding.” Accessed: Jan. 16, 2026. [Online]. Available: https://www.researchgate.net/publication/392174927_Using_Generative_AI_for_Qualitativ e_Coding

work page arXiv 2026

[48] [48]

Sakaguchi, R

K. Sakaguchi, R. Sakama, and T. Watari, “Evaluating ChatGPT in Qualitative Thematic Analysis With Human Researchers in the Japanese Clinical Context and Its Cultural Interpretation Challenges: Comparative Qualitative Study,” J Med Internet Res, vol. 27, p. e71521, Apr. 2025, doi: 10.2196/71521

work page doi:10.2196/71521 2025

[49] [49]

Model Cards for Model Reporting,

M. Mitchell et al., “Model Cards for Model Reporting,” in Proceedings of the Conference on Fairness, Accountability, and Transparency, Jan. 2019, pp. 220–229. doi: 10.1145/3287560.3287596

work page doi:10.1145/3287560.3287596 2019

[50] [50]

Datasheets for Datasets

T. Gebru et al., “Datasheets for Datasets,” Dec. 01, 2021, arXiv: arXiv:1803.09010. doi: 10.48550/arXiv.1803.09010

work page doi:10.48550/arxiv.1803.09010 2021

[51] [51]

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators,

A. Liesenfeld, A. Lopez, and M. Dingemanse, “Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators,” in Proceedings of the 5th International Conference on Conversational User Interfaces, in CUI ’23. New York, NY, USA: Association for Computing Machinery, Jul. 2023, pp. 1–6. doi: 10.1145/3571884.3604316

work page doi:10.1145/3571884.3604316 2023

[52] [52]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

N. F. Liu et al., “Lost in the Middle: How Language Models Use Long Contexts,” Transactions of the Association for Computational Linguistics, vol. 12, pp. 157–173, 2024, doi: 10.1162/tacl_a_00638

work page doi:10.1162/tacl_a_00638 2024

[53] [53]

Language Model Tokenizers Introduce Unfairness Between Languages

A. Petrov, E. L. Malfa, P. H. S. Torr, and A. Bibi, “Language Model Tokenizers Introduce Unfairness Between Languages,” Oct. 20, 2023, arXiv: arXiv:2305.15425. doi: 10.48550/arXiv.2305.15425

work page doi:10.48550/arxiv.2305.15425 2023

[54] [54]

The Tokenizer Playground - a Hugging Face Space by Xenova

“The Tokenizer Playground - a Hugging Face Space by Xenova.” Accessed: Feb. 20, 2026. [Online]. Available: https://huggingface.co/spaces/Xenova/the-tokenizer-playground

work page 2026

[55] [55]

The effect of sampling temperature on problem solving in large language models

M. Renze, “The Effect of Sampling Temperature on Problem Solving in Large Language Models,” in Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA: Association for Computational Linguistics, 2024, pp. 7346–7356. doi: 10.18653/v1/2024.findings-emnlp.432

work page doi:10.18653/v1/2024.findings-emnlp.432 2024

[56] [56]

Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs,

S. Troshin, W. Mohammed, Y. Meng, C. Monz, A. Fokkens, and V. Niculae, “Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs,” Sep. 20, 2025, arXiv: arXiv:2510.01218. doi: 10.48550/arXiv.2510.01218

work page doi:10.48550/arxiv.2510.01218 2025

[57] [57]

N., Baker, A., Neo, C., Roush, A., Kirsch, A., and Shwartz-Ziv, R

M. Nguyen, A. Baker, C. Neo, A. Roush, A. Kirsch, and R. Shwartz-Ziv, “Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs,” Oct. 13, 2024, arXiv: arXiv:2407.01082. doi: 10.48550/arXiv.2407.01082

work page doi:10.48550/arxiv.2407.01082 2024

[58] [58]

Survey of Hallucination in Natural Language Generation

Z. Ji et al., “Survey of Hallucination in Natural Language Generation,” ACM Comput. Surv., vol. 55, no. 12, p. 248:1-248:38, Mar. 2023, doi: 10.1145/3571730

work page doi:10.1145/3571730 2023

[59] [59]

A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT

J. White et al., “A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT,” Feb. 21, 2023, arXiv: arXiv:2302.11382. doi: 10.48550/arXiv.2302.11382

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2302.11382 2023

[60] [60]

The Prompt Report: A Systematic Survey of Prompt Engineering Techniques

S. Schulhoff et al., “The Prompt Report: A Systematic Survey of Prompt Engineering Techniques,” Feb. 26, 2025, arXiv: arXiv:2406.06608. doi: 10.48550/arXiv.2406.06608

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2406.06608 2025

[61] [61]

Board 50: Work in Progress: A Systematic Review of Embedding Large Language Models in Engineering and Computing Education,

D. Reeping and A. Shah, “Board 50: Work in Progress: A Systematic Review of Embedding Large Language Models in Engineering and Computing Education,” presented at the 2024 ASEE Annual Conference & Exposition, Jun. 2024. Accessed: Feb. 18, 2026. [Online]. Available: https://peer.asee.org/board-50-work-in-progress-a-systematic-review- of-embedding-large-lang...

work page 2024