pith. sign in

arxiv: 2604.06418 · v1 · submitted 2026-04-07 · 💻 cs.HC

Trust in AI among Middle Eastern CS Students: Investigating Students' Trust and Usage Patterns Across Saudi Arabia, Kuwait and Jordan

Pith reviewed 2026-05-10 18:52 UTC · model grok-4.3

classification 💻 cs.HC
keywords trust in AIcomputer science educationMiddle Eastern studentsgender differenceslanguage fluencycultural factorsAI adoptionsurvey replication
0
0 comments X

The pith

Middle Eastern CS students show trust in AI predicted by language fluency, with gender patterns that vary by country and differ from US results.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper replicates a prior US study on trust in AI by surveying computer science students at universities in Saudi Arabia, Kuwait, and Jordan. It finds that English language fluency predicts levels of trust in AI tools, that female students in Saudi Arabia report lower trust than males (unlike US patterns where females trusted more), and that no clear gender differences appear in Kuwait or Jordan, along with a negative correlation between English proficiency and confidence in using AI. A sympathetic reader would care because these results indicate that the factors driving AI adoption in computing education are shaped by language and local culture rather than being universal, which affects how equitably new tools spread beyond Western populations.

Core claim

Replicating a US study of trust in AI, the authors surveyed students in three Arabic-speaking Middle Eastern countries and found that language fluency can predict trust, female students in Saudi Arabia indicated lower trust than their male peers (contrasting US results), no noticeable gender differences appeared in Kuwait and Jordan, and English language proficiency showed a generally negative correlation with students' confidence in AI.

What carries the argument

A replicated survey instrument that measures trust in AI and analyzes it against gender, first-generation status, and language proficiency in non-Western student populations.

If this is right

  • AI adoption in Middle Eastern computing education depends on language fluency rather than matching Western patterns exactly.
  • Gender influences on trust appear only in certain countries within the region and reverse from US observations in Saudi Arabia.
  • Negative links between English proficiency and confidence point to potential barriers when AI tools are primarily English-based.
  • Design of AI systems for education must account for cultural and linguistic differences to support equitable use across regions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Repeating the survey in other non-English-speaking regions could reveal whether language fluency remains a consistent predictor or interacts differently with local languages.
  • Making AI tools available in Arabic or other local languages might raise trust levels and reduce the observed negative correlation with English proficiency.
  • The country-specific gender findings suggest that broader cultural norms around technology use, not just language, shape trust and could be tested by comparing urban versus rural student samples.

Load-bearing premise

The translated survey instrument measures the same underlying construct of trust in AI across Arabic-speaking cultural contexts as in the original English study, without introducing response biases from language or social norms.

What would settle it

Administering both the original English and the Arabic versions of the survey to the same bilingual students and finding substantially different trust scores or gender patterns would indicate that translation or cultural response styles alter the measured construct.

Figures

Figures reproduced from arXiv: 2604.06418 by Ali Alfageeh, Amin Alipour, Bader Alkhazi, Duaa Alshdaifat, Saleh Alkhamees.

Figure 1
Figure 1. Figure 1: Confidence by gender (𝑁 = 202) agreed they can "find ways out when stuck on a programming problem," compared to 72.4% of non-first-generation students (𝑝 = 0.0362). This indicates that first-generation students may have developed greater self-reliance in solving programming problems compared to their peers from more traditional academic backgrounds. 4.1.4 Experience of Using AI Tools. In [PITH_FULL_IMAGE:… view at source ↗
Figure 2
Figure 2. Figure 2: Experience by gender (𝑁 = 202) worry less about losing jobs. 29 female students (67.4%) agreed with "I worry that AI is going to replace programmers," compared to 17 male students (50.0%) (𝑝 = 0.0346). In contrast to Saudi Arabia, 22 Jordanian male students (53.7%) worry about being replaced by AI, compared to only 11 female students (30.6%) share that the same worry. This implies that the trust or fear re… view at source ↗
Figure 3
Figure 3. Figure 3: Correlation Matrix of Trust, Confidence, and Motivation Factors ( [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
read the original abstract

Background and Context: Artificial intelligence (AI) tools have been reshaping computing and computer science education. Trust in AI is a determining factor in the adoption of these tools. Recent studies have shown different trust factors across gender and first-generation status among students. However, these studies have focused mainly on Western, Educated, Industrialized, Rich, and Democratic (WEIRD) populations, and their generalizability to other populations with different languages and cultures is unclear. Objective: This study aims to evaluate trust in AI among Middle Eastern computer science students and the factors that can impact it. Method. We replicate a recent study of trust in four universities in three Middle Eastern, Arabic-speaking countries: Saudi Arabia, Kuwait, and Jordan. We analyze trust among students across different factors such as gender and first-generation status. Findings: Our results suggest that language fluency can predict trust in AI. Moreover, unlike the results from the US population where female students tended to trust AI more than their male peers, female students in Saudi Arabia indicated lower trust compared to their male counterparts, and we did not observe any noticeable differences across gender in the other countries. We also found a generally negative correlation between English language proficiency and students' confidence. Implications: This study highlights differences in students' adoption and trust in AI even within the same region. It emphasizes the need for more investigation into students' adoption and interaction in non-WEIRD regions for equitable adoption of this technology. It also suggests a need for efforts in designing effective AI systems tailored to the cultural and linguistic needs of the region.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript reports a replication of a prior US study on trust in AI, surveying computer science students at universities in Saudi Arabia, Kuwait, and Jordan. It claims that language fluency predicts trust in AI, that gender differences in trust diverge from US patterns (with female students in Saudi Arabia showing lower trust than males, and no noticeable gender differences in the other countries), and that English language proficiency is negatively correlated with students' confidence in AI. The work highlights the need for culturally and linguistically tailored AI systems in non-WEIRD regions.

Significance. If the measurement equivalence of the translated instrument can be established, the results would meaningfully extend AI trust research beyond WEIRD populations by providing primary empirical data from Middle Eastern CS students. The replication design and focus on language fluency and gender as predictors are strengths that could inform equitable AI adoption in computing education, particularly if supported by appropriate statistical reporting and cross-cultural validation.

major comments (3)
  1. [Methods] Methods section: No information is provided on the translation of the survey instrument into Arabic, including back-translation, pilot testing for comprehension, or cultural adaptation. This is load-bearing for the central claims, as direct numeric comparisons of trust scores and gender effects between the English US study and the Arabic versions assume the 'trust in AI' construct is measured equivalently across languages and contexts.
  2. [Results] Results/Findings: The manuscript does not report sample sizes per country or group, statistical tests, effect sizes, confidence intervals, or controls for the key directional findings on language fluency predicting trust and country-specific gender differences. Without these details, the robustness of the claims (e.g., lower trust among Saudi female students) cannot be assessed and may be vulnerable to response biases.
  3. [Results] Results/Findings: No tests of measurement invariance (configural, metric, or scalar) across language groups or countries are reported. If Arabic items introduce different response styles or factor structures, the observed differences in means, correlations, and gender effects could be artifacts rather than substantive cultural or linguistic effects, directly undermining the cross-regional comparisons.
minor comments (2)
  1. [Abstract] Abstract: Including basic descriptive statistics (e.g., total N, response rate, or mean trust scores) would help contextualize the directional findings for readers.
  2. [Introduction] The paper would benefit from clearer notation distinguishing the replicated US study from the current data collection to avoid any ambiguity in comparisons.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and insightful comments on our manuscript. We have addressed each of the major concerns raised, and the revised manuscript incorporates additional details and analyses to strengthen the validity of our findings. Below, we provide point-by-point responses.

read point-by-point responses
  1. Referee: [Methods] Methods section: No information is provided on the translation of the survey instrument into Arabic, including back-translation, pilot testing for comprehension, or cultural adaptation. This is load-bearing for the central claims, as direct numeric comparisons of trust scores and gender effects between the English US study and the Arabic versions assume the 'trust in AI' construct is measured equivalently across languages and contexts.

    Authors: We appreciate the referee pointing out this important omission in our initial submission. The Methods section has been revised to include a comprehensive description of the survey translation process. The instrument was translated into Arabic by a certified translator, followed by an independent back-translation to English to ensure accuracy. We conducted a pilot test with 15-20 CS students from the target population to evaluate item comprehension and cultural relevance, making minor adjustments for idiomatic expressions. These steps align with established cross-cultural research practices and support the equivalence of the measured construct. revision: yes

  2. Referee: [Results] Results/Findings: The manuscript does not report sample sizes per country or group, statistical tests, effect sizes, confidence intervals, or controls for the key directional findings on language fluency predicting trust and country-specific gender differences. Without these details, the robustness of the claims (e.g., lower trust among Saudi female students) cannot be assessed and may be vulnerable to response biases.

    Authors: We agree that the Results section lacked sufficient statistical detail. In the revised manuscript, we now report the sample sizes broken down by country and relevant subgroups (e.g., gender within each country). We have included the results of regression models testing language fluency as a predictor of trust, with appropriate controls for demographics, along with effect sizes and confidence intervals. For the gender differences, we present t-test results with effect sizes and discuss potential response biases. These additions allow for a more rigorous evaluation of the findings. revision: yes

  3. Referee: [Results] Results/Findings: No tests of measurement invariance (configural, metric, or scalar) across language groups or countries are reported. If Arabic items introduce different response styles or factor structures, the observed differences in means, correlations, and gender effects could be artifacts rather than substantive cultural or linguistic effects, directly undermining the cross-regional comparisons.

    Authors: This is a valid concern for cross-cultural research. We have now performed measurement invariance testing using multi-group structural equation modeling. The revised Results section reports that configural invariance was supported, indicating similar factor structures across groups. Metric invariance was also established, allowing for comparison of relationships. Scalar invariance showed some non-invariance in intercepts, which we acknowledge as a limitation and discuss in the context of potential cultural response styles. We have added this analysis to bolster confidence in our comparative claims. revision: yes

Circularity Check

0 steps flagged

No circularity: primary empirical survey replication with direct data analysis

full rationale

The paper performs new data collection via translated surveys across three countries, followed by standard statistical comparisons (means, correlations, gender/language effects) against an external prior US study. No equations, fitted parameters, or derivations are present; claims such as language fluency predicting trust arise directly from the collected responses rather than being redefined or forced by construction. Self-citations are absent from the load-bearing steps, and the replication cites an independent prior work. The derivation chain is therefore self-contained empirical output with no reduction to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is an empirical survey study relying on standard social science assumptions with no mathematical derivations, free parameters, or new theoretical entities.

axioms (1)
  • domain assumption Self-reported survey responses accurately reflect students' actual trust levels in AI tools.
    Implicit in all interpretation of trust scores; standard in HCI survey research but untested here for cultural translation effects.

pith-pipeline@v0.9.0 · 5614 in / 1210 out tokens · 56834 ms · 2026-05-10T18:52:19.553509+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

  1. [1]

    Ahmed Aljohani, Raed Alharbi, Asma Alkhaldi, and Wajdi Aljedaani. 2025. Evaluating LLMs for Arabic Code Summarization: Challenges and Insights from GPT-4. In2025 8th International Conference on Data Science and Machine Learning Applications (CDMA). IEEE, Riyadh, Saudi Arabia, 67–72. doi:10.1109/CDMA61895.2025.00017

  2. [2]

    Isaac Alpizar-Chacon and Hieke Keuning. 2025. Student’s Use of Generative AI as a Support Tool in an Advanced Web Development Course. In Proceedings of the 30th ACM Conference on Innovation and Technology in Computer Science Education V. 1. ACM, Nijmegen Netherlands, 312–318. doi:10.1145/3724363.3729106

  3. [3]

    Matin Amoozadeh, David Daniels, Daye Nam, Aayush Kumar, Stella Chen, Michael Hilton, Sruti Srinivasa Ragavan, and Mohammad Amin Alipour

  4. [4]

    doi:10.48550/arXiv.2310.04631 arXiv:2310.04631 [cs]

    Trust in Generative AI among students: An Exploratory Study. doi:10.48550/arXiv.2310.04631 arXiv:2310.04631 [cs]

  5. [5]

    Matin Amoozadeh, Daye Nam, Daniel Prol, Ali Alfageeh, James Prather, Michael Hilton, Sruti Srinivasa Ragavan, and Mohammad Amin Alipour

  6. [6]

    doi:10.48550/arXiv.2407.00305 arXiv:2407.00305 [cs]

    Student-AI Interaction: A Case Study of CS1 students. doi:10.48550/arXiv.2407.00305 arXiv:2407.00305 [cs]

  7. [7]

    Atif Noor Arbab, Badar Al Dhuhli, Yugesh Krishnan, and Anna Sheila Crisostomo. 2024. Student’s Utilization and Assistance of AI Tools in Assessment Completion: Perceptions and Implications.International Linguistics Research7, 3 (Oct. 2024), p1. doi:10.30560/ilr.v7n3p1

  8. [8]

    Patrick Bassner, Eduard Frankford, and Stephan Krusche. 2024. Iris: An AI-Driven Virtual Tutor For Computer Science Education. InProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1. 394–400. doi:10.1145/3649217.3653543 arXiv:2405.08008 [cs]

  9. [9]

    Doga Cambaz and Xiaoling Zhang. 2024. Use of AI-driven Code Generation Models in Teaching and Learning Programming: a Systematic Literature Review. InProceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1. ACM, Portland OR USA, 172–178. doi:10.1145/3626252.3630958

  10. [10]

    I-Sheng Chen, Danyang Wang, Luyi Xu, Chen Cao, Xiao Fang, and Jionghao Lin. [n. d.]. A Systematic Review on Prompt Engineering in Large Language Models for K-12 STEM Education. ([n. d.]). Manuscript submitted to ACM Trust in AI among Middle Eastern CS Students: Investigating Students’ Trust and Usage Patterns Across Saudi Arabia, Kuwait and Jordan 19

  11. [11]

    Corbin and A

    J. Corbin and A. Strauss. 2014.Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory. SAGE Publications. https://books.google.com/books?id=hZ6kBQAAQBAJ

  12. [12]

    Anastasiia Demidova, Hanin Atwany, Nour Rabih, Sanad Sha’ban, and Muhammad Abdul-Mageed. 2024. John vs. Ahmed: Debate-Induced Bias in Multilingual LLMs. InProceedings of The Second Arabic Natural Language Processing Conference. Association for Computational Linguistics, Bangkok, Thailand, 193–209. doi:10.18653/v1/2024.arabicnlp-1.18

  13. [13]

    Ethan Dickey, Andres Bejarano, and Chirayu Garg. 2024. Innovating Computer Programming Pedagogy: The AI-Lab Framework for Generative AI Adoption.SN Computer Science5, 6 (July 2024), 720. doi:10.1007/s42979-024-03074-y arXiv:2308.12258 [cs]

  14. [14]

    Silvia García-Méndez, Francisco de Arriba-Pérez, and María del Carmen Somoza-López. 2025. A review on the use of large language models as virtual tutors.Science & Education34, 2 (April 2025), 877–892. doi:10.1007/s11191-024-00530-2 arXiv:2405.11983 [cs]

  15. [15]

    Summit Haque and Christopher Hundhausen. 2025. Generative AI Access, Usage, and Perceptions: An Empirical Comparison of Computing Students In The United States and Bangladesh. InProceedings of the 2025 ACM Conference on International Computing Education Research V.1 (ICER ’25). Association for Computing Machinery, New York, NY, USA, 109–124. doi:10.1145/3...

  16. [16]

    Kendall Hartley, Merav Hayak, and Un Hyeok Ko. 2024. Artificial Intelligence Supporting Independent Student Learning: An Evaluative Case Study of ChatGPT and Learning to Code.Education Sciences14, 2 (Jan. 2024), 120. doi:10.3390/educsci14020120

  17. [17]

    2010.Cultures and organizations: Software of the mind, 3rd edition

    Geert Hofstede, Gert Jan Hofstede, and Michael Minkov. 2010.Cultures and organizations: Software of the mind, 3rd edition. McGraw-Hill

  18. [18]

    Irene Hou, Owen Man, Kate Hamilton, Srishty Muthusekaran, Jeffin Johnykutty, Leili Zadeh, and Stephen MacNeil. 2025. ’All Roads Lead to ChatGPT’: How Generative AI is Eroding Social Interactions and Student Learning Communities. InProceedings of the 30th ACM Conference on Innovation and Technology in Computer Science Education V. 1. ACM, Nijmegen Netherla...

  19. [19]

    Sven Jacobs and Steffen Jaschke. 2024. Evaluating the Application of Large Language Models to Generate Feedback in Programming Education. In 2024 IEEE Global Engineering Education Conference (EDUCON). 1–5. doi:10.1109/EDUCON60312.2024.10578838 arXiv:2403.09744 [cs]

  20. [20]

    Gregor Jošt, Viktor Taneski, and Sašo Karakatič. 2024. The Impact of Large Language Models on Programming Education and Student Learning Outcomes.Applied Sciences14, 10 (May 2024), 4115. doi:10.3390/app14104115

  21. [21]

    Moritz Körber. 2019. Theoretical considerations and development of a questionnaire to measure trust in automation. InProceedings of the 20th Congress of the International Ergonomics Association (IEA 2018). Springer, 13–30

  22. [22]

    Sebastian Linxen, Christian Sturm, Florian Brühlmann, Vincent Cassau, Klaus Opwis, and Katharina Reinecke. 2021. How WEIRD is CHI?. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 143, 14 pages. doi:10.1145/3411764.3445488

  23. [23]

    Iris Cristina Peláez-Sánchez, Davis Velarde-Camaqui, and Leonardo David Glasserman-Morales. 2024. The impact of large language models on higher education: exploring the connection between AI and Education 4.0.Frontiers in Education9 (June 2024), 1392091. doi:10.3389/feduc.2024.1392091

  24. [24]

    Tung Phung, Victor-Alexandru Pădurean, José Cambronero, Sumit Gulwani, Tobias Kohn, Rupak Majumdar, Adish Singla, and Gustavo Soares. 2023. Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors. doi:10.48550/arXiv.2306.17156 arXiv:2306.17156 [cs]

  25. [25]

    Engineering, Artificial Intelligence, and Sustainable Technologies in service of society

    Farman Ali Pirzado, Awais Ahmed, Gerardo Ibarra-Vázquez, and Hugo Terashima-Marin. 2025. Evaluating Language Dependency in Large Language Models: A Study on Programming Queries in English and Spanish. InProceedings of the 23rd LACCEI International Multi-Conference for Engineering, Education and Technology (LACCEI): "Engineering, Artificial Intelligence, a...

  26. [26]

    Griffin Pitts and Sanaz Motamedi. [n. d.]. Understanding Human-AI Trust in Education. ([n. d.])

  27. [27]

    Randrianasolo, Brett Becker, Bailey Kimmel, Jared Wright, and Ben Briggs

    James Prather, Brent Reeves, Juho Leinonen, Stephen MacNeil, Arisoa S. Randrianasolo, Brett Becker, Bailey Kimmel, Jared Wright, and Ben Briggs

  28. [28]

    Randrianasolo, Brett Becker, Bailey Kimmel, Jared Wright, and Ben Briggs

    The Widening Gap: The Benefits and Harms of Generative AI for Novice Programmers. doi:10.48550/arXiv.2405.17739 arXiv:2405.17739 [cs]

  29. [29]

    James Prather, Brent N. Reeves, Paul Denny, Juho Leinonen, Stephen MacNeil, Andrew Luxton-Reilly, João Orvalho, Amin Alipour, Ali Alfageeh, Thezyrie Amarouche, Bailey Kimmel, Jared Wright, Musa Blake, and Gweneth Barbre. 2024. Breaking the Programming Language Barrier: Multilingual Prompting to Empower Non-Native English Learners. doi:10.48550/arXiv.2412....

  30. [30]

    Nishat Raihan, Mohammed Latif Siddiq, Joanna C. S. Santos, and Marcos Zampieri. 2024. Large Language Models in Computer Science Education: A Systematic Literature Review. doi:10.48550/arXiv.2410.16349 arXiv:2410.16349 [cs]

  31. [31]

    Sami Sarsa, Paul Denny, Arto Hellas, and Juho Leinonen. 2022. Automatic Generation of Programming Exercises and Code Explanations using Large Language Models. InProceedings of the 2022 ACM Conference on International Computing Education Research - Volume 1. 27–43. doi:10.1145/ 3501385.3543957 arXiv:2206.11861 [cs]

  32. [32]

    Daniel Schiff. 2022. Education for AI, not AI for Education: The Role of Education and Ethics in National AI Policy Strategies.International Journal of Artificial Intelligence in Education32, 3 (Sept. 2022), 527–563. doi:10.1007/s40593-021-00270-2

  33. [33]

    Serry Sibaee, Omar Najar, Lahouri Ghouti, and Anis Koubaa. 2024. LLMs as Compiler for Arabic Programming Language. doi:10.48550/arXiv.2403. 16087 arXiv:2403.16087 [cs]

  34. [34]

    Nicholas Sukiennik, Chen Gao, Fengli Xu, and Yong Li. 2025. An Evaluation of Cultural Value Alignment in LLM. doi:10.48550/arXiv.2504.08863 arXiv:2504.08863 [cs]

  35. [35]

    Omar Tayan, Ali Hassan, Khaled Khankan, and Sanaa Askool. 2024. Considerations for adapting higher education technology courses for AI large language models: A critical review of the impact of ChatGPT.Machine Learning with Applications15 (March 2024), 100513. doi:10.1016/j.mlwa.2023. 100513 Manuscript submitted to ACM