A Study on the Framework for Evaluating the Ethics and Trustworthiness of Generative AI

Cheonsu Jeong; Seonhee Jeong; Seunghyun Lee; Sungsu Kim

arxiv: 2509.00398 · v4 · submitted 2025-08-30 · 💻 cs.CY · cs.AI

A Study on the Framework for Evaluating the Ethics and Trustworthiness of Generative AI

Cheonsu Jeong , Seunghyun Lee , Seonhee Jeong , Sungsu Kim This is my paper

Pith reviewed 2026-05-18 19:49 UTC · model grok-4.3

classification 💻 cs.CY cs.AI

keywords generative AIAI ethicstrustworthinessevaluation frameworkfairnesstransparencyaccountabilityprivacy

0 comments

The pith

A framework of eleven dimensions with detailed indicators evaluates the ethics and trustworthiness of generative AI across its lifecycle.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tries to establish a systematic approach to judging generative AI on ethical and trustworthiness grounds rather than performance alone. Current methods overlook problems such as bias, privacy violations, copyright issues, and hallucinations, so the authors build a framework that adds human-centered and social-impact criteria. They define eleven dimensions and create concrete indicators plus assessment methods for each one. The framework is meant to work throughout the entire AI development process and to combine technical checks with broader perspectives. If it holds, developers and regulators would gain practical ways to spot and reduce ethical risks before deployment.

Core claim

The authors state that generative AI ethics and trustworthiness can be evaluated through a framework built around eleven dimensions—fairness, transparency, accountability, safety, privacy, accuracy, consistency, robustness, explainability, copyright and intellectual property protection, and source traceability—each equipped with specific indicators and assessment methodologies, informed by a comparison of policies from South Korea, the United States, the European Union, and China, and designed for use across the full AI lifecycle to integrate technical and multidisciplinary views.

What carries the argument

The proposed evaluation framework built from eleven dimensions, each with its own indicators and assessment methodologies.

If this is right

Supplies practical tools to identify and manage ethical risks in actual AI applications.
Gives policymakers, developers, and users concrete guidance for responsible AI decisions.
Helps steer generative AI toward positive contributions to society.
Creates a shared academic base for ongoing work on trustworthy AI systems.
Combines technical evaluation with social and policy perspectives in one structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could be tested by running it on current large language models to see which indicators need adjustment.
It might serve as a starting point for creating standardized checklists used by regulators in multiple countries.
Extending the indicators to include measurable scores could make the assessments more repeatable.
The policy comparison section suggests the framework may adapt differently depending on regional legal contexts.

Load-bearing premise

The eleven dimensions and their indicators are complete enough to cover all relevant ethical and trustworthiness issues and can be applied across the AI lifecycle without further real-world testing.

What would settle it

Apply the framework to a deployed generative AI system and check whether it misses or fails to provide remedies for a major ethical failure such as undetected privacy leakage or widespread hallucinated facts.

read the original abstract

This study provides an in_depth analysis of the ethical and trustworthiness challenges emerging alongside the rapid advancement of generative artificial intelligence (AI) technologies and proposes a comprehensive framework for their systematic evaluation. While generative AI, such as ChatGPT, demonstrates remarkable innovative potential, it simultaneously raises ethical and social concerns, including bias, harmfulness, copyright infringement, privacy violations, and hallucination. Current AI evaluation methodologies, which mainly focus on performance and accuracy, are insufficient to address these multifaceted issues. Thus, this study emphasizes the need for new human_centered criteria that also reflect social impact. To this end, it identifies key dimensions for evaluating the ethics and trustworthiness of generative AI_fairness, transparency, accountability, safety, privacy, accuracy, consistency, robustness, explainability, copyright and intellectual property protection, and source traceability and develops detailed indicators and assessment methodologies for each. Moreover, it provides a comparative analysis of AI ethics policies and guidelines in South Korea, the United States, the European Union, and China, deriving key approaches and implications from each. The proposed framework applies across the AI lifecycle and integrates technical assessments with multidisciplinary perspectives, thereby offering practical means to identify and manage ethical risks in real_world contexts. Ultimately, the study establishes an academic foundation for the responsible advancement of generative AI and delivers actionable insights for policymakers, developers, users, and other stakeholders, supporting the positive societal contributions of AI technologies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper synthesizes eleven ethics dimensions for generative AI with a four-country policy comparison, but offers no testing of the framework in practice.

read the letter

The main thing to know is that the authors collect existing ethical concerns around generative AI into a list of eleven dimensions—fairness, transparency, accountability, safety, privacy, accuracy, consistency, robustness, explainability, copyright protection, and source traceability—then add indicators and methods for each, plus a side-by-side look at policies from South Korea, the US, EU, and China. The policy section pulls out some concrete implications that go beyond a pure literature summary. They organize the material clearly and tie it to the full AI lifecycle, which makes the proposal easy to follow on paper. That synthesis and the regional comparison are the parts that feel most useful right now. The indicators themselves read as reasonable extensions of prior work rather than brand-new inventions. The paper does not include any application of the framework to a real model, no pilot evaluation, and no check on whether the eleven areas overlap, miss key issues, or can actually be measured without extra tools. Because of that, the claim that it supplies practical means for managing risks stays untested. The dimensions and indicators are asserted to be sufficient and operational, but nothing in the manuscript demonstrates completeness or feasibility. This is the kind of structured reference that policy teams or developers might pick up as a starting checklist when they need to cover ethics without starting from scratch. Readers hunting for new theory or quantitative results will not find much here. The work engages the literature directly and avoids obvious internal contradictions, so it is worth sending to referees who can press on validation and suggest concrete ways to ground the indicators. I would recommend peer review rather than a desk reject.

Referee Report

3 major / 2 minor

Summary. The paper claims to provide an in-depth analysis of ethical and trustworthiness challenges in generative AI and proposes a comprehensive framework identifying 11 key dimensions—fairness, transparency, accountability, safety, privacy, accuracy, consistency, robustness, explainability, copyright and intellectual property protection, and source traceability—along with detailed indicators and assessment methodologies for each. It includes a comparative analysis of AI ethics policies and guidelines from South Korea, the United States, the European Union, and China, deriving implications from each, and asserts that the framework applies across the full AI lifecycle while integrating technical assessments with multidisciplinary perspectives to offer practical means for identifying and managing ethical risks in real-world contexts.

Significance. If the 11 dimensions and associated indicators prove both comprehensive and operationalizable, the work could serve as a useful synthesized reference for policymakers, developers, and stakeholders by consolidating existing policy documents and literature into a structured evaluation approach with explicit assessment methods. The comparative policy analysis across four jurisdictions adds value by surfacing regional differences and common themes. However, the significance is currently constrained by the purely synthetic nature of the contribution, with no demonstrated application or validation to establish practical utility.

major comments (3)

Abstract: The central claim that the framework 'applies across the AI lifecycle and integrates technical assessments with multidisciplinary perspectives, thereby offering practical means to identify and manage ethical risks in real-world contexts' is load-bearing but unsupported, as the manuscript contains no application of the 11 dimensions or indicators to any concrete generative AI system (e.g., ChatGPT or similar), no pilot evaluation, and no check for completeness or feasibility.
Framework proposal and indicators section: The sufficiency of the listed 11 dimensions to capture all relevant ethical and trustworthiness issues is asserted without addressing potential overlaps (such as between transparency and explainability) or gaps relative to broader AI ethics literature, which directly affects the claim of comprehensiveness.
Policy comparison section: While the analysis of policies from South Korea, the US, the EU, and China is informative, the manuscript does not explicitly derive or map the 11 dimensions and their indicators from specific policy elements, leaving the integration of these sources into the framework opaque and difficult to verify.

minor comments (2)

Abstract: Formatting artifacts such as 'in_depth', 'human_centered', and 'real_world' should be corrected to standard hyphenation or spacing for professional presentation.
Throughout the manuscript: Ensure the 11 dimensions are introduced and referenced in a consistent order and with uniform terminology to improve clarity and readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's detailed and constructive feedback on our manuscript. We have carefully considered each major comment and provide point-by-point responses below. Where appropriate, we outline revisions to address the concerns raised.

read point-by-point responses

Referee: Abstract: The central claim that the framework 'applies across the AI lifecycle and integrates technical assessments with multidisciplinary perspectives, thereby offering practical means to identify and manage ethical risks in real-world contexts' is load-bearing but unsupported, as the manuscript contains no application of the 11 dimensions or indicators to any concrete generative AI system (e.g., ChatGPT or similar), no pilot evaluation, and no check for completeness or feasibility.

Authors: We acknowledge that the manuscript proposes the framework without including a concrete application or pilot study on a specific generative AI system. The claim in the abstract reflects the intended scope and design of the framework, which is derived from a synthesis of policy documents and literature to cover the AI lifecycle. However, to strengthen the manuscript and avoid overstatement, we will revise the abstract to clarify that the framework provides a structured approach for such identification and management, with practical utility to be validated in future work. Additionally, we will add a brief discussion section on potential applications and feasibility considerations. revision: partial
Referee: Framework proposal and indicators section: The sufficiency of the listed 11 dimensions to capture all relevant ethical and trustworthiness issues is asserted without addressing potential overlaps (such as between transparency and explainability) or gaps relative to broader AI ethics literature, which directly affects the claim of comprehensiveness.

Authors: We agree that explicitly addressing potential overlaps and justifying the comprehensiveness is important. In the revised manuscript, we will include a new subsection in the framework section that discusses the rationale for selecting these 11 dimensions, drawing on key references from the broader AI ethics literature (e.g., works on AI principles from OECD, UNESCO). We will also analyze overlaps, such as between transparency and explainability, explaining how they are distinguished in our framework: transparency refers to openness about system operations, while explainability focuses on providing understandable reasons for outputs. This will help substantiate the claim of comprehensiveness. revision: yes
Referee: Policy comparison section: While the analysis of policies from South Korea, the US, the EU, and China is informative, the manuscript does not explicitly derive or map the 11 dimensions and their indicators from specific policy elements, leaving the integration of these sources into the framework opaque and difficult to verify.

Authors: We thank the referee for pointing this out. To improve transparency, we will revise the policy comparison section to include explicit mappings. This could be achieved by adding a summary table or detailed explanations linking specific policy elements from each jurisdiction to the corresponding dimensions in our framework. For example, we will highlight how the EU's AI Act informs the accountability and safety dimensions, and similarly for other regions. This will make the derivation process clearer and verifiable. revision: yes

Circularity Check

0 steps flagged

Framework synthesized from external policies and literature without self-referential reduction.

full rationale

The manuscript builds its 11-dimension evaluation framework via comparative review of South Korean, US, EU, and Chinese AI ethics policies plus general literature on generative AI risks. Dimensions and indicators are explicitly drawn from these external sources rather than from any internal definitions, fitted parameters, or self-citation chains that would render the output equivalent to the input by construction. No equations, uniqueness theorems, or predictions are present that collapse back onto the paper's own assumptions; the contribution is therefore a synthesis that remains independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central proposal rests on the domain assumption that the enumerated dimensions comprehensively address generative AI ethics and that the derived indicators can be operationalized without further empirical grounding.

axioms (1)

domain assumption The eleven dimensions (fairness, transparency, accountability, safety, privacy, accuracy, consistency, robustness, explainability, copyright and intellectual property protection, and source traceability) are the key and sufficient criteria for evaluating ethics and trustworthiness.
Invoked when the framework is presented as comprehensive without explicit justification for the selection or exclusion of other possible criteria.

pith-pipeline@v0.9.0 · 5792 in / 1269 out tokens · 52902 ms · 2026-05-18T19:49:09.912509+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

identifies key dimensions for evaluating the ethics and trustworthiness of generative AI—fairness, transparency, accountability, safety, privacy, accuracy, consistency, robustness, explainability, copyright and intellectual property protection, and source traceability—and develops detailed indicators and assessment methodologies for each
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat recovery unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The proposed framework applies across the AI lifecycle and integrates technical assessments with multidisciplinary perspectives

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 1 internal anchor

[1]

Jeong, C. (2023). A Study on the Implementation of Generative AI Services Using an Enterprise Data -Based LLM Application Architecture. Advances in Artificial Intelligence and Machine Learning , 3(4) , 1588 -1618. https://dx.doi.org/10.54364/AAIML.2023.1191

work page doi:10.54364/aaiml.2023.1191 2023
[2]

Jeong, C. (2023). Generative AI service implementation using LLM application architecture: based on RAG model and LangChain framework . Journal of Intelligence and Information Systems , 19(4), 129 -164. https://doi.org/10.13088/jiis.2023.29.4.129

work page doi:10.13088/jiis.2023.29.4.129 2023
[3]

Ziv, B.Z. Nature. (2025, July 1). Why we need mandatory safeguards for emotionally responsive AI. Retrieved from https://www.nature.com/articles/d41586-025-02031-w

work page 2025
[4]

Goodfellow, I., Pouget -Abadie, J., Mirza, M., X u, B., Warde -Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative Adversarial Networks. Advances in Neural Information Processing Systems, 27

work page 2014
[5]

N., Kaiser, Ł., & Polo sukhin, I

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polo sukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems, 30

work page 2017
[6]

& Park, H

An, J. & Park, H. (2023). Development of a case -based nursing education program using generative artificial intelligence. Journal of Korean Academy of Nursing Education, 29(3), 234–246. https://doi.org/10.5977/jkasne.2023.29.3.234

work page doi:10.5977/jkasne.2023.29.3.234 2023
[7]

Adam, M., Wessel, M., & Benlian, A. (2021). AI -based chatbots in customer service and their effects on user compliance. Electronic Markets, 31(2), 427-445

work page 2021
[8]

Przegalinska, A., Ciecha nowski, L., Stroz, A., Gloor, P., & Mazurek, G. (2019). In bot we trust: A new methodology of chatbot performance measures. Business Horizons, 62(6), 785-797

work page 2019
[9]

Park, E. (2024). The effects of customers’ regulatory focus and familiarity with generative AI -based chatbots on their intention to disclose personal information: Focusing on privacy calculus theory. Knowledge Management Research, 25(2), 49–68. https://doi.org/10.15813/kmr.2024.25.2.003

work page doi:10.15813/kmr.2024.25.2.003 2024
[10]

Sánchez-Dí az, X., Ayala-Bastidas, G., Fonseca-Ortiz, P., Garrido, L. (2018). A Knowledge-Based Methodology for Building a Conversational Chatbot as an Intelligent Tutor, Advances in Computational Intelligence , Vol. 11289. 165 -175. https://doi.org/10.1007/978-3-030-04497-8_14

work page doi:10.1007/978-3-030-04497-8_14 2018
[11]

& Jeong, J

Jeong, C. & Jeong, J. (2020). A study on buil ding AI chatbots for the post -COVID-19 untact era. Journal of the Korea Institute of IT Service, 19(4), 31–47. https://doi.org/10.9716/KITS.2020.19.4.031

work page doi:10.9716/kits.2020.19.4.031 2020
[12]

Foun- dation models for decision making: Problems, methods, and opportunities

Yang, et al., “Foundation Models for Decision Making: Problems, Methods, and Opportunities”, 2023. arXiv preprint arXiv:2303.04129

work page arXiv 2023
[13]

AgentBench: Evaluating LLMs as Agents

Karpas, E., et al., “AgentBench: Evaluating LLMs as Agents”, 2023. arXiv preprint arXiv:2308.03688

work page internal anchor Pith review Pith/arXiv arXiv 2023
[14]

Jeong, C. (2025). Beyond Text: Implementing Multimodal Large Language Model -Powered Multi-Agent Systems Using a No-Code Platform. Journal of Intelligence and Information Systems. 2025;31(1):191-231.doi:10.13088/jiis.2025.31.1.191

work page doi:10.13088/jiis.2025.31.1.191 2025
[15]

Jeong, C. (2025). A Practical MCP×A2A Integration Framework for Interoperability in LLM -Based Autonomous Multi - Agent Systems. Journal of Intelligence an d Information Systems , 31(3), 141 -170. https://dx.doi.org/10.13088/jiis.2025.31.3.141

work page doi:10.13088/jiis.2025.31.3.141 2025
[16]

S., Sim, S

Jeong, C. S., Sim, S. M., Cho, H. Y., Kim, S. S., & Shin, B. K. (2025). E2E Process Automation Leveraging Generative AI and IDP-Based Automation Agent: A Case Study on Corporate Expense Processing. Artificial Intelligence and Applications. . https://doi.org/10.47852/bonviewAIA52026307

work page doi:10.47852/bonviewaia52026307 2025
[17]

Schlagwein, D., & Willcocks, L. (2023). ‘ChatGPT et al.’: The ethics of using (generative) artificial intelligence in research and science. Journal of Information Technology, 38(3), 232-238

work page 2023
[18]

M., Weinhardt, C., van der Aalst, W., & Hinz, O

Teubner, T., Flath, C. M., Weinhardt, C., van der Aalst, W., & Hinz, O. (2023). Willkommen im Zeitalter von ChatGPT & Co.: Die Chancen großer Sprachmodelle. Business & Information Systems Engineering, 65(2), 95-101

work page 2023
[19]

(2025, July 5)

National Research Foundation of Korea. (2025, July 5). Researchers' Perceptions on Generative AI and Research Ethics. Retrieved from https://kenss.or.kr/board/data/article/252645

work page 2025
[20]

European Commission. (2019). Ethics Guidelines for Trustworthy AI. High-Level Expert Group on Artificial Intelligence. Received: 29 August 2025 | Revised: 29 October 2025 ______________________________________________________________________________ 22

work page 2019
[21]

National Institute of Standards and Technology (NIST). (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST AI 100-1

work page 2023
[22]

(2025, Ju ly 5)

Korea Institute for Industrial Technology Promotion. (2025, Ju ly 5). China Establishes Next -Generation AI Ethical Guidelines (China Ministry of Science and Technology, September 26). Retrieved from https://eiec.kdi.re.kr/policy/domesticView.do?ac=0000159203&issus=

work page 2025
[23]

Ministry of Science and ICT. (2020). Principles of Artificial Intelligence Ethics Centered on People

work page 2020
[24]

(2025, April 30)

SPRi. (2025, April 30). Research on AI Reliability and Ethical Systems. Retrieved from https://spri.kr/posts/view/23864?code=research

work page 2025
[25]

Korea Information Society Development Institute (KISDI). (2023). Self -Checklist for Implementing the 2023 AI Ethics Guidelines. Retrieved from https://www.kisdi.re.kr/bbs/view.do?bbsSn=114068&key=m2101113055944

work page 2023
[26]

The White House. (2022). Blueprint for an AI Bill of Rights

work page 2022
[27]

State Council of the People's Republic of China. (2017). New Generation Artificial Intelligence Development Plan

work page 2017
[28]

Administration of Cyberspace of China (CAC). (2022). Regulations on the Management of Algorithm Recommendation Services

work page 2022
[29]

Lee, J. (2022). A Study on the Ethics Policy of Artificial Intelligence (AI) in China. The Korean Association of Chinese Studies, no. 80, 69 – 87. http://dx.doi.org/10.14378/KACS.2022.80.80.4

work page doi:10.14378/kacs.2022.80.80.4 2022
[30]

(2025, July 5)

FAIR AI. (2025, July 5). AI Ethics Guidelines by Country. Retrieved from https://fairai.or.kr/updates/guidelines

work page 2025
[31]

Jeong, C. (2025). Design and Evaluation Methods for LLM -Based Explainable AI (XAI) -Based Human-AI Collaboration Systems. Advances in Artificial Intelligence and Machine Learning , 5(3), 4308 -4341. https://dx.doi.org/10.54364/AAIML.2025.53240

work page doi:10.54364/aaiml.2025.53240 2025
[32]

Jacoby, J., & Matell, M. S. (1971). Three-Point Likert Scales Are Good Enough. Journal of Marketing Research, 8(4), 495-

work page 1971
[33]

https://doi.org/10.2307/3150242

work page doi:10.2307/3150242
[34]

(2025, September, 11)

Firstpagesage. (2025, September, 11). Top Generative AI Chatbots by Ma rket Share – September 2025. Retrieved from https://firstpagesage.com/reports/top-generative-ai-chatbots/

work page 2025

[1] [1]

Jeong, C. (2023). A Study on the Implementation of Generative AI Services Using an Enterprise Data -Based LLM Application Architecture. Advances in Artificial Intelligence and Machine Learning , 3(4) , 1588 -1618. https://dx.doi.org/10.54364/AAIML.2023.1191

work page doi:10.54364/aaiml.2023.1191 2023

[2] [2]

Jeong, C. (2023). Generative AI service implementation using LLM application architecture: based on RAG model and LangChain framework . Journal of Intelligence and Information Systems , 19(4), 129 -164. https://doi.org/10.13088/jiis.2023.29.4.129

work page doi:10.13088/jiis.2023.29.4.129 2023

[3] [3]

Ziv, B.Z. Nature. (2025, July 1). Why we need mandatory safeguards for emotionally responsive AI. Retrieved from https://www.nature.com/articles/d41586-025-02031-w

work page 2025

[4] [4]

Goodfellow, I., Pouget -Abadie, J., Mirza, M., X u, B., Warde -Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative Adversarial Networks. Advances in Neural Information Processing Systems, 27

work page 2014

[5] [5]

N., Kaiser, Ł., & Polo sukhin, I

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polo sukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems, 30

work page 2017

[6] [6]

& Park, H

An, J. & Park, H. (2023). Development of a case -based nursing education program using generative artificial intelligence. Journal of Korean Academy of Nursing Education, 29(3), 234–246. https://doi.org/10.5977/jkasne.2023.29.3.234

work page doi:10.5977/jkasne.2023.29.3.234 2023

[7] [7]

Adam, M., Wessel, M., & Benlian, A. (2021). AI -based chatbots in customer service and their effects on user compliance. Electronic Markets, 31(2), 427-445

work page 2021

[8] [8]

Przegalinska, A., Ciecha nowski, L., Stroz, A., Gloor, P., & Mazurek, G. (2019). In bot we trust: A new methodology of chatbot performance measures. Business Horizons, 62(6), 785-797

work page 2019

[9] [9]

Park, E. (2024). The effects of customers’ regulatory focus and familiarity with generative AI -based chatbots on their intention to disclose personal information: Focusing on privacy calculus theory. Knowledge Management Research, 25(2), 49–68. https://doi.org/10.15813/kmr.2024.25.2.003

work page doi:10.15813/kmr.2024.25.2.003 2024

[10] [10]

Sánchez-Dí az, X., Ayala-Bastidas, G., Fonseca-Ortiz, P., Garrido, L. (2018). A Knowledge-Based Methodology for Building a Conversational Chatbot as an Intelligent Tutor, Advances in Computational Intelligence , Vol. 11289. 165 -175. https://doi.org/10.1007/978-3-030-04497-8_14

work page doi:10.1007/978-3-030-04497-8_14 2018

[11] [11]

& Jeong, J

Jeong, C. & Jeong, J. (2020). A study on buil ding AI chatbots for the post -COVID-19 untact era. Journal of the Korea Institute of IT Service, 19(4), 31–47. https://doi.org/10.9716/KITS.2020.19.4.031

work page doi:10.9716/kits.2020.19.4.031 2020

[12] [12]

Foun- dation models for decision making: Problems, methods, and opportunities

Yang, et al., “Foundation Models for Decision Making: Problems, Methods, and Opportunities”, 2023. arXiv preprint arXiv:2303.04129

work page arXiv 2023

[13] [13]

AgentBench: Evaluating LLMs as Agents

Karpas, E., et al., “AgentBench: Evaluating LLMs as Agents”, 2023. arXiv preprint arXiv:2308.03688

work page internal anchor Pith review Pith/arXiv arXiv 2023

[14] [14]

Jeong, C. (2025). Beyond Text: Implementing Multimodal Large Language Model -Powered Multi-Agent Systems Using a No-Code Platform. Journal of Intelligence and Information Systems. 2025;31(1):191-231.doi:10.13088/jiis.2025.31.1.191

work page doi:10.13088/jiis.2025.31.1.191 2025

[15] [15]

Jeong, C. (2025). A Practical MCP×A2A Integration Framework for Interoperability in LLM -Based Autonomous Multi - Agent Systems. Journal of Intelligence an d Information Systems , 31(3), 141 -170. https://dx.doi.org/10.13088/jiis.2025.31.3.141

work page doi:10.13088/jiis.2025.31.3.141 2025

[16] [16]

S., Sim, S

Jeong, C. S., Sim, S. M., Cho, H. Y., Kim, S. S., & Shin, B. K. (2025). E2E Process Automation Leveraging Generative AI and IDP-Based Automation Agent: A Case Study on Corporate Expense Processing. Artificial Intelligence and Applications. . https://doi.org/10.47852/bonviewAIA52026307

work page doi:10.47852/bonviewaia52026307 2025

[17] [17]

Schlagwein, D., & Willcocks, L. (2023). ‘ChatGPT et al.’: The ethics of using (generative) artificial intelligence in research and science. Journal of Information Technology, 38(3), 232-238

work page 2023

[18] [18]

M., Weinhardt, C., van der Aalst, W., & Hinz, O

Teubner, T., Flath, C. M., Weinhardt, C., van der Aalst, W., & Hinz, O. (2023). Willkommen im Zeitalter von ChatGPT & Co.: Die Chancen großer Sprachmodelle. Business & Information Systems Engineering, 65(2), 95-101

work page 2023

[19] [19]

(2025, July 5)

National Research Foundation of Korea. (2025, July 5). Researchers' Perceptions on Generative AI and Research Ethics. Retrieved from https://kenss.or.kr/board/data/article/252645

work page 2025

[20] [20]

European Commission. (2019). Ethics Guidelines for Trustworthy AI. High-Level Expert Group on Artificial Intelligence. Received: 29 August 2025 | Revised: 29 October 2025 ______________________________________________________________________________ 22

work page 2019

[21] [21]

National Institute of Standards and Technology (NIST). (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST AI 100-1

work page 2023

[22] [22]

(2025, Ju ly 5)

Korea Institute for Industrial Technology Promotion. (2025, Ju ly 5). China Establishes Next -Generation AI Ethical Guidelines (China Ministry of Science and Technology, September 26). Retrieved from https://eiec.kdi.re.kr/policy/domesticView.do?ac=0000159203&issus=

work page 2025

[23] [23]

Ministry of Science and ICT. (2020). Principles of Artificial Intelligence Ethics Centered on People

work page 2020

[24] [24]

(2025, April 30)

SPRi. (2025, April 30). Research on AI Reliability and Ethical Systems. Retrieved from https://spri.kr/posts/view/23864?code=research

work page 2025

[25] [25]

Korea Information Society Development Institute (KISDI). (2023). Self -Checklist for Implementing the 2023 AI Ethics Guidelines. Retrieved from https://www.kisdi.re.kr/bbs/view.do?bbsSn=114068&key=m2101113055944

work page 2023

[26] [26]

The White House. (2022). Blueprint for an AI Bill of Rights

work page 2022

[27] [27]

State Council of the People's Republic of China. (2017). New Generation Artificial Intelligence Development Plan

work page 2017

[28] [28]

Administration of Cyberspace of China (CAC). (2022). Regulations on the Management of Algorithm Recommendation Services

work page 2022

[29] [29]

Lee, J. (2022). A Study on the Ethics Policy of Artificial Intelligence (AI) in China. The Korean Association of Chinese Studies, no. 80, 69 – 87. http://dx.doi.org/10.14378/KACS.2022.80.80.4

work page doi:10.14378/kacs.2022.80.80.4 2022

[30] [30]

(2025, July 5)

FAIR AI. (2025, July 5). AI Ethics Guidelines by Country. Retrieved from https://fairai.or.kr/updates/guidelines

work page 2025

[31] [31]

Jeong, C. (2025). Design and Evaluation Methods for LLM -Based Explainable AI (XAI) -Based Human-AI Collaboration Systems. Advances in Artificial Intelligence and Machine Learning , 5(3), 4308 -4341. https://dx.doi.org/10.54364/AAIML.2025.53240

work page doi:10.54364/aaiml.2025.53240 2025

[32] [32]

Jacoby, J., & Matell, M. S. (1971). Three-Point Likert Scales Are Good Enough. Journal of Marketing Research, 8(4), 495-

work page 1971

[33] [33]

https://doi.org/10.2307/3150242

work page doi:10.2307/3150242

[34] [34]

(2025, September, 11)

Firstpagesage. (2025, September, 11). Top Generative AI Chatbots by Ma rket Share – September 2025. Retrieved from https://firstpagesage.com/reports/top-generative-ai-chatbots/

work page 2025