Evaluation of AI Ethics Tools in Language Models: A Developers' Perspective Case Stud

arxiv: 2512.15791 · v1 · submitted 2025-12-16 · 💻 cs.CY · cs.AI· cs.CL

Evaluation of AI Ethics Tools in Language Models: A Developers' Perspective Case Stud

Jhessica Silva , Diego A. B. Moreira , Gabriel O. dos Santos , Alef Ferreira , Helena Maia , Sandra Avila , Helio Pedrini This is my paper

Pith reviewed 2026-05-16 22:34 UTC · model grok-4.3

classification 💻 cs.CY cs.AIcs.CL

keywords AI ethics toolslanguage modelsPortuguese languagedevelopers perspectiveModel CardsALTAIFactSheetsHarms Modeling

0 comments p. Extension

The pith

AI ethics tools guide developers on general language model issues but miss Portuguese-specific harms like idiomatic effects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper surveys 213 AI ethics tools, narrows to four (Model Cards, ALTAI, FactSheets, Harms Modeling), and applies them to Portuguese language models through 35 hours of developer interviews. It concludes these tools offer a useful starting framework for broad ethical questions yet overlook model-unique features such as idiomatic expressions and fail to surface negative impacts on the Portuguese language. A reader would care because many AI systems now serve non-English users, and incomplete tools could leave cultural and linguistic risks unaddressed during development.

Core claim

The applied AIETs serve as a guide for formulating general ethical considerations about language models. However, they do not address unique aspects of these models, such as idiomatic expressions. Additionally, these AIETs did not help to identify potential negative impacts of models for the Portuguese language.

What carries the argument

Selection of four AIETs after screening 213 publications, followed by their direct application to Portuguese language models and structured interviews with developers to rate usefulness and gaps.

If this is right

Developers working on language models in specific languages will need supplementary methods to catch harms not covered by current AIETs.
AI ethics tools require updates to incorporate checks for idiomatic expressions and cultural or linguistic nuances.
Evaluations of AIETs should routinely include developers from underrepresented languages rather than relying on general-purpose testing.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Gaps found here may appear in other low-resource languages, suggesting standard tools favor dominant-language assumptions.
Tool creators could test revised versions by repeating the developer-interview process with new language-specific prompts.

Load-bearing premise

That interviews with the developers of these specific Portuguese models, using only the four chosen tools, provide enough evidence to judge how well AI ethics tools work for language models in general.

What would settle it

A follow-up study that applies the same four tools to English or other language models and finds they successfully surface unique aspects and negative impacts would contradict the reported limitations.

read the original abstract

In Artificial Intelligence (AI), language models have gained significant importance due to the widespread adoption of systems capable of simulating realistic conversations with humans through text generation. Because of their impact on society, developing and deploying these language models must be done responsibly, with attention to their negative impacts and possible harms. In this scenario, the number of AI Ethics Tools (AIETs) publications has recently increased. These AIETs are designed to help developers, companies, governments, and other stakeholders establish trust, transparency, and responsibility with their technologies by bringing accepted values to guide AI's design, development, and use stages. However, many AIETs lack good documentation, examples of use, and proof of their effectiveness in practice. This paper presents a methodology for evaluating AIETs in language models. Our approach involved an extensive literature survey on 213 AIETs, and after applying inclusion and exclusion criteria, we selected four AIETs: Model Cards, ALTAI, FactSheets, and Harms Modeling. For evaluation, we applied AIETs to language models developed for the Portuguese language, conducting 35 hours of interviews with their developers. The evaluation considered the developers' perspective on the AIETs' use and quality in helping to identify ethical considerations about their model. The results suggest that the applied AIETs serve as a guide for formulating general ethical considerations about language models. However, we note that they do not address unique aspects of these models, such as idiomatic expressions. Additionally, these AIETs did not help to identify potential negative impacts of models for the Portuguese language.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a narrow developer-interview case study showing four ethics tools miss Portuguese-specific issues like idioms, but the methods details are too thin to verify the claims.

read the letter

The main thing to know is that the paper applies four ethics tools (Model Cards, ALTAI, FactSheets, Harms Modeling) to Portuguese language models and reports, via 35 hours of developer interviews, that the tools only guide general ethics and miss language-specific problems such as idiomatic expressions and negative impacts. It starts from a survey of 213 tools and narrows them with inclusion criteria, then gathers developer perspectives on usability for their own models. That developer lens and the non-English focus are the actual new pieces; most prior work on these tools stays at the English or abstract level. The survey itself is a reasonable way to pick the four, and the qualitative approach fits the goal of capturing real-world use. The soft spots sit in the missing mechanics. The abstract gives no interview protocol, no participant count, no step-by-step description of how each tool was run on the models, and no coding or saturation details for the responses. Without those, the central claim that the tools failed to surface Portuguese issues rests on uncheckable interpretations of the interviews. It is possible the evaluation simply did not ask the right questions rather than the tools being inherently blind to the issues. The scope is also small—one language, four tools—so the findings stay incremental. This paper is for practitioners building or auditing ethics tools for non-English models and for researchers who want an empirical data point on tool gaps. It deserves a serious referee because the topic is timely and the developer feedback could be useful if the methods are filled in properly. I would send it to review rather than desk reject, with the expectation that the authors supply the missing protocol and analysis steps.

Referee Report

2 major / 1 minor

Summary. The paper surveys 213 AI Ethics Tools (AIETs), applies inclusion/exclusion criteria to select four (Model Cards, ALTAI, FactSheets, Harms Modeling), applies them to Portuguese-language models, and reports findings from 35 hours of developer interviews. It claims these tools guide only general ethical considerations and fail to address unique aspects such as idiomatic expressions or negative impacts specific to the Portuguese language.

Significance. If the findings are substantiated with transparent methods, the work would usefully map the AIET landscape via the 213-tool survey and provide practical developer perspectives on tool limitations for non-English models, highlighting the need for linguistically and culturally tailored ethics guidance.

major comments (2)

[Methods] Methods: No details are supplied on the interview protocol, participant count or selection, exact procedure for applying each of the four AIETs to the models, qualitative analysis approach (coding scheme, saturation criteria), or how exclusion criteria were operationalized on the 213 tools. This absence directly undermines verifiability of the central claim that the tools failed to surface Portuguese-specific issues.
[Results] Results/Discussion: The claim that the AIETs 'did not help to identify potential negative impacts of models for the Portuguese language' rests entirely on unverified developer self-reports. No cross-validation against model outputs, external audits, or documented harms is presented, making the evidence load-bearing for the failure conclusion but insufficiently grounded.

minor comments (1)

[Title] Title is truncated ('Case Stud'); complete to 'Case Study'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive comments. We address each major point below and will revise the manuscript to improve transparency and qualify our claims where appropriate.

read point-by-point responses

Referee: [Methods] Methods: No details are supplied on the interview protocol, participant count or selection, exact procedure for applying each of the four AIETs to the models, qualitative analysis approach (coding scheme, saturation criteria), or how exclusion criteria were operationalized on the 213 tools. This absence directly undermines verifiability of the central claim that the tools failed to surface Portuguese-specific issues.

Authors: We agree that the current Methods section lacks sufficient detail for full verifiability. In the revised manuscript we will expand it to describe the semi-structured interview protocol, the number of developer participants and their selection criteria, the step-by-step procedure for applying each of the four AIETs, the qualitative coding scheme and saturation criteria used, and the precise operationalization of the inclusion/exclusion criteria applied to the 213 tools. revision: yes
Referee: [Results] Results/Discussion: The claim that the AIETs 'did not help to identify potential negative impacts of models for the Portuguese language' rests entirely on unverified developer self-reports. No cross-validation against model outputs, external audits, or documented harms is presented, making the evidence load-bearing for the failure conclusion but insufficiently grounded.

Authors: The study is framed from the developers' perspective, so developer self-reports constitute the primary data. We acknowledge the absence of external validation as a limitation. In the revision we will add an explicit discussion of this limitation, include additional illustrative quotes from the interviews showing Portuguese-specific concerns raised by developers, and qualify the relevant claims to reflect that they are based on developer perceptions rather than independent verification. revision: partial

Circularity Check

0 steps flagged

No significant circularity; claims rest on external survey and interviews

full rationale

The paper conducts a literature survey of 213 AIETs, applies inclusion/exclusion criteria to select four tools, and evaluates them via 35 hours of developer interviews on Portuguese-language models. All central claims (that the tools guide only general ethics and miss idiomatic expressions or Portuguese-specific impacts) are presented as direct outcomes of this empirical process. No equations, fitted parameters, self-definitional reductions, or load-bearing self-citations appear in the abstract or described methodology. The derivation chain is self-contained against external benchmarks and does not reduce any result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the selected tools are representative after screening and that interview data validly measures tool effectiveness for language-specific issues.

axioms (1)

domain assumption The four selected AIETs are representative of tools that should address language-specific ethical concerns.
Stated after applying inclusion/exclusion criteria to 213 tools in the abstract.

pith-pipeline@v0.9.0 · 5618 in / 1156 out tokens · 62148 ms · 2026-05-16T22:34:03.956172+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We applied AIETs to language models developed for the Portuguese language, conducting 35 hours of interviews with their developers... they do not address unique aspects of these models, such as idiomatic expressions.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The results suggest that the applied AIETs serve as a guide for formulating general ethical considerations about language models.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

300 extracted references · 300 canonical work pages · 5 internal anchors

[1]

OpenAI.: Introducing ChatGPT. 2022. https://openai.com/blog/chatgpt. Available from: https: //openai.com/blog/chatgpt

work page 2022
[2]

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data

Bender EM, Koller A. Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data. In: Jurafsky D, Chai J, Schluter N, Tetreault J, editors. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics

work page
[3]

5185–5198

p. 5185–5198. Available from: https://aclanthology.org/2020.acl-main.463/

work page 2020
[4]

The Social Impact of Natural Language Processing

Hovy D, Spruit SL. The Social Impact of Natural Language Processing. In: Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers); 2016. p. 591–598

work page 2016
[5]

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: ACM Conference on Fairness, Accountability, and Transparency

Bender EM, Gebru T, McMillan-Major A, Shmitchell S. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: ACM Conference on Fairness, Accountability, and Transparency

work page
[6]

Ethical and Social Risks of Harm from Language Models

Weidinger L, Mellor J, Rauh M, Griffin C, Uesato J, Huang PS, et al. Ethical and Social Risks of Harm from Language Models. arXiv:211204359. 2021 Dec

work page 2021
[7]

Lost in Translation: Large Language Models in Non-English Content Analysis

Nicholas G, Bhatia A. Lost in Translation: Large Language Models in Non-English Content Analysis. arXiv:230607377. 2023;[cs.CL]

work page 2023
[8]

CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages

Santos GO, Moreira DAB, Ferreira AI, Silva J, Pereira L, Bueno P, et al. CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages. In: Workshop on Multi-lingual Representation Learning (MRL), Conference on Empirical Methods in Natural Language Processing (EMNLP); 2023. p. 184–207

work page 2023
[9]

The Ghost in the Machine Has an American Accent: Value Conflict in GPT-3

Johnson RL, Pistilli G, Men´ edez-Gonz´ alez N, Duran LDD, Panai E, Kalpokiene J, et al. The Ghost in the Machine Has an American Accent: Value Conflict in GPT-3. arXiv:220307785. 2022 Mar

work page 2022
[10]

Five sources of bias in natural language processing

Hovy D, Prabhumoye S. Five sources of bias in natural language processing. Language and Linguistics Compass. 2021;15(8):e12432

work page 2021
[11]

Teaching ethics in computing: a systematic literature review of ACM computer science education publications

Brown N, Xie B, Sarder E, Fiesler C, Wiese ES. Teaching ethics in computing: a systematic literature review of ACM computer science education publications. ACM Transactions on Computing Education. 2024;24(1):1–36

work page 2024
[12]

Integrating ethics into computer science education: Multi-, inter-, and transdisciplinary approaches

Goetze TS. Integrating ethics into computer science education: Multi-, inter-, and transdisciplinary approaches. In: 54th ACM Technical Symposium on Computer Science Education; 2023. p. 645–651. 29

work page 2023
[13]

Model Cards for Model Reporting

Mitchell M, Wu S, Zaldivar A, Barnes P, Vasserman L, Hutchinson B, et al. Model Cards for Model Reporting. In: Conference on Fairness, Accountability, and Transparency; 2019. p. 220–229

work page 2019
[14]

The Assessment List for Trustworthy Artificial Intelligence (ALTAI)

High-Level Expert Group on Artificial Intelligence. The Assessment List for Trustworthy Artificial Intelligence (ALTAI). Brussels: European Commission; 2020. Available from: https://digital-strategy. ec.europa.eu/pt/node/806

work page 2020
[15]

Microsoft.: Harms Modeling - Azure Application Architecture Guide. 2022. Available from: https: //learn.microsoft.com/en-us/azure/architecture/guide/responsible-innovation/harms-modeling/

work page 2022
[16]

doi: 10.1147/JRD.2019.2942288

Arnold M, Bellamy RKE, Hind M, Houde S, Mehta S, Mojsilovi´ c A, et al. FactSheets: Increasing trust in AI services through supplier’s declarations of conformity. IBM Journal of Research and Development. 2019;63(4/5):6:1–6:13. https://doi.org/10.1147/JRD.2019.2942288

work page doi:10.1147/jrd.2019.2942288 2019
[17]

Putting AI Ethics to Work: Are the Tools Fit for Purpose? AI and Ethics

Ayling J, Chapman A. Putting AI Ethics to Work: Are the Tools Fit for Purpose? AI and Ethics. 2022;2(3):405–429

work page 2022
[18]

Seeing Like a Toolkit: How Toolkits Envision the Work of AI Ethics

Wong RY, Madaio MA, Merrill N. Seeing Like a Toolkit: How Toolkits Envision the Work of AI Ethics. ACM on Human-Computer Interaction. 2023;7(CSCW1):1–27

work page 2023
[19]

From Ethical AI Frameworks to Tools: A Review of Approaches

Prem E. From Ethical AI Frameworks to Tools: A Review of Approaches. AI and Ethics. 2023;3:1–18

work page 2023
[20]

No Such Thing as One-Size-Fits-All in AI Ethics Frameworks: A Comparative Case Study

Qiang V, Rhim J, Moon Aj. No Such Thing as One-Size-Fits-All in AI Ethics Frameworks: A Comparative Case Study. AI & Society. 2023;6:1–20

work page 2023
[21]

From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices

Morley J, Floridi L, Kinsey L, Elhalal A. From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices. Science and Engineering Ethics. 2020;26(4):2141–2168

work page 2020
[22]

A ‘Biased’ Emerging Governance Regime for Artificial Intelligence? How AI Ethics Get Skewed Moving from Principles to Practices

Palladino N. A ‘Biased’ Emerging Governance Regime for Artificial Intelligence? How AI Ethics Get Skewed Moving from Principles to Practices. Telecommunications Policy. 2022;47(5):102479

work page 2022
[23]

Applying the ethics of AI: a systematic review of tools for developing and assessing AI-based systems

Ortega-Bola˜ nos R, Bernal-Salcedo J, Germ´ an Ortiz M, Galeano Sarmiento J, Ruz GA, Tabares-Soto R. Applying the ethics of AI: a systematic review of tools for developing and assessing AI-based systems. Artificial Intelligence Review. 2024;57(5):110. https://doi.org/https://doi.org/10.1007/ s10462-024-10740-3

work page 2024
[24]

IDEO.: IDEO’s AI Ethics Cards. 2019. Available from: https://www.ideo.com/journal/ ai-needs-an-ethical-compass-this-tool-can-help

work page 2019
[25]

Corporate digital responsibility

Lobschat L, Mueller B, Eggers F, Brandimarte L, Diefenbach S, Kroschke M, et al. Corporate digital responsibility. Journal of Business Research. 2021;122:875–888. https://doi.org/10.1016/j.jbusres. 2019.10.006

work page doi:10.1016/j.jbusres 2021
[26]

for Designers E.: Ethics for Designers — The Toolkit. 2017. Available from: https://www. ethicsfordesigners.com/tools

work page 2017
[27]

ACM64, 12 (2021), 86–92

Gebru T, Morgenstern J, Vecchione B, Vaughan JW, Wallach H, III HD, et al. Datasheets for datasets. Communications of the ACM. 2021 Nov;64(12):86–92. https://doi.org/10.1145/3458723

work page doi:10.1145/3458723 2021
[28]

Data statements for natural language processing: Toward mitigating system bias and enabling better science

Bender EM, Friedman B. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics. 2018;6:587–604

work page 2018
[29]

The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards

Holland S, Hosny A, Newman S, Joseph J, Chmielinski K. The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards. arXiv:180503677. 2018 May;[cs]

work page 2018
[30]

Aequitas: A Bias and Fairness Audit Toolkit

Saleiro P, Kuester B, Hinkson L, London J, Stevens A, Anisfeld A, et al. Aequitas: A Bias and Fairness Audit Toolkit. arXiv:181105577. 2019;[cs.LG]. 30

work page 2019
[31]

AI Explainability 360 Toolkit

Arya V, Bellamy RKE, Chen PY, Dhurandhar A, Hind M, Hoffman SC, et al. AI Explainability 360 Toolkit. In: 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD). CODS-COMAD ’21. New York, NY, USA: Association for Computing Machinery; 2021. p. 376–379

work page 2021
[32]

Research PA.: What-if Tool. 2018. Available from: https://pair-code.github.io/what-if-tool/

work page 2018
[33]

for Ethical AI & Machine Learning TI.: AI-RFX Procurement Framework. 2019. Available from: https://ethical.institute/rfx.html

work page 2019
[34]

Zeno: An Interactive Frame- work for Behavioral Evaluation of Machine Learning

Cabrera AA, Fu E, Bertucci D, Holstein K, Talwalkar A, Hong JI, et al. Zeno: An Interactive Frame- work for Behavioral Evaluation of Machine Learning. In: CHI Conference on Human Factors in Computing Systems. CHI ’23. New York, NY, USA: Association for Computing Machinery; 2023

work page 2023
[35]

Why Should I Trust You?

Ribeiro MT, Singh S, Guestrin C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16. New York, NY, USA: Association for Computing Machinery; 2016. p. 1135–1144

work page 2016
[36]

A Unified Approach to Interpreting Model Predictions

Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. In: Advances in Neural Information Processing Systems. vol. 30; 2017

work page 2017
[37]

BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation

Dhamala J, Sun T, Kumar V, Krishna S, Pruksachatkun Y, Chang KW, et al. BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation. In: ACM Conference on Fair- ness, Accountability, and Transparency. FAccT ’21. New York, NY, USA: Association for Computing Machinery; 2021. p. 862–872

work page 2021
[38]

for Ethical AI & Machine Learning TI.: XAI - An eXplainability toolbox for machine learning. 2021. Available from: https://github.com/EthicalML/xai

work page 2021
[39]

Auditing large language models: A three-layered approach

M¨ okander J, Schuett J, Kirk HR, Floridi L. Auditing large language models: A three-layered approach. AI and Ethics. 2024;4(4):1085–1115. https://doi.org/10.1007/s43681-023-00289-2

work page doi:10.1007/s43681-023-00289-2 2024
[40]

Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing

Raji ID, Smart A, White RN, Mitchell M, Gebru T, Hutchinson B, et al. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing. In: Conference on Fair- ness, Accountability, and Transparency. FAT* ’20. New York, NY, USA: Association for Computing Machinery; 2020. p. 33–44

work page 2020
[41]

A seven-layer model with checklists for standardising fairness assess- ment throughout the AI lifecycle

Agarwal A, Agarwal H. A seven-layer model with checklists for standardising fairness assess- ment throughout the AI lifecycle. AI and Ethics. 2023;4(2):299–314. https://doi.org/10.1007/ s43681-023-00266-9

work page 2023
[42]

PROBAST: a tool to assess the risk of bias and applicability of prediction model studies

Wolff RF, Moons KG, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Annals of Internal Medicine. 2019;170(1):51–58

work page 2019
[43]

Judgment Call the Game: Using Value Sensitive Design and Design Fiction to Surface Ethical Concerns Related to Technology

Ballard S, Chappell KM, Kennedy K. Judgment Call the Game: Using Value Sensitive Design and Design Fiction to Surface Ethical Concerns Related to Technology. In: Designing Interactive Systems Conference; 2019. p. 421–433

work page 2019
[44]

Microsoft.: Community Jury - Azure Application Architecture Guide. 2022. Available from: https: //learn.microsoft.com/en-us/azure/architecture/guide/responsible-innovation/community-jury/

work page 2022
[45]

Privacy T.: TensorFlow Privacy. 2019. Available from: https://github.com/tensorflow/privacy

work page 2019
[46]

Adversarial Robustness Toolbox v1.0.0

Nicolae MI, Sinn M, Tran MN, Buesser B, Rawat A, Wistuba M, et al. Adversarial Robustness Toolbox v1.0.0. arXiv180701069. 2019;[cs.LG]

work page 2019
[47]

Doteveryone.: Consequence Scanning: An Agile event for Responsible Innova- tors. 2019. Available from: https://doteveryone.org.uk/wp-content/uploads/2021/02/ 31 Consequence-Scanning-Agile-Event-Manual-TechTransformed-Doteveryone-2.pdf

work page 2019
[48]

Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems

Kiritchenko S, Mohammad SM. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems. NAACL HLT 2018. 2018;p. 43

work page 2018
[49]

Should I disclose my dataset? Caveats between reproducibility and individual data rights

Benatti RM, Villarroel CML, Avila S, Colombini EL, Severi F. Should I disclose my dataset? Caveats between reproducibility and individual data rights. In: Natural Legal Language Processing Workshop. Association for Computational Linguistics; 2022. p. 228–237

work page 2022
[50]

TuringBox: An Experimental Plat- form for the Evaluation of AI Systems

Epstein Z, Payne BH, Shen JH, Hong CJ, Felbo B, Dubey A, et al. TuringBox: An Experimental Plat- form for the Evaluation of AI Systems. In: Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization; 2018. p. 5826–5828

work page 2018
[51]

AI Audit: A Card Game to Reflect on Everyday AI Systems

Ali S, Kumar V, Breazeal C. AI Audit: A Card Game to Reflect on Everyday AI Systems. AAAI Con- ference on Artificial Intelligence. 2024 Jul;37(13):15981–15989. https://doi.org/10.1609/aaai.v37i13. 26897

work page doi:10.1609/aaai.v37i13 2024
[52]

A Survey on Ethical Principles of AI and Implementations

Zhou J, Chen F, Berry A, Reed M, Zhang S, Savage S. A Survey on Ethical Principles of AI and Implementations. In: IEEE Symposium Series on Computational Intelligence. Canberra, Australia: IEEE; 2020. p. 3010–3017

work page 2020
[53]

What’s next for AI ethics, policy, and governance? a global overview

Schiff D, Biddle J, Borenstein J, Laas K. What’s next for AI ethics, policy, and governance? a global overview. In: AAAI/ACM Conference on AI, Ethics, and Society. New York City, NY, USA: ACM

work page
[54]

Translating principles into practices of digital ethics: Five risks of being unethical

Floridi L. Translating principles into practices of digital ethics: Five risks of being unethical. Philosophy & Technology. 2019;32(2):185–193

work page 2019
[55]

Ethics of AI: A systematic literature review of principles and challenges

Khan AA, Badshah S, Liang P, Waseem M, Khan B, Ahmad A, et al. Ethics of AI: A systematic literature review of principles and challenges. In: 26th International Conference on Evaluation and Assessment in Software Engineering. Gothenburg, Sweden: ACM; 2022. p. 383–392

work page 2022
[56]

Worldwide AI ethics: A review of 200 guidelines and recommendations for AI governance

Corrˆ ea NK, Galv˜ ao C, Santos JW, Del Pino C, Pinto EP, Barbosa C, et al. Worldwide AI ethics: A review of 200 guidelines and recommendations for AI governance. Patterns. 2023;4(10)

work page 2023
[57]

Artificial Intelligence: The Global Landscape of Ethics Guidelines

Jobin A, Ienca M, Vayena E. Artificial Intelligence: The Global Landscape of Ethics Guidelines. Nature Machine Intelligence. 2019;1(9):389–399. [cs]

work page 2019
[58]

Artificial Intelligence Ethics Guidelines for Developers and Users: Clarifying Their Content and Normative Implications

Ryan M, Stahl BC. Artificial Intelligence Ethics Guidelines for Developers and Users: Clarifying Their Content and Normative Implications. Journal of Information, Communication and Ethics in Society. 2020 Jan;19(1):61–86

work page 2020
[59]

Responsible AI: Two Frameworks for Ethical Design Practice

Peters D, Vold K, Robinson D, Calvo RA. Responsible AI: Two Frameworks for Ethical Design Practice. IEEE Transactions on Technology and Society. 2020;1(1):34–47

work page 2020
[60]

High-Level Expert Group on Artificial Intelligence.: Ethics Guidelines for Trustworthy AI. 2019. Available from: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai

work page 2019
[61]

Anderson D, Bonaguro J, McKinney M, Nicklin A, Wiseman J.: Ethics & Algorithms Toolkit (beta)

work page
[62]

Available from: https://ethicstoolkit.ai/

work page
[63]

Treasury Board of Canada.: Algorithmic Impact Assessment Tool. 2021. Available from: https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/ responsible-use-ai/algorithmic-impact-assessment.html

work page 2021
[64]

Community TW.: The Turing Way: A Handbook for Reproducible, Ethical and Collaborative Research. Zenodo. 2021. 32

work page 2021
[65]

Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability

Reisman D, Schultz J, Crawford K, Whittaker M. Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability. AI Now Intitute. 2018;1(1):1–22

work page 2018
[66]

Explanations based on the missing: Towards contrastive explanations with pertinent negatives

Dhurandhar A, Chen PY, Luss R, Tu CC, Ting P, Shanmugam K, et al. Explanations based on the missing: Towards contrastive explanations with pertinent negatives. Advances in Neural Information Processing Systems. 2018;31

work page 2018
[67]

Advbox: A toolbox to generate adversarial examples that fool neural networks

Goodman D, Xin H, Yang W, Yuesheng W, Junfeng X, Huan Z. Advbox: A toolbox to generate adversarial examples that fool neural networks. arXiv:200105574. 2020;[cs.LG]

work page 2020
[68]

Fairness in Design: A Framework for Facilitating Ethical Artificial Intelligence Designs

Zhang J, Shu Y, Yu H. Fairness in Design: A Framework for Facilitating Ethical Artificial Intelligence Designs. International Journal of Crowd Science. 2023;7(1):32–39. https://doi.org/10.26599/IJCS. 2022.9100033

work page doi:10.26599/ijcs 2023
[69]

AI Privacy Toolkit

Goldsteen A, Saadi O, Shmelkin R, Shachor S, Razinkov N. AI Privacy Toolkit. SoftwareX. 2023;22:101352. https://doi.org/10.1016/j.softx.2023.101352

work page doi:10.1016/j.softx.2023.101352 2023
[70]

ICO.: Guide to the UK General Data Protection Regulation (UK GDPR). 2020

work page 2020
[71]

Paris: OECD Publishing

OECD Digital Economy Papers.: OECD Framework for the Classification of AI Systems. Paris: OECD Publishing. 2022

work page 2022
[72]

Forum WE.: AI Procurement in a Box. 2020. Available from: https://www3.weforum.org/docs/ WEF AI Procurement in a Box Project Overview 2020.pdf

work page 2020
[73]

for Ethical AI & Machine Learning TI.: Machine Learning Maturity Model, AI & Machine Learning Solutions. 2019. Available from: https://ethical.institute/mlmm

work page 2019
[74]

Guillou P.: GPorTuguese-2 (Portuguese GPT-2 small): a Language Model for Portuguese text gen- eration (and more NLP tasks...). 2020. Available from: https://huggingface.co/pierreguillou/ gpt2-small-portuguese

work page 2020
[75]

PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data

Carmo D, Piau M, Campiotti I, Nogueira R, Lotufo R. PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data. arXiv:200809144. 2020 Oct

work page 2020
[76]

BERTimbau: Pretrained BERT Models for Brazilian Portuguese

Souza F, Nogueira R, Lotufo R. BERTimbau: Pretrained BERT Models for Brazilian Portuguese. In: Cerri R, Prati RC, editors. Intelligent Systems. Cham: Springer International Publishing; 2020. p. 403–417

work page 2020
[77]

BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition

Schneider ETR, de Souza JVA, Knafou J, Oliveira LESe, Copara J, Gumiel YB, et al. BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition. In: 3rd Clinical Natural Language Processing Workshop; 2020. p. 65–72

work page 2020
[78]

BERTa´ u: Ita´ u BERT for Digital Customer Service

Finardi P, Viegas JD, Ferreira GT, Mansano AF, Carid´ a VF. BERTa´ u: Ita´ u BERT for Digital Customer Service. arXiv:210112015. 2021 Jul

work page 2021
[79]

A GPT-2 Language Model for Biomedical Texts in Portuguese

Schneider ETR, de Souza JVA, Gumiel YB, Moro C, Paraiso EC. A GPT-2 Language Model for Biomedical Texts in Portuguese. In: IEEE 34th International Symposium on Computer-Based Medical Systems; 2021. p. 474–479

work page 2021
[80]

LegalNLP – Natural Language Processing methods for the Brazilian Legal Language

Polo FM, Mendon¸ ca GCF, Parreira KCJ, Gianvechio L, Cordeiro P, Ferreira JB, et al. LegalNLP – Natural Language Processing methods for the Brazilian Legal Language. arXiv:211015709. 2021;[cs.CL]

work page 2021

Showing first 80 references.

[1] [1]

OpenAI.: Introducing ChatGPT. 2022. https://openai.com/blog/chatgpt. Available from: https: //openai.com/blog/chatgpt

work page 2022

[2] [2]

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data

Bender EM, Koller A. Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data. In: Jurafsky D, Chai J, Schluter N, Tetreault J, editors. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics

work page

[3] [3]

5185–5198

p. 5185–5198. Available from: https://aclanthology.org/2020.acl-main.463/

work page 2020

[4] [4]

The Social Impact of Natural Language Processing

Hovy D, Spruit SL. The Social Impact of Natural Language Processing. In: Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers); 2016. p. 591–598

work page 2016

[5] [5]

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: ACM Conference on Fairness, Accountability, and Transparency

Bender EM, Gebru T, McMillan-Major A, Shmitchell S. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: ACM Conference on Fairness, Accountability, and Transparency

work page

[6] [6]

Ethical and Social Risks of Harm from Language Models

Weidinger L, Mellor J, Rauh M, Griffin C, Uesato J, Huang PS, et al. Ethical and Social Risks of Harm from Language Models. arXiv:211204359. 2021 Dec

work page 2021

[7] [7]

Lost in Translation: Large Language Models in Non-English Content Analysis

Nicholas G, Bhatia A. Lost in Translation: Large Language Models in Non-English Content Analysis. arXiv:230607377. 2023;[cs.CL]

work page 2023

[8] [8]

CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages

Santos GO, Moreira DAB, Ferreira AI, Silva J, Pereira L, Bueno P, et al. CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages. In: Workshop on Multi-lingual Representation Learning (MRL), Conference on Empirical Methods in Natural Language Processing (EMNLP); 2023. p. 184–207

work page 2023

[9] [9]

The Ghost in the Machine Has an American Accent: Value Conflict in GPT-3

Johnson RL, Pistilli G, Men´ edez-Gonz´ alez N, Duran LDD, Panai E, Kalpokiene J, et al. The Ghost in the Machine Has an American Accent: Value Conflict in GPT-3. arXiv:220307785. 2022 Mar

work page 2022

[10] [10]

Five sources of bias in natural language processing

Hovy D, Prabhumoye S. Five sources of bias in natural language processing. Language and Linguistics Compass. 2021;15(8):e12432

work page 2021

[11] [11]

Teaching ethics in computing: a systematic literature review of ACM computer science education publications

Brown N, Xie B, Sarder E, Fiesler C, Wiese ES. Teaching ethics in computing: a systematic literature review of ACM computer science education publications. ACM Transactions on Computing Education. 2024;24(1):1–36

work page 2024

[12] [12]

Integrating ethics into computer science education: Multi-, inter-, and transdisciplinary approaches

Goetze TS. Integrating ethics into computer science education: Multi-, inter-, and transdisciplinary approaches. In: 54th ACM Technical Symposium on Computer Science Education; 2023. p. 645–651. 29

work page 2023

[13] [13]

Model Cards for Model Reporting

Mitchell M, Wu S, Zaldivar A, Barnes P, Vasserman L, Hutchinson B, et al. Model Cards for Model Reporting. In: Conference on Fairness, Accountability, and Transparency; 2019. p. 220–229

work page 2019

[14] [14]

The Assessment List for Trustworthy Artificial Intelligence (ALTAI)

High-Level Expert Group on Artificial Intelligence. The Assessment List for Trustworthy Artificial Intelligence (ALTAI). Brussels: European Commission; 2020. Available from: https://digital-strategy. ec.europa.eu/pt/node/806

work page 2020

[15] [15]

Microsoft.: Harms Modeling - Azure Application Architecture Guide. 2022. Available from: https: //learn.microsoft.com/en-us/azure/architecture/guide/responsible-innovation/harms-modeling/

work page 2022

[16] [16]

doi: 10.1147/JRD.2019.2942288

Arnold M, Bellamy RKE, Hind M, Houde S, Mehta S, Mojsilovi´ c A, et al. FactSheets: Increasing trust in AI services through supplier’s declarations of conformity. IBM Journal of Research and Development. 2019;63(4/5):6:1–6:13. https://doi.org/10.1147/JRD.2019.2942288

work page doi:10.1147/jrd.2019.2942288 2019

[17] [17]

Putting AI Ethics to Work: Are the Tools Fit for Purpose? AI and Ethics

Ayling J, Chapman A. Putting AI Ethics to Work: Are the Tools Fit for Purpose? AI and Ethics. 2022;2(3):405–429

work page 2022

[18] [18]

Seeing Like a Toolkit: How Toolkits Envision the Work of AI Ethics

Wong RY, Madaio MA, Merrill N. Seeing Like a Toolkit: How Toolkits Envision the Work of AI Ethics. ACM on Human-Computer Interaction. 2023;7(CSCW1):1–27

work page 2023

[19] [19]

From Ethical AI Frameworks to Tools: A Review of Approaches

Prem E. From Ethical AI Frameworks to Tools: A Review of Approaches. AI and Ethics. 2023;3:1–18

work page 2023

[20] [20]

No Such Thing as One-Size-Fits-All in AI Ethics Frameworks: A Comparative Case Study

Qiang V, Rhim J, Moon Aj. No Such Thing as One-Size-Fits-All in AI Ethics Frameworks: A Comparative Case Study. AI & Society. 2023;6:1–20

work page 2023

[21] [21]

From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices

Morley J, Floridi L, Kinsey L, Elhalal A. From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices. Science and Engineering Ethics. 2020;26(4):2141–2168

work page 2020

[22] [22]

A ‘Biased’ Emerging Governance Regime for Artificial Intelligence? How AI Ethics Get Skewed Moving from Principles to Practices

Palladino N. A ‘Biased’ Emerging Governance Regime for Artificial Intelligence? How AI Ethics Get Skewed Moving from Principles to Practices. Telecommunications Policy. 2022;47(5):102479

work page 2022

[23] [23]

Applying the ethics of AI: a systematic review of tools for developing and assessing AI-based systems

Ortega-Bola˜ nos R, Bernal-Salcedo J, Germ´ an Ortiz M, Galeano Sarmiento J, Ruz GA, Tabares-Soto R. Applying the ethics of AI: a systematic review of tools for developing and assessing AI-based systems. Artificial Intelligence Review. 2024;57(5):110. https://doi.org/https://doi.org/10.1007/ s10462-024-10740-3

work page 2024

[24] [24]

IDEO.: IDEO’s AI Ethics Cards. 2019. Available from: https://www.ideo.com/journal/ ai-needs-an-ethical-compass-this-tool-can-help

work page 2019

[25] [25]

Corporate digital responsibility

Lobschat L, Mueller B, Eggers F, Brandimarte L, Diefenbach S, Kroschke M, et al. Corporate digital responsibility. Journal of Business Research. 2021;122:875–888. https://doi.org/10.1016/j.jbusres. 2019.10.006

work page doi:10.1016/j.jbusres 2021

[26] [26]

for Designers E.: Ethics for Designers — The Toolkit. 2017. Available from: https://www. ethicsfordesigners.com/tools

work page 2017

[27] [27]

ACM64, 12 (2021), 86–92

Gebru T, Morgenstern J, Vecchione B, Vaughan JW, Wallach H, III HD, et al. Datasheets for datasets. Communications of the ACM. 2021 Nov;64(12):86–92. https://doi.org/10.1145/3458723

work page doi:10.1145/3458723 2021

[28] [28]

Data statements for natural language processing: Toward mitigating system bias and enabling better science

Bender EM, Friedman B. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics. 2018;6:587–604

work page 2018

[29] [29]

The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards

Holland S, Hosny A, Newman S, Joseph J, Chmielinski K. The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards. arXiv:180503677. 2018 May;[cs]

work page 2018

[30] [30]

Aequitas: A Bias and Fairness Audit Toolkit

Saleiro P, Kuester B, Hinkson L, London J, Stevens A, Anisfeld A, et al. Aequitas: A Bias and Fairness Audit Toolkit. arXiv:181105577. 2019;[cs.LG]. 30

work page 2019

[31] [31]

AI Explainability 360 Toolkit

Arya V, Bellamy RKE, Chen PY, Dhurandhar A, Hind M, Hoffman SC, et al. AI Explainability 360 Toolkit. In: 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD). CODS-COMAD ’21. New York, NY, USA: Association for Computing Machinery; 2021. p. 376–379

work page 2021

[32] [32]

Research PA.: What-if Tool. 2018. Available from: https://pair-code.github.io/what-if-tool/

work page 2018

[33] [33]

for Ethical AI & Machine Learning TI.: AI-RFX Procurement Framework. 2019. Available from: https://ethical.institute/rfx.html

work page 2019

[34] [34]

Zeno: An Interactive Frame- work for Behavioral Evaluation of Machine Learning

Cabrera AA, Fu E, Bertucci D, Holstein K, Talwalkar A, Hong JI, et al. Zeno: An Interactive Frame- work for Behavioral Evaluation of Machine Learning. In: CHI Conference on Human Factors in Computing Systems. CHI ’23. New York, NY, USA: Association for Computing Machinery; 2023

work page 2023

[35] [35]

Why Should I Trust You?

Ribeiro MT, Singh S, Guestrin C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16. New York, NY, USA: Association for Computing Machinery; 2016. p. 1135–1144

work page 2016

[36] [36]

A Unified Approach to Interpreting Model Predictions

Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. In: Advances in Neural Information Processing Systems. vol. 30; 2017

work page 2017

[37] [37]

BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation

Dhamala J, Sun T, Kumar V, Krishna S, Pruksachatkun Y, Chang KW, et al. BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation. In: ACM Conference on Fair- ness, Accountability, and Transparency. FAccT ’21. New York, NY, USA: Association for Computing Machinery; 2021. p. 862–872

work page 2021

[38] [38]

for Ethical AI & Machine Learning TI.: XAI - An eXplainability toolbox for machine learning. 2021. Available from: https://github.com/EthicalML/xai

work page 2021

[39] [39]

Auditing large language models: A three-layered approach

M¨ okander J, Schuett J, Kirk HR, Floridi L. Auditing large language models: A three-layered approach. AI and Ethics. 2024;4(4):1085–1115. https://doi.org/10.1007/s43681-023-00289-2

work page doi:10.1007/s43681-023-00289-2 2024

[40] [40]

Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing

Raji ID, Smart A, White RN, Mitchell M, Gebru T, Hutchinson B, et al. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing. In: Conference on Fair- ness, Accountability, and Transparency. FAT* ’20. New York, NY, USA: Association for Computing Machinery; 2020. p. 33–44

work page 2020

[41] [41]

A seven-layer model with checklists for standardising fairness assess- ment throughout the AI lifecycle

Agarwal A, Agarwal H. A seven-layer model with checklists for standardising fairness assess- ment throughout the AI lifecycle. AI and Ethics. 2023;4(2):299–314. https://doi.org/10.1007/ s43681-023-00266-9

work page 2023

[42] [42]

PROBAST: a tool to assess the risk of bias and applicability of prediction model studies

Wolff RF, Moons KG, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Annals of Internal Medicine. 2019;170(1):51–58

work page 2019

[43] [43]

Judgment Call the Game: Using Value Sensitive Design and Design Fiction to Surface Ethical Concerns Related to Technology

Ballard S, Chappell KM, Kennedy K. Judgment Call the Game: Using Value Sensitive Design and Design Fiction to Surface Ethical Concerns Related to Technology. In: Designing Interactive Systems Conference; 2019. p. 421–433

work page 2019

[44] [44]

Microsoft.: Community Jury - Azure Application Architecture Guide. 2022. Available from: https: //learn.microsoft.com/en-us/azure/architecture/guide/responsible-innovation/community-jury/

work page 2022

[45] [45]

Privacy T.: TensorFlow Privacy. 2019. Available from: https://github.com/tensorflow/privacy

work page 2019

[46] [46]

Adversarial Robustness Toolbox v1.0.0

Nicolae MI, Sinn M, Tran MN, Buesser B, Rawat A, Wistuba M, et al. Adversarial Robustness Toolbox v1.0.0. arXiv180701069. 2019;[cs.LG]

work page 2019

[47] [47]

Doteveryone.: Consequence Scanning: An Agile event for Responsible Innova- tors. 2019. Available from: https://doteveryone.org.uk/wp-content/uploads/2021/02/ 31 Consequence-Scanning-Agile-Event-Manual-TechTransformed-Doteveryone-2.pdf

work page 2019

[48] [48]

Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems

Kiritchenko S, Mohammad SM. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems. NAACL HLT 2018. 2018;p. 43

work page 2018

[49] [49]

Should I disclose my dataset? Caveats between reproducibility and individual data rights

Benatti RM, Villarroel CML, Avila S, Colombini EL, Severi F. Should I disclose my dataset? Caveats between reproducibility and individual data rights. In: Natural Legal Language Processing Workshop. Association for Computational Linguistics; 2022. p. 228–237

work page 2022

[50] [50]

TuringBox: An Experimental Plat- form for the Evaluation of AI Systems

Epstein Z, Payne BH, Shen JH, Hong CJ, Felbo B, Dubey A, et al. TuringBox: An Experimental Plat- form for the Evaluation of AI Systems. In: Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization; 2018. p. 5826–5828

work page 2018

[51] [51]

AI Audit: A Card Game to Reflect on Everyday AI Systems

Ali S, Kumar V, Breazeal C. AI Audit: A Card Game to Reflect on Everyday AI Systems. AAAI Con- ference on Artificial Intelligence. 2024 Jul;37(13):15981–15989. https://doi.org/10.1609/aaai.v37i13. 26897

work page doi:10.1609/aaai.v37i13 2024

[52] [52]

A Survey on Ethical Principles of AI and Implementations

Zhou J, Chen F, Berry A, Reed M, Zhang S, Savage S. A Survey on Ethical Principles of AI and Implementations. In: IEEE Symposium Series on Computational Intelligence. Canberra, Australia: IEEE; 2020. p. 3010–3017

work page 2020

[53] [53]

What’s next for AI ethics, policy, and governance? a global overview

Schiff D, Biddle J, Borenstein J, Laas K. What’s next for AI ethics, policy, and governance? a global overview. In: AAAI/ACM Conference on AI, Ethics, and Society. New York City, NY, USA: ACM

work page

[54] [54]

Translating principles into practices of digital ethics: Five risks of being unethical

Floridi L. Translating principles into practices of digital ethics: Five risks of being unethical. Philosophy & Technology. 2019;32(2):185–193

work page 2019

[55] [55]

Ethics of AI: A systematic literature review of principles and challenges

Khan AA, Badshah S, Liang P, Waseem M, Khan B, Ahmad A, et al. Ethics of AI: A systematic literature review of principles and challenges. In: 26th International Conference on Evaluation and Assessment in Software Engineering. Gothenburg, Sweden: ACM; 2022. p. 383–392

work page 2022

[56] [56]

Worldwide AI ethics: A review of 200 guidelines and recommendations for AI governance

Corrˆ ea NK, Galv˜ ao C, Santos JW, Del Pino C, Pinto EP, Barbosa C, et al. Worldwide AI ethics: A review of 200 guidelines and recommendations for AI governance. Patterns. 2023;4(10)

work page 2023

[57] [57]

Artificial Intelligence: The Global Landscape of Ethics Guidelines

Jobin A, Ienca M, Vayena E. Artificial Intelligence: The Global Landscape of Ethics Guidelines. Nature Machine Intelligence. 2019;1(9):389–399. [cs]

work page 2019

[58] [58]

Artificial Intelligence Ethics Guidelines for Developers and Users: Clarifying Their Content and Normative Implications

Ryan M, Stahl BC. Artificial Intelligence Ethics Guidelines for Developers and Users: Clarifying Their Content and Normative Implications. Journal of Information, Communication and Ethics in Society. 2020 Jan;19(1):61–86

work page 2020

[59] [59]

Responsible AI: Two Frameworks for Ethical Design Practice

Peters D, Vold K, Robinson D, Calvo RA. Responsible AI: Two Frameworks for Ethical Design Practice. IEEE Transactions on Technology and Society. 2020;1(1):34–47

work page 2020

[60] [60]

High-Level Expert Group on Artificial Intelligence.: Ethics Guidelines for Trustworthy AI. 2019. Available from: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai

work page 2019

[61] [61]

Anderson D, Bonaguro J, McKinney M, Nicklin A, Wiseman J.: Ethics & Algorithms Toolkit (beta)

work page

[62] [62]

Available from: https://ethicstoolkit.ai/

work page

[63] [63]

Treasury Board of Canada.: Algorithmic Impact Assessment Tool. 2021. Available from: https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/ responsible-use-ai/algorithmic-impact-assessment.html

work page 2021

[64] [64]

Community TW.: The Turing Way: A Handbook for Reproducible, Ethical and Collaborative Research. Zenodo. 2021. 32

work page 2021

[65] [65]

Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability

Reisman D, Schultz J, Crawford K, Whittaker M. Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability. AI Now Intitute. 2018;1(1):1–22

work page 2018

[66] [66]

Explanations based on the missing: Towards contrastive explanations with pertinent negatives

Dhurandhar A, Chen PY, Luss R, Tu CC, Ting P, Shanmugam K, et al. Explanations based on the missing: Towards contrastive explanations with pertinent negatives. Advances in Neural Information Processing Systems. 2018;31

work page 2018

[67] [67]

Advbox: A toolbox to generate adversarial examples that fool neural networks

Goodman D, Xin H, Yang W, Yuesheng W, Junfeng X, Huan Z. Advbox: A toolbox to generate adversarial examples that fool neural networks. arXiv:200105574. 2020;[cs.LG]

work page 2020

[68] [68]

Fairness in Design: A Framework for Facilitating Ethical Artificial Intelligence Designs

Zhang J, Shu Y, Yu H. Fairness in Design: A Framework for Facilitating Ethical Artificial Intelligence Designs. International Journal of Crowd Science. 2023;7(1):32–39. https://doi.org/10.26599/IJCS. 2022.9100033

work page doi:10.26599/ijcs 2023

[69] [69]

AI Privacy Toolkit

Goldsteen A, Saadi O, Shmelkin R, Shachor S, Razinkov N. AI Privacy Toolkit. SoftwareX. 2023;22:101352. https://doi.org/10.1016/j.softx.2023.101352

work page doi:10.1016/j.softx.2023.101352 2023

[70] [70]

ICO.: Guide to the UK General Data Protection Regulation (UK GDPR). 2020

work page 2020

[71] [71]

Paris: OECD Publishing

OECD Digital Economy Papers.: OECD Framework for the Classification of AI Systems. Paris: OECD Publishing. 2022

work page 2022

[72] [72]

Forum WE.: AI Procurement in a Box. 2020. Available from: https://www3.weforum.org/docs/ WEF AI Procurement in a Box Project Overview 2020.pdf

work page 2020

[73] [73]

for Ethical AI & Machine Learning TI.: Machine Learning Maturity Model, AI & Machine Learning Solutions. 2019. Available from: https://ethical.institute/mlmm

work page 2019

[74] [74]

Guillou P.: GPorTuguese-2 (Portuguese GPT-2 small): a Language Model for Portuguese text gen- eration (and more NLP tasks...). 2020. Available from: https://huggingface.co/pierreguillou/ gpt2-small-portuguese

work page 2020

[75] [75]

PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data

Carmo D, Piau M, Campiotti I, Nogueira R, Lotufo R. PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data. arXiv:200809144. 2020 Oct

work page 2020

[76] [76]

BERTimbau: Pretrained BERT Models for Brazilian Portuguese

Souza F, Nogueira R, Lotufo R. BERTimbau: Pretrained BERT Models for Brazilian Portuguese. In: Cerri R, Prati RC, editors. Intelligent Systems. Cham: Springer International Publishing; 2020. p. 403–417

work page 2020

[77] [77]

BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition

Schneider ETR, de Souza JVA, Knafou J, Oliveira LESe, Copara J, Gumiel YB, et al. BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition. In: 3rd Clinical Natural Language Processing Workshop; 2020. p. 65–72

work page 2020

[78] [78]

BERTa´ u: Ita´ u BERT for Digital Customer Service

Finardi P, Viegas JD, Ferreira GT, Mansano AF, Carid´ a VF. BERTa´ u: Ita´ u BERT for Digital Customer Service. arXiv:210112015. 2021 Jul

work page 2021

[79] [79]

A GPT-2 Language Model for Biomedical Texts in Portuguese

Schneider ETR, de Souza JVA, Gumiel YB, Moro C, Paraiso EC. A GPT-2 Language Model for Biomedical Texts in Portuguese. In: IEEE 34th International Symposium on Computer-Based Medical Systems; 2021. p. 474–479

work page 2021

[80] [80]

LegalNLP – Natural Language Processing methods for the Brazilian Legal Language

Polo FM, Mendon¸ ca GCF, Parreira KCJ, Gianvechio L, Cordeiro P, Ferreira JB, et al. LegalNLP – Natural Language Processing methods for the Brazilian Legal Language. arXiv:211015709. 2021;[cs.CL]

work page 2021