LicenseGPT: A Fine-tuned Foundation Model for Publicly Available Dataset License Compliance

Ahmed E. Hassan; Dan Li; Gopi Krishnan Rajbahadur; Jianshan Lin; Jingwen Tan; Xiangfu Song; Zibin Zheng; Zi Li

arxiv: 2501.00106 · v2 · submitted 2024-12-30 · 💻 cs.SE · cs.AI

LicenseGPT: A Fine-tuned Foundation Model for Publicly Available Dataset License Compliance

Jingwen Tan , Gopi Krishnan Rajbahadur , Zi Li , Xiangfu Song , Jianshan Lin , Dan Li , Zibin Zheng , Ahmed E. Hassan This is my paper

Pith reviewed 2026-05-23 05:58 UTC · model grok-4.3

classification 💻 cs.SE cs.AI

keywords dataset license compliancefine-tuned foundation modellegal AIlicense interpretationsoftware intellectual propertyprediction agreementuser study

0 comments

The pith

LicenseGPT, a model fine-tuned on 500 expert-annotated licenses, achieves 64.3% prediction agreement on dataset license compliance and reduces analysis time by over 94%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces LicenseGPT to help interpret ambiguous dataset licenses for commercial AI development. Existing legal foundation models reach only 43.75% agreement with expert judgments. By fine-tuning on a curated set of 500 licenses labeled by legal experts, LicenseGPT reaches 64.3% agreement and is faster. User studies with IP lawyers show it speeds up work dramatically while maintaining accuracy. Lawyers see it as a useful aid that still requires human review for hard cases.

Core claim

LicenseGPT is created by fine-tuning a foundation model on 500 licenses annotated by legal experts. It improves Prediction Agreement from the best legal FM's 43.75% to 64.30%. In A/B tests and user studies, it reduces the time for software IP lawyers to analyze each license from 108 seconds to 6 seconds without loss of accuracy.

What carries the argument

LicenseGPT, a fine-tuned foundation model trained on expert-annotated dataset licenses to predict compliance.

If this is right

LicenseGPT outperforms both specialized legal models and general-purpose models on the task.
Analysis time drops by 94.44% per license in controlled tests.
Lawyers perceive the tool as valuable but still require oversight for complex cases.
The model provides a publicly available resource for practitioners.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar fine-tuning approaches could apply to other areas of legal document analysis beyond licenses.
Expanding the annotated dataset beyond 500 examples might further improve performance on rare license types.
Integration into automated pipelines could change how datasets are selected for AI training.

Load-bearing premise

The 500 expert-annotated licenses represent the full range of ambiguities found in real-world dataset licenses, and the prediction agreement metric reflects actual legal usefulness.

What would settle it

Evaluating LicenseGPT on a fresh collection of 100 dataset licenses not included in the original 500 annotations, measured against new expert judgments.

Figures

Figures reproduced from arXiv: 2501.00106 by Ahmed E. Hassan, Dan Li, Gopi Krishnan Rajbahadur, Jianshan Lin, Jingwen Tan, Xiangfu Song, Zibin Zheng, Zi Li.

**Figure 1.** Figure 1: Overview of our study design the role of a software IP lawyer, ensuring its responses are legally sound. Defining roles enhances task relevance in specialized domains [46]. Task Definition: We clearly define the task for the model, instructing it to assess whether a dataset can be used commercially, thus maintaining a focused objective [31]. Focus Specification: We direct the model to concentrate on the l… view at source ↗

**Figure 2.** Figure 2: Heapmap of PA on studied system and user prompts [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

read the original abstract

Dataset license compliance is a critical yet complex aspect of developing commercial AI products, particularly with the increasing use of publicly available datasets. Ambiguities in dataset licenses pose significant legal risks, making it challenging even for software IP lawyers to accurately interpret rights and obligations. In this paper, we introduce LicenseGPT, a fine-tuned foundation model (FM) specifically designed for dataset license compliance analysis. We first evaluate existing legal FMs (i.e., FMs specialized in understanding and processing legal texts) and find that the best-performing model achieves a Prediction Agreement (PA) of only 43.75%. LicenseGPT, fine-tuned on a curated dataset of 500 licenses annotated by legal experts, significantly improves PA to 64.30%, outperforming both legal and general-purpose FMs. Through an A/B test and user study with software IP lawyers, we demonstrate that LicenseGPT reduces analysis time by 94.44%, from 108 seconds to 6 seconds per license, without compromising accuracy. Software IP lawyers perceive LicenseGPT as a valuable supplementary tool that enhances efficiency while acknowledging the need for human oversight in complex cases. Our work underscores the potential of specialized AI tools in legal practice and offers a publicly available resource for practitioners and researchers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LicenseGPT reports concrete gains on a niche legal task but the evaluation lacks the details needed to trust the numbers.

read the letter

LicenseGPT fine-tunes a foundation model on 500 expert-annotated dataset licenses and claims it lifts prediction agreement from 43.75% to 64.30% while cutting lawyer review time by 94.44% in an A/B test. That is the main result worth noting up front. The paper also includes a small user study where software IP lawyers view the tool as a helpful supplement that still requires human oversight on hard cases. The work is new in the narrow sense that it targets dataset license compliance as a distinct application rather than generic legal text. It does a reasonable job showing that off-the-shelf legal models fall short on this material and that domain-specific fine-tuning moves the needle on their chosen metric. The user study adds a practical angle that pure benchmark papers often skip. The soft spots are exactly where the stress-test note flags them. The abstract supplies no selection criteria for the 500 licenses, no inter-annotator agreement numbers, no protocol for handling ambiguity, and no confirmation that the evaluation split is clean. Without those pieces it is impossible to tell whether the reported gains are robust or tied to a narrow slice of licenses that happened to be easy to annotate consistently. The prediction agreement metric itself is only as good as the ground truth, and that ground truth is not described. This paper is for people building or evaluating legal compliance tools for public datasets. A practitioner who needs a quick starting point on fine-tuning for license text might extract some value from the high-level approach and the lawyer feedback. A methods-focused reader will find the current version frustrating. The paper shows straightforward engagement with the problem and does not overclaim the need for human review, so it meets the bar for serious thinking even if the evidence is incomplete. I would send it to peer review so that reviewers can press on the data construction and metric validity; the topic is relevant enough that the extra scrutiny is worth the time.

Referee Report

2 major / 1 minor

Summary. The paper introduces LicenseGPT, a fine-tuned foundation model for dataset license compliance analysis. It reports that the best existing legal FM achieves only 43.75% Prediction Agreement (PA) with expert annotations, while LicenseGPT, trained on 500 expert-annotated licenses, reaches 64.30% PA and outperforms both legal and general-purpose models. An A/B test and user study with software IP lawyers is claimed to show a 94.44% reduction in analysis time (108s to 6s per license) without accuracy loss, with lawyers viewing it as a useful supplementary tool requiring human oversight.

Significance. If the 500-license dataset is representative and the evaluation metrics valid, the result would be significant for reducing legal risks in commercial AI development using public datasets. The combination of quantitative PA gains with a practical user study on time savings provides applied value beyond pure model performance. However, the absence of methodological details on data curation and evaluation prevents confirming whether the gains reflect genuine legal utility or artifacts of the experimental setup.

major comments (2)

[Abstract] Abstract: The central claim of PA improvement from 43.75% to 64.30% rests on a 'curated dataset of 500 licenses annotated by legal experts,' but the abstract (and available text) supplies no selection criteria, ambiguity-handling protocol, inter-annotator agreement statistics, or confirmation that the evaluation split is disjoint from fine-tuning data. This directly undermines assessment of whether PA captures real-world license ambiguities rather than annotation artifacts.
[Abstract] Abstract (user study paragraph): The A/B test and user study demonstrating 94.44% time reduction is presented without details on participant recruitment, task construction, how 'accuracy' was independently verified, or controls for selection/expectation bias. These elements are load-bearing for the practical-utility claim and the assertion that accuracy is not compromised.

minor comments (1)

[Abstract] The abstract refers to 'Prediction Agreement (PA)' without defining the metric or how it differs from standard accuracy/F1; a brief definition would improve clarity even if expanded in the main text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on methodological transparency. We agree that the current manuscript lacks sufficient detail on data curation and user study design, which limits evaluation of the claims. We will revise the manuscript to address these points.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim of PA improvement from 43.75% to 64.30% rests on a 'curated dataset of 500 licenses annotated by legal experts,' but the abstract (and available text) supplies no selection criteria, ambiguity-handling protocol, inter-annotator agreement statistics, or confirmation that the evaluation split is disjoint from fine-tuning data. This directly undermines assessment of whether PA captures real-world license ambiguities rather than annotation artifacts.

Authors: We agree that these details are missing from the current version and are necessary for readers to assess whether the PA gains reflect genuine improvements. In the revised manuscript we will add a Methods subsection describing the license selection criteria, the ambiguity-handling protocol used by the legal experts, the inter-annotator agreement statistics obtained during annotation, and explicit confirmation that the evaluation split was held out from the fine-tuning data. revision: yes
Referee: [Abstract] Abstract (user study paragraph): The A/B test and user study demonstrating 94.44% time reduction is presented without details on participant recruitment, task construction, how 'accuracy' was independently verified, or controls for selection/expectation bias. These elements are load-bearing for the practical-utility claim and the assertion that accuracy is not compromised.

Authors: We agree that the user-study description is insufficiently detailed. The revised manuscript will expand the relevant section to cover participant recruitment criteria and process, the construction of the A/B test tasks, the independent verification procedure for accuracy, and the controls implemented to mitigate selection and expectation bias. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical results rest on external expert annotations and baselines.

full rationale

The paper's core claims rest on fine-tuning a model on 500 expert-annotated licenses and measuring Prediction Agreement (PA) plus time reduction against those same annotations and external baselines. No equations, self-definitional metrics, fitted-input predictions, or load-bearing self-citations appear in the provided text. The reported gains (43.75% to 64.30% PA; 94.44% time reduction) are presented as direct empirical comparisons rather than quantities derived by construction from the inputs themselves. The derivation chain is therefore self-contained against independent annotations.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Only abstract available, so free parameters and axioms cannot be exhaustively listed; main implicit assumption is that fine-tuning on a modest expert-annotated set transfers to real legal practice.

free parameters (1)

fine-tuning hyperparameters and data split
Not specified in abstract but required for the reported performance.

axioms (1)

domain assumption Existing legal foundation models provide a suitable base that can be improved via fine-tuning on domain-specific annotated licenses
Invoked by the choice to start from legal FMs and fine-tune rather than train from scratch.

pith-pipeline@v0.9.0 · 5775 in / 1405 out tokens · 58081 ms · 2026-05-23T05:58:12.445565+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

93 extracted references · 93 canonical work pages · 6 internal anchors

[1]

[Online]

Lexilaw. [Online]. Available: https://github.com/CSHaitao/LexiLaw

work page
[2]

[Online]

wisdominterrogatory. [Online]. Available: https://github.com/zhihaiLLM/ wisdomInterrogatory

work page
[3]

Jacobsen v. katzer,

“Jacobsen v. katzer,” pp. 1373–1381, 2008

work page 2008
[4]

Open data commons public domain dedication and license (pddl),

“Open data commons public domain dedication and license (pddl),” 2018, open Data Commons License. [Online]. Available: https://opendatacommons.org/ licenses/pddl/

work page 2018
[5]

Amazon s3,

“Amazon s3,” 2023, accessed: 2024-10-02. [Online]. Available: https://aws. amazon.com/s3/

work page 2023
[6]

Datahub,

“Datahub,” 2023, accessed: 2024-10-02. [Online]. Available: https://datahub.io/

work page 2023
[7]

Figshare,

“Figshare,” 2023, accessed: 2024-10-02. [Online]. Available: https://figshare.com/

work page 2023
[8]

[Online]

“Github,” 2023, accessed: 2024-10-02. [Online]. Available: https://github.com/

work page 2023
[9]

[Online]

“Gitlab,” 2023, accessed: 2024-10-02. [Online]. Available: https://gitlab.com/

work page 2023
[10]

Google cloud,

“Google cloud,” 2023, accessed: 2024-10-02. [Online]. Available: https: //cloud.google.com/

work page 2023
[11]

Hugging face,

“Hugging face,” 2023, accessed: 2024-10-02. [Online]. Available: https: //huggingface.co/

work page 2023
[12]

[Online]

“Kaggle,” 2023, accessed: 2024-10-02. [Online]. Available: https://www.kaggle. com/

work page 2023
[13]

Microsoft azure,

“Microsoft azure,” 2023, accessed: 2024-10-02. [Online]. Available: https: //azure.microsoft.com/

work page 2023
[14]

Opendataology,

“Opendataology,” 2023, accessed: 2024-10-02. [Online]. Available: http://www. opendataology.com:30800/#/dataSetAll

work page 2023
[15]

SPDX 3.0 Dataset Profile,

“SPDX 3.0 Dataset Profile,” 2023, accessed: 2024-10-11. [Online]. Available: https://spdx.github.io/spdx-spec/v3.0/model/Dataset/Dataset/

work page 2023
[16]

[Online]

“Zenodo,” 2023, accessed: 2024-10-02. [Online]. Available: https://zenodo.org/

work page 2023
[17]

Github licensing guide,

“Github licensing guide,” 2024, https://docs.github.com/en/repositories/ managing-your-repositorys-settings-and-features/customizing-your-repository/ licensing-a-repository

work page 2024
[18]

Open source initiative,

“Open source initiative,” 2024, available at: https://opensource.org/licenses

work page 2024
[19]

SPDX AI - Areas of Interest,

“SPDX AI - Areas of Interest,” 2024, accessed: 2024-10-11. [Online]. Available: https://spdx.dev/learn/areas-of-interest/ai/

work page 2024
[20]

Tldrlegal: Understand open source licenses,

“Tldrlegal: Understand open source licenses,” 2024, available at: https://www. tldrlegal.com/

work page 2024
[21]

Qwen: Open-source pretrained large-scale language model,

A. D. Academy, “Qwen: Open-source pretrained large-scale language model,” https: //modelscope.cn/models/damo, 2023, accessed: 2024-10-04

work page 2023
[22]

Llama-2: Open and efficient foundation language models,

M. AI, “Llama-2: Open and efficient foundation language models,” https://ai.meta. com/llama, 2023, accessed: 2024-10-04

work page 2023
[23]

Zero-shot learning: What, how, and why it matters for nlp,

N. AI, “Zero-shot learning: What, how, and why it matters for nlp,” 2023, accessed: 2024-10-05. [Online]. Available: https://neptune.ai/blog/zero-shot-learning

work page 2023
[24]

Software engineering for machine learning: A case study,

S. Amershi, A. Begel, C. Bird, R. DeLine, H. Gall, E. Kamar, N. Nagappan, B. Nushi, and T. Zimmermann, “Software engineering for machine learning: A case study,” in2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) . IEEE, 2019, pp. 291–300

work page 2019
[25]

Factsheets: Increasing trust in ai services through supplier’s declarations of conformity,

M. Arnold, R. K. Bellamy, M. Hind, S. Houde, S. Mehta, A. Mojsilovi ´c, R. Nair, K. N. Ramamurthy, A. Olteanu, D. Piorkowskiet al., “Factsheets: Increasing trust in ai services through supplier’s declarations of conformity,”IBM Journal of Research and Development, vol. 63, no. 4/5, pp. 6–1, 2019

work page 2019
[26]

Promptsource: An integrated development environment and repository for natural language prompts,

S. H. Bach et al. , “Promptsource: An integrated development environment and repository for natural language prompts,” 2022

work page 2022
[27]

Towards Traceability in Data Ecosystems using a Bill of Materials Model

I. Barclay, A. Preece, I. Taylor, and D. Verma, “Towards traceability in data ecosystems using a bill of materials model,” arXiv preprint arXiv:1904.04253 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904
[28]

Towards Standardization of Data Licenses: The Montreal Data License

M. Benjamin, P. Gagnon, N. Rostamzadeh, C. Pal, Y . Bengio, and A. Shee, “Towards standardization of data licenses: The montreal data license,” arXiv preprint arXiv:1903.12262, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1903
[29]

Chatgpt-4 performance on legal benchmarks: Evalu- ating its applicability for specialized tasks,

M. Bommarito and D. Katz, “Chatgpt-4 performance on legal benchmarks: Evalu- ating its applicability for specialized tasks,” Artificial Intelligence and Law , 2023. [Online]. Available: https://link.springer.com/article/10.1007/s10506-023-09356-y

work page doi:10.1007/s10506-023-09356-y 2023
[30]

Analyzing regulatory rules for privacy and security requirements,

Breaux et al., “Analyzing regulatory rules for privacy and security requirements,” IEEE Transactions on Software Engineering , 2008

work page 2008
[31]

Language models are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakan- tan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems , vol. 33, pp. 1877–1901, 2020

work page 1901
[32]

Objectives and key results in software teams: Challenges, opportunities and impact on development,

J. L. Butler, T. Zimmermann, and C. Bird, “Objectives and key results in software teams: Challenges, opportunities and impact on development,” in Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice, 2024, pp. 358–368

work page 2024
[33]

Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed

J. Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988

work page 1988
[34]

Creative commons attribution license (cc by),

C. Commons, “Creative commons attribution license (cc by),” 2013, creative Commons License. [Online]. Available: https://creativecommons.org/licenses/by/4. 0/

work page 2013
[35]

Chatlaw: A multi-agent collaborative legal assistant with knowledge graph enhanced mixture-of-experts large language model, 2024

J. Cui, Z. Li, Y . Yan, B. Chen, and L. Yuan, “Chatlaw: Open-source legal large language model with integrated external knowledge bases,” arXiv preprint arXiv:2306.16092, 2023

work page arXiv 2023
[36]

Efficient and effective text encoding for chinese llama and alpaca,

Y . Cui, Z. Yang, and X. Yao, “Efficient and effective text encoding for chinese llama and alpaca,” arXiv preprint arXiv:2304.08177 , 2023

work page arXiv 2023
[37]

Glm: General language model pretraining with autoregressive blank infilling,

Z. Du, Y . Qian, X. Liu, M. Ding, J. Qiu, Z. Yang, and J. Tang, “Glm: General language model pretraining with autoregressive blank infilling,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 320–335

work page 2022
[38]

Multiple comparisons among means,

O. J. Dunn, “Multiple comparisons among means,” Journal of the American statistical association, vol. 56, no. 293, pp. 52–64, 1961

work page 1961
[39]

What is zero-shot classification?

H. Face, “What is zero-shot classification?” 2023, accessed: 2024-10-

work page 2023
[40]

Available: https://huggingface.co/docs/transformers/main/en/task_ summary#zero-shot-classification

[Online]. Available: https://huggingface.co/docs/transformers/main/en/task_ summary#zero-shot-classification

work page
[41]

Datasheets for datasets,

T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach, H. D. Iii, and K. Crawford, “Datasheets for datasets,” Communications of the ACM , vol. 64, no. 12, pp. 86–92, 2021

work page 2021
[42]

A method for open source license compliance of java applications,

D. German and M. Di Penta, “A method for open source license compliance of java applications,” IEEE software, vol. 29, no. 3, pp. 58–63, 2012

work page 2012
[43]

License integration patterns: Addressing license mismatches in component-based development,

D. M. German and A. E. Hassan, “License integration patterns: Addressing license mismatches in component-based development,” in 2009 IEEE 31st international conference on software engineering . IEEE, 2009, pp. 188–198

work page 2009
[44]

Large language models: The legal aspects of licensing for commercial purposes,

GetInData, “Large language models: The legal aspects of licensing for commercial purposes,” 2023, accessed: 2024-10-02. [Online]. Available: https://getindata.com/ blog/large-language-models-legal-aspects-licensing-commercial-purposes/

work page 2023
[45]

Github copilot: Your ai pair programmer,

GitHub, “Github copilot: Your ai pair programmer,” https://copilot.github.com, 2021, accessed: 2024-07-03

work page 2021
[46]

Infringement of copyright and moral rights and exceptions to infringement (continued),

Government of Canada, “Infringement of copyright and moral rights and exceptions to infringement (continued),” 2021, [Last visited on 09-25-2024]. [Online]. Available: https://laws-lois.justice.gc.ca/eng/acts/c-42/page-9.html

work page 2021
[47]

Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models,

N. Guha, Nyarko et al. , “Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models,” in Advances in Neural Information Processing Systems , A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 44 123– 44 279. [Online]. Available: https://proceedin...

work page 2023
[48]

Fossology: A license compli- ance tool,

F. Hansen, B. Becker, C. Chamas, and P. Germain, “Fossology: A license compli- ance tool,” in IFIP International Conference on Open Source Systems . Springer, 2010, pp. 47–62

work page 2010
[49]

Key issues in writers’ case against openai explained,

Harvard Gazette, “Key issues in writers’ case against openai explained,” Sep. 2023. [Online]. Available: https://news.harvard.edu/gazette/story/2023/09/ key-issues-in-writers-case-against-openai-explained/

work page 2023
[50]

Rethinking software engineering in the foundation model era: From task-driven ai copilots to goal-driven ai pair programmers,

A. E. Hassan, G. A. Oliva, D. Lin, B. Chen, Z. Ming et al., “Rethinking software engineering in the foundation model era: From task-driven ai copilots to goal-driven ai pair programmers,” arXiv preprint arXiv:2404.10225 , 2024

work page arXiv 2024
[51]

Lawyer llama technical report,

Q. Huang, M. Tao, Z. An, C. Zhang, C. Jiang, Z. Chen, Z. Wu, and Y . Feng, “Lawyer llama technical report,” arXiv preprint arXiv:2305.15062 , 2023

work page arXiv 2023
[52]

Arguing regulatory compliance of software requirements,

S. Ingolfo, A. Siena, J. Mylopoulos, A. Susi, and A. Perini, “Arguing regulatory compliance of software requirements,” Data & Knowledge Engineering , vol. 87, pp. 279–296, 2013

work page 2013
[53]

Fossology: The open source license compliance tool,

M. C. Jaeger, G. J. Herzwurm, and J. Böhm, “Fossology: The open source license compliance tool,”International Free and Open Source Software Law Review, vol. 1, no. 2, pp. 153–171, 2009

work page 2009
[54]

Automating the license compati- bility process in open source software with spdx,

G. M. Kapitsaki, F. Kramer, and N. D. Tselikas, “Automating the license compati- bility process in open source software with spdx,” Journal of systems and software, vol. 131, pp. 386–401, 2017

work page 2017
[55]

Automating the extraction of rights and obligations for regulatory compliance,

N. Kiyavitskaya, N. Zeni, T. D. Breaux, A. I. Ant ’on, J. R. Cordy, L. Mich, and J. Mylopoulos, “Automating the extraction of rights and obligations for regulatory compliance,” inProceedings of the 27th International Conference on Conceptual Modeling, Barcelona, Spain, October 20-24 , 2008

work page 2008
[56]

Enforcing the gpl and open source software licenses in the us after jacobsen v. katzer,

B. M. Kuhn and K. M. Sandler, “Enforcing the gpl and open source software licenses in the us after jacobsen v. katzer,” Berkeley Technology Law Journal , vol. 27, pp. 231–274, 2012

work page 2012
[57]

Legal documents drafting with fine-tuned pre-trained large language model,

C.-H. Lin and P.-J. Cheng, “Legal documents drafting with fine-tuned pre-trained large language model,” arXiv preprint arXiv:2406.04202 , 2024

work page arXiv 2024
[58]

Lawgpt:chinese legal model,

M. LiuHongcheng, LiaoYusheng and WangYuhao, “Lawgpt:chinese legal model,”

work page
[59]

Available: https://github.com/LiuHC0428/LAW_GPT

[Online]. Available: https://github.com/LiuHC0428/LAW_GPT

work page
[60]

The data provenance initiative: A large scale audit of dataset licensing & attribution in ai,

S. Longpre, R. Mahari, A. Chen, N. Obeng-Marnu, D. Sileo, W. Brannon, N. Muennighoff, N. Khazam, J. Kabbara, K. Perisetla et al., “The data provenance initiative: A large scale audit of dataset licensing & attribution in ai,” arXiv preprint arXiv:2310.16787, 2023

work page arXiv 2023
[61]

Model cards for model reporting,

M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru, “Model cards for model reporting,” in Proceedings of the conference on fairness, accountability, and transparency , 2019, pp. 220–229. 11

work page 2019
[62]

The rise of open source program office,

H. Munir and C.-E. Mols, “The rise of open source program office,” IT Professional, vol. 23, no. 1, pp. 27–33, 2021

work page 2021
[63]

A guide to copyright,

G. of Canada, “A guide to copyright,” 2021, [Last visited on 09-25-2024]. [Online]. Available: https://laws-lois.justice.gc.ca/eng/acts/c-42/page-9.html

work page 2021
[64]

More information on fair use,

U. C. Office, “More information on fair use,” 2021, [Last visited on 09-25-2024]. [Online]. Available: https://www.copyright.gov/fair-use/more-info.html

work page 2021
[65]

OpenAI, “Gpt-4,” https://openai.com/gpt-4, 2023, accessed: 2024-10-04

work page 2023
[66]

OpenChain AI Study Group Monthly Workshop for North America and Europe: Full Recording,

OpenChain Project, “OpenChain AI Study Group Monthly Workshop for North America and Europe: Full Recording,” https://openchainproject.org/ news/2024/04/09/openchain-ai-study-group-monthly-workshop-for-north-/ america-and-europe-2024-04-02-full-recording, 2024, last accessed: October 10, 2024

work page 2024
[67]

Openchain project,

“Openchain project,” https://openchainproject.org/, OpenChain Project, 2024, ac- cessed: 2024-10-10

work page 2024
[68]

LicenseGPT,

OpenDataology, “LicenseGPT,” https://github.com/OpenDataology/LicenseGPT, 2024, gitHub repository, Last accessed: 2024-10-11

work page 2024
[69]

A review of current trends, techniques, and challenges in large language models (llms),

R. Patil and V . Gudivada, “A review of current trends, techniques, and challenges in large language models (llms),” Applied Sciences, vol. 14, no. 5, p. 2074, 2024

work page 2074
[70]

Mitigating dataset harms requires stewardship: Lessons from 1000 papers,

K. Peng, A. Mathur, and A. Narayanan, “Mitigating dataset harms requires stewardship: Lessons from 1000 papers,” arXiv preprint arXiv:2108.02922 , 2021

work page arXiv 2021
[71]

Can i use this publicly available dataset to build commercial ai software? most likely not,

G. K. Rajbahadur, E. Tuck, L. Zi, Z. Wei, D. Lin, B. Chen, Z. M. Jiang, and D. M. German, “Can i use this publicly available dataset to build commercial ai software? most likely not,” CoRR, abs/2111.02374, pp. 1–1, 2021

work page arXiv 2021
[72]

Self-reflective chain-of-thought reasoning in large language mod- els,

T. Researcher, “Self-reflective chain-of-thought reasoning in large language mod- els,” 2023

work page 2023
[73]

W. S. G. . Rosati. (2017) Open source software: Risks, compliance, and best practices. [Online]. Available: https://www.wsgr.com/en/insights/ open-source-software-risks-compliance-and-best-practices.html

work page 2017
[74]

Simultaneous statistical inference,

G. Rupert Jr et al., “Simultaneous statistical inference,” 2012

work page 2012
[75]

fuzi.mingcha,

Z. Z. Shiguang Wu, Zhongkun Liu et al. , “fuzi.mingcha,” 2023. [Online]. Available: https://github.com/irlab-sdu/fuzi.mingcha

work page 2023
[76]

B. D. Software. (2023) Open source security and license compliance management. [Online]. Available: https://www.blackducksoftware.com

work page 2023
[77]

Lawgpt: Chinese-llama tuned with chinese legal knowledge,

Z. Z. Song Pengxiao and cainiao, “Lawgpt: Chinese-llama tuned with chinese legal knowledge,” 2023. [Online]. Available: https://github.com/pengxiao-song/LaWGPT

work page 2023
[78]

Responsible ai licenses-a real alternative to generally applicable laws?

K. Szpyt, “Responsible ai licenses-a real alternative to generally applicable laws?” Revista Ibérica do Direito , vol. 1, no. 2, pp. 178–186, 2020

work page 2020
[79]

The impact of automated parameter optimization for defect prediction models,

C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto, “The impact of automated parameter optimization for defect prediction models,”IEEE Transactions on Software Engineering , 2018

work page 2018
[80]

LLaMA: Open and Efficient Foundation Language Models

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al. , “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971 , 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

Showing first 80 references.

[1] [1]

[Online]

Lexilaw. [Online]. Available: https://github.com/CSHaitao/LexiLaw

work page

[2] [2]

[Online]

wisdominterrogatory. [Online]. Available: https://github.com/zhihaiLLM/ wisdomInterrogatory

work page

[3] [3]

Jacobsen v. katzer,

“Jacobsen v. katzer,” pp. 1373–1381, 2008

work page 2008

[4] [4]

Open data commons public domain dedication and license (pddl),

“Open data commons public domain dedication and license (pddl),” 2018, open Data Commons License. [Online]. Available: https://opendatacommons.org/ licenses/pddl/

work page 2018

[5] [5]

Amazon s3,

“Amazon s3,” 2023, accessed: 2024-10-02. [Online]. Available: https://aws. amazon.com/s3/

work page 2023

[6] [6]

Datahub,

“Datahub,” 2023, accessed: 2024-10-02. [Online]. Available: https://datahub.io/

work page 2023

[7] [7]

Figshare,

“Figshare,” 2023, accessed: 2024-10-02. [Online]. Available: https://figshare.com/

work page 2023

[8] [8]

[Online]

“Github,” 2023, accessed: 2024-10-02. [Online]. Available: https://github.com/

work page 2023

[9] [9]

[Online]

“Gitlab,” 2023, accessed: 2024-10-02. [Online]. Available: https://gitlab.com/

work page 2023

[10] [10]

Google cloud,

“Google cloud,” 2023, accessed: 2024-10-02. [Online]. Available: https: //cloud.google.com/

work page 2023

[11] [11]

Hugging face,

“Hugging face,” 2023, accessed: 2024-10-02. [Online]. Available: https: //huggingface.co/

work page 2023

[12] [12]

[Online]

“Kaggle,” 2023, accessed: 2024-10-02. [Online]. Available: https://www.kaggle. com/

work page 2023

[13] [13]

Microsoft azure,

“Microsoft azure,” 2023, accessed: 2024-10-02. [Online]. Available: https: //azure.microsoft.com/

work page 2023

[14] [14]

Opendataology,

“Opendataology,” 2023, accessed: 2024-10-02. [Online]. Available: http://www. opendataology.com:30800/#/dataSetAll

work page 2023

[15] [15]

SPDX 3.0 Dataset Profile,

“SPDX 3.0 Dataset Profile,” 2023, accessed: 2024-10-11. [Online]. Available: https://spdx.github.io/spdx-spec/v3.0/model/Dataset/Dataset/

work page 2023

[16] [16]

[Online]

“Zenodo,” 2023, accessed: 2024-10-02. [Online]. Available: https://zenodo.org/

work page 2023

[17] [17]

Github licensing guide,

“Github licensing guide,” 2024, https://docs.github.com/en/repositories/ managing-your-repositorys-settings-and-features/customizing-your-repository/ licensing-a-repository

work page 2024

[18] [18]

Open source initiative,

“Open source initiative,” 2024, available at: https://opensource.org/licenses

work page 2024

[19] [19]

SPDX AI - Areas of Interest,

“SPDX AI - Areas of Interest,” 2024, accessed: 2024-10-11. [Online]. Available: https://spdx.dev/learn/areas-of-interest/ai/

work page 2024

[20] [20]

Tldrlegal: Understand open source licenses,

“Tldrlegal: Understand open source licenses,” 2024, available at: https://www. tldrlegal.com/

work page 2024

[21] [21]

Qwen: Open-source pretrained large-scale language model,

A. D. Academy, “Qwen: Open-source pretrained large-scale language model,” https: //modelscope.cn/models/damo, 2023, accessed: 2024-10-04

work page 2023

[22] [22]

Llama-2: Open and efficient foundation language models,

M. AI, “Llama-2: Open and efficient foundation language models,” https://ai.meta. com/llama, 2023, accessed: 2024-10-04

work page 2023

[23] [23]

Zero-shot learning: What, how, and why it matters for nlp,

N. AI, “Zero-shot learning: What, how, and why it matters for nlp,” 2023, accessed: 2024-10-05. [Online]. Available: https://neptune.ai/blog/zero-shot-learning

work page 2023

[24] [24]

Software engineering for machine learning: A case study,

S. Amershi, A. Begel, C. Bird, R. DeLine, H. Gall, E. Kamar, N. Nagappan, B. Nushi, and T. Zimmermann, “Software engineering for machine learning: A case study,” in2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) . IEEE, 2019, pp. 291–300

work page 2019

[25] [25]

Factsheets: Increasing trust in ai services through supplier’s declarations of conformity,

M. Arnold, R. K. Bellamy, M. Hind, S. Houde, S. Mehta, A. Mojsilovi ´c, R. Nair, K. N. Ramamurthy, A. Olteanu, D. Piorkowskiet al., “Factsheets: Increasing trust in ai services through supplier’s declarations of conformity,”IBM Journal of Research and Development, vol. 63, no. 4/5, pp. 6–1, 2019

work page 2019

[26] [26]

Promptsource: An integrated development environment and repository for natural language prompts,

S. H. Bach et al. , “Promptsource: An integrated development environment and repository for natural language prompts,” 2022

work page 2022

[27] [27]

Towards Traceability in Data Ecosystems using a Bill of Materials Model

I. Barclay, A. Preece, I. Taylor, and D. Verma, “Towards traceability in data ecosystems using a bill of materials model,” arXiv preprint arXiv:1904.04253 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904

[28] [28]

Towards Standardization of Data Licenses: The Montreal Data License

M. Benjamin, P. Gagnon, N. Rostamzadeh, C. Pal, Y . Bengio, and A. Shee, “Towards standardization of data licenses: The montreal data license,” arXiv preprint arXiv:1903.12262, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1903

[29] [29]

Chatgpt-4 performance on legal benchmarks: Evalu- ating its applicability for specialized tasks,

M. Bommarito and D. Katz, “Chatgpt-4 performance on legal benchmarks: Evalu- ating its applicability for specialized tasks,” Artificial Intelligence and Law , 2023. [Online]. Available: https://link.springer.com/article/10.1007/s10506-023-09356-y

work page doi:10.1007/s10506-023-09356-y 2023

[30] [30]

Analyzing regulatory rules for privacy and security requirements,

Breaux et al., “Analyzing regulatory rules for privacy and security requirements,” IEEE Transactions on Software Engineering , 2008

work page 2008

[31] [31]

Language models are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakan- tan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems , vol. 33, pp. 1877–1901, 2020

work page 1901

[32] [32]

Objectives and key results in software teams: Challenges, opportunities and impact on development,

J. L. Butler, T. Zimmermann, and C. Bird, “Objectives and key results in software teams: Challenges, opportunities and impact on development,” in Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice, 2024, pp. 358–368

work page 2024

[33] [33]

Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed

J. Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988

work page 1988

[34] [34]

Creative commons attribution license (cc by),

C. Commons, “Creative commons attribution license (cc by),” 2013, creative Commons License. [Online]. Available: https://creativecommons.org/licenses/by/4. 0/

work page 2013

[35] [35]

Chatlaw: A multi-agent collaborative legal assistant with knowledge graph enhanced mixture-of-experts large language model, 2024

J. Cui, Z. Li, Y . Yan, B. Chen, and L. Yuan, “Chatlaw: Open-source legal large language model with integrated external knowledge bases,” arXiv preprint arXiv:2306.16092, 2023

work page arXiv 2023

[36] [36]

Efficient and effective text encoding for chinese llama and alpaca,

Y . Cui, Z. Yang, and X. Yao, “Efficient and effective text encoding for chinese llama and alpaca,” arXiv preprint arXiv:2304.08177 , 2023

work page arXiv 2023

[37] [37]

Glm: General language model pretraining with autoregressive blank infilling,

Z. Du, Y . Qian, X. Liu, M. Ding, J. Qiu, Z. Yang, and J. Tang, “Glm: General language model pretraining with autoregressive blank infilling,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 320–335

work page 2022

[38] [38]

Multiple comparisons among means,

O. J. Dunn, “Multiple comparisons among means,” Journal of the American statistical association, vol. 56, no. 293, pp. 52–64, 1961

work page 1961

[39] [39]

What is zero-shot classification?

H. Face, “What is zero-shot classification?” 2023, accessed: 2024-10-

work page 2023

[40] [40]

Available: https://huggingface.co/docs/transformers/main/en/task_ summary#zero-shot-classification

[Online]. Available: https://huggingface.co/docs/transformers/main/en/task_ summary#zero-shot-classification

work page

[41] [41]

Datasheets for datasets,

T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach, H. D. Iii, and K. Crawford, “Datasheets for datasets,” Communications of the ACM , vol. 64, no. 12, pp. 86–92, 2021

work page 2021

[42] [42]

A method for open source license compliance of java applications,

D. German and M. Di Penta, “A method for open source license compliance of java applications,” IEEE software, vol. 29, no. 3, pp. 58–63, 2012

work page 2012

[43] [43]

License integration patterns: Addressing license mismatches in component-based development,

D. M. German and A. E. Hassan, “License integration patterns: Addressing license mismatches in component-based development,” in 2009 IEEE 31st international conference on software engineering . IEEE, 2009, pp. 188–198

work page 2009

[44] [44]

Large language models: The legal aspects of licensing for commercial purposes,

GetInData, “Large language models: The legal aspects of licensing for commercial purposes,” 2023, accessed: 2024-10-02. [Online]. Available: https://getindata.com/ blog/large-language-models-legal-aspects-licensing-commercial-purposes/

work page 2023

[45] [45]

Github copilot: Your ai pair programmer,

GitHub, “Github copilot: Your ai pair programmer,” https://copilot.github.com, 2021, accessed: 2024-07-03

work page 2021

[46] [46]

Infringement of copyright and moral rights and exceptions to infringement (continued),

Government of Canada, “Infringement of copyright and moral rights and exceptions to infringement (continued),” 2021, [Last visited on 09-25-2024]. [Online]. Available: https://laws-lois.justice.gc.ca/eng/acts/c-42/page-9.html

work page 2021

[47] [47]

Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models,

N. Guha, Nyarko et al. , “Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models,” in Advances in Neural Information Processing Systems , A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 44 123– 44 279. [Online]. Available: https://proceedin...

work page 2023

[48] [48]

Fossology: A license compli- ance tool,

F. Hansen, B. Becker, C. Chamas, and P. Germain, “Fossology: A license compli- ance tool,” in IFIP International Conference on Open Source Systems . Springer, 2010, pp. 47–62

work page 2010

[49] [49]

Key issues in writers’ case against openai explained,

Harvard Gazette, “Key issues in writers’ case against openai explained,” Sep. 2023. [Online]. Available: https://news.harvard.edu/gazette/story/2023/09/ key-issues-in-writers-case-against-openai-explained/

work page 2023

[50] [50]

Rethinking software engineering in the foundation model era: From task-driven ai copilots to goal-driven ai pair programmers,

A. E. Hassan, G. A. Oliva, D. Lin, B. Chen, Z. Ming et al., “Rethinking software engineering in the foundation model era: From task-driven ai copilots to goal-driven ai pair programmers,” arXiv preprint arXiv:2404.10225 , 2024

work page arXiv 2024

[51] [51]

Lawyer llama technical report,

Q. Huang, M. Tao, Z. An, C. Zhang, C. Jiang, Z. Chen, Z. Wu, and Y . Feng, “Lawyer llama technical report,” arXiv preprint arXiv:2305.15062 , 2023

work page arXiv 2023

[52] [52]

Arguing regulatory compliance of software requirements,

S. Ingolfo, A. Siena, J. Mylopoulos, A. Susi, and A. Perini, “Arguing regulatory compliance of software requirements,” Data & Knowledge Engineering , vol. 87, pp. 279–296, 2013

work page 2013

[53] [53]

Fossology: The open source license compliance tool,

M. C. Jaeger, G. J. Herzwurm, and J. Böhm, “Fossology: The open source license compliance tool,”International Free and Open Source Software Law Review, vol. 1, no. 2, pp. 153–171, 2009

work page 2009

[54] [54]

Automating the license compati- bility process in open source software with spdx,

G. M. Kapitsaki, F. Kramer, and N. D. Tselikas, “Automating the license compati- bility process in open source software with spdx,” Journal of systems and software, vol. 131, pp. 386–401, 2017

work page 2017

[55] [55]

Automating the extraction of rights and obligations for regulatory compliance,

N. Kiyavitskaya, N. Zeni, T. D. Breaux, A. I. Ant ’on, J. R. Cordy, L. Mich, and J. Mylopoulos, “Automating the extraction of rights and obligations for regulatory compliance,” inProceedings of the 27th International Conference on Conceptual Modeling, Barcelona, Spain, October 20-24 , 2008

work page 2008

[56] [56]

Enforcing the gpl and open source software licenses in the us after jacobsen v. katzer,

B. M. Kuhn and K. M. Sandler, “Enforcing the gpl and open source software licenses in the us after jacobsen v. katzer,” Berkeley Technology Law Journal , vol. 27, pp. 231–274, 2012

work page 2012

[57] [57]

Legal documents drafting with fine-tuned pre-trained large language model,

C.-H. Lin and P.-J. Cheng, “Legal documents drafting with fine-tuned pre-trained large language model,” arXiv preprint arXiv:2406.04202 , 2024

work page arXiv 2024

[58] [58]

Lawgpt:chinese legal model,

M. LiuHongcheng, LiaoYusheng and WangYuhao, “Lawgpt:chinese legal model,”

work page

[59] [59]

Available: https://github.com/LiuHC0428/LAW_GPT

[Online]. Available: https://github.com/LiuHC0428/LAW_GPT

work page

[60] [60]

The data provenance initiative: A large scale audit of dataset licensing & attribution in ai,

S. Longpre, R. Mahari, A. Chen, N. Obeng-Marnu, D. Sileo, W. Brannon, N. Muennighoff, N. Khazam, J. Kabbara, K. Perisetla et al., “The data provenance initiative: A large scale audit of dataset licensing & attribution in ai,” arXiv preprint arXiv:2310.16787, 2023

work page arXiv 2023

[61] [61]

Model cards for model reporting,

M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru, “Model cards for model reporting,” in Proceedings of the conference on fairness, accountability, and transparency , 2019, pp. 220–229. 11

work page 2019

[62] [62]

The rise of open source program office,

H. Munir and C.-E. Mols, “The rise of open source program office,” IT Professional, vol. 23, no. 1, pp. 27–33, 2021

work page 2021

[63] [63]

A guide to copyright,

G. of Canada, “A guide to copyright,” 2021, [Last visited on 09-25-2024]. [Online]. Available: https://laws-lois.justice.gc.ca/eng/acts/c-42/page-9.html

work page 2021

[64] [64]

More information on fair use,

U. C. Office, “More information on fair use,” 2021, [Last visited on 09-25-2024]. [Online]. Available: https://www.copyright.gov/fair-use/more-info.html

work page 2021

[65] [65]

OpenAI, “Gpt-4,” https://openai.com/gpt-4, 2023, accessed: 2024-10-04

work page 2023

[66] [66]

OpenChain AI Study Group Monthly Workshop for North America and Europe: Full Recording,

OpenChain Project, “OpenChain AI Study Group Monthly Workshop for North America and Europe: Full Recording,” https://openchainproject.org/ news/2024/04/09/openchain-ai-study-group-monthly-workshop-for-north-/ america-and-europe-2024-04-02-full-recording, 2024, last accessed: October 10, 2024

work page 2024

[67] [67]

Openchain project,

“Openchain project,” https://openchainproject.org/, OpenChain Project, 2024, ac- cessed: 2024-10-10

work page 2024

[68] [68]

LicenseGPT,

OpenDataology, “LicenseGPT,” https://github.com/OpenDataology/LicenseGPT, 2024, gitHub repository, Last accessed: 2024-10-11

work page 2024

[69] [69]

A review of current trends, techniques, and challenges in large language models (llms),

R. Patil and V . Gudivada, “A review of current trends, techniques, and challenges in large language models (llms),” Applied Sciences, vol. 14, no. 5, p. 2074, 2024

work page 2074

[70] [70]

Mitigating dataset harms requires stewardship: Lessons from 1000 papers,

K. Peng, A. Mathur, and A. Narayanan, “Mitigating dataset harms requires stewardship: Lessons from 1000 papers,” arXiv preprint arXiv:2108.02922 , 2021

work page arXiv 2021

[71] [71]

Can i use this publicly available dataset to build commercial ai software? most likely not,

G. K. Rajbahadur, E. Tuck, L. Zi, Z. Wei, D. Lin, B. Chen, Z. M. Jiang, and D. M. German, “Can i use this publicly available dataset to build commercial ai software? most likely not,” CoRR, abs/2111.02374, pp. 1–1, 2021

work page arXiv 2021

[72] [72]

Self-reflective chain-of-thought reasoning in large language mod- els,

T. Researcher, “Self-reflective chain-of-thought reasoning in large language mod- els,” 2023

work page 2023

[73] [73]

W. S. G. . Rosati. (2017) Open source software: Risks, compliance, and best practices. [Online]. Available: https://www.wsgr.com/en/insights/ open-source-software-risks-compliance-and-best-practices.html

work page 2017

[74] [74]

Simultaneous statistical inference,

G. Rupert Jr et al., “Simultaneous statistical inference,” 2012

work page 2012

[75] [75]

fuzi.mingcha,

Z. Z. Shiguang Wu, Zhongkun Liu et al. , “fuzi.mingcha,” 2023. [Online]. Available: https://github.com/irlab-sdu/fuzi.mingcha

work page 2023

[76] [76]

B. D. Software. (2023) Open source security and license compliance management. [Online]. Available: https://www.blackducksoftware.com

work page 2023

[77] [77]

Lawgpt: Chinese-llama tuned with chinese legal knowledge,

Z. Z. Song Pengxiao and cainiao, “Lawgpt: Chinese-llama tuned with chinese legal knowledge,” 2023. [Online]. Available: https://github.com/pengxiao-song/LaWGPT

work page 2023

[78] [78]

Responsible ai licenses-a real alternative to generally applicable laws?

K. Szpyt, “Responsible ai licenses-a real alternative to generally applicable laws?” Revista Ibérica do Direito , vol. 1, no. 2, pp. 178–186, 2020

work page 2020

[79] [79]

The impact of automated parameter optimization for defect prediction models,

C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto, “The impact of automated parameter optimization for defect prediction models,”IEEE Transactions on Software Engineering , 2018

work page 2018

[80] [80]

LLaMA: Open and Efficient Foundation Language Models

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al. , “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971 , 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023