LLM-based Generation of Semantically Diverse and Realistic Domain Model Instances

Andrei Coman; Dominik Bork; Lola Burgue\~no; Manuel Wimmer

arxiv: 2604.10350 · v1 · submitted 2026-04-11 · 💻 cs.SE

LLM-based Generation of Semantically Diverse and Realistic Domain Model Instances

Andrei Coman , Lola Burgue\~no , Dominik Bork , Manuel Wimmer This is my paper

Pith reviewed 2026-05-10 15:21 UTC · model grok-4.3

classification 💻 cs.SE

keywords LLMdomain modelUML class diagraminstance generationsemantic realismmodel validationdiversityprompting strategies

0 comments

The pith

Large language models can generate mostly correct and semantically realistic instances of UML domain models when prompted with class diagram descriptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how large language models can be prompted to create concrete instances of domain models from UML class diagrams, going beyond structural validity to include realistic values and semantic diversity. The method pairs two prompting strategies with standard model validation tools to check for syntactic correctness, conformance to the model, and semantic coherence within each generated instance. Experiments on educational and published models from multiple domains produced instances that were largely correct with only a few semantic errors and values that varied while staying true to the domain and internally consistent. If this holds, modelers gain a way to obtain human-understandable examples automatically instead of constructing them by hand. Such instances would then support teaching domain concepts and supplying varied data for research without violating the original model rules.

Core claim

LLMs prompted with class diagram descriptions, used together with existing validation tools, produce instances that are mostly syntactically correct, conform to the domain model, contain only a few semantic errors, and exhibit semantically diverse realistic values whose combinations within each instance remain coherent.

What carries the argument

The combination of large language models with two prompting strategies applied to class diagram descriptions, followed by validation with existing model-checking tools.

If this is right

Educators obtain ready-to-use concrete examples for teaching domain modeling without manual construction.
Research projects can draw on diverse yet model-conformant data sets for analysis or simulation.
Modeling environments can automate the creation of test populations that respect both structure and domain meaning.
The effort to prepare example instances for validation or demonstration drops while preserving semantic realism.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same prompting-plus-validation pattern might apply to other diagram types such as state machines or activity diagrams.
Adding domain-specific knowledge bases could further reduce the remaining semantic errors observed in the experiments.
Scaling tests to larger industrial models would show whether the approach remains practical when class counts and constraints grow.
Generated instances could serve as training data for other model-related machine learning tasks that need realistic examples.

Load-bearing premise

Large language models given only class diagram descriptions can reliably infer real-world domain semantics and produce coherent value sets without extra domain training or knowledge bases.

What would settle it

Running the generation process on a fresh collection of domain models and finding that most resulting instances contain multiple semantic inconsistencies or unrealistic value combinations.

Figures

Figures reproduced from arXiv: 2604.10350 by Andrei Coman, Dominik Bork, Lola Burgue\~no, Manuel Wimmer.

**Figure 2.** Figure 2: Overview of the two instance generation strategies [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Two generated Bank domain model instances, the left one following the Instruction Learning approach, the right one [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Two generated Bank domain model instances, the left one following the Chain-of-Thought approach in the edge scenario, [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

read the original abstract

Large Language Models (LLMs) have been recently proposed for supporting domain modeling tasks mostly related to the completion of partial models by recommending additional model elements. However, there are many more modeling tasks, one of them being the instantiation of domain models to represent concrete domain objects. While there is considerable work supporting the generation of structurally valid instantiations, there are still open challenges to incorporating real-world semantics by having realistic values contained in instances and ensuring the generation of semantically diverse models. Only then will such generated models become human-understandable and helpful in educational or data-driven research contexts. To tackle these challenges, this paper presents an approach that employs LLMs and two prompting strategies in combination with existing model validation tools for instantiating semantically realistic and diverse domain models expressed as UML class diagrams. We have applied our approach to models used in education and available in the literature from different domains and evaluated the generated instances in terms of syntactic correctness, model conformance, semantic correctness, and diversity of the generated values. The results show that the generated instances are mostly syntactically correct, that they conform to the domain model, and that there are only a few semantic errors. Moreover, the generated instance values are semantically diverse, i.e., concrete realistic examples in line with the domain and the combination of the values within one model are semantically coherent.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes an LLM-based method using two prompting strategies combined with existing model validation tools to instantiate UML class diagrams. The generated instances are evaluated for syntactic correctness, conformance to the domain model, semantic correctness (few errors), and semantic diversity (realistic, domain-appropriate values with coherent combinations within each instance). The approach is tested on educational and literature models from multiple domains, with results indicating mostly positive outcomes on all four criteria.

Significance. If the semantic realism and diversity claims hold under rigorous evaluation, the work would address a clear gap in domain model instantiation: moving beyond structural validity to produce human-understandable, realistic examples useful for education and data-driven research. The combination of LLMs with validation tools is a practical contribution, but its impact depends on demonstrating that the semantic properties are reproducible and not artifacts of subjective assessment.

major comments (2)

[Evaluation] Evaluation section: the abstract and results claim 'mostly syntactically correct,' 'conform to the domain model,' 'only a few semantic errors,' and 'semantically diverse' values that are 'concrete realistic examples' with 'semantically coherent' combinations, yet no exact metrics, trial counts, error definitions, or inter-rater reliability statistics are provided. This makes the central empirical claims difficult to reproduce or falsify.
[Results] Results section: semantic correctness and diversity are assessed without reported baselines (e.g., random or template-based value assignment), control conditions, or comparison to prior non-LLM instantiation techniques. Without these, it is unclear whether the LLM prompting genuinely improves semantic realism or merely produces plausible output.

minor comments (2)

[Approach] The description of the two prompting strategies would benefit from explicit examples of the prompts used and how they differ in handling class attributes versus associations.
[Evaluation] Consider reporting the exact number of models tested, their sizes (number of classes/attributes), and the domains represented to allow readers to assess generalizability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments, which highlight important opportunities to strengthen the empirical rigor of our work. We address each major comment below and commit to revisions that improve reproducibility and contextualization without altering the core claims or contributions of the manuscript.

read point-by-point responses

Referee: [Evaluation] Evaluation section: the abstract and results claim 'mostly syntactically correct,' 'conform to the domain model,' 'only a few semantic errors,' and 'semantically diverse' values that are 'concrete realistic examples' with 'semantically coherent' combinations, yet no exact metrics, trial counts, error definitions, or inter-rater reliability statistics are provided. This makes the central empirical claims difficult to reproduce or falsify.

Authors: We agree that the current Evaluation section would benefit from greater quantitative detail and explicit definitions to support reproducibility. In the revised manuscript, we will expand this section to report the precise number of generation trials and instances produced for each domain model, exact counts and percentages for syntactic correctness and model conformance (e.g., number of instances passing parser checks and OCL validation), clear operational definitions of semantic errors (e.g., attribute values that are implausible given domain knowledge or combinations that violate real-world coherence), and the assessment procedure for semantic diversity (manual review of value variety and intra-instance coherence). The semantic evaluation was performed by the authors with iterative cross-checking for consensus; we will describe this process in detail and acknowledge the absence of formal inter-rater reliability metrics as a limitation. revision: yes
Referee: [Results] Results section: semantic correctness and diversity are assessed without reported baselines (e.g., random or template-based value assignment), control conditions, or comparison to prior non-LLM instantiation techniques. Without these, it is unclear whether the LLM prompting genuinely improves semantic realism or merely produces plausible output.

Authors: Our primary aim was to demonstrate the feasibility of the LLM-based approach combined with validation tools for producing instances with the targeted semantic properties, rather than to perform a full comparative benchmark. We acknowledge that the absence of explicit baselines leaves open questions about relative improvement. In the revision, we will add a dedicated subsection in Results that compares our outputs to prior non-LLM techniques discussed in the related work (e.g., random instantiation and constraint-based solvers), explaining that such methods reliably achieve structural validity but typically produce semantically unrealistic or repetitive values. We will also include a small illustrative baseline using template-based random assignment on one of the evaluated models to contrast semantic quality, while noting that a comprehensive controlled experiment lies beyond the scope of this feasibility study. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical evaluation is independent of inputs

full rationale

The paper presents a prompting-based LLM method for generating UML class diagram instances, then evaluates outputs via mechanical validation tools for syntactic correctness and model conformance plus separate checks for semantic correctness and value diversity. No equations, fitted parameters, predictions derived from those parameters, self-definitional constructs, uniqueness theorems, or ansatzes appear in the abstract or described approach. Central claims rest on external validation steps rather than reducing to the generation process by construction, so the derivation chain is self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms or invented entities are introduced in the abstract; the work relies on off-the-shelf LLMs and existing model validation tools.

pith-pipeline@v0.9.0 · 5540 in / 1063 out tokens · 53743 ms · 2026-05-10T15:21:12.681091+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

[1]

Large language model assisted software engineering: prospects, challenges, and a case study,

L. Belzner, T. Gabor, and M. Wirsing, “Large language model assisted software engineering: prospects, challenges, and a case study,” inIn- ternational Conference on Bridging the Gap between AI and Reality. Springer, 2023, pp. 355–374

work page 2023
[2]

Large language models for software engi- neering: A systematic literature review,

X. Hou, Y . Zhao, Y . Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engi- neering: A systematic literature review,”ACM Transactions on Software Engineering and Methodology, 2023

work page 2023
[3]

Conceptual modeling and artificial intelligence: A systematic mapping study,

D. Bork, S. J. Ali, and B. Roelens, “Conceptual modeling and artificial intelligence: A systematic mapping study,”CoRR, vol. abs/2303.06758,

work page arXiv
[4]

Conceptual modeling and artificial intelligence: A systematic mapping study,

[Online]. Available: https://doi.org/10.48550/arXiv.2303.06758

work page doi:10.48550/arxiv.2303.06758
[5]

Bridging MDE and AI: a systematic review of domain-specific lan- guages and model-driven practices in AI software systems engineering,

S. Rädler, L. Berardinelli, K. Winter, A. Rahimi, and S. Rinderle-Ma, “Bridging MDE and AI: a systematic review of domain-specific lan- guages and model-driven practices in AI software systems engineering,” Software and Systems Modeling, 2024

work page 2024
[6]

Towards using few-shot prompt learning for automating model completion,

M. B. Chaaben, L. Burgueño, and H. A. Sahraoui, “Towards using few-shot prompt learning for automating model completion,” in45th IEEE/ACM International Conference on Software Engineering: New Ideas and Emerging Results, NIER@ICSE, Melbourne, Australia, May 14-20, 2023. IEEE, 2023, pp. 7–12

work page 2023
[7]

Automated domain modeling with large language models: A comparative study,

K. Chen, Y . Yang, B. Chen, J. A. H. López, G. Mussbacher, and D. Varró, “Automated domain modeling with large language models: A comparative study,” in26th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2023, Västerås, Sweden, October 1-6, 2023. IEEE, 2023, pp. 162–172

work page 2023
[8]

A systematic approach to generate diverse instantiations for conceptual schemas,

L. Burgueño, J. Cabot, R. Clarisó, and M. Gogolla, “A systematic approach to generate diverse instantiations for conceptual schemas,” inConceptual Modeling - 38th International Conference, ER 2019, Salvador, Brazil, November 4-7, 2019, Proceedings, ser. Lecture Notes in Computer Science, A. H. F. Laender, B. Pernici, E. Lim, and J. P. M. de Oliveira, Eds....

work page doi:10.1007/978-3-030-33223-5_42 2019
[9]

Yekta: A low-code framework for automated test models generation,

M. Karimi, S. Kolahdouz-Rahimi, and J. Troya, “Yekta: A low-code framework for automated test models generation,”SoftwareX, vol. 27, p. 101850, 2024

work page 2024
[10]

Viatra solver: A framework for the automated generation of consistent domain-specific models,

O. Semeráth, A. A. Babikian, S. Pilarski, and D. Varró, “Viatra solver: A framework for the automated generation of consistent domain-specific models,” in2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2019, pp. 43–46

work page 2019
[11]

A graph solver for the auto- mated generation of consistent domain-specific models,

O. Semeráth, A. S. Nagy, and D. Varró, “A graph solver for the auto- mated generation of consistent domain-specific models,” inProceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018. ACM, 2018, pp. 969–980

work page 2018
[12]

Generating instance models from meta models,

K. Ehrig, J. M. Küster, and G. Taentzer, “Generating instance models from meta models,”Softw. Syst. Model., vol. 8, no. 4, pp. 479–500, 2009

work page 2009
[13]

Generating large EMF models efficiently - A rule-based, configurable approach,

N. Nassar, J. Kosiol, T. Kehrer, and G. Taentzer, “Generating large EMF models efficiently - A rule-based, configurable approach,” in Fundamental Approaches to Software Engineering - 23rd International Conference, FASE 2020. Springer, 2020, pp. 224–244

work page 2020
[14]

Diversity of graph models and graph generators in mutation testing,

O. Semeráth, R. Farkas, G. Bergmann, and D. Varró, “Diversity of graph models and graph generators in mutation testing,”Int. J. Softw. Tools Technol. Transf., vol. 22, no. 1, pp. 57–78, 2020

work page 2020
[15]

Generating structurally realistic models with deep autoregressive networks,

J. A. H. López and J. S. Cuadrado, “Generating structurally realistic models with deep autoregressive networks,”IEEE Trans. Software Eng., vol. 49, no. 4, pp. 2661–2676, 2023

work page 2023
[16]

Empirical evidence about the UML: a systematic literature review,

D. Budgen, A. J. Burn, O. P. Brereton, B. A. Kitchenham, and R. Preto- rius, “Empirical evidence about the UML: a systematic literature review,” Softw. Pract. Exp., vol. 41, no. 4, pp. 363–392, 2011

work page 2011
[17]

Model-driven engineer- ing practices in industry: Social, organizational and managerial factors that lead to success or failure,

J. Hutchinson, J. Whittle, and M. Rouncefield, “Model-driven engineer- ing practices in industry: Social, organizational and managerial factors that lead to success or failure,”Science of Computer Programming, vol. 89, pp. 144–161, 2014

work page 2014
[18]

A model-driven approach for developing a model repository: Methodology and tool support,

B. Hamid, “A model-driven approach for developing a model repository: Methodology and tool support,”Future Gener. Comput. Syst., vol. 68, pp. 473–490, 2017

work page 2017
[19]

Modelset: a dataset for machine learning in model-driven engineering,

J. A. H. López, J. L. C. Izquierdo, and J. S. Cuadrado, “Modelset: a dataset for machine learning in model-driven engineering,”Softw. Syst. Model., vol. 21, no. 3, pp. 967–986, 2022

work page 2022
[20]

The extended EA ModelSet— a FAIR dataset for researching and reasoning enterprise architecture modeling practices,

P.-L. Glaser, E. Sallinger, and D. Bork, “The extended EA ModelSet— a FAIR dataset for researching and reasoning enterprise architecture modeling practices,”Software and Systems Modeling, 2025

work page 2025
[21]

Beobachtungen und Einsichten zu Reposito- rys von BPMN-Modellen,

R. Laue and M. Läuter, “Beobachtungen und Einsichten zu Reposito- rys von BPMN-Modellen,” inModellierung 2024, Potsdam, Germany, March 12-15, 2024, ser. LNI, vol. P-348. Gesellschaft für Informatik e.V ., 2024, pp. 157–173

work page 2024
[22]

USE: A UML-based specifica- tion environment for validating UML and OCL,

M. Gogolla, F. Büttner, and M. Richters, “USE: A UML-based specifica- tion environment for validating UML and OCL,”Sci. Comput. Program., vol. 69, no. 1-3, pp. 27–34, 2007

work page 2007
[23]

A survey on llm-based code generation for low-resource and domain-specific programming languages.arXiv preprint arXiv:2410.03981, 2024

S. Joel, J. J. Wu, and F. H. Fard, “A Survey on LLM-based Code Generation for Low-Resource and Domain-Specific Programming Languages,” Nov. 2024, arXiv:2410.03981 [cs]. [Online]. Available: http://arxiv.org/abs/2410.03981

work page arXiv 2024
[24]

Accompanying software repository for the paper: LLM-based generation of semantically diverse and realistic domain model instances,

Anonymous, “Accompanying software repository for the paper: LLM-based generation of semantically diverse and realistic domain model instances,” https://anonymous.4open.science/r/ instance-generation-MODELS25/, 2025, accessed: 2025-04-03

work page 2025
[25]

Chain-of-thought prompting elicits reasoning in large language models,

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V . Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” inProceedings of the 36th International Conference on Neural Information Processing Systems, ser. NIPS ’22. Red Hook, NY , USA: Curran Associates Inc., 2022

work page 2022
[26]

Wohlin, P

C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, and B. Regnell, Experimentation in Software Engineering. Springer, 2012

work page 2012
[27]

The good, the bad, and the greedy: Evaluation of llms should not ignore non-determinism, 2024

Y . Song, G. Wang, S. Li, and B. Y . Lin, “The good, the bad, and the greedy: Evaluation of llms should not ignore non-determinism,”arXiv preprint arXiv:2407.10457, 2024

work page arXiv 2024
[28]

Kodkod: A relational model finder,

E. Torlak and D. Jackson, “Kodkod: A relational model finder,” in Tools and Algorithms for the Construction and Analysis of Systems, 13th International Conference, TACAS 2007, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2007 Braga, Portugal, March 24 - April 1, 2007, Proceedings, ser. Lecture Notes in Computer ...

work page 2007
[29]

Testing models and model transformations using classifying terms,

F. Hilken, M. Gogolla, L. Burgueño, and A. Vallecillo, “Testing models and model transformations using classifying terms,”Softw. Syst. Model., vol. 17, no. 3, pp. 885–912, 2018

work page 2018
[30]

Fixing defects in integrity constraints via constraint mutation,

R. Clarisó and J. Cabot, “Fixing defects in integrity constraints via constraint mutation,” in11th International Conference on the Quality of Information and Communications Technology, QUATIC 2018, Coimbra, Portugal, September 4-7, 2018, A. Bertolino, V . Amaral, P. Rupino, and M. Vieira, Eds. IEEE Computer Society, 2018, pp. 74–82

work page 2018
[31]

How are LLMs Used for Conceptual Modeling? An Exploratory Study on Interaction Behavior and User Perception,

S. J. Ali, I. Reinhartz-Berger, and D. Bork, “How are LLMs Used for Conceptual Modeling? An Exploratory Study on Interaction Behavior and User Perception,” inConceptual Modeling - 43rd International Conference, ER 2024, Pittsburgh, PA, USA, October 28-31, 2024, Proceedings, ser. Lecture Notes in Computer Science, W. Maass, H. Han, H. Yasar, and N. J. Mult...

work page 2024
[32]

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML,

J. Cámara, J. Troya, L. Burgueño, and A. Vallecillo, “On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML,”Softw. Syst. Model., vol. 22, no. 3, pp. 781–793, 2023

work page 2023
[33]

Application of the Tree-of-Thoughts Framework to LLM-Enabled Domain Modeling,

J. Silva, Q. Ma, J. Cabot, P. Kelsen, and H. A. Proper, “Application of the Tree-of-Thoughts Framework to LLM-Enabled Domain Modeling,” inConceptual Modeling - 43rd International Conference, ER 2024, Pittsburgh, PA, USA, October 28-31, 2024, Proceedings, ser. Lecture Notes in Computer Science, W. Maass, H. Han, H. Yasar, and N. J. Multari, Eds., vol. 1523...

work page 2024
[34]

Large language models as oracles for instantiating ontologies with domain-specific knowledge,

G. Ciatto, A. Agiollo, M. Magnini, and A. Omicini, “Large language models as oracles for instantiating ontologies with domain-specific knowledge,”CoRR, vol. abs/2404.04108, 2024

work page arXiv 2024
[35]

Validating Modal Aspects of OntoUML Conceptual Models Using Automatically Generated Visual World Structures,

A. B. Benevides, G. Guizzardi, B. F. B. Braga, and J. P. A. Almeida, “Validating Modal Aspects of OntoUML Conceptual Models Using Automatically Generated Visual World Structures,”J. Univers. Comput. Sci., vol. 16, no. 20, pp. 2904–2933, 2010

work page 2010
[36]

Jackson,Software Abstractions - Logic, Language, and Analysis

D. Jackson,Software Abstractions - Logic, Language, and Analysis. MIT Press, 2006. [Online]. Available: http://mitpress.mit.edu/catalog/ item/default.asp?ttype=2&tid=10928

work page 2006
[37]

Ontological anti-patterns: empirically uncovered error-prone structures in ontology-driven conceptual models,

T. P. Sales and G. Guizzardi, “Ontological anti-patterns: empirically uncovered error-prone structures in ontology-driven conceptual models,” Data Knowl. Eng., vol. 99, pp. 72–104, 2015

work page 2015
[38]

Uml2alloy: A chal- lenging model transformation,

K. Anastasakis, B. Bordbar, G. Georg, and I. Ray, “Uml2alloy: A chal- lenging model transformation,” inModel Driven Engineering Languages and Systems, 10th International Conference, MoDELS 2007, Nashville, USA, September 30 - October 5, 2007, Proceedings, ser. Lecture Notes in Computer Science, G. Engels, B. Opdyke, D. C. Schmidt, and F. Weil, Eds., vol. ...

work page 2007
[39]

Generating realistic test models for model processing tools,

P. Pietsch, H. S. Yazdi, and U. Kelter, “Generating realistic test models for model processing tools,” in26th IEEE/ACM International Confer- ence on Automated Software Engineering (ASE 2011), Lawrence, KS, USA, November 6-10, 2011, P. Alexander, C. S. Pasareanu, and J. G. Hosking, Eds. IEEE Computer Society, 2011, pp. 620–623

work page 2011
[40]

Uniform random generation of huge metamodel instances,

A. Mougenot, A. Darrasse, X. Blanc, and M. Soria, “Uniform random generation of huge metamodel instances,” inModel Driven Architecture - Foundations and Applications, 5th European Conference, ECMDA- FA 2009, Enschede, The Netherlands, June 23-26, 2009. Proceedings, ser. Lecture Notes in Computer Science, R. F. Paige, A. Hartman, and A. Rensink, Eds., vol....

work page 2009
[41]

Boltzmann samplers for the random generation of combinatorial structures,

P. Duchon, P. Flajolet, G. Louchard, and G. Schaeffer, “Boltzmann samplers for the random generation of combinatorial structures,”Comb. Probab. Comput., vol. 13, no. 4-5, pp. 577–625, 2004

work page 2004
[42]

EMF– random instantiator

AtlanMod Team, “EMF– random instantiator.” [Online]. Avail- able: https://github.com/atlanmod/mondo-atlzoo-benchmark/tree/master/ fr.inria.atlanmod.instantiator

work page
[43]

Generation of large random models for benchmarking,

M. Scheidgen, “Generation of large random models for benchmarking,” inProceedings of the 3rd Workshop on Scalable Model Driven Engineering part of the Software Technologies: Applications and Foundations (STAF 2015) federation of conferences, L’Aquila, Italy, July 23, 2015, ser. CEUR Workshop Proceedings, D. S. Kolovos, D. D. Ruscio, N. D. Matragkas, J. S....

work page 2015
[44]

EMG: A domain- specific transformation language for synthetic model generation,

S. Popoola, D. S. Kolovos, and H. H. Rodriguez, “EMG: A domain- specific transformation language for synthetic model generation,” in Theory and Practice of Model Transformations - 9th International Conference, ICMT@STAF 2016, Vienna, Austria, July 4-5, 2016, Pro- ceedings, ser. Lecture Notes in Computer Science, P. V . Gorp and G. Engels, Eds., vol. 9765....

work page 2016
[45]

Towards the characterization of realistic models: evaluation of multidisciplinary graph metrics,

G. Szárnyas, Z. Kovári, Á. Salánki, and D. Varró, “Towards the characterization of realistic models: evaluation of multidisciplinary graph metrics,” inProceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems, Saint-Malo, France, October 2-7, 2016, B. Baudry and B. Combemale, Eds. ACM, 2016, pp. 87–94. [On...

work page 2016

[1] [1]

Large language model assisted software engineering: prospects, challenges, and a case study,

L. Belzner, T. Gabor, and M. Wirsing, “Large language model assisted software engineering: prospects, challenges, and a case study,” inIn- ternational Conference on Bridging the Gap between AI and Reality. Springer, 2023, pp. 355–374

work page 2023

[2] [2]

Large language models for software engi- neering: A systematic literature review,

X. Hou, Y . Zhao, Y . Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engi- neering: A systematic literature review,”ACM Transactions on Software Engineering and Methodology, 2023

work page 2023

[3] [3]

Conceptual modeling and artificial intelligence: A systematic mapping study,

D. Bork, S. J. Ali, and B. Roelens, “Conceptual modeling and artificial intelligence: A systematic mapping study,”CoRR, vol. abs/2303.06758,

work page arXiv

[4] [4]

Conceptual modeling and artificial intelligence: A systematic mapping study,

[Online]. Available: https://doi.org/10.48550/arXiv.2303.06758

work page doi:10.48550/arxiv.2303.06758

[5] [5]

Bridging MDE and AI: a systematic review of domain-specific lan- guages and model-driven practices in AI software systems engineering,

S. Rädler, L. Berardinelli, K. Winter, A. Rahimi, and S. Rinderle-Ma, “Bridging MDE and AI: a systematic review of domain-specific lan- guages and model-driven practices in AI software systems engineering,” Software and Systems Modeling, 2024

work page 2024

[6] [6]

Towards using few-shot prompt learning for automating model completion,

M. B. Chaaben, L. Burgueño, and H. A. Sahraoui, “Towards using few-shot prompt learning for automating model completion,” in45th IEEE/ACM International Conference on Software Engineering: New Ideas and Emerging Results, NIER@ICSE, Melbourne, Australia, May 14-20, 2023. IEEE, 2023, pp. 7–12

work page 2023

[7] [7]

Automated domain modeling with large language models: A comparative study,

K. Chen, Y . Yang, B. Chen, J. A. H. López, G. Mussbacher, and D. Varró, “Automated domain modeling with large language models: A comparative study,” in26th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2023, Västerås, Sweden, October 1-6, 2023. IEEE, 2023, pp. 162–172

work page 2023

[8] [8]

A systematic approach to generate diverse instantiations for conceptual schemas,

L. Burgueño, J. Cabot, R. Clarisó, and M. Gogolla, “A systematic approach to generate diverse instantiations for conceptual schemas,” inConceptual Modeling - 38th International Conference, ER 2019, Salvador, Brazil, November 4-7, 2019, Proceedings, ser. Lecture Notes in Computer Science, A. H. F. Laender, B. Pernici, E. Lim, and J. P. M. de Oliveira, Eds....

work page doi:10.1007/978-3-030-33223-5_42 2019

[9] [9]

Yekta: A low-code framework for automated test models generation,

M. Karimi, S. Kolahdouz-Rahimi, and J. Troya, “Yekta: A low-code framework for automated test models generation,”SoftwareX, vol. 27, p. 101850, 2024

work page 2024

[10] [10]

Viatra solver: A framework for the automated generation of consistent domain-specific models,

O. Semeráth, A. A. Babikian, S. Pilarski, and D. Varró, “Viatra solver: A framework for the automated generation of consistent domain-specific models,” in2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2019, pp. 43–46

work page 2019

[11] [11]

A graph solver for the auto- mated generation of consistent domain-specific models,

O. Semeráth, A. S. Nagy, and D. Varró, “A graph solver for the auto- mated generation of consistent domain-specific models,” inProceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018. ACM, 2018, pp. 969–980

work page 2018

[12] [12]

Generating instance models from meta models,

K. Ehrig, J. M. Küster, and G. Taentzer, “Generating instance models from meta models,”Softw. Syst. Model., vol. 8, no. 4, pp. 479–500, 2009

work page 2009

[13] [13]

Generating large EMF models efficiently - A rule-based, configurable approach,

N. Nassar, J. Kosiol, T. Kehrer, and G. Taentzer, “Generating large EMF models efficiently - A rule-based, configurable approach,” in Fundamental Approaches to Software Engineering - 23rd International Conference, FASE 2020. Springer, 2020, pp. 224–244

work page 2020

[14] [14]

Diversity of graph models and graph generators in mutation testing,

O. Semeráth, R. Farkas, G. Bergmann, and D. Varró, “Diversity of graph models and graph generators in mutation testing,”Int. J. Softw. Tools Technol. Transf., vol. 22, no. 1, pp. 57–78, 2020

work page 2020

[15] [15]

Generating structurally realistic models with deep autoregressive networks,

J. A. H. López and J. S. Cuadrado, “Generating structurally realistic models with deep autoregressive networks,”IEEE Trans. Software Eng., vol. 49, no. 4, pp. 2661–2676, 2023

work page 2023

[16] [16]

Empirical evidence about the UML: a systematic literature review,

D. Budgen, A. J. Burn, O. P. Brereton, B. A. Kitchenham, and R. Preto- rius, “Empirical evidence about the UML: a systematic literature review,” Softw. Pract. Exp., vol. 41, no. 4, pp. 363–392, 2011

work page 2011

[17] [17]

Model-driven engineer- ing practices in industry: Social, organizational and managerial factors that lead to success or failure,

J. Hutchinson, J. Whittle, and M. Rouncefield, “Model-driven engineer- ing practices in industry: Social, organizational and managerial factors that lead to success or failure,”Science of Computer Programming, vol. 89, pp. 144–161, 2014

work page 2014

[18] [18]

A model-driven approach for developing a model repository: Methodology and tool support,

B. Hamid, “A model-driven approach for developing a model repository: Methodology and tool support,”Future Gener. Comput. Syst., vol. 68, pp. 473–490, 2017

work page 2017

[19] [19]

Modelset: a dataset for machine learning in model-driven engineering,

J. A. H. López, J. L. C. Izquierdo, and J. S. Cuadrado, “Modelset: a dataset for machine learning in model-driven engineering,”Softw. Syst. Model., vol. 21, no. 3, pp. 967–986, 2022

work page 2022

[20] [20]

The extended EA ModelSet— a FAIR dataset for researching and reasoning enterprise architecture modeling practices,

P.-L. Glaser, E. Sallinger, and D. Bork, “The extended EA ModelSet— a FAIR dataset for researching and reasoning enterprise architecture modeling practices,”Software and Systems Modeling, 2025

work page 2025

[21] [21]

Beobachtungen und Einsichten zu Reposito- rys von BPMN-Modellen,

R. Laue and M. Läuter, “Beobachtungen und Einsichten zu Reposito- rys von BPMN-Modellen,” inModellierung 2024, Potsdam, Germany, March 12-15, 2024, ser. LNI, vol. P-348. Gesellschaft für Informatik e.V ., 2024, pp. 157–173

work page 2024

[22] [22]

USE: A UML-based specifica- tion environment for validating UML and OCL,

M. Gogolla, F. Büttner, and M. Richters, “USE: A UML-based specifica- tion environment for validating UML and OCL,”Sci. Comput. Program., vol. 69, no. 1-3, pp. 27–34, 2007

work page 2007

[23] [23]

A survey on llm-based code generation for low-resource and domain-specific programming languages.arXiv preprint arXiv:2410.03981, 2024

S. Joel, J. J. Wu, and F. H. Fard, “A Survey on LLM-based Code Generation for Low-Resource and Domain-Specific Programming Languages,” Nov. 2024, arXiv:2410.03981 [cs]. [Online]. Available: http://arxiv.org/abs/2410.03981

work page arXiv 2024

[24] [24]

Accompanying software repository for the paper: LLM-based generation of semantically diverse and realistic domain model instances,

Anonymous, “Accompanying software repository for the paper: LLM-based generation of semantically diverse and realistic domain model instances,” https://anonymous.4open.science/r/ instance-generation-MODELS25/, 2025, accessed: 2025-04-03

work page 2025

[25] [25]

Chain-of-thought prompting elicits reasoning in large language models,

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V . Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” inProceedings of the 36th International Conference on Neural Information Processing Systems, ser. NIPS ’22. Red Hook, NY , USA: Curran Associates Inc., 2022

work page 2022

[26] [26]

Wohlin, P

C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, and B. Regnell, Experimentation in Software Engineering. Springer, 2012

work page 2012

[27] [27]

The good, the bad, and the greedy: Evaluation of llms should not ignore non-determinism, 2024

Y . Song, G. Wang, S. Li, and B. Y . Lin, “The good, the bad, and the greedy: Evaluation of llms should not ignore non-determinism,”arXiv preprint arXiv:2407.10457, 2024

work page arXiv 2024

[28] [28]

Kodkod: A relational model finder,

E. Torlak and D. Jackson, “Kodkod: A relational model finder,” in Tools and Algorithms for the Construction and Analysis of Systems, 13th International Conference, TACAS 2007, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2007 Braga, Portugal, March 24 - April 1, 2007, Proceedings, ser. Lecture Notes in Computer ...

work page 2007

[29] [29]

Testing models and model transformations using classifying terms,

F. Hilken, M. Gogolla, L. Burgueño, and A. Vallecillo, “Testing models and model transformations using classifying terms,”Softw. Syst. Model., vol. 17, no. 3, pp. 885–912, 2018

work page 2018

[30] [30]

Fixing defects in integrity constraints via constraint mutation,

R. Clarisó and J. Cabot, “Fixing defects in integrity constraints via constraint mutation,” in11th International Conference on the Quality of Information and Communications Technology, QUATIC 2018, Coimbra, Portugal, September 4-7, 2018, A. Bertolino, V . Amaral, P. Rupino, and M. Vieira, Eds. IEEE Computer Society, 2018, pp. 74–82

work page 2018

[31] [31]

How are LLMs Used for Conceptual Modeling? An Exploratory Study on Interaction Behavior and User Perception,

S. J. Ali, I. Reinhartz-Berger, and D. Bork, “How are LLMs Used for Conceptual Modeling? An Exploratory Study on Interaction Behavior and User Perception,” inConceptual Modeling - 43rd International Conference, ER 2024, Pittsburgh, PA, USA, October 28-31, 2024, Proceedings, ser. Lecture Notes in Computer Science, W. Maass, H. Han, H. Yasar, and N. J. Mult...

work page 2024

[32] [32]

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML,

J. Cámara, J. Troya, L. Burgueño, and A. Vallecillo, “On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML,”Softw. Syst. Model., vol. 22, no. 3, pp. 781–793, 2023

work page 2023

[33] [33]

Application of the Tree-of-Thoughts Framework to LLM-Enabled Domain Modeling,

J. Silva, Q. Ma, J. Cabot, P. Kelsen, and H. A. Proper, “Application of the Tree-of-Thoughts Framework to LLM-Enabled Domain Modeling,” inConceptual Modeling - 43rd International Conference, ER 2024, Pittsburgh, PA, USA, October 28-31, 2024, Proceedings, ser. Lecture Notes in Computer Science, W. Maass, H. Han, H. Yasar, and N. J. Multari, Eds., vol. 1523...

work page 2024

[34] [34]

Large language models as oracles for instantiating ontologies with domain-specific knowledge,

G. Ciatto, A. Agiollo, M. Magnini, and A. Omicini, “Large language models as oracles for instantiating ontologies with domain-specific knowledge,”CoRR, vol. abs/2404.04108, 2024

work page arXiv 2024

[35] [35]

Validating Modal Aspects of OntoUML Conceptual Models Using Automatically Generated Visual World Structures,

A. B. Benevides, G. Guizzardi, B. F. B. Braga, and J. P. A. Almeida, “Validating Modal Aspects of OntoUML Conceptual Models Using Automatically Generated Visual World Structures,”J. Univers. Comput. Sci., vol. 16, no. 20, pp. 2904–2933, 2010

work page 2010

[36] [36]

Jackson,Software Abstractions - Logic, Language, and Analysis

D. Jackson,Software Abstractions - Logic, Language, and Analysis. MIT Press, 2006. [Online]. Available: http://mitpress.mit.edu/catalog/ item/default.asp?ttype=2&tid=10928

work page 2006

[37] [37]

Ontological anti-patterns: empirically uncovered error-prone structures in ontology-driven conceptual models,

T. P. Sales and G. Guizzardi, “Ontological anti-patterns: empirically uncovered error-prone structures in ontology-driven conceptual models,” Data Knowl. Eng., vol. 99, pp. 72–104, 2015

work page 2015

[38] [38]

Uml2alloy: A chal- lenging model transformation,

K. Anastasakis, B. Bordbar, G. Georg, and I. Ray, “Uml2alloy: A chal- lenging model transformation,” inModel Driven Engineering Languages and Systems, 10th International Conference, MoDELS 2007, Nashville, USA, September 30 - October 5, 2007, Proceedings, ser. Lecture Notes in Computer Science, G. Engels, B. Opdyke, D. C. Schmidt, and F. Weil, Eds., vol. ...

work page 2007

[39] [39]

Generating realistic test models for model processing tools,

P. Pietsch, H. S. Yazdi, and U. Kelter, “Generating realistic test models for model processing tools,” in26th IEEE/ACM International Confer- ence on Automated Software Engineering (ASE 2011), Lawrence, KS, USA, November 6-10, 2011, P. Alexander, C. S. Pasareanu, and J. G. Hosking, Eds. IEEE Computer Society, 2011, pp. 620–623

work page 2011

[40] [40]

Uniform random generation of huge metamodel instances,

A. Mougenot, A. Darrasse, X. Blanc, and M. Soria, “Uniform random generation of huge metamodel instances,” inModel Driven Architecture - Foundations and Applications, 5th European Conference, ECMDA- FA 2009, Enschede, The Netherlands, June 23-26, 2009. Proceedings, ser. Lecture Notes in Computer Science, R. F. Paige, A. Hartman, and A. Rensink, Eds., vol....

work page 2009

[41] [41]

Boltzmann samplers for the random generation of combinatorial structures,

P. Duchon, P. Flajolet, G. Louchard, and G. Schaeffer, “Boltzmann samplers for the random generation of combinatorial structures,”Comb. Probab. Comput., vol. 13, no. 4-5, pp. 577–625, 2004

work page 2004

[42] [42]

EMF– random instantiator

AtlanMod Team, “EMF– random instantiator.” [Online]. Avail- able: https://github.com/atlanmod/mondo-atlzoo-benchmark/tree/master/ fr.inria.atlanmod.instantiator

work page

[43] [43]

Generation of large random models for benchmarking,

M. Scheidgen, “Generation of large random models for benchmarking,” inProceedings of the 3rd Workshop on Scalable Model Driven Engineering part of the Software Technologies: Applications and Foundations (STAF 2015) federation of conferences, L’Aquila, Italy, July 23, 2015, ser. CEUR Workshop Proceedings, D. S. Kolovos, D. D. Ruscio, N. D. Matragkas, J. S....

work page 2015

[44] [44]

EMG: A domain- specific transformation language for synthetic model generation,

S. Popoola, D. S. Kolovos, and H. H. Rodriguez, “EMG: A domain- specific transformation language for synthetic model generation,” in Theory and Practice of Model Transformations - 9th International Conference, ICMT@STAF 2016, Vienna, Austria, July 4-5, 2016, Pro- ceedings, ser. Lecture Notes in Computer Science, P. V . Gorp and G. Engels, Eds., vol. 9765....

work page 2016

[45] [45]

Towards the characterization of realistic models: evaluation of multidisciplinary graph metrics,

G. Szárnyas, Z. Kovári, Á. Salánki, and D. Varró, “Towards the characterization of realistic models: evaluation of multidisciplinary graph metrics,” inProceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems, Saint-Malo, France, October 2-7, 2016, B. Baudry and B. Combemale, Eds. ACM, 2016, pp. 87–94. [On...

work page 2016