MermaidSeqBench is a new human-verified benchmark for evaluating LLMs on natural language to Mermaid sequence diagram generation, revealing significant capability gaps across models.
How llms aid in uml modeling: An exploratory study with novice analysts, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SE 2verdicts
UNVERDICTED 2representative citing papers
A survey of user studies on LLM use in programming that identifies interaction behaviors, mixed benefits and weaknesses, and factors influencing human and task performance.
citing papers explorer
-
MermaidSeqBench: An Evaluation Benchmark for NL-to-Mermaid Sequence Diagram Generation
MermaidSeqBench is a new human-verified benchmark for evaluating LLMs on natural language to Mermaid sequence diagram generation, revealing significant capability gaps across models.
-
Understanding the Human-LLM Dynamic: A Literature Survey of LLM Use in Programming Tasks
A survey of user studies on LLM use in programming that identifies interaction behaviors, mixed benefits and weaknesses, and factors influencing human and task performance.