LLM-based merge conflict resolution performs well on imbalanced conflicts but struggles with large or non-English inputs, while search-based methods show better generalization and strength on balanced conflicts.
Aeon: a method for automatic evaluation of nlp test cases
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
Raven automates Scratch program assessment by having instructors specify task-level video generation rules and using LLMs to analyze resulting videos for behavioral compliance, outperforming prior tools on real student submissions.
SemLink applies a Siamese SBERT model to detect semantic drift in hyperlinks, achieving 96% recall at 47.5 times the speed of GPT-5.2 using a new 60k-pair dataset.
citing papers explorer
-
LLM-based vs. Search-based Merge Conflict Resolution: An Empirical Study of Competing Paradigms
LLM-based merge conflict resolution performs well on imbalanced conflicts but struggles with large or non-English inputs, while search-based methods show better generalization and strength on balanced conflicts.
-
Raven: Rethinking Automated Assessment for Scratch Programs via Video-Grounded Evaluation
Raven automates Scratch program assessment by having instructors specify task-level video generation rules and using LLMs to analyze resulting videos for behavioral compliance, outperforming prior tools on real student submissions.
-
SemLink: A Semantic-Aware Automated Test Oracle for Hyperlink Verification using Siamese Sentence-BERT
SemLink applies a Siamese SBERT model to detect semantic drift in hyperlinks, achieving 96% recall at 47.5 times the speed of GPT-5.2 using a new 60k-pair dataset.