Developers using AI assistants exhibit more stable emotions and greater focus on code creation, evaluation, and verification, captured in a new four-dimensional S-IASE model from retrospective labeling of screen recordings, surveys, and interviews.
Title resolution pending
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 9roles
background 3representative citing papers
ClarifySTL uses LLM agents to interactively detect and resolve vagueness and ambiguity in natural language requirements via clarification queries before generating STL formulas, with evaluations on existing and new benchmarks showing effectiveness.
A dual-axis quality framework ranks DL mutation operators by statistical resistance and Jaccard-based realism to real faults, enabling up to 55.6% fewer mutants on held-out validation data without dropping baseline performance.
Ethics testing is introduced as a systematic approach to generate tests that identify software harms induced by unethical behavior in generative AI outputs.
LDMDroid applies LLMs in a state-aware process to trigger data manipulation functions and uses visual cues to detect errors, finding 17 bugs across 24 Android apps with 14 developer confirmations.
APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.
This empirical baseline study characterizes generative AI usage across the software lifecycle in capstone projects, student-recommended responsible practices, and client expectations for understanding and quality.
Log coverage metrics show Claude Opus 4.6 tests uncover 28.4% more unique log templates than human-written Locust tests on Light-OAuth2, while combinations of strategies increase observed coverage by 30-105%.
A research proposal for three studies on multi-agent LLM pair programming that externalizes intent and uses automated validation to increase trustworthiness.
citing papers explorer
-
How Do Developers Interact with AI? An Exploratory Study on Modeling Developer Programming Behavior
Developers using AI assistants exhibit more stable emotions and greater focus on code creation, evaluation, and verification, captured in a new four-dimensional S-IASE model from retrospective labeling of screen recordings, surveys, and interviews.
-
ClarifySTL: An Interactive LLM Agent Framework for STL Transformation through Requirements Clarification
ClarifySTL uses LLM agents to interactively detect and resolve vagueness and ambiguity in natural language requirements via clarification queries before generating STL formulas, with evaluations on existing and new benchmarks showing effectiveness.
-
Quality-Driven Selective Mutation for Deep Learning
A dual-axis quality framework ranks DL mutation operators by statistical resistance and Jaccard-based realism to real faults, enabling up to 55.6% fewer mutants on held-out validation data without dropping baseline performance.
-
Ethics Testing: Proactive Identification of Generative AI System Harms
Ethics testing is introduced as a systematic approach to generate tests that identify software harms induced by unethical behavior in generative AI outputs.
-
LDMDroid: Leveraging LLMs for Detecting Data Manipulation Errors in Android Apps
LDMDroid applies LLMs in a state-aware process to trigger data manipulation functions and uses visual cues to detect errors, finding 17 bugs across 24 Android apps with 14 developer confirmations.
-
Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study
APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.
-
How Do Software Engineering Students Use Generative AI in Real-World Capstone Projects? An Empirical Baseline Study
This empirical baseline study characterizes generative AI usage across the software lifecycle in capstone projects, student-recommended responsible practices, and client expectations for understanding and quality.
-
Assessing REST API Test Generation Strategies with Log Coverage
Log coverage metrics show Claude Opus 4.6 tests uncover 28.4% more unique log templates than human-written Locust tests on Light-OAuth2, while combinations of strategies increase observed coverage by 30-105%.
-
From Helpful to Trustworthy: LLM Agents for Pair Programming
A research proposal for three studies on multi-agent LLM pair programming that externalizes intent and uses automated validation to increase trustworthiness.