LLMs predict story points better in zero-shot prompting than supervised deep learning models trained on 80% of project data, with few-shot examples and comparative judgments further improving performance.
Chain-of-thought prompting elicits reasoning in large language models
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.SE 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
GuidedFewShot prompts embedding a mutation taxonomy achieve the highest single-run failure-mode coverage in LLM-generated robustness tests for microservices, with prompt strategy explaining more diversity variation than model size.
citing papers explorer
-
Story Point Estimation Using Large Language Models
LLMs predict story points better in zero-shot prompting than supervised deep learning models trained on 80% of project data, with few-shot examples and comparative judgments further improving performance.
-
LLM-Based Robustness Testing of Microservice Applications: An Empirical Study
GuidedFewShot prompts embedding a mutation taxonomy achieve the highest single-run failure-mode coverage in LLM-generated robustness tests for microservices, with prompt strategy explaining more diversity variation than model size.