Empirical analysis of 338 PRs with self-admitted ChatGPT usage shows low full integration (median 25%), selective adaptation patterns, and broader influence on developer reasoning during reviews.
arXiv preprint arXiv:2305.00418 (2023)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SE 2representative citing papers
LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.
citing papers explorer
-
PatchTrack: A Comprehensive Analysis of ChatGPT's Influence on Pull Request Outcomes
Empirical analysis of 338 PRs with self-admitted ChatGPT usage shows low full integration (median 25%), selective adaptation patterns, and broader influence on developer reasoning during reviews.
-
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.