Smart Paste: Automatically Fixing Copy/Paste for Google Developers
Pith reviewed 2026-05-18 10:16 UTC · model grok-4.3
The pith
Smart Paste suggests automatic edits after code is pasted and now generates over 1% of all code at Google.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We show how to iteratively develop and scale Smart Paste, an IDE feature for post-paste edit suggestions, to Google's development environment. Since deployment, Smart Paste has had overwhelmingly positive feedback with a 45% acceptance rate. At Google's enterprise scale, these accepted suggestions account substantially for over 1% of all code written company-wide.
What carries the argument
The deep learning model that predicts post-paste edits such as reformatting, variable renaming, and style adjustments, integrated into the IDE with user-facing suggestion handling.
Load-bearing premise
High acceptance rates and measured code volume directly indicate net productivity gains without hidden costs like suggestion fatigue or reduced code quality.
What would settle it
A measurement of total developer time spent on code tasks before and after the feature, or a check for changes in bug rates and maintenance effort in code that used the suggestions.
Figures
read the original abstract
Manually editing pasted code is a long-standing developer pain point. In internal software development at Google, we observe that code is pasted 4 times more often than it is manually typed. These paste actions frequently require follow-up edits, ranging from simple reformatting and renaming to more complex style adjustments and cross-language translations. Prior work has shown deep learning can be used to predict these edits. In this work, we show how to iteratively develop and scale Smart Paste, an IDE feature for post-paste edit suggestions, to Google's development environment. This experience can serve as a guide for AI practitioners on a holistic approach to feature development, covering user experience, system integration, and model capabilities. Since deployment, Smart Paste has had overwhelmingly positive feedback with a 45% acceptance rate. At Google's enterprise scale, these accepted suggestions account substantially for over 1% of all code written company-wide.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript describes the iterative development and deployment of Smart Paste, an IDE feature that applies deep learning to generate post-paste edit suggestions for code in Google's internal environment. It notes that pasting occurs four times more often than manual typing and reports that, after deployment, the feature received positive feedback with a 45% acceptance rate; accepted suggestions are claimed to account for over 1% of all code written company-wide. The work positions itself as a guide for AI practitioners on holistic feature development covering user experience, system integration, and model capabilities.
Significance. If the reported deployment metrics hold under scrutiny, the paper provides a useful large-scale case study on integrating predictive edit models into production developer tools. It demonstrates how such a feature can achieve measurable uptake at enterprise scale and offers practical lessons on scaling from research prototypes to widespread use. The emphasis on real-world outcomes rather than isolated model accuracy is a strength, though the absence of detailed evaluation methodology limits the ability to assess generalizability or net productivity impact.
major comments (2)
- [Abstract] Abstract: The central claims of a 45% acceptance rate and >1% contribution to company-wide code volume are stated as observed results without any description of measurement methods, time window, definition of acceptance, controls for selection bias, or comparison to baseline paste/edit behavior. These metrics are load-bearing for the claim of substantial success and positive feedback.
- [Abstract] The manuscript does not report data on rejection reasons, interaction overhead, downstream code quality effects, or controlled comparisons to manual editing, which are required to substantiate that the surface metrics reflect net productivity gains rather than hidden costs such as review fatigue or quality regressions.
minor comments (2)
- Clarify the exact scope of 'all code written company-wide' (e.g., whether it includes only edited files or all commits) to avoid ambiguity in the 1% figure.
- Provide a brief overview of the model architecture or training data sources in the main text, as the abstract mentions deep learning but leaves implementation details implicit.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback, which helps clarify the presentation of our deployment results. We address each major comment below and indicate the revisions made to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claims of a 45% acceptance rate and >1% contribution to company-wide code volume are stated as observed results without any description of measurement methods, time window, definition of acceptance, controls for selection bias, or comparison to baseline paste/edit behavior. These metrics are load-bearing for the claim of substantial success and positive feedback.
Authors: We agree that greater transparency on metric collection strengthens the paper. The revised manuscript now includes an expanded methods description specifying the observation window (post-deployment data from March 2023 through June 2024), the operational definition of acceptance (explicit user acceptance or persistence of the edit in the final committed code), and the aggregation method used to compute the >1% contribution (total lines introduced by accepted suggestions divided by total lines committed company-wide). Phased rollout across teams was used to reduce selection effects, though we cannot release the full internal statistical controls or baseline paste/edit logs for privacy reasons. revision: partial
-
Referee: [Abstract] The manuscript does not report data on rejection reasons, interaction overhead, downstream code quality effects, or controlled comparisons to manual editing, which are required to substantiate that the surface metrics reflect net productivity gains rather than hidden costs such as review fatigue or quality regressions.
Authors: We have added a new subsection summarizing internal survey responses on rejection reasons (primarily irrelevance or over-conservatism) and qualitative observations that interaction overhead remains low because suggestions are presented inline. We also now explicitly list the absence of randomized controlled trials and downstream quality metrics as a limitation of the work. A formal A/B comparison to manual editing was not feasible within the production deployment constraints and ethical guidelines governing tool rollouts at this scale. revision: partial
- Quantitative results from controlled experiments measuring net productivity impact or downstream code quality effects
Circularity Check
No significant circularity in empirical deployment report
full rationale
The paper is an experience report on iteratively developing and deploying the Smart Paste IDE feature. It reports direct observational data including that code is pasted 4 times more often than typed, a 45% acceptance rate after deployment, and that accepted suggestions account for over 1% of company-wide code. These are presented as measured post-deployment outcomes and user feedback rather than as model predictions, fitted parameters renamed as results, or quantities derived from equations that reduce to the inputs by construction. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked to justify the central claims; the metrics function as independent external observations of the deployed system.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption User acceptance of suggestions and resulting code volume serve as valid proxies for feature value and productivity impact.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/Atomicity.leanatomic_tick unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We developed a rule-based method to identify paste and fix sequences from raw edit logs... 72% of all paste events receive a local fix.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The target output is a unidiff-style patch... multilingual training strategy successfully improved performance across all languages.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[2]
SmartPaste: Learning to Adapt Source Code
Miltiadis Allamanis and Marc Brockschmidt. Smartpaste: Learning to adapt source code.arXiv preprint arXiv:1705.07867, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[3]
Amazon CodeWhisperer is now generally avail- able
Amazon Web Services. Amazon CodeWhisperer is now generally avail- able. https://aws.amazon.com/blogs/aws/amazon-codewhisperer-free-for- individual-use-is-now-generally-available/, April 2023. Accessed: 2025-09-24
work page 2023
-
[4]
Program fracture and recombination for efficient automatic code reuse
Peter Amidon, Eli Davis, Stelios Sidiroglou-Douskos, and Martin Rinard. Program fracture and recombination for efficient automatic code reuse. In2015 IEEE High Performance Extreme Computing Conference (HPEC), pages 1–6. IEEE, 2015
work page 2015
-
[5]
Cursor: The AI-first Code Editor
Anysphere Inc. Cursor: The AI-first Code Editor. https://cursor.sh/. n.d
-
[6]
Efficient Training of Language Models to Fill in the Middle
Mohammad Bavarian, Heewoo Jun, Nikolas Tezak, John Schulman, Christine McLeavey, Jerry Tworek, and Mark Chen. Efficient training of language models to fill in the middle.arXiv preprint arXiv:2207.14255, 2022. Smart Paste: Automatically Fixing Copy/Paste for Google Developers
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[7]
Identifying the factors that influence trust in ai code completion
Adam Brown, Sarah D’Angelo, Ambar Murillo, Ciera Jaspan, and Collin Green. Identifying the factors that influence trust in ai code completion. InProceedings of the 1st ACM International Conference on AI-Powered Software, pages 1–9, 2024
work page 2024
-
[8]
Learning from examples to improve code completion systems
Marcel Bruch, Martin Monperrus, and Mira Mezini. Learning from examples to improve code completion systems. InProceedings of the 7th joint meeting of the European softw are engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, pages 213–222, 2009
work page 2009
-
[9]
On multi-modal learning of editing source code
Saikat Chakraborty and Baishakhi Ray. On multi-modal learning of editing source code. In2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 443–455. IEEE, 2021
work page 2021
-
[10]
Multi-line ai-assisted code authoring
Omer Dunay, Daniel Cheng, Adam Tait, Parth Thakkar, Peter C Rigby, Andy Chiu, Imad Ahmad, Arun Ganesan, Chandra Maddila, Vijayaraghavan Murali, et al. Multi-line ai-assisted code authoring. InCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, pages 150–160, 2024
work page 2024
-
[12]
GitHub Copilot is generally available to all developers
GitHub. GitHub Copilot is generally available to all developers. https://github. blog/2022-06-21-github-copilot-is-generally-available-to-all-developers/, June
work page 2022
-
[13]
Accessed: 2025-09-24
work page 2025
-
[14]
Duet AI in Google Cloud is now generally avail- able
Google Cloud. Duet AI in Google Cloud is now generally avail- able. https://cloud.google.com/blog/products/application-modernization/ introducing-duet-ai-for-google-cloud, May 2023. Accessed: 2025-09-24]
work page 2023
-
[15]
Diff, patch, and friends.Linux Journal, 1996(28es):2–es, 1996
Michael K Johnson. Diff, patch, and friends.Linux Journal, 1996(28es):2–es, 1996
work page 1996
-
[16]
Adaptivepaste: Intelligent copy-paste in ide
Xiaoyu Liu, Jinu Jang, Neel Sundaresan, Miltiadis Allamanis, and Alexey Svy- atkovskiy. Adaptivepaste: Intelligent copy-paste in ide. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 1844–1854, 2023
work page 2023
-
[17]
Navigate and edit c# - visual studio code
Microsoft. Navigate and edit c# - visual studio code. https://code.visualstudio. com/docs/csharp/navigate-edit#_peek-definition. Section: Peek Definition. Ac- cessed: 2025-09-18
work page 2025
-
[18]
Refactoring - visual studio code
Microsoft. Refactoring - visual studio code. https://code.visualstudio.com/docs/ editing/refactoring, Sep 2025. Accessed: 2025-09-18
work page 2025
-
[19]
Prompting llms for code editing: Struggles and remedies.arXiv preprint arXiv:2504.20196, 2025
Daye Nam, Ahmed Omran, Ambar Murillo, Saksham Thakur, Abner Araujo, Marcel Blistein, Alexander Frömmgen, Vincent Hellendoorn, and Satish Chan- dra. Prompting llms for code editing: Struggles and remedies.arXiv preprint arXiv:2504.20196, 2025
-
[20]
Type-directed completion of partial expressions
Daniel Perelman, Sumit Gulwani, Thomas Ball, and Dan Grossman. Type-directed completion of partial expressions. InProceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, pages 275–286, 2012
work page 2012
-
[21]
chrF: character n-gram F-score for automatic MT evaluation
Maja Popović. chrF: character n-gram F-score for automatic MT evaluation. In Ondřej Bojar, Rajan Chatterjee, Christian Federmann, Barry Haddow, Chris Hokamp, Matthias Huck, Varvara Logacheva, and Pavel Pecina, editors,Proceed- ings of the Tenth Workshop on Statistical Machine Translation, pages 392–395, Lisbon, Portugal, September 2015. Association for Co...
work page 2015
-
[22]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer.Journal of machine learning research, 21(140):1–67, 2020
work page 2020
-
[23]
Detecting and characterizing semantic inconsistencies in ported code
Baishakhi Ray, Miryung Kim, Suzette Person, and Neha Rungta. Detecting and characterizing semantic inconsistencies in ported code. In2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 367–377. IEEE, 2013
work page 2013
-
[24]
Code completion with statis- tical language models
Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statis- tical language models. InProceedings of the 35th ACM SIGPLAN conference on programming language design and implementation, pages 419–428, 2014
work page 2014
-
[25]
Stelios Sidiroglou-Douskos, Eric Lahtinen, Anthony Eden, Fan Long, and Martin Rinard. Codecarboncopy. InProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pages 95–105, 2017
work page 2017
-
[26]
Pythia: Ai-assisted code completion system
Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, and Neel Sundaresan. Pythia: Ai-assisted code completion system. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2727–2735, 2019
work page 2019
-
[27]
Ml-enhanced code completion im- proves developer productivity
Maxim Tabachnyk and Stoyan Nikolov. Ml-enhanced code completion im- proves developer productivity. https://research.google/blog/ml-enhanced-code- completion-improves-developer-productivity/, Jul 2022. Accessed: 2025-09-18
work page 2022
-
[28]
Gemini: A Family of Highly Capable Multimodal Models
Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. Gemini: a family of highly capable multimodal models.arXiv preprint arXiv:2312.11805, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[29]
Why and how javascript developers use linters
Kristín Fjóla Tómasdóttir, Mauricio Aniche, and Arie Van Deursen. Why and how javascript developers use linters. In2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 578–589. IEEE, 2017
work page 2017
-
[30]
On learning meaningful code changes via neural machine translation
Michele Tufano, Jevgenija Pantiuchina, Cody Watson, Gabriele Bavota, and Denys Poshyvanyk. On learning meaningful code changes via neural machine translation. In2019 IEEE/ACM 41st International Conference on Software Engi- neering (ICSE), pages 25–36. IEEE, 2019
work page 2019
-
[31]
Code suggestions powered by everything you’ve done
Windsurf. Code suggestions powered by everything you’ve done. https:// windsurf.com/tab. n.d
-
[32]
John Yang, Carlos E Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. Swe-agent: Agent-computer interfaces enable automated software engineering.Advances in Neural Information Processing Systems, 37:50528–50652, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.