Enhancing Python Compiler Error Messages via Stack Overflow

Christoph Treude; Emillie Thiselton

arxiv: 1906.11456 · v1 · pith:QQNTKRAWnew · submitted 2019-06-27 · 💻 cs.SE

Enhancing Python Compiler Error Messages via Stack Overflow

Emillie Thiselton , Christoph Treude This is my paper

Pith reviewed 2026-05-25 14:55 UTC · model grok-4.3

classification 💻 cs.SE

keywords compiler error messagesStack OverflowPythonIDE pluginuser studyerror message enhancementPycee

0 comments

The pith

Stack Overflow threads can be automatically mined and summarized to enhance Python compiler error messages inside an IDE.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests the idea that discussions on Stack Overflow about Python errors contain usable information that can be collected and shown to programmers without them leaving their editor. The authors built Pycee, a Sublime Text plugin that queries Stack Overflow for each error and displays a custom summary. In a think-aloud study, 16 programmers completed tasks with Pycee and most said it was helpful, preferring it to a version that pulled from official Python documentation because it gave concrete fixes and code examples. The work shows that crowd-sourced Q&A content can be reused automatically to make error messages more actionable.

Core claim

Pycee automatically queries Stack Overflow to provide customised and summarised information about Python compiler errors within the Sublime Text IDE. When evaluated in a user study, the majority of the 16 participants agreed that Pycee was helpful, and they generally preferred it to a baseline using official Python documentation due to its concrete suggestions for fixes and example code.

What carries the argument

Pycee, an IDE plugin that automatically queries Stack Overflow and repackages relevant thread content as enhanced error messages.

If this is right

Programmers receive fix suggestions and examples directly in the editor instead of searching separately.
Official documentation is no longer the only source for improving error messages.
Time spent resolving common Python errors can decrease for users of the enhanced messages.
The same reuse of online Q&A content becomes feasible for other programming tasks beyond error messages.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be tested on languages other than Python where Stack Overflow has dense error discussions.
If the summarization step is made more robust, the same pipeline might apply to runtime errors or warnings.
Integration into other editors would let the benefit reach programmers who do not use Sublime Text.

Load-bearing premise

Stack Overflow threads contain accurate, relevant, and summarizable information about Python errors that improves programmer understanding without introducing new confusion or incorrect advice.

What would settle it

A controlled study in which programmers using the Stack Overflow summaries take longer to fix errors or introduce more new bugs than those using only the official documentation would falsify the central claim.

Figures

Figures reproduced from arXiv: 1906.11456 by Christoph Treude, Emillie Thiselton.

**Figure 1.** Figure 1: Screenshot of PYCEE. The first few lines on white background show the original compiler error message produced by Python, the additional lines show the enhanced error message produced by PYCEE. The message provided by PYCEE is a summary of Stack Overflow answer 2395167. Note that in the screenshot, the offending line has already been corrected. by adding related verbs and syntax from other programming lang… view at source ↗

**Figure 2.** Figure 2: Participant experience in years (log scale) [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Perceived helpfulness of PYCEE variants formal(3), e.g., P3 noted “This is not clear at all, I want plain English” after encountering an IndentationError. There was a general feeling among participants that a tool for enhancing compiler error messages should focus on common errors, as stated by P3: “You should use common (basic) errors as test cases when testing this plugin”. When using PYCEE, participants… view at source ↗

**Figure 5.** Figure 5: User satisfaction for PYCEE variants answers by referring to the style of the enhanced error messages(4) and the presence of code examples(4). For example, P4 explained: “I liked the code examples and full sentences in normal English, written for humans”. On the other hand, one of the disadvantages of PYCEE is that it relies on information from Stack Overflow which may or may not be correct. Several parti… view at source ↗

read the original abstract

Background: Compilers tend to produce cryptic and uninformative error messages, leaving programmers confused and requiring them to spend precious time to resolve the underlying error. To find help, programmers often take to online question-and-answer forums such as Stack Overflow to start discussion threads about the errors they encountered. Aims: We conjecture that information from Stack Overflow threads which discuss compiler errors can be automatically collected and repackaged to provide programmers with enhanced compiler error messages, thus saving programmers' time and energy. Method: We present Pycee, a plugin integrated with the popular Sublime Text IDE to provide enhanced compiler error messages for the Python programming language. Pycee automatically queries Stack Overflow to provide customised and summarised information within the IDE. We evaluated two Pycee variants through a think-aloud user study during which 16 programmers completed Python programming tasks while using Pycee. Results: The majority of participants agreed that Pycee was helpful while completing the study tasks. When compared to a baseline relying on the official Python documentation to enhance compiler error messages, participants generally preferred Pycee in terms of helpfulness, citing concrete suggestions for fixes and example code as major benefits. Conclusions: Our results confirm that data from online sources such as Stack Overflow can be successfully used to automatically enhance compiler error messages. Our work opens up venues for future work to further enhance compiler error messages as well as to automatically reuse content from Stack Overflow for other aspects of programming.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents Pycee, a Sublime Text IDE plugin that automatically queries Stack Overflow to retrieve, summarize, and display customized information alongside Python compiler error messages. Two variants are evaluated in a think-aloud study with 16 participants who completed programming tasks; results show that the majority found Pycee helpful and generally preferred it to a baseline that enhanced messages using official Python documentation, primarily due to concrete fix suggestions and example code. The authors conclude that Stack Overflow data can be successfully reused to enhance compiler error messages.

Significance. If the central claim holds, the work demonstrates a practical approach to repurposing online Q&A content for IDE tooling, supported by an empirical user study with direct baseline comparison. This provides qualitative evidence of user preference and opens directions for similar applications to other languages or programming activities. The inclusion of a controlled think-aloud protocol with participant feedback is a positive aspect of the evaluation design.

major comments (2)

[Results / User Study] §Results / User Study: The claim that Stack Overflow data can be 'successfully used' to enhance error messages is supported only by subjective reports of helpfulness and preference from 16 participants. No objective metrics (task completion rates, time-to-fix, or pre/post understanding scores) are reported, so the evidence does not directly address whether the SO-derived content improves error resolution or merely appears appealing.
[Method] §Method: The baseline condition uses 'official Python documentation to enhance compiler error messages,' yet the paper provides no description of how this baseline was implemented or how its content was selected and presented, preventing assessment of whether the observed preference is attributable to SO content specifically or to differences in summarization style.

minor comments (1)

[Abstract] Abstract: The abstract states that 'two Pycee variants' were evaluated but does not indicate what distinguishes the variants or which results apply to each.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [Results / User Study] §Results / User Study: The claim that Stack Overflow data can be 'successfully used' to enhance error messages is supported only by subjective reports of helpfulness and preference from 16 participants. No objective metrics (task completion rates, time-to-fix, or pre/post understanding scores) are reported, so the evidence does not directly address whether the SO-derived content improves error resolution or merely appears appealing.

Authors: We agree that the evaluation relies on subjective participant reports from a think-aloud study with 16 programmers rather than objective measures such as task completion time or error resolution accuracy. The study design prioritised qualitative insights into perceived helpfulness and preference under realistic conditions, which aligns with the goal of assessing tool usability. However, the concluding claim that Stack Overflow data can be 'successfully used' is stronger than the subjective evidence warrants. We will revise the Conclusions section to state that the results provide evidence of user preference for the SO-enhanced messages, without claiming objective improvements in error resolution. revision: partial
Referee: [Method] §Method: The baseline condition uses 'official Python documentation to enhance compiler error messages,' yet the paper provides no description of how this baseline was implemented or how its content was selected and presented, preventing assessment of whether the observed preference is attributable to SO content specifically or to differences in summarization style.

Authors: We accept this criticism. The baseline was created by selecting relevant excerpts from the official Python documentation for each encountered error and formatting them similarly to the Pycee output. We will expand the Method section with a full description of baseline content selection, summarisation approach, and presentation format to enable clearer comparison. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical evaluation with external user feedback

full rationale

The paper describes an empirical tool (Pycee) that queries Stack Overflow and a think-aloud user study with 16 participants comparing it to official documentation. No equations, fitted parameters, predictions, or derivations appear in the abstract or described method. The central claim rests on participant preference ratings, which constitute external feedback rather than any self-referential reduction. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing steps. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that community-generated Stack Overflow content is suitable for automated summarization and display in error messages. No free parameters or invented entities are introduced.

axioms (1)

domain assumption Stack Overflow threads contain accurate and relevant information about Python compiler errors that can be automatically retrieved and summarized to help programmers.
This premise underpins both the tool construction and the claim of successful enhancement; it is tested indirectly via user preference but not independently verified.

pith-pipeline@v0.9.0 · 5782 in / 1226 out tokens · 34007 ms · 2026-05-25T14:55:57.713700+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

[1]

Maxims for malfeasant designers, or how to design languages to make programming as difﬁcult as possible,

R. L. Wexelblat, “Maxims for malfeasant designers, or how to design languages to make programming as difﬁcult as possible,” inProceedings of the International Conference on Software Engineering, 1976, pp. 331– 336

work page 1976
[2]

On compiler error messages: What they say and what they mean,

V . J. Traver, “On compiler error messages: What they say and what they mean,” Advances in Human-Computer Interaction , vol. 2010, pp. 3:1–3:26, 2010

work page 2010
[3]

An effective approach to enhancing compiler error messages,

B. A. Becker, “An effective approach to enhancing compiler error messages,” in Proceedings of the Technical Symposium on Computing Science Education, 2016, pp. 126–131

work page 2016
[4]

Mind your language: On novices’ interactions with error messages,

G. Marceau, K. Fisler, and S. Krishnamurthi, “Mind your language: On novices’ interactions with error messages,” in Proceedings of the Sym- posium on New Ideas, New Paradigms, and Reﬂections on Programming and Software, 2011, pp. 3–18

work page 2011
[5]

How do programmers ask and answer questions on the web? (NIER track),

C. Treude, O. Barzilay, and M.-A. Storey, “How do programmers ask and answer questions on the web? (NIER track),” in Proceedings of the International Conference on Software Engineering , 2011, pp. 804–807

work page 2011
[6]

Ranking crowd knowledge to assist software development,

L. B. L. de Souza, E. C. Campos, and M. de Almeida Maia, “Ranking crowd knowledge to assist software development,” in Proceedings of the International Conference on Program Comprehension, 2014, pp. 72–82

work page 2014
[7]

What makes a good code example?: A study of programming Q&A in StackOverﬂow,

S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns, “What makes a good code example?: A study of programming Q&A in StackOverﬂow,” in Proceedings of the International Conference on Software Maintenance , 2012, pp. 25–34

work page 2012
[8]

Redocumenting APIs with crowd knowledge: a coverage analysis based on question types,

F. M. Delﬁm, K. V . R. Paix ˜ao, D. Cassou, and M. de Almeida Maia, “Redocumenting APIs with crowd knowledge: a coverage analysis based on question types,” Journal of the Brazilian Computer Society , vol. 22, no. 1, 2016

work page 2016
[9]

What information about code snippets is available in differ- ent software-related documents? An exploratory study,

P. Chatterjee, M. A. Nishi, K. Damevski, V . Augustine, L. Pollock, and N. A. Kraft, “What information about code snippets is available in differ- ent software-related documents? An exploratory study,” in Proceedings of the International Conference on Software Analysis, Evolution and Reengineering, 2017, pp. 382–386

work page 2017
[10]

Holistic recommender systems for software engineering,

L. Ponzanelli, “Holistic recommender systems for software engineering,” in Companion Proceedings of the International Conference on Software Engineering, 2014, pp. 686–689

work page 2014
[11]

Augmenting API documentation with insights from Stack Overﬂow,

C. Treude and M. P. Robillard, “Augmenting API documentation with insights from Stack Overﬂow,” in Proceedings of the International Conference on Software Engineering , 2016, pp. 392–403

work page 2016
[12]

Effective compiler error message enhancement for novice programming students,

B. A. Becker, G. Glanville, R. Iwashima, C. McDonnell, K. Goslin, and C. Mooney, “Effective compiler error message enhancement for novice programming students,” Computer Science Education , vol. 26, no. 2–3, pp. 148–175, 2016

work page 2016
[13]

Automatic query reformulations for text retrieval in soft- ware engineering,

S. Haiduc, G. Bavota, A. Marcus, R. Oliveto, A. De Lucia, and T. Menzies, “Automatic query reformulations for text retrieval in soft- ware engineering,” in Proceedings of the International Conference on Software Engineering, 2013, pp. 842–851

work page 2013
[14]

Query expansion via WordNet for effective code search,

M. Lu, X. Sun, S. Wang, D. Lo, and Y . Duan, “Query expansion via WordNet for effective code search,” in Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering , 2015, pp. 545–549

work page 2015
[15]

An empirical investigation into programming language syntax,

A. Steﬁk and S. Siebert, “An empirical investigation into programming language syntax,” ACM Transactions on Computing Education , vol. 13, no. 4, pp. 19:1–19:40, 2013

work page 2013
[16]

Using task context to improve pro- grammer productivity,

M. Kersten and G. C. Murphy, “Using task context to improve pro- grammer productivity,” in Proceedings of the International Symposium on Foundations of Software Engineering , 2006, pp. 1–11

work page 2006
[17]

Extracting development tasks to navigate software documentation,

C. Treude, M. P. Robillard, and B. Dagenais, “Extracting development tasks to navigate software documentation,” IEEE Transactions on Soft- ware Engineering, vol. 41, no. 6, pp. 565–581, 2015

work page 2015
[18]

Tasknav: Task- based navigation of software documentation,

C. Treude, M. Sicard, M. Klocke, and M. P. Robillard, “Tasknav: Task- based navigation of software documentation,” in Proceedings of the International Conference on Software Engineering - Volume 2 , 2015, pp. 649–652

work page 2015
[19]

Sewordsim: Software-speciﬁc word similarity database,

Y . Tian, D. Lo, and J. Lawall, “Sewordsim: Software-speciﬁc word similarity database,” in Companion Proceedings of the International Conference on Software Engineering , 2014, pp. 568–571

work page 2014
[20]

Online python tutor: Embeddable web-based program visu- alization for CS education,

P. J. Guo, “Online python tutor: Embeddable web-based program visu- alization for CS education,” in Proceeding of the Technical Symposium on Computer Science Education , 2013, pp. 579–584

work page 2013
[21]

Debugging with the crowd: a debug recommendation system based on Stackoverﬂow,

M. Monperrus and A. Maia, “Debugging with the crowd: a debug recommendation system based on Stackoverﬂow,” Universit ´e Lille 1 - Sciences et Technologies, Tech. Rep. hal-00987395, 2014

work page 2014
[22]

The automatic creation of literature abstracts,

H. P. Luhn, “The automatic creation of literature abstracts,” IBM Journal of Research and Development , vol. 2, no. 2, pp. 159–165, 1958

work page 1958
[23]

Compiler error messages: What can help novices?

M.-H. Nienaltowski, M. Pedroni, and B. Meyer, “Compiler error messages: What can help novices?” in Proceedings of the Technical Symposium on Computer Science Education , 2008, pp. 168–172

work page 2008
[24]

Automatic generation of natural language summaries for Java classes,

L. Moreno, J. Aponte, G. Sridhara, A. Marcus, L. Pollock, and K. Vijay- Shanker, “Automatic generation of natural language summaries for Java classes,” in Proceedings of the International Conference on Program Comprehension, 2013, pp. 23–32

work page 2013
[25]

Strauss and J

A. Strauss and J. Corbin, Basics of qualitative research: Techniques and procedures for developing grounded theory, 2nd ed. Sage Publications, Inc., 1998

work page 1998
[26]

Grounded theory in software engineering research: A critical review and guidelines,

K.-J. Stol, P. Ralph, and B. Fitzgerald, “Grounded theory in software engineering research: A critical review and guidelines,” in Proceedings of the International Conference on Software Engineering, 2016, pp. 120– 131

work page 2016
[27]

Bazeley and K

P. Bazeley and K. Jackson, Qualitative data analysis with NVivo . Sage Publications Limited, 2013

work page 2013
[28]

Toxic code snippets on Stack Overﬂow,

C. Ragkhitwetsagul, J. Krinke, M. Paixao, G. Bianco, and R. Oliveto, “Toxic code snippets on Stack Overﬂow,” IEEE Transactions on Soft- ware Engineering, 2019, to appear

work page 2019
[29]

Patterns of knowledge in API reference documentation,

W. Maalej and M. P. Robillard, “Patterns of knowledge in API reference documentation,” IEEE Transactions on Software Engineering , vol. 39, no. 9, pp. 1264–1282, 2013

work page 2013
[30]

Crowd documen- tation: Exploring the coverage and the dynamics of API discussions on Stack Overﬂow,

C. Parnin, C. Treude, L. Grammel, and M.-A. Storey, “Crowd documen- tation: Exploring the coverage and the dynamics of API discussions on Stack Overﬂow,” Georgia Institute of Technology, Tech. Rep., 2012

work page 2012
[31]

Reviewing the quality of awareness support in collaborative applications,

P. Antunes, V . Herskovic, S. F. Ochoa, and J. A. Pino, “Reviewing the quality of awareness support in collaborative applications,” Journal of Systems and Software , vol. 89, no. C, pp. 146–169, 2014

work page 2014
[32]

Compiler error notiﬁcations revisited: An interaction-ﬁrst approach for helping developers more effectively comprehend and resolve error notiﬁcations,

T. Barik, J. Witschey, B. Johnson, and E. Murphy-Hill, “Compiler error notiﬁcations revisited: An interaction-ﬁrst approach for helping developers more effectively comprehend and resolve error notiﬁcations,” in Companion Proceedings of the International Conference on Software Engineering, 2014, pp. 536–539

work page 2014
[33]

On novices’ interaction with compiler error messages: A human factors approach,

J. Prather, R. Pettit, K. H. McMurry, A. Peters, J. Homer, N. Simone, and M. Cohen, “On novices’ interaction with compiler error messages: A human factors approach,” in Proceedings of the Conference on International Computing Education Research , 2017, pp. 74–82

work page 2017
[34]

Usability measurement and metrics: A consolidated model,

A. Seffah, M. Donyaee, R. B. Kline, and H. K. Padda, “Usability measurement and metrics: A consolidated model,” Software Quality Journal, vol. 14, no. 2, pp. 159–178, 2006

work page 2006
[35]

Identifying and correcting Java programming errors for introductory computer science students,

M. Hristova, A. Misra, M. Rutter, and R. Mercuri, “Identifying and correcting Java programming errors for introductory computer science students,” in Proceedings of the Technical Symposium on Computer Science Education, 2003, pp. 153–156

work page 2003
[36]

Seahawk: Stack Overﬂow in the IDE,

L. Ponzanelli, A. Bacchelli, and M. Lanza, “Seahawk: Stack Overﬂow in the IDE,” in Proceedings of the International Conference on Software Engineering, 2013, pp. 1295–1298

work page 2013
[37]

Mining StackOverﬂow to turn the IDE into a self-conﬁdent program- ming prompter,

L. Ponzanelli, G. Bavota, M. Di Penta, R. Oliveto, and M. Lanza, “Mining StackOverﬂow to turn the IDE into a self-conﬁdent program- ming prompter,” in Proceedings of the Working Conference on Mining Software Repositories, 2014, pp. 102–111

work page 2014
[38]

Context-based recommendation to support problem solving in software development,

J. Cordeiro, B. Antunes, and P. Gomes, “Context-based recommendation to support problem solving in software development,” in Proceedings of the International Workshop on Recommendation Systems for Software Engineering, 2012, pp. 85–89

work page 2012
[39]

Autocomment: Mining question and answer sites for automatic comment generation,

E. Wong, J. Yang, and L. Tan, “Autocomment: Mining question and answer sites for automatic comment generation,” in Proceedings of the International Conference on Automated Software Engineering, 2013, pp. 562–567

work page 2013
[40]

NLP2Code: Code snippet content assist via natural language tasks,

B. A. Campbell and C. Treude, “NLP2Code: Code snippet content assist via natural language tasks,” in Proceedings of the International Conference on Software Maintenance and Evolution, 2017, pp. 628–632

work page 2017
[41]

Bing developer assistant: Improving developer productivity by recom- mending sample code,

H. Zhang, A. Jain, G. Khandelwal, C. Kaushik, S. Ge, and W. Hu, “Bing developer assistant: Improving developer productivity by recom- mending sample code,” in Proceedings of the International Symposium on Foundations of Software Engineering , 2016, pp. 956–961

work page 2016
[42]

Understanding Stack Overﬂow code fragments,

C. Treude and M. P. Robillard, “Understanding Stack Overﬂow code fragments,” in Proceedings of the International Conference on Software Maintenance and Evolution , 2017, pp. 509–513

work page 2017
[43]

On the use of automated text summarization techniques for summarizing source code,

S. Haiduc, J. Aponte, L. Moreno, and A. Marcus, “On the use of automated text summarization techniques for summarizing source code,” in Proceedings of the Working Conference on Reverse Engineering , 2010, pp. 35–44

work page 2010
[44]

Automatic source code summa- rization of context for Java methods,

P. W. McBurney and C. McMillan, “Automatic source code summa- rization of context for Java methods,” IEEE Transactions on Software Engineering, vol. 42, no. 2, pp. 103–119, 2016

work page 2016
[45]

Automatically generating documentation for lambda expressions in Java,

A. Alqaimi, P. Thongtanunam, and C. Treude, “Automatically generating documentation for lambda expressions in Java,” in Proceedings of the International Conference on Mining Software Repositories , 2019, pp. 310–320

work page 2019
[46]

Code fragment summarization,

A. T. T. Ying and M. P. Robillard, “Code fragment summarization,” in Proceedings of the Joint Meeting on Foundations of Software Engineer- ing, 2013, pp. 655–658

work page 2013
[47]

Automatic documentation inference for exceptions,

R. P. Buse and W. R. Weimer, “Automatic documentation inference for exceptions,” in Proceedings of the International Symposium on Software Testing and Analysis, 2008, pp. 273–282

work page 2008
[48]

Generating natural language summaries for crosscutting source code concerns,

S. Rastkar, G. C. Murphy, and A. W. J. Bradley, “Generating natural language summaries for crosscutting source code concerns,” in Proceed- ings of the International Conference on Software Maintenance, 2011, pp. 103–112

work page 2011
[49]

Summarizing software arti- facts: A case study of bug reports,

S. Rastkar, G. C. Murphy, and G. Murray, “Summarizing software arti- facts: A case study of bug reports,” in Proceedings of the International Conference on Software Engineering - Volume 1 , 2010, pp. 505–514

work page 2010
[50]

Automatic summarization of bug reports,

——, “Automatic summarization of bug reports,” IEEE Transactions on Software Engineering, vol. 40, no. 4, pp. 366–380, 2014

work page 2014

[1] [1]

Maxims for malfeasant designers, or how to design languages to make programming as difﬁcult as possible,

R. L. Wexelblat, “Maxims for malfeasant designers, or how to design languages to make programming as difﬁcult as possible,” inProceedings of the International Conference on Software Engineering, 1976, pp. 331– 336

work page 1976

[2] [2]

On compiler error messages: What they say and what they mean,

V . J. Traver, “On compiler error messages: What they say and what they mean,” Advances in Human-Computer Interaction , vol. 2010, pp. 3:1–3:26, 2010

work page 2010

[3] [3]

An effective approach to enhancing compiler error messages,

B. A. Becker, “An effective approach to enhancing compiler error messages,” in Proceedings of the Technical Symposium on Computing Science Education, 2016, pp. 126–131

work page 2016

[4] [4]

Mind your language: On novices’ interactions with error messages,

G. Marceau, K. Fisler, and S. Krishnamurthi, “Mind your language: On novices’ interactions with error messages,” in Proceedings of the Sym- posium on New Ideas, New Paradigms, and Reﬂections on Programming and Software, 2011, pp. 3–18

work page 2011

[5] [5]

How do programmers ask and answer questions on the web? (NIER track),

C. Treude, O. Barzilay, and M.-A. Storey, “How do programmers ask and answer questions on the web? (NIER track),” in Proceedings of the International Conference on Software Engineering , 2011, pp. 804–807

work page 2011

[6] [6]

Ranking crowd knowledge to assist software development,

L. B. L. de Souza, E. C. Campos, and M. de Almeida Maia, “Ranking crowd knowledge to assist software development,” in Proceedings of the International Conference on Program Comprehension, 2014, pp. 72–82

work page 2014

[7] [7]

What makes a good code example?: A study of programming Q&A in StackOverﬂow,

S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns, “What makes a good code example?: A study of programming Q&A in StackOverﬂow,” in Proceedings of the International Conference on Software Maintenance , 2012, pp. 25–34

work page 2012

[8] [8]

Redocumenting APIs with crowd knowledge: a coverage analysis based on question types,

F. M. Delﬁm, K. V . R. Paix ˜ao, D. Cassou, and M. de Almeida Maia, “Redocumenting APIs with crowd knowledge: a coverage analysis based on question types,” Journal of the Brazilian Computer Society , vol. 22, no. 1, 2016

work page 2016

[9] [9]

What information about code snippets is available in differ- ent software-related documents? An exploratory study,

P. Chatterjee, M. A. Nishi, K. Damevski, V . Augustine, L. Pollock, and N. A. Kraft, “What information about code snippets is available in differ- ent software-related documents? An exploratory study,” in Proceedings of the International Conference on Software Analysis, Evolution and Reengineering, 2017, pp. 382–386

work page 2017

[10] [10]

Holistic recommender systems for software engineering,

L. Ponzanelli, “Holistic recommender systems for software engineering,” in Companion Proceedings of the International Conference on Software Engineering, 2014, pp. 686–689

work page 2014

[11] [11]

Augmenting API documentation with insights from Stack Overﬂow,

C. Treude and M. P. Robillard, “Augmenting API documentation with insights from Stack Overﬂow,” in Proceedings of the International Conference on Software Engineering , 2016, pp. 392–403

work page 2016

[12] [12]

Effective compiler error message enhancement for novice programming students,

B. A. Becker, G. Glanville, R. Iwashima, C. McDonnell, K. Goslin, and C. Mooney, “Effective compiler error message enhancement for novice programming students,” Computer Science Education , vol. 26, no. 2–3, pp. 148–175, 2016

work page 2016

[13] [13]

Automatic query reformulations for text retrieval in soft- ware engineering,

S. Haiduc, G. Bavota, A. Marcus, R. Oliveto, A. De Lucia, and T. Menzies, “Automatic query reformulations for text retrieval in soft- ware engineering,” in Proceedings of the International Conference on Software Engineering, 2013, pp. 842–851

work page 2013

[14] [14]

Query expansion via WordNet for effective code search,

M. Lu, X. Sun, S. Wang, D. Lo, and Y . Duan, “Query expansion via WordNet for effective code search,” in Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering , 2015, pp. 545–549

work page 2015

[15] [15]

An empirical investigation into programming language syntax,

A. Steﬁk and S. Siebert, “An empirical investigation into programming language syntax,” ACM Transactions on Computing Education , vol. 13, no. 4, pp. 19:1–19:40, 2013

work page 2013

[16] [16]

Using task context to improve pro- grammer productivity,

M. Kersten and G. C. Murphy, “Using task context to improve pro- grammer productivity,” in Proceedings of the International Symposium on Foundations of Software Engineering , 2006, pp. 1–11

work page 2006

[17] [17]

Extracting development tasks to navigate software documentation,

C. Treude, M. P. Robillard, and B. Dagenais, “Extracting development tasks to navigate software documentation,” IEEE Transactions on Soft- ware Engineering, vol. 41, no. 6, pp. 565–581, 2015

work page 2015

[18] [18]

Tasknav: Task- based navigation of software documentation,

C. Treude, M. Sicard, M. Klocke, and M. P. Robillard, “Tasknav: Task- based navigation of software documentation,” in Proceedings of the International Conference on Software Engineering - Volume 2 , 2015, pp. 649–652

work page 2015

[19] [19]

Sewordsim: Software-speciﬁc word similarity database,

Y . Tian, D. Lo, and J. Lawall, “Sewordsim: Software-speciﬁc word similarity database,” in Companion Proceedings of the International Conference on Software Engineering , 2014, pp. 568–571

work page 2014

[20] [20]

Online python tutor: Embeddable web-based program visu- alization for CS education,

P. J. Guo, “Online python tutor: Embeddable web-based program visu- alization for CS education,” in Proceeding of the Technical Symposium on Computer Science Education , 2013, pp. 579–584

work page 2013

[21] [21]

Debugging with the crowd: a debug recommendation system based on Stackoverﬂow,

M. Monperrus and A. Maia, “Debugging with the crowd: a debug recommendation system based on Stackoverﬂow,” Universit ´e Lille 1 - Sciences et Technologies, Tech. Rep. hal-00987395, 2014

work page 2014

[22] [22]

The automatic creation of literature abstracts,

H. P. Luhn, “The automatic creation of literature abstracts,” IBM Journal of Research and Development , vol. 2, no. 2, pp. 159–165, 1958

work page 1958

[23] [23]

Compiler error messages: What can help novices?

M.-H. Nienaltowski, M. Pedroni, and B. Meyer, “Compiler error messages: What can help novices?” in Proceedings of the Technical Symposium on Computer Science Education , 2008, pp. 168–172

work page 2008

[24] [24]

Automatic generation of natural language summaries for Java classes,

L. Moreno, J. Aponte, G. Sridhara, A. Marcus, L. Pollock, and K. Vijay- Shanker, “Automatic generation of natural language summaries for Java classes,” in Proceedings of the International Conference on Program Comprehension, 2013, pp. 23–32

work page 2013

[25] [25]

Strauss and J

A. Strauss and J. Corbin, Basics of qualitative research: Techniques and procedures for developing grounded theory, 2nd ed. Sage Publications, Inc., 1998

work page 1998

[26] [26]

Grounded theory in software engineering research: A critical review and guidelines,

K.-J. Stol, P. Ralph, and B. Fitzgerald, “Grounded theory in software engineering research: A critical review and guidelines,” in Proceedings of the International Conference on Software Engineering, 2016, pp. 120– 131

work page 2016

[27] [27]

Bazeley and K

P. Bazeley and K. Jackson, Qualitative data analysis with NVivo . Sage Publications Limited, 2013

work page 2013

[28] [28]

Toxic code snippets on Stack Overﬂow,

C. Ragkhitwetsagul, J. Krinke, M. Paixao, G. Bianco, and R. Oliveto, “Toxic code snippets on Stack Overﬂow,” IEEE Transactions on Soft- ware Engineering, 2019, to appear

work page 2019

[29] [29]

Patterns of knowledge in API reference documentation,

W. Maalej and M. P. Robillard, “Patterns of knowledge in API reference documentation,” IEEE Transactions on Software Engineering , vol. 39, no. 9, pp. 1264–1282, 2013

work page 2013

[30] [30]

Crowd documen- tation: Exploring the coverage and the dynamics of API discussions on Stack Overﬂow,

C. Parnin, C. Treude, L. Grammel, and M.-A. Storey, “Crowd documen- tation: Exploring the coverage and the dynamics of API discussions on Stack Overﬂow,” Georgia Institute of Technology, Tech. Rep., 2012

work page 2012

[31] [31]

Reviewing the quality of awareness support in collaborative applications,

P. Antunes, V . Herskovic, S. F. Ochoa, and J. A. Pino, “Reviewing the quality of awareness support in collaborative applications,” Journal of Systems and Software , vol. 89, no. C, pp. 146–169, 2014

work page 2014

[32] [32]

Compiler error notiﬁcations revisited: An interaction-ﬁrst approach for helping developers more effectively comprehend and resolve error notiﬁcations,

T. Barik, J. Witschey, B. Johnson, and E. Murphy-Hill, “Compiler error notiﬁcations revisited: An interaction-ﬁrst approach for helping developers more effectively comprehend and resolve error notiﬁcations,” in Companion Proceedings of the International Conference on Software Engineering, 2014, pp. 536–539

work page 2014

[33] [33]

On novices’ interaction with compiler error messages: A human factors approach,

J. Prather, R. Pettit, K. H. McMurry, A. Peters, J. Homer, N. Simone, and M. Cohen, “On novices’ interaction with compiler error messages: A human factors approach,” in Proceedings of the Conference on International Computing Education Research , 2017, pp. 74–82

work page 2017

[34] [34]

Usability measurement and metrics: A consolidated model,

A. Seffah, M. Donyaee, R. B. Kline, and H. K. Padda, “Usability measurement and metrics: A consolidated model,” Software Quality Journal, vol. 14, no. 2, pp. 159–178, 2006

work page 2006

[35] [35]

Identifying and correcting Java programming errors for introductory computer science students,

M. Hristova, A. Misra, M. Rutter, and R. Mercuri, “Identifying and correcting Java programming errors for introductory computer science students,” in Proceedings of the Technical Symposium on Computer Science Education, 2003, pp. 153–156

work page 2003

[36] [36]

Seahawk: Stack Overﬂow in the IDE,

L. Ponzanelli, A. Bacchelli, and M. Lanza, “Seahawk: Stack Overﬂow in the IDE,” in Proceedings of the International Conference on Software Engineering, 2013, pp. 1295–1298

work page 2013

[37] [37]

Mining StackOverﬂow to turn the IDE into a self-conﬁdent program- ming prompter,

L. Ponzanelli, G. Bavota, M. Di Penta, R. Oliveto, and M. Lanza, “Mining StackOverﬂow to turn the IDE into a self-conﬁdent program- ming prompter,” in Proceedings of the Working Conference on Mining Software Repositories, 2014, pp. 102–111

work page 2014

[38] [38]

Context-based recommendation to support problem solving in software development,

J. Cordeiro, B. Antunes, and P. Gomes, “Context-based recommendation to support problem solving in software development,” in Proceedings of the International Workshop on Recommendation Systems for Software Engineering, 2012, pp. 85–89

work page 2012

[39] [39]

Autocomment: Mining question and answer sites for automatic comment generation,

E. Wong, J. Yang, and L. Tan, “Autocomment: Mining question and answer sites for automatic comment generation,” in Proceedings of the International Conference on Automated Software Engineering, 2013, pp. 562–567

work page 2013

[40] [40]

NLP2Code: Code snippet content assist via natural language tasks,

B. A. Campbell and C. Treude, “NLP2Code: Code snippet content assist via natural language tasks,” in Proceedings of the International Conference on Software Maintenance and Evolution, 2017, pp. 628–632

work page 2017

[41] [41]

Bing developer assistant: Improving developer productivity by recom- mending sample code,

H. Zhang, A. Jain, G. Khandelwal, C. Kaushik, S. Ge, and W. Hu, “Bing developer assistant: Improving developer productivity by recom- mending sample code,” in Proceedings of the International Symposium on Foundations of Software Engineering , 2016, pp. 956–961

work page 2016

[42] [42]

Understanding Stack Overﬂow code fragments,

C. Treude and M. P. Robillard, “Understanding Stack Overﬂow code fragments,” in Proceedings of the International Conference on Software Maintenance and Evolution , 2017, pp. 509–513

work page 2017

[43] [43]

On the use of automated text summarization techniques for summarizing source code,

S. Haiduc, J. Aponte, L. Moreno, and A. Marcus, “On the use of automated text summarization techniques for summarizing source code,” in Proceedings of the Working Conference on Reverse Engineering , 2010, pp. 35–44

work page 2010

[44] [44]

Automatic source code summa- rization of context for Java methods,

P. W. McBurney and C. McMillan, “Automatic source code summa- rization of context for Java methods,” IEEE Transactions on Software Engineering, vol. 42, no. 2, pp. 103–119, 2016

work page 2016

[45] [45]

Automatically generating documentation for lambda expressions in Java,

A. Alqaimi, P. Thongtanunam, and C. Treude, “Automatically generating documentation for lambda expressions in Java,” in Proceedings of the International Conference on Mining Software Repositories , 2019, pp. 310–320

work page 2019

[46] [46]

Code fragment summarization,

A. T. T. Ying and M. P. Robillard, “Code fragment summarization,” in Proceedings of the Joint Meeting on Foundations of Software Engineer- ing, 2013, pp. 655–658

work page 2013

[47] [47]

Automatic documentation inference for exceptions,

R. P. Buse and W. R. Weimer, “Automatic documentation inference for exceptions,” in Proceedings of the International Symposium on Software Testing and Analysis, 2008, pp. 273–282

work page 2008

[48] [48]

Generating natural language summaries for crosscutting source code concerns,

S. Rastkar, G. C. Murphy, and A. W. J. Bradley, “Generating natural language summaries for crosscutting source code concerns,” in Proceed- ings of the International Conference on Software Maintenance, 2011, pp. 103–112

work page 2011

[49] [49]

Summarizing software arti- facts: A case study of bug reports,

S. Rastkar, G. C. Murphy, and G. Murray, “Summarizing software arti- facts: A case study of bug reports,” in Proceedings of the International Conference on Software Engineering - Volume 1 , 2010, pp. 505–514

work page 2010

[50] [50]

Automatic summarization of bug reports,

——, “Automatic summarization of bug reports,” IEEE Transactions on Software Engineering, vol. 40, no. 4, pp. 366–380, 2014

work page 2014