An Empirical Study of API Misuses of Data-Centric Libraries

Akalanka Galappaththi; Christoph Treude; Sarah Nadi

arxiv: 2408.15853 · v2 · submitted 2024-08-28 · 💻 cs.SE

An Empirical Study of API Misuses of Data-Centric Libraries

Akalanka Galappaththi , Sarah Nadi , Christoph Treude This is my paper

Pith reviewed 2026-05-23 22:12 UTC · model grok-4.3

classification 💻 cs.SE

keywords API misusedata-centric librariesempirical studyStack OverflowGitHubsoftware engineering

0 comments

The pith

Characteristics of API misuses in deep learning libraries extend to other data-centric libraries such as those for data processing and numerical computation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies API misuses across five data-centric libraries by examining Stack Overflow posts and GitHub issues. It finds that many misuse traits previously noted for deep learning APIs, including specific symptoms and causes, also appear here. Developers violate API directives at similar rates whether or not those directives are stated in the library documentation. The work argues that the data-centric character of the APIs, rather than deep learning specifics, drives these patterns. The collected misuse examples and their classification provide a basis for improving detection methods beyond deep learning cases.

Core claim

Manual review of misuse instances from Stack Overflow and GitHub shows that the nature, symptoms, and root causes of misuses in the studied data-centric libraries closely match those observed for deep learning libraries, and that developers misuse APIs irrespective of whether usage directives appear in the documentation.

What carries the argument

Comparative empirical analysis of misuse reports drawn from Stack Overflow and GitHub for five data-centric libraries spanning data processing, numerical computation, machine learning, and visualization.

If this is right

Current API misuse detectors developed for traditional or deep-learning libraries may miss or misclassify errors in data-centric settings.
Adding documentation directives alone will not eliminate the observed misuse rates.
Detection tools should incorporate data-structure and workflow constraints typical of data-centric APIs.
Future studies can reuse the collected misuse dataset to test new detection approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same analysis approach could be applied to libraries outside the five studied to test whether the pattern holds more broadly.
Tool builders might prioritize checks for data-type and parameter-interaction errors over general syntax violations.

Load-bearing premise

The misuses identified from Stack Overflow and GitHub posts represent typical developer errors with these libraries, and the five chosen libraries cover the main range of data-centric APIs.

What would settle it

A replication study that finds substantially different misuse characteristics or that shows developers follow documented directives at much higher rates than reported here would falsify the extension claim.

Figures

Figures reproduced from arXiv: 2408.15853 by Akalanka Galappaththi, Christoph Treude, Sarah Nadi.

**Figure 2.** Figure 2: Our updated misuse classification taxonomy, based [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: Missing API parameter for seaborn’s distplot. other code context implies the necessity of setting a particular parameter (or its value). This is what this API misuse type refers to. When creating tensors in TensorFlow, the parameter dtype is optional. If a subsequent API requires a specific data type for the input tensor, failing to set dtype appropriately can lead to unexpected results and propagate erro… view at source ↗

**Figure 3.** Figure 3: Redundant API call to seaborn’s FacetGrid when using lmplot type of API misuse in deep learning libraries, as they are heavily reliant on tensor computations, we also observe similar misuses in our data set. For example, in pandas, failing to call pivot on a pandas dataframe before passing it to heatmap results in a runtime error, because the input is not in wide format as heatmap expects. It is important … view at source ↗

read the original abstract

Developers rely on third-party library Application Programming Interfaces (APIs) when developing software. However, libraries typically come with assumptions and API usage constraints, whose violation results in API misuse. API misuses may result in crashes or incorrect behavior. Even though API misuse is a well-studied area, a recent study of API misuse of deep learning libraries showed that the nature of these misuses and their symptoms are different from misuses of traditional libraries, and as a result highlighted potential shortcomings of current misuse detection tools. We speculate that these observations may not be limited to deep learning API misuses but may stem from the data-centric nature of these APIs. Data-centric libraries often deal with diverse data structures, intricate processing workflows, and a multitude of parameters, which can make them inherently more challenging to use correctly. Therefore, understanding the potential misuses of these libraries is important to avoid unexpected application behavior. To this end, this paper contributes an empirical study of API misuses of five data-centric libraries that cover areas such as data processing, numerical computation, machine learning, and visualization. We identify misuses of these libraries by analyzing data from both Stack Overflow and GitHub. Our results show that many of the characteristics of API misuses observed for deep learning libraries extend to misuses of the data-centric library APIs we study. We also find that developers tend to misuse APIs from data-centric libraries, regardless of whether the API directive appears in the documentation. Overall, our work exposes the challenges of API misuse in data-centric libraries, rather than only focusing on deep learning libraries. Our collected misuses and their characterization lay groundwork for future research to help reduce misuses of these libraries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Extends the DL API misuse study to five other data-centric libraries with fresh SO and GitHub observations, but the sampling method leaves the representativeness of the extension claim open.

read the letter

The main thing to know is that this paper mines Stack Overflow posts and GitHub issues or commits for misuses in five data-centric libraries and reports that many of the characteristics seen in the deep learning study carry over, including misuses happening regardless of whether directives appear in the docs. It also supplies new concrete examples across data processing, numerical computation, machine learning, and visualization libraries that were not covered before. The work does a clean job of applying the same data sources and basic comparison approach to a broader set of libraries, which adds usable empirical points without overclaiming novelty in method. The central extension claim rests on observed distributions of misuse types and documentation presence. The stress-test concern holds up on reading: SO and GitHub data naturally surface cases that produce visible failures or questions, so the finding that misuses occur regardless of docs could partly reflect reporting bias rather than the intrinsic properties of the APIs. Without a check against random samples of API calls in active repositories or explicit handling of false positives, the generalization to all developer errors stays tentative. The abstract leaves the exact misuse identification criteria and inter-rater details implicit, though the paper appears to engage the prior literature directly rather than fitting results to a preconceived story. This is for researchers already working on API misuse detection or library documentation in software engineering. A reader who needs fresh examples to motivate tooling or to compare against their own datasets would get value from the characterizations. It deserves peer review because the empirical extension is clear enough to warrant referee time, even if the sampling bias needs explicit discussion in revision.

Referee Report

2 major / 2 minor

Summary. The paper conducts an empirical study of API misuses across five data-centric libraries (covering data processing, numerical computation, machine learning, and visualization) by mining Stack Overflow posts and GitHub issues/commits. It claims that many characteristics of API misuses previously observed for deep learning libraries extend to these data-centric APIs, and that developers tend to misuse such APIs regardless of whether the relevant directive appears in the documentation. The work collects and characterizes a set of misuses to support future research on detection and prevention.

Significance. If the results hold after addressing sampling concerns, the study broadens API-misuse research beyond deep learning libraries to a wider class of data-centric APIs, supplying a reusable dataset of observed misuses and highlighting documentation-independent misuse patterns. The direct comparison to prior DL findings and the dual-platform mining approach are concrete strengths that could inform improved static-analysis tools.

major comments (2)

[§3] §3 (Study Design / Data Collection): the manuscript provides no explicit criteria for misuse identification, no inter-rater agreement statistics, and no procedure for estimating or bounding false positives in the SO/GitHub mining pipeline. These omissions are load-bearing for the central extension claim, because the reported distributions of misuse types and documentation presence could be artifacts of the identification process rather than intrinsic properties of the libraries.
[§4] §4 (Results) and §5 (Discussion): the claim that DL-library misuse characteristics 'extend' to the five data-centric libraries rests on the mined posts and commits being representative of actual developer errors. No comparison against a random sample of API invocations in active repositories is reported; therefore the observed frequencies (and the 'regardless of documentation' finding) remain vulnerable to the reporting bias noted in the stress-test note.

minor comments (2)

[Table 1] Table 1 (library selection) would benefit from an explicit justification of why these five libraries adequately span the data-centric space; a short paragraph on coverage gaps would improve clarity.
[§4] The paper cites prior DL-misuse work but does not include a side-by-side table of misuse-category frequencies; adding one would make the 'extension' claim easier to evaluate at a glance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback. The comments highlight important aspects of study design transparency and the interpretation of our findings. We address each major comment below, indicating where revisions will be made.

read point-by-point responses

Referee: [§3] §3 (Study Design / Data Collection): the manuscript provides no explicit criteria for misuse identification, no inter-rater agreement statistics, and no procedure for estimating or bounding false positives in the SO/GitHub mining pipeline. These omissions are load-bearing for the central extension claim, because the reported distributions of misuse types and documentation presence could be artifacts of the identification process rather than intrinsic properties of the libraries.

Authors: We agree that the current manuscript lacks sufficient detail on the misuse identification process. In the revision, we will expand §3 with an explicit subsection describing: (1) the concrete criteria used to label a post or commit as a misuse (e.g., violation of documented API constraints leading to incorrect behavior or errors), (2) the multi-author validation workflow, and (3) inter-rater agreement statistics (Cohen’s kappa) computed on a sampled subset. We will also report the manual validation procedure performed on a random sample of mined items to estimate and bound the false-positive rate. These additions will directly support the reliability of the reported distributions and the extension claim. revision: yes
Referee: [§4] §4 (Results) and §5 (Discussion): the claim that DL-library misuse characteristics 'extend' to the five data-centric libraries rests on the mined posts and commits being representative of actual developer errors. No comparison against a random sample of API invocations in active repositories is reported; therefore the observed frequencies (and the 'regardless of documentation' finding) remain vulnerable to the reporting bias noted in the stress-test note.

Authors: The study’s scope is the characterization of observed misuses reported on Stack Overflow and GitHub, following the same methodology as the prior DL-library study to which we compare. The extension claim concerns qualitative characteristics (misuse types, symptoms, and documentation presence) of the collected cases rather than their prevalence among all possible API invocations. We will revise §5 to explicitly acknowledge reporting bias as a limitation and clarify that the “regardless of documentation” observation applies to the misuses we identified. A random-sample comparison of all invocations would constitute a separate study on usage patterns; we therefore treat the current design as appropriate for the stated research questions while adding the requested caveat. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical data collection and comparison

full rationale

The paper conducts an empirical study by mining Stack Overflow posts and GitHub issues/commits to identify API misuses in five data-centric libraries, then compares observed characteristics to those reported in prior work on deep learning libraries. No equations, fitted parameters, predictions, or derivations are present. Central claims rest on direct observation of the mined data rather than any self-referential reduction or self-citation chain. Self-citations, if any, are incidental and not load-bearing for the extension claim. The analysis is self-contained against external benchmarks of misuse reporting.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that public forum data reliably surfaces representative misuses and that the chosen libraries are typical of the data-centric category.

axioms (1)

domain assumption Misuses can be reliably identified and categorized from Stack Overflow posts and GitHub commits/issues without significant selection bias.
Invoked when the paper states it identifies misuses by analyzing data from both sources.

pith-pipeline@v0.9.0 · 5838 in / 1169 out tokens · 20625 ms · 2026-05-23T22:12:33.080779+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages

[1]

Nguyen, and Mira Mezini

Sven Amann, Sarah Nadi, Hoan Anh Nguyen, Tien N. Nguyen, and Mira Mezini

work page
[2]

Mubench: A benchmark for api-misuse detectors,

MUBench: a benchmark for API-misuse detectors. In Proceedings of the 13th International Conference on Mining Software Repositories, MSR 2016, Austin, TX, USA, May 14-22, 2016 . ACM, 464–467. https://doi.org/10.1145/2901739.2903506

work page doi:10.1145/2901739.2903506 2016
[4]

In Proceedings of the 16th International Conference on Mining Software Repositories, MSR 2019, 26-27 May 2019, Montreal, Canada

Investigating next steps in static API-misuse detection. In Proceedings of the 16th International Conference on Mining Software Repositories, MSR 2019, 26-27 May 2019, Montreal, Canada . IEEE / ACM, 265–275. https://doi.org/10.1109/MSR. 2019.00053

work page doi:10.1109/msr 2019
[5]

Nguyen, and Mira Mezini

Sven Amann, Hoan Anh Nguyen, Sarah Nadi, Tien N. Nguyen, and Mira Mezini

work page
[6]

IEEE Trans

A Systematic Evaluation of Static API-Misuse Detectors. IEEE Trans. Software Eng. 45, 12 (2019), 1170–1188. https://doi.org/10.1109/TSE.2018.2827384

work page doi:10.1109/tse.2018.2827384 2019
[7]

Anonymous. 2024. ESEM 2024 - Data-centric API misuse data set . https://figshare. com/s/0062f28e8587fe9ce715

work page 2024
[8]

Wilson Baker, Michael O’Connor, Seyed Reza Shahamiri, and Valerio Terragni

work page
[9]

In IEEE International Con- ference on Software Analysis, Evolution and Reengineering, SANER 2022, Honolulu, HI, USA, March 15-18, 2022

Detect, Fix, and Verify TensorFlow API Misuses. In IEEE International Con- ference on Software Analysis, Evolution and Reengineering, SANER 2022, Honolulu, HI, USA, March 15-18, 2022 . IEEE, 925–929. https://doi.org/10.1109/SANER53432. 2022.00110

work page doi:10.1109/saner53432 2022
[10]

Houssem Ben Braiek, Foutse Khomh, and Bram Adams. 2018. The open-closed principle of modern machine learning frameworks. In Proceedings of the 15th International Conference on Mining Software Repositories, MSR 2018, Gothenburg, Sweden, May 28-29, 2018 , Andy Zaidman, Yasutaka Kamei, and Emily Hill (Eds.). ACM, 353–363. https://doi.org/10.1145/3196398.3196445

work page doi:10.1145/3196398.3196445 2018
[11]

Kathy Charmaz. 2014. Constructing grounded theory. (2014)

work page 2014
[12]

Rosa Falotico and Piero Quatto. 2015. Fleiss’ kappa statistic without paradoxes. Quality & Quantity 49, 2 (2015), 463–470

work page 2015
[13]

Zuxing Gu, Jiecheng Wu, Chi Li, Min Zhou, Yu Jiang, Ming Gu, and Jiaguang Sun. 2019. Vetting API usages in C programs with IMChecker. In Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019 . IEEE / ACM, 91–94. https: //doi.org/10.1109/ICSE-Companion.2019.00046

work page doi:10.1109/icse-companion.2019.00046 2019
[14]

Mansur Gulami. 2022. A Human-in-the-loop Approach to Generate Annotation Usage Rules. Master’s thesis

work page 2022
[15]

Xincheng He, Xiaojin Liu, Lei Xu, and Baowen Xu. 2023. How Dynamic Features Affect API Usages? An Empirical Study of API Misuses in Python Programs. In IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2023, Taipa, Macao, March 21-24, 2023 , Tao Zhang, Xin Xia, and Nicole Novielli (Eds.). IEEE, 522–533. https://doi.org...

work page doi:10.1109/saner56733.2023.00055 2023
[16]

Daqing Hou and Lin Li. 2011. Obstacles in Using Frameworks and APIs: An Exploratory Study of Programmers’ Newsgroup Discussions. InThe 19th IEEE International Conference on Program Comprehension, ICPC 2011, Kingston, ON, Canada, June 22-24, 2011 . IEEE Computer Society, 91–100. https://doi.org/10. 1109/ICPC.2011.21

work page 2011
[17]

Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In ICSE ’20: 42nd International Conference on Software Engineering, Seoul, South Korea, 27 June - 19 July, 2020 . ACM, 1110–1121. https://doi.org/10.1145/3377811. 3380395

work page doi:10.1145/3377811 2020
[18]

Nick Hynes, D Sculley, and Michael Terry. 2017. The data linter: Lightweight, automated sanity checking for ml data sets. In NIPS MLSys Workshop, Vol. 1

work page 2017
[19]

Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019 . ACM, 510–520. https://d...

work page doi:10.1145/3338906 2019
[20]

kaggle. 2023. State of Data Science and Machine Learning 2022. https://www. kaggle.com/kaggle-survey-2022

work page 2023
[21]

Stefan Krüger, Sarah Nadi, Michael Reif, Karim Ali, Mira Mezini, Eric Bodden, Florian Göpfert, Felix Günther, Christian Weinert, Daniel Demmler, and Ram Kamath. 2017. CogniCrypt: supporting developers in using cryptography. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017, Urbana, IL, USA, October 30...

work page doi:10.1109/ase.2017.8115707 2017
[22]

Richard Landis and Gary G

J. Richard Landis and Gary G. Koch. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics 33, 1 (1977), 159–174. http: //www.jstor.org/stable/2529310

work page arXiv 1977
[23]

Chi Li, Zuxing Gu, Min Zhou, Jiecheng Wu, Jiarui Zhang, and Ming Gu. 2019. API Misuse Detection in C Programs: Practice on SSL APIs. Int. J. Softw. Eng. Knowl. Eng. 29, 11&12 (2019), 1761–1779. https://doi.org/10.1142/S0218194019400205

work page doi:10.1142/s0218194019400205 2019
[24]

Xia Li, Jiajun Jiang, Samuel Benton, Yingfei Xiong, and Lingming Zhang. 2021. A Large-scale Study on API Misuses in the Wild. In 14th IEEE Conference on Software Testing, Verification and Validation, ICST 2021, Porto de Galinhas, Brazil, April 12-16, 2021. IEEE, 241–252. https://doi.org/10.1109/ICST49551.2021.00034

work page doi:10.1109/icst49551.2021.00034 2021
[25]

Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005, Lisbon, Portugal, September 5-9, 2005...

work page doi:10.1145/1081706.1081755 2005
[26]

Qingmi Liang, Zhirui Kuai, Yangqi Zhang, Zhiyang Zhang, Li Kuang, and Lingyan Zhang. 2022. MisuseHint: A Service for API Misuse Detection Based on Building from Documentation and Codebase. In IEEE International Conference on Web Services, ICWS 2022, Barcelona, Spain, July 10-16, 2022 . IEEE, 246–255. https: //doi.org/10.1109/ICWS55610.2022.00046

work page doi:10.1109/icws55610.2022.00046 2022
[27]

Yunkai Liang, Yun Lin, Xuezhi Song, Jun Sun, Zhiyong Feng, and Jin Song Dong

work page
[28]

In 44th IEEE/ACM International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2022, Pittsburgh, PA, USA, May 22-24,

gDefects4DL: A Dataset of General Real-World Deep Learning Program Defects. In 44th IEEE/ACM International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2022, Pittsburgh, PA, USA, May 22-24,

work page 2022
[29]

https://doi.org/10.1145/3510454.3516826

ACM/IEEE, 90–94. https://doi.org/10.1145/3510454.3516826

work page doi:10.1145/3510454.3516826
[30]

Christian Lindig. 2015. Mining Patterns and Violations Using Concept Analysis. In The Art and Science of Analyzing Software Data , Christian Bird, Tim Menzies, and Thomas Zimmermann (Eds.). Morgan Kaufmann / Elsevier, 17–38. https: //doi.org/10.1016/b978-0-12-411519-4.00002-1

work page doi:10.1016/b978-0-12-411519-4.00002-1 2015
[31]

Robillard

Walid Maalej and Martin P. Robillard. 2013. Patterns of Knowledge in API Reference Documentation. IEEE Trans. Software Eng. 39, 9 (2013), 1264–1282. https://doi.org/10.1109/TSE.2013.12

work page doi:10.1109/tse.2013.12 2013
[32]

Matplotlib. 2023. datestr2num. https://matplotlib.org/stable/api/dates_api.html# matplotlib.dates.datestr2num

work page 2023
[33]

Matplotlib. 2023. Matplotlib. https://matplotlib.org/

work page 2023
[34]

Matplotlib. 2023. pyplot.text. https://matplotlib.org/stable/api/_as_gen/ matplotlib.pyplot.text.html

work page 2023
[35]

Michael Meng, Stephanie Steinhardt, and Andreas Schubert. 2018. Application Programming Interface Documentation: What Do Software Developers Want? Journal of Technical Writing and Communication 48, 3 (2018), 295–330. https: //doi.org/10.1177/0047281617721853

work page doi:10.1177/0047281617721853 2018
[36]

Martin Monperrus, Michael Eichberg, Elif Tekes, and Mira Mezini. 2012. What should developers be aware of? An empirical study on the directives of API documentation. Empir. Softw. Eng. 17, 6 (2012), 703–737. https://doi.org/10.1007/ s10664-011-9186-4

work page 2012
[37]

Martin Monperrus and Mira Mezini. 2013. Detecting missing method calls as violations of the majority rule. ACM Trans. Softw. Eng. Methodol. 22, 1 (2013), 7:1–7:25. https://doi.org/10.1145/2430536.2430541

work page doi:10.1145/2430536.2430541 2013
[38]

Myers and Jeffrey Stylos

Brad A. Myers and Jeffrey Stylos. 2016. Improving API usability. Commun. ACM 59, 6 (2016), 62–69. https://doi.org/10.1145/2896587

work page doi:10.1145/2896587 2016
[39]

mypy. 2023. mypy. https://mypy-lang.org/

work page 2023
[40]

Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden. 2016. Jumping through Hoops: Why Do Java Developers Struggle with Cryptography APIs?. In Proceed- ings of the 38th International Conference on Software Engineering (Austin, Texas) (ICSE ’16). Association for Computing Machinery, New York, NY, USA, 935–946. https://doi.org/10.1145/2884781.2884790

work page doi:10.1145/2884781.2884790 2016
[41]

Sarah Nadi and Christoph Treude. 2020. Essential Sentences for Navigating Stack Overflow Answers. In 27th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2020, London, ON, Canada, February 18-21, 2020, Kostas Kontogiannis, Foutse Khomh, Alexander Chatzigeorgiou, Marios- Eleftherios Fokaefs, and Minghui Zhou (Eds.). I...

work page arXiv 2020
[42]

Pham, Jafar M

Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, and Tien N. Nguyen. 2009. Graph-based mining of multiple object usage patterns. In Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009, Amsterdam, The Netherlands, Au...

work page doi:10.1145/1595696.1595767 2009
[43]

NumPy. 2023. NumPy. https://numpy.org/

work page 2023
[44]

Query Stack Overflow. 2023. Stack Exchange Data Explorer. https://data. stackexchange.com/stackoverflow/query/

work page 2023
[45]

Stack Overflow. 2024. Survey. https://survey.stackoverflow.co/2023/#overview

work page 2024
[46]

pandas. 2023. pandas. https://pandas.pydata.org/

work page 2023
[47]

pandas. 2023. repalce. https://pandas.pydata.org/pandas-docs/stable/reference/ api/pandas.DataFrame.replace.html

work page 2023
[48]

PyDriller. 2023. PyDriller documentation. https://pydriller.readthedocs.io/en/ latest/index.html

work page 2023
[49]

Murali Krishna Ramanathan, Ananth Grama, and Suresh Jagannathan. 2007. Static specification inference using predicate mining. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007 . ACM, 123–134. https://doi.org/10. 1145/1250734.1250749

work page arXiv 2007
[50]

Sebastian Raschka, Joshua Patterson, and Corey Nolet. 2020. Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence. Information 11, 4 (2020), 193. https://doi.org/ 10.3390/INFO11040193 ESEM ’24, October 24–25, 2024, Barcelona, Spain Galappaththi et al

work page doi:10.3390/info11040193 2020
[51]

Xiaoxue Ren, Xinyuan Ye, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Jianling Sun. 2020. API-Misuse Detection Driven by Fine-Grained API- Constraint Knowledge Graph. In 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25,

work page 2020
[52]

https://doi.org/10.1145/3324884.3416551

IEEE, 461–472. https://doi.org/10.1145/3324884.3416551

work page doi:10.1145/3324884.3416551
[53]

Robillard

Martin P. Robillard. 2009. What Makes APIs Hard to Learn? Answers from Developers. IEEE Softw. 26, 6 (2009), 27–34. https://doi.org/10.1109/MS.2009.193

work page doi:10.1109/ms.2009.193 2009
[54]

Robillard and Yam B

Martin P. Robillard and Yam B. Chhetri. 2015. Recommending reference API documentation. Empir. Softw. Eng. 20, 6 (2015), 1558–1586. https://doi.org/10. 1007/s10664-014-9323-y

work page 2015
[55]

Michael Schlichtig, Steffen Sassalla, Krishna Narasimhan, and Eric Bodden. 2022. FUM - A Framework for API Usage constraint and Misuse Classification. In IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2022, Honolulu, HI, USA, March 15-18, 2022 . IEEE, 673–684. https://doi. org/10.1109/SANER53432.2022.00085

work page doi:10.1109/saner53432.2022.00085 2022
[56]

scikit learn. 2023. scikit-learn. https://scikit-learn.org/stable/

work page 2023
[57]

scikit learn. 2023. StandardScaler. https://scikit-learn.org/stable/modules/ generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing. StandardScaler.fit_transform

work page 2023
[58]

seaborn. 2023. scatterplot. https://seaborn.pydata.org/generated/seaborn. scatterplot.html

work page 2023
[59]

seaborn. 2023. seaborn. https://seaborn.pydata.org/

work page 2023
[60]

seaborn. 2023. set_style. https://seaborn.pydata.org/generated/seaborn.set_ style.html

work page 2023
[61]

Suresh Thummalapenta and Tao Xie. 2009. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. In ASE 2009, 24th IEEE/ACM International Conference on Automated Software Engineering, Auckland, New Zealand, November 16-20, 2009. IEEE Computer Society, 283–294. https://doi.org/10.1109/ASE.2009.72

work page doi:10.1109/ase.2009.72 2009
[62]

Robillard

Christoph Treude and Martin P. Robillard. 2016. Augmenting API documenta- tion with insights from stack overflow. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016 , Laura K. Dillon, Willem Visser, and Laurie A. Williams (Eds.). ACM, 392–403. https://doi.org/10.1145/2884781.2884800

work page doi:10.1145/2884781.2884800 2016
[63]

Andrzej Wasylkowski, Andreas Zeller, and Christian Lindig. 2007. Detecting object usage anomalies. In Proceedings of the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2007, Dubrovnik, Croatia, September 3-7,

work page 2007
[64]

https://doi.org/10.1145/1287624.1287632

ACM, 35–44. https://doi.org/10.1145/1287624.1287632

work page doi:10.1145/1287624.1287632
[65]

Moshi Wei, Nima Shiri Harzevili, YueKai Huang, Jinqiu Yang, Junjie Wang, and Song Wang. 2024. Demystifying and Detecting Misuses of Deep Learning APIs. In Proceedings of the 46th International Conference on Software Engineering, ICSE 2024, Lisbon, Portugal, April 14 - 20, 2024 . IEEE, 1–13

work page 2024
[66]

Ming Wen, Yepang Liu, Rongxin Wu, Xuan Xie, Shing-Chi Cheung, and Zhendong Su. 2019. Exposing library API misuses via mutation analysis. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019 . IEEE / ACM, 866–877. https://doi.org/10.1109/ ICSE.2019.00093

work page arXiv 2019
[67]

Hushuang Zeng, Jingxin Chen, Beijun Shen, and Hao Zhong. 2021. Mining API Constraints from Library and Client to Detect API Misuses. In 28th Asia-Pacific Software Engineering Conference, APSEC 2021, Taipei, Taiwan, December 6-9, 2021 . IEEE, 161–170. https://doi.org/10.1109/APSEC53868.2021.00024

work page doi:10.1109/apsec53868.2021.00024 2021
[68]

Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2018, Amsterdam, The Netherlands, July 16-21, 2018 , Frank Tip and Eric Bodden (Eds.). ACM, 129–140. https://doi.org/10.1145/3213...

work page doi:10.1145/3213846.3213866 2018

[1] [1]

Nguyen, and Mira Mezini

Sven Amann, Sarah Nadi, Hoan Anh Nguyen, Tien N. Nguyen, and Mira Mezini

work page

[2] [2]

Mubench: A benchmark for api-misuse detectors,

MUBench: a benchmark for API-misuse detectors. In Proceedings of the 13th International Conference on Mining Software Repositories, MSR 2016, Austin, TX, USA, May 14-22, 2016 . ACM, 464–467. https://doi.org/10.1145/2901739.2903506

work page doi:10.1145/2901739.2903506 2016

[3] [4]

In Proceedings of the 16th International Conference on Mining Software Repositories, MSR 2019, 26-27 May 2019, Montreal, Canada

Investigating next steps in static API-misuse detection. In Proceedings of the 16th International Conference on Mining Software Repositories, MSR 2019, 26-27 May 2019, Montreal, Canada . IEEE / ACM, 265–275. https://doi.org/10.1109/MSR. 2019.00053

work page doi:10.1109/msr 2019

[4] [5]

Nguyen, and Mira Mezini

Sven Amann, Hoan Anh Nguyen, Sarah Nadi, Tien N. Nguyen, and Mira Mezini

work page

[5] [6]

IEEE Trans

A Systematic Evaluation of Static API-Misuse Detectors. IEEE Trans. Software Eng. 45, 12 (2019), 1170–1188. https://doi.org/10.1109/TSE.2018.2827384

work page doi:10.1109/tse.2018.2827384 2019

[6] [7]

Anonymous. 2024. ESEM 2024 - Data-centric API misuse data set . https://figshare. com/s/0062f28e8587fe9ce715

work page 2024

[7] [8]

Wilson Baker, Michael O’Connor, Seyed Reza Shahamiri, and Valerio Terragni

work page

[8] [9]

In IEEE International Con- ference on Software Analysis, Evolution and Reengineering, SANER 2022, Honolulu, HI, USA, March 15-18, 2022

Detect, Fix, and Verify TensorFlow API Misuses. In IEEE International Con- ference on Software Analysis, Evolution and Reengineering, SANER 2022, Honolulu, HI, USA, March 15-18, 2022 . IEEE, 925–929. https://doi.org/10.1109/SANER53432. 2022.00110

work page doi:10.1109/saner53432 2022

[9] [10]

Houssem Ben Braiek, Foutse Khomh, and Bram Adams. 2018. The open-closed principle of modern machine learning frameworks. In Proceedings of the 15th International Conference on Mining Software Repositories, MSR 2018, Gothenburg, Sweden, May 28-29, 2018 , Andy Zaidman, Yasutaka Kamei, and Emily Hill (Eds.). ACM, 353–363. https://doi.org/10.1145/3196398.3196445

work page doi:10.1145/3196398.3196445 2018

[10] [11]

Kathy Charmaz. 2014. Constructing grounded theory. (2014)

work page 2014

[11] [12]

Rosa Falotico and Piero Quatto. 2015. Fleiss’ kappa statistic without paradoxes. Quality & Quantity 49, 2 (2015), 463–470

work page 2015

[12] [13]

Zuxing Gu, Jiecheng Wu, Chi Li, Min Zhou, Yu Jiang, Ming Gu, and Jiaguang Sun. 2019. Vetting API usages in C programs with IMChecker. In Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019 . IEEE / ACM, 91–94. https: //doi.org/10.1109/ICSE-Companion.2019.00046

work page doi:10.1109/icse-companion.2019.00046 2019

[13] [14]

Mansur Gulami. 2022. A Human-in-the-loop Approach to Generate Annotation Usage Rules. Master’s thesis

work page 2022

[14] [15]

Xincheng He, Xiaojin Liu, Lei Xu, and Baowen Xu. 2023. How Dynamic Features Affect API Usages? An Empirical Study of API Misuses in Python Programs. In IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2023, Taipa, Macao, March 21-24, 2023 , Tao Zhang, Xin Xia, and Nicole Novielli (Eds.). IEEE, 522–533. https://doi.org...

work page doi:10.1109/saner56733.2023.00055 2023

[15] [16]

Daqing Hou and Lin Li. 2011. Obstacles in Using Frameworks and APIs: An Exploratory Study of Programmers’ Newsgroup Discussions. InThe 19th IEEE International Conference on Program Comprehension, ICPC 2011, Kingston, ON, Canada, June 22-24, 2011 . IEEE Computer Society, 91–100. https://doi.org/10. 1109/ICPC.2011.21

work page 2011

[16] [17]

Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In ICSE ’20: 42nd International Conference on Software Engineering, Seoul, South Korea, 27 June - 19 July, 2020 . ACM, 1110–1121. https://doi.org/10.1145/3377811. 3380395

work page doi:10.1145/3377811 2020

[17] [18]

Nick Hynes, D Sculley, and Michael Terry. 2017. The data linter: Lightweight, automated sanity checking for ml data sets. In NIPS MLSys Workshop, Vol. 1

work page 2017

[18] [19]

Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019 . ACM, 510–520. https://d...

work page doi:10.1145/3338906 2019

[19] [20]

kaggle. 2023. State of Data Science and Machine Learning 2022. https://www. kaggle.com/kaggle-survey-2022

work page 2023

[20] [21]

Stefan Krüger, Sarah Nadi, Michael Reif, Karim Ali, Mira Mezini, Eric Bodden, Florian Göpfert, Felix Günther, Christian Weinert, Daniel Demmler, and Ram Kamath. 2017. CogniCrypt: supporting developers in using cryptography. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017, Urbana, IL, USA, October 30...

work page doi:10.1109/ase.2017.8115707 2017

[21] [22]

Richard Landis and Gary G

J. Richard Landis and Gary G. Koch. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics 33, 1 (1977), 159–174. http: //www.jstor.org/stable/2529310

work page arXiv 1977

[22] [23]

Chi Li, Zuxing Gu, Min Zhou, Jiecheng Wu, Jiarui Zhang, and Ming Gu. 2019. API Misuse Detection in C Programs: Practice on SSL APIs. Int. J. Softw. Eng. Knowl. Eng. 29, 11&12 (2019), 1761–1779. https://doi.org/10.1142/S0218194019400205

work page doi:10.1142/s0218194019400205 2019

[23] [24]

Xia Li, Jiajun Jiang, Samuel Benton, Yingfei Xiong, and Lingming Zhang. 2021. A Large-scale Study on API Misuses in the Wild. In 14th IEEE Conference on Software Testing, Verification and Validation, ICST 2021, Porto de Galinhas, Brazil, April 12-16, 2021. IEEE, 241–252. https://doi.org/10.1109/ICST49551.2021.00034

work page doi:10.1109/icst49551.2021.00034 2021

[24] [25]

Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005, Lisbon, Portugal, September 5-9, 2005...

work page doi:10.1145/1081706.1081755 2005

[25] [26]

Qingmi Liang, Zhirui Kuai, Yangqi Zhang, Zhiyang Zhang, Li Kuang, and Lingyan Zhang. 2022. MisuseHint: A Service for API Misuse Detection Based on Building from Documentation and Codebase. In IEEE International Conference on Web Services, ICWS 2022, Barcelona, Spain, July 10-16, 2022 . IEEE, 246–255. https: //doi.org/10.1109/ICWS55610.2022.00046

work page doi:10.1109/icws55610.2022.00046 2022

[26] [27]

Yunkai Liang, Yun Lin, Xuezhi Song, Jun Sun, Zhiyong Feng, and Jin Song Dong

work page

[27] [28]

In 44th IEEE/ACM International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2022, Pittsburgh, PA, USA, May 22-24,

gDefects4DL: A Dataset of General Real-World Deep Learning Program Defects. In 44th IEEE/ACM International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2022, Pittsburgh, PA, USA, May 22-24,

work page 2022

[28] [29]

https://doi.org/10.1145/3510454.3516826

ACM/IEEE, 90–94. https://doi.org/10.1145/3510454.3516826

work page doi:10.1145/3510454.3516826

[29] [30]

Christian Lindig. 2015. Mining Patterns and Violations Using Concept Analysis. In The Art and Science of Analyzing Software Data , Christian Bird, Tim Menzies, and Thomas Zimmermann (Eds.). Morgan Kaufmann / Elsevier, 17–38. https: //doi.org/10.1016/b978-0-12-411519-4.00002-1

work page doi:10.1016/b978-0-12-411519-4.00002-1 2015

[30] [31]

Robillard

Walid Maalej and Martin P. Robillard. 2013. Patterns of Knowledge in API Reference Documentation. IEEE Trans. Software Eng. 39, 9 (2013), 1264–1282. https://doi.org/10.1109/TSE.2013.12

work page doi:10.1109/tse.2013.12 2013

[31] [32]

Matplotlib. 2023. datestr2num. https://matplotlib.org/stable/api/dates_api.html# matplotlib.dates.datestr2num

work page 2023

[32] [33]

Matplotlib. 2023. Matplotlib. https://matplotlib.org/

work page 2023

[33] [34]

Matplotlib. 2023. pyplot.text. https://matplotlib.org/stable/api/_as_gen/ matplotlib.pyplot.text.html

work page 2023

[34] [35]

Michael Meng, Stephanie Steinhardt, and Andreas Schubert. 2018. Application Programming Interface Documentation: What Do Software Developers Want? Journal of Technical Writing and Communication 48, 3 (2018), 295–330. https: //doi.org/10.1177/0047281617721853

work page doi:10.1177/0047281617721853 2018

[35] [36]

Martin Monperrus, Michael Eichberg, Elif Tekes, and Mira Mezini. 2012. What should developers be aware of? An empirical study on the directives of API documentation. Empir. Softw. Eng. 17, 6 (2012), 703–737. https://doi.org/10.1007/ s10664-011-9186-4

work page 2012

[36] [37]

Martin Monperrus and Mira Mezini. 2013. Detecting missing method calls as violations of the majority rule. ACM Trans. Softw. Eng. Methodol. 22, 1 (2013), 7:1–7:25. https://doi.org/10.1145/2430536.2430541

work page doi:10.1145/2430536.2430541 2013

[37] [38]

Myers and Jeffrey Stylos

Brad A. Myers and Jeffrey Stylos. 2016. Improving API usability. Commun. ACM 59, 6 (2016), 62–69. https://doi.org/10.1145/2896587

work page doi:10.1145/2896587 2016

[38] [39]

mypy. 2023. mypy. https://mypy-lang.org/

work page 2023

[39] [40]

Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden. 2016. Jumping through Hoops: Why Do Java Developers Struggle with Cryptography APIs?. In Proceed- ings of the 38th International Conference on Software Engineering (Austin, Texas) (ICSE ’16). Association for Computing Machinery, New York, NY, USA, 935–946. https://doi.org/10.1145/2884781.2884790

work page doi:10.1145/2884781.2884790 2016

[40] [41]

Sarah Nadi and Christoph Treude. 2020. Essential Sentences for Navigating Stack Overflow Answers. In 27th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2020, London, ON, Canada, February 18-21, 2020, Kostas Kontogiannis, Foutse Khomh, Alexander Chatzigeorgiou, Marios- Eleftherios Fokaefs, and Minghui Zhou (Eds.). I...

work page arXiv 2020

[41] [42]

Pham, Jafar M

Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, and Tien N. Nguyen. 2009. Graph-based mining of multiple object usage patterns. In Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009, Amsterdam, The Netherlands, Au...

work page doi:10.1145/1595696.1595767 2009

[42] [43]

NumPy. 2023. NumPy. https://numpy.org/

work page 2023

[43] [44]

Query Stack Overflow. 2023. Stack Exchange Data Explorer. https://data. stackexchange.com/stackoverflow/query/

work page 2023

[44] [45]

Stack Overflow. 2024. Survey. https://survey.stackoverflow.co/2023/#overview

work page 2024

[45] [46]

pandas. 2023. pandas. https://pandas.pydata.org/

work page 2023

[46] [47]

pandas. 2023. repalce. https://pandas.pydata.org/pandas-docs/stable/reference/ api/pandas.DataFrame.replace.html

work page 2023

[47] [48]

PyDriller. 2023. PyDriller documentation. https://pydriller.readthedocs.io/en/ latest/index.html

work page 2023

[48] [49]

Murali Krishna Ramanathan, Ananth Grama, and Suresh Jagannathan. 2007. Static specification inference using predicate mining. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007 . ACM, 123–134. https://doi.org/10. 1145/1250734.1250749

work page arXiv 2007

[49] [50]

Sebastian Raschka, Joshua Patterson, and Corey Nolet. 2020. Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence. Information 11, 4 (2020), 193. https://doi.org/ 10.3390/INFO11040193 ESEM ’24, October 24–25, 2024, Barcelona, Spain Galappaththi et al

work page doi:10.3390/info11040193 2020

[50] [51]

Xiaoxue Ren, Xinyuan Ye, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Jianling Sun. 2020. API-Misuse Detection Driven by Fine-Grained API- Constraint Knowledge Graph. In 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25,

work page 2020

[51] [52]

https://doi.org/10.1145/3324884.3416551

IEEE, 461–472. https://doi.org/10.1145/3324884.3416551

work page doi:10.1145/3324884.3416551

[52] [53]

Robillard

Martin P. Robillard. 2009. What Makes APIs Hard to Learn? Answers from Developers. IEEE Softw. 26, 6 (2009), 27–34. https://doi.org/10.1109/MS.2009.193

work page doi:10.1109/ms.2009.193 2009

[53] [54]

Robillard and Yam B

Martin P. Robillard and Yam B. Chhetri. 2015. Recommending reference API documentation. Empir. Softw. Eng. 20, 6 (2015), 1558–1586. https://doi.org/10. 1007/s10664-014-9323-y

work page 2015

[54] [55]

Michael Schlichtig, Steffen Sassalla, Krishna Narasimhan, and Eric Bodden. 2022. FUM - A Framework for API Usage constraint and Misuse Classification. In IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2022, Honolulu, HI, USA, March 15-18, 2022 . IEEE, 673–684. https://doi. org/10.1109/SANER53432.2022.00085

work page doi:10.1109/saner53432.2022.00085 2022

[55] [56]

scikit learn. 2023. scikit-learn. https://scikit-learn.org/stable/

work page 2023

[56] [57]

scikit learn. 2023. StandardScaler. https://scikit-learn.org/stable/modules/ generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing. StandardScaler.fit_transform

work page 2023

[57] [58]

seaborn. 2023. scatterplot. https://seaborn.pydata.org/generated/seaborn. scatterplot.html

work page 2023

[58] [59]

seaborn. 2023. seaborn. https://seaborn.pydata.org/

work page 2023

[59] [60]

seaborn. 2023. set_style. https://seaborn.pydata.org/generated/seaborn.set_ style.html

work page 2023

[60] [61]

Suresh Thummalapenta and Tao Xie. 2009. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. In ASE 2009, 24th IEEE/ACM International Conference on Automated Software Engineering, Auckland, New Zealand, November 16-20, 2009. IEEE Computer Society, 283–294. https://doi.org/10.1109/ASE.2009.72

work page doi:10.1109/ase.2009.72 2009

[61] [62]

Robillard

Christoph Treude and Martin P. Robillard. 2016. Augmenting API documenta- tion with insights from stack overflow. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016 , Laura K. Dillon, Willem Visser, and Laurie A. Williams (Eds.). ACM, 392–403. https://doi.org/10.1145/2884781.2884800

work page doi:10.1145/2884781.2884800 2016

[62] [63]

Andrzej Wasylkowski, Andreas Zeller, and Christian Lindig. 2007. Detecting object usage anomalies. In Proceedings of the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2007, Dubrovnik, Croatia, September 3-7,

work page 2007

[63] [64]

https://doi.org/10.1145/1287624.1287632

ACM, 35–44. https://doi.org/10.1145/1287624.1287632

work page doi:10.1145/1287624.1287632

[64] [65]

Moshi Wei, Nima Shiri Harzevili, YueKai Huang, Jinqiu Yang, Junjie Wang, and Song Wang. 2024. Demystifying and Detecting Misuses of Deep Learning APIs. In Proceedings of the 46th International Conference on Software Engineering, ICSE 2024, Lisbon, Portugal, April 14 - 20, 2024 . IEEE, 1–13

work page 2024

[65] [66]

Ming Wen, Yepang Liu, Rongxin Wu, Xuan Xie, Shing-Chi Cheung, and Zhendong Su. 2019. Exposing library API misuses via mutation analysis. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019 . IEEE / ACM, 866–877. https://doi.org/10.1109/ ICSE.2019.00093

work page arXiv 2019

[66] [67]

Hushuang Zeng, Jingxin Chen, Beijun Shen, and Hao Zhong. 2021. Mining API Constraints from Library and Client to Detect API Misuses. In 28th Asia-Pacific Software Engineering Conference, APSEC 2021, Taipei, Taiwan, December 6-9, 2021 . IEEE, 161–170. https://doi.org/10.1109/APSEC53868.2021.00024

work page doi:10.1109/apsec53868.2021.00024 2021

[67] [68]

Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2018, Amsterdam, The Netherlands, July 16-21, 2018 , Frank Tip and Eric Bodden (Eds.). ACM, 129–140. https://doi.org/10.1145/3213...

work page doi:10.1145/3213846.3213866 2018