Bash-Commenter: Leveraging Syntax-Aware Preference Optimization to Reinforce Large Language Model for Bash Code Comment Generation

Fengjun Zhang; Jiajia Ma; Jia Xu; Jingyuan Zhang; Lei Yu; Li Yang; Peng Wang; Xin Wang

arxiv: 2606.29709 · v1 · pith:PS5AUTWYnew · submitted 2026-06-29 · 💻 cs.SE

Bash-Commenter: Leveraging Syntax-Aware Preference Optimization to Reinforce Large Language Model for Bash Code Comment Generation

Lei Yu , Jingyuan Zhang , Xin Wang , Li Yang , Fengjun Zhang , Peng Wang , Jia Xu , Jiajia Ma This is my paper

Pith reviewed 2026-06-30 05:46 UTC · model grok-4.3

classification 💻 cs.SE

keywords Bash script comment generationSyntax-Aware Preference OptimizationAbstract Syntax TreeLarge Language ModelsContinual Pre-trainingSupervised Fine-tuningPreference Optimization

0 comments

The pith

Bash-Commenter improves LLM comment generation for Bash by training on minimal correct/incorrect pairs from script ASTs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Bash-Commenter to generate comments for Bash scripts, which often lack them due to complex syntax. It builds a dataset of multi-line scripts, applies continual pre-training and supervised fine-tuning to LLaMA-3.1-8B on Bash data, then uses Syntax-Aware Preference Optimization. SAPO creates preference pairs by making small atomic changes to a script's abstract syntax tree, teaching the model fine-grained semantics. This yields higher automatic metric scores and better human and LLM ratings for comment quality than prior methods.

Core claim

Syntax-Aware Preference Optimization constructs preference pairs by applying atomic operations to a Bash script's Abstract Syntax Tree to form minimal correct and subtly incorrect variants; after continual pre-training and supervised fine-tuning of LLaMA-3.1-8B, this produces comments scoring 33.40% BLEU-4, 58.26% METEOR and 57.03% ROUGE-L on single-line commands plus 22.15% BLEU-4, 43.89% METEOR and 32.80% ROUGE-L on multi-line scripts, with superior correctness, completeness and naturalness in human and LLM evaluations.

What carries the argument

Syntax-Aware Preference Optimization (SAPO), which builds preference pairs from atomic AST operations on Bash scripts to support fine-grained semantic learning.

If this is right

Comment generation for single-line Bash commands reaches 33.40% BLEU-4, 58.26% METEOR and 57.03% ROUGE-L.
Comment generation for multi-line Bash scripts reaches 22.15% BLEU-4, 43.89% METEOR and 32.80% ROUGE-L.
Human and LLM judges rate the generated comments higher in correctness, completeness and naturalness than baseline methods.
Continual pre-training plus supervised fine-tuning on Bash data followed by SAPO strengthens the model's grasp of Bash syntax and semantics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The AST-pair construction step could be reused to create training signals for other code-generation tasks that require precise syntactic understanding.
The same preference-optimization pattern might transfer to comment generation in other shell or scripting languages that share Bash's syntactic flexibility.
Larger-scale application of the dataset-construction pipeline could reduce reliance on manual annotation for low-resource programming languages.

Load-bearing premise

That the preference pairs created by applying atomic operations to a script's AST produce minimal differences that reliably teach fine-grained Bash semantics rather than superficial syntax patterns.

What would settle it

Training an otherwise identical LLaMA model on the same dataset but with random or non-AST preference pairs and finding it matches or exceeds the reported BLEU/METEOR/ROUGE scores plus human ratings on the 1,064 single-line and 1,046 multi-line test sets.

Figures

Figures reproduced from arXiv: 2606.29709 by Fengjun Zhang, Jiajia Ma, Jia Xu, Jingyuan Zhang, Lei Yu, Li Yang, Peng Wang, Xin Wang.

**Figure 2.** Figure 2: Example of a Bash command pipeline for error log processing. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: The Overview of our Bash-Commenter. additional 8,154 annotated multi-line Bash scripts from [7], focusing on diverse use cases and complexity levels. After quality control and deduplication, our final SFT dataset contained 16,623 high-quality ⟨Bash Code, Comment⟩ pairs. Our quality control employed a four-stage pipeline: (1) Jaccard-based deduplication (threshold 0.9); (2) LLM scoring (DeepSeek-R1) with 50… view at source ↗

**Figure 4.** Figure 4: Case Study of Bash Code Comment Generation Using Bash-Commenter. [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

read the original abstract

Bash script comprehension is challenging due to Bash's syntactic freedom and complex command structures. Despite its critical role in system administration, Bash scripts often lack adequate comments, hindering readability and maintainability. Existing automated comment generation approaches face two main challenges: (1) limited training datasets that inadequately represent real-world Bash usage patterns; and (2) insufficient understanding of Bash-specific concepts by Large Language Models (LLMs). To address these, we propose Bash-Commenter, an advanced comment generation method based on LLaMA-3.1-8B. First, we construct a comprehensive dataset of complex, multi-line Bash scripts with high-quality comments. Second, we conduct Continual Pre-training (CPT) on large-scale Bash data, followed by Supervised Fine-tuning (SFT), strengthening the model's foundational knowledge of Bash syntax and semantics. Finally, we introduce Syntax-Aware Preference Optimization (SAPO), which constructs preference pairs by applying atomic operations to a script's Abstract Syntax Tree (AST), creating minimal pairs of correct and subtly incorrect scripts for fine-grained semantics learning. Our method outperforms state-of-the-art baselines, achieving 33.40% BLEU-4, 58.26% METEOR, and 57.03% ROUGE-L for 1,064 single-line commands, and 22.15% BLEU-4, 43.89% METEOR, and 32.80% ROUGE-L for 1,046 multi-line scripts. Human and LLM evaluations further confirm superior comment quality in correctness, completeness, and naturalness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SAPO is a reasonable domain tweak on preference optimization for Bash comments, but the core claim that AST edits create semantically meaningful pairs needs direct validation.

read the letter

The paper's main contribution is a three-stage pipeline on LLaMA-3.1-8B for Bash comment generation: continual pre-training on Bash corpora, supervised fine-tuning, then Syntax-Aware Preference Optimization that turns AST atomic edits into preference pairs. They also release a dataset of multi-line scripts with comments. On their held-out sets (roughly 1k single-line and 1k multi-line examples) the final model beats baselines on BLEU-4, METEOR, and ROUGE-L, and human plus LLM raters prefer the outputs on correctness, completeness, and naturalness.

They did the obvious but necessary work well: collecting real Bash usage data instead of relying on synthetic or Python-centric sets, and running the standard CPT-SFT alignment sequence before adding the SAPO step. The AST-based pair construction is a straightforward adaptation of existing preference optimization ideas to a syntax-heavy language.

The weakest part is exactly the one the stress-test flags. The abstract and method description give no list of the atomic operations, no check that the edited scripts actually differ in runtime behavior, and no ablation that isolates SAPO from the earlier stages. Bash's command substitution and quoting rules make it easy for small AST changes to leave semantics unchanged, so the preference signal could be mostly stylistic. Without that check, the reported gains could come from better syntax modeling rather than the claimed fine-grained semantics. The evaluation also lacks statistical significance numbers and details on how the baselines were trained or whether test data leaked into the CPT stage.

This is a niche but practical paper for people working on comment generation or domain-specific alignment for shell and scripting languages. It has concrete numbers, human evaluation, and a clear experimental setup, so it is worth sending to referees even though the SAPO validation will need to be strengthened.

Referee Report

2 major / 2 minor

Summary. The paper proposes Bash-Commenter, a pipeline based on LLaMA-3.1-8B that first builds a dataset of complex Bash scripts, then applies continual pre-training (CPT) on large-scale Bash data, supervised fine-tuning (SFT), and Syntax-Aware Preference Optimization (SAPO). SAPO constructs preference pairs via atomic AST edits to produce minimally differing correct/incorrect scripts for fine-grained learning. On held-out sets of 1,064 single-line commands and 1,046 multi-line scripts the method reports 33.40% BLEU-4 / 58.26% METEOR / 57.03% ROUGE-L and 22.15% BLEU-4 / 43.89% METEOR / 32.80% ROUGE-L respectively, with supporting human and LLM judgments on correctness, completeness and naturalness.

Significance. If the SAPO pairs demonstrably alter execution semantics while remaining syntactically close, the approach supplies a concrete, reproducible technique for preference optimization on scripting languages whose flexible syntax makes standard DPO pairs unreliable. The release of a sizable, high-quality Bash dataset would also be a lasting community resource. The evaluation is presented as direct measurement on held-out data with no circularity in the reported scores.

major comments (2)

[SAPO / method section] SAPO description (abstract and method section): the claim that atomic AST operations produce 'minimal pairs of correct and subtly incorrect scripts' for fine-grained semantics learning is load-bearing, yet the manuscript supplies neither the explicit list of operations nor any runtime validation (e.g., execution-trace or output-difference checks) that the edited pairs actually differ in behavior rather than surface syntax. Without this evidence the DPO step could be optimizing stylistic preferences instead.
[Results / evaluation section] Results section (automatic metrics): the reported gains (e.g., 33.40% BLEU-4 on single-line) are presented without statistical significance tests, without details on baseline training protocols or hyper-parameter parity, and without dataset-construction safeguards against leakage. These omissions make it impossible to judge whether the outperformance claim is robust.

minor comments (2)

[Abstract] Abstract: the dataset size and provenance are mentioned only qualitatively; a single sentence giving total script count and collection method would improve context.
[Evaluation section] Evaluation: it is unclear whether the human/LLM raters saw the identical test instances used for the automatic metrics; explicit alignment would strengthen the multi-faceted claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the thoughtful and constructive feedback on our manuscript. The comments have helped us identify areas where additional details can strengthen the presentation of our method and results. Below, we provide point-by-point responses to the major comments.

read point-by-point responses

Referee: [SAPO / method section] SAPO description (abstract and method section): the claim that atomic AST operations produce 'minimal pairs of correct and subtly incorrect scripts' for fine-grained semantics learning is load-bearing, yet the manuscript supplies neither the explicit list of operations nor any runtime validation (e.g., execution-trace or output-difference checks) that the edited pairs actually differ in behavior rather than surface syntax. Without this evidence the DPO step could be optimizing stylistic preferences instead.

Authors: We appreciate the referee's emphasis on the importance of validating that the preference pairs induce semantic differences. While the current manuscript describes the high-level idea of using atomic AST operations to create minimal pairs, we agree that an explicit list of operations and empirical validation of behavioral differences would provide stronger support for the claim. In the revised manuscript, we will add a dedicated subsection detailing the complete set of atomic AST operations employed (e.g., altering command arguments to change functionality, modifying control structures in ways that affect execution, etc.) along with quantitative results from runtime validation, such as execution trace comparisons and output difference metrics on a sample of pairs, demonstrating that the 'incorrect' scripts exhibit different behaviors. This will help confirm that SAPO targets semantic understanding. revision: yes
Referee: [Results / evaluation section] Results section (automatic metrics): the reported gains (e.g., 33.40% BLEU-4 on single-line) are presented without statistical significance tests, without details on baseline training protocols or hyper-parameter parity, and without dataset-construction safeguards against leakage. These omissions make it impossible to judge whether the outperformance claim is robust.

Authors: We acknowledge that the absence of statistical tests, baseline implementation details, and explicit anti-leakage measures limits the interpretability of the results. To address this, the revised version will include: (1) statistical significance testing using bootstrap resampling or paired tests with reported p-values for all metric comparisons; (2) a table or appendix detailing the training protocols, hyperparameters, and data sources for each baseline to ensure fair comparison; and (3) a description of the dataset construction process, including how the held-out sets were created with safeguards such as deduplication based on normalized script content and checks for no overlap with training data. These additions will allow for a more rigorous assessment of our claims. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results on held-out data with independent methodological choices

full rationale

The paper reports standard NLP metrics (BLEU-4, METEOR, ROUGE-L) measured on held-out test sets of 1,064 single-line and 1,046 multi-line Bash scripts. The core pipeline (CPT on Bash data, SFT, then SAPO via AST atomic operations to build preference pairs for DPO-style optimization) is a sequence of training steps whose outputs are evaluated externally; no equation or self-citation reduces the reported scores to quantities defined by the same fitted parameters. The abstract and described method contain no self-definitional loops, fitted-input predictions, load-bearing self-citations, uniqueness theorems, or ansatz smuggling. The reader's assessment of score 1.0 aligns with a minor non-circular self-citation possibility that is not load-bearing here.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated beyond the high-level claim that the constructed dataset represents real-world patterns and that AST-derived pairs teach semantics.

pith-pipeline@v0.9.1-grok · 5843 in / 1224 out tokens · 29609 ms · 2026-06-30T05:46:44.277503+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 53 canonical work pages · 4 internal anchors

[1]

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified Pre-training for Program Un- derstanding and Generation. InProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Kristina Toutanova, Anna Rumshisky, Luke Zettle- moyer, Dilek Hakkani-Tur,...

work page doi:10.18653/v1/2021.naacl-main.211 2021
[2]

Anthropic. 2025. Claude-3.7-Sonnet. https://www.anthropic.com/news/claude-3-7-sonnet

2025
[3]

Qiuyuan Chen, Xin Xia, Han Hu, David Lo, and Shanping Li. 2021. Why My Code Summarization Model Does Not Work: Code Comment Improvement with Category Prediction.ACM Trans. Softw. Eng. Methodol.30, 2, Article 25 (Feb. 2021), 29 pages. doi:10.1145/3434280

work page doi:10.1145/3434280 2021
[4]

Shiqi Cheng, Chenjie Shen, Li Yang, Lei Yu, Fengjun Zhang, and Chun Zuo. 2025. AUVANA: An Efficient and Automatic Approach to Variable Rename Refactoring via Large Pre-trained Language Model. In2025 IEEE 36th International Symposium on Software Reliability Engineering (ISSRE). 288–299. doi:10.1109/ISSRE66568.2025.00038

work page doi:10.1109/issre66568.2025.00038 2025
[5]

Stack Overflow Community. 2025. Newest ’bash’ Questions - Stack Overflow. https://stackoverflow.com/questions/ tagged/bash

2025
[6]

Stack Overflow Community. 2025. Newest ’shell’ Questions - Stack Overflow. https://stackoverflow.com/questions/ tagged/shell

2025
[7]

Godfrey, and Meiyappan Nagappan

Yiwen Dong, Zheyang Li, Yongqiang Tian, Chengnian Sun, Michael W. Godfrey, and Meiyappan Nagappan. 2023. Bash in the Wild: Language Usage, Code Smells, and Bugs.ACM Trans. Softw. Eng. Methodol.32, 1, Article 8 (Feb. 2023), 22 pages. doi:10.1145/3517193

work page doi:10.1145/3517193 2023
[8]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational ...

work page doi:10.18653/v1/2020.findings-emnlp.139 2020
[9]

Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Ge Li, Zhi Jin, Xiaoguang Mao, and Xiangke Liao
[10]

InProceedings of the IEEE/ACM 46th International Conference on Software Engineering(Lisbon, Portugal)(ICSE ’24)

Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering(Lisbon, Portugal)(ICSE ’24). Association for Computing Machinery, New York, NY, USA, Article 39, 13 pages. doi:10.1145/3597503.3608134

work page doi:10.1145/3597503.3608134
[11]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, et al. 2024. The Llama 3 Herd of Models. arXiv preprint arXiv:2407.21783(July 2024). arXiv:2407.21783 [cs.AI] doi:10.48550/arXiv.2407.21783

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2407.21783 2024
[12]

Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xuehao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, Saizhuo Wang, Kun Zhang, Zhouchi Lin, Bowen Zhang, Lionel Ni, Wen Gao, Yuanzhuo Wang, and Jian Guo. 2026. A survey on LLM-as-a-judge.The Innovation7, 6 (2026), 101253. doi:10.1016/j.xinn.2025.101253

work page doi:10.1016/j.xinn.2025.101253 2026
[13]

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to- Sequence Learning. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 1631–1640. doi:10.18653/v1/P16-1154

work page doi:10.18653/v1/p16-1154 2016
[14]

Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified Cross-Modal Pre-training for Code Representation. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Ling...

work page doi:10.18653/v1/2022.acl-long.499 2022
[15]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.Nature 645 (2025), 633–638. doi:10.1038/s41586-025-09422-z

work page doi:10.1038/s41586-025-09422-z 2025
[16]

Sonia Haiduc, Jairo Aponte, and Andrian Marcus. 2010. Supporting program comprehension with source code summarization. InProceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2(Cape Town, South Africa)(ICSE ’10). Association for Computing Machinery, New York, NY, USA, 223–226. doi:10.1145/ 1810295.1810335

work page arXiv 2010
[17]

Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the Use of Automated Text Summarization Techniques for Summarizing Source Code. InProceedings of the 2010 17th Working Conference on Reverse Engineering (WCRE ’10). IEEE Computer Society, USA, 35–44. doi:10.1109/WCRE.2010.13

work page doi:10.1109/wcre.2010.13 2010
[18]

Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2020. Deep Code Comment Generation with Hybrid Lexical and Syntactical Information.Empirical Software Engineering25, 3 (2020), 2179–2217. doi:10.1007/s10664-019-09730-9

work page doi:10.1007/s10664-019-09730-9 2020
[19]

Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Katrin Erk and Noah A. Smith (Eds.). Association for Computational Linguistics, Berlin, Germany, 2073–2083. do...

work page doi:10.18653/v1/p16-1195 2016
[20]

Denis Kocetkov, Raymond Li, Loubna Ben Allal, Jia Li, Chenghao Mou, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell, Sean Hughes, Thomas Wolf, et al. 2022. The stack: 3 tb of permissively licensed source code.arXiv preprint arXiv:2211.15533(Nov. 2022). arXiv:2211.15533 [cs.CL] doi:10.48550/arXiv.2211.15533

work page doi:10.48550/arxiv.2211.15533 2022
[21]

Li Kuang, Cong Zhou, and Xiaoxian Yang. 2022. Code comment generation based on graph neural network enhanced transformer model for code understanding in open-source software ecosystems.Automated Software Engineering29, 2 (2022), 43. doi:10.1007/s10515-022-00341-1

work page doi:10.1007/s10515-022-00341-1 2022
[22]

Dawei Li, Bohan Jiang, Liangjie Huang, Alimohammad Beigi, Chengshuai Zhao, Zhen Tan, Amrita Bhattacharjee, Yuxuan Jiang, Canyu Chen, Tianhao Wu, Kai Shu, Lu Cheng, and Huan Liu. 2025. From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Christ...

work page doi:10.18653/v1/2025.emnlp-main.138 2025
[23]

Xi Victoria Lin, Chenglong Wang, Deric Pang, Kevin Vu, Luke Zettlemoyer, and Michael D. Ernst. 2017.Program Synthesis from Natural Language Using Recurrent Neural Networks. Technical Report UW-CSE-17-03-01. University of Washington, Department of Computer Science and Engineering, Seattle, WA, USA. https://dericpang.com/tellina- tr170510.pdf

2017
[24]

Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, and Michael D Ernst. 2018. NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System. InProceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://aclanthology.org/L18-1491

2018
[25]

Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. 2024. DeepSeek-V3 Technical Report. arXiv:2412.19437 [cs.CL] doi:10.48550/arXiv.2412.19437

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2412.19437 2024
[26]

Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and LINGMING ZHANG. 2023. Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation. InAdvances in Neural Information Pro- cessing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 21558–...

2023
[27]

Hassan, David Lo, Zhenchang Xing, and Xinyu Wang

Zhongxin Liu, Xin Xia, Ahmed E. Hassan, David Lo, Zhenchang Xing, and Xinyu Wang. 2018. Neural-machine- translation-based commit message generation: how far are we?. InProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering(Montpellier, France)(ASE ’18). Association for Computing Machinery, New York, NY, USA, 373–384. d...

work page doi:10.1145/3238147.3238190 2018
[28]

Locutusque. 2024. UltraTextbooks Dataset. https://huggingface.co/datasets/Locutusque/UltraTextbooks

2024
[29]

Shayne Longpre, Le Hou, Tu Vu, Albert Webson, Hyung Won Chung, Yi Tay, Denny Zhou, Quoc V Le, Barret Zoph, Jason Wei, and Adam Roberts. 2023. The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. InProceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202), Andreas ...

2023
[30]

Junyi Lu, Xiaojia Li, Zihan Hua, Lei Yu, Shiqi Cheng, Li Yang, Fengjun Zhang, and Chun Zuo. 2025. Deepcrceval: Revisiting the evaluation of code review comment generation. InInternational Conference on Fundamental Approaches to Software Engineering. Springer, 43–64. doi:10.1007/978-3-031-90900-9_3

work page doi:10.1007/978-3-031-90900-9_3 2025
[31]

Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, and Chun Zuo. 2023. LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning. In2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 647–658. doi:10.1109/ISSRE59848.2023.00026

work page doi:10.1109/issre59848.2023.00026 2023
[32]

Humza Naveed, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar, Muhammad Usman, Naveed Akhtar, Nick Barnes, and Ajmal Mian. 2025. A Comprehensive Overview of Large Language Models.ACM Trans. Intell. Syst. Technol. 16, 5, Article 106 (Aug. 2025), 72 pages. doi:10.1145/3744746

work page doi:10.1145/3744746 2025
[33]

1998.Learning the Bash Shell

Cameron Newham, Bill Rosenblatt, and Gigi Estabrook. 1998.Learning the Bash Shell. O’Reilly & Associates, Inc

1998
[34]

OpenAI. 2025. GPT-4.1. https://platform.openai.com/docs/models/gpt-4.1

2025
[35]

Keiran Paster, Marco Dos Santos, Zhangir Azerbayev, and Jimmy Ba. 2024. OpenWebMath: An Open Dataset of High- Quality Mathematical Web Text. InInternational Conference on Learning Representations, B. Kim, Y. Yue, S. Chaudhuri, K. Fragkiadaki, M. Khan, and Y. Sun (Eds.), Vol. 2024. 20357–20379. https://proceedings.iclr.cc/paper_files/paper/2024/ file/5949a...

2024
[36]

Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Annibal, Alec Peltekian, and Yanfang Ye. 2021. CoTexT: Multi-task Learning with Code-Text Transformer. InProceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021), Royi Lachmy, Ziyu Yao, Greg Durrett, Milos Gligoric, Junyi Jessy Li, Ray Mooney, Graham Neubig, Yu Su, H...

work page doi:10.18653/v1/2021.nlp4prog-1.5 2021
[37]

Tamilselvam, Prince Kumar, Ashok Pon Kumar, and Pushpak Bhattacharyya

Sameer Pimparkhede, Mehant Kammakomati, Srikanth G. Tamilselvam, Prince Kumar, Ashok Pon Kumar, and Pushpak Bhattacharyya. 2024. DocCGen: Document-based Controlled Code Generation. InProceedings of the 2024 Conference Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE095. Publication date: July 2026. Bash-Commenter: Leveraging Syntax-Aware Preference Opt...

work page doi:10.18653/v1/2024.emnlp-main.1040 2024
[38]

Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. 2023. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. InAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 53728...

2023
[39]

Chenjie Shen, Jie Zhu, Lei Yu, Li Yang, and Chun Zuo. 2024. Dependency-Aware Method Naming Framework with Generative Adversarial Sampling. In2024 International Joint Conference on Neural Networks (IJCNN). 1–8. doi:10.1109/ IJCNN60899.2024.10651109

work page arXiv 2024
[40]

Yiheng Shen, Xiaolin Ju, Xiang Chen, and Guang Yang. 2024. Bash comment generation via data augmentation and semantic-aware CodeBERT.Automated Software Engineering31 (2024), 30. doi:10.1007/s10515-024-00431-2

work page doi:10.1007/s10515-024-00431-2 2024
[41]

Stack Overflow. 2026. Bash Questions (1134 days post-ChatGPT). https://stackoverflow.com/questions/tagged/bash? sort=Newest&days=1134. Data collected: 2026-01-07, covering November 30, 2022 to January 7, 2026

2026
[42]

Stack Overflow. 2026. Bash Questions (3000 day baseline). https://stackoverflow.com/questions/tagged/bash?sort= Newest&days=3000. Data collected: 2026-01-07, covering October 21, 2017 to January 7, 2026

2026
[43]

André Storhaug, Jingyue Li, and Tianyuan Hu. 2023. Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding. In2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 683–693. doi:10.1109/ISSRE59848.2023.00035

work page doi:10.1109/issre59848.2023.00035 2023
[44]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. InAdvances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2014/file/ 5a18e133cbf9f257297f410bb7eca...

2014
[45]

Jiayue Tang, Li Yang, Lei Yu, Junyi Lu, Zhirong Huang, Fengjun Zhang, and Chun Zuo. 2025. Breaking Task Isolation: Enhancing Code Review Automation with Mixture-of-Experts Large Language Models. In2025 IEEE 36th International Symposium on Software Reliability Engineering (ISSRE). 227–238. doi:10.1109/ISSRE66568.2025.00033

work page doi:10.1109/issre66568.2025.00033 2025
[46]

Gomez, Łukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. InProceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010

2017
[47]

Ruiqi Wang, Jiyu Guo, Cuiyun Gao, Guodong Fan, Chun Yong Chong, and Xin Xia. 2025. Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering.Proc. ACM Softw. Eng.2, ISSTA, Article ISSTA086 (June 2025), 23 pages. doi:10.1145/3728963

work page doi:10.1145/3728963 2025
[48]

Yue Wang, Weishi Wang, Shafiq Joty, and Steven C.H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen- tau Yih (Eds.). Association fo...

work page doi:10.18653/v1/2021.emnlp-main.685 2021
[49]

Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, and Houari Sahraoui. 2025. Exploring Parameter-Efficient Fine- Tuning Techniques for Code Generation with Large Language Models.ACM Trans. Softw. Eng. Methodol.34, 7, Article 204 (Aug. 2025), 25 pages. doi:10.1145/3714461

work page doi:10.1145/3714461 2025
[50]

Hassan, and Shanping Li

Xin Xia, Lingfeng Bao, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. 2018. Measuring program comprehension: a large-scale field study with professionals. InProceedings of the 40th International Conference on Software Engineering(Gothenburg, Sweden)(ICSE ’18). Association for Computing Machinery, New York, NY, USA,

2018
[51]

doi:10.1145/3180155.3182538

work page doi:10.1145/3180155.3182538
[52]

Zhu Xiaoxuan, Xiong Zhuozhi, Zhang Lin, Ye Haoning, Gu Zhouhong, Li Zihan, Jiang Sihang, Feng Hongwei, Xiao Yanghua, Wang Zili, Yang Dongjie, and Wang Shusen. 2023. CodeGPT: A Code-Related Dialogue Dataset Generated by GPT and for GPT. https://github.com/zxx000728/CodeGPT

2023
[53]

An Yang, Anfeng Li, Baosong Yang, et al. 2025. Qwen3 Technical Report. arXiv:2505.09388 [cs.CL] doi:10.48550/arXiv. 2505.09388

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv 2025
[54]

An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, et al. 2024. Qwen2.5 Technical Report. arXiv:2412.15115 [cs.CL] doi:10.48550/arXiv.2412.15115

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2412.15115 2024
[55]

Yuanzhe Yang, Li Yang, Lingwei Li, Xiaoxiao Ma, Lei Yu, and Chun Zuo. 2023. DccGraph: Detecting Criminal Communities with Augmented Criminal Network Construction and Graph Neural Network. In2023 International Joint Conference on Neural Networks (IJCNN). 1–8. doi:10.1109/IJCNN54540.2023.10191121

work page doi:10.1109/ijcnn54540.2023.10191121 2023
[56]

Chi Yu, Guang Yang, Xiang Chen, Ke Liu, and Yanlin Zhou. 2022. BashExplainer: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT. In2022 IEEE International Conference on Software Maintenance and Evolution (ICSME). 82–93. doi:10.1109/ICSME55016.2022.00016 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE095. Publication date: J...

work page doi:10.1109/icsme55016.2022.00016 2022
[57]

Lei Yu, Shiqi Chen, Hang Yuan, Peng Wang, Zhirong Huang, Jingyuan Zhang, Chenjie Shen, Fengjun Zhang, Li Yang, and Jiajia Ma. 2024. Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart Contract Vulnerability Detection and Explanation.arXiv preprint arXiv:2411.06221(2024). doi:10.48550/arXiv.2411.06221

work page doi:10.48550/arxiv.2411.06221 2024
[58]

Lei Yu, Shiqi Cheng, Zhirong Huang, Jingyuan Zhang, Chenjie Shen, Junyi Lu, Li Yang, Fengjun Zhang, and Jiajia Ma
[59]

In2025 IEEE International Conference on Software Maintenance and Evolution (ICSME)

SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection. In2025 IEEE International Conference on Software Maintenance and Evolution (ICSME). 61–72. doi:10.1109/ ICSME64153.2025.00016

work page arXiv 2025
[60]

Lei Yu, Zhirong Huang, Hang Yuan, Shiqi Cheng, Li Yang, Fengjun Zhang, Chenjie Shen, Jiajia Ma, Jingyuan Zhang, Junyi Lu, and Chun Zuo. 2025. Smart-LLaMA-DPO: Reinforced Large Language Model for Explainable Smart Contract Vulnerability Detection.Proc. ACM Softw. Eng.2, ISSTA, Article ISSTA009 (June 2025), 24 pages. doi:10.1145/3728878

work page doi:10.1145/3728878 2025
[61]

Lei Yu, Junyi Lu, Xianglong Liu, Li Yang, Fengjun Zhang, and Jiajia Ma. 2023. PSCVFinder: A Prompt-Tuning Based Framework for Smart Contract Vulnerability Detection. In2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 556–567. doi:10.1109/ISSRE59848.2023.00030

work page doi:10.1109/issre59848.2023.00030 2023
[62]

Lei Yu, Peng Wang, Jingyuan Zhang, Xin Wang, Jia Xu, Li Yang, Changzhi Deng, Jiajia Ma, and Fengjun Zhang. 2026. SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization. arXiv preprint arXiv:2603.18606(2026). doi:10.48550/arXiv.2603.18606

work page doi:10.48550/arxiv.2603.18606 2026
[63]

Lei Yu, Fengjun Zhang, Jiajia Ma, Li Yang, Yuanzhe Yang, and Wei Jia. 2023. Who Are the Money Launderers? Money Laundering Detection on Blockchain via Mutual Learning-Based Graph Neural Network. In2023 International Joint Conference on Neural Networks (IJCNN). 1–8. doi:10.1109/IJCNN54540.2023.10191217

work page doi:10.1109/ijcnn54540.2023.10191217 2023
[64]

Lei Yu, Jingyuan Zhang, Xin Wang, Jiajia Ma, Li Yang, and Fengjun Zhang. 2025. Towards Secure and Explainable Smart Contract Generation with Security-Aware Group Relative Policy Optimization.arXiv preprint arXiv:2509.09942 (2025). doi:10.48550/arXiv.2509.09942

work page doi:10.48550/arxiv.2509.09942 2025
[65]

Hang Yuan, Xizhi Hou, Lei Yu, Li Yang, Jiayue Tang, Jiadong Xu, Yifei Liu, Fengjun Zhang, and Chun Zuo. 2025. Leveraging Mixture-of-Experts Framework for Smart Contract Vulnerability Repair with Large Language Model. In2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1667–1679. doi:10.1109/ ASE63991.2025.00140

work page arXiv 2025
[66]

Hang Yuan, Lei Yu, Zhirong Huang, Jingyuan Zhang, Junyi Lu, Shiqi Cheng, Li Yang, Fengjun Zhang, Jiajia Ma, and Chun Zuo. 2025. Mos: Towards effective smart contract vulnerability detection through mixture-of-experts tuning of large language models.arXiv preprint arXiv:2504.12234(2025). doi:10.48550/arXiv.2504.12234

work page doi:10.48550/arxiv.2504.12234 2025
[67]

Daoguang Zan, Zhirong Huang, Ailun Yu, Shaoxin Lin, Yifan Shi, Wei Liu, Dong Chen, Zongshuai Qi, Hao Yu, Lei Yu, et al. 2024. Swe-bench-java: A github issue resolving benchmark for java.arXiv preprint arXiv:2408.14354(2024). doi:10.48550/arXiv.2408.14354

work page doi:10.48550/arxiv.2408.14354 2024
[68]

Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, et al. 2024. MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series. arXiv:2405.19327 [cs.CL] doi:10.48550/arXiv.2405.19327

work page doi:10.48550/arxiv.2405.19327 2024
[69]

Jingyuan Zhang, Xin Wang, Lei Yu, Li Yang, and Fengjun Zhang. 2026. Binary Message Passing for Generalizable Semi-Supervised Graph Anomaly Detection. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 16334–16342. doi:10.1609/aaai.v40i19.38671

work page doi:10.1609/aaai.v40i19.38671 2026
[70]

Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, and Xudong Liu. 2020. Retrieval-based neural source code summarization. InProceedings of the ACM/IEEE 42nd International Conference on Software Engineering(Seoul, South Korea)(ICSE ’20). Association for Computing Machinery, New York, NY, USA, 1385–1397. doi:10.1145/3377811.3380383

work page doi:10.1145/3377811.3380383 2020
[71]

Jingyuan Zhang, Lei Yu, Zhirong Huang, Li Yang, and Fengjun Zhang. 2025. Topology Augmented Multi-Band and Multi-Scale Filtering for Graph Anomaly Detection.ACM Trans. Knowl. Discov. Data19, 8, Article 151 (Sept. 2025), 27 pages. doi:10.1145/3748727

work page doi:10.1145/3748727 2025
[72]

Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, and Tong Zhang. 2024. Bridge-Coder: Unlocking LLMs’ Potential to Overcome Language Gaps in Low-Resource Code. arXiv:2410.18957 [cs.CL] doi:10.48550/arXiv.2410.18957

work page doi:10.48550/arxiv.2410.18957 2024
[73]

Junsan Zhang, Yang Zhu, Ao Lu, Yudie Yan, and Yao Wan. 2025. Bash command comment generation via multi-scale heterogeneous feature fusion.Automated Software Engineering32 (2025), 28. doi:10.1007/s10515-025-00494-9

work page doi:10.1007/s10515-025-00494-9 2025
[74]

Yifan Zhang, Yifan Luo, Yang Yuan, and Andrew C Yao. 2024. Autonomous Data Selection with Language Models for Mathematical Texts. InICLR 2024 Workshop on Navigating and Addressing Data Problems for Foundation Models. https://openreview.net/forum?id=bBF077z8LF Received 2025-09-04; accepted 2026-03-24 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE095. ...

2024

[1] [1]

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified Pre-training for Program Un- derstanding and Generation. InProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Kristina Toutanova, Anna Rumshisky, Luke Zettle- moyer, Dilek Hakkani-Tur,...

work page doi:10.18653/v1/2021.naacl-main.211 2021

[2] [2]

Anthropic. 2025. Claude-3.7-Sonnet. https://www.anthropic.com/news/claude-3-7-sonnet

2025

[3] [3]

Qiuyuan Chen, Xin Xia, Han Hu, David Lo, and Shanping Li. 2021. Why My Code Summarization Model Does Not Work: Code Comment Improvement with Category Prediction.ACM Trans. Softw. Eng. Methodol.30, 2, Article 25 (Feb. 2021), 29 pages. doi:10.1145/3434280

work page doi:10.1145/3434280 2021

[4] [4]

Shiqi Cheng, Chenjie Shen, Li Yang, Lei Yu, Fengjun Zhang, and Chun Zuo. 2025. AUVANA: An Efficient and Automatic Approach to Variable Rename Refactoring via Large Pre-trained Language Model. In2025 IEEE 36th International Symposium on Software Reliability Engineering (ISSRE). 288–299. doi:10.1109/ISSRE66568.2025.00038

work page doi:10.1109/issre66568.2025.00038 2025

[5] [5]

Stack Overflow Community. 2025. Newest ’bash’ Questions - Stack Overflow. https://stackoverflow.com/questions/ tagged/bash

2025

[6] [6]

Stack Overflow Community. 2025. Newest ’shell’ Questions - Stack Overflow. https://stackoverflow.com/questions/ tagged/shell

2025

[7] [7]

Godfrey, and Meiyappan Nagappan

Yiwen Dong, Zheyang Li, Yongqiang Tian, Chengnian Sun, Michael W. Godfrey, and Meiyappan Nagappan. 2023. Bash in the Wild: Language Usage, Code Smells, and Bugs.ACM Trans. Softw. Eng. Methodol.32, 1, Article 8 (Feb. 2023), 22 pages. doi:10.1145/3517193

work page doi:10.1145/3517193 2023

[8] [8]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational ...

work page doi:10.18653/v1/2020.findings-emnlp.139 2020

[9] [9]

Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Ge Li, Zhi Jin, Xiaoguang Mao, and Xiangke Liao

[10] [10]

InProceedings of the IEEE/ACM 46th International Conference on Software Engineering(Lisbon, Portugal)(ICSE ’24)

Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering(Lisbon, Portugal)(ICSE ’24). Association for Computing Machinery, New York, NY, USA, Article 39, 13 pages. doi:10.1145/3597503.3608134

work page doi:10.1145/3597503.3608134

[11] [11]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, et al. 2024. The Llama 3 Herd of Models. arXiv preprint arXiv:2407.21783(July 2024). arXiv:2407.21783 [cs.AI] doi:10.48550/arXiv.2407.21783

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2407.21783 2024

[12] [12]

Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xuehao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, Saizhuo Wang, Kun Zhang, Zhouchi Lin, Bowen Zhang, Lionel Ni, Wen Gao, Yuanzhuo Wang, and Jian Guo. 2026. A survey on LLM-as-a-judge.The Innovation7, 6 (2026), 101253. doi:10.1016/j.xinn.2025.101253

work page doi:10.1016/j.xinn.2025.101253 2026

[13] [13]

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to- Sequence Learning. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 1631–1640. doi:10.18653/v1/P16-1154

work page doi:10.18653/v1/p16-1154 2016

[14] [14]

Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified Cross-Modal Pre-training for Code Representation. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Ling...

work page doi:10.18653/v1/2022.acl-long.499 2022

[15] [15]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.Nature 645 (2025), 633–638. doi:10.1038/s41586-025-09422-z

work page doi:10.1038/s41586-025-09422-z 2025

[16] [16]

Sonia Haiduc, Jairo Aponte, and Andrian Marcus. 2010. Supporting program comprehension with source code summarization. InProceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2(Cape Town, South Africa)(ICSE ’10). Association for Computing Machinery, New York, NY, USA, 223–226. doi:10.1145/ 1810295.1810335

work page arXiv 2010

[17] [17]

Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the Use of Automated Text Summarization Techniques for Summarizing Source Code. InProceedings of the 2010 17th Working Conference on Reverse Engineering (WCRE ’10). IEEE Computer Society, USA, 35–44. doi:10.1109/WCRE.2010.13

work page doi:10.1109/wcre.2010.13 2010

[18] [18]

Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2020. Deep Code Comment Generation with Hybrid Lexical and Syntactical Information.Empirical Software Engineering25, 3 (2020), 2179–2217. doi:10.1007/s10664-019-09730-9

work page doi:10.1007/s10664-019-09730-9 2020

[19] [19]

Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Katrin Erk and Noah A. Smith (Eds.). Association for Computational Linguistics, Berlin, Germany, 2073–2083. do...

work page doi:10.18653/v1/p16-1195 2016

[20] [20]

Denis Kocetkov, Raymond Li, Loubna Ben Allal, Jia Li, Chenghao Mou, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell, Sean Hughes, Thomas Wolf, et al. 2022. The stack: 3 tb of permissively licensed source code.arXiv preprint arXiv:2211.15533(Nov. 2022). arXiv:2211.15533 [cs.CL] doi:10.48550/arXiv.2211.15533

work page doi:10.48550/arxiv.2211.15533 2022

[21] [21]

Li Kuang, Cong Zhou, and Xiaoxian Yang. 2022. Code comment generation based on graph neural network enhanced transformer model for code understanding in open-source software ecosystems.Automated Software Engineering29, 2 (2022), 43. doi:10.1007/s10515-022-00341-1

work page doi:10.1007/s10515-022-00341-1 2022

[22] [22]

Dawei Li, Bohan Jiang, Liangjie Huang, Alimohammad Beigi, Chengshuai Zhao, Zhen Tan, Amrita Bhattacharjee, Yuxuan Jiang, Canyu Chen, Tianhao Wu, Kai Shu, Lu Cheng, and Huan Liu. 2025. From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Christ...

work page doi:10.18653/v1/2025.emnlp-main.138 2025

[23] [23]

Xi Victoria Lin, Chenglong Wang, Deric Pang, Kevin Vu, Luke Zettlemoyer, and Michael D. Ernst. 2017.Program Synthesis from Natural Language Using Recurrent Neural Networks. Technical Report UW-CSE-17-03-01. University of Washington, Department of Computer Science and Engineering, Seattle, WA, USA. https://dericpang.com/tellina- tr170510.pdf

2017

[24] [24]

Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, and Michael D Ernst. 2018. NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System. InProceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://aclanthology.org/L18-1491

2018

[25] [25]

Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. 2024. DeepSeek-V3 Technical Report. arXiv:2412.19437 [cs.CL] doi:10.48550/arXiv.2412.19437

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2412.19437 2024

[26] [26]

Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and LINGMING ZHANG. 2023. Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation. InAdvances in Neural Information Pro- cessing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 21558–...

2023

[27] [27]

Hassan, David Lo, Zhenchang Xing, and Xinyu Wang

Zhongxin Liu, Xin Xia, Ahmed E. Hassan, David Lo, Zhenchang Xing, and Xinyu Wang. 2018. Neural-machine- translation-based commit message generation: how far are we?. InProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering(Montpellier, France)(ASE ’18). Association for Computing Machinery, New York, NY, USA, 373–384. d...

work page doi:10.1145/3238147.3238190 2018

[28] [28]

Locutusque. 2024. UltraTextbooks Dataset. https://huggingface.co/datasets/Locutusque/UltraTextbooks

2024

[29] [29]

Shayne Longpre, Le Hou, Tu Vu, Albert Webson, Hyung Won Chung, Yi Tay, Denny Zhou, Quoc V Le, Barret Zoph, Jason Wei, and Adam Roberts. 2023. The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. InProceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202), Andreas ...

2023

[30] [30]

Junyi Lu, Xiaojia Li, Zihan Hua, Lei Yu, Shiqi Cheng, Li Yang, Fengjun Zhang, and Chun Zuo. 2025. Deepcrceval: Revisiting the evaluation of code review comment generation. InInternational Conference on Fundamental Approaches to Software Engineering. Springer, 43–64. doi:10.1007/978-3-031-90900-9_3

work page doi:10.1007/978-3-031-90900-9_3 2025

[31] [31]

Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, and Chun Zuo. 2023. LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning. In2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 647–658. doi:10.1109/ISSRE59848.2023.00026

work page doi:10.1109/issre59848.2023.00026 2023

[32] [32]

Humza Naveed, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar, Muhammad Usman, Naveed Akhtar, Nick Barnes, and Ajmal Mian. 2025. A Comprehensive Overview of Large Language Models.ACM Trans. Intell. Syst. Technol. 16, 5, Article 106 (Aug. 2025), 72 pages. doi:10.1145/3744746

work page doi:10.1145/3744746 2025

[33] [33]

1998.Learning the Bash Shell

Cameron Newham, Bill Rosenblatt, and Gigi Estabrook. 1998.Learning the Bash Shell. O’Reilly & Associates, Inc

1998

[34] [34]

OpenAI. 2025. GPT-4.1. https://platform.openai.com/docs/models/gpt-4.1

2025

[35] [35]

Keiran Paster, Marco Dos Santos, Zhangir Azerbayev, and Jimmy Ba. 2024. OpenWebMath: An Open Dataset of High- Quality Mathematical Web Text. InInternational Conference on Learning Representations, B. Kim, Y. Yue, S. Chaudhuri, K. Fragkiadaki, M. Khan, and Y. Sun (Eds.), Vol. 2024. 20357–20379. https://proceedings.iclr.cc/paper_files/paper/2024/ file/5949a...

2024

[36] [36]

Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Annibal, Alec Peltekian, and Yanfang Ye. 2021. CoTexT: Multi-task Learning with Code-Text Transformer. InProceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021), Royi Lachmy, Ziyu Yao, Greg Durrett, Milos Gligoric, Junyi Jessy Li, Ray Mooney, Graham Neubig, Yu Su, H...

work page doi:10.18653/v1/2021.nlp4prog-1.5 2021

[37] [37]

Tamilselvam, Prince Kumar, Ashok Pon Kumar, and Pushpak Bhattacharyya

Sameer Pimparkhede, Mehant Kammakomati, Srikanth G. Tamilselvam, Prince Kumar, Ashok Pon Kumar, and Pushpak Bhattacharyya. 2024. DocCGen: Document-based Controlled Code Generation. InProceedings of the 2024 Conference Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE095. Publication date: July 2026. Bash-Commenter: Leveraging Syntax-Aware Preference Opt...

work page doi:10.18653/v1/2024.emnlp-main.1040 2024

[38] [38]

Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. 2023. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. InAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 53728...

2023

[39] [39]

Chenjie Shen, Jie Zhu, Lei Yu, Li Yang, and Chun Zuo. 2024. Dependency-Aware Method Naming Framework with Generative Adversarial Sampling. In2024 International Joint Conference on Neural Networks (IJCNN). 1–8. doi:10.1109/ IJCNN60899.2024.10651109

work page arXiv 2024

[40] [40]

Yiheng Shen, Xiaolin Ju, Xiang Chen, and Guang Yang. 2024. Bash comment generation via data augmentation and semantic-aware CodeBERT.Automated Software Engineering31 (2024), 30. doi:10.1007/s10515-024-00431-2

work page doi:10.1007/s10515-024-00431-2 2024

[41] [41]

Stack Overflow. 2026. Bash Questions (1134 days post-ChatGPT). https://stackoverflow.com/questions/tagged/bash? sort=Newest&days=1134. Data collected: 2026-01-07, covering November 30, 2022 to January 7, 2026

2026

[42] [42]

Stack Overflow. 2026. Bash Questions (3000 day baseline). https://stackoverflow.com/questions/tagged/bash?sort= Newest&days=3000. Data collected: 2026-01-07, covering October 21, 2017 to January 7, 2026

2026

[43] [43]

André Storhaug, Jingyue Li, and Tianyuan Hu. 2023. Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding. In2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 683–693. doi:10.1109/ISSRE59848.2023.00035

work page doi:10.1109/issre59848.2023.00035 2023

[44] [44]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. InAdvances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2014/file/ 5a18e133cbf9f257297f410bb7eca...

2014

[45] [45]

Jiayue Tang, Li Yang, Lei Yu, Junyi Lu, Zhirong Huang, Fengjun Zhang, and Chun Zuo. 2025. Breaking Task Isolation: Enhancing Code Review Automation with Mixture-of-Experts Large Language Models. In2025 IEEE 36th International Symposium on Software Reliability Engineering (ISSRE). 227–238. doi:10.1109/ISSRE66568.2025.00033

work page doi:10.1109/issre66568.2025.00033 2025

[46] [46]

Gomez, Łukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. InProceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010

2017

[47] [47]

Ruiqi Wang, Jiyu Guo, Cuiyun Gao, Guodong Fan, Chun Yong Chong, and Xin Xia. 2025. Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering.Proc. ACM Softw. Eng.2, ISSTA, Article ISSTA086 (June 2025), 23 pages. doi:10.1145/3728963

work page doi:10.1145/3728963 2025

[48] [48]

Yue Wang, Weishi Wang, Shafiq Joty, and Steven C.H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen- tau Yih (Eds.). Association fo...

work page doi:10.18653/v1/2021.emnlp-main.685 2021

[49] [49]

Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, and Houari Sahraoui. 2025. Exploring Parameter-Efficient Fine- Tuning Techniques for Code Generation with Large Language Models.ACM Trans. Softw. Eng. Methodol.34, 7, Article 204 (Aug. 2025), 25 pages. doi:10.1145/3714461

work page doi:10.1145/3714461 2025

[50] [50]

Hassan, and Shanping Li

Xin Xia, Lingfeng Bao, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. 2018. Measuring program comprehension: a large-scale field study with professionals. InProceedings of the 40th International Conference on Software Engineering(Gothenburg, Sweden)(ICSE ’18). Association for Computing Machinery, New York, NY, USA,

2018

[51] [51]

doi:10.1145/3180155.3182538

work page doi:10.1145/3180155.3182538

[52] [52]

Zhu Xiaoxuan, Xiong Zhuozhi, Zhang Lin, Ye Haoning, Gu Zhouhong, Li Zihan, Jiang Sihang, Feng Hongwei, Xiao Yanghua, Wang Zili, Yang Dongjie, and Wang Shusen. 2023. CodeGPT: A Code-Related Dialogue Dataset Generated by GPT and for GPT. https://github.com/zxx000728/CodeGPT

2023

[53] [53]

An Yang, Anfeng Li, Baosong Yang, et al. 2025. Qwen3 Technical Report. arXiv:2505.09388 [cs.CL] doi:10.48550/arXiv. 2505.09388

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv 2025

[54] [54]

An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, et al. 2024. Qwen2.5 Technical Report. arXiv:2412.15115 [cs.CL] doi:10.48550/arXiv.2412.15115

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2412.15115 2024

[55] [55]

Yuanzhe Yang, Li Yang, Lingwei Li, Xiaoxiao Ma, Lei Yu, and Chun Zuo. 2023. DccGraph: Detecting Criminal Communities with Augmented Criminal Network Construction and Graph Neural Network. In2023 International Joint Conference on Neural Networks (IJCNN). 1–8. doi:10.1109/IJCNN54540.2023.10191121

work page doi:10.1109/ijcnn54540.2023.10191121 2023

[56] [56]

Chi Yu, Guang Yang, Xiang Chen, Ke Liu, and Yanlin Zhou. 2022. BashExplainer: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT. In2022 IEEE International Conference on Software Maintenance and Evolution (ICSME). 82–93. doi:10.1109/ICSME55016.2022.00016 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE095. Publication date: J...

work page doi:10.1109/icsme55016.2022.00016 2022

[57] [57]

Lei Yu, Shiqi Chen, Hang Yuan, Peng Wang, Zhirong Huang, Jingyuan Zhang, Chenjie Shen, Fengjun Zhang, Li Yang, and Jiajia Ma. 2024. Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart Contract Vulnerability Detection and Explanation.arXiv preprint arXiv:2411.06221(2024). doi:10.48550/arXiv.2411.06221

work page doi:10.48550/arxiv.2411.06221 2024

[58] [58]

Lei Yu, Shiqi Cheng, Zhirong Huang, Jingyuan Zhang, Chenjie Shen, Junyi Lu, Li Yang, Fengjun Zhang, and Jiajia Ma

[59] [59]

In2025 IEEE International Conference on Software Maintenance and Evolution (ICSME)

SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection. In2025 IEEE International Conference on Software Maintenance and Evolution (ICSME). 61–72. doi:10.1109/ ICSME64153.2025.00016

work page arXiv 2025

[60] [60]

Lei Yu, Zhirong Huang, Hang Yuan, Shiqi Cheng, Li Yang, Fengjun Zhang, Chenjie Shen, Jiajia Ma, Jingyuan Zhang, Junyi Lu, and Chun Zuo. 2025. Smart-LLaMA-DPO: Reinforced Large Language Model for Explainable Smart Contract Vulnerability Detection.Proc. ACM Softw. Eng.2, ISSTA, Article ISSTA009 (June 2025), 24 pages. doi:10.1145/3728878

work page doi:10.1145/3728878 2025

[61] [61]

Lei Yu, Junyi Lu, Xianglong Liu, Li Yang, Fengjun Zhang, and Jiajia Ma. 2023. PSCVFinder: A Prompt-Tuning Based Framework for Smart Contract Vulnerability Detection. In2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 556–567. doi:10.1109/ISSRE59848.2023.00030

work page doi:10.1109/issre59848.2023.00030 2023

[62] [62]

Lei Yu, Peng Wang, Jingyuan Zhang, Xin Wang, Jia Xu, Li Yang, Changzhi Deng, Jiajia Ma, and Fengjun Zhang. 2026. SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization. arXiv preprint arXiv:2603.18606(2026). doi:10.48550/arXiv.2603.18606

work page doi:10.48550/arxiv.2603.18606 2026

[63] [63]

Lei Yu, Fengjun Zhang, Jiajia Ma, Li Yang, Yuanzhe Yang, and Wei Jia. 2023. Who Are the Money Launderers? Money Laundering Detection on Blockchain via Mutual Learning-Based Graph Neural Network. In2023 International Joint Conference on Neural Networks (IJCNN). 1–8. doi:10.1109/IJCNN54540.2023.10191217

work page doi:10.1109/ijcnn54540.2023.10191217 2023

[64] [64]

Lei Yu, Jingyuan Zhang, Xin Wang, Jiajia Ma, Li Yang, and Fengjun Zhang. 2025. Towards Secure and Explainable Smart Contract Generation with Security-Aware Group Relative Policy Optimization.arXiv preprint arXiv:2509.09942 (2025). doi:10.48550/arXiv.2509.09942

work page doi:10.48550/arxiv.2509.09942 2025

[65] [65]

Hang Yuan, Xizhi Hou, Lei Yu, Li Yang, Jiayue Tang, Jiadong Xu, Yifei Liu, Fengjun Zhang, and Chun Zuo. 2025. Leveraging Mixture-of-Experts Framework for Smart Contract Vulnerability Repair with Large Language Model. In2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1667–1679. doi:10.1109/ ASE63991.2025.00140

work page arXiv 2025

[66] [66]

Hang Yuan, Lei Yu, Zhirong Huang, Jingyuan Zhang, Junyi Lu, Shiqi Cheng, Li Yang, Fengjun Zhang, Jiajia Ma, and Chun Zuo. 2025. Mos: Towards effective smart contract vulnerability detection through mixture-of-experts tuning of large language models.arXiv preprint arXiv:2504.12234(2025). doi:10.48550/arXiv.2504.12234

work page doi:10.48550/arxiv.2504.12234 2025

[67] [67]

Daoguang Zan, Zhirong Huang, Ailun Yu, Shaoxin Lin, Yifan Shi, Wei Liu, Dong Chen, Zongshuai Qi, Hao Yu, Lei Yu, et al. 2024. Swe-bench-java: A github issue resolving benchmark for java.arXiv preprint arXiv:2408.14354(2024). doi:10.48550/arXiv.2408.14354

work page doi:10.48550/arxiv.2408.14354 2024

[68] [68]

Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, et al. 2024. MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series. arXiv:2405.19327 [cs.CL] doi:10.48550/arXiv.2405.19327

work page doi:10.48550/arxiv.2405.19327 2024

[69] [69]

Jingyuan Zhang, Xin Wang, Lei Yu, Li Yang, and Fengjun Zhang. 2026. Binary Message Passing for Generalizable Semi-Supervised Graph Anomaly Detection. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 16334–16342. doi:10.1609/aaai.v40i19.38671

work page doi:10.1609/aaai.v40i19.38671 2026

[70] [70]

Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, and Xudong Liu. 2020. Retrieval-based neural source code summarization. InProceedings of the ACM/IEEE 42nd International Conference on Software Engineering(Seoul, South Korea)(ICSE ’20). Association for Computing Machinery, New York, NY, USA, 1385–1397. doi:10.1145/3377811.3380383

work page doi:10.1145/3377811.3380383 2020

[71] [71]

Jingyuan Zhang, Lei Yu, Zhirong Huang, Li Yang, and Fengjun Zhang. 2025. Topology Augmented Multi-Band and Multi-Scale Filtering for Graph Anomaly Detection.ACM Trans. Knowl. Discov. Data19, 8, Article 151 (Sept. 2025), 27 pages. doi:10.1145/3748727

work page doi:10.1145/3748727 2025

[72] [72]

Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, and Tong Zhang. 2024. Bridge-Coder: Unlocking LLMs’ Potential to Overcome Language Gaps in Low-Resource Code. arXiv:2410.18957 [cs.CL] doi:10.48550/arXiv.2410.18957

work page doi:10.48550/arxiv.2410.18957 2024

[73] [73]

Junsan Zhang, Yang Zhu, Ao Lu, Yudie Yan, and Yao Wan. 2025. Bash command comment generation via multi-scale heterogeneous feature fusion.Automated Software Engineering32 (2025), 28. doi:10.1007/s10515-025-00494-9

work page doi:10.1007/s10515-025-00494-9 2025

[74] [74]

Yifan Zhang, Yifan Luo, Yang Yuan, and Andrew C Yao. 2024. Autonomous Data Selection with Language Models for Mathematical Texts. InICLR 2024 Workshop on Navigating and Addressing Data Problems for Foundation Models. https://openreview.net/forum?id=bBF077z8LF Received 2025-09-04; accepted 2026-03-24 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE095. ...

2024