Reducing Labeling Effort in Architecture Technical Debt Detection through Active Learning and Explainable AI

Andrea Capiluppi; Edi Sutoyo; Paris Avgeriou

arxiv: 2603.02944 · v2 · pith:XRD5V2B5new · submitted 2026-03-03 · 💻 cs.SE

Reducing Labeling Effort in Architecture Technical Debt Detection through Active Learning and Explainable AI

Edi Sutoyo , Paris Avgeriou , Andrea Capiluppi This is my paper

classification 💻 cs.SE

keywords technicalactiveannotationdebteffortissueslearningapplied

0 comments

read the original abstract

Self-Admitted Technical Debt (SATD) refers to technical compromises explicitly admitted by developers in natural language artifacts, such as code comments, commit messages, and issue trackers. Among its types, Architecture Technical Debt (ATD) is particularly difficult to detect due to its abstract and context-dependent nature. Manual annotation of ATD is costly, time-consuming, and challenging to scale. To reduce labeling effort, this study combines keyword-based filtering, active learning, and explainable AI for ATD detection. We refined an existing dataset of ATD-related Jira issues to obtain an expert-validated seed set used to extract representative keywords. These keywords were then applied to identify more than 103k candidate issues across 10 open-source projects. To assess the reliability of keyword-based filtering, we qualitatively evaluated a statistically representative sample of labeled issues. Building on the resulting dataset, we applied active learning with multiple query strategies to prioritize informative samples for annotation. The results show that Breaking Ties achieved the best performance, with an F1-score of 0.72 and a 49% reduction in annotation effort. To improve transparency, we used SHAP and LIME to explain ATD classification results. Expert evaluation showed that both methods provided useful explanations, with LIME generally preferred for its clarity and ease of use.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The Dangers of Non-Self-Fixed Architecture Technical Debt and Its Impact on Time-to-Fix
cs.SE 2026-05 conditional novelty 5.0

Non-self-fixed architectural technical debt persists longer than self-fixed debt in Apache projects, with repayment speed linked to the spread of changes across developers.