TriagerX: Dual Transformers for Bug Triaging Tasks with Content and Interaction Based Rankings
Pith reviewed 2026-05-18 21:46 UTC · model grok-4.3
The pith
TriagerX improves bug triaging by using two transformers for content-based developer rankings and then refining those rankings with historical interaction data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TriagerX is a dual-transformer architecture for bug triaging that generates a robust content-based ranking by collecting recommendations from the last three layers of each of two transformers, then refines this ranking with an interaction-based methodology considering historical developer interactions with similar fixed bugs, leading to superior performance over single-transformer baselines.
What carries the argument
Dual-transformer architecture with last-three-layer aggregation for content ranking, followed by interaction-based refinement that scores developers according to their history with similar fixed bugs.
If this is right
- TriagerX raises Top-1 and Top-3 developer recommendation accuracy by more than 10 percent on five datasets compared with nine prior transformer methods.
- The same model delivers up to 10 percent gains for component recommendations and 54 percent gains for developer recommendations on an industrial dataset.
- The system was successfully deployed in a large partner’s development environment where both developer and component suggestions are required.
- Recommendations remain useful even when teams change or developers leave, because components act as proxies for team assignments.
Where Pith is reading between the lines
- The same dual-plus-interaction pattern could be tested on related tasks such as recommending code reviewers or assigning issues to teams.
- Adding richer interaction signals, for example commit patterns or discussion threads, might strengthen the refinement stage further.
- Performance is likely highest in projects that already contain many resolved bugs, since the interaction ranking depends on historical examples.
Load-bearing premise
The last three layers from each of the two transformers supply independent and complementary signals, and past interactions with similar fixed bugs supply reliable refinement signals that hold up outside the training data.
What would settle it
On a new bug dataset, if stripping away the interaction-based refinement step leaves Top-1 and Top-3 accuracy no better than the strongest single-transformer baseline, the added value of the dual-plus-interaction design would be falsified.
Figures
read the original abstract
Pretrained Language Models or PLMs are transformer-based architectures that can be used in bug triaging tasks. PLMs can better capture token semantics than traditional Machine Learning (ML) models that rely on statistical features (e.g., TF-IDF, bag of words). However, PLMs may still attend to less relevant tokens in a bug report, which can impact their effectiveness. In addition, the model can be sub-optimal with its recommendations when the interaction history of developers around similar bugs is not taken into account. We designed TriagerX to address these limitations. First, to assess token semantics more reliably, we leverage a dual-transformer architecture. Unlike current state-of-the-art (SOTA) baselines that employ a single transformer architecture, TriagerX collects recommendations from two transformers with each offering recommendations via its last three layers. This setup generates a robust content-based ranking of candidate developers. TriagerX then refines this ranking by employing a novel interaction-based ranking methodology, which considers developers' historical interactions with similar fixed bugs. Across five datasets, TriagerX surpasses all nine transformer-based methods, including SOTA baselines, often improving Top-1 and Top-3 developer recommendation accuracy by over 10%. We worked with our large industry partner to successfully deploy TriagerX in their development environment. The partner required both developer and component recommendations, with components acting as proxies for team assignments-particularly useful in cases of developer turnover or team changes. We trained TriagerX on the partner's dataset for both tasks, and it outperformed SOTA baselines by up to 10% for component recommendations and 54% for developer recommendations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces TriagerX, a dual-transformer architecture for bug triaging that produces content-based developer rankings by aggregating recommendations from the last three layers of each of two transformers, then refines the ranking via a novel interaction-based component that incorporates developers' historical interactions with similar fixed bugs. It reports consistent outperformance over nine transformer-based baselines (including SOTA methods) on five datasets, with Top-1 and Top-3 accuracy gains often exceeding 10%, and describes successful deployment at an industry partner for both developer and component recommendations.
Significance. If the reported gains prove robust, the work would advance software engineering research on PLM-based recommendations by showing how dual content encoders can be combined with interaction history to address token-attention limitations and improve practical triaging. The industrial deployment for component recommendations (as proxies for team assignments) adds applied value, particularly for handling developer turnover.
major comments (3)
- [§3.2] §3.2 (Dual-Transformer Architecture): The decision to aggregate only the last three layers from each transformer for the content ranking lacks motivation, ablation, or comparison to other layer selections. This choice is load-bearing for the claim of 'robust content-based ranking' because the complementarity of the two transformers rests on it; without evidence that three layers are optimal or that other depths yield inferior results, the architectural novelty is difficult to assess.
- [§4.3] §4.3 (Evaluation and Interaction Ranking): The manuscript does not specify whether dataset splits are temporal (to prevent leakage) or random, nor does it detail the similarity function used to retrieve 'similar fixed bugs' for the interaction-based refinement. If similarity is computed from the same content embeddings as the primary ranking or if splits permit future data to influence training, the >10% gains on Top-1/Top-3 metrics could be inflated rather than reflecting genuine complementarity of the interaction component.
- [Results] Results tables (e.g., Table 2 or equivalent): No statistical significance tests (paired t-test, McNemar, or bootstrap) or details on hyper-parameter tuning and validation folds are reported. Given that the central claim is consistent outperformance across five datasets, the absence of these controls makes it impossible to rule out post-hoc selection or overfitting as explanations for the observed margins.
minor comments (2)
- [Abstract] Abstract: The phrase 'often improving ... by over 10%' is vague; specifying the exact datasets, metrics, and range of improvements (or providing a summary table reference) would improve clarity for readers.
- [§2] §2 (Related Work): Several citations to recent PLM papers could be supplemented with foundational bug-triaging references (e.g., earlier ML-based assignee recommendation studies) to better situate the novelty of the dual + interaction design.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below, indicating revisions where the manuscript will be updated to improve clarity and rigor.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Dual-Transformer Architecture): The decision to aggregate only the last three layers from each transformer for the content ranking lacks motivation, ablation, or comparison to other layer selections. This choice is load-bearing for the claim of 'robust content-based ranking' because the complementarity of the two transformers rests on it; without evidence that three layers are optimal or that other depths yield inferior results, the architectural novelty is difficult to assess.
Authors: The last three layers were selected because deeper transformer layers encode higher-level semantic representations suited to bug report semantics, while avoiding dilution from shallower syntactic features. We acknowledge the absence of supporting ablation. In the revision we will add an ablation study comparing last-1, last-2, last-3, and last-4 layers on all five datasets, demonstrating that three layers yield the strongest and most stable content rankings. revision: yes
-
Referee: [§4.3] §4.3 (Evaluation and Interaction Ranking): The manuscript does not specify whether dataset splits are temporal (to prevent leakage) or random, nor does it detail the similarity function used to retrieve 'similar fixed bugs' for the interaction-based refinement. If similarity is computed from the same content embeddings as the primary ranking or if splits permit future data to influence training, the >10% gains on Top-1/Top-3 metrics could be inflated rather than reflecting genuine complementarity of the interaction component.
Authors: All splits are strictly temporal (training on bugs reported before a cutoff date, testing on later bugs) to eliminate leakage. The similarity function for retrieving similar fixed bugs combines BM25 textual similarity with metadata overlap (component, priority, severity) and is computed independently of the dual-transformer content embeddings. We will document these choices explicitly in the revised §4.3 to confirm the interaction component supplies orthogonal signal. revision: yes
-
Referee: [Results] Results tables (e.g., Table 2 or equivalent): No statistical significance tests (paired t-test, McNemar, or bootstrap) or details on hyper-parameter tuning and validation folds are reported. Given that the central claim is consistent outperformance across five datasets, the absence of these controls makes it impossible to rule out post-hoc selection or overfitting as explanations for the observed margins.
Authors: Hyperparameters were selected via grid search with 5-fold cross-validation performed inside the training set of each dataset. We will add paired t-tests and McNemar tests on Top-1/Top-3 accuracy across repeated runs in the revised results section, reporting p-values to establish statistical significance of the observed margins. revision: yes
Circularity Check
No circularity: empirical performance claims rest on external benchmarks
full rationale
The paper introduces TriagerX as a dual-transformer model for content-based developer ranking followed by an interaction-based refinement step using historical bug interactions. Its central claims consist of measured accuracy improvements (Top-1/Top-3 gains >10% across five datasets versus nine baselines). These are obtained by training on one data partition and evaluating on held-out test reports, not by any algebraic reduction, fitted parameter renamed as prediction, or self-referential definition. No equations, uniqueness theorems, or ansatzes are presented that would make the reported rankings equivalent to the model's own inputs by construction. Design choices such as extracting signals from the last three layers of each transformer are architectural decisions justified by the goal of capturing token semantics, not circularly defined quantities. The results therefore remain self-contained against external baselines and do not trigger any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
free parameters (2)
- choice of last three layers
- interaction history weighting
axioms (2)
- domain assumption Transformer models capture token semantics better than TF-IDF or bag-of-words features.
- domain assumption Historical developer interactions with similar fixed bugs are predictive of future assignments.
Reference graph
Works this paper leans on
-
[1]
Automatic bug triage using text categorization,
D. Cubranic and G. C. Murphy, “Automatic bug triage using text categorization,” in International Conference on Software Engineering and Knowledge Engineering , 2004
work page 2004
-
[2]
J. Anvik, L. Hiew, and G. C. Murphy, “Who should fix this bug?” in Proceedings of the 28th International Conference on Software Engineering, ser. ICSE ’06. New York, NY , USA: Association for Computing Machinery, 2006, p. 361–370
work page 2006
-
[3]
Reducing the effort of bug report triage: Recommenders for development-oriented deci- sions,
J. Anvik and G. C. Murphy, “Reducing the effort of bug report triage: Recommenders for development-oriented deci- sions,” ACM Trans. Softw. Eng. Methodol. , vol. 20, no. 3, aug 2011. TABLE XI: Top-k accuracy of all considered baselines on different datasets compared to TriagerX CBR. Dataset Method Top-1 Top-3 Top-5 Top-10 Top-20 GoogleChromium TriagerX CB...
work page 2011
-
[4]
Ap- plying deep learning based automatic bug triager to industrial projects,
S.-R. Lee, M.-J. Heo, C.-G. Lee, M. Kim, and G. Jeong, “Ap- plying deep learning based automatic bug triager to industrial projects,” in Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering , ser. ESEC/FSE 2017. New York, NY , USA: Association for Computing Machinery, 2017, p. 926–931
work page 2017
-
[5]
DeepTriage: Exploring the Effectiveness of Deep Learning for Bug Triaging
S. Mani, A. Sankaran, and R. Aralikatte, “Deeptriage: Exploring the effectiveness of deep learning for bug triaging,” CoRR, vol. abs/1801.01275, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[6]
S. F. A. Zaidi, F. M. Awan, M. Lee, H. Woo, and C.-G. Lee, “Applying convolutional neural networks with different word representation techniques to recommend bug fixers,”IEEE Access, vol. 8, pp. 213 729–213 747, 2020
work page 2020
-
[7]
A light bug triage framework for applying large pre-trained language model,
J. Lee, K. Han, and H. Yu, “A light bug triage framework for applying large pre-trained language model,” in Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering , ser. ASE ’22. New York, NY , USA: Association for Computing Machinery, 2023
work page 2023
-
[8]
Improving bug triaging with high confidence predictions at ericsson,
A. Sarkar, P. C. Rigby, and B. Bartalos, “Improving bug triaging with high confidence predictions at ericsson,” in 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019, pp. 81–91
work page 2019
-
[9]
A comparative study of transformer-based neural text representation techniques on bug triaging,
A. K. Dipongkor and K. Moran, “A comparative study of transformer-based neural text representation techniques on bug triaging,” in 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE) , 2023, pp. 1012–1023
work page 2023
-
[10]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[11]
Deep contextualized word rep- resentations,
M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word rep- resentations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), M. Walker, H. Ji, and A. Stent, Eds. New Or- leans, Lo...
work page 2018
-
[12]
Efficient Estimation of Word Representations in Vector Space
T. Mikolov, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, vol. 3781, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[13]
Distilling the knowledge in a neural network,
G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” 2015
work page 2015
-
[14]
Does knowledge distillation really work?
S. D. Stanton, P. Izmailov, P. Kirichenko, A. A. Alemi, and A. G. Wilson, “Does knowledge distillation really work?” inAd- vances in Neural Information Processing Systems , A. Beygelz- imer, Y . Dauphin, P. Liang, and J. W. Vaughan, Eds., 2021
work page 2021
-
[15]
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
R. T. McCoy, E. Pavlick, and T. Linzen, “Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference,” arXiv preprint arXiv:1902.01007 , 2019
work page internal anchor Pith review Pith/arXiv arXiv 1902
-
[16]
Effects of human adversarial and affable samples on bert generalization,
A. Elangovan, J. He, Y . Li, and K. Verspoor, “Effects of human adversarial and affable samples on bert generalization,” arXiv preprint arXiv:2310.08008, 2023
-
[17]
Ensemble learning for heterogeneous large language models with deep parallel collaboration,
Y . Huang, X. Feng, B. Li, Y . Xiang, H. Wang, T. Liu, and B. Qin, “Ensemble learning for heterogeneous large language models with deep parallel collaboration,” Advances in Neural Information Processing Systems , vol. 37, pp. 119 838–119 860, 2024
work page 2024
-
[18]
An ensemble method for bug triaging using large language models,
A. Kumar Dipongkor, “An ensemble method for bug triaging using large language models,” in Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engi- neering: Companion Proceedings , 2024, pp. 438–440
work page 2024
-
[19]
When does diversity help generalization in classification ensembles?
Y . Bian and H. Chen, “When does diversity help generalization in classification ensembles?” IEEE Transactions on Cybernet- ics, vol. 52, no. 9, pp. 9059–9075, 2022
work page 2022
-
[20]
Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy,
L. I. Kuncheva and C. J. Whitaker, “Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy,”Mach. Learn., vol. 51, no. 2, p. 181–207, may 2003
work page 2003
-
[21]
G. Yang, T. Zhang, and B. Lee, “Utilizing a multi-developer network-based developer recommendation algorithm to fix bugs effectively,” inProceedings of the 29th Annual ACM Symposium on Applied Computing , ser. SAC ’14. New York, NY , USA: Association for Computing Machinery, 2014, p. 1134–1139
work page 2014
-
[22]
Sentimental analysis of movie reviews using soft voting ensemble-based machine learning,
A. Athar, S. Ali, M. M. Sheeraz, S. Bhattachariee, and H.- C. Kim, “Sentimental analysis of movie reviews using soft voting ensemble-based machine learning,” in 2021 Eighth Inter- national Conference on Social Network Analysis, Management and Security (SNAMS) , 2021, pp. 01–05
work page 2021
-
[23]
S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,”International Journal of Cognitive Computing in Engineering, vol. 2, pp. 40–46, 2021
work page 2021
-
[24]
Catastrophic forgetting in connectionist net- works,
R. M. French, “Catastrophic forgetting in connectionist net- works,” Trends in cognitive sciences, vol. 3, no. 4, pp. 128–135, 1999
work page 1999
-
[25]
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” CoRR, vol. abs/1502.03167, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[26]
Dropout: A simple way to prevent neural networks from overfitting,
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” Journal of Machine Learning Re- search, vol. 15, no. 56, pp. 1929–1958, 2014
work page 1929
-
[27]
Sentence-bert: Sentence em- beddings using siamese bert-networks,
N. Reimers and I. Gurevych, “Sentence-bert: Sentence em- beddings using siamese bert-networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019
work page 2019
-
[28]
Time weight collaborative filtering,
Y . Ding and X. Li, “Time weight collaborative filtering,” in Proceedings of the 14th ACM international conference on Information and knowledge management , 2005, pp. 485–492
work page 2005
-
[29]
Performance evaluation of time-based recommendation system in collaborative filtering technique,
G. Jain, T. Mahara, and S. Sharma, “Performance evaluation of time-based recommendation system in collaborative filtering technique,” Procedia Computer Science , vol. 218, pp. 1834– 1844, 2023
work page 2023
-
[30]
Cost- aware triage ranking algorithms for bug reporting systems,
J.-w. Park, M.-W. Lee, J. Kim, S.-w. Hwang, and S. Kim, “Cost- aware triage ranking algorithms for bug reporting systems,” Knowledge and Information Systems , vol. 48, pp. 679–705, 2016
work page 2016
-
[31]
Adptriage: Approximate dynamic programming for bug triage,
H. Jahanshahi, M. Cevik, K. Mousavi, and A. Bas ¸ar, “Adptriage: Approximate dynamic programming for bug triage,” IEEE Trans. Softw. Eng., vol. 49, no. 10, p. 4594–4609, Aug. 2023
work page 2023
-
[32]
Decoupled Weight Decay Regularization
I. Loshchilov and F. Hutter, “Fixing weight decay regularization in adam,” CoRR, vol. abs/1711.05101, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[33]
MPNet: Masked And Permuted Pre-Training For Language Understanding,
K. Song, X. Tan, T. Qin, J. Lu, and T. Liu, “Mpnet: Masked and permuted pre-training for language understanding,” CoRR, vol. abs/2004.09297, 2020
-
[34]
Multiple word embeddings for increased diver- sity of representation,
B. Lester, D. Pressel, A. Hemmeter, S. R. Choudhury, and S. Bangalore, “Multiple word embeddings for increased diver- sity of representation,” arXiv preprint arXiv:2009.14394, 2020
-
[35]
Comparison and combination of sentence embeddings derived from different supervision signals,
H. Tsukagoshi, R. Sasano, and K. Takeda, “Comparison and combination of sentence embeddings derived from different supervision signals,” in Proceedings of the 11th Joint Con- ference on Lexical and Computational Semantics , V . Nastase, E. Pavlick, M. T. Pilehvar, J. Camacho-Collados, and A. Ra- ganato, Eds. Seattle, Washington: Association for Computa- t...
work page 2022
-
[36]
R. E. Schapire, “Explaining adaboost,” in Empirical inference: festschrift in honor of vladimir N. Vapnik . Springer, 2013, pp. 37–52
work page 2013
-
[37]
Smote: synthetic minority over-sampling technique,
N. V . Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” J. Artif. Int. Res., vol. 16, no. 1, p. 321–357, jun 2002
work page 2002
-
[38]
Focal loss for dense object detection,
T.-Y . Lin, P. Goyal, R. Girshick, K. He, and P. Doll ´ar, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision , 2017, pp. 2980– 2988
work page 2017
-
[39]
Developer activity motivated bug triaging: Via convolutional neural network,
S. Guo, X. Zhang, X. Yang, R. Chen, C. Guo, H. Li, and T. Li, “Developer activity motivated bug triaging: Via convolutional neural network,” Neural Process. Lett. , vol. 51, no. 3, p. 2589–2606, jun 2020
work page 2020
-
[40]
Efficient bug triage for industrial environments,
W. Zhang, “Efficient bug triage for industrial environments,” in 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME) , 2020, pp. 727–735
work page 2020
-
[41]
S. N. Ahsan, J. Ferzund, and F. Wotawa, “Automatic software bug triage system (bts) based on latent semantic indexing and support vector machine,” in 2009 Fourth International Conference on Software Engineering Advances , 2009, pp. 216– 221
work page 2009
-
[42]
Bug prioritization to facilitate bug report triage,
J. Kanwal and O. Maqbool, “Bug prioritization to facilitate bug report triage,” Journal of Computer Science and Technology , vol. 27, pp. 397 – 412, 2012
work page 2012
-
[43]
Automated bug triaging in an industrial context,
V . Ded´ık and B. Rossi, “Automated bug triaging in an industrial context,” in 2016 42th Euromicro Conference on Software Engi- neering and Advanced Applications (SEAA), 2016, pp. 363–367
work page 2016
-
[44]
Easy over hard: a case study on deep learning,
W. Fu and T. Menzies, “Easy over hard: a case study on deep learning,” in Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering , ser. ESEC/FSE 2017. New York, NY , USA: Association for Computing Machinery, 2017, p. 49–60
work page 2017
-
[45]
Automated bug assignment: Ensemble-based ma- chine learning in large scale industrial contexts,
L. Jonsson, M. Borg, D. Broman, K. Sandahl, S. Eldh, and P. Runeson, “Automated bug assignment: Ensemble-based ma- chine learning in large scale industrial contexts,” Empirical Softw. Engg., vol. 21, no. 4, p. 1533–1578, aug 2016
work page 2016
-
[46]
Triaging incoming change requests: Bug or commit history, or code authorship?
M. Linares-V ´asquez, K. Hossen, H. Dang, H. Kagdi, M. Geth- ers, and D. Poshyvanyk, “Triaging incoming change requests: Bug or commit history, or code authorship?” in 2012 28th IEEE International Conference on Software Maintenance (ICSM) , 2012, pp. 451–460
work page 2012
-
[47]
R. Shokripour, J. Anvik, Z. M. Kasirun, and S. Zamani, “Why so complicated? simple term filtering and weighting for location- based bug report assignment recommendation,” in 2013 10th Working Conference on Mining Software Repositories (MSR) , 2013, pp. 2–11
work page 2013
-
[48]
Automatic bug triage using hierarchical attention networks,
H. He and S. Yang, “Automatic bug triage using hierarchical attention networks,” in 2021 IEEE 21st International Confer- ence on Software Quality, Reliability and Security Companion 14 (QRS-C), 2021, pp. 1043–1049
work page 2021
-
[49]
A multi-label, dual-output deep neural network for automated bug triaging,
C. A. Choquette-Choo, D. Sheldon, J. Proppe, J. Alphonso- Gibbs, and H. Gupta, “A multi-label, dual-output deep neural network for automated bug triaging,” in 2019 18th IEEE Inter- national Conference On Machine Learning And Applications (ICMLA), 2019, pp. 937–944
work page 2019
-
[50]
Fuzzy set and cache-based approach for bug triaging,
A. Tamrawi, T. T. Nguyen, J. M. Al-Kofahi, and T. N. Nguyen, “Fuzzy set and cache-based approach for bug triaging,” in Pro- ceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering , ser. ESEC/FSE ’11. New York, NY , USA: Association for Computing Machinery, 2011, p. 365–375
work page 2011
-
[51]
Effective bug triage based on historical bug-fix information,
H. Hu, H. Zhang, J. Xuan, and W. Sun, “Effective bug triage based on historical bug-fix information,” in 2014 IEEE 25th International Symposium on Software Reliability Engineering , 2014, pp. 122–132
work page 2014
-
[52]
Learning to rank for bug report assignee recommendation,
Y . Tian, D. Wijedasa, D. Lo, and C. Le Goues, “Learning to rank for bug report assignee recommendation,” in 2016 IEEE 24th International Conference on Program Comprehension (ICPC) , 2016, pp. 1–10
work page 2016
-
[53]
D. M. Blei, A. Y . Ng, and M. I. Jordan, “Latent dirichlet allocation,” J. Mach. Learn. Res. , vol. 3, no. null, p. 993–1022, mar 2003
work page 2003
-
[54]
Dretom: developer recommendation based on topic models for bug resolution,
X. Xie, W. Zhang, Y . Yang, and Q. Wang, “Dretom: developer recommendation based on topic models for bug resolution,” in Proceedings of the 8th International Conference on Predictive Models in Software Engineering , ser. PROMISE ’12. New York, NY , USA: Association for Computing Machinery, 2012, p. 19–28
work page 2012
-
[55]
T. Zhang, G. Yang, B. Lee, and E. K. Lua, “A novel developer ranking algorithm for automatic bug triage using topic model and developer relations,” in 2014 21st Asia-Pacific Software Engineering Conference, vol. 1, 2014, pp. 223–230
work page 2014
-
[56]
G. Yang, T. Zhang, and B. Lee, “Towards semi-automatic bug triage and severity prediction based on topic model and multi- feature of bug reports,” in 2014 IEEE 38th Annual Computer Software and Applications Conference , 2014, pp. 97–106
work page 2014
-
[57]
Improving automated bug triaging with specialized topic model,
X. Xia, D. Lo, Y . Ding, J. M. Al-Kofahi, T. N. Nguyen, and X. Wang, “Improving automated bug triaging with specialized topic model,” IEEE Transactions on Software Engineering , vol. 43, no. 3, pp. 272–297, 2017
work page 2017
-
[58]
Developer prioritization in bug repositories,
J. Xuan, H. Jiang, Z. Ren, and W. Zou, “Developer prioritization in bug repositories,” in 2012 34th International Conference on Software Engineering (ICSE) , 2012, pp. 25–35
work page 2012
-
[59]
Heterogeneous network analysis of developer contribution in bug repositories,
W. Zhang, S. Wang, Y . Yang, and Q. Wang, “Heterogeneous network analysis of developer contribution in bug repositories,” in 2013 International Conference on Cloud and Service Com- puting, 2013, pp. 98–105
work page 2013
-
[60]
Improving bug triage with bug tossing graphs,
G. Jeong, S. Kim, and T. Zimmermann, “Improving bug triage with bug tossing graphs,” in Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, ser. ESEC/FSE ’09. New York, NY , USA: Association for Computing Machinery, 2009, p. 111–120
work page 2009
-
[61]
Fine-grained incremental learning and multi-feature tossing graphs to improve bug triag- ing,
P. Bhattacharya and I. Neamtiu, “Fine-grained incremental learning and multi-feature tossing graphs to improve bug triag- ing,” in 2010 IEEE International Conference on Software Main- tenance, 2010, pp. 1–10
work page 2010
-
[62]
Bug triaging based on tossing sequence modeling,
S. Xi, Y . Yao, X. Xiao, F. Xu, and J. Lu, “Bug triaging based on tossing sequence modeling,” Journal of Computer Science and Technology, vol. 34, pp. 942 – 956, 2019
work page 2019
-
[63]
Reducing bug triaging confusion by learning from mistakes with a bug tossing knowledge graph,
Y . Su, Z. Xing, X. Peng, X. Xia, C. Wang, X. Xu, and L. Zhu, “Reducing bug triaging confusion by learning from mistakes with a bug tossing knowledge graph,” in 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2021, pp. 191–202
work page 2021
-
[64]
Assigning bug reports using a vocabulary-based expertise model of developers,
D. Matter, A. Kuhn, and O. Nierstrasz, “Assigning bug reports using a vocabulary-based expertise model of developers,” in 2009 6th IEEE International Working Conference on Mining Software Repositories, 2009, pp. 131–140
work page 2009
-
[65]
Bug report assignee recommendation using activity profiles,
H. Naguib, N. Narayan, B. Br ¨ugge, and D. Helal, “Bug report assignee recommendation using activity profiles,” in 2013 10th Working Conference on Mining Software Repositories (MSR) , 2013, pp. 22–30
work page 2013
-
[66]
Sustriage: Sustainable bug triage with multi-modal ensemble learning,
W. Zhang, J. Zhao, and S. Wang, “Sustriage: Sustainable bug triage with multi-modal ensemble learning,” in IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, ser. WI-IAT ’21. New York, NY , USA: Association for Computing Machinery, 2022, p. 441–448
work page 2022
-
[67]
Shallow-deep networks: Understanding and mitigating network overthinking,
Y . Kaya, S. Hong, and T. Dumitras, “Shallow-deep networks: Understanding and mitigating network overthinking,” in Inter- national conference on machine learning . PMLR, 2019, pp. 3301–3310
work page 2019
-
[68]
HuggingFace's Transformers: State-of-the-art Natural Language Processing
T. Wolf, L. Debut, V . Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, and J. Brew, “Huggingface’s transformers: State-of-the-art natural language processing,” CoRR, vol. abs/1910.03771, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1910
-
[69]
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Y . Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V . Stoyanov, “Roberta: A robustly optimized BERT pretraining approach,” CoRR, vol. abs/1907.11692, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[70]
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
P. He, X. Liu, J. Gao, and W. Chen, “Deberta: Decoding- enhanced BERT with disentangled attention,” CoRR, vol. abs/2006.03654, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[71]
CodeBERT: A pre- trained model for programming and natural languages,
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, “CodeBERT: A pre- trained model for programming and natural languages,” in Find- ings of the Association for Computational Linguistics: EMNLP 2020, T. Cohn, Y . He, and Y . Liu, Eds. Online: Association for Computational Linguistics, Nov. 2020, pp. 1536–1547
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.