Predicting Merge Conflicts in Collaborative Software Development

Julia Rubin; Moein Owhadi-Kareshk; Sarah Nadi

arxiv: 1907.06274 · v1 · pith:52MIOPABnew · submitted 2019-07-14 · 💻 cs.SE · cs.LG

Predicting Merge Conflicts in Collaborative Software Development

Moein Owhadi-Kareshk , Sarah Nadi , Julia Rubin This is my paper

Pith reviewed 2026-05-24 21:24 UTC · model grok-4.3

classification 💻 cs.SE cs.LG

keywords merge conflict predictionmachine learningGit featuresspeculative mergingcollaborative software developmentversion controlsoftware mergingconflict detection

0 comments

The pith

A classifier using nine lightweight Git features predicts safe merges with F1-scores of 0.95 to 0.97 across languages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether merge conflicts during branch integration can be predicted ahead of time with machine learning instead of always running full merges. It builds the predictor from nine simple features already present in Git history, such as branch lengths and change patterns. Evaluation covers 267,657 merge scenarios drawn from 744 repositories spanning seven programming languages. The model identifies safe, conflict-free merges very reliably while performing less strongly on actual conflicts. This setup would let teams skip most background merge checks and focus effort only on the merges the model flags as risky.

Core claim

We design a classifier for predicting merge conflicts, based on 9 light-weight Git feature sets. To evaluate our predictor, we perform a large-scale study on 267,657 merge scenarios from 744 GitHub repositories in seven programming languages. Our results show that we achieve high f1-scores, varying from 0.95 to 0.97 for different programming languages, when predicting safe merge scenarios. The f1-score is between 0.57 and 0.68 for the conflicting merge scenarios. Predicting merge conflicts is feasible in practice, especially in the context of predicting safe merge scenarios as a pre-filtering step for speculative merging.

What carries the argument

A machine-learning classifier based on nine light-weight Git feature sets that distinguishes safe from conflicting merge scenarios.

If this is right

Speculative merging systems can safely skip most merges the classifier labels safe, cutting background computation.
Developers receive earlier warnings about likely conflicts before the changes grow large and complex.
The same nine-feature approach delivers high accuracy on safe merges in all seven languages tested.
The technique is positioned as a practical pre-filter rather than a complete replacement for full merge simulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same lightweight features could be extracted from other version-control systems to test whether the prediction approach generalizes beyond Git.
Adding simple change-type or file-overlap signals might raise the lower F1-scores observed for conflicting merges.
Embedding the classifier in continuous-integration pipelines could automatically route only risky merges to human review.
The gap between safe-merge and conflict-merge accuracy suggests the features capture absence of conflict more readily than presence of conflict.

Load-bearing premise

The nine light-weight Git feature sets provide sufficient information to distinguish between safe and conflicting merge scenarios with high accuracy.

What would settle it

Running the trained classifier on merge scenarios from a fresh collection of repositories outside the original 744 and finding that the F1-score for safe merges drops below 0.9 would challenge the feasibility result.

Figures

Figures reproduced from arXiv: 1907.06274 by Julia Rubin, Moein Owhadi-Kareshk, Sarah Nadi.

**Figure 2.** Figure 2: The Distribution of Merge Scenarios with relatively the same chance. In [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Background. During collaborative software development, developers often use branches to add features or fix bugs. When merging changes from two branches, conflicts may occur if the changes are inconsistent. Developers need to resolve these conflicts before completing the merge, which is an error-prone and time-consuming process. Early detection of merge conflicts, which warns developers about resolving conflicts before they become large and complicated, is among the ways of dealing with this problem. Existing techniques do this by continuously pulling and merging all combinations of branches in the background to notify developers as soon as a conflict occurs, which is a computationally expensive process. One potential way for reducing this cost is to use a machine-learning based conflict predictor that filters out the merge scenarios that are not likely to have conflicts, ie safe merge scenarios. Aims. In this paper, we assess if conflict prediction is feasible. Method. We design a classifier for predicting merge conflicts, based on 9 light-weight Git feature sets. To evaluate our predictor, we perform a large-scale study on 267, 657 merge scenarios from 744 GitHub repositories in seven programming languages. Results. Our results show that we achieve high f1-scores, varying from 0.95 to 0.97 for different programming languages, when predicting safe merge scenarios. The f1-score is between 0.57 and 0.68 for the conflicting merge scenarios. Conclusions. Predicting merge conflicts is feasible in practice, especially in the context of predicting safe merge scenarios as a pre-filtering step for speculative merging.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that a machine-learning classifier based on 9 lightweight Git feature sets can predict merge conflicts with high F1 (0.95-0.97) on safe merges and moderate F1 (0.57-0.68) on conflicting merges. The evaluation uses 267,657 merge scenarios from 744 GitHub repositories across 7 languages. The central conclusion is that conflict prediction is feasible in practice, particularly as a pre-filter for safe merges to reduce the cost of speculative merging.

Significance. A reliable pre-filter for safe merges would meaningfully lower the computational overhead of continuous speculative merging in collaborative development. The scale of the empirical study (hundreds of thousands of real merges) is a positive aspect. However, the reported F1 disparity between classes limits the immediate practical significance unless the conflict-class performance can be shown to be adequate for the pre-filter role or the use-case is narrowed.

major comments (3)

[Abstract] Abstract: The F1 scores of 0.57-0.68 on conflicting merges are moderate and undermine the pre-filtering claim, because either low recall on conflicts would let conflicting merges reach the expensive speculative step or low precision would cause unnecessary speculative work on safe merges.
[Method] Method/Results: No feature definitions, ablation studies, feature-importance analysis, or baseline comparisons (e.g., simple file-overlap heuristics) are described, so it is unclear whether the nine Git features capture non-trivial conflict signals or merely reproduce obvious overlap statistics.
[Evaluation] Evaluation: The manuscript provides no details on the cross-validation procedure, class imbalance handling, or precision/recall breakdown per class, which are required to assess whether the reported F1 values generalize or are artifacts of majority-class bias.

minor comments (2)

[Abstract] The abstract and conclusions should explicitly state the class distribution (safe vs. conflicting) to contextualize the F1 gap.
[Method] Notation for the nine feature sets should be introduced with a table or enumerated list for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. The feedback highlights important areas for clarification and strengthening, particularly around the practical implications of the F1 scores, feature analysis, and evaluation details. We address each major comment below and will revise the manuscript to incorporate additional explanations, analyses, and details as outlined.

read point-by-point responses

Referee: [Abstract] Abstract: The F1 scores of 0.57-0.68 on conflicting merges are moderate and undermine the pre-filtering claim, because either low recall on conflicts would let conflicting merges reach the expensive speculative step or low precision would cause unnecessary speculative work on safe merges.

Authors: We agree that the moderate F1 on the conflict class requires careful interpretation for the pre-filter use case. However, the primary intended application is to identify safe merges with high confidence (F1 0.95-0.97) so they can be skipped in speculative merging, directly reducing computational cost. For the conflict class, even moderate performance provides value by catching some conflicts early; the system can still fall back to full speculative merging for uncertain cases. Low recall on conflicts would indeed allow some through, but this is acceptable if the goal is cost reduction rather than perfect filtering. We will revise the abstract, introduction, and discussion sections to explicitly frame the use case this way, add per-class precision/recall breakdowns, and discuss the precision-recall trade-offs to show when the pre-filter remains beneficial. revision: yes
Referee: [Method] Method/Results: No feature definitions, ablation studies, feature-importance analysis, or baseline comparisons (e.g., simple file-overlap heuristics) are described, so it is unclear whether the nine Git features capture non-trivial conflict signals or merely reproduce obvious overlap statistics.

Authors: The nine feature sets are introduced in Section 3 with high-level descriptions drawn from Git metadata (e.g., commit counts, file changes, branch divergence metrics). We acknowledge that explicit definitions, ablation results, feature-importance rankings, and a baseline comparison (such as a simple file-overlap heuristic) were not included. These additions would clarify whether the features provide non-trivial signals. We will expand Section 3 with precise feature definitions, add an ablation study removing feature groups, include feature-importance analysis (e.g., via permutation importance or SHAP), and compare against a file-overlap baseline to demonstrate incremental value of the ML model. revision: yes
Referee: [Evaluation] Evaluation: The manuscript provides no details on the cross-validation procedure, class imbalance handling, or precision/recall breakdown per class, which are required to assess whether the reported F1 values generalize or are artifacts of majority-class bias.

Authors: The evaluation used 10-fold cross-validation with stratification by repository to prevent leakage across projects, and class imbalance was addressed via class weighting in the classifier. We agree that these procedural details, along with full per-class precision, recall, and F1 scores for each language, are essential for assessing generalization and potential majority-class bias. We will add a dedicated subsection in the evaluation describing the CV procedure, imbalance handling method, and complete per-class metrics (including confusion matrices or PR curves) to allow readers to verify the results are not artifacts of imbalance. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML evaluation on held-out repository data

full rationale

The paper trains a classifier on nine Git-derived features and reports F1 scores on a large held-out set of 267k merge scenarios from 744 real GitHub repositories. No derivation chain exists; the central claim rests on standard supervised learning performance metrics rather than any self-definition, fitted-input-as-prediction, or self-citation load-bearing step. The evaluation protocol (train/test split across repositories) is externally falsifiable and independent of the reported numbers.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work relies on standard supervised classification assumptions and the domain assumption about feature relevance.

free parameters (1)

ML model parameters
The classifier is trained on data, so parameters are fitted.

axioms (1)

domain assumption Light-weight Git features are predictive of merge conflicts
The paper assumes these 9 feature sets capture the necessary signals.

pith-pipeline@v0.9.0 · 5806 in / 1059 out tokens · 23881 ms · 2026-05-24T21:24:29.256362+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

[1]

The promises and perils of mining git,

C. Bird, P. C. Rigby, E. T. Barr, D. J. Hamilton, D. M. German, and P. Devanbu, “The promises and perils of mining git,” in Mining Software Repositories, 2009. MSR’09. 6th IEEE International Working Conference on. IEEE, 2009, pp. 1–10

work page 2009
[2]

The promises and perils of mining github,

E. Kalliamvakou, G. Gousios, K. Blincoe, L. Singer, D. M. German, and D. Damian, “The promises and perils of mining github,” in Proceedings of the 11th working conference on mining software repositories . ACM, 2014, pp. 92–101

work page 2014
[3]

Software practitioner per- spectives on merge conﬂicts and resolutions,

S. McKee, N. Nelson, A. Sarma, and D. Dig, “Software practitioner per- spectives on merge conﬂicts and resolutions,” in Software Maintenance and Evolution (ICSME), 2017 IEEE International Conference on. IEEE, 2017, pp. 467–478

work page 2017
[4]

Understanding semi-structured merge conﬂict characteristics in open-source java projects,

P. Accioly, P. Borba, and G. Cavalcanti, “Understanding semi-structured merge conﬂict characteristics in open-source java projects,” Empirical Software Engineering, vol. 23, no. 4, pp. 2051–2085, 2018

work page 2051
[5]

Palantir: Early detection of development conﬂicts arising from parallel code changes,

A. Sarma, D. F. Redmiles, and A. Van Der Hoek, “Palantir: Early detection of development conﬂicts arising from parallel code changes,” IEEE Transactions on Software Engineering, vol. 38, no. 4, pp. 889–908, 2012

work page 2012
[6]

Assessing the value of branches with what-if analysis,

C. Bird and T. Zimmermann, “Assessing the value of branches with what-if analysis,” in Proceedings of the ACM SIGSOFT 20th Interna- tional Symposium on the Foundations of Software Engineering . ACM, 2012, p. 45

work page 2012
[7]

Early detection of collaboration conﬂicts and risks,

Y . Brun, R. Holmes, M. D. Ernst, and D. Notkin, “Early detection of collaboration conﬂicts and risks,” IEEE Transactions on Software Engineering, vol. 39, no. 10, pp. 1358–1375, 2013

work page 2013
[8]

Improving early detection of software merge conﬂicts,

M. L. Guimar ˜aes and A. R. Silva, “Improving early detection of software merge conﬂicts,” in Proceedings of the 34th International Conference on Software Engineering . IEEE Press, 2012, pp. 342–352

work page 2012
[9]

Awareness and merge conﬂicts in distributed software development,

H. C. Estler, M. Nordio, C. A. Furia, and B. Meyer, “Awareness and merge conﬂicts in distributed software development,” in Global Software Engineering (ICGSE), 2014 IEEE 9th International Conference on . IEEE, 2014, pp. 26–35

work page 2014
[10]

Incremental speculative merging,

J. Baumgartner, R. Kanzelman, H. Mony, and V . Paruthi, “Incremental speculative merging,” Apr. 26 2011, uS Patent 7,934,180

work page 2011
[11]

Proactive detection of collaboration conﬂicts,

Y . Brun, R. Holmes, M. D. Ernst, and D. Notkin, “Proactive detection of collaboration conﬂicts,” in Proceedings of the 19th ACM SIGSOFT sym- posium and the 13th European conference on Foundations of software engineering. ACM, 2011, pp. 168–178

work page 2011
[12]

Cassandra: Proactive conﬂict minimization through optimized task scheduling,

B. K. Kasi and A. Sarma, “Cassandra: Proactive conﬂict minimization through optimized task scheduling,” in Proceedings of the 2013 Inter- national Conference on Software Engineering . IEEE Press, 2013, pp. 732–741

work page 2013
[13]

In- dicators for merge conﬂicts in the wild: survey and empirical study,

O. Leßenich, J. Siegmund, S. Apel, C. K ¨astner, and C. Hunsen, “In- dicators for merge conﬂicts in the wild: survey and empirical study,” Automated Software Engineering , vol. 25, no. 2, pp. 279–313, 2018

work page 2018
[14]

Analyzing conﬂict predictors in open-source java projects,

P. Accioly, P. Borba, L. Silva, and G. Cavalcanti, “Analyzing conﬂict predictors in open-source java projects,” in Proceedings of the 15th International Conference on Mining Software Repositories . ACM, 2018, pp. 576–586

work page 2018
[15]

Curating github for engineered software projects,

N. Munaiah, S. Kroh, C. Cabrey, and M. Nagappan, “Curating github for engineered software projects,” Empirical Software Engineering, vol. 22, no. 6, pp. 3219–3253, 2017

work page 2017
[16]

Induction of decision trees,

J. R. Quinlan, “Induction of decision trees,” Machine learning, vol. 1, no. 1, pp. 81–106, 1986

work page 1986
[17]

Classiﬁcation and regression by randomfor- est,

A. Liaw, M. Wiener et al., “Classiﬁcation and regression by randomfor- est,” R news, vol. 2, no. 3, pp. 18–22, 2002

work page 2002
[18]

Artifact page,

“Artifact page,” https://github.com/ualberta-smr/conﬂict-prediction

work page
[19]

A state-of-the-art survey on software merging,

T. Mens, “A state-of-the-art survey on software merging,” IEEE trans- actions on software engineering , vol. 28, no. 5, pp. 449–462, 2002

work page 2002
[20]

Semistruc- tured merge: rethinking merge in revision control systems,

S. Apel, J. Liebig, B. Brandl, C. Lengauer, and C. K ¨astner, “Semistruc- tured merge: rethinking merge in revision control systems,” in Proceed- ings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering . ACM, 2011, pp. 190–200

work page 2011
[21]

How do centralized and distributed version control systems impact software changes?

C. Brindescu, M. Codoban, S. Shmarkatiuk, and D. Dig, “How do centralized and distributed version control systems impact software changes?” in Proceedings of the 36th International Conference on Software Engineering. ACM, 2014, pp. 322–333

work page 2014
[22]

Structured merge with auto- tuning: balancing precision and performance,

S. Apel, O. Leßenich, and C. Lengauer, “Structured merge with auto- tuning: balancing precision and performance,” in Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineer- ing. ACM, 2012, pp. 120–129

work page 2012
[23]

Structure-oriented merging of revisions of software documents,

B. Westfechtel, “Structure-oriented merging of revisions of software documents,” in Proceedings of the 3rd international workshop on Software conﬁguration management . ACM, 1991, pp. 68–79

work page 1991
[24]

Syntactic software merging,

J. Buffenbarger, “Syntactic software merging,” in Software Conﬁguration Management. Springer, 1995, pp. 153–172

work page 1995
[25]

Fstmerge tool,

“Fstmerge tool,” https://github.com/joliebig/featurehouse/tree/master/ fstmerge

work page
[26]

Jdime tool,

“Jdime tool,” http://fosd.net/JDime

work page
[27]

Evaluating and improving semistructured merge,

G. Cavalcanti, P. Borba, and P. Accioly, “Evaluating and improving semistructured merge,” Proceedings of the ACM on Programming Lan- guages, vol. 1, no. OOPSLA, p. 59, 2017

work page 2017
[28]

On the Nature of Merge Conﬂicts: a Study of 2,731 Open Source Java Projects Hosted by GitHub,

G. G. L. Menezes, L. G. P. Murta, M. O. Barros, and A. Van Der Hoek, “On the Nature of Merge Conﬂicts: a Study of 2,731 Open Source Java Projects Hosted by GitHub,” IEEE Transactions on Software Engineering, 2018

work page 2018
[29]

Tipmerge: recom- mending experts for integrating changes across branches,

C. Costa, J. Figueiredo, L. Murta, and A. Sarma, “Tipmerge: recom- mending experts for integrating changes across branches,” in Proceed- ings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering . ACM, 2016, pp. 523–534

work page 2016
[30]

Effective software merging in the presence of object-oriented refactorings,

D. Dig, K. Manzoor, R. E. Johnson, and T. N. Nguyen, “Effective software merging in the presence of object-oriented refactorings,” IEEE Transactions on Software Engineering, vol. 34, no. 3, pp. 321–335, 2008

work page 2008
[31]

Are refactorings to blame? an empirical study of refactorings in merge conﬂicts,

M. Mahmoudi, S. Nadi, and N. Tsantalis, “Are refactorings to blame? an empirical study of refactorings in merge conﬂicts,” in Proc. of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER ’19) , 2019

work page 2019
[32]

Syde: a tool for collaborative software development,

L. Hattori and M. Lanza, “Syde: a tool for collaborative software development,” in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2. ACM, 2010, pp. 235– 238

work page 2010
[33]

Supporting merge conﬂict resolu- tion by using ﬁne-grained code change history,

Y . Nishimura and K. Maruyama, “Supporting merge conﬂict resolu- tion by using ﬁne-grained code change history,” in Software Analysis, Evolution, and Reengineering (SANER), 2016 IEEE 23rd International Conference on, vol. 1. IEEE, 2016, pp. 661–664

work page 2016
[34]

Studying pull request merges: a case study of shopify’s active merchant,

O. Kononenko, T. Rose, O. Baysal, M. Godfrey, D. Theisen, and B. de Water, “Studying pull request merges: a case study of shopify’s active merchant,” in Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice . ACM, 2018, pp. 124–133

work page 2018
[35]

Early prediction of merged code changes to prioritize reviewing tasks,

Y . Fan, X. Xia, D. Lo, and S. Li, “Early prediction of merged code changes to prioritize reviewing tasks,” Empirical Software Engineering, pp. 1–48, 2018

work page 2018
[36]

Scalable software merging studies with merganser,

M. Owhadi-Kareshk and S. Nadi, “Scalable software merging studies with merganser,” in Proceedings of the 16th International Conference on Mining Software Repositories (MSR ’19) , 2019

work page 2019
[37]

https://git-scm.com/docs/git-merge

work page
[38]

Learning from class-imbalanced data: Review of methods and applications,

G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, “Learning from class-imbalanced data: Review of methods and applications,” Expert Systems with Applications , vol. 73, pp. 220–239, 2017

work page 2017
[39]

Smote: synthetic minority over-sampling technique,

N. V . Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” Journal of artiﬁcial intel- ligence research, vol. 16, pp. 321–357, 2002

work page 2002
[40]

reaper dataset,

“reaper dataset,” https://reporeapers.github.io/static/downloads/dataset. csv.gz

work page
[41]

The distribution of spearman’s coefﬁcient of rank correlation in a universe in which all rankings occur an equal number of times,

M. G. Kendall, S. F. Kendall, and B. B. Smith, “The distribution of spearman’s coefﬁcient of rank correlation in a universe in which all rankings occur an equal number of times,” Biometrika, pp. 251–273, 1939

work page 1939
[42]

T. W. Anderson and J. D. Finn, The new statistical analysis of data . Springer Science & Business Media, 2012

work page 2012

[1] [1]

The promises and perils of mining git,

C. Bird, P. C. Rigby, E. T. Barr, D. J. Hamilton, D. M. German, and P. Devanbu, “The promises and perils of mining git,” in Mining Software Repositories, 2009. MSR’09. 6th IEEE International Working Conference on. IEEE, 2009, pp. 1–10

work page 2009

[2] [2]

The promises and perils of mining github,

E. Kalliamvakou, G. Gousios, K. Blincoe, L. Singer, D. M. German, and D. Damian, “The promises and perils of mining github,” in Proceedings of the 11th working conference on mining software repositories . ACM, 2014, pp. 92–101

work page 2014

[3] [3]

Software practitioner per- spectives on merge conﬂicts and resolutions,

S. McKee, N. Nelson, A. Sarma, and D. Dig, “Software practitioner per- spectives on merge conﬂicts and resolutions,” in Software Maintenance and Evolution (ICSME), 2017 IEEE International Conference on. IEEE, 2017, pp. 467–478

work page 2017

[4] [4]

Understanding semi-structured merge conﬂict characteristics in open-source java projects,

P. Accioly, P. Borba, and G. Cavalcanti, “Understanding semi-structured merge conﬂict characteristics in open-source java projects,” Empirical Software Engineering, vol. 23, no. 4, pp. 2051–2085, 2018

work page 2051

[5] [5]

Palantir: Early detection of development conﬂicts arising from parallel code changes,

A. Sarma, D. F. Redmiles, and A. Van Der Hoek, “Palantir: Early detection of development conﬂicts arising from parallel code changes,” IEEE Transactions on Software Engineering, vol. 38, no. 4, pp. 889–908, 2012

work page 2012

[6] [6]

Assessing the value of branches with what-if analysis,

C. Bird and T. Zimmermann, “Assessing the value of branches with what-if analysis,” in Proceedings of the ACM SIGSOFT 20th Interna- tional Symposium on the Foundations of Software Engineering . ACM, 2012, p. 45

work page 2012

[7] [7]

Early detection of collaboration conﬂicts and risks,

Y . Brun, R. Holmes, M. D. Ernst, and D. Notkin, “Early detection of collaboration conﬂicts and risks,” IEEE Transactions on Software Engineering, vol. 39, no. 10, pp. 1358–1375, 2013

work page 2013

[8] [8]

Improving early detection of software merge conﬂicts,

M. L. Guimar ˜aes and A. R. Silva, “Improving early detection of software merge conﬂicts,” in Proceedings of the 34th International Conference on Software Engineering . IEEE Press, 2012, pp. 342–352

work page 2012

[9] [9]

Awareness and merge conﬂicts in distributed software development,

H. C. Estler, M. Nordio, C. A. Furia, and B. Meyer, “Awareness and merge conﬂicts in distributed software development,” in Global Software Engineering (ICGSE), 2014 IEEE 9th International Conference on . IEEE, 2014, pp. 26–35

work page 2014

[10] [10]

Incremental speculative merging,

J. Baumgartner, R. Kanzelman, H. Mony, and V . Paruthi, “Incremental speculative merging,” Apr. 26 2011, uS Patent 7,934,180

work page 2011

[11] [11]

Proactive detection of collaboration conﬂicts,

Y . Brun, R. Holmes, M. D. Ernst, and D. Notkin, “Proactive detection of collaboration conﬂicts,” in Proceedings of the 19th ACM SIGSOFT sym- posium and the 13th European conference on Foundations of software engineering. ACM, 2011, pp. 168–178

work page 2011

[12] [12]

Cassandra: Proactive conﬂict minimization through optimized task scheduling,

B. K. Kasi and A. Sarma, “Cassandra: Proactive conﬂict minimization through optimized task scheduling,” in Proceedings of the 2013 Inter- national Conference on Software Engineering . IEEE Press, 2013, pp. 732–741

work page 2013

[13] [13]

In- dicators for merge conﬂicts in the wild: survey and empirical study,

O. Leßenich, J. Siegmund, S. Apel, C. K ¨astner, and C. Hunsen, “In- dicators for merge conﬂicts in the wild: survey and empirical study,” Automated Software Engineering , vol. 25, no. 2, pp. 279–313, 2018

work page 2018

[14] [14]

Analyzing conﬂict predictors in open-source java projects,

P. Accioly, P. Borba, L. Silva, and G. Cavalcanti, “Analyzing conﬂict predictors in open-source java projects,” in Proceedings of the 15th International Conference on Mining Software Repositories . ACM, 2018, pp. 576–586

work page 2018

[15] [15]

Curating github for engineered software projects,

N. Munaiah, S. Kroh, C. Cabrey, and M. Nagappan, “Curating github for engineered software projects,” Empirical Software Engineering, vol. 22, no. 6, pp. 3219–3253, 2017

work page 2017

[16] [16]

Induction of decision trees,

J. R. Quinlan, “Induction of decision trees,” Machine learning, vol. 1, no. 1, pp. 81–106, 1986

work page 1986

[17] [17]

Classiﬁcation and regression by randomfor- est,

A. Liaw, M. Wiener et al., “Classiﬁcation and regression by randomfor- est,” R news, vol. 2, no. 3, pp. 18–22, 2002

work page 2002

[18] [18]

Artifact page,

“Artifact page,” https://github.com/ualberta-smr/conﬂict-prediction

work page

[19] [19]

A state-of-the-art survey on software merging,

T. Mens, “A state-of-the-art survey on software merging,” IEEE trans- actions on software engineering , vol. 28, no. 5, pp. 449–462, 2002

work page 2002

[20] [20]

Semistruc- tured merge: rethinking merge in revision control systems,

S. Apel, J. Liebig, B. Brandl, C. Lengauer, and C. K ¨astner, “Semistruc- tured merge: rethinking merge in revision control systems,” in Proceed- ings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering . ACM, 2011, pp. 190–200

work page 2011

[21] [21]

How do centralized and distributed version control systems impact software changes?

C. Brindescu, M. Codoban, S. Shmarkatiuk, and D. Dig, “How do centralized and distributed version control systems impact software changes?” in Proceedings of the 36th International Conference on Software Engineering. ACM, 2014, pp. 322–333

work page 2014

[22] [22]

Structured merge with auto- tuning: balancing precision and performance,

S. Apel, O. Leßenich, and C. Lengauer, “Structured merge with auto- tuning: balancing precision and performance,” in Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineer- ing. ACM, 2012, pp. 120–129

work page 2012

[23] [23]

Structure-oriented merging of revisions of software documents,

B. Westfechtel, “Structure-oriented merging of revisions of software documents,” in Proceedings of the 3rd international workshop on Software conﬁguration management . ACM, 1991, pp. 68–79

work page 1991

[24] [24]

Syntactic software merging,

J. Buffenbarger, “Syntactic software merging,” in Software Conﬁguration Management. Springer, 1995, pp. 153–172

work page 1995

[25] [25]

Fstmerge tool,

“Fstmerge tool,” https://github.com/joliebig/featurehouse/tree/master/ fstmerge

work page

[26] [26]

Jdime tool,

“Jdime tool,” http://fosd.net/JDime

work page

[27] [27]

Evaluating and improving semistructured merge,

G. Cavalcanti, P. Borba, and P. Accioly, “Evaluating and improving semistructured merge,” Proceedings of the ACM on Programming Lan- guages, vol. 1, no. OOPSLA, p. 59, 2017

work page 2017

[28] [28]

On the Nature of Merge Conﬂicts: a Study of 2,731 Open Source Java Projects Hosted by GitHub,

G. G. L. Menezes, L. G. P. Murta, M. O. Barros, and A. Van Der Hoek, “On the Nature of Merge Conﬂicts: a Study of 2,731 Open Source Java Projects Hosted by GitHub,” IEEE Transactions on Software Engineering, 2018

work page 2018

[29] [29]

Tipmerge: recom- mending experts for integrating changes across branches,

C. Costa, J. Figueiredo, L. Murta, and A. Sarma, “Tipmerge: recom- mending experts for integrating changes across branches,” in Proceed- ings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering . ACM, 2016, pp. 523–534

work page 2016

[30] [30]

Effective software merging in the presence of object-oriented refactorings,

D. Dig, K. Manzoor, R. E. Johnson, and T. N. Nguyen, “Effective software merging in the presence of object-oriented refactorings,” IEEE Transactions on Software Engineering, vol. 34, no. 3, pp. 321–335, 2008

work page 2008

[31] [31]

Are refactorings to blame? an empirical study of refactorings in merge conﬂicts,

M. Mahmoudi, S. Nadi, and N. Tsantalis, “Are refactorings to blame? an empirical study of refactorings in merge conﬂicts,” in Proc. of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER ’19) , 2019

work page 2019

[32] [32]

Syde: a tool for collaborative software development,

L. Hattori and M. Lanza, “Syde: a tool for collaborative software development,” in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2. ACM, 2010, pp. 235– 238

work page 2010

[33] [33]

Supporting merge conﬂict resolu- tion by using ﬁne-grained code change history,

Y . Nishimura and K. Maruyama, “Supporting merge conﬂict resolu- tion by using ﬁne-grained code change history,” in Software Analysis, Evolution, and Reengineering (SANER), 2016 IEEE 23rd International Conference on, vol. 1. IEEE, 2016, pp. 661–664

work page 2016

[34] [34]

Studying pull request merges: a case study of shopify’s active merchant,

O. Kononenko, T. Rose, O. Baysal, M. Godfrey, D. Theisen, and B. de Water, “Studying pull request merges: a case study of shopify’s active merchant,” in Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice . ACM, 2018, pp. 124–133

work page 2018

[35] [35]

Early prediction of merged code changes to prioritize reviewing tasks,

Y . Fan, X. Xia, D. Lo, and S. Li, “Early prediction of merged code changes to prioritize reviewing tasks,” Empirical Software Engineering, pp. 1–48, 2018

work page 2018

[36] [36]

Scalable software merging studies with merganser,

M. Owhadi-Kareshk and S. Nadi, “Scalable software merging studies with merganser,” in Proceedings of the 16th International Conference on Mining Software Repositories (MSR ’19) , 2019

work page 2019

[37] [37]

https://git-scm.com/docs/git-merge

work page

[38] [38]

Learning from class-imbalanced data: Review of methods and applications,

G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, “Learning from class-imbalanced data: Review of methods and applications,” Expert Systems with Applications , vol. 73, pp. 220–239, 2017

work page 2017

[39] [39]

Smote: synthetic minority over-sampling technique,

N. V . Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” Journal of artiﬁcial intel- ligence research, vol. 16, pp. 321–357, 2002

work page 2002

[40] [40]

reaper dataset,

“reaper dataset,” https://reporeapers.github.io/static/downloads/dataset. csv.gz

work page

[41] [41]

The distribution of spearman’s coefﬁcient of rank correlation in a universe in which all rankings occur an equal number of times,

M. G. Kendall, S. F. Kendall, and B. B. Smith, “The distribution of spearman’s coefﬁcient of rank correlation in a universe in which all rankings occur an equal number of times,” Biometrika, pp. 251–273, 1939

work page 1939

[42] [42]

T. W. Anderson and J. D. Finn, The new statistical analysis of data . Springer Science & Business Media, 2012

work page 2012