Evaluating data-flow coverage in spectrum-based fault localization

Fabio Kon; Henrique Lemos Ribeiro; Higor Amario de Souza; Marcos Lordello Chaim; Roberto Paulo de Andrioli Araujo

arxiv: 1906.11715 · v1 · pith:ER3HEX2Vnew · submitted 2019-06-27 · 💻 cs.SE

Evaluating data-flow coverage in spectrum-based fault localization

Henrique Lemos Ribeiro , Higor Amario de Souza , Roberto Paulo de Andrioli Araujo , Marcos Lordello Chaim , Fabio Kon This is my paper

Pith reviewed 2026-05-25 14:30 UTC · model grok-4.3

classification 💻 cs.SE

keywords spectrum-based fault localizationdata-flow spectracontrol-flow spectradefinition-use associationsfault rankingsoftware debuggingSFL metrics

0 comments

The pith

Data-flow spectra place up to 50% more faults in the top-15 ranks than control-flow spectra in spectrum-based fault localization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares data-flow spectra, based on definition-use associations, to control-flow spectra based on lines for use in spectrum-based fault localization across ten ranking metrics. It evaluates them on 163 faults from five real-world open source programs with large test suites. The results show data-flow spectra improve the ranking of faults, allowing more to be found in top positions. This suggests SFL can help developers locate bugs with less inspection effort, though at higher computational cost for collecting the spectra. Data-flow also gives info on suspicious variables.

Core claim

Using data-flow spectra, up to 50% more faults are ranked in the top-15 positions compared to control-flow spectra. Most SFL ranking metrics present better effectiveness using data-flow to inspect up to the top-40 positions. The execution cost of data-flow spectra is higher, with an average overhead of 353% compared to 102% for control-flow.

What carries the argument

Definition-use association (DUA) spectra versus line spectra applied to ten SFL ranking metrics on 163 faults.

If this is right

Developers may need to inspect less code to find faults when using data-flow spectra.
Most ranking metrics perform better with data-flow up to the top-40 positions.
Data-flow spectra provide additional information about suspicious variables that can aid fault localization.
The extra execution time for data-flow, from 22 seconds to under 9 minutes, remains practical for use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Combining data-flow and control-flow spectra could yield even better results in hybrid SFL techniques.
Applying this to other types of faults or larger programs might reveal scalability limits.
Integration with variable-level analysis could further reduce the code developers need to review.

Load-bearing premise

The 163 faults and five open-source programs with their test suites are representative of typical software systems without systematic bias from data-flow instrumentation.

What would settle it

Running the same comparison on a new set of programs and faults and observing no increase in the number of faults ranked in the top-15 with data-flow spectra.

Figures

Figures reproduced from arXiv: 1906.11715 by Fabio Kon, Henrique Lemos Ribeiro, Higor Amario de Souza, Marcos Lordello Chaim, Roberto Paulo de Andrioli Araujo.

**Figure 1.** Figure 1: Code of max program B. Control-flow spectra Fault localization techniques use different types of controlflow spectra: statements are executable lines of code; basic blocks (or simply blocks) are sets of statements that are always executed together; branches are statements that transfer the control-flow execution among blocks. Control-flow information of a program is represented by a graph with nodes and e… view at source ↗

**Figure 3.** Figure 3: Effectiveness of DUA and line spectra in fault localization using different ranking metrics [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Effectiveness of DUA and line spectra in fault localization using different ranking metrics number of faults: 75 out of 163 using Ochiai and Zoltar whilst line spectrum required the inspection of less code only for 40 and 39 faults, respectively, for the same ranking metrics—a difference of 87.5%. Around 20 out of 163 (12%) faults can be located by investigating only the top 5 lines, using either DUA or li… view at source ↗

read the original abstract

Background: Debugging is a key task during the software development cycle. Spectrum-based Fault Localization (SFL) is a promising technique to improve and automate debugging. SFL techniques use control-flow spectra to pinpoint the most suspicious program elements. However, data-flow spectra provide more detailed information about the program execution, which may be useful for fault localization. Aims: We evaluate the effectiveness and efficiency of ten SFL ranking metrics using data-flow spectra. Method: We compare the performance of data- and control-flow spectra for SFL using 163 faults from 5 real-world open source programs, which contain from 468 to 4130 test cases. The data- and control-flow spectra types used in our evaluation are definition-use associations (DUAs) and lines, respectively. Results: Using data-flow spectra, up to 50% more faults are ranked in the top-15 positions compared to control-flow spectra. Also, most SFL ranking metrics present better effectiveness using data-flow to inspect up to the top-40 positions. The execution cost of data-flow spectra is higher than control-flow, taking from 22 seconds to less than 9 minutes. Data-flow has an average overhead of 353% for all programs, while the average overhead for control-flow is of 102%. Conclusions: The results suggest that SFL techniques can benefit from using data-flow spectra to classify faults in better positions, which may lead developers to inspect less code to find bugs. The execution cost to gather data-flow is higher compared to control-flow, but it is not prohibitive. Moreover, data-flow spectra also provide information about suspicious variables for fault localization, which may improve the developers' performance using SFL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Incremental empirical check on 5 programs shows data-flow spectra lift SFL rankings but rests on narrow sample with no stats or bias checks.

read the letter

The core finding here is a head-to-head run of ten SFL metrics on definition-use association spectra versus line spectra across 163 faults in five open-source programs. Data-flow versions place up to 50% more faults in the top-15 ranks and stay ahead through the top-40, while adding roughly 3.5x overhead instead of 2x. That is the concrete result the abstract supplies and the only quantitative claim that is new relative to prior SFL work on control-flow spectra alone. The paper does the straightforward thing of instrumenting real test suites, collecting both spectrum types, and tabulating the ranking positions and wall-clock costs. That is useful bookkeeping for anyone who might want to try DUAs in a debugger tool. The limitation is exactly the one the stress-test note flags: five programs and 163 faults give no evidence that the lift survives different languages, different fault distributions, or different test-suite strengths. The abstract mentions no statistical tests, no tie-breaking rule, and no check on whether the DUA instrumentation itself changes execution behavior or coverage. Those gaps make the 50% number hard to treat as more than an observation on this particular set of subjects. The work is aimed at researchers and tool builders who already care about spectrum-based localization and are willing to pay the extra instrumentation cost for finer-grained data. It is not a foundational result, but the experiment is honest enough and the question practical enough that a serious editor should send it out for review rather than desk-reject. Ask the authors for more programs and basic significance tests before publication.

Referee Report

4 major / 2 minor

Summary. The paper evaluates spectrum-based fault localization (SFL) using data-flow spectra (definition-use associations, DUAs) versus control-flow spectra (program lines) across ten ranking metrics. On 163 faults from five open-source programs (468–4130 tests each), it reports that data-flow spectra place up to 50% more faults in the top-15 positions, yield better effectiveness for most metrics up to the top-40 positions, incur higher but feasible overhead (353% average vs. 102%), and supply additional variable-suspiciousness information.

Significance. If the empirical comparison holds after addressing methodological gaps, the result would be useful for SFL research by showing that richer execution spectra can improve ranking quality on real programs without prohibitive cost. The direct head-to-head measurement on actual faults and test suites is a concrete strength; however, the absence of statistical testing and limited subject selection limit the strength of the general claim that 'SFL techniques can benefit from using data-flow spectra'.

major comments (4)

[Method] Method section: no statistical significance tests (e.g., paired Wilcoxon or bootstrap) are reported for the top-15 and top-40 effectiveness differences that underpin the 'up to 50%' and 'better effectiveness' claims; without them the headline numbers cannot be distinguished from sampling variation.
[Method] Method / Results: the paper supplies no description of tie-breaking rules or how programs containing multiple faults are counted when computing the 'faults ranked in top-15' metric; both choices directly affect the reported percentages.
[Evaluation] Evaluation setup: the five programs and 163 faults are presented without explicit selection criteria, stratification by fault type, or threats-to-validity discussion of domain or language bias, making it impossible to assess whether the observed DUA advantage generalizes beyond the chosen subjects.
[Method] Method: potential systematic bias introduced by the DUA instrumentation itself (e.g., altered execution timing or coverage) is not measured or bounded, yet the spectra comparison treats the two kinds of spectra as directly comparable.

minor comments (2)

[Abstract] Abstract: the overhead sentence 'taking from 22 seconds to less than 9 minutes' should clarify whether these are per-program extremes or averages and should reference the corresponding table or figure.
[Results] Results: tables or figures comparing the ten metrics should include the raw counts of faults localized at each rank threshold rather than only relative percentages, to allow independent verification.

Simulated Author's Rebuttal

4 responses · 0 unresolved

Thank you for the constructive feedback. We have revised the manuscript to strengthen its methodological transparency and address all major concerns raised.

read point-by-point responses

Referee: [Method] Method section: no statistical significance tests (e.g., paired Wilcoxon or bootstrap) are reported for the top-15 and top-40 effectiveness differences that underpin the 'up to 50%' and 'better effectiveness' claims; without them the headline numbers cannot be distinguished from sampling variation.

Authors: We agree this is a gap. The revised manuscript now includes paired Wilcoxon signed-rank tests on the per-metric effectiveness differences at top-15 and top-40 positions, with p-values and effect sizes reported in the Results section. Most differences remain statistically significant (p < 0.05). revision: yes
Referee: [Method] Method / Results: the paper supplies no description of tie-breaking rules or how programs containing multiple faults are counted when computing the 'faults ranked in top-15' metric; both choices directly affect the reported percentages.

Authors: We have added an explicit subsection in Method describing tie-breaking (average rank assigned to tied elements, standard in SFL) and clarified that the study uses single-fault versions of the programs, consistent with the majority of prior SFL benchmarks. Multi-fault handling is noted as out of scope. revision: yes
Referee: [Evaluation] Evaluation setup: the five programs and 163 faults are presented without explicit selection criteria, stratification by fault type, or threats-to-validity discussion of domain or language bias, making it impossible to assess whether the observed DUA advantage generalizes beyond the chosen subjects.

Authors: The revised Threats to Validity section now states the selection criteria (programs drawn from prior SFL studies with available test suites and real faults), notes lack of stratification by fault type, and explicitly discusses language (Java) and domain limitations on generalizability. revision: yes
Referee: [Method] Method: potential systematic bias introduced by the DUA instrumentation itself (e.g., altered execution timing or coverage) is not measured or bounded, yet the spectra comparison treats the two kinds of spectra as directly comparable.

Authors: We acknowledge the concern. The revision adds a paragraph in Method noting that both spectra are collected from the same instrumented executions (ensuring internal comparability) and bounds the timing impact via the separately reported overhead figures. We could not retroactively quantify any differential coverage distortion without new instrumentation experiments. revision: partial

Circularity Check

0 steps flagged

No circularity: direct empirical comparison of measured spectra on fixed subjects

full rationale

The paper performs an empirical evaluation comparing data-flow (DUA) and control-flow (line) spectra for SFL ranking metrics across 163 faults in 5 open-source programs. No equations, fitted parameters, predictions, or derivations appear in the abstract or described method. Effectiveness claims (e.g., up to 50% more faults in top-15) are reported as direct observations from the experiment, not reduced by construction to any self-defined quantity or prior self-citation. The reader's assessment of score 1.0 is consistent; generalizability concerns exist but are orthogonal to circularity. No load-bearing self-citation chains or ansatzes are present.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper is an empirical evaluation study. It introduces no new mathematical entities or derivations and therefore carries no free parameters or invented entities. It rests on two standard domain assumptions common to software-testing experiments.

axioms (2)

domain assumption The five selected open-source programs and their 163 faults are representative of real-world software and faults.
Generalization from the reported results depends on this premise.
domain assumption Definition-use association spectra can be collected by instrumentation without introducing measurement bias or altering program behavior.
The comparison of spectra effectiveness assumes accurate collection.

pith-pipeline@v0.9.0 · 5853 in / 1415 out tokens · 32616 ms · 2026-05-25T14:30:02.954622+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

[1]

The economic impacts of inadequate infrastructure for software testing,

G. Tassey, “The economic impacts of inadequate infrastructure for software testing,” National Institute of Standards and Technology, RTI Project, vol. 7007, no. 011, 2002

work page 2002
[2]

Slice-based statistical fault localization,

X. Mao, Y . Lei, Z. Dai, Y . Qi, and C. Wang, “Slice-based statistical fault localization,” Journal of Systems and Software , vol. 89, no. 0, pp. 51–62, 2014

work page 2014
[3]

State dependency probabilistic model for fault localization,

G. Dandan, S. Xiaohong, W. Tiantian, M. Peijun, and Y . Wang, “State dependency probabilistic model for fault localization,” Information and Software Technology, vol. 57, no. 0, pp. 430–445, 2014

work page 2014
[4]

Zeller, Why programs fail: A guide to systematic debugging , 2nd ed

A. Zeller, Why programs fail: A guide to systematic debugging , 2nd ed. Burlington, MA: Morgan Kaufmann Publishers, 2009

work page 2009
[5]

Visualization of test informa- tion to assist fault localization,

J. A. Jones, M. J. Harrold, and J. Stasko, “Visualization of test informa- tion to assist fault localization,” in Proceedings of the 24th International Conference on Software Engineering , ser. ICSE’02, 2002, pp. 467–477

work page 2002
[6]

Lightweight fault- localization using multiple coverage types,

R. Santelices, J. A. Jones, Y . Yu, and M. J. Harrold, “Lightweight fault- localization using multiple coverage types,” in Proceedings of the 31st International Conference on Software Engineering , ser. ICSE’09, 2009, pp. 56–66

work page 2009
[7]

On the accuracy of spectrum-based fault localization,

R. Abreu, P. Zoeteweij, and A. J. C. van Gemund, “On the accuracy of spectrum-based fault localization,” in Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION, ser. TAICPART-MUTATION’07, 2007, pp. 89–98

work page 2007
[8]

Spectral debugging with weights and incremental ranking,

L. Naish, H. J. Lee, and K. Ramamohanarao, “Spectral debugging with weights and incremental ranking,” in Proceedings of the 16th Asia- Paciﬁc Software Engineering Conference , ser. APSEC’09, 2009, pp. 168–175

work page 2009
[9]

A family of code coverage- based heuristics for effective fault localization,

W. E. Wong, V . Debroy, and B. Choi, “A family of code coverage- based heuristics for effective fault localization,” Journal of Systems and Software, vol. 83, no. 2, pp. 188–208, 2010

work page 2010
[10]

Automatic error detection techniques based on dynamic invariants,

A. Gonzalez-Sanchez, “Automatic error detection techniques based on dynamic invariants,” Master’s thesis, Delft University of Technology, 2007

work page 2007
[11]

Evaluating and improving fault localization,

S. Pearson, J. Campos, R. Just, G. Fraser, R. Abreu, M. D. Ernst, D. Pang, and B. Keller, “Evaluating and improving fault localization,” in Proceedings of the 39th International Conference on Software Engi- neering, ser. ICSE’17, 2017, pp. 609–620

work page 2017
[12]

Uniformly evaluating and comparing ranking metrics for spectral fault localization,

C. Ma, Y . Zhang, T. Zhang, Y . Lu, and Q. Wang, “Uniformly evaluating and comparing ranking metrics for spectral fault localization,” in Pro- ceedings of the 14th International Conference on Quality Software , ser. QSIC’14, 2014, pp. 315–320

work page 2014
[13]

Fault localization based on information ﬂow coverage,

W. Masri, “Fault localization based on information ﬂow coverage,” Software Testing, Veriﬁcation and Reliability , vol. 20, no. 2, pp. 121– 147, 2010

work page 2010
[14]

Experiments of the effectiveness of dataﬂow- and controlﬂow-based test adequacy cri- teria,

M. Hutchins, H. Foster, T. Goradia, and T. Ostrand, “Experiments of the effectiveness of dataﬂow- and controlﬂow-based test adequacy cri- teria,” in Proceedings of the 16th International Conference on Software Engineering, ser. ICSE’94, 1994, pp. 191–200

work page 1994
[15]

Fault localization with nearest neighbor queries,

M. Renieris and S. P. Reiss, “Fault localization with nearest neighbor queries,” in Proceedings of the 18th IEEE International Conference on Automated Software Engineering , ser. ASE’03, 2003, pp. 30–39

work page 2003
[16]

Debugging in parallel,

J. A. Jones, J. F. Bowring, and M. J. Harrold, “Debugging in parallel,” in Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA’07, 2007, pp. 16–26

work page 2007
[17]

HOLMES: Effective statistical debugging via efﬁcient path proﬁling,

T. M. Chilimbi, B. Liblit, K. Mehra, A. V . Nori, and K. Vaswani, “HOLMES: Effective statistical debugging via efﬁcient path proﬁling,” in Proceedings of the 31st International Conference on Software Engi- neering, ser. ICSE’09, 2009, pp. 34–44

work page 2009
[18]

Demand-driven structural testing with dynamic instrumentation,

J. Misurda, J. A. Clause, J. L. Reed, B. R. Childers, and M. L. Soffa, “Demand-driven structural testing with dynamic instrumentation,” in Proceedings of the 27th International Conference on Software Engi- neering, ser. ICSE’05, 2005, pp. 156–165

work page 2005
[19]

Efﬁciently monitoring data-ﬂow test coverage,

R. Santelices and M. J. Harrold, “Efﬁciently monitoring data-ﬂow test coverage,” in Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, ser. ASE’07, 2007, pp. 343–352

work page 2007
[20]

An efﬁcient bitwise algorithm for intra-procedural data-ﬂow testing coverage,

M. L. Chaim and R. P. A. d. Araujo, “An efﬁcient bitwise algorithm for intra-procedural data-ﬂow testing coverage,” Information Processing Letters, vol. 113, no. 8, pp. 293–300, 2013

work page 2013
[21]

Data-ﬂow testing in the large,

R. P. A. de Araujo and M. L. Chaim, “Data-ﬂow testing in the large,” in Proceedings of the 7th IEEE International Conference on Software Testing, Veriﬁcation and Validation, ser. ICST’14, 2014, pp. 81–90

work page 2014
[22]

Jaguar: A spectrum-based fault localization tool for real- world software,

H. L. Ribeiro, H. A. de Souza, R. P. A. de Araujo, M. L. Chaim, and F. Kon, “Jaguar: A spectrum-based fault localization tool for real- world software,” in Proceedings of the 11th International Conference on Software Testing, Veriﬁcation and Validation , ser. ICST’18, 2018, pp. 404–409

work page 2018
[23]

Releng of the nerds: Open source release engineering,

K. Moir, “Releng of the nerds: Open source release engineering,” March 2011, SDK code coverage with JaCoCo. [Online]. Available: http://relengofthenerds.blogspot.com.br/2011/03/ sdk-code-coverage-with-jacoco.html

work page 2011
[24]

Selecting software test data using data ﬂow information,

S. Rapps and E. J. Weyuker, “Selecting software test data using data ﬂow information,” IEEE Transactions on Software Engineering, vol. 11, no. 4, pp. 367–375, 1985

work page 1985
[25]

The use of program proﬁling for software maintenance with applications to the year 2000 problem,

T. Reps, T. Ball, M. Das, and J. Larus, “The use of program proﬁling for software maintenance with applications to the year 2000 problem,” in Proceedings of the 6th European Software Engineering Conference Held Jointly with the 5th ACM SIGSOFT Symposium on the Foundations of Software Engineering , ser. ESEC/FSE’97, 1997, pp. 432–449

work page 2000
[26]

The impact of software evolution on code coverage information,

S. Elbaum, D. Gable, and G. Rothermel, “The impact of software evolution on code coverage information,” in Proceedings of the 19th IEEE International Conference on Software Maintenance, ser. ICSM’01, 2001, pp. 170–179

work page 2001
[27]

An empirical investiga- tion of program spectra,

M. J. Harrold, G. Rothermel, R. Wu, and L. Yi, “An empirical investiga- tion of program spectra,” SIGPLAN Notices, vol. 33, no. 7, pp. 83–90, 1998

work page 1998
[28]

A consensus-based strategy to improve the quality of fault localization,

V . Debroy and W. E. Wong, “A consensus-based strategy to improve the quality of fault localization,” Software: Practice and Experience, vol. 43, no. 8, pp. 989–1011, 2013

work page 2013
[29]

A dynamic fault localization technique with noise reduction for java programs,

J. Xu, W. K. Chan, Z. Zhang, T. H. Tse, and S. Li, “A dynamic fault localization technique with noise reduction for java programs,” in Proceedings of the 11th International Conference on Quality Software , ser. QSIC’11, 2011, pp. 11–20

work page 2011
[30]

A debugging strategy based on requirements of testing,

M. L. Chaim, J. C. Maldonado, and M. Jino, “A debugging strategy based on requirements of testing,” in Proceedings of the 7th Euro- pean Conference on Software Maintenance and Reengineering , ser. CSMR’03, 2003, pp. 160–169

work page 2003
[31]

Defects4j: A database of existing faults to enable controlled testing studies for java programs,

R. Just, D. Jalali, and M. D. Ernst, “Defects4j: A database of existing faults to enable controlled testing studies for java programs,” in Pro- ceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA’14, 2014, pp. 437–440

work page 2014
[32]

Are automated debugging techniques actually helping programmers?

C. Parnin and A. Orso, “Are automated debugging techniques actually helping programmers?” in Proceedings of the ACM SIGSOFT Inter- national Symposium on Software Testing and Analysis , ser. ISSTA’11, 2011, pp. 199–209

work page 2011
[33]

Practitioners’ expectations on automated fault localization,

P. S. Kochhar, X. Xia, D. Lo, and S. Li, “Practitioners’ expectations on automated fault localization,” in Proceedings of the 25th International Symposium on Software Testing and Analysis , ser. ISSTA’16, 2016, pp. 165–176

work page 2016
[34]

A test of goodness of ﬁt,

T. W. Anderson and D. A. Darling, “A test of goodness of ﬁt,” Journal of the American Statistical Association , vol. 49, no. 268, pp. 765–769, 1954

work page 1954
[35]

Individual comparisons by ranking methods,

F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics bulletin, vol. 1, no. 6, pp. 80–83, 1945

work page 1945
[36]

Dominance statistics: Ordinal analyses to answer ordinal questions,

N. Cliff, “Dominance statistics: Ordinal analyses to answer ordinal questions,” Psychological Bulletin, vol. 114, no. 3, pp. 494–509, 1993

work page 1993
[37]

Assessment of spectrum-based fault localization for practical use,

H. A. de Souza, “Assessment of spectrum-based fault localization for practical use,” PhD thesis, Institute of Mathematics and Statistics – University of S ˜ao Paulo, S ˜ao Paulo, Brazil, April 2018

work page 2018
[38]

Effective statistical fault localization using program slices,

Y . Lei, X. Mao, Z. Dai, and C. Wang, “Effective statistical fault localization using program slices,” in Proceedings of the IEEE 36th Annual International Computers, Software and Applications Conference, ser. COMPSAC’12, 2012, pp. 1–10

work page 2012
[39]

Hsfal: Effective fault localization using hybrid spectrum of full slices and execution slices,

X. Ju, S. Jiang, X. Chen, X. Wang, Y . Zhang, and H. Cao, “Hsfal: Effective fault localization using hybrid spectrum of full slices and execution slices,” Journal of Systems and Software , vol. 90, no. 0, pp. 3–17, 2014

work page 2014
[40]

Locating faults using multiple spectra-speciﬁc models,

K. Yu, M. Lin, Q. Gao, H. Zhang, and X. Zhang, “Locating faults using multiple spectra-speciﬁc models,” in Proceedings of the 26th ACM Symposium on Applied Computing , ser. SAC’11, 2011, pp. 1404–1410

work page 2011
[41]

Software-defect localisation by mining dataﬂow-enabled call graphs,

F. Eichinger, K. Krogmann, R. Klug, and K. B ¨ohm, “Software-defect localisation by mining dataﬂow-enabled call graphs,” in Proceedings of the Joint European Conference on Machine Learning and Principles and Practice on Knowledge Discovery in Databases , ser. ECML PKDD 2010, 2010, pp. 425–441

work page 2010
[42]

How effective are code coverage criteria?

H. Hemmati, “How effective are code coverage criteria?” in 2015 IEEE International Conference on Software Quality, Reliability and Security , ser. QRS’15, 2015, pp. 151–156

work page 2015

[1] [1]

The economic impacts of inadequate infrastructure for software testing,

G. Tassey, “The economic impacts of inadequate infrastructure for software testing,” National Institute of Standards and Technology, RTI Project, vol. 7007, no. 011, 2002

work page 2002

[2] [2]

Slice-based statistical fault localization,

X. Mao, Y . Lei, Z. Dai, Y . Qi, and C. Wang, “Slice-based statistical fault localization,” Journal of Systems and Software , vol. 89, no. 0, pp. 51–62, 2014

work page 2014

[3] [3]

State dependency probabilistic model for fault localization,

G. Dandan, S. Xiaohong, W. Tiantian, M. Peijun, and Y . Wang, “State dependency probabilistic model for fault localization,” Information and Software Technology, vol. 57, no. 0, pp. 430–445, 2014

work page 2014

[4] [4]

Zeller, Why programs fail: A guide to systematic debugging , 2nd ed

A. Zeller, Why programs fail: A guide to systematic debugging , 2nd ed. Burlington, MA: Morgan Kaufmann Publishers, 2009

work page 2009

[5] [5]

Visualization of test informa- tion to assist fault localization,

J. A. Jones, M. J. Harrold, and J. Stasko, “Visualization of test informa- tion to assist fault localization,” in Proceedings of the 24th International Conference on Software Engineering , ser. ICSE’02, 2002, pp. 467–477

work page 2002

[6] [6]

Lightweight fault- localization using multiple coverage types,

R. Santelices, J. A. Jones, Y . Yu, and M. J. Harrold, “Lightweight fault- localization using multiple coverage types,” in Proceedings of the 31st International Conference on Software Engineering , ser. ICSE’09, 2009, pp. 56–66

work page 2009

[7] [7]

On the accuracy of spectrum-based fault localization,

R. Abreu, P. Zoeteweij, and A. J. C. van Gemund, “On the accuracy of spectrum-based fault localization,” in Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION, ser. TAICPART-MUTATION’07, 2007, pp. 89–98

work page 2007

[8] [8]

Spectral debugging with weights and incremental ranking,

L. Naish, H. J. Lee, and K. Ramamohanarao, “Spectral debugging with weights and incremental ranking,” in Proceedings of the 16th Asia- Paciﬁc Software Engineering Conference , ser. APSEC’09, 2009, pp. 168–175

work page 2009

[9] [9]

A family of code coverage- based heuristics for effective fault localization,

W. E. Wong, V . Debroy, and B. Choi, “A family of code coverage- based heuristics for effective fault localization,” Journal of Systems and Software, vol. 83, no. 2, pp. 188–208, 2010

work page 2010

[10] [10]

Automatic error detection techniques based on dynamic invariants,

A. Gonzalez-Sanchez, “Automatic error detection techniques based on dynamic invariants,” Master’s thesis, Delft University of Technology, 2007

work page 2007

[11] [11]

Evaluating and improving fault localization,

S. Pearson, J. Campos, R. Just, G. Fraser, R. Abreu, M. D. Ernst, D. Pang, and B. Keller, “Evaluating and improving fault localization,” in Proceedings of the 39th International Conference on Software Engi- neering, ser. ICSE’17, 2017, pp. 609–620

work page 2017

[12] [12]

Uniformly evaluating and comparing ranking metrics for spectral fault localization,

C. Ma, Y . Zhang, T. Zhang, Y . Lu, and Q. Wang, “Uniformly evaluating and comparing ranking metrics for spectral fault localization,” in Pro- ceedings of the 14th International Conference on Quality Software , ser. QSIC’14, 2014, pp. 315–320

work page 2014

[13] [13]

Fault localization based on information ﬂow coverage,

W. Masri, “Fault localization based on information ﬂow coverage,” Software Testing, Veriﬁcation and Reliability , vol. 20, no. 2, pp. 121– 147, 2010

work page 2010

[14] [14]

Experiments of the effectiveness of dataﬂow- and controlﬂow-based test adequacy cri- teria,

M. Hutchins, H. Foster, T. Goradia, and T. Ostrand, “Experiments of the effectiveness of dataﬂow- and controlﬂow-based test adequacy cri- teria,” in Proceedings of the 16th International Conference on Software Engineering, ser. ICSE’94, 1994, pp. 191–200

work page 1994

[15] [15]

Fault localization with nearest neighbor queries,

M. Renieris and S. P. Reiss, “Fault localization with nearest neighbor queries,” in Proceedings of the 18th IEEE International Conference on Automated Software Engineering , ser. ASE’03, 2003, pp. 30–39

work page 2003

[16] [16]

Debugging in parallel,

J. A. Jones, J. F. Bowring, and M. J. Harrold, “Debugging in parallel,” in Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA’07, 2007, pp. 16–26

work page 2007

[17] [17]

HOLMES: Effective statistical debugging via efﬁcient path proﬁling,

T. M. Chilimbi, B. Liblit, K. Mehra, A. V . Nori, and K. Vaswani, “HOLMES: Effective statistical debugging via efﬁcient path proﬁling,” in Proceedings of the 31st International Conference on Software Engi- neering, ser. ICSE’09, 2009, pp. 34–44

work page 2009

[18] [18]

Demand-driven structural testing with dynamic instrumentation,

J. Misurda, J. A. Clause, J. L. Reed, B. R. Childers, and M. L. Soffa, “Demand-driven structural testing with dynamic instrumentation,” in Proceedings of the 27th International Conference on Software Engi- neering, ser. ICSE’05, 2005, pp. 156–165

work page 2005

[19] [19]

Efﬁciently monitoring data-ﬂow test coverage,

R. Santelices and M. J. Harrold, “Efﬁciently monitoring data-ﬂow test coverage,” in Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, ser. ASE’07, 2007, pp. 343–352

work page 2007

[20] [20]

An efﬁcient bitwise algorithm for intra-procedural data-ﬂow testing coverage,

M. L. Chaim and R. P. A. d. Araujo, “An efﬁcient bitwise algorithm for intra-procedural data-ﬂow testing coverage,” Information Processing Letters, vol. 113, no. 8, pp. 293–300, 2013

work page 2013

[21] [21]

Data-ﬂow testing in the large,

R. P. A. de Araujo and M. L. Chaim, “Data-ﬂow testing in the large,” in Proceedings of the 7th IEEE International Conference on Software Testing, Veriﬁcation and Validation, ser. ICST’14, 2014, pp. 81–90

work page 2014

[22] [22]

Jaguar: A spectrum-based fault localization tool for real- world software,

H. L. Ribeiro, H. A. de Souza, R. P. A. de Araujo, M. L. Chaim, and F. Kon, “Jaguar: A spectrum-based fault localization tool for real- world software,” in Proceedings of the 11th International Conference on Software Testing, Veriﬁcation and Validation , ser. ICST’18, 2018, pp. 404–409

work page 2018

[23] [23]

Releng of the nerds: Open source release engineering,

K. Moir, “Releng of the nerds: Open source release engineering,” March 2011, SDK code coverage with JaCoCo. [Online]. Available: http://relengofthenerds.blogspot.com.br/2011/03/ sdk-code-coverage-with-jacoco.html

work page 2011

[24] [24]

Selecting software test data using data ﬂow information,

S. Rapps and E. J. Weyuker, “Selecting software test data using data ﬂow information,” IEEE Transactions on Software Engineering, vol. 11, no. 4, pp. 367–375, 1985

work page 1985

[25] [25]

The use of program proﬁling for software maintenance with applications to the year 2000 problem,

T. Reps, T. Ball, M. Das, and J. Larus, “The use of program proﬁling for software maintenance with applications to the year 2000 problem,” in Proceedings of the 6th European Software Engineering Conference Held Jointly with the 5th ACM SIGSOFT Symposium on the Foundations of Software Engineering , ser. ESEC/FSE’97, 1997, pp. 432–449

work page 2000

[26] [26]

The impact of software evolution on code coverage information,

S. Elbaum, D. Gable, and G. Rothermel, “The impact of software evolution on code coverage information,” in Proceedings of the 19th IEEE International Conference on Software Maintenance, ser. ICSM’01, 2001, pp. 170–179

work page 2001

[27] [27]

An empirical investiga- tion of program spectra,

M. J. Harrold, G. Rothermel, R. Wu, and L. Yi, “An empirical investiga- tion of program spectra,” SIGPLAN Notices, vol. 33, no. 7, pp. 83–90, 1998

work page 1998

[28] [28]

A consensus-based strategy to improve the quality of fault localization,

V . Debroy and W. E. Wong, “A consensus-based strategy to improve the quality of fault localization,” Software: Practice and Experience, vol. 43, no. 8, pp. 989–1011, 2013

work page 2013

[29] [29]

A dynamic fault localization technique with noise reduction for java programs,

J. Xu, W. K. Chan, Z. Zhang, T. H. Tse, and S. Li, “A dynamic fault localization technique with noise reduction for java programs,” in Proceedings of the 11th International Conference on Quality Software , ser. QSIC’11, 2011, pp. 11–20

work page 2011

[30] [30]

A debugging strategy based on requirements of testing,

M. L. Chaim, J. C. Maldonado, and M. Jino, “A debugging strategy based on requirements of testing,” in Proceedings of the 7th Euro- pean Conference on Software Maintenance and Reengineering , ser. CSMR’03, 2003, pp. 160–169

work page 2003

[31] [31]

Defects4j: A database of existing faults to enable controlled testing studies for java programs,

R. Just, D. Jalali, and M. D. Ernst, “Defects4j: A database of existing faults to enable controlled testing studies for java programs,” in Pro- ceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA’14, 2014, pp. 437–440

work page 2014

[32] [32]

Are automated debugging techniques actually helping programmers?

C. Parnin and A. Orso, “Are automated debugging techniques actually helping programmers?” in Proceedings of the ACM SIGSOFT Inter- national Symposium on Software Testing and Analysis , ser. ISSTA’11, 2011, pp. 199–209

work page 2011

[33] [33]

Practitioners’ expectations on automated fault localization,

P. S. Kochhar, X. Xia, D. Lo, and S. Li, “Practitioners’ expectations on automated fault localization,” in Proceedings of the 25th International Symposium on Software Testing and Analysis , ser. ISSTA’16, 2016, pp. 165–176

work page 2016

[34] [34]

A test of goodness of ﬁt,

T. W. Anderson and D. A. Darling, “A test of goodness of ﬁt,” Journal of the American Statistical Association , vol. 49, no. 268, pp. 765–769, 1954

work page 1954

[35] [35]

Individual comparisons by ranking methods,

F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics bulletin, vol. 1, no. 6, pp. 80–83, 1945

work page 1945

[36] [36]

Dominance statistics: Ordinal analyses to answer ordinal questions,

N. Cliff, “Dominance statistics: Ordinal analyses to answer ordinal questions,” Psychological Bulletin, vol. 114, no. 3, pp. 494–509, 1993

work page 1993

[37] [37]

Assessment of spectrum-based fault localization for practical use,

H. A. de Souza, “Assessment of spectrum-based fault localization for practical use,” PhD thesis, Institute of Mathematics and Statistics – University of S ˜ao Paulo, S ˜ao Paulo, Brazil, April 2018

work page 2018

[38] [38]

Effective statistical fault localization using program slices,

Y . Lei, X. Mao, Z. Dai, and C. Wang, “Effective statistical fault localization using program slices,” in Proceedings of the IEEE 36th Annual International Computers, Software and Applications Conference, ser. COMPSAC’12, 2012, pp. 1–10

work page 2012

[39] [39]

Hsfal: Effective fault localization using hybrid spectrum of full slices and execution slices,

X. Ju, S. Jiang, X. Chen, X. Wang, Y . Zhang, and H. Cao, “Hsfal: Effective fault localization using hybrid spectrum of full slices and execution slices,” Journal of Systems and Software , vol. 90, no. 0, pp. 3–17, 2014

work page 2014

[40] [40]

Locating faults using multiple spectra-speciﬁc models,

K. Yu, M. Lin, Q. Gao, H. Zhang, and X. Zhang, “Locating faults using multiple spectra-speciﬁc models,” in Proceedings of the 26th ACM Symposium on Applied Computing , ser. SAC’11, 2011, pp. 1404–1410

work page 2011

[41] [41]

Software-defect localisation by mining dataﬂow-enabled call graphs,

F. Eichinger, K. Krogmann, R. Klug, and K. B ¨ohm, “Software-defect localisation by mining dataﬂow-enabled call graphs,” in Proceedings of the Joint European Conference on Machine Learning and Principles and Practice on Knowledge Discovery in Databases , ser. ECML PKDD 2010, 2010, pp. 425–441

work page 2010

[42] [42]

How effective are code coverage criteria?

H. Hemmati, “How effective are code coverage criteria?” in 2015 IEEE International Conference on Software Quality, Reliability and Security , ser. QRS’15, 2015, pp. 151–156

work page 2015