Network versus Code Metrics to Predict Defects: A Replication Study
by Rahul Premraj, Kim Herzig

Proceedings of the Fifth International Symposium on Empirical Software Engineering and Measurement (ESEM 2011), September 2011.

Download as PDF file.

Abstract

Several defect prediction models have been proposed to identify which entities in a software system are likely to have defects before its release. This paper presents a replication of one such study conducted by Zimmermann and Nagappan [1] on Windows Server 2003 where the authors leveraged dependency relationships between software entities captured using social network metrics to predict whether they are likely to have defects. They found that network metrics perform significantly better than source code metrics at predicting defects. In order to corroborate the generality of their findings, we replicate their study on three open source Java projects, viz., JRuby, ArgoUML, and Eclipse. Our results are in agreement with the original study by Zimmermann and Nagappan when using a similar experimental setup as them (random sampling). However, when we evaluated the metrics using setups more suited for industrial use Ğ forward-release and cross-project prediction Ğ we found network metrics to offer no vantage over code metrics. Moreover, code metrics may be preferable to network metrics considering the data is easier to collect and we used only 8 code metrics compared to approximately 58 network metrics.

BibTeX Entry

@inproceedings{premraj-esem-2011,
    title = "Network versus Code Metrics to Predict Defects: A Replication Study",
    author = "Rahul Premraj and Kim Herzig",
    year = "2011",
    month = sep,
    booktitle = "Proceedings of the Fifth International Symposium on Empirical Software Engineering and Measurement
(ESEM 2011)",
    location = "Banff, Canada",
}

Show all publications of the Software Engineering Chair.