Dies ist ein Archiv des alten Softwaretechnik Lehrstuhls der Universität des Saarlandes. Es ist nicht länger aktuell.

Für die aktuelle Arbeit von Andreas Zeller und seiner Gruppe besuchen Sie andreas-zeller.info.
Für den aktuellen Softwaretechnik Lehrstuhl besuchen Sie www.se.cs.uni-saarland.de.

Mining Software Archives
Predicting Component Failures

Lehrstuhl für Softwaretechnik (Prof. Zeller)
Universität des Saarlandes – Informatik
Informatik Campus des Saarlandes
Campus E9 1 (CISPA)
66123 Saarbrücken
E-mail: zeller @ cs.uni-saarland.de
Telefon: +49 681 302-70970

The Software Evolution project at the Software Engineering Chair, Saarland University, analyzes version and bug databases to predict failure-prone modules, related changes, and future development activities.

What's new

We have mined the Microsoft bug databases to predict component failures.
- Read the paper (to appear at International Conference on Software Engineering, Shanghai)
- Read the FAZ article: Wie anfällig wird "Vista"? (in German)
Be warned: Don't program on Fridays! Why not? Read the paper...

How do design decisions impact the quality of software?

In an empirical study of 52 ECLIPSE plug-ins, we found that the software design as well as past failure history, can be used to build models which accurately predict failure-prone components in new programs. Our prediction only requires usage relationships between components, which are typically defined in the design phase; thus, designers can easily explore and assess design alternatives in terms of predicted quality. In the ECLIPSE study, 90% of the 5% most failure-prone components, as predicted by our model from design data, turned out to actually produce failures later; a random guess would have predicted only 33%.

What is it that makes software fail?

In an empirical study of the post-release defect history of five Microsoft software systems, we found that failure-prone software entities are statistically correlated with code complexity measures. However, there is no single set of complexity metrics that could act as a universally best defect predictor. Using principal component analysis on the code metrics, we built regression models that accurately predict the likelihood of post-release defects for new entities. The approach can easily be generalized to arbitrary projects; in particular, predictors obtained from one project can also be significant for new, similar projects.

Papers

Predicting Component Failures at Design Time. A. Schröter, T. Zimmermann, A. Zeller. To appear in Proc. 5th International Symposium on Empirical Software Engineering (ISESE), Rio de Janeiro, Brazil, September 2006. [PDF]
Abstract. How do design decisions impact the quality of the resulting software? In an empirical study of 52 ECLIPSE plug-ins, we found that the software design as well as past failure history, can be used to build models which accurately predict failure-prone components in new programs. Our prediction only requires usage relationships between components, which are typically defined in the design phase; thus, designers can easily explore and assess design alternatives in terms of predicted quality. In the ECLIPSE study, 90% of the 5% most failure-prone components, as predicted by our model from design data, turned out to actually produce failures later; a random guess would have predicted only 33%.
Mining Metrics to Predict Component Failures. N. Nagappan, T. Ball, A. Zeller. Proc. 28th International Conference on Software Engineering (ICSE), Shanghai, China, May 2006. [PDF]
Abstract. What is it that makes software fail? In an empirical study of the post-release defect history of five Microsoft software systems, we found that failure-prone software entities are statistically correlated with code complexity measures. However, there is no single set of complexity metrics that could act as a universally best defect predictor. Using principal component analysis on the code metrics, we built regression models that accurately predict the likelihood of post-release defects for new entities. The approach can easily be generalized to arbitrary projects; in particular, predictors obtained from one project can also be significant for new, similar projects.
HATARI: Raising Risk Awareness. J. Śliwerski, T. Zimmermann, A. Zeller. Research Demonstration. Proc. European Software Engineering Conference/ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE), Lisbon, Portugal, September 2005. [PDF]
Abstract. As a software system evolves, programmers make changes which sometimes lead to problems. The risk of later problems significantly depends on the location of the change. Which are the locations where changes impose the greatest risk? Our HATARI prototype relates a version history (such as CVS) to a bug database (such as BUGZILLA) to detect those locations where changes have been risky in the past. HATARI makes this risk visible for developers by annotating source code with color bars. Furthermore, HATARI provides views to browse through the most risky locations and to analyze the risk history of a particular location.
When do changes induce fixes? On Fridays. J. Śliwerski, T. Zimmermann, A. Zeller. Proc. International Workshop on Mining Software Repositories (MSR), Saint Louis, Missouri, USA, May 2005. [PDF]
Abstract. As a software system evolves, programmers make changes that sometimes cause problems. We analyze CVS archives for fix-inducing changes—changes that lead to problems, indicated by fixes. We show how to automatically locate fix-inducing changes by linking a version archive (such as CVS) to a bug database (such as BUGZILLA). In a first investigation of the MOZILLA and ECLIPSE history, it turns out that fix-inducing changes show distinct patterns with respect to their size and the day of week they were applied.