Mining Software Archives
Recommending Related Changes

Lehrstuhl für Softwaretechnik (Prof. Zeller)
Universität des Saarlandes – Informatik
Informatik Campus des Saarlandes
Campus E9 1 (CISPA)
66123 Saarbrücken
E-mail: zeller @
Telefon: +49 681 302-70970

Deutschsprachige Startseite Page d'acceuil en franšais English home page

[ Software Evolution | Vulnerable Components | Predicting Failures | Good Bug Reports | Related Changes | Cross-cutting Concerns | Usage Patterns | Jazz | Trends ]

The Software Evolution project at the Software Engineering Chair, Saarland University, analyzes version and bug databases to predict failure-prone modules, related changes, and future development activities.

What's new

Programmers who changed this function also changed...

If you browse the books at Amazon or a similar shop, you may have encountered suggestions of this type: ``Customers who bought this book also bought...'' Such findings stem from Amazon's purchase history: Buying two books or more together establish a relationship between these two books.

We realized a similar feature for software: "Programmers who changed function X also changed function Y". For this purpose, we analyze version histories of large software systems, trying to identify commonalities and anomalities, and guiding the programmer in understanding and maintenance.


  • Mining Version Histories to Guide Software Changes. T. Zimmermann, P. Weißgerber, S. Diehl, A. Zeller. Proc. 26th International Conference on Software Engineering (ICSE), Edinburgh, UK, May 2004. [PDF]
    Abstract. We apply data mining to version histories in order to guide programmers along related changes: "Programmers who changed these functions also changed...". Given a set of existing changes, such rules a) suggest and predict likely further changes, b) show up item coupling that is indetectable by program analysis, and c) prevent errors due to incomplete changes. Our evaluation shows after an initial change, our ROSE prototype can correctly predict 26% of further files to be changed—and 15% of the precise functions or variables. 30% of the suggested files and 26% of the suggested functions or variables were correct predictions.

  • Preprocessing CVS Data for Fine-Grained Analysis. T. Zimmermann, P. Weißgerber. Proc. International Workshop on Mining Software Repositories (MSR), Edinburgh, UK, May 2004. [PDF]
    Abstract. All analyses of version archives have one phase in common: the preprocessing of data. Preprocessing has a direct impact on the quality of the results returned by an analysis. In this paper we discuss four essential preprocessing tasks necessary for a fine-grained analysis of CVS archives: data extraction, transaction recovery, mapping of changes to fine-grained entities, and data cleaning. We formalize the concept of sliding time windows and show how commit mails can relate revisions to transactions. We also present two approaches that map changes to the affected building blocks of a file, e.g. functions or sections.

  • How History Justifies System Architecture (or not). T. Zimmermann, S. Diehl, A. Zeller. Proc. International Workshop on Principles of Software Evolution (IWPSE 2003), Helsinki, Finland, September 2003. [PDF]
    Abstract. The revision history of a software system conveys important information about how and why the system evolved in time. The revision history can also tell us which parts of the system are coupled by common changes: "Whenever the database schema was changed, the sqlquery() method was altered, too." This "evolutionary" coupling can be compared with the coupling as imposed by the system architecture; differences indicate anomalies which may be subject to restructuring. Our ROSE prototype analyzes fine-grained coupling between software entities as indicated by common changes. It turns out that common changes are a good indicator for modularity, that evolutionary coupling should be determined between syntactical entities (rather than files or modules), and that common changes can indicate coupling between software entities and non-program artifacts that is unavailable to the analysis of a single version.


  • eROSE: Guiding Programmers in Eclipse

Keep me posted


<> · · Stand: 2017-01-03 21:10