Mining Software Archives
Predicting Vulnerable Components

Software Engineering Chair (Prof. Zeller)
Saarland University – Computer Science
Campus E9 1 (CISPA)
66123 Saarbrücken, Germany
E-mail: zeller @
Phone: +49 681 302-70970

Deutschsprachige Startseite Page d'acceuil en franšais English home page

[ Software Evolution | Vulnerable Components | Predicting Failures | Good Bug Reports | Related Changes | Cross-cutting Concerns | Usage Patterns | Jazz | Trends ]

The Software Evolution project at the Software Engineering Chair, Saarland University, analyzes version and bug databases to predict failure-prone modules, related changes, and future development activities.

What's new

  • We have mined the Mozilla history to predict vulnerabilties.
    • One of the most popular German computer magazines (c't) covered our tool (c't 2007, Issue 21, page 52)
    • Read the paper, accepted at the ACM conference on Computer and Communications Security 2007 (ACM CCS 07). Our paper is the only accepted paper by a German research group. Researchers submitted 303 papers to the conference, out of which 55 were accepted.
    • In January, we created a list of 10 components that we predicted were the most likely to contain unknown vulnerabilities. Five of those 10 components needed to be changed i nthe last six months because vulnerabilties were actually discovered.

Tell me what you import, and I'll tell you how vulnerable you are.

We observed that the domain—as expressed by the other components that are interacted with—characterizes a component's vulnerability. In case of Mozilla, for instance, we found that of the 14 components importing nsNodeUtils.h, 13 components (93%) had to be patched because of security leaks. The situation is even worse for those 15 components that import nsIContent.h, nsIInterfaceRequestorUtils.h and nsContentUtils.h together, because they all had vulnerabilities. In other words: "Tell me what you import, and I'll tell you how vulnerable you are."

Our technique allows us to map related source files (called components) to vulnerabilities. When we do that, we get a map that tells us how vulnerable components have been in the past. (Click the map for a larger version; the map is also available in PDF.)

Vulnerability Map Thumbnail

In this map, components with no known vulnerabilities appear in white, and components with vulnerabilities appear in shades of red: the redder a component, the more vulnerabilities it has had in the past.

Additionally, this allows us to create a predictor that assesses new components as they are added to the Mozilla source code. In an evaluation, we found that we can correctly identify about half of the vulnerable components and that about 70% of our predictions are correct, for a false positive rate of about 30%. This compares very well to other approaches.

In January, we created a list of 10 components that our method flagged as most likely to be vulnerable. In the meantime, 5 of those components needed to be fixed because of vulnerabilities; see the following table. This shows that we can actually predict unknown vulnerabilities.

1js/src/jsxdrapi *
2js/src/jsscope *
7layout/xul/base/src/nsSliderFrame *
9layout/tables/nsTableRowFrame *
10layout/base/nsFrameManager *

Would a new Mozilla component importing nsNodeUtils.h be prone to vulnerabilities as well? Read more...


  • Predicting Vulnerable Software Components. S. Neuhaus, T. Zimmermann, C. Holler, A. Zeller. Universität des Saarlandes, Saarbrücken, Germany, February 2007. Accepted at ACM CCS 07, Alexandria, Virginia, USA. [PDF]
    Abstract. Where do most vulnerabilities occur in software? Our Vulture tool automatically mines existing vulnerability databases and version archives to map past vulnerabilities to components. The resulting ranking of the most vulnerable components is a perfect base for further investigations on what makes components vulnerable.

    In an investigation of the Mozilla vulnerability history, we surprisingly found that components that had a single vulnerability in the past were generally not likely to have further vulnerabilities. However, components that had similar imports or function calls were likely to be vulnerable.

    Based on this observation, we were able to extend Vulture by a simple predictor that correctly predicts about half of all vulnerable components, and about two thirds of all predictions are correct. This allows developers and project managers to focus their their efforts where it is needed most: ``We should look at nsXPInstallManager because it is likely to contain yet unknown vulnerabilities.''


Interested in the raw data that was used for the statistical analysis? Just drop me a note.

Keep me posted


<> · · Updated: 2014-03-23 23:45