Mining Patterns and Violations using Concept Analysis
by Christian Lindig

Universität des Saarlandes, Saarbrücken, Germany, June 2007. Unpublished manuscript..

Download as PDF file.

See also

More information is available at http://www.st.cs.uni-saarland.de/publications/.

Abstract

Large programs develop patterns in their implementation and behavior that can be used for defect mining. Previous work used frequent itemset mining to detect such patterns and their violations, which correlate with defects. However, frequent itemset mining gives much more attention to patterns than to the instances of these patterns. We are proposing a more general framework to understand and mine purely structural patterns and violations. By combining patterns and their instances into blocks, we gain access to the rich theory of formal concepts. This results in a novel geometric interpretation, which helps to understand previous mining approaches. Blocks form a hierarchy in which each block corresponds to a pattern and neighboring blocks to a violation. Furthermore, blocks may be computed efficiently and searched for violations. Using our open-source tool Colibri, we mined patterns and violations from five open-source projects in less than a minute each, including the Linux kernel.

BibTeX Entry

@techreport{lindig-tr-2007,
    title = "Mining Patterns and Violations using Concept Analysis",
    author = "Christian Lindig",
    year = "2007",
    month = jun,
    institution = "Universität des Saarlandes, Saarbrücken, Germany",
}

Show all publications of the Software Engineering Chair.