Delta Debugging
From automated testing to automated debugging

Software Engineering Chair (Prof. Zeller)
Saarland University – Computer Science
Campus E1 1
66123 Saarbrücken, Germany
E-mail: zeller @ cs.uni-saarland.de
Phone: +49 681 302-70970

Deutschsprachige Startseite Page d'acceuil en franšais English home page
  

The Delta Debugging project at Software Engineering Chair, Saarland University investigates an automated debugging approach based on systematic testing. With Delta Debugging, we can find failure-inducing circumstances automatically—circumstances such as the program input, changes to the program code, or program executions.

What's New

About Delta Debugging

  • Delta Debugging automates the scientific method of debugging. The basic idea of the scientific method is to establish a hypothesis on why something does not work. You test this hypothesis, and you refine or reject it depending on the test outcome. When debugging, people are doing this all the time. Manually. Delta Debugging automates this process. Read more...

    Narrowing down possible failure causes

  • The most exciting application so far is to isolate the entire cause-effect chain of a failure: "Initially, variable v1 was x1, thus variable v2 became x2, thus variable v3 became x3 ... and thus the program failed." This explains failure causes automatically and effectively, for programs as large as the GNU C compiler. This paper won the ACM SIGSOFT Distinguished Paper Award. Read more...

  • The locations at which a cause-effect chain changes its variables are actually likely defects. Thus allows us to locate defects that cause a given failure automatically. Read more...

  • As a simple application, consider a program that fails when given some input. With Delta Debugging, you can isolate and minimize the failure-inducing input automatically. For instance, if your browser crashes on a 10,000-line WWW page, Delta Debugging can determine the failure-inducing HTML tag. Read more...

  • As another application, consider a program and a number of changes to the program code. After applying the changes, the program no longer works. With Delta Debugging, you can identify the failure-inducing changes automatically. Read more...

  • Yet another application is the isolation of failure-inducing executed statements - that is, the events during execution which were critical for producing the failure. This work is at an early stage. Read more...

  • Further applications are the identification of failure-inducing schedules (e.g. race conditions due to nondeterministic behavior) or the isolation of failure-inducing control statements (i.e. which branches taken were relevant and which not).

Papers

  • Isolating Relevant Component Interactions with JINSI. Alessandro Orso, Shrinivas Joshi, Martin Burger, and Andreas Zeller; Proc. 2006 International Workshop on Dynamic Analysis (WODA 2006), Shanghai, China, May 2006.
    Abstract. When a component in a large system fails, developers encounter two problems: (1) reproducing the failure, and (2) investigating the causes of such a failure. Our JINSI tool lets developers capture and replay the interactions between a component and its environment, thus allowing for reproducing the failure at will. In addition, JINSI uses delta debugging to automatically isolate the subset of the interactions that is relevant for the failure. In a first study, JINSI has successfully isolated the relevant interaction of a JAVA component: "Out of the 32 interactions with the VendingMachine component, seven interactions suffice to produce the failure."

  • Locating Causes of Program Failures. Holger Cleve and Andreas Zeller; Proc. 27th International Conference on Software Engineering (ICSE 2005), St. Louis, Missouri, May 2005.
    Abstract. Which is the defect that causes a software failure? By comparing the program states of a failing and a passing run, we can identify the state differences that cause the failure. However, these state differences can occur all over the program run. Therefore, we focus in space on those variables and values that are relevant for the failure, and in time on those moments where cause transitions occur—moments where new relevant variables begin being failure causes: “Initially, variable argc was 3; therefore, at shell_sort(), variable a[2] was 0, and therefore, the program failed.” In our evaluation, cause transitions locate the failure-inducing defect twice as well as the best methods known so far.

  • Isolating Cause-Effect Chains from Computer Programs. Andreas Zeller; Proc. ACM SIGSOFT 10th International Symposium on the Foundations of Software Engineering (FSE-10), Charleston, South Carolina, November 2002.
    Abstract. Consider the execution of a failing program as a sequence of program states. Each state induces the following state, up to the failure. Which variables and values of a program state are relevant for the failure? We show how the Delta Debugging algorithm isolates the relevant variables and values by systematically narrowing the state difference between a passing run and a failing run - by assessing the outcome of altered executions to determine wether a change in the program state makes a difference in the test outcome. Applying Delta Debugging to multiple states of the program automatically reveals the cause-effect chain of the failure - that is, the variables and values that caused the failure.
    In a case study, our prototype implementation successfully narrowed down the cause-effect chain for a failure of the GNU C compiler: ``Initially, the C program to be compiled contained an addition of 1.0; this caused an addition operator in the intermediate RTL representation; this caused a cycle in the RTL tree - and this caused the compiler to crash.''

  • Isolating Failure-Inducing Thread Schedules. Jong-Deok Choi and Andreas Zeller; Proc. International Symposium on Software Testing and Analysis (ISSTA 2002), Rome, Italy, July 2002.
    Abstract. Consider a multi-threaded application that occasionally fails due to non-determinism. Using the DEJAVU capture/replay tool, it is possible to record the thread schedule and replay the application in a deterministic way. By systematically narrowing down the difference between a thread schedule that makes the program pass and another schedule that makes the program fail, the Delta Debugging approach can pinpoint the error location automatically - namely, the location(s) where a thread switch causes the program to fail. In a case study, Delta Debugging isolated the failure-inducing schedule difference from 3.8 billion differences in only 50 tests.

  • Automated Debugging: Are We Close? Andreas Zeller; IEEE Computer, November 2001.
    Abstract. Debugging is still one of the hardest, yet least systematic activities of software engineering. The Delta Debugging algorithm isolates failure causes automatically - by systematically narrowing down failure-inducing circumstances until a minimal set remains. Delta Debugging has been applied to isolate failure-inducing program input (e.g. a HTML page that makes a Web browser fail), failure-inducing user interaction (e.g. the keystrokes that make a program crash), or failure-inducing changes to the program code (e.g. after a failing regression test).
    Delta Debugging is fully automatic; all it requires is an automated test that detects whether the expected failure is present or not. The method is introduced using real-life programs with real-life bugs.

    Generating and testing hypotheses

  • Simplifying and Isolating Failure-Inducing Input. Andreas Zeller and Ralf Hildebrandt; IEEE Transactions on Software Engineering 28(2), February 2002, pp. 183-200.
    Abstract. Given some test case, a program fails. Which circumstances of the test case are responsible for the particular failure? The Delta Debugging algorithm generalizes and simplifies some failing test case to a minimal test case that still produces the failure; it also isolates the difference between a working and a failing test case.
    In a case study, the Mozilla web browser crashed after 95 user actions. Our prototype implementation automatically simplified the input to 3 relevant user actions. Likewise, it simplified 896 lines of HTML to the single line that caused the failure. The case study required 139 automated test runs, or 35 minutes on a 500 MHz PC.

  • Finding Failure Causes through Automated Testing. Holger Cleve, Andreas Zeller; Proc. Fourth International Workshop on Automated Debugging, Munich, Germany, 28-30 August 2000.
    Abstract. A program fails. Under which circumstances does this failure occur? One single algorithm, the delta debugging algorithm, suffices to determine these failure-inducing circumstances. Delta debugging applies the scientific method of debugging to narrow down the set of failure-inducing circumstances automatically - circumstances such as the program input, changes to the program code, or executed statements.

  • Yesterday, my program worked. Today, it does not. Why? Andreas Zeller; Proc. ESEC/FSE 99, Toulouse, France, September 1999, Vol. 1687 of LNCS, pp. 253-267.
    Abstract. Imagine some program and a number of changes. If none of these changes is applied (``yesterday''), the program works. If all changes are applied (``today''), the program does not work. Which change is responsible for the failure? We present an efficient algorithm that determines the minimal set of failure-inducing changes. Our delta debugging prototype tracked down a single failure-inducing change from 178,000 changed GDB lines within a few hours.

Software

We do our best to make all of our software available to the general public. Be aware, though, that they are prototypes. Here are some packages you can play with:
  • We are building a set of delta debugging plug-ins for the Eclipse programming environment.
  • The AskIgor debugging service is a public Web service around an extended HOWCOME prototype, isolating cause-effect chains from failing Linux programs.
  • Igor is also available for download as a command-line tool (written in Python + licensed under the GPL)
  • HOWCOME demonstrator. This demonstrator shows how to isolate cause-effect chains from computer programs. ZIP archive for Linux PCs.
  • DD.py, the Delta Debugging core module, realizes the Delta Debugging algorithm as a Python class. By providing your own subclass with its own _test method, you can realize arbitrary applications. See the tutorial for details.
  • XLAB-980917, the last public snapshot of Marc Vertes' XLAB. This is used for recording and playing back X user input.

Keep me posted

People

<webmaster@st.cs.uni-saarland.de> · http://www.st.cs.uni-saarland.de/dd/ · Updated: 2014-03-23 23:32