We are committed to reproducible research and to making our tools freely available. Since our tools, techniques and benchmark suites evolve and improve over time, there is no single all-inclusive package. Instead, for major papers, we try to make available a snapshot of the programs and benchmarks used in that paper, so that others can reproduce or extend those experiments.
There are several sources of code used in our publications:
ManyBugs and IntroClass
The RepairBenchmarks website contains detailed information on the ManyBugs and IntroClass benchmarks, described in detail in TSE 2015, including the baseline experimental results for GenProg, AE, and TRPAutoRepair.
105 GenProg ICSE 2012 Program Bugs
These scenarios and results were used for the systematic study on program repair published in ICSE 2012 (Paper), and the study of representation and operator choices in GECCO 2012 (Paper). Note: these benchmarks are deprecated. We include these results for completeness, but we discourage their use in future work. Instead, the TSE 2015 benchmarks release (above) includes important corrections.
Virtual Machine Images Buggy Programs
TSE 2012 Bugs
These programs were used in the experiments in this paper; they are a superset of the programs/bugs used in ICSE 2009 and GECCO 2009. The virtual machine image demonstrates the wu-ftpd repair described in that article. Instructions assume GenProg v1.0.
Virtual Machine Images VM Instructions Buggy Programs Workloads
In GECCO 2010, we investigated alternative fitness functions for test-guided APR. Instructions assume GenProg v1.0.
2009 Buggy Programs
These experiments cover the GenProg publications in both ICSE 2009 and GECCO 2009. Instructions assume GenProg v1.0.
The ASPLOS 2013 paper includes results on the Software-artifact Infrastructure Repository.
ASE 2013 (Paper)
These experiments relate to the Adaptive Equality repair algorithm that uses an approximation to program equivalence to reduce the search and introduces on-line learning strategies to order test cases and repairs.
Code Experimental Results
ICSE 2012 (Paper)
A systematic study of program repair. These experiments were conducted on AWS, using images that we have converted to VirtualBox format. The READMEs also point to a publicly-available AMI. Please use ManyBugs for all future experiments.
Code Virtual Machine Images Experimental Results
These results cover the GenProg publications in both ICSE 2009 and GECCO 2009.
These include repair results for various genetic algorithm parameter values.
In GECCO 2010, we investigated alternative fitness functions for test-guided APR.
This dataset includes the subject code and questions presented to humans, as well as the human responses.
These results relate to neutral mutants and software mutational robustness. Experimental results for higher order neutral mutants are also available. Benchmarks Sorting Programs Experimental Results
Pacific Graphics 2015
These experiments use GenProg-like approaches to automatically generate band-limited procedural shaders. Dataset