Repeatability as a Core Value in March CACM: For Software and Education
Repeatability presumes evidence (which can be repeated). Computer scientists have not valued evidence and repeatability as much as we need to for rigor and scientific advancement — in education, too. One of my favorite papers by Michael Caspersen is his Mental models and programming aptitude ITICSE 2007 paper where he and his colleagues attempt to replicate the results of the famous and controversial Dehnadi and Bornat paper (see here). Michael and his colleagues are unable to replicate the result, and they propose a research method for understanding the differences. That’s good science — attempting to replicate another’s result, and then developing the next steps to understand the differences.
Science advances faster when we can build on existing results, and when new ideas can easily be measured against the state of the art. This is exceedingly difficult in an environment that does not reward the production of reusable software artifacts. Our goal is to get to the point where any published idea that has been evaluated, measured, or benchmarked is accompanied by the artifact that embodies it. Just as formal results are increasingly expected to come with mechanized proofs, empirical results should come with code.
If a paper makes, or implies, claims that require software, those claims must be backed up.