Automatically grading programming homework: Echoes of Proust
I’d love to see this new system from MIT compared to Lewis Johnson’s Proust. Proust also found semantic bugs in students’ code. Lewis (and Elliot Soloway and Jim Spohrer) collected hundreds of bugs when students were working on the Rainfall Problem, then looked for those bugs in students’ programs. Proust caught about 85% of students’ semantic errors. That last 15% covered so many different bugs that it wasn’t worthwhile to encode the semantic check rules — each rule would only fire once, ever. My guess is that Proust, which knew what problem that the students were working on, would do better than the MIT homework checker, because it can only encode general mistakes.
The new system does depend on a catalogue of the types of errors that student programmers tend to make. One such error is to begin counting from zero on one pass through a series of data items and from one in another; another is to forget to add the condition of equality to a comparison — as in, “If a is greater than or equal to b, do x.”
The first step for the researchers’ automated-grading algorithm is to identify all the spots in a student’s program where any of the common errors might have occurred. At each of those spots, the possible error establishes a range of variations in the program’s output: one output if counting begins at zero, for instance, another if it begins at one. Every possible combination of variations represents a different candidate for the corrected version of the student’s program.