## Recursion by Pirolli (1991)

I heard Greg Wilson’s request for me to talk about the papers I’m reading (especially since I owe him a chapter which is a review of literature), so I thought I’d talk about one that Barb and I have been thinking a lot about lately: Effects of Examples and Their Explanations in a Lesson on Recursion: A Production System Analysis by Peter Pirolli (1991), in Cognition and Instruction, 8(3), 207-259.  In this paper, Pirolli describes two studies where he explores what kind of examples are useful in learning to write recursive functions, and how the characteristics of the example influences what errors students make when they write their own recursive functions.  It’s a dense paper, with some sophisticated quantitative analysis on error rates.

I was interested in this paper as one of the first in computer science to build upon Sweller’s worked examples research.  Pirolli was explicitly trying to understand the role of examples in problem-solving and about usefulness of different kinds of examples.  The first interesting tidbit that I got from this paper is how many examples Pirolli sees as necessary to learn the basics of the language.  He’s teaching students to write recursive functions with a version of Lisp (called “SIMPLE”) with only 7 primitives. Here’s an example SIMPLE program:

During the training phase, when he’s just bringing people up to speed on these primitives, he provides 8 examples for each of the 7 primitives.  56 individual examples (where he shows the primitive applied to a list, the student is asked to guess the result, and if fails, the system provides the result) is a lot of examples just to get students familiar with the language.  When you teach CS1, do you show 8 examples of for loops before students try to use them in an assignment?

The most interesting lesson I learned about recursive examples from this paper comes from the two conditions in his first experiment.  In one condition, the recursion examples that students work through are about how to write a recursive function (e.g., “here’s the base case” and “here’s how it recurses”).  In the other condition, the recursion examples are about the dynamics of how it works (e.g., “first there’s this call, then the same function is called but with this input, then the result is returned…”), like this:

Here’s the bottomline of what he finds: Getting students through the “how to write” examples took on average 57 minutes, while the “how it works” examples took an average of 85 minutes.  There was no statistical difference in performance on a post-test on writing recursive functions, though the “how to write” group had slightly fewer errors.

Even more intriguing is the discussion where Pirolli relates these findings to others in John Anderson’s group at the time which suggest, “that knowledge about how recursive functions are written is different from knowledge about how they work” and “that there is little transfer of how-it-works knowledge to function-writing tasks and, more interestingly, that extensive additional practice with simulating the evaluation of programs yields no significant benefit in debugging tasks when compared with extensive practice just coding programs.”  Writing code and tracing code are completely different tasks.

Barb is helping to teach an AP CS class this semester, and she’s teaching recursion right now.  She’s basing how she teaches recursion on Pirolli’s results.  Her first activities have the students looking at recursive functions and highlighting the base case and the recursive call — just figure out the structure.  Then they write some recursive functions. This is Pirolli’s Experiment #1 process, which takes students less time, giving them an early success with less frustration.  Next, she’ll get into debugging recursive functions, which Pirolli suggests is really a different task entirely.

Pirolli’s paper isn’t the definitive statement on teaching recursion or using worked examples.  If it was, he wouldn’t have gone on to write several more papers, including several with his students at Berkeley on using examples to learn recursion.  It is a nice paper that provides good evidence with some practical implications for how we teach.

Entry filed under: Uncategorized. Tags: , , , .

• 1. Alan Kay  |  March 31, 2010 at 9:25 am

My guess (or maybe a little more than a guess) here is that there are several confounders for these results, including the S-expression syntax, and the insistence on using a stack for “how this works”. (And especially since there is undoubtedly already a version of “how this works” in operation for the “how to write” part.)

It’s highly likely that using a “textual expander(TE)” here is a better way to accomplish both ends, and to do with both syntax and semantics issues. (The stack is a red-herring because it is a pragmatic device that needlessly adds complexity.)

A TE is a program that shows a trace in an understandable form, by rewriting (and expanding) the code to flatten out the recursive calls as though they were written inline with the parameters in mind. There are lots of bad ways to show this (most traces are abysmal at being clear about what is going on), but there have been important exceptions (and in any case any decent UI designer, once primed for this as important, can device just what the trace should look like).

An interesting question for some time is whether recursive functions pay their way. It is possible to write a nice little LOGO program to draw a tree with parameters controlling depth, etc. But a much more interesting program is a parallel distributed one that sprouts new branch objects and creates a “tree of objects” in parallel. This is a much stronger idea and technique to learn than recursive functions, and it more strongly illustrates the idea of recursive analysis and design (that “the parts have the same powers as the wholes”). Recursive functions are a rather weak subset of this more powerful idea.

Cheers,

Alan

• 2. Mark Guzdial  |  March 31, 2010 at 10:26 am

Alan, I screwed up here — I accidentally inserted artifacts into my explanation of Pirolli’s work which are inaccurate. I’ve just copied a couple snapshots from his paper and inserted them above to make clearer. First, his version of SIMPLE is not entirely S-expression based. I’ve included a snapshot. The operators are prefix S-expressions, but the control structure is not an S-expression. Second, he never says “stack” — that was something I added to explain his approach. I’ve now removed that, and included an actual “process” trace (as he calls it) so that you can see how he’s doing it.

Your points about using a textual expander, and about whether recursive functions pay their way are well taken. My comments about Lisp and stacks accidentally inserted artifacts into the discussion which shouldn’t have been there — they were part of my explanation in summarizing the paper, not part of Pete’s work.

Cheers,
Mark

• 3. Alan Kay  |  April 1, 2010 at 8:14 am

Hi Mark,

These seems better and more reasonable on both counts.

I think there is still a syntax issue, but not as big as with S-expressions. And his textual expander trace really needs to be designed much better (in the Ed Tufte sense — that is: even though it is of “text”, the explanation is graphical and in 2D, so better visual scaffolding should be used here to make the process more clear).

A good question for us all to ponder — and this goes right to the idea of “needing lots of examples” — is “what examples?”.

From my perspective I’m suspicious that examples which are essentially sequential and pretty easily thought of as iterative, make good initial examples for teaching recursion.

But those in which an iterative solution is really messy to think about, but where recursion really works — such as comparing and copying and modifying tree structures — are far more compelling. The pattern for a lot of these is to think of “consing up the result” as making a grammar for the result in which the fractal decomposition is simple and clear.

Cheers,

Alan

• 4. Mark Guzdial  |  April 1, 2010 at 10:36 am

Hi Alan,

I STRONGLY agree! It’s an issue that I’m thinking a lot about these days. It’s one of the reasons why I assigned a paper for my class today on Learning from Examples: Instructional Principles from the Worked Examples Research which is by a group of educational psychologists reviewing, critiquing, and summarizing what’s to be learned about what makes a good example. (As we all do, I’m assigning the papers that I want to spend time reading myself. 🙂 I agree — which examples (and how they’re structured and presented) is critical.

Cheers!
Mark

• 5. Greg Wilson  |  March 31, 2010 at 1:43 pm

Thanks! It’s one (Canadian) beer per paper discussed, right? 🙂

• 6. Raymond Lister  |  March 31, 2010 at 3:16 pm

Mark wrote, “The first interesting tidbit that I got from this paper is how many examples Pirolli sees as necessary to learn the basics of the language. … he provides 8 examples for each of the 7 primitives. … When you teach CS1, do you show 8 examples of for loops before students try to use them in an assignment?”

This is consistent with what I learned from an educational psychologist, a few years back. As a rule of thumb, she advised 7 examples. I doubt I’ve ever supplied 7 examples of anything before asking students to write their own code.

And what do intro programming students always ask for when they are surveyed? … more examples!

• 7. Raymond Lister  |  March 31, 2010 at 4:04 pm

Mark wrote, “Writing code and tracing code are completely different tasks.”

That’s a big inference to make from one paper.

Alan Kay has already pithily discussed two problems I have with that inference. Alan wrote, (1) “there is undoubtedly already a version of “how this works” in operation for the “how to write” part” and (2) “There are lots of bad ways to show [a trace] … any decent UI designer, once primed for this as important, can device just what the trace should look like”.

The Pirolli paper is just one study. Furthermore, experiment 1 — the experiment Mark describes — collected data from only 19 subjects.

The BRACElet project (of which I am a member) has performed multiple studies regarding the relationship between writing and tracing. These studies have collected data from over a thousand subjects, from over a dozen educational institutions. These BRACElet studies have found a consistent link between tracing and writing. From those BRACElet studies, I make the inference that tracing is an enabling skill. That is, tracing is a necessary skill if students are to write code, but tracing skill alone is not sufficient to allow students to write code. The first BRACElet publication to talk about this was a poster at an NACCQ conference:

Philpott, A, Robbins, P., and Whalley, J. (2007) Assessing The Steps on the Road to Relational Thinking.
Proceedings of the 20th Annual NACCQ. Mann, S and Bridgeman, N. (eds), Nelson, NZ, July 8-11, 2007, p. 286.

Click to access 286.pdf

The most recent BRACElet papers to explore the relationship between tracing and writing are:

Lister, R., Fidge C. and Teague, D. (2009) Further Evidence of a Relationship between Explaining, Tracing and Writing Skills in Introductory Programming. ITiCSE’09, July 6-9, 2009, Paris, France.
http://doi.acm.org/10.1145/1595496.1562930

Venables, A., Tan, G. and Lister, R. (2009) A Closer Look at Tracing, Explaining and Code Writing Skills in the Novice Programmer. The fifth International Computing Education Research Workshop (ICER 2009), Berkeley, California, August 10-11, 2009.
http://doi.acm.org/10.1145/1584322.1584336

The above two papers cite most earlier BRACElet papers.

In the Venables, Tan and Lister paper, take a look at Figure 9, which is a plot of student performance on tracing versus student performance on writing. As we wrote in that paper, “Figure 9 illustrates that students who score poorly on the
tracing questions rarely score well on the code writing tasks, but there is no clear relationship with code writing for students who scored well on tracing questions. This suggests a causal
relationship, where a minimal level of skill at tracing is necessary for code writing, but that minimal skill at tracing is not sufficient by itself to enable code writing”.

I’ve encountered a lot of intuitive resistance among academics to the idea that tracing is a necessary (but not sufficient) skill for code writing in novices. I think the reason for this
intuitive resistance is due to the expertise reversal effect (another concept from cognitive load theory). While tracing is an essential skill for the novice, it is a skill that we all outgrow in our development as programmers. We move on to more sophisticated ways of understanding code other than tracing it. Furthermore, having moved on in our thinking, we find it distasteful, and even difficult, to go back to tracing code. Our personal distaste leads us to minimize our pedagogic emphasis on tracing … to the cost of many of our students.

In the conclusion of the Lister, Fidge and Teague paper, we wrote:

It is our view that novices only begin to improve their code writing ability via extensive practice in code writing when their tracing and explaining skills are strong enough to support a systematic approach to code writing. Students who are weak at tracing and/or explaining cannot write systematically. Instead, in a desperate effort to pass their course, they will experiment haphazardly with their code − a behavior often reported by computing educators. Until students have acquired minimal competence in tracing and explaining, it may be counter productive to have them write a great deal of code. We do not advocate that students be forced to demonstrate minimal competence in tracing and explaining before they write any code. (Indeed, we suspect that such an approach would lead to motivational problems in students.) However, we do advocate that educators pay greater attention to tracing and explaining in the very early stages of introductory programming courses, and that educators discuss with their students how these skills are used to avoid a haphazard approach to writing code.

• 8. Mark Guzdial  |  March 31, 2010 at 4:35 pm

Hi Raymond! Point well-taken — my claim does go beyond what Pirolli is saying. Still, how do you explain the results of the cognitive tutors research (multiple studies, with n > 19), where tutor-using students learn to write programs (in Pascal and Lisp) better than students in more traditional classes, as measured by post-tests, while tutor-using students never run their programs, never debug, never even have to trace their code? It’s clear that tutor-using students don’t really learn to program, since they don’t learn to trace or debug. But on the other hand, they did well on assessments of writing code.

• 9. Raymond Lister  |  April 1, 2010 at 1:40 am

Could you provide references to the papers you have in mind? (Perhaps you might even review those papers here, as part of your drive to collect Canadian beers from Greg Wilson?) I’m happy to look at any papers that you nominate on this aspect of novice programmers – indeed, I’m happy to look at any papers that you nominate on any aspect of novice programmers.

However, my wife and I are about to embark on a long vacation, around parts of Canada, before returning to Australia (and work) in early-to-mid-June. I’m doubt I’ll be able to justice to those papers before mid-June.

• 10. Mark Guzdial  |  April 1, 2010 at 10:29 am

Great suggestion, Raymond! I’ve made several claims about the Cognitive Tutors work that I should defend with a similar review of the papers (in order to chalk up my Canadian beer total 🙂 I do hope you have a great Canadian vacation!

• 11. Raymond Lister  |  April 1, 2010 at 4:06 pm

Here are two thoughts, after sleeping on this thread.

1) The term “tracing” means different things to different people. As Alan Kay pointed out, some presentations of a trace are better than others. I think of tracing hierarchically. At the bottom of the hierarchy is any concrete process where students merely follow the changing values in the variables. As a student begins to see regularities in code, they begin to skip over parts of the concrete trace. For example, a student may see immediately what the values in some variables will be at the completion of each iteration of a loop, without having to track every variable update inside the loop. That is a form of tracing higher up the hierarchy. (The need to move to higher levels of the hierarchy is driven by the students need to reduce cognitive load.) Given the idea of such a hierarchy of tracing, the skill of tracing and the skill of writing need no longer be thought of as dichotomous, and the process of learning to write code is related to acquiring skills higher in the tracing hierarchy.

2) Pirolli studied students learning recursion, whereas BRACElet studies students learning loops. What Pirolli saw, and what BRACElet saw, are not necessarily contradictory, as different skill sets may be required to learn these different programming concepts. I suspect that low level concrete tracing is near-essential in everything students learn prior to loops, and concrete tracing remains important when students first learn simple loops … and even (as Alan Kay kind of pointed out) loop-like recursion (i.e. tail recursion). However, by the time students reach more sophisticated and usefiul forms of recursion (and other programming concepts), low level concrete tracing may no longer be terribly useful. By that time, a student must be able to at least trace at a higher level of the tracing hierarchy.

• 12. Mark Guzdial  |  April 2, 2010 at 10:28 am

Intriguing ideas! I think it’s also possible to use loops without being able to trace them. The Media Computation students use an idiom like for pixel in getPixels(picture): for several assignments before tracing what a for loop really does. We introduced looping that way, on purpose, based on the work of John Pane and others that showed that non-programmers understood set operations more easily than iteration, so we explain the for loop initially as a set operation: “for all the pixels in the picture, make the pixel change like this.”

I expect that you’re right — at some level, I can’t imagine that you can write more complex code without understanding it at the process level. Reconciling the BRACElet and Pirolli & Anderson results will lead to an interesting new perspective on the interplay between writing and tracing.

• 13. Raymond Lister  |  April 2, 2010 at 2:37 pm

Mark wrote, “The Media Computation students use an idiom” … “we explain the for loop initially as a set operation: “for all the pixels in the picture, make the pixel change like this” … [but] … at some level, I can’t imagine that you can write more complex code without understanding it at the process level”.

I agree that introducing a “for” loop idiomatically, as a set operation, is a great way to introduce the “for” loop. Essentially, those loops can be read declaratively.

What we’ve found is that students start to have problems when we get to loops involving (1) some selection logic inside the loop and/or (2) a variable that acquires a value in one iteration of a loop and the value is then used in a subsequent iteration of the loop … it’s hard to read such a loop declaratively. An example is a loop to find the smallest element in an array.

I think the low level tracking of variable is the concrete skill upon which most abstractions are built. That sort of low level tracing is tedious, and error prone, so students need to move past that form of reasoning, to the more abstract forms of reasoning. Some students automatically make the transition between methods of reasoning; from declarative to low level tracing, or from tracing to higher order thinking. But it is also very natural that some students resist making such changes. If you’ve been reasoning about code at one level for a week or two, and you’ve been making progress on the course materials, why abandon a successful strategy? For some students, we need to make the methods of reasoning explicit. Furthermore, it would help those students if we explained why we need to change to another method of reasoning.

• […] was thinking that would be great as a the practice component of exploring a bunch of programs in a worked examples curriculum. Visualizing a program’s execution can aid understanding, but research suggests that […]

• 15. Taking a test helps with learning « Computing Education Blog  |  January 29, 2011 at 10:07 am

[…] result!  Flies in the face of the original Worked Examples research by Sweller et al., but not the later work that emphasized skills testing as well as examples. It supports the claims of Peer Instruction, the idea of lots of mini-quiz-like questions mixed […]

• 16. Show Me The Code « Computing Education Blog  |  April 18, 2011 at 10:32 am

[…] not enough time reading code examples.  Worked examples are a powerful approach to learning that we simply make too little use of in computer science. We emphasize so much that computer science is about “problem-solving” that we only […]

• […] on CS education has lots of examples of how reading and writing skills develop separately.  Pete Pirolli reported in the early 1980′s that, in several studies, Anderson’s lab at CMU found no correlation between reading and […]

• […] something to themselves (or others), in their own words.  Pete Pirolli did studies where he had students use worked examples to study computer science (explicitly, recursion), and with Mimi Recker, prompted CS students to self-explain then studied the effect.  In their […]

• […] about the work by Anderson and Wiedenbeck supporting this argument. I have also written about the terrific work by Pirolli exploring different ways to teach recursion, which fed into the work by […]

This site uses Akismet to reduce spam. Learn how your comment data is processed.