Posts tagged ‘notional machine’
So what’s a notional machine anyway? A guest blog post from Ben Shapiro
Last week, we had a Dagstuhl Seminar about the concept of notional machines, as I mentioned in an earlier blog post about the work of Ben Shapiro and his student Abbie Zimmermann-Niefield. There is an amazing amount being written about the seminar already (see the Twitter stream here), with a detailed description from Amy Ko here in her blog and several posts from Felienne on her blog. I have written my own summary statement on the CACM Blog (see post here). It seems appropriate to let Ben have the summary word here, since I started the seminar with a reference to his work.
I’m heading back to Boulder from a Dagstuhl seminar on Notional Machines and Programming Language Semantics in Education. The natural question to ask is: what is a notional machine?
I don’t think we converged on an answer, but here’s my take: A notional machine is an explanation of the rules of a programmable system. The rules account for what makes a program a valid one and how a system will execute it.
Why this definition? Well, for one, it’s consistent with how du Boulay, coiner of the term notional machine, defined it at the workshop (“the best lie that explains what the computer does”). Two, it has discriminant utility (i.e. precision): the definition allows us to say that some things are notional machines and some are not. Three, it is consistent with a reasonable definition of formal semantics, and thus lets us imagine a continuum of notional machines that include descriptions of formal semantics, but also descriptions that are too imprecise — too informal — to be formal semantics but that still have explanatory value.
The first affordance is desirable because it allows us to avoid a breaking change in nomenclature. It would be good if people reading research papers about notional machines (see Juha Sorva’s nice review), including work on how people understand them, how teachers generate or select them, etc., don’t need to wrestle with what contemporary uses of the term mean in comparison to how du Boulay used the term thirty years ago. It may make it easier for the research community to converge on a shared sense of notional machine, unlike, say, computational thinking, where this has not been possible.
The second affordance, discriminant utility, is useful because it gives us a reason to want to have a term like notional machine in our vocabulary when we already have other useful and related terms like explanation and model and pedagogical content knowledge. Why popularize a new term when you already have perfectly good ones? A good reason to do so is because you’d like to refer to a distinct set of things than those terms refer to.
The scope of our workshop was explicitly pedagogical: it was about notional machines “in education.” It was common within the workshop for people to refer to notional machines as pedagogical devices. It is often the case that notional machines are invented for pedagogical purposes, but other contexts may also give rise to them. Consider the case of Newtonian mechanics. Newton’s laws, and the representations that we construct around them (e.g. free body diagrams), were invented before Einstein described relativity. Newton’s laws weren’t intended as pedagogical tools but as tools to describe the laws of the universe, within the scales of size and velocity that were accessible to humans at the time. Today we sequence physics curriculum to offer up Newtonian physics before quantum because we believe it is easier to understand. But in many cases, even experts will continue to use it, even if they have studied (and hopefully understand) quantum physics. This is because in many cases, the additional complexity of working within a quantum model offers no additional utility over using the simpler abstractions that Newtonian physics provides. It doesn’t help one to predict the behavior of a system any better within the context of use, but likely does impose additional work on the system doing the calculation. So, while pedagogical contexts may be a primary locus for the generation, selection, and learning of notional machines, they are not solely of pedagogical value.
Within the workshop, I noticed that people often seemed to want their definitions, taxonomies, and examples of notional machines to include entities and details beyond those encompassed by the definition I have provided above. For example, some participants suggested that action rules can be, or be part of, notional machines. An example of an action rule might be “use descriptive variable names” or “make sure to check for None when programming in Python.” While both of these practices can be quite helpful, my definition of notional machines accepts neither of them. It rejects them because they aren’t about the rules by which a computer executes a program. In most languages, what one names variables does not matter, so long as one uses a name consistently within the appropriate scope. “Make sure to check for None” is a good heuristic for writing a correct program, but not an account of the rules a programming environment uses to run a program. In contrast, “dereferencing a null pointer causes a crash” is a valid notional machine, or at least a fragment of one.
Why do I want to exclude these things? Because a) I think it’s valuable to have a term that refers to the ways we communicate about what programming languages are and how the programs written in them will behave. And b) a broader definition will refer to just about everything that has anything to do with the practice of programming. That doesn’t seem worth having another term in our lexicon, and it would be less helpful for designing and interpreting research studies for computing education.
The third affordance is desirable because it may allow us to form stronger bridges to the programming languages research world. It allows us to examine — and value — the kinds of artifacts that they produce (programming languages and semantics for those languages) while also studying the contradictions between the values embedded in the production of those artifacts and the values that drive our own work. Programming languages (PL) researchers are generally quite focused on demonstrating the soundness of designs they create, but typically pay little attention to the usability of the artifacts they produce. Research languages and written (with Greek) semantics have difficult user interfaces, at least to those of us sitting on the outside of that community. How can we create a research community that includes the people, practices, and artifacts of PL and that conducts research on learning? One way is to decide to treat the practices and artifacts of PL researchers, such as writing down formal semantics, an instance of something that computing education researchers care about: producing explanations of how programming systems work. PL researchers describing languages’ semantics aren’t doing something that is very different in kind than what educators do when they explain how programming languages work. But (I think) they usually do so with greater precision and less abstraction than educators do. Educators’ abstractions may be metaphorical (e.g. “There’s a little man inside the box that reads what you wrote, and follows your instructions, line by line…”) but at least if we use my definition, they are of the same category as the descriptions that semanticists write down. As such, the range of things that can be notional machines, in addition to the programming languages they describe, may serve as boundary objects to link our communities together. I think we can learn a lot from each other.
That overlap presents opportunities. It’s an opportunity for us to learn from each other and an opportunity to conduct new lines of research. Imagine that we are faced with the desire to explain a programming system. How would a semanticist explain this system? How would an experienced teacher? An inexperienced teacher? What do the teachers’ explanations tell us about what’s important? What does a semanticist’s explanation tell us about what’s the kernel of truth that must be conveyed? How do these overlap? How do they diverge? What actually works for students? Can pedagogical explanations be more precise (and less metaphorical) and still be as helpful to students? Are more precise definitions actually more helpful to students than less precise ones? If so, what does one need to know to write a formal semantics? How does one learn to do that? How does one teach educators to do that? How can we design better programming languages, where better is defined as being easier to understand or use? How can we design better programming languages when we have different theories of what it means to program well? How do we support and assess learning of programming, and design programming languages and notional machines to explain them, when we have different goals for what’s important to accomplish with programming?
There are many other questions we could ask too. Several groups at the workshop held breakout sessions to brainstorm these, but I think it’s best to let them tell their own stories.
In summary, I think the term notional machines has value to computing education research, but only if we can come to a consensus about what the term means, and what it doesn’t. That’s my definition and why I’ve scoped it how I have. What’s your take?
If you’d like to read more (including viewpoints different than mine), make sure to check out Felienne’s and Amy’s blog posts on this same topic.
Thank you to Shriram, Mark, Jan, and Juha for organizing the workshop, and to the other participants in the workshop for many lively and generous conversations. Thanks as well to the wonderful Dagstuhl staff.
Code Smells might suggest a different and better Notional Machine: Maybe students want more than one main()
There is a body of research that looks for “code smells” in Scratch projects. “Code smells” are characteristics of code that suggest a deeper problem (see Wikipedia description here). I have argued that these shouldn’t be applied to Scratch, that we’re confusing software engineering with what students are doing with computing (see post here).
One of the smells is having code lying around that isn’t actually executed from the Go button, the green flag in Scratch. The argument is that code that’s not executed from the Go button is unreachable. That’s a very main() oriented definition of what matters. There was a discussion on Twitter about that “smell” and why it’s inappropriate to apply to Scratch. I know that when I program in GP (another block-based program), I often leave little bits of maintenance code lying around that I might use to set the world’s state.
There’s another possibility for code lying around that isn’t connected and thus doesn’t executd properly — it should execute properly. There’s evidence that novice students are pretty comfortable with the idea of programs/functions/codechunks executing in parallel. They want more than one main() at once. It’s our programming systems that can’t handle this idea well. Our languages need to step up to the notional machines that students can and want to use.
For example, in Squeak eToys, it’s pretty common to create multiple scripts to control one object. In the below example, one script is continually telling the car to turn, and the other script is continually telling the car to go forward. The overall effect is that the car turns in circles.
I was on Kayla DesPortes dissertation committee (now at NYU!). She asked novice programmers to write a script to make two lights on an Arduino to blink. She gave them the code to blink one light: In a Forever loop, they raise the voltage on a pin high, then wait a bit, then lower the voltage, then wait a bit. That makes a single light blink.
The obvious thing that more than half of the participants in her study did was to duplicate the code — either putting it in parallel or putting in sequence. One block blinked the light on one pin, and the other block blinked the light on the other pin. However, both blocks were Forever loops. Only script can execute on Arduino at a time.
On the Arduino, what the students did was buggy. It “smelled” because the second or parallel Forever block would never execute.
These examples suggest that parallel execution of scripts might be normal and even expected for novices. Maybe parallel execution is an attribute of a notional machine that is natural and even easier for students than trying to figure out how to do everything in one loop. Maybe concurrency is more natural than sequentiality.
Something that “smells” to a software engineer might actually be easier to understand for a layperson.
Teaching to develop a mental model of program behavior: How do students learn the notional machine
“To understand a program you must become both the machine and the program” – Perlis 1982, cited in Sorva 2013
I’ve been thinking for a few years now about an open research question in computing education. How do students come to understand how programs work? Put in a more technical way, How do students develop their mental model of the language’s notional machine?
I have been thinking about this question in terms of Ashok Goel’s Structure-Behavior-Function (SBF) model of how people think about systems.
- Structure is the parts of the system — for us in CS, think about the code.
- Function is what the system does — for us in CS, the requirements or the description of what the program is supposed to do.
- Behavior is how the structural elements interact to achieve the function. It’s the understanding of the semantic model of the programming language (the notional machine) plus how that plays out in a specific program.
There are studies of students learning notional machines (e.g., Raymond Lister and Juha Sorva are some of the top researchers here). What I don’t know is how it develops and how to help it develop. Lister tells us the stages of development (e.g., of tracing skill). Sorva tells us about theories of how to teach the notional machine, but with little evidence. We have models for how people learn to read and write code (e.g., Elliot Soloway’s plans). But we not have a cognitive model for how they develop a mental model of how the code works.
A Pedagogical Problem That I Faced
I’m teaching Media Computation (n=234) this semester, and students had a disappointing performance on two programming problems on a recent quiz. (We have a 30 minute quiz every other week.) They didn’t really bomb, but an average of 82% on one programming problem (Problem #3 below) and 76% on the second (Problem #4) was lower than I was hoping for. Those are all mostly due to partial credit — only 25 of my 234 students got full credit on Problem #4. Worse yet, we had a “simple” matching problem where we offered four pictures and four programs — which program generated which picture? More than half the students got at least two wrong. The score on the matching problem was 72%, even lower than the programming task problems. My conclusion is that my students can’t yet read code and understand it.
How do I teach my students to understand code?
With my researcher hat on, I don’t have a solid answer. With my teacher hat on, I have to do something. So, I drew on what I know from the research to come up with a best guess solution.
I decided to drop the next two lecture topics from the schedule, to instead re-focus on manipulation of pictures. I know from the learning sciences literature that it’s much better to go deeper than broader. Teaching for mastery is more important than teaching for coverage. Things that students know well are more likely to persist and transfer than things that students are merely familiar with.
I decided to do a live-coded session revisiting the quiz. I had graded a bunch of the programming problems on the quiz. I saw several different strategies for solving those two problems. I had a unique teachable moment here — every student had attempted these two problems. They were primed for the right answer. I decided to solve the problems in a live-coding session (starting from a blank editor, and talking aloud as I wrote the code) in each of the ways that I saw students doing it — three ways for the first problem, four ways for the second problem. While I wrote, I drew pictures to describe the behavior, drawing from Sorva’s visualization approach and the SILC emphasis on sketching rather than diagrams. After writing each program, I tested it on a picture. Along the way, I answered questions and wrote additional examples based on those questions.
This idea is based on Marton’s Variation Theory. You have to vary critical aspects of examples for students to figure out the differences. Janet Kolodner talks about a similar idea when she emphasizes contrasting cases for learning informed by case-based reasoning. In SBF terms, I was keeping the Function constant, but varying the Structure and Behavior. In Goal-Plan-Code terms, I was achieving the same Goal, but varying the underlying Plan and Code.
Could an exploration of these variations/contrasts help students see how the code changes related to behavior changes? I don’t actually know how to evaluate the result as a researcher, but as a teacher, I got good response from students. I’m looking forward to seeing how they do on similar problems on future quizzes.
The rest of this blog post is a static replay of the lecture. I’ll show you the slides I showed, the sketches I made, and the code I wrote (while talking aloud).
Problem clearTopHalf
Solution #1: Iterate through all pixels
def clearTopHalf1(pic):
h = getHeight(pic)
for pixel in getPixels(pic):
y = getY(pixel)
if y < h/2:
setColor(pixel,white)
Solution #2: Iterate through half of the pixel indices
def clearTopHalf2(pic):
all = getPixels(pic)
for index in range(0,len(all)/2):
pixel = all[index]
setColor(pixel,white)
Solution #3: Iterate through all x and y positions in the top half
def clearTopHalf3(pic):
h = getHeight(pic)
w = getWidth(pic)
for y in range(0,h/2):
for x in range(0,w):
pixel = getPixel(pic,x,y)
setColor(pixel,white)
A student asked, “Could we do x first and then y?” Sure!
def clearTopHalf3b(pic):
h = getHeight(pic)
w = getWidth(pic)
for x in range(0,w):
for y in range(0,h/2):
pixel = getPixel(pic,x,y)
setColor(pixel,white)
Pause for Reflection and Discussion
At this point, I asked students to turn to the person next to them and ask, “Which one do you prefer? Which makes the most sense to you?”
I always encourage students to discuss during peer instruction questions. I have never had such an explosion of noise as I did with this invitation. From wandering around the room, what I heard students discussing was, “This is what I did, and this is what I got wrong.”
When we had the whole class discussion, the first and third approaches (all pixels or coordinates) were the preferences. Students who spoke up disliked the index approach — it felt “like there’s too much indirection” (said one student).
Problem copyThirdDown
Solution #1: Iterating through all pixels
def copyAThird1(pic):
h = getHeight(pic)
for pixel in getPixels(pic):
x = getX(pixel)
y = getY(pixel)
if y < h/3:
targetPixel=getPixel(pic,x,y+(2*h/3))
setColor(targetPixel,getColor(pixel))
Solution #2: Iterate through first 1/3 of pixels
def copyAThird2(pic):
all = getPixels(pic)
h = getHeight(pic)
for index in range(0,len(all)/3):
pixel = all[index]
color = getColor(pixel)
x = getX(pixel)
y = getY(pixel)
targetPixel = getPixel(pic,x,y+(2*h/3))
setColor(targetPixel,color)
Solution #3: Iterate through top 1/3 of picture by coordinates
def copyAThird3(pic):
h = getHeight(pic)
w = getWidth(pic)
for x in range(0,w):
for y in range(0,h/3):
pixel = getPixel(pic,x,y)
color = getColor(pixel)
#Copies
targetPixel = getPixel(pic,x,y+(2*h/3))
setColor(targetPixel,color)
At this point, someone said that they did it by subtracting y from the height. I showed them that this approach mirrors. This is the first incorrect solution that I demonstrated.
def copyAThird3b(pic):
h = getHeight(pic)
w = getWidth(pic)
for x in range(0,w):
for y in range(0,h/3):
pixel = getPixel(pic,x,y)
color = getColor(pixel)
# Mirrors instead of copies
targetPixel = getPixel(pic,x,h-y-1)
setColor(targetPixel,color)
Solution #4: Iterating through the bottom 1/3 of the picture by x and y coordinates
This was an unusual approach that I saw a few students try: They used nested loops to iterate through the bottom 2/3 of pixel coordinates, and then compute the top 1/3 to copy down. They iterated through the target and computed the source.
def copyAThird4(pic):
h = getHeight(pic)
w = getWidth(pic)
for x in range(0,w):
for y in range(2*h/3,h):
targetPixel=getPixel(pic,x,y)
srcPixel=getPixel(pic,x,y-(2*h/3))
setColor(targetPixel,getColor(srcPixel))
After I wrote that program, someone asked, “Couldn’t you make an empty picture and copy the top third into the new picture at the top and bottom?” With her guidance, I modified the program above to create a new version, which does exactly as she describes, leaving the middle third blank. So, this was the second incorrect version I wrote to respond to student queries.
def copyAThirdEmpty(pic):
h = getHeight(pic)
w = getWidth(pic)
canvas = makeEmptyPicture(w,h)
for x in range(0,w):
for y in range(0,h/3):
pixel = getPixel(pic,x,y)
color = getColor(pixel)
targetPixel = getPixel(canvas,x,y)
setColor(targetPixel,color)
targetPixel = getPixel(canvas,x,y+(2*h/3))
setColor(targetPixel,color)
explore(canvas)
When I asked students a second time which version made the most sense to them, there was a bigger split. Indexing the array continued to be the least preferred one, but students liked both versions with nested loops, and many still preferred the first version.
Katie Cunningham receives NSF fellowship: Studying how CS students use sketching and tracing
Kate Cunningham is a first year PhD student working with me in computing education research. She just won an NSF graduate research fellowship, and the College of Computing interviewed her. She explains the direction that she’s exploring now, which I think is super exciting.
“I’m interested in examining the kinds of things students draw and sketch when they trace through code,” she said. “Can certain types of sketching help students do better when they learn introductory programming?” She grew interested in this topic while working as a teacher for a program in California. As she watched students there work with code, she found that they worked solely with the numbers and text on their computer screen.“They weren’t really drawing,” she said. “I found that the drawing techniques we encouraged were really useful for those students, so I was inspired to study it at Georgia Tech.”
Essentially, the idea is that by drawing or sketching a visual representation of their work as they code, students may be able to better understand the operations of how the computer works. “It’s a term we call the ‘notional machine,’” Cunningham explained. “It’s this idea of how the computer processes the instructions. I think if students are drawing out the process for how their code is working, that can help them to fully understand how the instructions are working.” That’s one benefit. Another, she said, is better collaboration. If a student is sketching the process, she posits, the teacher can better see and understand what they’re thinking.
Source: IC Ph.D. student Katie Cunningham receives NSF fellowship | College of Computing
Recent Comments