Why we teach how linked lists are made, and how memory representations work

March 24, 2011 at 6:56 am 11 comments

One of the ongoing battles in CS Education is whether or not to teach how data structures are built.  One school of thought goes something like this, “We have to teach how data structures are really implemented, because unless you realize how the runtime behavior differs between a singly-linked list and a doubly-linked list and a tree, you really can’t make informed choices between them.”  In other words, we teach the implementation to inform the use.  An opposing school of thought says, “It’s ridiculous to have anyone implement yet another of these complex data structures!  How many linked list implementations do we need?  Instead, we should focus on how to use them well.”  That school of thought sees data structures as a programming or even vocational benefit.  Several colleagues in computing education wrote a nice piece describing the intent of teaching data structures, which differ from teacher-to-teacher.

Based on interviews with Computer Science educators and analysis of CS literature, we identified five categories of intent: developing transferable thinking, improving students’ programming skills, knowing “what’s under the hood”, knowledge of software libraries, and component thinking.

I’ve come to realize that I have another reason for teaching the implementation of data structures, which I don’t think is covered by the five reasons in that piece.  It’s only through dealing with implementation structures that I (and my students) have seen their misconceptions highlighted, like the ones I talked about in my ontological categories post.  I continue to struggle with my students’ understanding of object, variable, reference, and “link” (the arrows we draw between boxes in data structure diagrams).  We have wonderful discussions now, where we talk about the best visual representation for various structures — great “meta-level” discussions to have.

I think they’re starting to get it (and I have these peer instruction results, so that I can make that claim with some confidence), but it’s a real struggle.  Moving from singly-linked lists to doubly-linked lists helped face the problem well: It’s too easy to simply say, “Well, that’s just the next node,” and gets more complicated when students have to say, “That node is the next of the previous, and the previous of the next” and now start to wonder what those words mean.  “Next” has a commonplace meaning.  We don’t talk about the “next” and the “previous” at the same time in our daily conversation, but we have to deal with both simultaneously when implementing them.

I had a chance to discuss these issues yesterday with some really great cognitive and learning scientists.  Some pointed out that the problem is all the harder because there’s no there there — all these words, all these ways to refer to objects, are just different perspectives, and there’s no real solid thing that’s being described.  I countered that there was a real, inspectable, testable thing: the actual values in real memory addresses. That argument didn’t hold much weight with them, because few of us really use that level.  Things are defined through their use, and different uses lead to different definitions, and might as well be different things.  “Explaining memory is just layering yet another perspective, and that one has even less usefulness than the others and is even more complicated. ”  They thought that it’s like teaching quantum mechanics to fourth graders so that they have a scientist’s understanding of heat.

I’m not convinced, but at a higher level, I’m interested by the kinds of arguments we’re having.  These questions of “what is our intent in our teaching?,” “what’s real?,” and “what does it mean for something to be ‘defined’?” seem esoteric, or worse, pointless for the in-the-classroom CS teacher.  Those are questions for navel-gazing philosophers, we might think.  Actually, they are the answer to some of our fierce debates in computing education.  Why do we teach data structures?  Why do we teach hardware, architecture, and memory structures?  I’m sure that we all have immediate answers that we whip out in undergraduate curriculum committee meetings.  What’s our evidence for our answers?

  • The cognitive scientists were asking us to consider, “Does it really make sense any more for computer scientists to understand the computer down to the metal?”
  • I’m asking us to consider that maybe the reason to teach data structure implementation has no practical or vocational benefit — an important purpose is just to highlight and correct what students don’t understand.

Entry filed under: Uncategorized.

A Definition of Computational Thinking from Jeannette Wing Impact of CS on female non-majors: Beyond a pipeline model

11 Comments Add your own

  • 1. Greg Wilson  |  March 24, 2011 at 7:04 am

    Double-plus yes 🙂 In my personal and teaching experience, the only way to find out what people actually understand is to see where they stumble when trying to apply what they have (supposedly) learned. I’m fascinated by the idea of collecting information about misunderstanding in real-time, whether it’s by multiple-choice clicker quizzes (as you’ve been describing) or real-time monitoring of small coding exercises; I wonder whether this could be done on a larger scale at e.g. the Khan Academy by looking at success/fail rates on exercises to tell which of the nodes and links in their learning graph are weakest…

    Reply
    • 2. Mark Guzdial  |  March 24, 2011 at 7:53 am

      Really interesting, Greg — I proposed something similar in my second talk at CMU. I’m interested in the problem of teaching high school CS teachers. Not clear we can use cognitive tutors for this. A worked examples approach is our best bet right now, and the folks at CMU agreed with that. I suggested that we could check the value of the worked examples by interspersing them with small exercises, then using credit/blame assignment to determine the most effective worked examples.

      Reply
  • 3. drj11  |  March 24, 2011 at 8:05 am

    When I did the Datastructures and Algorithms course I learnt some pretty nifty algorithms. Sure, it’s good that I can program FFT, and sort large data. But the real take home message for me was a much higher level one: some datastructures and algorithms are too clever and useful to discover by oneself (Graham scan, the FFT algorithm for multiplying large numbers, red–black trees are three that spring to mind). And the practical application of this is that sometimes for some problems it may be useful to go and do some research to find “an algorithm”.

    Reply
  • 4. drj11  |  March 24, 2011 at 8:09 am

    The “it’s like teaching quantum mechanics to fourth graders” comments you heard seem totally inappropriate to me. We’re not talking about teaching fourth graders, right? It’s university, it’s big boy stuff now. Should you teach undergraduate physicists about Quantum Mechanics so they understand heat? Absolutely. So I think the analogy with teaching the “bare metal” falls down.

    Reply
  • 5. Alan Kay  |  March 24, 2011 at 8:17 am

    Hi Mark

    What is the actual intent here with regard to the students?

    For example, are we selecting out of a population, or trying to educate all? (There is nothing complex about this subject for the 10% who find it obvious and easy)

    Are we dealing with mathematics or engineering? (Before and after, butfirst and butlast, etc., are relationships and can be implemented many ways, including as definitions on paper)

    Just these two dimensions give us 4 pathways.

    For children, we give them visible active structures — e.g. “Holders” in Etoys — and they learn how to do things with them. This is the “learning to use tools before learning physics and engineering” approach. The concrete visible and manipulable nature of the structures puts them into the “real world with a little mystery, but real predictability” category.

    For a general population where we are not selecting, I think the children’s approach is a good one. If the language/IDE allows all the “hoods” to be popped to (later) see how the mid-level tools are implemented, then one can contemplate a middle-out approach here:
    — down to specific implementations
    — up to more abstract definitions and uses

    I think going “up” from middle first is a good one, because those mechanisms can be learned and will be understood as one goes “down”.

    Another prejudice I have is that computing’s emphasis on data structures and algorithms is a bit of an artifact from the problems of the past on tiny computers and very little knowledge. So I think computing would be much better off being taught from the point of view of systems and architectures.

    This argues that an “older children’s language” on the lines of Etoys or Scratch, would be a very good idea.

    Cheers,

    Alan

    Reply
    • 6. Mark Guzdial  |  March 24, 2011 at 9:45 am

      Hi Alan,

      Your question is exactly the one that I think (and the colleagues who wrote the linked paper think) that the community needs to be asking itself. All of these approaches, including our depiction of memory, is an abstraction for the students. We need to think about learning intentions, and determine appropriate abstractions — which may be “down to the metal.”

      Cheers,
      Mark

      Reply
  • 7. Peter Boothe  |  March 24, 2011 at 12:56 pm

    In my software engineering course, the very first project is one of “misconception elimination”. I force the students to build a Java debugger, walk through the reachable memory of a paused program, and then draw the resulting object-reference graph on the screen. The best part comes when they have to figure out where to draw the name of a given variable – for many students it is actually a revelation that variable names should be drawn on the edges, and not on the vertices.

    I don’t anticipate that building a debugger will have much use later in their careers, or that their experience dealing withe the Java debugger API has any long-term benefit, but it serves as a fantastic tool for drawing out and fixing any misconceptions the students might have.

    Reply
  • 8. Why sweat the details? « Gas station without pumps  |  March 24, 2011 at 2:07 pm

    […] Mark Guzdial on his Computing Education Blog asks some interesting questions today in his post Why we teach how linked lists are made, and how memory representations work. […]

    Reply
  • 9. Alvin Yates  |  March 24, 2011 at 6:16 pm

    Slight background: I’m a ’07 GT BS in CompE and a ’10 MS Computational Math from Stanford. I spent 4 years in the UTA program in the old CoC, so most of my perspective comes from this. I should also note that this perspective has zero to do with any intro level faculty discussion, and is purely the posits of the TAs at the time.

    As far as “down to the metal”: As a CompE, I did have to go down to the metal. In other schools, that’s just how Computer Science is defined.

    The reason why I educate at all is because I want my students to be able to function and create things on their own. It doesn’t matter if it’s a robot, distributed system, or parallel algorithm. The goal is not to take what’s there and simply reuse, but advance. This isn’t a purely academic exercise. I’d hire people based on their ability to function and innovate on their own or in small groups (and recent job postings would agree with me on that).

    Two stories:
    – My last year as a student was TAing CS1332, which for the non-GT was pretty much new and the only dedicated Data Structures course we had in the into curriculum. It also served as the new merging point for the different intro CS tracks.

    Most of the work went to breaking the abstractions that prevented understanding of the new concepts, rather than new concepts themselves. It’s harder to appreciate the difference between arrays and linked lists without knowing what memory is on some level. It doesn’t preclude you from “using” these structures naively, which is the beauty of an abstraction. But it is 100% blocking on how you “innovate” data structures.

    Long term, the general problem becomes worse: How do you know what you don’t know, or where to find it?

    As it happens, one of my old Scheme students who had to take medical time off was also in the class. He later commented that Scheme (not the class necessarily) ended up changing how he thought about programming completely and made the rest of the CS easier to grasp as a whole.

    – My first experience with Linear Algebra was that the entire field is a square of numbers that you only do two things: Row reduce to get identity/inverses or eigenvalues/vectors. I went back to grad school in computational math specifically to learn the math that kept me from doing well in higher-level CS theory classes.

    More importantly, this abstraction is 100% useless for real applications. In numerical computation, you never find inverses (They take too long and are numerically poor at best) and row-reduction as a general rule typically performs worse compared to other techniques. Just as important for growth: Someone else needs to be familiar with this abstraction.

    One last note: I don’t think it’s necessarily a community question but an institutional one. One of the biggest distinctions you can make in a school is if you want your undergraduates to be practical engineers or academics, one trade-off being whether your students learn software engineering techniques/paradigms or mesh theory and abstract algebra.

    What do your students need to learn such that, as an educator, you feel as if they are equipped for the challenges ahead? At some point you cut a line, and it probably isn’t one size fits all.

    Reply
  • 10. anoj  |  February 21, 2012 at 4:46 am

    I have been teaching data structures with C for a while and found 100% pass percentage of students with the techniqes I use,
    First the student should know the basic C programming (data types, if-else, for loop, while loop and function creation and calling, Arrays and basics of Pointers and Structures)
    For convinience I take first few days of class to get the students brush up these skill by doing some workshop sessions in same.
    Then I start Data structures But I never say that I am teaching them data structurs (Stack, Queue,Linked List ,Tress or Graphs).
    I just start by taking a real Life Example say ( to create a list of all students in a class) First I tell how to create the node structure and Importance of creating the node poninter inside the structure.
    Then I tell how to keep adding the nodes to the first node by using insert function and using double pointer.
    I keep doing the same with many other examples (like a Bank Accuount Conunter, Line of people in bus stop, list of Things to purchase from market etc.) same creating a node and adding the nodes again and again.
    By this time the student is familiar with making the basic structure for any type of example and adding nodes to the end,Also understands the methods of manupulating the pointers to point to the next node.
    Then I take same examples again but this time teaching to delete a node from the list (for all the eamples previously taken).
    Then I take adding a node in between the two nodes and adding the node in front of the list. For same examples taken previously.
    Keep practicing the same examples again and again untill each and every student understands the concepts of manipulating the pointers and nodes to acheive the tasks.
    Repeating the same workshop again and againg for many days and classes actually enables the students to interact with each other. The Student who understands the concept tries to help other students in the workshop and hence improves his own skills as well as satisfaction levels. Other students get motivated and try to learn the same from others.
    Slowly then I take more complex examples each time to give other real time problems to solve (making a pile of chairs(i.e Stack), making a queue in ticket counter (i.e Queue FIFO), making a Call Log of the mobile phone (i.e Linked List) Making family History Diagram (i.e Tree)etc.)
    Since now the students are familiar with pointer and playing with pointer to making and creating links, They slowly able to understand that Every example I take was one of the data stuctures (Stack, Queue, Linked List or Tree).
    Happy Coding.

    Reply
  • 11. Carlos Andrade  |  November 27, 2013 at 11:35 pm

    In my classes I’ve been more lucky on tying visualizations to different groups of concepts within the course and how they interact. As a recent graduated student and recent temporary professor of a data structure classes, I’ve usually believed that there were an ‘underlying big picture’ behind all courses subjects that was not well explored by my professors, and have been trying to figure out it more properly now that I am one. This also actually follows from a class from a great journalist named Roberto Cairo in where one of his classes he calls attention that “once you move away from text in your article and show people a visualization, they won’t unconsciously think of them differently, rather they will see what you draw there”.

    In my understanding, that could help me avoid misconception by using proper visualization since the very first day and sticking to it all the time.

    For instance, my very first slide was an extension to a visualization from Goodrich’s book I found modified on a data structure course where boys enters a machine and girls comes out (the website seems to have added a clock to remind of the importance of T(n)). You can find it here:

    http://ww3.algorithmdesign.net/handouts/Analysis.pdf

    On the very first slide.

    I extended it with some text notes from my former professor and some thoughts on what could visually codify the concepts and as a end result I got this for the computational model, code file, algorithms as functions (C is the used language here), data structures as boxes, a computational problem and solution.

    In my case, teaching data structures is slight more complicated because students are on their second freshman semester, and were only taught pascal on the previous semester. Still, they are expected to use the concepts in C, perform all classic sorts and search on primary and secondary memories and know all basic data structures up to trees and associated searches. For a total of 68h class!

    Thanks for the interesting post!

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Recent Posts

March 2011
M T W T F S S
« Feb   Apr »
 123456
78910111213
14151617181920
21222324252627
28293031  

Feeds

Blog Stats

  • 1,324,448 hits

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 4,633 other followers

CS Teaching Tips


%d bloggers like this: