Even for Experts! What Makes Code Hard to Understand?

June 27, 2013 at 1:31 am 7 comments

When I visited Indiana earlier this year, I got a chance to meet with Rob Goldstone who told me about these fascinating results that Michael Hansen describes in the blog post linked below — that adding two blank lines to a Python program (which has no change to execution) significantly changes how programmers understand the code.  Are his participants getting confused, because spacing matters horizontally in Python but not vertically?

The other experiments that Michael describes below, like the one I’m quoting below, are also amazing.  Michael isn’t dealing with students — most of his participants are programmers with 2-10 years worth of experience, and graduate degrees.  How could they get this code so wrong, when the problem is the kind of thing we might give on a CS1 exam?  Here’s one hypothesis: We really don’t know just how hard programming is, and both students and programmers understand it far less well than we expect.

Why did 50% of our participants get this program wrong? There is a strong expectation amongst programmers that you don’t include code that won’t be used. Elliot Soloway identified this and other maxims (or rules of discourse) in 1984. Like conversational norms, these unwritten rules can have a powerful influence on interpretation.

via What Makes Code Hard to Understand? | synesthesiam.

Entry filed under: Uncategorized. Tags: .

Learning for today versus learning for tomorrow: Teaching evaluations Zydeco: Supporting Cross-Context Inquiry in Formal and Informal Settings

7 Comments Add your own

  • 1. alanone1  |  June 27, 2013 at 2:14 am

    As with pretty much all discourse, poor readability stems from: some percentage of poor language design (for all languages), some percentage of poor style (for all styles and writers) and the rest poor readers (for all readers).

    All three bear a lot of discussion — especially whether the judgements are forced to be relative, or could possibly have some absolute content (For example, in natural languages we tend to evaluate all against our skills with our native tongue rather than to try to identify absolute pluses and minuses. Relative judgements like these could obtain for many programmers, for many languages both natural and artificial.)

    But I also think that the programming languages of the last 60 years or so are actually poorly designed in absolute terms — in particular, they have proved to be very brittle under the stress of even modest scaling (some of this is really crazy e.g. most code really requires the human programmer to remember details over the body of the code whose size easily surpasses human memory limits).

    Cheers

    Alan

    Reply
    • 2. Mark Guzdial  |  June 27, 2013 at 8:10 am

      One of the deep ideas in Michael’s work is that experts read code as a diagram (e.g., noting keystones and relations between them) and not as text (word-for-word). We know (from Richard Mayer’s work on multimedia instruction) that people learn differently with diagrams than with texts, e.g., people can *hear* explanations of diagrams easier than *reading* them. So, when does programs-as-diagrams kick in? Should we be explaining programs to novices as audio narrations, rather than text explanations? That’s what Briana Morrison is testing this summer.

      I like the idea that you and Neil are suggesting, Alan, that these examples are “bad” or “poorly written” code. When combined Michael’s insight about diagrams, it shifts the onus on unreadable code. If the NYTimes had a visualization that few could understand, we would say that the NYTimes should have made it easier, not that the people should have been able to figure it out.

      It does bother me that the kinds of problems in Michael’s experiments are JUST like the kinds of problems on many CS1 exams. The first part of programming is learning the discourse rules, not learning to interpret malformed or unreadable code.

      Finally, your last point is one that hits home for me. I turned 50 last year, and that’s enough that my short-term memory is not as effective. Programming demands too much short-term memory. How do we design programming languages for reduced short-term memory? How do we design programming languages so that people can program productively in the second half of life?

      Cheers,
      Mark

      Reply
    • 3. Seth Chaiken  |  June 27, 2013 at 2:35 pm

      We must not forget the lesson of Alan Turing that any general purpose programming language (under idealizations that are effectively true in practice) can express programs that are intrinsically impossible to fully analyze. More recent theoretical results (of Robertson and Seymour) tell us that low degree polynomial time solutions to some computational problems exist without providing an explicit algorithm. These lessons from theory lead us to realize that programs people write for each other, or even for oneself, must be specially written for understandability.

      And of course major software codes are way too big for any one human to have a workably detailed understanding of the whole thing.

      Cheers to my fellow mortals.
      Seth

      Reply
  • 4. Neil Brown  |  June 27, 2013 at 2:49 am

    I don’t think it’s a problem with code being hard to understand. I think we tend to skim the code, and form an expectation of what it’s doing, without reading the fine details. The first mistake is quite interesting, because it shows the expectation (that “done” is only printed after the loop”) overpowering knowledge of syntax. Both of the others are bad code, which misleads programmers.

    If you are marking a student’s code, you look over it in very great detail and would probably spot these mistakes. If you are skimming a colleague’s code to see what it’s doing (and don’t suspect a bug is present), then you are liable to overlook the problems, because you expect they followed these rules of discourse.

    Reply
  • 5. alfredtwo  |  June 27, 2013 at 11:47 am

    This is why some of us still believe in comments in code.

    Reply
  • 6. Affandy  |  July 2, 2013 at 12:34 am

    what about pretty printing, visual programming, program visualization, or algorithm visualization..?
    do they work in helping learner to understand the program/programming?

    Reply
  • 7. gailcarmichael  |  July 24, 2013 at 9:46 am

    Reading this today is timely for me – just yesterday I had a conversation with a software/computer engineering professor who has taught fist year with Python for 10 years now. He was telling us that he found students got good at knowing how to write code, but not “play computer” (i.e. read code and determine what’s happening). Even more depressing is how many students couldn’t correctly evaluate simple math expressions that require only an understanding of high school BEDMAS plus knowledge about integer division. I wonder if it is true that programmers in general tend to be better at writing code than reading it?

    Reply

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trackback this post  |  Subscribe to the comments via RSS Feed


Enter your email address to follow this blog and receive notifications of new posts by email.

Join 11.4K other subscribers

Feeds

Recent Posts

Blog Stats

  • 2,096,624 hits
June 2013
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930

CS Teaching Tips