Preview ICER 2016: Ebooks Design-Based Research and Replications in Assessment and Cognitive Load Studies

September 2, 2016 at 7:53 am 17 comments

The International Computing Education Research (ICER) Conference 2016 is September 8-12 in Melbourne, Australia (see website here). There were 102 papers submitted, and 26 papers accepted for a 25% acceptance rate. Georgia Tech computing education researchers are justifiably proud — we submitted three papers to ICER 2016, and we had three acceptances. We’re over 10% of all papers at ICER 2016.

One of the papers extends the ebook work that I’ve reported on here (see here where we made them available and our paper on usability and usage from WiPSCE 2015). Identifying Design Principles for CS Teacher Ebooks through Design-Based Research (click on the title to get to the ACM DL page) by Barbara Ericson, Kantwon Rogers, Miranda Parker, Briana Morrison, and I use a Design-Based Research perspective on our ebooks work. We describe our theory for the ebooks, then describe the iterations of what we designed, what happened when we deployed (data-driven), and how we then re-designed.

Two of our papers are replication studies — so grateful to the ICER reviewers and communities for seeing the value of replication studies. The first is Replication, Validation, and Use of a Language Independent CS1 Knowledge Assessment by Miranda Parker, me, and Shelly Engleman. This is Miranda’s paper expanding on her SIGCSE 2016 poster introducing the SCS1 validated and language-independent measure of CS1 knowledge. The paper does a great survey of validated measures of learning, explains her process, and then presents what one can and can’t claim with a validated instrument.

The second is Learning Loops: A Replication Study Illuminates Impact of HS Courses by Briana Morrison, Adrienne Decker, and Lauren Margulieux. Briana and Lauren have both now left Georgia Tech, but they were still here when they did this paper, so we’re claiming them. Readers of this blog may recall Briana and Lauren’s confusing results from SIGCSE 2016 result that suggest that cognitive load in CS textual programming is so high that it blows away our experimental instructional treatments. Was that an aberration? With Adrienne Decker’s help (and student participants), they replicated the study. I’ll give away the bottom line: It wasn’t an aberration. One new finding is that students who did not have high school CS classes caught up with those who did in the experiment, with respect to understanding loops

We’re sending three of our Human-Centered Computing PhD students to the ICER 2016 Doctoral Consortium. These folks will be in the DC on Sept 8, and will present posters to the conference on Sept 9 afternoon.

Barbara Ericson will be presenting her results with Dynamically Adaptive Parsons Problems. I’ve seen some of the pilot study results from this summer, and they’re fascinating.
Amber Solomon is just starting her second year working with me. She did the evaluation on the AR Design Studio classroom. She (and I) is fascinated by Steve Cooper’s results from ICER 2015 where spatial reasoning training influenced CS performance and reduced SES differences. She’s been doing a study on CS grades, SES, and spatial reasoning in a non-majors class. She’ll be presenting on The Role of Spatial Reasoning in Learning Computer Science.
Kayla DesPortes works with my colleague Betsy DiSalvo on the learning that happens in MakerSpaces. She’s designing new kinds of physical interfaces to reduce cognitive load and improve learning when working with electronics, which she’ll be talking about at her poster: Learning and Collaboration in Physical Computing.

Entry filed under: Uncategorized. Tags: assessment, cognitive load, computing education research, design-based research, educational psychology, learning sciences.

Why ‘U.S. News’ should rank colleges and universities according to diversity: Essay from Dean Gary May #CSforAll Survey for Human-Computer Interaction (HCI) Instructors

17 Comments Add your own

1. shriramkrishnamurthi | September 2, 2016 at 8:52 am

Replication is great. But I do not see any sense in which this is “language-independent”. Please justify that phrase.
Reply
- 2. Mark Guzdial | September 2, 2016 at 6:27 pm
  
  Allison tested with MATLAB, Java, and Python. Briana tested with programmers using C, C++, and Java. That’s not universally language independent, but it’s not language-dependent.
  Reply
  - 3. shriramkrishnamurthi | September 3, 2016 at 11:49 am
    
    Sorry, that’s a basic logic error. Not-language-*specific* is not the same as language-*independent*. The latter is not the negation of the former. The former is a narrow claim; the latter is a sweepingly broad one and not justified here.
    
    In fact, your instrument is _very much_ language dependent. Indeed, it’s evident in the languages that you list: it assumes that the languages are inherently imperative. (To therefore be “independent”, it assumes that all languages are inherently imperative, but they are not.) To someone who hasn’t seen imperative programming, the correct reaction to most of these programs is: *error* (indeed, in many cases, “does not even parse”).
    
    If you at least narrow your claim to say it’s independent across the _imperative core_ of languages, sure, I’d probably grant you that. But if you aren’t even independent across imperative and functional programming, I don’t know how you can make any claims of “independence”.
    Reply
    - 4. Mark Guzdial | September 3, 2016 at 5:13 pm
      
      It’s an empirical question. We are explicitly making the too broad claim and welcoming research that proves the claim wrong. It’s possible that students using functional languages, given the definition of the pseudocode, might transfer their knowledge effectively on the test. We were already surprised that MATLAB students (who do little with FOR or WHILE loops, in class time or practice) did just as well as Python students on those parts of the test. We’re particularly interested in exploring the claim with blocks-based languages like Scratch and Snap!.
      
      I don’t think it’s a logic question. It’s something that we should test. It’s an issue of human cognition, not programming language paradigm.
      Reply
      - 5. shriramkrishnamurthi | September 3, 2016 at 5:37 pm
        
        Sorry, what? Is this a general strategy we can all use? Because I’m pretty sure you routinely call out people for making broad claims they don’t justify. The burden is on _you_ to justify an extraordinarily broad claim: not on others to falsify it.
        
        Bringing up Matlab is not particularly relevant to me. I’m arguing that you’re dealing with an imperative core, which is common to all the languages you mentioned.
        
        Your programs are literally parse errors in the languages I’m talking about. Yes, it’s _possible_ that students who have had no prior exposure to imperative programming might be able to tell you what these programs mean. It’s even probable. But where’s the evidence? Why is the burden on others rather than on you?
        Reply
        
        6. Mark Guzdial | September 3, 2016 at 6:22 pm
        
        You’re right that FCS1 and SCS1 are probably not broadly language independent. I don’t think we know the right qualifiers yet. Certainly the pseudocode would give parse errors in all the students’ languages — I believe that that would be a characteristic of just about any pseudocode that meant to cover a CS1 space.
        
        You and Peter are both right about the limitations in the coverage of the test. Alison wrote a paper (for SIGCSE, but I don’t have the year at-hand) about how she defined the concepts she meant to cover, and it was purposely chosen to be a least common denominator, e.g., we barely touch objects (and not for all levels of her Bloom-inspired taxonomy). That’s true for any test, and we’ve tried to be explicit about it.
        
        We spend much of the SCS1 paper defining the limitations of the test, and what we’ve shown and what we haven’t shown. I hope we did a more careful job than I did in this blog post.
        
        The MATLAB example is relevant. It’s the most syntax-different from our pseudocode, and MATLAB students spent less actual practice (time-on-task is relevant for learning) on WHILE and FOR than did the Python or Java using students, but did remarkably well on the test. Why? Is it because MATLAB leads to transferable knowledge? Because of the GT CS1371 curriculum? Because the GT students who study MATLAB are highly intelligent? I bet it’s some combination of those.
        
        Imagine that we gave SCS1 to MIT 6.001 students studying SICP. I would bet that they do quite well on SCS1. SICP aims at developing deep, transferable knowledge. A student who passes 6.001 would likely know the subset of CS1 knowledge that SCS1 tests. Is that because of Scheme? Of SICP? Of MIT student quality? I don’t know, but it’s worth exploring.
        
        As we say in the SCS1 paper, we are actively gathering data on use of SCS1. SCS1 is not truly language independent, but saying that it’s imperative language independent or similar qualifier may not be right. I don’t want to qualify on language factors. I believe that the relevant qualifiers are socio-cognitive, and we don’t know what they are yet. We are trying to find out.
        Reply
        
        7. shriramkrishnamurthi | September 3, 2016 at 7:18 pm
        
        That’s not my point about parse errors. Let me say it in a few more words to be really clear.
        
        You’re saying it’d be a parse error for _shallow_ reasons (e.g., in Pascal, assignment is by `:=` rather than `=`, ergo parse error). I’m saying it’s a parse error for _deep_ reasons: you can’t say `x = 1` followed by `x = 2`. That’s a _semantic_ error that _even_ manifests as a parse error. That is, the languages have fundamentally different models and with them notional machines. In other words, _your_ parse error problem is trivially solved by just translating the code from pseudocode to the languages you studied, in a highly local way. Mine is not. [There’s even a beautiful theory result backing up my claim, if someone’s truly skeptical.]
        
        You may disagree that “imperative language” is a relevant qualifier. However, again, it’s up to you to prove that it’s not. Given that it is patently a common characteristic of the languages you measured and of your pseudocode, it should clearly be considered a potential factor until proven otherwise. A title like “Replication, Validation, and Use of an Imperative Programming CS1 Knowledge Assessment” would be the honest thing to say from a scientific viewpoint, unless you want to engage in marketing over science. That’s a title I could really get behind.
        
        Everything else you’ve written is a could-be, may-be feint, but not important to me. Just stop saying “language-independent” until you’ve actually demonstrated it in a sufficiently credible way. That’s all I’m asking for. (The committees that reviewed your papers should have called you out on this term, too, but this just further illustrates — though no illustration was necessary — the total linguistic blindness of the CS ed community.)
        
        And please don’t justify that mistake (which you are now rolling back, thanks) by further saying “well, we can make overly broad claims, it’s up to others to prove us wrong”. You of all people know better than to say that, and given that you are one of the people who sets standards for the CS ed community, you should be especially careful about setting such an irresponsible standard (admittedly, only in a comment on a blog post).
        Reply
        
        8. Mark Guzdial | September 4, 2016 at 6:15 am
        
        By providing more explanation, I got lost. I’m not sure that we’re talking about the same things anymore, Shriram.
        
        By language independence, we mean that a student could study language X and take SCS1 because SCS1 is based in a pseudocode. That alone is a contribution because other validated tests for CS1 knowledge (i.e., AP CS, Adrienne Decker’s exam) were language-dependent (i.e., both were in Java and asked Java-specific questions). We have tested X for a subset of languages. We have not confirmed that if students take any language X that they can be successful at the SCS1, but I think that’s not a well-formed question. A student could be successful at the SCS1 for some language Y, even a non-imperative language Y, if they have a good teacher, or they learn Y so well that they can transfer their knowledge to the SCS1 pseudocode, or they have high-intelligence, etc.
        
        I thought that that’s what you were getting at: For what languages Y are students unprepared for taking the SCS1? I don’t know of any, but we’re gathering data on that. I do welcome others to come up with a strong definition of CS1 learning such that we can define the requisite features of the learning context and language such that students are successful at learning the concepts of CS1 covered in SCS1. Are you suggesting that there are languages Y that *are* used for learning CS1 (as commonly defined) that *can’t* lead to learning SCS1 concepts?
        
        By your response, I think (but I’m not sure) you’re suggesting that there are language features (like logic programming where “X=1” means “X=2” can’t be true?) that aren’t covered in SCS1. That’s totally true, but that’s not part of the language independence argument for SCS1. We already defined the small “least common subset” of CS1 knowledge that SCS1 covered. We’re not covering all languages that have been taught for CS1. We’re covering the parts of CS1 that are common in most CS1’s, and yes, that’s mostly imperative. That’s part of defining the content area that the test covers.
        
        I separate the language learned by the student, the language of the test, and the concepts learned by the student as three separate things. I don’t buy a Whorfian hypothesis that students can only learn the concepts in the language. Even if students only use immutable data, they might be able to imagine mutable data and might be able to reason through SCS1 successfully. Again, teacher and curriculum play a role here.
        
        The shallow/deep parse error is confusing me — I don’t understand how that influences whether SCS1 is language independent. Sorry.
        Reply
9. Peter Donaldson | September 2, 2016 at 12:31 pm

I’d agree with shriramkrishnamurthi about the dangers of labelling it as language-independent. When I read the FCS1 papers and looked at the code snippets they had precise and well defined semantics but the keywords were optimised for readability. What was interesting was the level of comparability between this language and the programming languages the students had been taught in. At the time that suggested that the concepts assessed probably had very similar semantics in several different languages. I’m not sure that the same results could be achieved for other concepts where there is a much greater variation in how they are realised in particular languages.

The original paper referred to the language-neutral questions as pseudocode but in the paper “Code or (not Code) – Separating Formal and Natural
Language in CS Education” I helped to make the case that pseudocode is an overloaded term with two very distinct meanings and uses. One is as a precisely defined reference language that is designed to be easier to read than some forms of programming syntax and the other is as a halfway step between the real world problem and it’s implementation as an executable program. Algorithm textbooks in particular define a notation (sometimes influenced by mathematics) that they then use for demonstrating the implementation of particular algorithms in a “language-independent” manner. For any line there is only one precisely defined meaning for how the instruction will be executed which sounds more like a high level programming language without a working translator to me. That isn’t always the case when designing programs. An english-like line such as “Calculate the area using the height and width of the room” could be implemented as function call or a simple arithmetic expression.

Greg Michaelson, Quintin Cutts and Richard Connor found the SQA reference language relatively easy to design for testing core structured programming concepts but it became much more challenging when it was extended to handle modular code and object orientation. When you start digging there is a lot more variation between scoping rules, how subprograms and objects are defined and created than there is between forms of fixed and conditional repetition.
Reply
10. gasstationwithoutpumps | September 4, 2016 at 2:05 pm

I think that shriramkrishnamurthi and Mark are talking past each other, because they have different definitions of “language-independent”.

Mark is looking at syntax independence—all the languages tested have fundamentally similar semantics (at the level being tested), but somewhat different syntax. The test appears to be reasonably good for testing understanding of that semantics without being distracted by slight syntax variations.

shriramkrishnamurthi pointed out that the concepts themselves were not universal to all programming languages, and that there are first courses in programming based in languages with different semantics, in which the questions don’t make sense. Claiming “language independence” is stronger claim than “syntax independence”, and shriramkrishnamurthi rightly feels that this stronger claim cannot be met by the test.

Reducing the claim to “syntax independence” or reducing the scope to “language-independent testing of imperative programming concepts” would probably be a fair statement.
Reply
- 11. Mark Guzdial | September 4, 2016 at 6:08 pm
  
  Thanks very much, Kevin! That does help. Key for me is “there are first courses in programming based in languages with different semantics.” We’re building a least common denominator CS1 test. The LCD CS1 is all imperative. We are testing understanding of semantics at the definition, recognition, and use levels. We can’t test for everything — the test would be huge. As it is, it’s 28 questions and takes an hour to take. The semantics are given for us. We are language-independent for the most common CS1 languages. There are less-common languages we hope to test.
  Reply
12. News Roundup [September 9, 2016] | No Shelf Required | September 9, 2016 at 10:17 am

[…] Preview ICER 2016:Ebooks Design-Based Research & Replications in Assessment & Cognitive Load Studies,by @guzdial (Computing Education Blog) […]
Reply
13. Paul Biba’s eBook, eLibrary and ePublishing news compilation for week ending Saturday, September 10 - eBookJoy.com | September 12, 2016 at 11:34 am

[…] Preview ICER 2016:Ebooks Design-Based Research & Replications in Assessment & Cognitive Load Studies,by @guzdial (Computing Education Blog) […]
Reply
14. Learning Curves, Given vs Generated Subgoal Labels, Replicating a US study in India, and Frames vs Text: More ICER 2016 Trip Reports | Computing Education Blog | September 16, 2016 at 7:08 am

[…] mentioned in this blog previously that Briana Morrison and Lauren Margulieux had a replication study (see paper here), written with […]
Reply
15. SIGCSE 2017 Preview: Ebooks, GP, EarSketch, CS for All, and more from Georgia Tech | Computing Education Blog | March 8, 2017 at 7:01 am

[…] Miranda Parker, and I will present our ebooks. I blogged about our ICER 2016 paper on ebooks here and our WiPSCE 2015 paper […]
Reply
16. Using tablets to broaden access to computing education: Elliot Soloway and truly making CS for All | Computing Education Blog | June 14, 2017 at 7:00 am

[…] Our ebooks run well on the Fire HD 8 tablet. I can program Python in our ebook using the tablet. Our approach in the ebooks emphasizes modification to existing programs, not just coding from scratch. Tweaking text works fine on the tablet. […]
Reply
17. An Ebook for Java AP CS Review: Guest Blog Post from Barbara Ericson | Computing Education Research Blog | June 17, 2019 at 7:00 am

[…] has been building an ebook (like the ones we’ve been making for AP CSP, as mentioned here and here) for students studying Advanced Placement (AP) CS Level A. We wanted to write a blog post about it, […]
Reply

	PCAS Expansion, Grow… on Updates: NSF Funding to Study…
	PCAS Expansion, Grow… on Putting a Teaspoon of Programm…
	PCAS Expansion, Grow… on Media Computation today: Runes…
	PCAS Expansion, Grow… on Participatory Design to Set St…
	PCAS Expansion, Grow… on Updates: Developing the Univer…

Computing Ed Research – Guzdial's Take