Measuring progress on CS learning trajectories at the earliest stages

I’ve written in this blog (and talked about many times) how I admire and build upon the work of Katie Rich, Diana Franklin, and colleagues in the Learning Trajectories for Everyday Computing project at the University of Chicago (see blog posts here and here). They define the sequence of concepts and goals that K-8 students need to be able to write programs consisting of sequential statements, to write programs that contain iteration, and to debug programs. While they ground their work in K-8 literature and empirical work, I believe that their trajectories apply to all students learning to program.

Here are some of the skills that appear in the early stages of their trajectories:

  • Precision and completeness are important when writing instructions in advance. 
  • Different sets of instructions can produce the same outcome. 
  • Programs are made by assembling instructions from a limited set. 
  • Some tasks involve repeating actions. 
  • Programs use conditions to end loops.  
  • Outcomes can be used to decide whether or not there are errors.
  • Reproducing a bug can help find and fix it.
  • Step-by-step execution of instructions can help find and fix errors.

These feel fundamental and necessary — that you have to learn all of these to progress in programming. But it’s pretty clear that that’s not true. As I describe in my SIGCSE keynote talk (the relevant 4 minute segment is here), there is lots of valuable programming that doesn’t require all of these. For example, most students programming in Scratch don’t use conditions to end loops — still, millions of students find expressive power in Scratch. The Bootstrap: Algebra curriculum doesn’t have students write their own iteration at all — but they learn algebra, which means that there is learning power in even a subset of this list.

What I find most fascinating about this list is the evidence that CS students older than K-8 do not have all these concepts. One of my favorite papers at Koli Calling last year was  It’s like computers speak a different language: Beginning Students’ Conceptions of Computer Science (see ACM DL link here — free downloads through June 30). They interviewed 14 University students about what they thought Computer Science was about. One of the explanations they labeled the “Interpreter.” Here’s an example quote exemplifying this perspective:

It’s like computers speak a different language. That’s how I always imagined it. Because I never understood exactly what was happening. I only saw what was happening. It’s like, for example, two people talking and suddenly one of them makes a somersault and the other doesn’t know why. And then I just learn the language to understand why he did the somersault. And so it was with the computers. 

This student finds the behavior of computers difficult to understand. They just do somersaults, and computer science is about coming to understand why they do somersaults? This doesn’t convey to me the belief that outcomes are completely and deterministically specified by the program.

I’ll write in June about Katie Cunningham’s paper to appear next month at the International Conference of the Learning Sciences. The short form is that she asked Data Science students at University to trace through a program. Two students refused, saying that they never traced code. They did not believe that “Step-by-step execution of instructions can help find and fix errors.” And yet, they were successful data science students.

You may not agree that these two examples (the Koli paper and Katie’s work) demonstrate that some University students do not have all the early concepts listed above, but that possibility brings us to the question that I’m really interested in: How would we know?

How can we assess whether students have these early concepts in the trajectories for learning programming? Just writing programs isn’t enough. 

  • How often do we ask students to write the same thing two ways? Do students realize that this is possible?
  • Students may realize that programming languages are “finicky” but may not realize that programming is about “precision and completeness.” 
  • Students re-run programs all the time (most often with no changes to the code in between!), but that’s not the same as seeing a value in reproducing a bug to help find and fix it. I have heard many students exclaim, “Okay, that bug went away — let’s turn it in.” (Or maybe that’s just a memory from when I said it as a student…)

These concepts really get at fundamental issues of transfer and plugged vs unplugged computing education. I bet that if students learn these concepts, they would transfer. They address what Roy Pea called “language-independent bugs” in programming. If a student understands these ideas about the nature of programs and programming, they will likely recognize that those are true in any programming language. That’s a testable hypothesis. Is it even possible to learn these concepts in unplugged forms? Will students believe you about the nature of programs and programming if they never program?

I find questions like these much more interesting than trying to assess computational thinking. We can’t agree on what computational thinking is. We can’t agree on the value of computational thinking. Programming is an important skill, and these are the concepts that lead to success in programming. Let’s figure out how to assess these.

May 25, 2020 at 7:00 am 14 comments

Thought Experiments on Why Face-to-Face Teaching Beats On-Line Teaching: We are Humans, not Econs

With everything moving on-line, I’m seeing more discussion about whether this on-line life might just be better. Amy Ko recently blogged (see post here) about how virtual conferences are cheaper, more accessible, and lower carbon footprint than face-to-face conferences, ending with the conclusion for her “it is hard to make the case to continue meeting in person.” My colleague, Sarita Yardi, has been tweeting about her exploration of “medium-independent classes” where she considers (see tweet here), “Trying to use the block of class time just because that’s how we’ve always taught seems like something to revisit. Less synchronous time; support short, frequent individual/small group interaction, less class time.”

It’s hard to do on-line education well. I used to study this kind of learning a lot (see post on “What I have learned about on-line collaborative learning”). I recently wrote about how we’re mostly doing emergency remote teaching, not effective on-line learning (see post here). I am concerned that moving our classes on-line will hurt the most the students who most need our help (see post here).

It should come as no surprise then that I don’t think that we know how to do on-line teaching or on-line conferences in a way that is anywhere close to the effectiveness of face-to-face learning. I agree with both Amy and Sarita’s points. I’m only focusing on learning outcomes.

Let me offer a thought experiment on why face-to-face matters. How often do you…

  • Look at the movie trailer and not watch the movie.
  • Watch the first few minutes of a show on Netflix but never finish it.
  • Start a book and give up on it.
  • Start watching a YouTube video and immediately close it or click away.

Now contrast that with: How often do you…

  • Get up from a one-on-one meeting and walk out mid-discussion.
  • Get up in the middle of a small group discussion and leave.
  • Walk out of a class during a lecture.
  • Walk out of a conference session while the speaker is still presenting (not between talks or during Q&A).

For some people, the answers to the first set are like the answers for the second set. I tried this thought experiment on my family, and my wife pointed out that she finishes every book she starts. But for most people, the first set is much more likely to happen than the second set. This is particularly hard for professors and teachers to recognize. We are good at self-regulated learning. We liked school. We don’t understand as well the people who aren’t like us.

There are a lot of people who don’t really like school. There are good reasons for members of minority groups to distrust or dislike school. Most people engage in higher-education for the economic benefit. That means that they have a huge value for the reward at the end, but they don’t particularly want to go through the process. We have to incentivize them to be part of the process.

Yes, of course, many students skip classes. Some students skip many classes. But the odds are still in favor of the face-to-face classes. If you are signed up for a face-to-face class, you are much more likely to show up for that class compared to any totally free and absolutely relevant to your interests lecture, on-campus or on-line. Enrolling in a course is a nudge.

For most people, you are much more willing to walk away from an asynchronous, impersonal event than a face-to-face, personal event. The odds of you learning from face-to-face learning are much higher simply because you are more likely to show up and less likely to walk out. It’s a great design challenge to make on-line learning opportunities just as compelling and “sticky” as face-to-face learning. We’re not there yet.

I would be all in favor of efforts to teach people to be more self-regulated. It would be great if we all were better at learning from books, lectures, and on-line resources. But we’re not. The learners with the best preparation are likely the most privileged students. They were the ones who were taught how to learn well, how to learn from school, and how to enjoy school.

Here’s a second thought experiment, for people who work at Universities. At any University, there are many interesting talks happening every week. For me, at least a couple of those talks each week are faculty candidates, which I am highly encouraged to attend. Now, they’re all on-line. How many of those did you attend when they were face-to-face, and how many do you attend on-line? My guess is that both are small numbers, but I’ll bet that the face-to-face number is at least double the on-line number. Other people see that you’re there face-to-face. There are snacks and people to visit with face-to-face. The incentives are far fewer on-line.

On-line learning is unlikely to ever be as effective as face-to-face learning. Yes, we can design great on-line learning, but we do that fighting against how most humans learn most things. Studies that show on-line learning to be as effective (or even more effective) than face-to-face classes are holding all other variables equal. But holding all other variables equal takes real effort! To get people to show up just as much, to give people as much (or more) feedback, and to make sure that the demographics of the class stay the same on-line or face-to-face — that takes significant effort which is invisible in the studies that are trying to just ask face-to-face vs on-line. The reality is that education is an economic endeavor. Yes, you can get similar learning outcomes, at a pretty high cost. At exactly the same cost, you’re unlikely to get the same learning outcomes.

We are wired to show-up and learn from face-to-face events. I would love for all of us to be better self-regulated learners, to be better at learning from books and from lecture. But we’re not Econs, we’re Humans (to use the Richard Thaler distinction). We need incentives. We need prompts to reflect, like peer instruction. We need to see and be seen, and not just through a small box on a 2-D screen.

May 11, 2020 at 7:00 am 19 comments

SIGCSE 2020: Papers freely available, AP CSA over AP CSP for diversifying computing, and a tour of computing ed research in one hour

My Blog@CACM post for this month was about my first stop on my tour of SIGCSE 2020 papers (see link here). While the SIGCSE 2020 conference was cancelled, the papers are freely available now through the end of June — see all the proceedings here. I’ve started going through the proceedings myself. The obvious place to start such a tour is with the award-winning papers. My Blog@CACM post is on the paper from An Nguyen and Colleen M. Lewis of Harvey Mudd College on the negative impact of competitive enrollment policies (having students enroll to get into CS, or requiring a higher-than-just-passing GPA to get into the computing major) on students’ sense of belonging, self-efficacy, and perceptions of the department.

I said that this was the first stop on my tour, but that’s not really true. I’d already looked up the paper Does AP CS Principles Broaden Participation in Computing?: An Analysis of APCSA and APCSP Participants (see link here), because I’d heard about it from co-author Joanna Goode. I was eager to see the result. They show that AP CS Principles is effectively recruiting much more diverse students than the AP CS A course (which is mostly focused on Java programming). But, AP CS A students end up with more confidence in computing and much more interest in computing majors and tech careers. Maybe CSA students had more interest to begin with — there is likely some selection bias here. This result reminds me of the Weston et al result (see blog post here) showing that the female high school students they studied continued on to tech and computing majors and careers if they had programming classes.

I’ve been reading The Model of Domain Learning: Understanding the Development of Expertise (see Amazon link) which offers one explanation of what’s going on here. Pat Alexander’s Model of Domain Learning points out that domain knowledge is necessary to have sustained interest in a domain. You can draw students in with situational interest (having activities that are exciting and engage novices), but you only get sustained interest if they also learn enough about the domain. Maybe AP CSP has more situational interest, but doesn’t provide enough of the domain knowledge (like programming) that leads to continued success in computing.

In my SIGCSE 2020 Preview blog post (posted just two days before the conference was posted), I mentioned the cool session that Colleen Lewis was organizing where she was going to get 25 authors to present the entire 700+ page Cambridge Handbook of Computing Education Research in 75 minutes. Unfortunately, that display of organizational magic didn’t occur. However, in a demonstration of approximately the same level of organizational magic, Colleen got the authors to submit videos, and she compiled a 55 minute version (which is still shorter than reading the entire tome) — see it on YouTube here.

There are lots of other great papers in the proceedings that I’m eager to get into. A couple that are high on my list:

  • Dual-Modality Instruction and Learning: A Case Study in CS1 from Jeremiah Blanchard, Christina Gardner-McCune, and Lisa Anthony from University of Florida, which provides evidence that a blocks-based version of Java leads to more and deeper version on the same assessments as students learning with textual Java (see link here).
  • Design Principles behind Beauty and Joy of Computing by Paul Goldenberg and others. I love design principles papers, because they explain why the authors and developers were doing what they were doing. I have been reading Paul since back in the Logo days. I’m eager to read his treatment of how BJC works (see link here).

Please share in the comments your favorite papers with links to them.

May 4, 2020 at 7:00 am 3 comments

Data science as a path to integrate computing into K-12 schools and achieve CS for All

My colleague Betsy DiSalvo is part of the team that just released Beats Empire, an educational game for assessing what students understand about middle school computer science and data science https://info.beatsempire.org The game was designed by researchers from Teachers College, Columbia University; Georgia Tech; University of Wisconsin, Madison; SRI International; Digital Promise; and Filament Games in concert with the NYC Dept. of Education. Beats Empire is totally free; it has already won game design awards, and it is currently in use by thousands of students. Jeremy Roschelle was a consultant on the game and he just wrote a CACM Blog post about the reasoning behind the game (see link here).

Beats Empire is an example of an important development in the effort to help more students get the opportunity to participate in computing education. Few students are taking CS classes, even when they’re offered — less than 5% in every state for whom I’ve seen data (see blog post here). If we want students to see and use computing, we’ll need to put them in other classes. Data science fits in well with other classes, especially social studies classes. Bootstrap: Data Science (see link here) is another example of a computing-rich data science curriculum that could fit into a social studies class.

Social studies is where we can reach the more diverse student populations who are not in our CS classes. I’ve written here about my work developing data visualization tools for history classes. For a recent NSF proposal, I looked up the exam participation in the two Advanced Placement exams in computer science (CS Principles and CS A) vs the two AP exams in history (US history and World history). AP CS Principles was 32% female, and AP CS A was 24% female in 2019. In contrast, AP US History was 55% female and AP World History was 56% female. Five times as many Black students took the AP US History exam as took the AP CS Principles exam. Fourteen times as many Hispanic students took the AP US History exam as took the AP CS Principles exam.

Data science may be key to providing CS for All in schools.

April 27, 2020 at 7:00 am Leave a comment

Active learning has differential benefits for underserved students

We have had evidence that active learning teaching methods have more benefit for underserved populations than for majority groups (for example, I discussed the differential impact of active learning here). Just published in March in the Proceedings of the National Academy of Science is a meta-analysis of over 40 studies giving us the strongest argument yet: “Active learning narrows achievement gaps for underrepresented students in undergraduate science, technology, engineering, and math” at https://www.pnas.org/content/117/12/6476. I’ll remind everyone that a terrific resource for peer instruction in computer science is here: http://peerinstruction4cs.com/

Achievement gaps increase income inequality and decrease workplace diversity by contributing to the attrition of underrepresented students from science, technology, engineering, and mathematics (STEM) majors. We collected data on exam scores and failure rates in a wide array of STEM courses that had been taught by the same instructor via both traditional lecturing and active learning, and analyzed how the change in teaching approach impacted underrepresented minority and low-income students. On average, active learning reduced achievement gaps in exam scores and passing rates. Active learning benefits all students but offers disproportionate benefits for individuals from underrepresented groups. Widespread implementation of high-quality active learning can help reduce or eliminate achievement gaps in STEM courses and promote equity in higher education.

April 20, 2020 at 7:00 am 5 comments

Checking our hubris with checklists: Learning a lesson from the XO Laptop

My Blog@CACM blog post for February was on Morgan Ames’ book The Charisma Machine (see post here). The book is well-written, and I do recommend it. In the post, I say that the OLPC opposition to HCI design practices is one of the themes in her book that I found most interesting:

It takes humility to design software that humans will use successfully. The human-computer interaction (HCI) community has developed a rich set of methods for figuring out what users need and might use, and for evaluating the potential of a new interface. To use these methods requires us to recognize our limitations — that we are unlikely to get the design right the first time and that our users know things that we don’t.

How do we get developers to have that humility? There are a lot of rewards for hubris. Making big promises that you probably can’t keep is one way to get grant and VC funding.

I just finished Atul Gawande’s The Checklist Manifesto (which I already blogged about here, before I even read it). It’s a short book which I highly recommend. I hadn’t realized before how much Gawande’s story overlaps with the OLPC story — or rather, how much it doesn’t but should have. Gawande is a surgeon. His entry into the idea of checklists is because of the success of checklists in reducing costs and improving patient success rates in medicine. There, too, they had to deal with physician hubris. They saw the checklists as busywork. As one physician said in opposition to checklists, “Forget the paperwork. Take care of the patient.”

The OLPC project couldn’t be bothered with user studies or pilot studies. They wanted to airdrop tablets into Ethiopia. They were so confident that they were going to (in Negroponte’s words) “eliminate poverty, create peace, and work on the environment.” They couldn’t be bothered with the details. They were taking care of the patient!

Gawande points out that checklists aren’t needed because physicians are dumb, but because they know SO much. We’re humans and not Econs. Our attention gets drawn this way or that. We forget about or skip a detail. Our knowledge and systems are so complex. Checklists help us to manage all the details.

We need checklists to check our hubris. We have confidence that we can build technology that changes users lives. The reality is that the odds are slim that we can have impact without going through an HCI design process, e.g., know the user, test often, and iterate. The OLPC Project could have used an HCI checklist.

The second to last chapter in Gawande’s Checklist Manifesto captures the idea well that we need checklists:

We are all plagued by failures—by missed subtleties, overlooked knowledge, and outright errors. For the most part, we have imagined that little can be done beyond working harder and harder to catch the problems and clean up after them. We are not in the habit of thinking the way the army pilots did as they looked upon their shiny new Model 299 bomber—a machine so complex no one was sure human beings could fly it. They too could have decided just to “try harder” or to dismiss a crash as the failings of a “weak” pilot. Instead they chose to accept their fallibilities. They recognized the simplicity and power of using a checklist. And so can we. Indeed, against the complexity of the world, we must. There is no other choice. When we look closely, we recognize the same balls being dropped over and over, even by those of great ability and determination. We know the patterns. We see the costs. It’s time to try something else. Try a checklist.

April 13, 2020 at 7:00 am 14 comments

How I’m lecturing during emergency remote teaching

Alfred Thompson (whom most of my readers already know) has a recent blog post requesting: Please blog about your emergency remote teaching (see post here). Alfred is right. We ought to be talking about what we’re doing and sharing our practices, so we get better at it. Reflecting and sharing our teaching practices is a terrific way to improve CS teaching, which Josh Tenenberg and Sally Fincher told us about in their Disciplinary Commons

My CACM Blog Post this month is on our contingency plan that we created to give students an “out” in case they become ill or just can’t continue with the class — see post here. I encourage all my readers who are CS teachers to create such a contingency plan and make it explicit to your students.

I am writing to tell you what I’m doing in my lectures with my co-instructor Sai R. Gouravajhala. I can’t argue that this is a “best” practice. This stuff is hard. Eugene Wallingford has been blogging about his emergency remote teaching practice (see post here). The Chronicle of Higher Education recently ran an article about how difficult it is to teach via online video like Zoom or BlueJeans (see article here). We’re all being forced into this situation with little preparation. We just deal with it based on our goals for our teaching practice.

For me, keeping peer instruction was my top priority. I use the recommended peer instruction (PI) protocol from Eric Mazur’s group at Harvard, as was taught to me by Beth Simon, Leo Porter, and Cynthia Lee (see http://peerinstruction4cs.com/): I pose a question for everybody, then I encourage class discussion, then I pose the question again and ask for consensus answers. I use participation in that second question (typically gathered via app or clicker device) towards a participation grade in the class — not correct/incorrect, just participating. 

My plan was to do all of this in a synchronous lecture with Google Forms, based on a great recommendation from Chinmay Kulkarni. I would have a Google Form that everyone answered, then I’d encourage discussion. Students are working on team projects, and we have a campus license for Microsoft Teams, so I encouraged students to set that up before lecture and discuss with their teams. On a second Google Form with the same question, I also collect their email addresses. I wrote a script to give them participation credit if I get their email address at least once during the class PI questions.

Then the day before my first lecture, I was convinced on Twitter by David Feldon and Justin Reich that I should provide an asynchronous option (see thread here). I know that I have students who are back home overseas and are not in my timezone. They need to be able to watch the video at another time. I now know that I have students with little Internet access. So, I do all the same things, but I record the lecture and I leave the Google Forms open for 24 hours after the last class. The links to the Google Forms are in the posted slides and in the recorded lectures. To fill out the PI questions for participation, they would have to at least look at that the lecture.

I’m so glad that I did. As I tweeted, I had 188 responses to the PI questions after the lectures ended. 24 hours later, I had 233 responses. About 20% of my students didn’t get the synchronous lecture, but still got some opportunity to learn through the asynchronous component. The numbers have been similar for every lecture since that first.

I lecture, but typically only for 10-15 minutes between questions. I have 4-5 questions in an 85 minute lecture. The questions take longer now. I can’t just move the lecture along when most of the students answer, as I could with clickers. I typically give the 130+ students 90 seconds to get the link entered and answer the question. 

I have wondered if I should just go to a fully asynchronous lecture, so I asked my students via a PI question. 85% say that they want to see the lecturer in the video. They like that I can respond to chat and to answers in Google Forms. (I appreciate how Google Forms lets me see a summary of answers in real-time, so that I can respond to answers.) I’d love to have a real, synchronous give-and-take discussion, but my class is just too big. I typically get 130+ students synchronously participating in a lecture. It’s hard to have that many students participate in the chat, let alone see video streams for all of them.

We’re down to the last week of lecture, then we’ll have presentations of their final projects. They will prepare videos of their presentations, and receive peer comments. Each student has been assigned four teams to provide peer feedback on. Each team has a Google Doc to collect feedback on their project.

So, that’s my practice. In the comments, I’d welcome advice on improving the practice (though I do hope not to have to do this again anytime soon!), and your description of your practice. Let’s share.

April 6, 2020 at 7:00 am 5 comments

So much to learn about emergency remote teaching, but so little to claim about online learning

The Chronicle of Higher Education published an article by Jonathan Zimmerman on March 10 arguing that we should use the dramatic shift to online classes due to Covid-19 pandemic as an opportunity to research online learning (see article here).

For the first time, entire student bodies have been compelled to take all of their classes online. So we can examine how they perform in these courses compared to the face-to-face kind, without worrying about the bias of self-selection.

It might be hard to get good data if the online instruction only lasts a few weeks. But at institutions that have moved to online-only for the rest of the semester, we should be able to measure how much students learn in that medium compared to the face-to-face instruction they received earlier.

To be sure, the abrupt and rushed shift to a new format might not make these courses representative of online instruction as a whole. And we also have to remember that many faculty members will be teaching online for the first time, so they’ll probably be less skilled than professors who have more experience with the medium. But these are the kinds of problems that a good social scientist can solve.

I strongly disagree with Zimmerman’s argument. There is a lot to study here. There is little to claim about online learning.

What we are doing right now is not even close to best practice for online learning. I recommend John Daniels’ book Mega-Universities (Amazon link). One of his analyses is a contrast with online learning structured as “correspondence school” (e.g., send out high-quality materials, require student work, provide structured feedback) or as a “remote classroom” (e.g., video record lectures, replicate in-classroom structures). Remote classrooms tend to have lower-retention and increase costs as the number of students scale. Correspondence school models are expensive (in money and time) to produce, but scales well and has low cost for large numbers. What we’re doing is much closer to remote classrooms than correspondence school. Experience with MOOCs supports this analysis. Doing it well takes time and is expensive, and is carefully-structured. It’s not thrown together with less than a week’s notice.

My first thought when I read Zimmerman’s essay was for the ethics of any experiment comparing to the enforced move to online classes versus face-to-face classe. Students and faculty did not choose to be part of this study. They are being forced into online classes. How can we possibly compare face-to-face classes that have been carefully designed, with hastily-assembled online versions that nobody wants at a time when the world is suffering a crisis. This isn’t a fair nor ethical comparison.

Ian Milligan recommends that we change our language to avoid these kinds of comparisons, and I agree. He writes (see link here) that we should stop calling this “online learning” and instead call it “emergency remote teaching.” Nobody would compare “business as usual” to an “emergency response” in terms of learning outcomes, efficiency, student satisfaction, and development of confidence and self-efficacy.

On the other hand, I do hope that education researchers, e.g., ethnographers, are tracking what happens. This is first-ever event, to move classes online with little notice. We should watch what happens. We should track, reflect, and learn about the experience.

But we shouldn’t make claims about online learning. There is no experiment here. There is a crisis, and we are all trying to do our best under the circumstances.

March 30, 2020 at 10:20 am 7 comments

What I learned from taking a MOOC: Live Object Programming in Pharo

I wrote this post a month ago, before COVID-19 changed how a great many of us teach in higher education. It feels so long ago now. I thought about writing a different post for this week, one about how I’m managing my large (260+) Senior-level User Interface Development class with projects. But I realize — I have a ton of those kinds of posts in my to-read queue now. We’re all being bombarded with advice on how to take our classes on-line. I can’t read it all. I’m sure that you can’t either.

So instead, I decided to move this post up in the queue. It’s about taking the students’ perspective. I worry about what’s going to happen to students as we all move into on-line modes. I wrote my Blog@CACM post this week about how the lowest-performing students are the ones who will be most hurt by the move to on-line — you can find that post here. This is a related story: What I learned about MOOCs by taking a MOOC.

I received in February my certificate of success for the MOOC I took on Pharo. I have not, in general, been a big fan of MOOCs (among many other posts, here’s one I wrote in 2018 about MOOCs and ethics). This MOOC was perfect for what I needed and wanted. But I’m still not generally a MOOC fan.

I’m a long-time Smalltalk programmer and have written or edited a couple of books about Squeak. I’m building software again at the University of Michigan (see the task-specific programming environments I’ve posted about). Pharo is a terrific, modern Smalltalk that I’d like to use.

A MOOC on Pharo matched what I needed. I fit the demographics of a student who succeeds at a MOOC — I already know a lot about the material, and I’m looking for specific pieces of information. Pharo has a test-driven development model that is remarkable. You define your classes, then start writing tests, and then you execute them. You can then build your system from the Debugger! You get prompts like, “You’re referencing the instance variable window here, but it doesn’t exist. Shall I create it for you?” I’ve never programmed like that before, and it was great to learn all the support Pharo has for that style of programming.

Yes, it was in French. They provide versions of the videos dubbed in English, and the French version can display English captions. I preferred the latter. I had French in undergraduate, which means that I didn’t understand everything, but I understood occasional words which was enough to be able to synchronize between the video and the captions to figure out what was going on.

My favorite part of the MOOC was just watching the videos of Stéphane Ducasse programming. He’s a very expert Smalltalk programmer. It’s great seeing how he works and hearing him think aloud while he’s programming. But he’s very, very expert — there were things he did that I had to re-watch in slow motion to figure out, “Okay, how did he do that?”

The MOOC was better than just a set of videos. The exercises made sure I actually tried to think about what the videos were saying. But it’s clear that the exercises were not developed by assessment experts. There were lots of fill in the blanks like “Name the class that does X.” Who cares? I can always look that up. It’s a problem that the exercises were developed by Smalltalk experts. Some of the problems were of a form that would be simple, if you knew the right tool or the right option (e.g., “Which of the below is not a message that instances of the class Y understand?”), but I often couldn’t remember or find the right tool. Tools can fall into the experts’ blind spot. Good assessments should scaffold me in figuring out the answer (e.g., worked examples or subgoal labels).

I ran into one of the problems that MOOCs suffer — they’re really expensive to make and update. The Pharo MOOC was written for Pharo 6.0. Pharo 8.0 was just released. Not all the packages in the MOOC still work in 8.0, or there are updated versions that aren’t exactly the same as in the videos. There were things in the MOOC that I couldn’t do in modern Pharo. It’s hard and costly to keep a MOOC updated over time.

My opinions about MOOCs haven’t changed. They’re a great way for experienced people to get a bit more knowledge. That’s where the Georgia Tech OMSCS works. But I still they that they are a terrible way to help people who need initial knowledge, and they don’t help to broaden participation in computing.

March 23, 2020 at 7:00 am Leave a comment

How do we test the cultural assumptions of our assessments?

I’m teaching a course on user interface software development for about 260 students this semester. We just had a Midterm where I felt I bobbled one of the assessment questions because I made cultural assumptions. I’m wondering how I could have avoided that.

I’m a big fan of multiple choice, fill-in-the-blank, and Parsons problems on my assessments. I use my Parson problem generator a lot (see link here). For example, on this one, students had to arrange the scrambled parts of an HTML file in order to achieve a given DOM tree, and there were two programs in JavaScript (using constructors and prototypes) that they had to unscramble.

I typically ask some definitional questions about user interfaces at the start, about ideas like signifiers, affordances, learned associations, and metaphors. Like Dan Garcia (see his CS-Ed Podcast), I believe in starting out the exam with some easy things, to buoy confidence. They’re typically only worth a couple points, and I try to make the distractors fun. Here’s an example:

Since we watched in lecture a nice video starring Don Norman explaining “Norman doors,” I was pretty sure that anyone who actually attended lecture that day would know that the answer was the first one in the list. Still, maybe a half-dozen students chose the second item.

Here’s the one that bothered me much more.

I meant for the answer to be the first item on the list. In fact, almost the exact words were on the midterm exam review, so that students who studied the review guide would know immediately what we wanted. (I do know that working memory doesn’t actually store more for experts — I made a simplification to make the definition easier to keep in mind.)

Perhaps a dozen student chose the second item: “Familiarity breeds contempt. Experts contempt for their user interfaces allows them to use them without a sense of cognitive overload.” I had several students ask me during the exam, “What’s contempt?” I realized that many of my students didn’t know the word or the famous phrase (dates back to Chaucer).

Then one student actually wrote on his exam, “I’m assuming that contempt means learned contentment.” If you make that assumption, the item doesn’t sound ridiculous: “Familiarity breeds learned contentment. Experts learned contentment for their user interfaces allows them to use them without a sense of cognitive overload.”

I had accidentally created an assessment that expected a particular cultural context. The midterm was developed over several weeks, and reviewed by my co-instructor, graduate student instructor, five undergraduate assistants, and three undergraduate graders. We’re a pretty diverse bunch. We had found and fixed perhaps a dozen errors in the exam during the development period. We’d never noted this problem.

I’m not sure how I could have avoided this mistake. How does one remain aware of one’s own cultural assumptions? I’m thinking of the McLuhan quote: “I don’t know who discovered water, but it wasn’t a fish.” I feel bad for the students who got this problem wrong because they didn’t know the quote or the meaning of the word “contempt.” What do you think? How might I have discovered the cultural assumptions in my assessment?

March 16, 2020 at 1:57 pm 15 comments

Ebooks, Handbooks, Strong Themes, and Undergraduate Research: SIGCSE 2020 Preview

A few items on things that we’re doing at SIGCSE 2020. Yes, SIGCSE 2020 is still have a face-to-face meeting. Attendance looks to be down by at least 30% because of coronavirus fears.

Barbara Ericson (and Brad Miller, who won’t be there) are presenting a paper on their amazingly successful Runestone open-source platform for publishing ebooks: Runestone: A Platform for Free, On-line, and Interactive Ebooks on Sat Mar 14, 2020 11:10 AM – 11:35 AM in D135. They are also hosting a workshop to help others to develop with Runestone: Workshop #401: Using and Customizing Ebooks for Computing Courses with Runestone Interactive on Sat Mar 14, 2020 3:30 PM – 6:30 PM in C120.

I’m part of the massive special session on Thursday 1:45 PM – 3:00 PM in B113 that Colleen Lewis is organizing: Session 2H: The Cambridge Handbook of Computing Education Research Summarized in 75 minutes. Colleen, who must have done graduate work in organizational management (or perhaps cat herding), has organized 25 authors (!) to present the entire Handbook in a single session. Even if I wasn’t one of the presenters, I’d go just to see if we can all pull it off! It’s going to be kind of like watching NASCAR — you’re on the edge of your seat as everyone tries to avoid crashing into one another.

Bravo to Bob Sloan who got this panel accepted this year: Session 6K: CS + X Meets CS 1: Strongly Themed Intro Courses on Fri Mar 13, 2020 3:45 PM – 5:00 PM in Portland Ball Room 255. The panelists are teachers and developers who have put together contextualized introductions to computing, like Media Computation. The panelists have done interesting classes, and I’m eager to hear what they have to say about them.

I am collaborating with Sindhu Kutty on her interesting summer reading group to engage undergraduates in CS research. (Read as: we meet occasionally to work on assessment, but Sindhu is really doing all the work.) The evidence suggests that she’s able to give undergraduates a better understanding of CS graduate research, at a larger scale (e.g. a couple dozen students to one faculty member) than typical undergraduate research programs. It seems like it might feel a bit safer and easier to try for female students. She was going to present a poster at RESPECT on Wednesday Undergraduate Student Research With Low Faculty Cost, but it’s now going to be virtual. I’m not sure how it’s going to work right now.

March 11, 2020 at 7:00 am 10 comments

Defining CS Ed out of existence: Have we made CS too hard to learn and teach?

It was this quote in a tweet from Miles Berry that really made me sit up and take notice of the latest news about the Computing at School initiative:

“If computing increasingly means CS, it looks likely that hundreds of thousands of students, particularly girls and poorer students, will be disenfranchised from a digital education over the next few years.”

He was quoting an article from the New Statesman which can be found here. It describes the history of the rise of the CS curriculum in England. The key paragraph for me is:

The new curriculum was failing. While a tougher course had been introduced, few students were taking it and even fewer teachers could teach it. In many cases, even those who could felt uncomfortable doing so.

The government read the reports and has decided to respond. There’s now an enormous investment in England in trying to train new teachers. The question is whether that’s the right investment.

Meanwhile, in Scotland, the headline of this May 2019 article is “Teachers and students in decline: the computing ‘crisis’ in Scotland’s schools.”

Experts are urging the Scottish Government to take radical steps to boost computing science education to prevent the subject from being squeezed out of schools.

The teaching of computing in schools is in “crisis”, practitioners have told The Ferret, with classes shrinking and teachers in short supply. The latest official data shows that the number of children studying the subject declined last year, while the number of teachers has fallen over the last decade.

Despite a national focus on delivering science and technology education and economic development, schools are finding it increasingly difficult to teach computing science to young people, critics say.

Let’s explicitly consider the questions raised in these two articles. Have we defined CS education in such a way that it’s too hard to teach? That it’s not interesting to learn? Maybe that it’s too hard to learn?

I’ve been writing in the last few months about the surprisingly low uptake of CS education in the United States (for example, in this CACM Blog post). No more than 5% of high school students in any US state are getting any CS classes, from the data available. There is value in setting high standards for CS education (as Alan Kay has been arguing), but that’s an argument for the end goal. Where do we start with CS education? How quickly can and will students learn CS education? What does it mean for something to be too hard to teach or too hard to learn?

Overall, US is following a similar strategy as in England and Scotland for computing in K-12: standalone CS classes, heavy emphasis on in-service teacher development, and counting the number of students in CS classes and the number of teachers leading those classes. There is integrated CS in the US, but as far as I know, no state is tracking those numbers. Public policy tends to focus on things that can be measured. Most of the argument against integration says that too little CS is covered in integrated forms. 95% of US students getting no CS at all is even less coverage than CS in integrated forms.

Let’s consider two hypotheses:

Hypothesis #1: We know how to teach computer science in such a way that all students can learn what they need to be technically-literate citizens, or even to develop the prerequisite knowledge they need to be software professionals. We have not yet achieved this goal because we do not have enough teachers to implement the curriculum. Larger investments in teacher development (perhaps including stipends or better pay to CS teachers) would allow us to scale CS Ed to reach everyone.

Hypothesis #2: We have defined computer science education in a way that is too hard to teach (so too few teachers are unwilling to teach it), and that is too hard to learn (which includes not being motivating enough to recruit students or engage student interest in order to achieve learning).

Given the evidence we have in the US, England, and Scotland, which hypothesis is better supported? You may have a Hypothesis #3 or #4 which is also well-supported by the evidence — I am very interested in hearing it.

In general, we tend to take the “insider view” of CS Ed, as Kahneman warned about (see excerpt here). If you step outside CS Ed, are we making progress along a trajectory that leads to CS education for all? And how long is that trajectory? If you were an Education faculty member and learned that CS had less than 5% of US high school students enrolled, wouldn’t it be reasonable to consider it a fad and likely to pass?

As I wrote in my blog post about what I got wrong in the last decade, I no longer think that CS for All is a matter of access. We have to figure out how to improve participation. I’m in support of Hypothesis #2. We need to re-think what and how we teach CS education. Because of my work these days, I suspect that we made a mistake at the design level. I was involved in the early days of the AP CS Principles (AP CSP) process. Most of the AP CSP curricula I’m aware of were developed by and tested with some of the best CS teachers in the US. That design and development process doesn’t promise a curriculum that many teachers can teach and that most students will learn from.

I just got back from a three day visit in Norway, where they are about to roll-out an integration of CS activities (explicitly programming) into mathematics, science, music, and arts & crafts classes. (See workshop about this topic here.). Maybe that would result in more students learning some computer science. Did US, England, and Scotland make a mistake by emphasizing standalone CS classes over integration?

March 9, 2020 at 7:41 am 22 comments

Final (likely) version of JES released, 18 years after first release

JES 6.0 is now available at https://github.com/gatech-csl/jes/releases/tag/6.0. JES is the Jython Environment for Students — it’s a Python IDE implemented in Java and with support for Media Computation built in. It was a lot of work for a bunch of people. Here are the notes from the release as a summary and acknowledgement for all the effort that brought this version fruition.

This is likely the final version of JES, unless a Jython 3.0 is developed.

This version was brought to completion by Nigel Charleston, based on the beta work of Veronica Day and Audrey Zhang (see discussion at this blog post https://computinged.wordpress.com/2019/07/22/beta-release-of-new-jes-jython-environment-for-students-now-available-media-computation-for-python-ide/). Many thanks to R. Benjamin Shapiro for helping us with many technical questions.

JES 6.0 updates Jython to 2.7beta, uses the latest version of JMusic (from https://jythonmusic.me/), fixes many bugs, will run with Java 8, and creates a new facility to generate pictures from a collection of pixels and sounds from a collection of samples.

The Mac version is a little more complicated to run than usual. You will need to have Java 8 installed to run JES. Thanks to Brian Howard and Michael Stewart for helping to figure this out.

The rest of the Mac version installation instructions can be found at the release page.

JES was originally written by a team of Georgia Tech undergraduates taking Senior Design in Summer 2002. It’s been in use and (sporadic) development for almost 18 years now. The previous version of JES was downloaded over 71K times (see counts here). I would not have predicted in 2002 that JES would still be used in 2020, with little maintenance and no additional funding. Software has to be continually maintained, right? I claim no great genius behind the design. How did it happen that it’s still working and being used?

An even more interesting example is our Squeak-based Wikis (Swikis) which were first developed in 1997. Jeff Rick created the version that we used in classes, and wrote about the process in what I think is the first ACM publication on wikis in 2000. Even after he graduated in 2007, they just kept going. The server http://coweb.cc.gatech.edu/ is still running today — I can find at least one Swiki there dating from 2002. I’ve patched the Swiki software only once or twice since Jeff graduated. Jeff did a great job designing Swiki, but I suspect that even he’d be surprised at how long they’ve run with essentially no maintenance.

What are the characteristics of educational technology that remains viable and usable (i.e., useful and actively used) with very little maintenance for well over a decade? Schools are under-resourced, as I talked about in the Thorndike vs Dewey blog post. It’s great to have educational software that just keeps going without maintenance. Maybe that it’s a certain class of software that works like this. Is it that JES and Swiki do so little, such that they’re really just frameworks on which to hang others’ content? Maybe that’s why they’ve been able to keep going for so long?

Your thoughts would be welcome.

March 2, 2020 at 7:00 am 2 comments

BDSI – A New Validated Assessment for Basic Data Structures: Guest Blog Post from Leo Porter and colleagues

Leo Porter, Michael Clancy, Cynthia Lee, Soohyun Nam Liao, Cynthia Taylor, Kevin C. Webb, and Daniel Zingaro have developed a new concept inventory that they are making available to instructors and researchers. They have written this guest blog post to describe their new instrument and explain why you should use it. I’m grateful for their contribution!

We recently published a Concept Inventory for Basic Data Structures at ICER 2019 [1] and hope it will be of use to you in your classes and/or research.

The BDSI is a validated instrument to measure student knowledge of Basic Data Structure Concepts [1].  To validate the BDSI, we engaged faculty at a diverse set of institutions to decide on topics, help with question design, and ensure the questions are valued by instructors.  We also conducted over one hundred interviews with students in order to identify common misconceptions and to ensure students properly interpret the questions. Lastly, we ran pilots of the instrument at seven different institutions and performed a statistical evaluation of the instrument to ensure the questions are properly interpreted and discriminate between students’ abilities well.

What Our Assessment Measures

The BDSI measures student performance on Basic Data Structure concepts commonly found in a CS2 course.  To arrive at the topics and content of the exam, we worked with fifteen faculty at thirteen different institutions to ensure broad applicability.  The resulting topics on the CI include: Interfaces, Array-Based Lists, Linked-Lists, and Binary Search Trees. If you are curious about the learning goals or want more details on the process we used in arriving at these goals, please see our SIGCSE 2018 publication [2].

Why Validated Assessments are Great for Instructors

Suppose you want to know how well your students understand various topics in your CS2 course.  How could you figure out how much your students are learning relative to other schools? You could, perhaps, get a final exam from another school and use it in your class to compare results, but invariably, the final exam may not be a good fit.  Moreover, you may find flaws in some of the questions and wonder if students interpret them properly. Instead, you can use a validated assessment. The advantage of using a validated assessment is there is general agreement that it is measuring what you want to measure and it accurately measures student thinking.  As such, you can compare your findings to results from other schools who have used the instrument to determine if your students are learning particular topics better or worse than cohorts and similar institutions.

Why Validated Assessments are Great for Researchers

As CS researchers, we often experiment with new ways to teach courses.  For example, many people use Media Computation or Peer Instruction (PI), two complementary pedagogical approaches developed over the past several decades.  It’s important to establish whether these changes are helping our students. Do more students pass? Do fewer students withdraw? Do more students continue studying CS?  Does it boost outcomes for under-represented groups? Answering these questions using a variety of courses can give us insight into whether what we do corresponds with our expectations.

One important question is: using our new approach, do students learn more than before?  Unfortunately, answering this is complicated by the lack of standardized, validated assessments.  If students score 5% higher on an exam when studying with PI vs. not studying with PI, all we know is that PI students did better on that exam.  But exams are designed by one instructor, for one course at one institution, not for the purposes of cross-institution, cross-cohort comparisons.  They are not validated. They do not take into account the perspectives of other CS experts. When students answer a question on an exam correctly, we assume that it’s because they know the material; when they answer incorrectly, we assume it’s because they don’t know the material.  But we don’t know: maybe the exam contains incidental cues that subtly influence how students respond.

A Concept Inventory (CI) solves these problems.  Its rigorous design process leads to an assessment that can be used across schools and cohorts, and can be used to validly compare teaching approaches.

How to Obtain the BDSI

The BDSI is available via the google group.  If you’re interested in using it, please join the group and add a post with your name, institution, and how you plan to use the BDSI.

How to Use the BDSI

The BDSI is designed to be given as a post-test after students have completed the covered material.  Because the BDSI was validated as a full instrument, it is important to use the entire assessment, and not alter or remove any of the questions.  We ask that instructors not make copies of the assessment available to students after giving the BDSI, to try to avoid the questions becoming public.  We likewise recommend giving participation credit, but not correctness credit, to students for taking the BDSI, to avoid incentivizing cheating.  We have found giving the BDSI as part of a final review session, collecting the assessment from students, and then going over the answers to be a successful methodology for having students take it. 

Want to Learn More?

If you’re interested in learning more about how to build a CI, please come to our talk at SIGCSE 2020 (from 3:45-4:10pm on Thursday, March 12th) or read our paper [3].  If you are interested in learning more about how to use validated assessments, please come to our Birds of a Feather session on “Using Validated Assessments to Learn About Your Students” at SIGCSE 2020 (5:30-6:20pm on Thursday, March 12th) or our tutorial on using the BDSI at CCSC-SW 2020 (March 20-21).

References:

[1] Leo Porter, Daniel Zingaro, Soohyun Nam Liao, Cynthia Taylor, Kevin C. Webb, Cynthia Lee, and Michael Clancy. 2019. BDSI: A Validated Concept Inventory for Basic Data Structures. In Proceedings of the 2019 ACM Conference on International Computing Education Research (ICER ’19).

[2] Leo Porter, Daniel Zingaro, Cynthia Lee, Cynthia Taylor, Kevin C. Webb, and Michael Clancy. 2018. Developing Course-Level Learning Goals for Basic Data Structures in CS2. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE ’18).

[3] Cynthia Taylor, Michael Clancy, Kevin C. Webb, Daniel Zingaro, Cynthia Lee, and Leo Porter. 2020. The Practical Details of Building a CS Concept Inventory. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE ’20).

February 24, 2020 at 7:00 am Leave a comment

Older Posts Newer Posts


Enter your email address to follow this blog and receive notifications of new posts by email.

Join 7,995 other followers

Feeds

Recent Posts

Blog Stats

  • 1,801,953 hits
October 2020
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  

CS Teaching Tips