Archive for August 19, 2010
Mobile technology to improve student retention
Georgia Gwinnett College (just up the road from us) is battling low student retention by using smartphones, that they give to the faculty. The phone numbers get listed on the syllabi, and students are encouraged to contact their faculty for help. It’s a novel use of “educational” technology, but places quite a burden on the faculty.
And so far, they say, it is working. The retention rate for returning sophomores at Georgia Gwinnett stands at 75 percent. That is about double the average rate for noncompetitive-admissions colleges in Georgia, according to Tom Mundie, dean of the school of science and technology at Georgia Gwinnett, and on par with many public institutions that have competitive admissions. In engagement surveys, Mundie says, students have reported “feeling that faculty care about and are accessible to them.”
A New Classroom for a New Kind of Computing Student: Brian Defends
Brian Dorn is defending his dissertation this week. For several years now, he has been studying graphic designers who program. He started by studying the information foraging behavior of the graphic designer in the wild. First, he characterized who they were and what kinds of coding they were doing (ICER2006). Next, he studied one of the information sources that they frequented (Adobe’s Photoshop scripting repository) to figure out what kind of informational nourishment they were getting (VL/HCC 2007). He did careful assessment and interviews with graphics designers to figure out what they knew and what they wanted to learn (CHI2010), and most recently (the paper I presented at ICER 2010 last week), he did an interview study to figure out why they won’t enter our classes to get the information they needed.
At this point, Brian knew who his subjects were, what they were looking for, where they were looking, and why they wouldn’t go where he knew the information was. Now, there is a consortium of researchers studying end-user programmers, but for the most part, they’re coming at it from an HCI perspective. How do we make the tools better? Brian wanted to come at it from a learning perspective. How do we make the people better? How do we teach people where they are with what they need? Continuing my (now tired) metaphor, how can he vitamin-fortify (“Now with Vitamin CS!”) the places where they were foraging?
Brian built two different kinds of code repositories. In one, he just had code, just like the repositories that his designers were already using (like the one Adobe hosted). In the other, he provided real cases (based on the design that Mike Clancy and Marcia Linn created). In that one, he included lots of conceptual information about computer science. He wanted to see if his graphics designers would like the cases the same, would be just as effective at writing code, but would also learn something. If adding the CS content made it less pleasant or hurt their productivity, it wouldn’t get used.
He ran everyone through a task, where they had to write some code and answer some conceptual questions, using whatever resources they would normally use. Next, he split the pool into two groups, of roughly equal performance on code and concepts, and gave each group one version of “ScriptABLE” (his tool) — either the code repository form, or the case library form. They did an isomorphic task: About the same complexity, same kind of code to write, same kind of concepts to answer about.
Huge win: Each group liked their resources. No difference in code writing. Statistically significant better learning by the case library-using group.
There are lots of reasons to be excited by this work. First, it’s a study of a seriously non-STEM group of programmers. He has made computing education work with people who have mostly only studied art, with a disdain for computer science. Second, it’s an audience that is much more gender-balanced than most of STEM. Brian now has an approach that works well for increasing the computing knowledge of art-oriented, female professionals who are pretty darn hostile to normal CS classes. That’s quite an accomplishment. Brian’s work is very important for the CS10K effort, because (as I’ve argued previously) on-line learning is critical to achieve that goal.
For our field, it’s a whole new world for computing education. It’s about making things better for computing learning outside the classroom, with people who aren’t CS majors. We mostly look at classrooms, and mostly CS majors. There are many more non-CS majors interested in learning about computing, and most of them won’t enter our classrooms. Brian is showing us a new space for us to work, providing a process for studying our new “students” and new kinds of “classrooms,” and giving us an example of a successful first attempt. Brian has already started his new job as an Assistant Professor at the University of Hartford.
The First Multi-Lingual, Valid Measure of CS1 Knowledge: Allison Tew Defends
Allison Elliott Tew has been working for five years to be able to figure out how we can compare different approaches to teaching CS1. As Alan Kay noted in his comments to my recent previous post on computing education research, there are lots of factors, like who is taking the class and what they’re doing in the class. But to make a fair comparison in terms of the inputs, we need a stable measure of the output. Allison made a pass in 2005, but became worried when she couldn’t replicate her results in later semesters. She decided that the problem was that we had no scientific tool that we could rely on to measure CS1 knowledge. We have had no way of measuring what students learn in CS1, in a way that was independent of language or approach, that was reliable and valid. Allison set out to create one.
Allison defends this week. She took a huge gamble — at the end of her dissertation work, she collected two multiple choice question exams from each of 952 subjects. If you get that wrong, you can’t really try again.
She doesn’t need to. She won.
Her dissertation had three main questions.
(1) How do you do this? All the standard educational assessment methods involve comparing new methods to old methods in order to validate them. How do you bootstrap a new test when one has never been created before? She developed a multi-step process for validating her exam, and she carefully defined the range of the test using a combination of text analysis and curriculum standards.
(2) Can you use pseudo-code to make the test language-independent? First, she developed 3 open-ended versions of her test in MATLAB, Python, and Java, then had subjects take those. By analyzing those, she was able to find three distractors (wrong answers) for every question that covered the top three wrong answers in each language — which by itself was pretty amazing. I wouldn’t have guessed that the same mistakes would be made in all three languages.
Then she developed her pseudo-code test. She ran subjects through two sessions (counter-balanced). In one session, they took the test in their “native” language (whatever their CS1 was in), and in another (a week later, to avoid learning effects), the pseudo-code version.
The pseudo-code and native language tests were strongly correlated. The social scientists say that, in this kind of comparison, a correlation statistic r over 0.37 is considered the same test. She beat that on every language.
Notice that the Python correlation was only .415. She then split out the Python CS1 with only CS majors, from the one with mostly non-majors. That’s the .615 vs. the .372 — CS majors will always beat non-majors. One of her hypotheses was that this transfer from native code to pseudo-code would work best for the best students. She found that that was true. She split her subjects into quartiles and the top quartile was significantly different than the third, the third from the second, and so on. I think that this is really important for all those folks who might say, “Oh sure, your students did badly. Our students would rock that exam!” (As I mentioned, the average score on the pseudo-code test was 33.78%, and 48.61% on the “native” language test.) Excellent! Allison’s test works even better as a proxy test for really good students. Do show us better results, then publish it and tell us how you did it!
(3) Then comes the validity argument — is this testing really testing what’s important? Is it a good test? Like I said, she had a multi-step process. First, she had a panel of experts review her test for reasonableness of coverage. Second, she did think-alouds with 12 students to make sure that they were reading the exam the way she intended. Third, she ran IRT analysis to show that her problems were reasonable. Finally, she correlated performance on her pseudo-code test (FCS1) with the final exam grades. That one is the big test for me — is this test measuring what we think is important, across two universities and four different classes? Another highly significant set of correlations, but it’s this scatterplot that really tells the story for me.
Next, Allison defends, and takes a job as a post-doc at University of British Columbia. She plans to make her exam available for other researchers to use — in comparison of CS1 approaches and languages. Want to know if your new Python class is leading to the same learning as your old Java class? This is your test! But she’ll never post it for free on the Internet. If there’s any chance that a student has seen the problems first, the argument for validity fails. So, she’ll be carefully controlling access to the test.
Allison’s work is a big deal. We need it in our “Georgia Computes!” work, as do our teachers. As we change our approaches to broaden participation, we need to show that learning isn’t impacted. In general, we need it in computing education research. We finally have a yardstick by which we can start comparing learning. This isn’t the final and end-all assessment. For example, there are no objects in this test, and we don’t know if it’ll be valid for graphical languages. But it’s the first test like this, and that’s a big step. I hope that others will follow the trail Allison made so that we end up with lots of great learning measures in computing education research.
Recent Comments