How do we test the cultural assumptions of our assessments?

March 16, 2020 at 1:57 pm 15 comments

I’m teaching a course on user interface software development for about 260 students this semester. We just had a Midterm where I felt I bobbled one of the assessment questions because I made cultural assumptions. I’m wondering how I could have avoided that.

I’m a big fan of multiple choice, fill-in-the-blank, and Parsons problems on my assessments. I use my Parson problem generator a lot (see link here). For example, on this one, students had to arrange the scrambled parts of an HTML file in order to achieve a given DOM tree, and there were two programs in JavaScript (using constructors and prototypes) that they had to unscramble.

I typically ask some definitional questions about user interfaces at the start, about ideas like signifiers, affordances, learned associations, and metaphors. Like Dan Garcia (see his CS-Ed Podcast), I believe in starting out the exam with some easy things, to buoy confidence. They’re typically only worth a couple points, and I try to make the distractors fun. Here’s an example:

Since we watched in lecture a nice video starring Don Norman explaining “Norman doors,” I was pretty sure that anyone who actually attended lecture that day would know that the answer was the first one in the list. Still, maybe a half-dozen students chose the second item.

Here’s the one that bothered me much more.

I meant for the answer to be the first item on the list. In fact, almost the exact words were on the midterm exam review, so that students who studied the review guide would know immediately what we wanted. (I do know that working memory doesn’t actually store more for experts — I made a simplification to make the definition easier to keep in mind.)

Perhaps a dozen student chose the second item: “Familiarity breeds contempt. Experts contempt for their user interfaces allows them to use them without a sense of cognitive overload.” I had several students ask me during the exam, “What’s contempt?” I realized that many of my students didn’t know the word or the famous phrase (dates back to Chaucer).

Then one student actually wrote on his exam, “I’m assuming that contempt means learned contentment.” If you make that assumption, the item doesn’t sound ridiculous: “Familiarity breeds learned contentment. Experts learned contentment for their user interfaces allows them to use them without a sense of cognitive overload.”

I had accidentally created an assessment that expected a particular cultural context. The midterm was developed over several weeks, and reviewed by my co-instructor, graduate student instructor, five undergraduate assistants, and three undergraduate graders. We’re a pretty diverse bunch. We had found and fixed perhaps a dozen errors in the exam during the development period. We’d never noted this problem.

I’m not sure how I could have avoided this mistake. How does one remain aware of one’s own cultural assumptions? I’m thinking of the McLuhan quote: “I don’t know who discovered water, but it wasn’t a fish.” I feel bad for the students who got this problem wrong because they didn’t know the quote or the meaning of the word “contempt.” What do you think? How might I have discovered the cultural assumptions in my assessment?

Entry filed under: Uncategorized. Tags: , , .

Ebooks, Handbooks, Strong Themes, and Undergraduate Research: SIGCSE 2020 Preview What I learned from taking a MOOC: Live Object Programming in Pharo

15 Comments Add your own

  • 1. alanone1  |  March 16, 2020 at 2:19 pm

    Theory, practice, reality, etc.

    The NAEP has shown that only around 30-40 percent of graduates of 4 year colleges can read at what is called the “proficient”, which — if you to see how they define this — is not what most of us would call “proficient”, but something much less.

    This seems a problem of such a wide spread that I wouldn’t call it “cultural” (maybe “pop cultural”?).

    “In theory”, being highly competent in the main language used for reading, writing, and some of the thinking that can be done with language, is a very reasonable graduation requirement, not just from college but also from high school.

    I’m not against the idea of remedial courses in college — and we could contemplate here the pros and cons of “directed remediation”. This would be in the spirit of college as a “lifting place” for many important human aspects, especially intellectual richness.

    I wonder whether this comes close to “practice” and “reality” today — so many colleges seem much more about selling degrees rather than actual education.

    A 260 student class sounds like an undergrad lower division course, and again in theory, the lower division courses include mandatory courses in “larger ideas, and larger skills”.

    Seems like good time to help the students get above threshold in the main language being used.

    • 2. Mark Guzdial  |  March 16, 2020 at 3:12 pm

      Hi Alan,

      No, this is EECS 493 — a senior level course. Due to the enrollment boom in CS, triple digit enrollment in upper level courses is the norm. Lower division is quadruple digit.

      This is one of the blog posts grounded in my experience as a teacher. I teach large classes regularly (as of today, all online). I have to assess student knowledge, and I’m trying to do it fairly. I get the theory argument, but I’m also a practicing teacher.

      • 3. gasstationwithoutpumps  |  March 16, 2020 at 9:57 pm

        A senior in college who does not know “contempt” is going to experience a lot of it from people he wants to be employed by.

  • 4. Raul Miller  |  March 16, 2020 at 2:58 pm

    In my experience, you can’t prevent people from making mistakes. If you want to teach them, you have to help them work through the issues…

    We try to establish a common background, to minimize this, but that can only go so far.

    In the example here, some didn’t understand the intended meaning of a word — and that’s a relatively frequent issue. But it’s not the only issue (and, in addition to quizzes, practical exams are one of the ways to work through these things).

    Anyways… I don’t know how to say this so that it makes sense and isn’t trite, but: expect perfection, just don’t expect perfection. (Something maybe about goals and ideals and approaching them?)

  • 5. Daniel Hickey  |  March 16, 2020 at 3:42 pm

    Nice post Mark. You are all up into my world. I too like the efficiency of a good multiple choice exam as long as one’s instruction targets only concept but not the specific association. For typical classroom assessments, you did the easiest thing which is get feedback from students and be open to it. But after the fact, you can examine d (discrimination) index in the LMS and it would have revealed the problem as well

    • 6. Mark Guzdial  |  March 17, 2020 at 11:47 am

      This was on one of those pre-COVID, paper-based exams. All our quizzes and our final exam are Canvas quizzes, where I can easily crunch the numbers.

  • 7. Phil Barry  |  March 16, 2020 at 4:24 pm

    I’m wondering if the problem is not so much the word “contempt,” but the overall phasing of the item. All the other items seem straightforward; for example in the third item, it makes sense that headphone can remove distractions which in turn leads to better focus. For the second item, how does an expert being contemptuous of an interface lead to using it better? That seems counterintuitive, and so I can see how the student mentioned misinterpreted “contempt” as “learned contentment.”

    • 8. alanone1  |  March 17, 2020 at 3:02 am

      This presumes that the student was not actually reading, but only trying to guess from context — this could be a general case that is the result of decades of simple multiple choice tests for reading, which encourage trying to figure out the answer by “mining” a paragraph rather than reading it.

      I’ve been suspecting this for some time on the basis of many comments I’ve seen online that reveal the commenters have not actually read what they are commenting on.

      Guessing instead of reading is also the prime strategy of the older functional illiterate.

      It’s against my notion of “college” that this be skated by.

  • 9. Megan Lutz  |  March 17, 2020 at 7:21 am

    I suggest that this is, as mentioned above, an issue with the phrasing and students’ reading, rather than purely cultural. The distractor is written differently than the others, so gleaning the meaning from contextual cues is difficult. A different form makes it attractive to students, particularly with lower ability, so strive for parallel structure of alternatives. Separately, though the quotation is correct, it is not firmly based in science (where social psychologists find that “familiarity [of people] fosters fondness”), so that is also misleading.

    Apologies if I am stating what you already know (likely), but a key to good item writing, for both stems and alternatives, is to eliminate all context-irrelevant “bonus words”, to make the item as readable and understandable as possible, so that you are only testing item content. If you care about the speed of trains, the number of passengers and points of departure don’t matter and tax WMC, distracting from the task and make items arbitrarily more difficult. That may be what happened here.

    Absent the opening quotation for distractor 2, students might have been more likely to recognize their misunderstanding of vocabulary and ask for clarification on the word (woe!) or suss out the correct answer by elimination of the other options. Or, not knowing the term, focus more on the other alternatives.

    Finally, while we would love our students to know all the words (sigh) writing tests at a lower reading level to again be sure that we are testing the construct of interest and not reading comprehension.

    You did everything right, having so much review, but you need to do it righter 🙂

    • 10. gasstationwithoutpumps  |  March 17, 2020 at 11:45 am

      I disagree with the idea that questions should be written with no extraneous information—often what needs to be tested (and is very hard to test with multiple-choice) is the ability of students to extract relevant information from a sea of irrelevancies and synthesize a solution. I want students able to find information on a datasheet, even when there are hundreds of parameters that are not relevant to their question. I want students able to choose the right function from a large library, and to be able to put together systems of many parts. A question that is carefully tailored to ask for recall of one tiny factoid with no extraneous information is really of no use whatsoever in testing the skills I care about in my students.

      The futility of multiple-choice questions to ask anything really meaningful is why I gave up on the decades ago.

      • 11. Megan Lutz  |  March 17, 2020 at 4:07 pm

        Perhaps you misunderstood what I meant. If the test is about computer science, using higher level (non-CS-relevant vocab) makes the test arbitrarily harder for the student. You are testing, at that point, their language skills as well as their CS knowledge, and confounding the results. This is true for MC tests and any other format. If a student can’t understand what you are asking because of obtuse language, for example, you aren’t testing the construct.

        The short version is: writing items is hard. Writing items that measure what you want them to measure – – and only measure that — is hard. True for any item and test format.

        • 12. gasstationwithoutpumps  |  March 18, 2020 at 1:39 am

          Agreed, though writing meaningful multiple-choice tests is much harder than assessing work that more closely resembles the skill one is interested in. My courses have most of the assessment weight on design reports or programs, not on quizzes, because anything I can ask on a quiz or test bears little relationship to the skills I really want students to develop.

          • 13. alanone1  |  March 18, 2020 at 2:01 am

            Yes, this is my notion also.

            The size of classes that Mark mentioned are daunting to say the least — and probably the ratio of helpers to students has not kept pace … so doing “reasonable assessment” is an incredible burden on everyone.

            Still, I think your aim at real skills and fluencies and assessing these, makes so much more sense in technical fields than the really poor and inadequate multiple choice route (imagine multiple choice tests in music … yikes!)

            • 14. gasstationwithoutpumps  |  March 18, 2020 at 1:27 pm

              The over-reliance on tests has another side effect—students justify cheating on the homework by claiming it is too much work and the tests show that they learned the material. (I’ve seen precisely that claim made on Reddit.) If the tests don’t really test the skill, and the skill is not developed through doing the homework honestly, then the students are not learning what we want them to learn, and they aren’t even aware that they aren’t learning.

              • 15. npslagle  |  March 30, 2020 at 3:16 pm

                Exactly my experience. I’m becoming increasingly convinced that tests are obsolete, and collaborative activities and laboratories are more effective. I was away from school for some years, and my return has left me quite surprised at how little has changed, despite the leap forward in technology. For instance, we use the calculators of a quarter century ago, along with overhead projectors. The one substantial change I’ve seen is mass course coordination, overseen by non-academic bureaucrats akin to the city manager model. Accountability seems out the window.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trackback this post  |  Subscribe to the comments via RSS Feed

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 10,184 other subscribers


Recent Posts

Blog Stats

  • 2,054,191 hits
March 2020

CS Teaching Tips

%d bloggers like this: