Student evaluations of teaching don’t correlate with learning gains

May 20, 2011 at 8:19 am 15 comments

The best part of this post from Hake comes at the end, where he cites six published accounts of dramatic improvement in learning with dramatic decline in student teaching evaluations. Administrators rely heavily on student evaluations of teaching, but the reality is, they don’t correlate with good teaching. Students don’t necessarily “like” teaching that makes them think.

Unfortunately for my academic career, I gradually caught on to the fact that students’ conceptual understanding of physics was not substantively increased by traditional pedagogy. As described in Hake (1987, 1991, 1992, 2002c) and Tobias & Hake (1988), I converted to the “Arons Advocated Method” [Hake (2004c)] of “interactive engagement.” This resulted in average normalized gains on the “Mechanics Diagnostic” test or “Force Concept Inventory” that ranged from 0.54 to 0.65 [Hake (1998b), Table 1c] as compared to the gain of about 0.2 typically obtained in traditional introductory mechanics courses [Hake (1998a)].

But my EPA’s for “overall evaluation of professor,” sometimes dipped to as low as 1.67 (C-), and never returned to the 3.38 high that I had garnered by using traditional ineffective methods of introductory physics instruction. My department chair and his executive committee, convinced by the likes of Peter Cohen (1981, 1990) that SET’s are valid measures of the cognitive impact of introductory courses, took a very dim view of both my teaching and my educational activities.

via LISTSERV 16.0 – AERA-L Archives.

Entry filed under: Uncategorized. Tags: .

Could you find the missing Hillary computationally? Where are all the women scientists? It’s not about the babies

15 Comments Add your own

  • […] Student evaluations of teaching don’t correlate with learning gains « Computing Education Blog. […]

  • 2. Stephen Downes  |  May 20, 2011 at 3:55 pm

    If you focus instruction strictly on the retention of certain concepts, then you can force an increased retention of those concepts by non-traditional (clicker-based question and response prompting) means, as demonstrated (weakly) by the research.

    The students correctly perceive, however, that learning physics is about more than just the memory of some concepts, and in my mind rightly reject the methodology of a professor that is thus singly focused. Being a little ‘physicist-automaton’ was never high on the list of priorities for anyone who signed up to study physics, and students rightly resent being so treated.

    It is said that, “what is measured, is managed” (or, “what is measured, is done”. To me the main lesson of such slogans is that you have to be *very* careful about what you measure, lest you roll over the cliff that escaped your measurements.

    • 3. Mark Hammond  |  May 23, 2011 at 7:57 am

      If one does not retain “certain concepts” in physics, then all of physics becomes a struggle to memorize certain problems and their solutions. Many successful students in traditional classes become quite good at figuring out “which kind of problem is this,” with no recognition of the underlying fundamental principles, thus when they are asked quite easy conceptual questions, they are utterly lost. Nor can these students approach new problems in new contexts, because they haven’t seen a solution before and they have no deep understanding of the fundamental principles upon which to draw. The users of traditional instruction fall back on the fact that a small percentage of their students actually “get it,” and the others must be lazy or incompetent; however, that is the rationalization of a lazy or incompetent professor.

      Why a student likes a class is a complex issue that includes much more than just how much they learned. What we do know is that changing ones brain structure through repeated practice is hard work and not fun. The rewards are subtle and might not occur for a long time. Derek Muller’s thesis research shows that students who watch ineffective teaching videos (as measured by whether the videos actually teach the target concept) think they have learned, enjoyed the video, and were very positive about the experience. But no learning occurred, whatsoever. Videos that resulted in an overthrow of the students’ preconceptions and successfully established new understandings were rated as hard to understand, confusing, not so much fun. But they worked.

  • 4. Alfred Thompson  |  May 20, 2011 at 7:51 pm

    I’ve run into a lot of students whose idea of who their best (most effective) teachers were changed a year or two after graduation. I know a HS English teacher for example who was not a favorite while students were in her class but who a year later really had good things to say about her when they had come to appreciate all they had learned in her class.

  • 5. Cynthia  |  May 21, 2011 at 1:41 am

    While I concur with all the criticisms leveled against SETs in the linked article, I’m wary of wholesale rejection of student input.

    Many professors really are bad, and we (R1s especially) should be paying more attention when students tell us that, not less. Every true criticism of SETs has also been used as an excuse for inaction by a professor whose approach to teaching needs improvement.

    Students have good things to say, and they’re often very sophisticated about knowing what they need and when they’re not getting it. I get very insightful observations and suggestions from students whenever I do informal mid-quarter feedback.

    So I think the question needs to be, how can we improve the signal to noise ratio of student input? It seems to me that the more specific students are in describing what they like and dislike, the more likely they are to be accurate (and the easier it is to spot unwarranted whining). Are there changes to the questions we ask students, or, as Alfred Thompson notes, different times we can ask questions, which would improve the fidelity of the student input signal?

    • 6. Mark Guzdial  |  May 21, 2011 at 2:06 pm

      I do agree, Cynthia. I think it’s key to have good evaluation, so that we can award teacher success. I’ve also been thinking a lot about Stephen’s note. We do want students to do more than memorize. Standardized tests don’t measure that “more.” How do we assess that higher-level content?

      At the same time, we don’t want to lose competencies. I don’t believe that students learn the higher-level content without mastering the lower-level content (but not necessarily ALL of it) — you can’t problem-solve or be creative without knowing something. The trick is doing both. I worry about a focus on problem-solving making it too easy to ignore the content, to “fake it.”

      I’m in Mexico City at a Pearson Education event, and heard a great talk by a Cuban education professor, Julio Pimienta, about the challenge of doing both. Maybe the problem is that we don’t yet have good mechanisms for students to learn the content in the face of a relevant context/problem. Brian Dorn’s dissertation points that out for CS — graphics designers with a real problem have a hard time learning from the CS learning resources available to them. I’ll try to blog on this when I get back and say more about Pimienta’s talk.

  • 7. Mike Byrne  |  May 26, 2011 at 12:33 pm

    There is, in fact, a sizable research literature on what, exactly, student evaluations in college courses actually measure. (The numerical ratings part, anyway.)

    The largest correlates of ratings tend to be things like expected grade, how much students like the instructor’s personality, and, yes, physical attractiveness of the instructor.

    What’s interesting is that most of the research in this area doesn’t look at the relationship with actual student learning, (a) because that’s hard to assess, and (b) because there’s so little variance left in the measures that even if they did, the effects would be small anyway. Physical attractiveness, for example, can account for over 20% of the variance in such scores. Heck, as project in one of my classes, a student did a factor analysis of the teaching evals here at Rice and found a two-factor solution, with the primary factor being the one loaded on most heavily by “expected grade” and that factor accounted for about 50% of the variance.

    So, the fact that evaluations and learning aren’t closely related, regardless of issues with what is being learned and how it’s being assessed, is not news.

    Cynthia has a point that, in fact, student can and do give useful feedback. At the end of each semester I appeal to students to PLEASE provide written feedback about what they thought worked and didn’t work, and I do look carefully at those comments.

    But the numerical ratings are pretty bad, and we all know that university administrators use those, rather than the written comments, when assessing teaching.

    • 8. Charles Isbell  |  June 4, 2011 at 9:33 pm

      Well, at least my institution, administrators are not allowed to see the written comments. This situation developed long before I ever showed up, but I’m told that this was a condition of the deal struck with the faculty senate for allowing such evaluations (or perhaps just online evaluations) in the first place. So, there you go.

    • 9. Mark Guzdial  |  June 6, 2011 at 10:49 am

      We know that correlation isn’t causation, and other variables may also correlate as well and may be more causal. Here’s one that I have been wondering about: Teacher effort. If a teacher works at his or her teaching, do course survey results rise? If survey results are poor, does it suggest a teacher who isn’t working at improving his or her course?

      • 10. Mike Byrne  |  June 6, 2011 at 11:37 am

        No, correlation doesn’t guarantee causation. However, if there is causation, there is also correlation. What the research literature suggests is that if there are other causes, they aren’t responsible for a great deal of the variance. This doesn’t mean they aren’t important or that we shouldn’t strive to identify them, but that it’s unlikely that they can be hugely influential all by themselves.

        Even if “effort” was something you could easily quantify in a meaningful way and could determine was a significant predictor, it’s share of the pie is unlikely to be even a majority share. You couldn’t conclude that low scores imply low effort unless you were controlling for all the other variables. Maybe the instructor is making an effort, but is a really tough grader and doesn’t get on well personally with the students. You’d expect that person to get relatively low scores regardless of their level of effort.

  • […] finding: Students praise teachers who give clear lectures, who reduce confusion.  Student evaluations of teaching reward that clarity.  Students prefer not to be confused.  Is that always a good thing?  Mazur […]

  • […] Another way of interpreting my students’ comments which is much more intellectually challenging is that the difference between an effective and expert teacher is hard to see.  A recent NYTimes article speaks to the enormous value of expert teachers — over a student’s lifetime.  Barbara has pointed out that, in her experience, the first year that a teacher teaches AP CS, none of his or her students will pass the AP CS (with a score of 3 or better).  Even some veteran teachers have few test-passers, but all the teachers who get many test-passers are veterans with real teaching expertise.  But how do you make those successes visible?  As we’ve talked about here before: How do we measure good teaching? […]

  • […]  Making “production” better doesn’t make the teaching more effective.  Student engagement pedagogies are likely to make teaching more effective, but it’s still an open question how to make those […]

  • […] Student evaluations of teaching don’t correlate with learning gains « Computing Education Blog. […]

  • 15. Michole Washington  |  September 12, 2019 at 2:28 pm

    This really made me chuckle. One (of the many cruxes) with trying new pedagogies is not giving students enough time to adjust. And when I say time I mean something comparable to the amount of time they may have spent at a K-12 school where their teachers over utilized behavioral methods (e.g. listen to me lecture allllllll day and you should know everything about this topic) to teach. It’s even more comical that administrators assessing student feedback on a course do not take this into consideration.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trackback this post  |  Subscribe to the comments via RSS Feed

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 10,184 other subscribers


Recent Posts

Blog Stats

  • 2,053,614 hits
May 2011

CS Teaching Tips

%d bloggers like this: