Unreliable research: How replicable is Stereotype Threat?

October 23, 2013 at 1:27 am 1 comment

The Economist has an article in a recent issue that’s leading to lots of discussion: Are we making mistakes with science?  Can scientists really tell the good stuff from the bad stuff?  Are we really making sure that our key results are replicable?

One of the topics that they explore is “priming” research.

“I SEE a train wreck looming,” warned Daniel Kahneman, an eminent psychologist, in an open letter last year. The premonition concerned research on a phenomenon known as “priming”. Priming studies suggest that decisions can be influenced by apparently irrelevant actions or events that took place just before the cusp of choice. They have been a boom area in psychology over the past decade, and some of their insights have already made it out of the lab and into the toolkits of policy wonks keen on “nudging” the populace.Dr Kahneman and a growing number of his colleagues fear that a lot of this priming research is poorly founded. Over the past few years various researchers have made systematic attempts to replicate some of the more widely cited priming experiments. Many of these replications have failed. In April, for instance, a paper in PLoS ONE, a journal, reported that nine separate experiments had not managed to reproduce the results of a famous study from 1998 purporting to show that thinking about a professor before taking an intelligence test leads to a higher score than imagining a football hooligan.

via Unreliable research: Trouble at the lab | The Economist.

Stereotype threat is a kind of priming effect.  Stereotype threat is where you remind someone of a negative stereotype associated with a group that the person belongs to, and that reminding impacts performance.  The argument is that stereotype threat might be leading to the gaps between races and genders.

A common situation of stereotype threat for girls and women is when they are tested on their knowledge of math or science. The Educational Testing Services performed an experiment to see if girls performed better or worse on a math exam if they were asked their gender either before or after the exam. Researchers found that the group of girls who were asked their gender before the exam scored several points lower than the boys, while girls who were asked their gender after the exam scored on par with the boys.

via The Stereotype Threat and How It Affects Women in Computing » Anita Borg Institute.

If there are questions being raised about “priming” research, I got to wondering about whether anyone was checking the reliability of the stereotype threat research.  They are, and it’s not promising.

Men and women score similarly in most areas of mathematics, but a gap favoring men is consistently found at the high end of performance. One explanation for this gap, stereotype threat, was first proposed by Spencer, Steele, and Quinn 1999 and has received much attention. We discuss merits and shortcomings of this study and review replication attempts. Only 55% of the articles with experimental designs that could have replicated the original results did so. But half of these were confounded by statistical adjustment of preexisting mathematics exam scores. Of the unconfounded experiments, only 30% replicated the original. A meta-analysis of these effects confirmed that only the group of studies with adjusted mathematics scores displayed the stereotype threat effect. We conclude that although stereotype threat may affect some women, the existing state of knowledge does not support the current level of enthusiasm for this as a mechanism underlying the gender gap in mathematics. We argue there are many reasons to close this gap, and that too much weight on the stereotype explanation may hamper research and implementation of effective interventions

via Dienekes’ Anthropology Blog: Shortage of female math geniuses not due to “stereotype threat”.

As I dug into this further, I found that there has been a lot of misinterpretation of the research on stereotype threat.  There is already a gap between genders and between races on many of these tests.  If you remind someone of a negative stereotype, that can make the gap larger.  But if you don’t remind someone of the stereotype, the gap is just the same.  The gap was already there.  If you adjust the scores so that they’re the same pre-test (that’s the “statistical adjustment of the preexisting mathematics exam scores” referenced above), you find no difference absent the threat invocation. The measured impact of stereotype threat has worked when the test-takers are consciously aware of the threat.  The blog post cited below goes into alot of detail into the efforts to replicate, the problems with interpreting the result, and how the methodology of the experiment matters.

Thus, rather than showing that eliminating threat eliminates the large score gap on standardized tests, the research actually shows something very different. Specifically, absent stereotype threat, the African American–White difference is just what one would expect based on the African American–White difference in SAT scores, whereas in the presence of stereotype threat, the difference is larger than would be expected based on the difference in SAT scores.

via Race and IQ : Stereotype Threat R.I.P. « Meng Hu’s Blog.

I come away with the opinion that stereotype threat is real, but it needs more experimentation to understand just how reliable the effect is and what triggers it. It’s probably a small impact, more like the impact of general test anxiety than an explanation for much of the gaps between genders and races.

About these ads

Entry filed under: Uncategorized. Tags: , , , .

On-Line Course in Educational Robotics IU project to understand how children learn about complex systems

1 Comment Add your own

  • 1. Michelle  |  October 23, 2013 at 1:54 am

    I need to think more about this when I’m less tired, but one seemingly little-understood aspect of stereotype threat is that it is only hypothesized to be a factor for people who are *highly identified with the domain*. If you’re a woman who thinks you’re bad at math and doesn’t care, stereotype threat doesn’t affect you. It’s only if you are a woman who wants to be a mathematician that it is expected to affect performance.

    I end up wondering how good some of the replication studies are – do they truly replicate the original study or only a partial replication. That isn’t to say that psychology experiments are all perfect or that even highly-cited studies don’t have flaws. But if significant elements of a theory or key parts of the procedure are overlooked, the lack of replication isn’t significant either. For example, the study on the SAT and priming gender by asking demographic information before or after the test didn’t seem to distinguish between identification with the domain. (I also heard an explanation that the SAT is high-stakes enough and known to have gender effects such that the priming element may be irrelevant – girls know they’re girls and they’re not expected to do as well as boys whether or not you remind them.)

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Recent Posts

October 2013
M T W T F S S
« Sep   Nov »
 123456
78910111213
14151617181920
21222324252627
28293031  

Feeds

Blog Stats

  • 940,379 hits

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 3,012 other followers

CS Teaching Tips


Follow

Get every new post delivered to your Inbox.

Join 3,012 other followers

%d bloggers like this: