Archive for April 14, 2014

Big data: are we making a big mistake? Yes, especially in education

Important article that gets at some of my concerns about using MOOCs to inform education research.  The sampling bias mentioned in the article below is one of my responses to the claim that we can inform education research by analyzing the results of MOOCs. We can only learn from the data of participants. If 90% of the students go away, we can’t learn about them. Making claims about computing education based on the 10% who complete a CS MOOC (and mostly white/Asian, male, wealthy, and well-educated at that) is bad science.

Cheerleaders for big data have made four exciting claims, each one reflected in the success of Google Flu Trends: that data analysis produces uncannily accurate results; that every single data point can be captured, making old statistical sampling techniques obsolete; that it is passé to fret about what causes what, because statistical correlation tells us what we need to know; and that scientific or statistical models aren’t needed because, to quote “The End of Theory”, a provocative essay published in Wired in 2008, “with enough data, the numbers speak for themselves”.

Unfortunately, these four articles of faith are at best optimistic oversimplifications. At worst, according to David Spiegelhalter, Winton Professor of the Public Understanding of Risk at Cambridge university, they can be “complete bollocks. Absolute nonsense.”

via Big data: are we making a big mistake? – FT.com.

April 14, 2014 at 8:59 am 6 comments


Enter your email address to follow this blog and receive notifications of new posts by email.

Join 9,004 other followers

Feeds

Recent Posts

Blog Stats

  • 1,876,342 hits
April 2014
M T W T F S S
 123456
78910111213
14151617181920
21222324252627
282930  

CS Teaching Tips