Archive for August 16, 2012
I’ve told you a bit about how the Media Computation class went this summer, with the new things that I tried. Let me tell you something about how the “Computational Freakonomics” (CompFreak) class went.
The CompFreak class wasn’t new. Richard Catrambone and I taught it once in 2006. But we’ve never taught it since then, and I’d never taught it before on my own, so it was “new” for me. There were six weeks in the term at Oxford. Each week was roughly the same:
- On Monday, we discussed a chapter from the “Freakonomics” book.
- We then discussed social science issues related to that chapter, from the nature of science, through t-tests and ANOVA, up to multiple linear regression. Sometimes, we did a debate about issues in the chapter (e.g., on “Atlanta is a crime-ridden city” and on “Roe v. Wade is the most significant explanation for the drop in crime in the 1990′s.”)
- Then I showed them how to implement the methods in SciPy to do real analysis of some Internet-based data sets. I give them a bunch of example data sets, and show them how to read data from flat text files and from CSV files.
At the end of the course, students do a project where they ask a question, any question they want from any database. Then, they do it again, but in pair, after a bunch of feedback from me (both on the first project, and on their proposal for the final project). The idea is that the final projects are better than the first round, since they get feedback and combine efforts in the pair. And they were.
- One team looked at the so-called “medal slump” after a country hosts the Olympics. The “medal slump” got mentioned in some UK newspapers this summer. One member of the team had found in his first project that, indeed, the host country wins a statistically significant fewer medals in the following year. But as a pair of students, they found that there was no medal “slump.” Instead, during the Olympics of hosting, there was a huge medal “bump”! When hosting, the country gets more medals, but the prior two and following two Olympics all follow the same trends in terms of medals won.
- Another team looked at Eurozone countries and how their GDP changes tracked one another after moving to the Euro, then tried to explain that in terms of monetary policy and internal trading. It is this case that Eurozone countries who did move to the Euro found that their GDP started correlating with one another, much more than with non-Euro Eurozone countries or with other countries of similar GDP size. But the team couldn’t figure out a good explanation for why, e.g., was it because internal trading was facilitated, or because of joint monetary policy, or something else?
- One team figured out the Facebook API (which they said was awful) and looked at different company’s “likes” versus their stock price over time. Strongly correlated, but “likes” are basically linear — almost nobody un-likes a company. Since stock prices generally rise, it’s a clear correlation, but not meaningful.
- Another team looked at the impact of new consoles on the video game market. Video game consoles are a huge hit on the stock price of the developing company in the year of release, while the game manufacturers stock rises dramatically. But the team realized a weakness of their study: They looked at the year of a console’s release. The real benefit of a new console is in the long lifespan. The year that the PS3 came out, it was outsold by the PS2. But that’s hard to see in stock prices.
- The last team looked at impact of Olympics on the host country’s GDP. No correlation at all between hosting and changes in GDP. Olympics is a big deal, but it’s still a small drop in the overall country’s economy.
One of my favorite observations from their presentations: Their honesty. Most of the groups found nothing significant, or they got it wrong — and they all admitted that. Maybe it was because it was a class context, versus a tenure-race-influenced conference. They had a wonderful honesty about what they found and what they didn’t.
I’ve posted the syllabus, course notes, slides that I used (Richard never used PowerPoint, but I needed PowerPoint to prop up my efforts to be Richard), and the final exam that I used on the CompFreak Swiki. I also posted the student course-instructor opinion survey results, which are interesting to read in terms of what didn’t work.
- Clearly, I was no Richard Catrambone. Richard is known around campus for how well he explains statistics, and I learned a lot from listening to his lectures in 2006. Students found my discussion of inferential statistics to be the most boring part.
- They wanted more in-class coding! I had them code in-class every week. After each new test I showed them (correlation, t-test, ANOVA, etc.), I made them code it in pairs (with any data they wanted), and then we all discussed what they found in the last five minutes of class. I felt guilty that they were just programming away while I worked with pairs that had questions or read email. I guess they liked that part and wanted more.
- I get credit from the students for something that Richard taught me to do. Richard pointed out that his reading of cognitive overload suggests that nobody can pay attention for 90 minutes straight. Our classes were 90 minutes a day, four days a week. In a 90 minute class, I made them get up halfway through and go outside (when it wasn’t raining). They liked that part.
- Students did learn more about computing, inspired by the questions that they were trying to answer. They talk in their survey comments about studying more Python on their own and wishing I’d covered more Python and computing.
- In general, though, they seemed to like the class, and encourage us to offer it on-campus, which we’ve not yet done.
Students who talked to me about the class at the end said that they found it interesting to use statistics for something. Turns out that I happened to get a bunch of students who had taken a lot of statistics before (e.g., high school AP Statistics). But they still liked the class because (a) the coding and (b) applying statistics to real datasets. My students asked all kinds of questions, from what factors influenced money earned by golf pros, to the influences on attendance at Braves games (unemployment is much more significant than how much the team is in contention for the playoffs). One of the other more interesting findings for me: GPD correlates strongly and significantly with number of Olympic gold medals that a country wins, i.e., rich countries win more medals. However, GPD-per-capita has almost no correlation. One interpretation: To win in the Olympics, you need lots of rich people (vs. a large middle class).
Anyway, I still don’t know if we’ll ever offer this class again, on-campus or study-abroad. It was great fun to teach. It’s particularly fun for me as an exploration of other contexts in contextualized computing education. This isn’t robotics or video games. This is “studying the world, computationally and quantitatively” as a reason for learning more about computing.