## Statistics worrying about losing ground to CS: Claim that CS isn’t worthy

The linked blog post below bemoans the fact that the AP CS is growing, perhaps at the expense of growth in AP Statistics.  AP Stats is still enormously successful, but the part of the post that’s most interesting is the author’s complaints about what’s wrong with CS.  I read it as, “Students should know that CS is not worthy of their attention.”

It’s always worthwhile to consider thoughtful critiques seriously.  The author’s points about CS being mostly free of models and theories is well taken.  I do believe that there are theories and models used in many areas of CS, like networking, programming languages, and HCI. I don’t believe that most CS papers draw on them or build on them. It’s an empirical question, and unfortunately, we have the answer for computing education research.  A recent multi-national study concluded that less than half of the papers in computing education research draw on or build on any theory (see paper here).

Though the Stat leaders seem to regard all this as something of an existential threat to the well-being of their profession, I view it as much worse than that. The problem is not that CS people are doing Statistics, but rather that they are doing it poorly: Generally the quality of CS work in Stat is weak. It is not a problem of quality of the researchers themselves; indeed, many of them are very highly talented. Instead, there are a number of systemic reasons for this, structural problems with the CS research “business model.”

Entry filed under: Uncategorized. Tags: , , .

• 1. John "Z-Bo" Zabroski  |  September 16, 2015 at 10:21 am

Ah! The classical trade-off – take AP Stats, or AP CS. As an engineer, I am taught to solve such trade-offs through innovations that bypass the shortcomings of both. Proposed solution: A geometric approach advocated by Wickens [1] would allow for a AP CS-Stats course! This is especially true when you consider we now have quantum computing chips and can now calculate really large multivariate stat problems.

[1] The Geometry of Multivariate Statistics, by Thomas Wickens, published May 2015. Note: Wickens died at the conclusion of finishing his magnum opus. It will likely be regarded as the greatest stats presentation to students ever written.

Z-Bo
P.S. Thanks for keeping up your blog Mark. I still read most of the lead-ins via WordPress subscriptions, even if I don’t comment as often as I used to years ago.

• 2. John "Z-Bo" Zabroski  |  September 16, 2015 at 10:31 am

Also, I think the earliest (and best) work on Computing Education Research had deep theoretical underpinnings from the work done by Jean Piaget, Jerome Bruner, and Lev Vygotsky. Alan Kay famously suggests reading, in particular:

Jean Piaget (any of his books)
– The psychology of the child
– To understand is to invent

Jerome Bruner (any of his books)
– Towards a theory of instruction
– The relevance of education

Lev Vygotsky (any of his books)
– Thought and language
– Mind in society
– The psychology of art

Especially when you consider Seymour Papert’s Mindstorms and Children’s machine both draw heavily from the above three giants.

The fact that current academic Computing Education Research has lost its way is sad, and I don’t have any real answers for why or how it has happened. My only guess is that the stakes are too small for those involved, and the quality is torn apart by vicious politics. Not because I have first hand experience, but simply because it is the usual reason why societies brood ideas without deep principles.

• 3. nickfalkner  |  September 16, 2015 at 7:58 pm

Current Computing Education Research has not lost its way but the perception that it is somehow ‘lesser than it was’ or ‘less worthy than other research’ certainly does get in the way of disseminating the knowledge that we have managed to obtain through careful research, scholarship and practice. Yes, people could be citing more but it’s hard when a large amount of CS Ed i questioned purely because of the name of it, rather than the quality of the research; it is not the most rewarding field to be working in, sometimes.

The stakes are high. The quality is reinforced by a strong, collaborative and critical community. Our problem is not the quality of Computing Education Research but the scientific commitment of people who can look at good research and find it wanting merely because it features the word “education”…

• 4. dennisfrailey  |  September 16, 2015 at 6:48 pm

I went to the original article and found that all comments are now closed, so I’ll comment here. What I find interesting about this whole issue is that CS is one of the most important areas where statistical techniques are being applied these days. The complaints that people are going to CS instead of statistics seem to really be saying that people are applying statistics instead of developing the theory. And the students appear to be attracted to (taken in by?) the appeal of applying statistics rather than doing the theoretical work. There’s a sort of snide implication that “engineering” methods are sloppy compared with the precise techniques of the statistical theoretician, but shouldn’t the statisticians be rejoicing that there’s such growth in application of statistical techniques?

This all reminds me of a fight that occurred about 15 years ago over software engineering vs CS. The specific issue was whether software engineering should be taught in CS departments (preferably in “pure” schools such as science or arts and sciences) rather than in (ugh) engineering schools, where it might end up being taught by electrical engineers or others of that ilk. Heavens to Betsy, we might have to worry about things like accreditation and licensing and other horrors! The CS experts were afraid that software engineering was going to supplant CS, take away all the students, move them to (ick) schools of engineering, and leave CS departments without enough students to justify the number of faculty members they wanted. Of course, the way things worked out, CS kept the enrollments, software engineering programs never became as widespread as had been feared, and so we still have CS departments teaching software engineering (but calling it CS). Why did CS vs software engineering work out one way and statistics vs CS applications of statistics work out differently? I don’t know but I suspect it has a lot to do with politics and with how the CS establishment fought to keep software engineering instead of complaining about it.

If I stand back, I recall Bill Wulf’s Turing lecture where he observed that there is only one nature and that division into academic disciplines is purely an artificial thing. There’s nothing intrinsic in nature, for example, that says chemistry should be taught in a school of science and chemical engineering should be taught in a school of engineering. It’s just the way we humans have chosen to divide things for our own convenience. This tendency to categorize does not always serve us well because it sets up artificial boundaries. There are so many things that cross over or straddle the boundaries. In fact, many of the greatest scientific breakthroughs have come when people straddled boundaries. Consider plate tectonics (continental drift), which was discovered and nurtured by a meteorologist (weather expert) when the conventional scientific establishments in all of the related disciplines thought it was a crackpot idea.

So too we have an artificial boundary between “pure” statistics (conventionally viewed as a sort of bastard child of mathematics) and “applied” statistics, which today is increasingly and almost universally performed using computers and thus has tended to be perceived as a branch of computer science.

As I see it, the world is constantly changing, our knowledge is constantly growing, our areas of focus are constantly evolving, and hence the boundaries between conventional academic disciplines are constantly being usurped. Computer science was once a branch of mathematics (I know because that’s when I got my degree), and the mathematicians let it go on its own because it was deemed to be too imprecise and applied and (essentially) impure. When I wanted to go to graduate school in CS, my math professors told me I was throwing away my potential and wasting my brain. Now, math departments are wondering where all the students went (a lot of them went to CS where they apply mathematics). If the statistics faculty don’t face reality, they may find themselves in a similar boat before long and, meanwhile, applied statistics will be flourishing but under the banner of “big data” or something else.

This is not to say that we shouldn’t be more precise and careful about the way we apply mathematics and statistics and other techniques in CS. I’m teaching a course right now where part of the curriculum is to point out the theoretical problems of many popular CS/software engineering techniques (function point analysis, quality function deployment, and several others) and what risks this brings to our use of these techniques. I, for one, hope that solid theoretical experts and accomplished applied experts can find a way to cooperate. The scientists and engineers do this very well when they make things like particle accelerators. One would think the statisticians and CS experts could throw away the boundaries and just work together as well. We might have to fight a few deans and higher university officials, who defend their respective turf very effectively, but if we want to be at the forefront of things we may have to fight that fight.

• 5. shriramkrishnamurthi  |  September 17, 2015 at 1:11 pm

> I do believe that there are theories and models used in many areas of CS, like networking, programming languages, and HCI. I don’t believe that most CS papers draw on them or build on them. It’s an empirical question, and unfortunately, we have the answer for computing education research.

Translation: “I don’t know, I sort of have a hunch, but I’ll still make random statements like `don’t believe that _most_ CS papers …'”. You wouldn’t accept this kind of sloppiness in anyone else; you shouldn’t practice it yourself.

When’s the last time you read a PL or networking proceedings cover to cover, to have formulated an opinion about “most”?

• 6. Mark Guzdial  |  September 18, 2015 at 8:55 am

Fair criticism. I have not read a PL proceedings cover-to-cover for at least a decade, and never a networking proceedings. I do have a bit more information than a hunch, though. I have been a faculty member in a College of Computing for 22 years, and hear lots of faculty, candidate, and visitor talks. If anything, I would expect that these would be biased in favor of having the most theory and models, but I’d still say less than half of those I hear do. Nonetheless, it’s an empirical question, and I don’t have the data to be confident.

• 7. shriramkrishnamurthi  |  September 18, 2015 at 9:12 am

That’s a fair response. Though:

1. I’m not sure faculty _talks_ are the most instructive source. People feel the need to “sell” more in talks, and that’s not conducive to talking about foundations. Often that comes about more from one-on-one discussion. So I wouldn’t read too much into the talk presentations, especially these days, when glitz seems to dominate.

2. You may be looking at the world through Georgia Tech-colored lenses. I don’t think of Tech as generally being a very foundations-minded place (though it certainly has some outstanding people in those fields). So there’s a strong selection bias there (just as there is in other directions in my department, and everyone else’s). That’s why I think conference proceedings give you a much better sense of how a community thinks.

## Blog Stats

• 1,408,379 hits