A computational biologist’s personal toolbox : What a scientist will really do with programming
May 2, 2012 at 8:55 am 3 comments
Here’s a great piece to read when wondering about the questions, “Do scientists really all need to learn to program? Surely they’re not going to program are they? What would they do?” What they’ll do is patch together piece of other’s code, with lots of data transformation. What do they need to know? A robust mental model of how the modules work and what the data needs for each are. This is beyond computational thinking.
In my past 20 years as a programmer, I’ve seen the rise of object-oriented programming and ‘modularity’ is something that was hammered onto my forehead. These days, I organise my entire life as a computational biologist around little modules that I re-use in almost every workflow. Yes, sure, you may call me a one-trick pony, but in terms of productivity, call me plough horse.
My core modules are ACQUISITION, COMPUTATION, VISUALISATION, and usually I glue those together with a few lines of Perl or the Unix command line. Here come the constraints again: To overcome the limitations of the software that I’m often “misusing”, I use my own scripts to shove data from one format into the next, and back again. I think every biologist who deals with lots of data, not only us computational folk, should know a few handy lines to quickly turn comma-separated files into tab-delimited, strip a table of empty quotes or grep some essential info.
via Soapbox Science: Tool Tales: A computational biologist’s personal toolbox : Soapbox Science.
Entry filed under: Uncategorized. Tags: computational thinking, computing for everyone, science education.
1.
Kathi Fisler | May 4, 2012 at 1:11 am
So take this one step further: assume a university is designing a
single semester course to prepare students with no prior programming
experience for this style of practical programming for scientists.
What are good, yet realistic, learning outcomes for the course? It
has to go farther than mastery of basic programming concepts
(conditionals, functions, etc). Sounds like it has to convey some
notion of modularity, and the concept of data formats/schemas and how
to translate from one to the other.
This sounds ambitious for novice programmers who likely haven’t yet
needed (or appreciate) these skills for scientific careers.
I’m familiar with recent works on learning outcomes for CS1, but not
for this kind of intro CS course. Does anyone have pointers to
researchers or literature on this?
thanks,
Kathi
2.
Mark Guzdial | May 4, 2012 at 8:36 am
Hi Kathi,
There are a couple of pointers I can offer you. First, Greg Wilson’s Software Carpentry is aimed at that goal. Second, I’m updating the notes for Computational Freakonomics which I’ll be teaching this summer. I wonder if that’s the kind of course that you’re looking for. In general, I agree — preparing students to use computing for understanding science and math is a really important goal, probably more important than teaching students to be software engineers (and learning to be a computational scientist doesn’t preclude learning to be a software engineer later).
Cheers,
Mark
3.
Rob St. Amant | May 4, 2012 at 11:29 am
Thanks for the pointer to the argle. I think that Boris’s work actually is a good example of computational thinking. Modularity is a core concept, which he’s clearly relying on. He’s passing data between otherwise incompatible modules by recoding and filtering outputs to match inputs. The software components he relies on are complex, but his general strategy reminds me of parts of the Unix philosophy (though he’s rolling his own pipes, by hand).
But I’d agree that the higher-level goals of the work are domain specific and go beyond computational thinking.