Microsoft is all about cloud computing and parallelism

March 25, 2010 at 7:20 am 8 comments

I don’t know a lot about parallel computing, so there may be no conflict here. Let me explain my confusion. First this:

Microsoft is “all in” for cloud computing, Microsoft CEO Steve Ballmer told a large crowd at the University of Washington’s Paul G. Allen Center for Computer Science & Engineering early this month.

Currently, about 70 percent of Microsoft’s 40,000 employees are “doing things that are entirely cloud-based or cloud-inspired,” he said, adding: “And by a year from now, that [number] will be 90 percent.”

via Technical Career News.

Then this: At the final plenary of SIGCSE 2010, Michael Wrinn of Intel exhorted, cajoled, and insisted that all the educators in the audience teach parallelism to students. For example, he encouraged us to move away from languages like Java and C, towards C#, but even better, toward functional languages like Haskell and OCaml because these can be more easily and more efficiently parallelized. Microsoft and Intel have funded two large research centers (about $10M each, I understand) to improve our ability to program multi-core, parallel computers, at Berkeley and at UIUC.

Here’s where I’m confused: As I understand it, clouds are massive server farms, but each server could be programmed in traditional serial style (even if they are multi-core, which they most certainly will be in the future), right? So is Microsoft hedging its bets, by going “all in” on the Cloud and pushing toward more parallel programming? It’s important for us as educators. To move our curricula to all functional is (for most schools) a big change. Is the future about parallelism, or does the cloud make an emphasis on parallelism less critical, except as an optimization technique to better utilize the multiple cores?

Entry filed under: Uncategorized. Tags: computing education, curriculum.

Sally Fincher and women in computing education research for Ada Lovelace Day New CRA Taulbee Report Released: CS Majors up!

8 Comments Add your own

1. Tom Hoffman | March 25, 2010 at 9:37 am

They’d like to see more parallel architectures at all levels — servers, pc’s, portable devices, etc. It is more energy efficient, etc. But the roadblock is knowledge of programming techniques.
Reply
2. Darrin Thompson | March 25, 2010 at 11:20 am

It’s kind of weird request. Haskell is good at parallelism but what it does well is bring sanity to shared state concurrency with software transactional memory.

That isn’t exactly the cloud silver bullet. STM in Haskell is just a single process thing last I checked.

Erlang emphasizes multiple nodes, crash on error, and making libraries for sophisticated recovery, and it does so in a message passing environment.

That’s the kind of thing you could say you want in a cloud but the request doesn’t mention that.

As far as parallelism in education goes, there’s somebody somewhere advocating something like parallelism is easier for some people to learn than strictly sequential computation. At least in the right circumstances.

Slowly it’s dawning on me that I think it was in some recent slides from Guy Steele. Sorry I’m so fuzzy on the details.
Reply
3. Daniel | March 25, 2010 at 12:09 pm

A very practical, non-academic point of view:

I work at a different Seattle-based computing giant, and I can tell you that on this rare occasion, I agree with Balmer. This is more based upon the construction of cloud computing platforms than using them, but the new grads we interview and/or acquire take months or years to get up to speed on:

– Distributed computing
– Parallel algorithms
– Advanced threading, locking, semaphores, gates, etc

Distributed Computing: The problem is, most everyone knows what MapReduce is, but few can take a massive data-analysis problem and actually implement a solution for it using MR or any other tool. Ideas like leader election, distributed locking, work partitioning, and the C.A.P. theorem are all must-haves.

Parallel algorithms: Even in local (non-distributed) situations, there is huge importance now for code that is as parallel as possible – not so much due to the advent of multi-core machines, but because cloud computing tends to imply that we must process data from multiple, geographically distributed sources, and do so without letting that increase process latency. How do you process, store, route, or cache data when the data are coming from and going to multiple places? You can spin up multiple processes of essentially serial code, sure, but that only scales so far.

Advanced threading, etc, etc: This is where more parallel, scalable languages might come into play. Haskel or Scala would be nice, but just as good would be a strong concurrency skillset in Java, C++, Ruby, or Python. Knowing how to start threads isn’t enough. We even have to go beyond pooling, joining, and whole-method synchronization. Operations need to be made atomic, but only when required. Datasets need to be split into parallelizable units of work. Single threads need to know how to split, spawn, wait, timeout, and merge those UoWs. Common utility functionalities such as IPC, logging, auditing, and instrumentation all need to keep working when code is massively parallel. Unit tests need to be able to test concurrent libraries in such a way that all temporal code paths are exercised.

To be sure, the sum of these things may be outside of the scope of an undergraduate’s four years. That said, we need fresh minds that can pick them up as quickly as possible – who can at least talk about the concepts with confidence, even if they aren’t known in one particular language. Race conditions, atomicity, stable hashing, bucketing, queueing, leadership, network partitioning, local retries, global retries – none of these words should be foreign to new grads.
Reply
4. Suzanne Rivoire | March 25, 2010 at 12:11 pm

I think that what makes this confusing is that the initial push toward the cloud has not really been motivated by performance — it’s more been about ubiquitous access to your data and about manageability. However, the cloud does present an opportunity for scaling performance as well.

The problem is that in any class of computer — whether it’s your laptop or a server in the cloud — hardware single-thread performance is just not going to scale like it did before 2005. You won’t be able to make sequential code (much) faster by buying the latest, greatest processor. Therefore, programmers need to write parallel code. The performance of this code *can* scale — if you increase the number of cores you throw at it. Once you have a parallelized application, the cloud presents a great opportunity for scaling performance because there’s a ton of hardware to throw at the problem.

So there’s nothing inconsistent about saying that we’re moving toward the cloud while simultaneously emphasizing parallelism.
Reply
5. Rhodes Brown | March 25, 2010 at 12:21 pm

At the risk of oversimplifying, let me attempt to explain what (in my understanding) is at the root of this strategy. In short, a cloud, given the right input code, is a very versatile software pipeline.

Consider a simple example: Suppose you have a program for producing widgets (outputs) that is composed from three computing stages which take roughly the same time. So, if you want to make 10 widgets it will take 30 units of time working sequentially. On the other hand if you have 10 workers, you can finish in 3 time units working in parallel. The interesting possibilities, however, are those between these two points. If you can arrange 3 workers in an assembly line (a pipeline), one for each stage of widget building, then you can crank out your 10 widgets in 12 time units. Now, consider a twist. Suppose the second stage takes twice as long as the others. Now your first and last workers regularly sit idle waiting for the middle worker to finish. The best you can do in this scenario is 22 time units to build 10 widgets. However, if you add another worker at stage 2–working in parallel with the other–you can finish in 13 time units (vs. 40 units for a sequential job).

Here’s the take away: when you are able to divide a program into fine grained tasks without global dependencies (something functional programming does well) then a cloud system can do a better job of allocating its resources. Overall, I think Microsoft’s ultimate strategy is to simply get more people thinking about programming in this way. Obviously they have a stake in their own systems like Dryad, PLINQ, and F#, but they may get better long-term adoption if they simply push concurrent paradigms in general.
Reply
6. Barry Brown | March 25, 2010 at 1:45 pm

If Microsoft and Intel are right, there is lots of work to be done. Some twenty years after object-orientation went mainstream in academia, we are still arguing over “objects early,” “objects late,” “functional first,” ad nauseum.

With parallelism thrown into the mix, we’ll have “parallel early,” “cloud first,” “multicore late” to contend with. And we’ll need pedagogical languages to support these teaching styles.
Reply
7. Neil Conway | March 27, 2010 at 3:29 pm

It seems strange to me to assume that the right long-term programming model for cloud computing is to program each server “in a serial style”: that would not effectively utilize the true power of distributed, pay-as-you-go computing. Such a conservative approach might be more pragmatic in the short term, of course — but ultimately, taking advantage of cloud computing will require new programming models and techniques (Erlang is a step in the right direction).
Reply
8. Michael Wrinn | March 31, 2010 at 4:16 pm

Hi Mark — thanks for mention, and the continuing conversation.
On the language front: apologies for apparently overselling functional languages. They illustrate a broader point (alluded to here by Darrin Thompson): the worst of the shared-state concurrency issues are removed. The same goal can be achieved with message-passing in imperative languages. Yet another option: extend favorite languages with deterministic constructs (such as the Ct project I mentioned). Bottom line: take away race conditions, and the parallel beast looks a lot more tame.
From a teaching standpoint, I’d say the key impression to make is there are many ways to solve this; threads are simply one choice.
Reply

	PCAS Expansion, Grow… on Updates: NSF Funding to Study…
	PCAS Expansion, Grow… on Putting a Teaspoon of Programm…
	PCAS Expansion, Grow… on Media Computation today: Runes…
	PCAS Expansion, Grow… on Participatory Design to Set St…
	PCAS Expansion, Grow… on Updates: Developing the Univer…

Computing Ed Research – Guzdial's Take