28 April 2008

Microsoft comes over to play

Monday and Tuesday this week: a new (three months old!) numerical libraries group from Microsoft came over to speak with us linear algebra hackers and parallel performance tuners. Today we did most of the talking, but we learned something from them: They aren't from MS Research, and they aim to do applied research (not "blue-sky research," as the group's manager put it) with a 2-5 year horizon, and then transition successful, desired prototypes out into a production group. They've been incubating some scientific libraries for about two years, and they want to push it out into a core scientific library (sparse and dense linear algebra for now). Target hardware is single-node multicore -- no network stuff -- and they are especially interested in higher-level interface design for languages like C++, C#, and IronPython, built on both native and managed code. ("Managed code" means heavyweight runtimes like the JVM and .NET -- .NET is a big thing for them in improving programmer productivity, and these runtimes have a growing ecosystem of friendly higher-level languages.) Their group is pretty small now, but they are actively hiring (in case any of you are looking for a job in Redmond-world), and they have some bright folks on their team.

One of our biggest questions was, and probably still is, "why?" -- why wouldn't third-party applications suffice? Then again, one could ask the same of a company like Intel -- why do they need a large in-house group of HPC hackers? However, there's a difference: MS doesn't have the personpower or expertise to contribute as much to, say, ScaLAPACK development as Intel has, nor do they intend to grow their group that much. This team seems to be mainly focused on interface design: how to wrap efficient but hard-to-use scientific codes so that coders in the MS world can exploit them. In that context, my advisor showed one of his favorite slides: three different interfaces to a linear solver (solve Ax = b for the vector x, where b is a known vector and A is a matrix). One is Matlab's: "A \ b". The other two are two possible invocations of ScaLAPACK's parallel linear solver. The second of these has nearly twenty obscurely named arguments relating to the data distribution (it's a parallel distributed-memory routine) and to iterative refinement -- clearly not what you want to give to a 22-year-old n00b fresh out of an undergrad CS curriculum who knows barely enough math to balance a checkbook. Ultimately, MS has to design programmer interfaces for these people, as well as for gurus -- which is something that the gurus often forget.

Another reason perhaps for the "why" is that high-performance, mathematical calculations are a compelling reason to buy new computing hardware and software. There are interesting performance-bound consumer applications being developed, most of which have some kind of math kernel(s) at their core. MS presumably wants to get in on that action, especially as it is starting to lose out on the shrink-wrapped OS-and-Office software market as well as the Web 2.0 / search market.

It's interesting to watch a large, bureaucratic company like Microsoft struggling to evolve. IBM managed this sort of transition pretty well, from what I understand. They still sell mainframes (and supercomputers!), but they also sell services, and maintain a huge and successful research effort on many fronts. MS Research is also a powerhouse, but somehow we don't see the research transitioning into products, or even driving the brand, as it does in IBM's case (think about the Kasparov-defeating chess computer Deep Blue, for example). Maybe it does drive the products, but somehow the marketing has failed to convey this. I kind of feel for them, just like the "Mac guy" feels for the "PC guy" in Apple's ad series: They have to struggle not only to command a new market, but also to reinvent their image and command a brand.

No comments: