Sunday, April 20, 2008

The Future Has a Way of Happening

In his blog Coding Horror, Jeff Atwood answers the question: Should All Developers Have Manycore CPUs? Probably to no one's surprise, the answer is "yeah, duh". Jeff concentrates mostly on the performance and productivity issues. For me, it's all about economics too, but not necessarily the economics Jeff is talking about.

Let's suppose a company has a product that generates, directly or indirectly (as, through service contracts) the bulk of its revenue. I'm talking hundreds of millions of dollars a year. It's the kind of revenue stream most of us poor sods working for tiny start-ups just dream about. This single product is without question the crown jewel of their product line. The installed user base is huge, global, and growing. It is considered by its customers to be a critical and strategic piece of their enterprise infrastructure, and has been for decades.

But, see, here's the thing: this product evolved from code written as far back as the late 1970s. By some estimates, the code base is eight million lines of code, mostly C. Much of it was designed and implemented long before multi-processor servers were cost effective, existing only here or there in various corporate and government labs. Although the application is multi-threaded, most of it was written before there was any way to run it on a platform where the race conditions that might occur in a truly concurrent environment could be exercised.

It wasn't (necessarily) that the developers -- and there would be hundreds of them over the years -- knowingly wrote code that wouldn't work on a multi-core platform. It's just that, well, it wasn't an issue. These developers were more worried about being attacked by a saber-toothed tiger or trampled by a herd of mastadon on the long commute home across the ice pack, then some hypothetical science-fictional multi-core future.

And, of course, even if the occasional forward thinking developer was worried, there was no way to test it.

Fast forward thirty years later. Multi-core servers are now cheap. So cheap, server manufacturers will soon see no point in making a server that isn't multi-core. Even if they wanted to, the day is coming when the microprocessor chip manufacturers will no longer make single-core microprocessors except perhaps for specialized, embedded applications. Chip manufacturers have been pushed into the multi-core future by having hit the wall for CPU clock rates, due mainly to issues in power dissipation. And they're dragging the rest of us in to the multi-core future, some of us kicking and screaming.

This hypothetical company's product, the thing that pays all the bills, won't run reliably on the current generation of servers. And they can't make it work. Finding all the race conditions that show up in a thousand lines of code is a challenge. Doing so in an eight million line code base is for all practical purposes impossible. They've also hit the CPU clock speed wall, and are reduced to limiting their application to run on a single core, while the other cores run (occasionally) an Apache web server for the system administration interface, or maybe blinks some LEDs.

In other words, they are screwed.

I don't want to make it sound like anyone is at fault here. For sure, my crystal ball hasn't been any better than theirs was over the past thirty years. At one time, I actually worked at a national lab which had bunch of those multi-processor, shared-memory systems. We called them supercomputers. But it never occurred to me that I'd have one of those systems in my home. In my basement. Under a table. And that it would have cost only about a grand, U.S. That's just crazy talk.

But for this company, the future caught up with them. When it came to designing for future multi-processor architectures, they effectively, and correctly, practiced an agile development methodology. Unfortunately, the future has a way of happening, and the agile principle of "you ain't gonna need it" eventually turned into "you need it right now, and there ain't no way to get it".

(There is some irony here. This same hypothetical company let go -- or otherwise drove away or simply ignored -- nearly all of their firmware developers. These were developers who were quite experienced in designing code for multi-processor shared-memory architectures, albeit for embedded applications. These guys and gals dreamed concurrently in their sleep, and could have been a valuable development resource for the server-side application architecture. But that amazing short-sightedness is a topic for another article.)

So with all due respect to Jeff Atwood, here's why I think all of your developers and testers should have a multi-core platform on which to write and test their applications: if they don't, and if your product has any success at all in the marketplace, sooner or later you will be screwed too. It's simply a matter of developing and testing on the platform you expect to deploy to the field. Or, as the rocket scientists say, "test what you ship, ship what you test".

I've written about some of the issues in the multi-core future in Small Is Beautiful, But Many Is Scary and its follow-up, and my alter-ego has given a talk on problems software developers are running into in writing multi-threaded code for multi-core (and, in the case of hyper-threading, sometimes even single-core) platforms. The company that issues me my paycheck, Digital Aggregates, put its money where my mouth is: we just added a Dell 530 with an Intel 6600 2.4 GHz quad-core processor to the server farm.

1 comment:

Anonymous said...

I definitely found having a multi-core workstation to be beneficial over the years. I think I got my first one about 7 years ago and used it to test the multi-threaded code I was developing. It also lets me play my tunes without interruption while compiling code. :-)