Is it Time to Throw Away your Supercomputers – or How Folding@Home is Dominated by the New Kids in Town
Jeff Atwood at Coding Horror is not the first, but maybe the most popular blog yet to comment on the recent inroads the Playstation 3 has made into the Folding@Home-world. Although the console has only been released for a few month, it has already contributed 72 percent of the computing power in the entire Folding@Home project. Not to mention that it was only released in all of Europe the day before yesterday. And has not exactly taken the world by storm, either, if you look at the sales figures (the Wii, the PS2 and the Xbox360 are all selling better in the recent months in the US). Still, the PS3 has shown all other contestants where they stand with regards to Folding@Home. Therefore I have asked myself the question: is it time yet to throw away the traditional supercomputers and start again from scratch with Cell?
Short answer: no. Not yet, at least. Maybe later :P. Long answer: there are a couple of things wrong in the simple picture that is painted by the Folding@Home OS-stats.
Die, FLOPS die!
As Jeff already points out in his article, what is compared here are FLOPS. Floating Point Operations per second. Which means exactly – not too much. FLOPS have been used as a marketing instrument to boast about the power of a new supercomputer or CPU basically forever. Yet I am not interested in how many simple floating point additions in a row a CPU can perform – because I am never going to get that kind of workload for a real-world program anyways. My programs have branches, memory accesses and all kinds of other operations that will keep that CPU far off its theoretical performance maximum at all times, even for the most basic data-parallel operations. That’s why the supercomputing centers invest heavily in benchmarks that are similar to their actual workloads – if FLOPS did mean much to them, they could as well save their money.
A more interesting number would be in this context, how many work units of the Folding@Home project have been calculated by each architecture – a number that is unfortunately not revealed.
Not all Operations Are Created Equal
The Cell processor has a very impressive performance for single-precision floating point operations. Unfortunately, its performance sucks when it comes to double-precision floating point operations. In computer games, this does not matter much and this is what the first generation Cell processor was optimized for. The picture is different for many traditional supercomputing workloads – double precision is required there. And although there are ways to compute double precision accuracy on single-precision hardware, as far as I know they are not commonly employed. There are rumors out there that the next generation Cell processor will not have this problem – but until it is out and these can be confirmed, this remains a serious weakness of the cell.
Why Settle for the Cell, when You Can Have a GPU?
If you take a look at the numbers in the articles I linked to again, the real star is not the Playstation 3. It’s the GPUs. With only 733 active GPUs at the time, the total computing power reached equals the power of 25.239 traditional CPUs running Linux. Of course, this is comparing apples and oranges again, as of course FLOPS are compared (see above) and half of these Linux-CPUs might as well be 386s. Or worse :P. We will never know exactly, but it is obvious that the GPUs are operating in a whole different dimension. Of course, they also have the same single-precision floating point problem as the Cell at this time.
The Good
So let’s try to rephrase the original question slightly: is it time yet to throw away the traditional supercomputers and start again from scratch with PS3s or run of the mill GPUs? There are many points in favor of this proposition:
- They are dirt-cheap. Economy of scale really helps these and in a year or two we will see them in virtually every PC or game-console sold.
- They are power-efficient. A PS3 takes about the same 200W to run as a traditional PC (and I have seen gaming PCs that take much more). GPUs have grown to be more power-hungry over the past few years as well, but in relation to their computing power they are still a bargain when it comes to power consumption
- They are blazing fast. How fast just depends on your workload and on a whole lot of other factors, but amazing things are possible with them.
Some pretty strong points in favour of the specialized CPUs. Now let’s take a look at the bad:
The Bad
Regular readers of my blog know whats coming know. There are decades of experience with programming traditional CPUs. We have pretty good compilers by now and people that know how to program them efficiently. None of this is there for the new architectures. There are compilers for the Cell out there, but I hear they are not exactly bug-free or easy to work with. And how could they be? This architecture is so new there just have to be problems! The same is true for the GPUs.
And let’s not even start about legacy codes. I just don’t see many scientists reprogramming their applications for the new architectures tomorrow – they want to get their science done and many of them are happy to have their simulations or calculations or whatever running on the architectures they have today.
These obstacles can and will be overcome by the big players in supercomputing – the ones with huge staffs to help users reprogram their applications or that will even do it for them. The normal supercomputing center at your university? I guess not. Or at least not tomorrow.
Of course, as always, I can only tell from my limited experience – so here goes my question: do you see specialized CPUs appearing in your computing center of choice in the near future? And if yes, would you actually put the processing power provided to good use?