Tuesday, March 11, 2008

Wrap up from Rice's HPC for Oil & Gas

I learned something live blogging at the O&G HPC event at Rice... You can't simultaneously report and analyze. I owe myself (and others) a short reflection on the event. So here goes...

The current accelerated computing work is going full force. The options exist and the barrier to experimentation is very low. This is very good for Accelerated Computing.

However, I don't think the motivations are pure. The chief reason for working on silicon outside the mainstream x86 is fear of many-core. There is an expectation that x86 complexity is also dramatically increasing, while the performance is stagnant. The the cost of overcoming x86 many-core complexity is unknown.

The presentations are by smart motivated people who are exploring the alternatives. What they have in common seemed to be the following:
  1. Current scaling options are running out. All presenters provided scale up on dual and quad core x86 CPUs. They are all asymptotic.
  2. The compute is data driven. That is to say there is a lot of data to be worked upon - and it is increasing.
  3. Performance achievements on a greater scale of current performance x86 cores is going to be more expensive than historical trends. Complexity of application management is emerging as both a motivator and a barrier to Accelerated Computing
  4. They need to touch the compute kernels anyway. If they are going to rewrite the compute intensive sections, why not try code on a different piece of silicon or ISA. They have been moving away from hardware specific code anyway.
The optimized point on the curve for the O&G group was compute kernel code that looked like human readable C (or Fortran), integrated with an x86 cluster and a >10x performance return on single thread performance on current quad core.

Mainstreaming Accelerated Computing will not happen without addressing the complexity of systems and application management. I don't know who is really working on this... Do you?

5 comments:

Anonymous said...

Mainstreaming Accelerated Computing will not happen without addressing the complexity of systems and application management. I don't know who is really working on this... Do you?

I might have misunderstood the question, but how about Intel and the AAL?

http://download.intel.com/technology/platforms/quickassist/quickassist_aal_whitepaper.pdf

Unknown said...

I know AAL, but it doesn't address knowing if the Accelerator (FPGA) is up/down/stuck or having random bit errors.

The system management requirements that corporations need to have to compute at scale.

Richard Kaufmann said...

OK, why are folks scared of multicore? Is it because:

* Memory bandwidth ratios aren’t tracking performance (actually a problem that predates multicore)?

* It’s hard to think in parallel, and they think that the compiler providers can’t/won’t figure out how to automatically parallelize within a socket?

* There’s something inherently evil in SIMD, and they’d prefer some kind of data-parallel machine?

So far the techniques to program accelerators feel a lot more like assembly programming to me (human readable, hah!), and I am not at all surprised that their use is restricted to slaves (graduate students) and the brave (those deploying huge emb. parallel farms).

Other quick comments:

* What option does a program have to being “data driven”? :-)

* If someone is rewriting a compute kernel, what language/model are they going to use that will be (even somewhat) future proof?

* HPC folks only hear about the applications that aren’t scaling on the normal roadmaps. Lots of folks are happy as clams with multicore; we only hear from those who aren’t. Perhaps you meant the problem starts at 32-core?

Managing accelerators is easy once you figure out:

* How to program the darned things in the first place
* How to make them reliable at scale (as you imply, they’re designed for single-socket implementations -- not large, parallel machines)

Richard Kaufmann said...

Yeah, I meant MIMD!

Unknown said...

Adding to the thread, though this certainly deserves a posts of it's own...

(I can give Richard Posting rights *and* he should elaborate on this!)

Managing and scaling accelerators is harder than 100s of MPI processes, which is the future of these HPC apps.

Yes, I believe there are happy people on multi-core and those should be beyond HPC, but I can't find them. Your comment on memory bandwidth (and others on I/O scalability) looms large. More cores doesn't make a database or a virus scan run faster.