Monday, April 28, 2008

GPGPU Cluster for France

French Supercomputer to be Intel plus Nvidia's Tesla...

I didn't see this covered too broadly, but it is a notable event

French hybrid supercomputer to exceed 300 TFLOPS by 2009

French supercomputing institute Grand Equipement National de Calcul Intensif (GENCI) along with former nuclear research institute CEA (Commissariat à l'Energie Atomique) has asked Bull to make the first hybrid PC cluster in Europe. The new machine will be housed south of Paris in Bruyères le Châtel, a data centre also used by military institute CEA-DAM. The Bull Novascale series machine will be composed of 1068 cluster nodes, each consisting of eight Intel processor cores and an additional 48 GPU application accelerators with 512 cores each. The supercomputer will also have 25 TB of RAM and 1 PB of hard drive storage.

Friday, April 25, 2008

Tweaking the Parts

Back on March 26, Jay provided his commentary on HPCC & Steven Wheat's keynote in these pages.

Now John E. West has added to the discussion instigated by Wheat. "If high performance computing wants to continue to be a distinguishable market space, it needs its own research and development activities." So, when is that funding coming?

In the article for HPC Wire there are numerous suggestions for improving the process. There are several suggestions for ways to fix the procurement process and a call by Dan Reed, now of MSFT, for a coordinated national HPC R&D effort.[1]

Fundamentally, the bottom line is the economics aren't working right now.

Ed Turkel's statement about tweaks on commodity systems is naive about the real economic costs of delivering commodity components. By definition commodity systems are mature markets with brutal margins. When you focus on holding the margin even the most modest tweak in the components is expensive. To support tweaks, you need modular designs to isolate the HPC embellishments from your mainstream delivery, or you need an industry that thinks these will be table stakes in the near future.

We're going to see how expensive and sustainable tweaks in silicon are in the GPGPU market. AMD & NVIDIA are delivering GPUs with functionality that no video display system will ever use. It's a real live experiment in action.
I just hope someone is watching... Oh yeah, we are :)

[1] Dear President, I want America to spend more money on really big computers... See Michael Feldman's HPC Wire Editorial on that one I won't touch that discussion on these pages. At least not yet.

Tuesday, April 8, 2008

HPCSW: Since when are MPI & SPMD the same?

YAHPCC: Yet Another High Performance Computing Conference

I go to a number of these and there seem to be more every year to attend. This post is about the HPC Science Week sponsored by several government agencies, plus some vendors.

However, as I prepped this I realized I also missed another one the same week. Suzy Tichenor and the Council of Competitiveness hosted an HPC Application Summit the same week. Here's coverage from HPC Wire. My highlight from Michael's write up is
"There was also extensive discussion of how best to conceive a software framework for integrating codes that must work together to run multiphysics models. Codes written in-house have to work with codes provided by independent software vendors and open-source codes being built by far-flung communities. A software framework could be the solution."

This was also a theme at HPCSW. Some people make rash statements like 1000 core chips and others talk about new programming languages, but everyone agrees that humans can not program at the anticipated level of future complexity. Since we can't go faster, we're going to have to have more and there is a limit to how much more a person can manage.

There are a few themes that are becoming inescapable.

There will be experts, near experts and the rest of us. Experts will want to be as close to the silicon as possible, while the rest of us are most interested in it working. This is going to require layers.

Data Movement is Expensive
We have plenty of computation, but moving the data to it is hard. We need methods that minimize the cost of data movement. Asynchronous threads, hardware synchronization and message queues are in vogue.

Is it a Framework, or a Library?
The data is also more complicated, so a library doesn't seem to sufficient. Programmers need constructs that handle data parameters and other odd bits. Libraries take rigor. Frameworks are application modules with architecture and external hooks. (see Guido's blog & comments for more on this.) Accelerated computing will take frameworks.

More User Control
Only the programmer knows... According to the pundits, future operating systems will allow the user to schedule threads manage data access patterns etc.

More on all these in the near future...

Friday, April 4, 2008

Belfast: Day 2

Eric still can't get online...

Many Core Reconfigurable Supercomputing Conference update
ECIT, Belfast, Northern Ireland
April 1-3, 2008

In short, the conference consensus is that accelerators are going to be an intergral part of the future computing paradigm. This isn't surprising, given the nature of the conference, but rather than being speculative statements, there was increasing demonstration of community acceptance of heterogenous computing as the next wave of innovation and performance.

Several presentations were made by vendors, SGI, Mitrion and Clearspeed, demonstrating cases where they have had success in proving out performance in real applicaitons. Mitrion with BLAST, Clearspeed with everything from quantum chemistry to financial modeling (but all floating point intensive) and SGI partnering with both of these partners. SGI presentation provided several interesting perspectives.
  • It took a big case (proving 70 FPGAs working together) to begin to draw out interest by many companies in the technology.
  • Now people are approaching SGI on what one can do for them with FPGAs and accelerators. This isn't surprising either, because in this demonstration, SGI stretched the size of the box that was constraining interest in FPGAs.
  • SGI has developed several examples using Quick Assist, but unfortunately, the details of the implementation and interface were not available.
  • It was important to note that Quick Assist focuses on single node acceleration, which is potentially limiting.

Mitrion presented on their language and BLAST example. A primary take home point is that the parallel programming mindset needs to be developed earlier for scientists and programmers alike. Mitrion C helps enforce this mindset. Of course, Mitrion C also emphasized their portability across parallel processor types.

Clearspeed was very interesting because of the speedup and density of performance they are able to achieve. Admittedly SIMD in nature and focused on floating point, the accelerator has a valuable niche, but isn't universal. It seems that Clearspeed is the CM5 coming around
with updated technology. A notable point from Clearspeed was a call for common standards for acceleration, something akin to OpenMP but not OpenMP. One notable point about Clearspeed was the availaiblity of codes that had the Clearspeed implementations.

Several other presentations were given from Alan George regarding CHREC, Craig Steffan from NCSA, Olaf Stoorasli from ORNL. Alan talked primarily about CHREC's effort to move the thought process up to strategy for computing the solution, instead of low-level optimization on a specific processor. A good direction because it provides a more common ground for domain scientists to interact with the application performance experts.

Olaf talked about his work with parallelizing Smith-Waterman on many, many FPGAs and with enough FPGAs, achieved 1000x speedup over a single CPU. This is another example of big cases providing visibility and showing the limit for FPGA computing has a ways to go before finding a limit.

Craig Steffan provided a good overview of NCSA mission which is to bring new computational capabilities to the scientists. Provided good input on necessary steps to have a successful deployment of new computing technologies including
  • Making it easy to keep heterogenous components (object files and bitstreams) together
  • Make decisions at run time on how the application problem will be solved
  • Make documentation available and consistent
  • Access to latest versions (even pre-release) is useful when trying to work around compiler bugs present in early releases

Mike Giles from Oxford presented his experiences in financial modeling using GPGPUs. Showed good success. Commented that standards for GPGPUs will be many years off, but that OpenFPGA is a good sign from the RC community that standards are emerging. Mike also identified that having examples, tools and libraries, student projects and more conferences will be important to getting started in new technologies. For those experienced with parallel programming, it's a 2-4 week learning curve to use CUDA.

Greg Petersen (UT) talked about cyber chemistry virtual center between UT and UIUC. Question for chemists is how to use these machines. Whole research front is on this aspect in order to get to petascale systems. Talked about kernel for QMC applications including general interpolation framework. Looked at efforts using Monte Carlo stochastic methods with random numbers and a Markov process. Significant work on using numerical analysis underlying the chemistry results.

Overall there are many applications using heterogenous acceleration, many in life sciences ranging from MD, to drug docking and Monte Carlo techniques, and nearly all referencing image processing and financial applications that performed well with accelerators. There was overlap in the life sciences space, with nearly every accelerator type demonstrating acceleration for at least one applicaition in this space.

Another significant time block was for the OpenFPGA forum. A show of hands indicated that only about 20% of the audience was aware of OpenFPGA, so I spent 30 minutes on an OpenFPGA overview before moving to the discussion of the general API. Part of the presentation included getting an interest level in assuring open interoperability for accelerators. There were no responses in the negative, many in the affirmative and some undecided.
The GenAPI discussion went pretty well. In short, there were no showstoppers indicating a wrong direction, but more discussion on technical detals of argument specification, what is included and what is not specified. There was a strong interest in having more direction for new areas such as inter-fpga communication, inter-node accelerator communication, etc, although all agreed it was too early to standardize because even the basics had yet to become standard.

There were some comments from those with a lot of history in HPC that the GenAPI looked similar to the model use by Floating Point Systems. There was general consensus that a first standard that is simple is best, allowing common use and then looking to emerging patterns as the basis for future standards. It appeared the application community would accept the standard if it were available.

Summarizing, the conference provided a good overview of work in moving computational science applications to new accelerator technologies that is becoming the new mainstream way to get higher performance for computing. The tools have matured enough that applications are being more broadly developed, and beginning to be deployed.

Tuesday, April 1, 2008

Eric: Day1 at mrsc Belfast

Greetings reader(s)... I'm posting on behalf of Eric, whose email works, but web doesn't.

Here's a quick synopsis of the first day of workshops.
Attended the workshop for SGI and Mitrion. Seems that SGI has aligned strongly with Intel and is embracing Quick Assist as the portable accelerator platform. Its working for portable applications, but has some room to evolve to match functionality in RASC implementation. Exciting to see that customers are now approaching SGI following demonstration of 70 FPGA system running an application of mainstream interest.
There was a discussion of the RC300 which will merge the RC100 and RC200 product lines. Vendor for FPGA component was not disclosed.

There were a number of informal discussions on OpenFPGA with attendees in new areas of potential interest for developing industry standards.

Mitrion presented on their Mitrion-C programming language and illustrated the changes needed to about 1300 lines of code to make BLAST accelerate with Mitrion-C. 1300 lines out of a million plus lines of code isn't a bad percentage for porting a large application.

Opening of the conference comes tomorrow. Looking for an exciting set of presentations.

Getting Value from Advanced Academic

This is the afternoon session for Media Lab including Schlumberger, Steelcase,

My notes are...
First, you need to articulate the underlying ideas, trends or technology that drive your competitive differentiation.
Second, understand how your culture and mission match the vision of the institution.

Third, dedicate an impassioned champion for this work who has access to high-level executives in your company.

Fourth, enable your champion to reach our well beyond the corporate silos with internal tools, support and leverage.

Claude from Schlumberger says:
  • You get ideas & IP, not prototypes
  • Interact frequently
  • Bring lots of people to visit (your academic partners can show off)
  • Learn from 'demo or die'
  • Be prepared to constantly advocate the relationship
Joe from Steelcase says:
  • Look to your future (based upon the Alchemy of Growth)
  • Co-creation as a principle
  • Learn, learn, learn...especially from the students
Kayato from NEC was a resident scientist. He enjoyed being a student and strongly recommends it.

Toyz & more... MIT Media Lab Day 1 Lab Projects

The keynote was the magician and professional skeptic, James Randi, and Seth (of MagicSeth). The bottom line was the more you think you know, the more easy you are to fool.

Scratch... tile based programming for kids. The cool item was the ability to send it also to you mobile phone. I've enabled my kids already.

Sticky Notes... that are smart. Linkiing phyiscal notes & books with computer storage. I couldn't find the video online, but the research summary is online

Sociable Robots for Weight Loss... A partner to help you achieve your long-term goals. Can't convince a friend to get you to the gym, ask Autom to help. There is a video of this one. And, it will be commercialized.

Detecting Group Dynamics.... (I think there was a catchy title for this as well, but I was blogging) Using sensors to tell you if you are running Good Meetings or Bad Meetings. The cool part was the simple feedback model. I swear 90% of everything is the UI and again, they used a simple display on the cell phone.

Cognitive Machines... Searching for the highlights of the game. "show me a video clip of Ortiz hitting a home run" Current state of the art is using the the announcer, who is often filling in the dull spots. Therefore, you need to get a machine to "look" at the video to understand the patterns within it. THIS IS COMPUTATIONALLY INTENSIVE. (4th floor)

Information Spaces... What's this virtual world stuff anyway.... I liked this one because of my personal interest in how to use online more effectively for the things we really do as humans in meetings: social clues, consensus building, recognition of social clues.

Active Sensor Nodes... Small, fast, real-time data on movement via wearable sensors. This is what Jacoby means when he says "Show me & I can learn"

Common Sense Toolkit... Yes, you too can have Common Sense via C Code. Available online as a repository of simple statements. & a library of sematic analysis called DIVISI. (a potentially cool little library.)

Tangible Media... You live in the real world, why can't your computer interfaces act more like real items (paintbrushes, clothes and more) This lab is also doing the Gesture Object Interfaces, which is a really cool idea. Throwing your phone on the table is much better than voice control

Zero Energy Home... a project of Changing Places... And they are really building this house in Maine. Nice ideas that can be used today. I need to re-read this stuff.

Smart Cities and Roboscooter... No American wants this, but everyone else does! What Dean Kaman thought we should target with the Segway, but he hasn't make the leap.