I am having a very mixed reaction to the recent public announcements of AMD's Heterogeneous System Architecture (HSA) roadmaps. As anyone who stumbles across this blog should know, I did a lot of work in that area. I really do (or is it "did"?) think it represents a real future of computing silicon. It should reduce overall power and is the roadmap to create truly powerful single-chip computers that do everything. Building hotter FPUs or more cache or more generic cores is a race of diminishing returns.
We already have significant sections of modern x86 CPUs dedicated to special purpose functions. Vector instructions with SSE started us down the path, but now we have instruction sets and silicon that speed encryption and other low level functions. The advantages of this model is a single instruction set and, most importantly, a flat memory space.
It is the overhead of memory copy that actually eats up most of the advantage of dedicated silicon. The problem has to be large enough to be worth shipping around the data. Phil Rogers, Mark Hummel & others at AMD know this well. But so do the teams at NVidia, Intel and every other silicon company in the world.
Unfortunately, the silicon industry has also shown stamping out small power-efficient general purpose cores is easy. The difference in power consumption between on-load dedicated cores as NIC and purpose built silicon is shrinking. However, the design overhead of developing that silicon has not. There are only a handful of compute problems worth solving in silicon, but there should be 1000s of compute functions to which it is worth dedicating a core.
AMD has an approach to coherent memory across the different silicon environments. I know enough of the people involved to be confident the solution is elegant and functional. I have deep confidence that it can be game changing in the HPC space where uber-FLOPS still matter and adoption is a matter of compiling in the libraries.
Unfortunately, I don't think it is industry changing. Current software trends are not toward getting more from a single program, but dividing up the problem into smaller, general purpose compute elements. Software architects realized it was too hard to copy the data, so they moved the compute to the data. Yes, I'm talking about Map/Reduce.
AMD's HSA and Map/Reduce represent two directions for software's use of silicon. In my opinion, the need to speed up specific algorithms has been circumvented already by software architecture. AMD is shooting where the duck was, not where it is going. That is the problem with silicon engineering in the current age. It doesn't move fast enough.
That's why Dan Reed, Burton Smith and others are working for MSFT these days and Peter Ungaro is giving the keynote at a Big Data conference. They are trying to shoot ahead of the duck.
If AMD can improve memory efficiency (e.g. garbage collection) and/or messaging primitives (e.g. collectives) they may not change the industry, but you certainly have a competitive advantage in the modern age of distributed software architectures.