[Sidefx-houdini-list] Intel's Larrabee Delayed Indefinitely

malexander malexander at sidefx.com
Tue Dec 8 14:04:36 EST 2009

John Coldrick wrote:
  > 	I'm really hoping/expecting this whole GPU bugaboo will die off 
sooner rather
> than later.  Thread more apps, thread them better, throw more cores at them.  
> Don't do what autodesk does to our poor, overfed compositors and talk them 
> into buying inferno render nodes with nvidia cards and then announce 6 months 
> later they need to upgrade the cards because the new release won't run on 
> them.  It's a thief's market.
> 	And inferno hardly even threads...not surprising I guess, considering they've 
> been tied to GPUs since their start.

Unless the previous GPUs were the discrete vertex/fragment shader (DX9) 
variety, this seems pretty fishy to me (on Autodesk's part). There is 
not a huge difference between the original compute version 1.0 (8800GTX) 
and the current version, at least not one that can't be creatively 
worked around. Especially for a compositor...

Re: Larrabee itself, using the complex x86 instruction set for a GPU and 
trying to sell it as a feature seemed strange from a development and 
architectural sense. Complex instructions take longer to decode, use 
more die area and/or more power. Also, software developers don't code in 
assembly, we use a compiler, which hides the underlying code anyways 
(whether it be x86, Nvidia's GPU language or whatnot).  I would much 
prefer Nvidia and ATI to come out with a compiler that supports C++ and 
pthreads (supposedly the latest version of CUDA does support C++) than 
see a power-hungry, slower x86 implementation.

The 16-wide vector unit also seemed like overkill - trying to keep ATI's 
4-wide/1 special function unit fed on its series of GPUs is hard enough. 
There's still a lot of scalar code to process, and it's difficult to 
SIMD effectively. So you end up with a unit with most of its resources 
idling, unless the code for it is very specifically and meticulously 
targeted to the unit. This probably means that existing code, even 
x86-compiled, would still have to be modified to support such an 
architecture (compilers still aren't that smart).

Finally, you can't just take code that was designed for 2-4 cores and 
run it on 20+ cores effectively - diminishing returns on the speedup 
rack up pretty quickly.  A different mindset is needed when dealing with 
dozens of threads (bandwidth and compute oriented, not latency and 
cache/memory oriented). These points made me rather skeptical of the 
ease of developing for Larrabee and its benefits that Intel claimed.

Unfortunately, I really _was_ looking forward to the death of the Intel 
GMA by Larrabee. Oh well.


ps - my view's, not SESI's.

More information about the Sidefx-houdini-list mailing list