[Sidefx-houdini-list] Intel's Larrabee Delayed Indefinitely
malexander at sidefx.com
Tue Dec 8 14:04:36 EST 2009
John Coldrick wrote:
> I'm really hoping/expecting this whole GPU bugaboo will die off
> than later. Thread more apps, thread them better, throw more cores at them.
> Don't do what autodesk does to our poor, overfed compositors and talk them
> into buying inferno render nodes with nvidia cards and then announce 6 months
> later they need to upgrade the cards because the new release won't run on
> them. It's a thief's market.
> And inferno hardly even threads...not surprising I guess, considering they've
> been tied to GPUs since their start.
Unless the previous GPUs were the discrete vertex/fragment shader (DX9)
variety, this seems pretty fishy to me (on Autodesk's part). There is
not a huge difference between the original compute version 1.0 (8800GTX)
and the current version, at least not one that can't be creatively
worked around. Especially for a compositor...
Re: Larrabee itself, using the complex x86 instruction set for a GPU and
trying to sell it as a feature seemed strange from a development and
architectural sense. Complex instructions take longer to decode, use
more die area and/or more power. Also, software developers don't code in
assembly, we use a compiler, which hides the underlying code anyways
(whether it be x86, Nvidia's GPU language or whatnot). I would much
prefer Nvidia and ATI to come out with a compiler that supports C++ and
pthreads (supposedly the latest version of CUDA does support C++) than
see a power-hungry, slower x86 implementation.
The 16-wide vector unit also seemed like overkill - trying to keep ATI's
4-wide/1 special function unit fed on its series of GPUs is hard enough.
There's still a lot of scalar code to process, and it's difficult to
SIMD effectively. So you end up with a unit with most of its resources
idling, unless the code for it is very specifically and meticulously
targeted to the unit. This probably means that existing code, even
x86-compiled, would still have to be modified to support such an
architecture (compilers still aren't that smart).
Finally, you can't just take code that was designed for 2-4 cores and
run it on 20+ cores effectively - diminishing returns on the speedup
rack up pretty quickly. A different mindset is needed when dealing with
dozens of threads (bandwidth and compute oriented, not latency and
cache/memory oriented). These points made me rather skeptical of the
ease of developing for Larrabee and its benefits that Intel claimed.
Unfortunately, I really _was_ looking forward to the death of the Intel
GMA by Larrabee. Oh well.
ps - my view's, not SESI's.
More information about the Sidefx-houdini-list