[Sidefx-houdini-list] render farms, pipelines and file systems

Edward Lam edward at sidefx.com
Thu Jan 8 09:14:06 EST 2009


Chip Collier wrote:
> And while I haven't implemented this myself, Mercurial or GIT could be
> an interesting alternative to the more traditional rsync, ftp, etc..

FWIW, last I read in the Mercurial documentation (or wiki?), they 
recommended using Subversion instead if you primarily store binary data.

-Edward

Chip Collier wrote:
> There is a free implementation of GFS via CentOS I think; as well as
> Hadoop. So that may help you out a bit. As far as performance goes,
> it's like anything else.. depends on the setup and machines
> participating.
> 
> And while I haven't implemented this myself, Mercurial or GIT could be
> an interesting alternative to the more traditional rsync, ftp, etc..
> for staging your data if you have a relatively non-volatile cache
> space on each node. It's not less complicated, but could possibly save
> some time and bandwidth.
> 
> What scheduler are you using? Condor has an extension for
> automatically handling data transfers. Could save you some time, but I
> seem to recall it was a non-trivial task to setup. However,
> http://www.cyclecomputing.com specialize in Condor and develop CloudFS
> for Hadoop that is cross-platform.
> 
> At the end of the day even with a fast shared centralized NFS share
> you will easily blow away it's ability to serve all the data you'll
> need. It's all about finding the balance for whatever jobs your
> throwing at it.
> 
> 
> On Tue, Jan 6, 2009 at 7:28 PM, Drew Whitehouse
> <Drew.Whitehouse at anu.edu.au> wrote:
>> Hi all,
>>
>> I'm interested in hearing about experiences people have had
>> implementing render farms where you *don't* have a shared file system
>> between workstations and render farms. I currently have a solution
>> than involves shipping resources back and forth. However, it requires
>> parsing IFD files and heuristically determining what resources are
>> needed for a render and what results needed to be transported back
>> post render. I'd like to make this more robust and I'm sure that this
>> is something that large facilities have to solve sooner or later. And
>> the same problems must crop up whatever software is used (prman,
>> MentalRay etc). Following is a bunch of questions that come to mind,
>> some houdini specific, others more general.
>>
>> Do you generate IFD's out on the farm or on artists workstations, and
>> if so how are resources synced across to where they are needed ?
>>
>> Is it the case that you purchase expensive global file systems and
>> maintain a single shared file space ? If so, are they robust and
>> performant ?
>>
>> Obviously their are significant performance issues to consider. Eg you
>> are working with Gb's of texture per frame which you don't want to
>> ship across to a render node every time a frame is rendered. There are
>> many possible solutions to this problem but I'd love to hear about
>> experiences in the trenches ...
>>
>> At what point do you find that single NFS servers grind to halt ? Is
>> scalability and locality a solved problem or something you're always
>> having to tweak ?
>>
>> Are you restricted to certain workstation architectures ?
>>
>> Does your solution restrict you to a certain applications ?
>>
>> Any feedback appreciated,
>> Drew
>>
>> --
>> Drew Whitehouse
>> ANU Supercomputer Facility Vizlab
>> _______________________________________________
>> Sidefx-houdini-list mailing list
>> Sidefx-houdini-list at sidefx.com
>> https://lists.sidefx.com:443/mailman/listinfo/sidefx-houdini-list
>>
> _______________________________________________
> Sidefx-houdini-list mailing list
> Sidefx-houdini-list at sidefx.com
> https://lists.sidefx.com:443/mailman/listinfo/sidefx-houdini-list




More information about the Sidefx-houdini-list mailing list