Houdini is becoming one of the most popular and highly regarded CG and VFX applications today. It can be used to model, render, animate and simulate objects and scenes with a great deal of precision and control non-destructively. A key characteristic of Houdini is the use of nodes, networks and assets in a step-wise or procedural manner – so modeling, lighting, rendering and visual effects are all comprised of networks of nodes the artist uses to complete the object, animation or scene. Structurally, even the different networks constructed can reference other networks to add additional nuance and depth of effect to the scene in question.

The question we’re out to answer today is this: if every action is stored in a node, and then connected to networks of additional nodes, how does node and network complexity affect compute speed?  In other words, what are the optimized Houdini software system requirements?  In this the first of a three part series about the best hardware for Houdini, we’ll be benchmarking different hardware configurations for testing Houdini’s native renderer Mantra.  But first, we’d like to break down Houdini mechanics a bit more so you see what steps and processes affect compute speed and effectiveness.

Particles and dynamics in Houdini are perhaps the most powerful draw to VFX Artists, VFX supervisors and teams using Houdini – a procedural workflow that’s non-destructive is ideal for creating such assets. There’s also the “intelligent” reacting to actions and variables in a shot which a procedural solution is ideal for, as it supports iteration and process control.  In addition, Houdini provides content creation tools for modeling, rendering, character work and gamedev.

Houdini’s procedural workflow supports you as you create all of your CG content. The ability to explore different iterations many levels deep on your content creation work within a node or node network provides an empowered environment for creative work. Viewport and shelf tools add to your ability to work interactively while Houdini builds the networks.

Moreover, whether using Alembic, FBX or EXR, artists can easily work back and forth with a wide variety of DCC apps. Houdini Engine plug-ins can be used to bring the Houdini Digital Assets into other apps such as Maya, 3ds Max, Renderman or Cinema 4D, or game engines like Unity or Unreal Engine while maintaining the asset’s procedural controls.

Given the ability of Houdini to work effectively with large data sets, we want to look at the different steps making up a complete Houdini workflow in this three part article and determine:

  1. What’s the recommended hardware configuration for Houdini?
  2. How important is a fast cache drive and RAM for Houdini?
  3. What CPU is best for Houdini?
  4. What GPU is best for Houdini?

From solids destruction, fluid and particle sim, grain and fire effects and the demanding compute nature required for effectively producing these effects, we first want to look at how different Mediaworkstations perform using Houdini’s built-in render engine Mantra.

Rendering with Mantra

Mantra renders by dividing the image up into tiles and rendering each tile individually – small tiles increase computer responsiveness, but can decrease overall performance. You do have an option to ameliorate this however – if a render is updating in MPlay or the render view, you just tell Mantra to begin rendering that tile. This lets you preview areas you’re interested in faster.

Mantra is a renderer that utilizes scanline, raytracing, and physically-based rendering. Physically based rendering (PBR) as many of you probably know, refers to the use of realistic shading/lighting models along with measured surface values in representing real-world materials accurately. Because PBR is more conceptual than it is a strict set of rules, PBR implementation systems (and results) tend to vary.

Also, like Renderman and Arnold, Mantra is a CPU-based renderer.  As a result, if Mantra is your primary render engine, choosing the fastest CPU for Houdini is top priority when making configuration choices.  Houdini also supports renderers from many 3rd party ISVs including RenderMan, Arnold, OctaneRender, Redshift and V-Ray, which we will explore in part three of this series on Houdini.

Mantra is deeply integrated into Houdini for rendering of geometry, instances and volumes. As you might expect, rendering in Houdini depends on a camera defining the angle or vantage point from which to render and the lighting to illuminate.  You also set up a render node which represents the renderer and render settings, but you can also run preview renders.

Shaders & Materials

Lastly, rendering objects in a Houdini scene, you assign materials (AKA shaders) to your geometry. In Houdini materials and shaders are created in the mat and vex builder networks.  One of the powerful abilities of Houdini is your ability to build up your materials using nodes which make up the look of your shot. At the same time the complexity, network size and settings, and steps like caching nodes directly impact compute speed.

Benchmarking

For our testing for determining the optimum Houdini system requirements, we used our i-X, a-X and i-X2 Mediaworkstations with identical configurations except CPU:

i-X Mediaworkstation
(Intel CPUs listed in charts)
64GB DDR4 2666 MHz
500GB Samsung 970 Pro
NVIDIA RTX 2080 Ti 11GB GPU

a-X Mediaworkstation
(AMD CPUs listed in charts)
64GB DDR4 2666 MHz
500GB Samsung 970 Pro
NVIDIA RTX 2080 Ti 11GB GPU

i-X2 Mediaworkstation
2 x Intel Xeon Gold 6254 CPUs (36 Total Cores)
64GB DDR4 2933 MHz ECC
500GB Samsung 970 Pro
NVIDIA RTX 2080 Ti 11GB GPU

For our comparison we took three different scenes, FLIP, GRAIN and PYRO (thanks to VFX Arabia for providing test samples) with the above configurations using Mantra, to determine the fastest CPU for Houdini.  NOTE: Temp Caching and OpenCL acceleration were disabled in the Houdini files which would affect RAM and GPU usage.  Our results are detailed below.

GRAIN

In the grain solver test we see that a higher core count is an advantage, but also that the results do not scale absolutely (i.e., 20% greater core count ≠ 20% better performance):

FLIP

The FLIP solver is a hybrid between a particle based and volume based fluid sim – it’s what prevents the particles from all going on top of each other and begin moving in similar directions.  Again, we see that Houdini compute time is aided by a higher number of cores, but not in a linear fashion:

PYRO

The Pyro solver is an extension of the Smoke Solver, and the Mantra performance with the different configurations with the Pyro sample is given below:

Some interesting observations in hardware resource utilization:

  • When FLIP points and Pyro Voxels were decreased, the CPU utilization decreased proportionately as well.
  • The 31 M Point FLIP generated a saw tooth pattern in CPU usage, but lower resolution versions did not (1 M Points)
  • As expected the GPU usages are unaffected, but RAM usage on the FLIP scene went from 28% with 1 M Points setting, to 55% usage with 31 M Points.

Conclusion

While the case of more cores equaling proportionately better performance using Houdini’s built-in renderer Mantra in our Pyro sample is clear cut, our other test comparisons aren’t so clear.  Generally more cores is better in all three tests, but performance results do not scale as expected.  With the Grain example in particular, the i-X2 with Dual Xeon Gold 6254 18 Core CPUs leads the pack, but the 18 Core i9 9980XE outperforms the 32 Core Threadripper 2990WX by about 20%.  In the Flip scene render, both i9 9980XE  and TR 2990WX are more closely matched (i9 9980XE being about 10% faster than the TR 2990WX), but the TR 2990WX has 14 more cores.  Potential contributing factors may include that Intel’s CPU architecture still leads AMD’s in efficiency (AMD’s CPU 4-die design is such that not all 4 dies can communicate directly with memory controllers, increasing latency) and the i9 9980XE has a slightly higher turbo frequency than does the TR 2990WX.  What this means is while there are instances where determining the fastest CPU is clear, it may be helpful to first determine the best CPU for your 2-3 top applications, be they Maya, 3dsMax, Cinema 4D and others, in weighing your decision.  For most media professionals, your top applications are the most important for determining which configuration (including CPU) is best for you.

In our next two articles, we’ll continue our look at Houdini.  First, we’ll explore what kind of impact faster cache drives have in Houdini workflows, as well as how different amounts of system ram impact performance.

Special thanks to Curvin Huber for his help and expertise in testing and writing this article.

Houdini Software System Requirements – Houdini Recommended Configurations

Below are our optimum Houdini system requirements:  the price for performance king (our a-X Mediaworkstation), and the performance KING (the i-X2, our most cost effective dual Scaleable Xeon configuration).