Image: GPU Rendering, OctaneRender, by Cornelius Dämmrich

 

PART 2: OctaneRender Hardware

Christopher: Let’s pick back up re GPU rendering with OctaneRender with workflow considerations, and best hardware for OctaneRender. As we both know, hardware choices depend on the user and needs, software, how you use that software and how much that software makes up of your total workflow. OctaneRender is just the render engine piece.

Tom: Yes, you can use one program at a time or use different programs at the same time. For example, GPU and CPU render engine simultaneously. Try to render something in the background while you’re compositing or doing some sort of simulation, so it depends on how you work. Some users stick with one particular task at a time because they like and prefer their system to be responsive. Different user might need much more powerful machine to handle all at the same time. So it depends on your workflow as it depends on what software you’re using.

Christopher: Looking at hardware, I suppose it makes sense to start with the case. For most, it is a workstation case. What’s important for OctaneRender users to know about the case?

Tom: Well, obviously the most important thing for case to hold all the hardware that you’re choosing for your work. The other thing is to have very good airflow, because if you house all of the components inside and some of them would overheat they will soon start to throttle down and eventually even cause stability issues.

Liquid cooling itself is not necessary but it helps to keep temperatures of components down. In most situations, that allows you to get more performance, but then the case should be bigger and you have to pay for convenience of having more performance, less noise, lower temperatures. Because of all those things it’s not for everyone.

Christopher: This is something we test thoroughly on all systems – temps. Adequate airflow to keep the system cool is important for all components. Primarily CPU and GPU but it’s important for all.

Tom: Yes if you do not cool motherboards good enough, you might end up having stability issues because on those motherboards, there are chips which handle all components together. If your GPU or CPU is cool enough, because they have active cooling components, but your motherboard isn’t, it still might overheat, causing entire system to lose stability, especially under those heavy workloads when you need your computer to work rock solid.

Christopher: What’s about the motherboard itself, aside from the need to keep it cool?

Tom: For OctaneRender users, it’s mostly about PCI connectivity, meaning how many GPUs you can physically connect and how many of them chosen motherboard can handle.

There are different options to expand GPU power for rendering. You can add splitters or external GPU units, but if the BIOS from your motherboard can not handle more GPUs it doesn’t matter if you can physically connect more of them. If it doesn’t boot you cannot do any work, so it’s not only the physical capability of the motherboard, but also whether the BIOS allow you to boot the system.

Christopher: We only use workstation class motherboards for our i-X3, i-X2, i-X, i-XL and portable workstations, which will all be X299 and X399 from AMD shortly. Is it more like X170, X270 type motherboards where you have experienced the bios problems you mentioned in our previous conversation? GPU rendering with 3 or 4 GPUs on these boards does not work.

Tom: It depends. Most gamer related motherboards allow you to connect 2-3 PCIe devices, like GPUs and some that have PLX chips inside might allow you to get up to four GPUs for rendering without any problems.

You can also choose to use some external boxes or splitters with motherboards that have less physical PCIe connectors , but again your system might not post because it was never designed to run with more GPUs in the first place.

Then there are workstation motherboards, they’re not server products, they are somewhere in between of gaming hardware and servers. These motherboards have different drivers, have different BIOS configurations and they allow more GPUs and more devices to be connected and handled. The third group would be server gear.

So it really depends on the use case, how many GPUs do we need for rendering and how you’re going to use them. Only two or three GPUs, you can get a lower end system and save yourself a bit of money. Five, seven or eight GPUs, you can get workstation motherboards. if you want to go over that and maybe have multiple CPUs, then you can choose server grade components. There is no good or bad.

Christopher: I’d like to look at CPU next. Different CPUs can only “see” a certain number of PCI express lanes. And the consumer chips like the i5 and the i7, the 7600 or 7700 series chips typically can only see 16 PCI Express lanes, whereas workstation chips can see 28 or 40 or 44. Same for Xeon. What should users know about the best CPU for OctaneRender?

Tom: Regarding PCIe lanes I would say you will have hard time noticing any difference between x8 and x16 lanes. Once you get to x4 lanes, you might notice some slowdowns on certain things especially on loading times with heavy projects, because there will be much more information going between CPU and GPU.

However there is no need for 40 lane (six or more core) CPU if you’re going to use just a couple of GPUs. If you plan building a rendering machine with four or more cards, you might be forced into getting better platform that is compatible only with 40 lane CPUs and has bios capabilities to handle more devices.

Christopher: Might be good to break this down for some readers. When you say X16, X8 and X4, to use a consumer example, if you have an i7 7700K which can only see 16 PCI express lanes, and you have one GPU on board, it’s going to see it at X16, assuming no other PCIe devices are attached. If you have two GPUs, each card will go to X8 bandwidth, and if you have four, it’ll be four at X4, assuming there are no other PCI devices present. Your point: PCIe lanes at X4 speeds, you might to start to notice a slowdown in performance.

Tom: Yes, because GPU devices might start waiting for some information from CPU before rendering and they would not be performing at the maximum capability all the time or it would have gaps in utilization. While building system balance is a key and it makes no sense to put way too much GPU power and save on CPU/motherboard.

X1 speed even – it would still work and so technically if you are upgrading, you can still add high end GPU in a rather pretty old computer with slower interface, and you might still be able to get nearly full speed of that GPU. The only problem is that once you end up going as low as X1, stability might become an issue as well in addition to already slower workflow and rendering.

For the best experience, I would say, you need at least eight lanes (X8). That would be logical move if you are building or looking for new system where GPU rendering is important.

Regarding lanes on motherboards it’s not so easy. Actually some of them can be paired with 16 lane CPU, but if they have PLX chip on board that “multiplies” lanes each GPU would still get X8. Those PLX equipped motherboards are far from cheapest due to additional complexity, but are well worth the money, offering good value.

Christopher: Right. So the other points about the CPU I think are well taken, because OctaneRender typically—well, it’s the render engine. So it’s just one aspect of most people’s workflows and if they’re using Cinema 4D or Maya or 3DSMax, those applications will often dictate your CPU choice.

Tom: Exactly. With OctaneRender, as you start loading any scene, once you open OctaneRender file, CPU does all the texture loading, scene loading, and all of the other bits. It converts the information. Once it’s converted, this information is pushed to GPUs. Again, CPU does all this work, and only once that information is inside GPU in video memory (VRAM) then GPU kicks in and starts to render. It renders out certain information and it sends it back to CPU that rendered information. So CPU itself, even if it’s not rendering, it’s actually responsible for preparing all information for rendering.

Christopher: So it is responsible for preparation, in other words.

Tom: Yes, and an important aspect here is that you would have CPU not only with a lot of cores, but as fast as possible as well. Fast CPU is going to prepare that information for OctaneRender and send to the GPUs faster compared to one that has more cores but is generally slower (or has the same speed, but is based on older architecture).

Christopher: We’ve seen situations where—let’s say 4x GTX 1080 Ti on board, and one system has a i7 7700K “consumer” CPU, and the other has, like let’s say the i9 7900X workstation chip. Some situations the i7 7700K is faster, and others the i9 wins easily.

Tom: The GPU rendering as a process is going to be more or less the same on either system, whether it has six core CPUs or four core CPUs. The thing is that the responsiveness of the system before it kicks into rendering depends a lot on the CPU speed, as mentioned.

Christopher: Quad core CPUs can run at faster frequencies, and so process single threaded workloads faster. Workloads that need only a core or two, quad core CPU feels faster, while 6, 8 or 10 core would be slower until you would throw more tasks in parallel than quad core could handle – in which case these extra cores you may need depending on your core applications.

Tom: Responsiveness also depends on certain CPUs because you can find six core or eight core units from new architecture that are faster compared to certain quad core models. In the end it depends on how much money you want to invest and what make sense for your needs, your specific workloads.

Christopher: When you say the newer architecture I think you’re referring to Skylake-X i7-7820X, i9-7900X LGA 2066 CPUs. New architectures from Intel usually provide a 5-10% speed increase generation on generation.

Tom: Yeah, basically every new product has some sort of improvement to make more tasks per cycle. So if you have a five generation old CPU, it’s going to be not so efficient. Even if you get the same speed, let’s say 4 GHz (old) vs 4 GHz (new), actually the new CPU is going to be a bit faster because of all of the improvement that has been done architecturally. But as you mentioned with new architecture you generally get only 5-10% of improvement per cycle.

Christopher: That’s right, and everything that’s currently released, at least with the Intel Skylake X, the i7 7800 series and i9 7900 series, that the turbo is 4.3 and then there’s single core turbo which goes up to 4.5.

Tom: Regarding to turbo speeds there is actually one important thing to mention and that is how OctaneRender itself works.

As program prepares information for GPUs, the load itself might be not big enough in most cases to trigger the CPU into turbo speed, so you end up with CPU not running at 100%. In some situations it might be running at half even third of regular speed. Due to this situation, if you’re using CPU that is trying to save power, it might be not that fast to wake in time to do work at full or even turbo speed.

Sometimes giving away some power usage and running CPU pinned to certain frequency might end up making sense for performance and for stability. Theoretical peak “turbo” frequency might sound good on paper and save some power, while system is underutilised, but then in some cases it might not do any good.

Christopher: Yes, it really does depend on the application code. I think we’re all really interested in seeing, for example, what happens in different use scenarios. Doing modeling in Maya and then doing physical rendering in Cinema 4D; how efficiently is the code using the hardware? That’s what we’ve seen as well and this specific point is important to recognize about OctaneRender.

Ultimately the core applications (and operations) that make up anyone’s workflow end up dictating what their final CPU choice is.

I’d like to switch gears and look at RAM. What’s an adequate amount of system RAM and do you break it down—there’s an adage which says 4GB RAM per CPU core. What’s your experience with OctaneRender?

Tom: Our unwritten rule in community is to have about three times more system memory, RAM, compared to what you have on your GPU, VRAM. So for instance, if you have an 11 GB GTX 1080 Ti, get at least 32 GB of RAM, or even 64 if you need some more memory for some other applications. If you have a card that has only 2 GB of RAM, you can maybe get away with less RAM (for rendering).

However, there is one feature that is quite important. If you have lower memory on GPU, you can use a certain feature that’s called out of core texture, which allows you to store some of your texture maps not inside GPUs VRAM but in system memory (RAM). So sometimes when you go in with a lower amount of VRAM on GPU, it’s worth to get more system RAM.

As for RAM speed, I would always advise to get slower memory, as it is more stable. Fast memory might lead to certain problems in the end with stability, so if you do not notice too much of a difference between slower RAM and really, really fast RAM, there’s no need to pay more and you’ll only do yourself a favor if you choose a bit slower RAM.

octane render, out-of-core-textures, gpu rendering

Christopher: Our experience corroborates with yours. Buying the RAM speed which matches the platform rating (LGA 2011-3 DDR4 2400, LGA 2066 DDR4 2666) works best. Can you explain a bit more about out of core texture use in the OctaneRender UI?

Tom: File, preferences, and the fourth tab is going to be named out of core. If you enable out of core textures, it will allow you to choose the amount of RAM usage that you would like to dedicate for this function. You can enable this feature on standalone and on plugins. (it’ll be a different path but overall, it’s under preferences).

Christopher: Have you seen instances where system RAM has been insufficient and is there any comment that you’d like to share about that? Like if someone has a four GPU system with 64 GB where you’ve seen maybe a certain model size where it just crashed. Many of our customers use After Effects with OctaneRender, and it can easily use 128GB in certain situations.

Tom: Overall OctaneRender is pretty good with RAM consumption. Here is an example of pretty heavy scene, 6088AD by Cornelius Dämmrich (you can get tutorial, walkthrough scene files on his Gumroad page). In this case while rendering 15k x 10k output RAM usage is only around 35GB (+ few GB for operating system) and over 8GB of VRAM. Scene statistics (32 high resolution textures, 32.5 mil triangles, etc. – specs under the image that is being rendered for reference regarding workstation rendering loads.

GPU Rendering with OctaneRender - 6088AD by Cornelius Dämmrich

GPU Rendering with OctaneRender – VIDEO: 6088AD by Cornelius Dämmrich

Christopher: Let’s talk about GPU, the key component as OctaneRender is all about fast GPU rendering.

Tom: The best place to start would be to take a look at OctaneBench’s page to see how different GPUs and different systems perform. There are GPUs that differ in rendering speed and in amount of VRAM and also different models of the same GPU that differs mainly in cooling solution.

Cards with highest amount of VRAM are going to be Quadro and Tesla GPUs because they are designed for larger jobs. Those cost way more compared to mainstream (gaming cards). To put that in perspective for same amount of money you would pay for single card, if you’re an OctaneRender user, you can buy system with multiple GTX cards that would be a few times faster, but they would have a bit less VRAM.

One of the important things about professional cards like Quadro, is that they have what’s called ECC RAM and higher performance for specific application that could take advantage of dual precision or half precision float performance, but OctaneRender is not using those too much and mostly relying on single precision. So if you’re buying professional cards, you either need a lot of memory or you have demand for those features for other applications.

Something not to forget is the OctaneBench link to see OctaneRender performance of quad 1080ti systems. One thing to keep in mind is that GPU rendering performance can vary quite a bit depending on how good cooling is. We talked about case requirements and that you need very good cooling because GPUs rendering under load emit a lot of heat. If you do not remove all of that fast enough, GPUs start reaching the limit of 80 something in Celsius. Once the GPU is in that position, it starts to downclock and slow itself down to prevent from making harm related to excessive amount of heat that is being generated while working.

Christopher: That’s what we’ve seen. Can you say a bit more about this?

Tom: If GPUs are cool enough they automatically overclock themself or allows us to crank few sliders to run faster. Because GPUs are designed to run 85 degrees or a bit lower (it depends on specific GPU model), so if your GPU is running at 85 and temperature still wants to climb, GPU will try to reduce its speed in order to prevent damage to the card.

Christopher: Yes, and perhaps it’s good here to remind newcomers exploring possibility to get into OctaneRender that Nvidia GPUs are required, though OTOY is working on OpenCL card functionality.

Tom: For now Nvidia GPUs only with OctaneRender, but with time I think we will see AMD cards running on OctaneRender. OTOY already showed some work in progress that they were able to run not only an AMD cards but also on CPUs as well.

Christopher: Let’s look at a scenario. Customer has a card like GTX 1080 which has 8GB of VRAM, and they’re working with rendering large models. Let’s also just say for example, they have 128GB of RAM. They check the out of core texture box so they’re using system RAM. Does that delay the rendering, does that slow down the rendering process at all, by first loading textures using System RAM in that way? Rather than if you had, say, a P5000 Quadro or P6000, where you could load the entire scene or instance directly into VRAM?

Tom: It slows down a bit, but the difference is not that big. We’re talking at most 5-15% and it depends a lot on how much of those textures that you’re going to have out of core. It also depends on that interface between GPUs. So if your GPUs are connected into sixteen lanes each, rendering will go faster compared to whether you would have those on one lane, for an extreme example.

However, price wise, if you can use those out of core textures and if you do not have very large meshes of geometry, it’s still better to get more system memory and GTX cards rather than professional cards for OctaneRender.

With professional cards, you’re going to pay three to four times more and it’s only going to give you like, extra few gigabytes or twice the VRAM at most. But then, if you work with rendering really heavy models and could not optimize them or it’s too labor intensive, then yes, why not? It may make sense to buy Quadro P5000 16GB or Quadro P6000 24GB for your GPU rendering.

Christopher: Good. On that note, I think I want to move to storage disks. With GPU rendering using applications like OctaneRender, traditional disc versus SSD versus M.2, it’s not terribly important overall. What are your thoughts?

Tom: It matters very little. It again depends on your other needs. If you need to store a lot of information, maybe you have big libraries, it might make sense to have old hard drives.

Christopher: Platter drives, HDDs.

Tom: If you want a bit faster solution, just get a SATA SSD. It will be like, three or four times faster. Lastly, if you want the best performance that you can get you can get M.2 drives and these are three to five times faster than regular SSDs.

Christopher: The Samsung 960 Pro m.2 SSD for example is 5-7x faster than SATA3 SSDs.

Tom: Yeah. I had two drives, each one was one terabyte. One was probably the best sata and other was one of the best M.2 based. That SATA drive is around 500 in and out, megabytes per second, and the other one is like close to 2000 MB/s write and up to over 3000 MB/s read.

If you tried to load something into OctaneRender, the information has to be loaded into the program (System RAM), so your storage matter but might not that much since CPU needs not only to find but also make something with it before you can use it. However, if you tried to save the file from the program to the disc, now it matters more how fast that storage is, because it only needs to write the file. If you’re rendering small files (low resolution), that might not matter very much, but if your files are 8K or 16K or you are rendering some VR formats that have 24 megapixel files, now it matters a lot because each of those are going to be maybe few hundred MB.

Christopher: I think the last thing I would like to cover is Windows and Linux. Let’s start with Windows, Windows 10. Are they any limitations for Octane users?

Tom: I think that operating system is very subjective choice in most cases. Some people just like one over the other; Mac users prefer that system and that’s it. Regarding Windows, you need to be very cautious of updates and all of the other bits that are being done in the background, because some updates, Windows especially, might cause a lot of issues. By saying a lot of issues, I mean stability, system errors, blue screens, hangs and all the other things that will really annoy you when you’re working. You might leave a render overnight, come in the morning and it will be not finished because your computer rebooted out of the blue.

If you want to connect more than, let’s say, seven GPUs or go over ten for instance, then you actually need to go Linux because you start getting issues with Windows as well. Again, it depends on your usage, so if you are Windows user and if you want to use that for a quad GPU machine, do it. There’s nothing bad about it (apart from mentioned issues).

Christopher: With 7 to 10 GPUs you can use Windows Server, with our GPUx, for example. As far as Linux goes—our most common distro requests are for either Ubuntu or RHEL. Have you done any testing with OctaneRender in either of those environments?

Tom: There are still some users that prefer Windows 7 over Windows 10, because there is no memory leakage (you can not use about 10% of total VRAM on W10 due to issues with Windows).

I have very little knowledge about anything else but Windows, just starting to play. I can say one thing though. As long as GPUs are seen by the operating system, as long as you have drivers for that operating system, it will work for OctaneRender as well.

Christopher: Your other applications can dictate this quite a bit. Some drivers that work for one operating system won’t work for another. Depends how you’re going to use the computer, that’ll probably dictate what operating system you might want to choose.

Is there anything else that you would want to say about Octane that would be helpful to users? Final comments on recommended configurations for OctaneRender?

Tom: From the program side or even the technical side, there is not so much to add, but there are a few things to mention. A lot of times, people try to spend more money for a CPU, try to get better GPUs, get really fast RAM, etc. Do not save on cooling. If you want good performance, make sure you have good airflow. And another thing, do not save on power supply. Get the best you can, and get maybe even a bit bigger than you need right now. If you have a power supply that is not stable, that it has a lot of ripples and output is not linear, you’re going to get stability issues. In the end then you’re going to blame software, you’re going to blame all the other parts in computer, when one of the the most important bit is to have good power supplied in the first place – the entire ‘puzzle”” makes a good experience; if any of those parts fail, you will have a problem.

Christopher: Well Tom, on that note, I think we’re going to wrap. Thank you for your very specific feedback about GPU rendering with OctaneRender.

Tom: My pleasure.