Chaos
Architecture
Limited-time offer: Get an extra three months of free V-Ray with an annual license. Buy now and save

Industry solutions

Explore our ecosystem

Products

Media & Entertainment

Industry solutions

Explore our ecosystem

Products

Product & E-Commerce

Industry solutions

Explore our ecosystem

Products

Education & CommunityHelp

Chaos Help Center


What are you looking for?

image product-bar
gpu-benchmark-lead-image.jpg
gpu-benchmark-lead-image.jpg

V-Ray GPU Benchmarks on Top-of the-Line NVIDIA GPUs



 

Introduction

 

V-Ray’s GPU rendering and NVIDIA’s hardware are constantly improving. Recently, there have been major advances in both, so we thought now would be the perfect time to run new benchmarks and find out how much faster everything might be.

 

The hardware


With 40 logical CPU cores and 128GB RAM, the Lenovo P900 is powerful. It’s great for GPU tests, since there’s space for three double slot GPUs and one single slot GPU. Plus, the toolless chassis makes it quick to pop cards in and out. The tests felt like an F1 pitstop for GPUs.

 

 

The GPUs we decided to test are as follows:

 

GPU Architecture Cores RAM type RAM Power Slots Street Price
GP100 Pascal 3584 HBM2 16GB 235W 2 N/A
P6000 Pascal 3840 GDDR5X 24GB 250W 2 $4,699
P5000 Pascal 2560 GDDR5X 16GB 180W 2 $2,499
P4000 Pascal 1792 GDDR5X 8GB 105W 1 N/A
M6000 Maxwell 3072 GDDR5 24GB 250W 2 $4,539
Titan X (Pascal) Pascal 3584 GDDR5X 12GB 250W 2 $1,599

 

*Street prices approximate, based on a quick search at Newegg and Amazon. The GP100 and P4000 are not public yet, so no pricing is available.

 

 

The benchmark test

 

Even before the benchmarks started, we were very interested to see NVIDIA’s new NVLink tech in action. Because NVLink allows cards to share memory, we were curious to see what sort of performance we could get using two new GP100s. More on this later.

Our lead GPU developer, Blago Taskov and I set up the benchmarks. To get better data, we decided it would be best to test multiple scenes instead of just one. We batch rendered nine different scenes and recorded the time to complete each one. Then, we added up the total time for all nine.


Here are the results:

 

  Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Total time
GP100 x 2 46.49 130.36 156.69 29.43 112.99 39.88 40.21 107.75 19.94 683.74
GP100 90.72 251.81 295.52 50.84 220.51 77.72  76.94  202.28 38.02

1304.36

P6000 127.21 363.18 410.72 72.17 348.99 131.39 109.64 264.82 61.83 1889.95
P5000 188.18 536.69                
P4000 212.54 636.84 724.22 131.86 565.83 207.79 178.6 455.61 104.62 3217.91
M6000 140.13 483.71 538.86 97.59 423.11 159.04 134.79 351.91 73.15 2402.29
                     

A comparison of the different times in percentage of time for each card can be seen in this table:

 

  GP100 x 2 GP100 P6000 P5000 P4000 M6000 Titan X (Pascal)
GP100 x 2 1 1.907684 2.764135 4.059496 4.706336 3.513455 2.7042882
GP100 0.524196 1 1.448948 2.127971 2.467041 1.841738 1.4175764
P6000 0.361777 0.690156 1 1.468631 1.702643 1.271087 0.9783486
P5000 0.246336 0.469931 0.680906 1 1.15934 0.86549 0.6661635
P4000 0.21248 0.405344 0.587322 0.86256 1 0.746537 0.5746059
M6000 0.28462 0.542965 0.786728 1.155414 1.339518 1 0.7696947
Titan X (Pascal) 0.369783 0.705429 1.022131 1.501133 1.740323 1.299216 1

 

A note about RAM

 

RAM plays a big part in the value of these cards. For example, the Titan X (Pascal) and P6000 showed similar times across all the tests. On some the Titan X was faster, and on others the P6000 beat it outright. In overall time, the Titan X narrowly edged out the P6000. But that’s the not the whole story. While both cards were neck in neck in speed, the choice (and cost) comes down to RAM. The Titan X is significantly less expensive at 12GB of RAM, but the P6000 can fit much more data with its 24GB of RAM. You might be able to give yourself a little more breathing on that 12GB card with V-Ray 3.5’s On-demand Mip-mapping. This would dramatically reduce the RAM requirements for loading textures. Ultimately, it comes down to your budget and how much memory you really need.

Let’s say you want to render a huge scene with lots of geometry and textures. If you need more than 24GB, that’s where NVLink comes in. What is NVLink?

Currently, GP100s are the only cards to support NVLink. They use special HBM2 memory that is so fast, it can be shared across cards. It may look similar to SLI, but it’s not the same. In our setup we connected two GP100s. In theory, with specialized hardware, it’s possible to link more. For example, NVIDIA’s DGX-1 does this with eight P100 GPUs. But at $129,000 it’s a little out of our price range. We’re looking forward to testing that one. When we do, we’ll be sure to share the results.

 

V-Ray and NVLink

 

We’ve enabled NVLINK in the latest V-Ray nightly builds. To test it, we enlisted the help of our friends at Dabarti Studio, and they created this torture test.

 

Model and assets courtesy of Dabarti with 169 million polygons and 150+ 6k textures

 

This scene contains 169 million polygons and over 150 6K images. The geometry alone won’t fit on a single card, not to mention all those high res. textures.

Time to render. First, we set all objects to Dynamic Geometry in the V-Ray Properties. This made it possible for the geometry to be shared across the cards. Then, we disabled On-demand Mip-mapping to force the full resolution textures to load. Once the cards were fully loaded, each one used 13GB of its 16GB RAM. That’s a total of 26GB RAM on both cards – more than the 24GB a P6000 can hold.

It worked, and we noticed little or no performance loss with NVLink. It’s still early, but the initial results are positive. Maybe with a few driver updates and V-Ray tweaks, NVLink will perform even better in the future.

 

Conclusion

 

Moore’s Law is alive and well. The M6000 arrived about two years ago and today the GP100 is almost twice as fast – right on schedule. The combination of NVIDIA’s latest tech and V-Ray’s most recent advances in GPU rendering, seem to remove some of the early memory limitations. And that paints a bright future for GPU rendering. We will continue to test and update you more as we get new hardware to test and benchmark.

 

Special thanks

 

Thanks to NVIDIA for loaning us their latest and greatest hardware for stress testing. Also, thanks to Lenovo for supplying Chaos Group Labs with a workstation that can handle some serious computing. And thanks to Tomasz Wyszolmirski at Dabarti Studio for helping us continue to push GPU rendering to its limits.

IMG_h96auu.jpg
About the author

Christopher Nichols

Chris is a CG industry veteran and Director of Chaos Labs. He can also be heard regularly as the host of the CG Garage podcast which attracts 20,000 weekly listeners. With a background in both VFX and Design, Chris has worked for Gensler, Digital Domain, Imageworks and Method Studios. His credits include Maleficent, Oblivion and Tron: Legacy.

Originally published: March 3, 2017.
© Škoda Design

Subscribe to our blog.

Get the latest news, artist spotlight stories, tips and tricks delivered to your inbox.

By submitting your information you are agreeing to receive marketing messages from Chaos. You can opt-out at any time. Privacy Policy.

Chaos
© 2024 Chaos Software EOOD. All Rights reserved. Chaos®, V-Ray® and Phoenix FD® are registered trademarks of Chaos Software EOOD in Bulgaria and/or other countries.