HotSort on a GK208
NVIDIA’s new GK208 GPU is a low power device with very high end
sm_35
compute capabilities.
At this point, only the flagship GTX 780, GTX TITAN, TESLA K20 and
Quadro K6000 GPUs support sm_35
.
It’s also rumored that Tegra 5 (“Logan”) will be an sm_35
compute
capability device. My guess is that it will run at ~300 MHZ, have 192
cores (1 SMX) and 12.8 GB/s of bandwidth (64-bit LPDDR3@800MHz).
Much of my work targets mobile GPUs so I was excited to acquire a GK208 device and have been evaluating it over the last couple days.
First, here are the GPU-Z specs:
I also updated the HotSort benchmarks for GK208. You can see the full GK208 benchmarks here and the combined benchmarks for all GPUs here.
As you can see below, the GK208’s two SMXs basically match the performance of a GTX 680 and TESLA K20c for 32K and 64K arrays of 32-bit elements. For larger arrays the algorithm is clearly bound by the very narrow 14.4 GB/s 64-bit DDR3 bus.
With the GK208 we’re probably looking at where mobile GPUs will be
in 2015. If you’re interested in GPU compute on mobile then I think
it’s a great idea to familiarize yourself with the new sm_35
features as well as the reality of how much off-chip bandwidth will be
available.
Let me know if you have any questions about the GK208!