An Idiom for SMEM Variables
It’s common to declare a number of shared memory variables at the top of a CUDA kernel file.
You often see these declarations with Hungarian’ish names that imply the variables somehow reside in shared memory.
But as someone who likes to write highly readable code, I’ve found that naming, referencing and maintaining independently declared shared variables can sometimes turn into an endeavor.
Some example issues:
- I would like to use the same meaningful variable name for both the shared and in-register instances.
- I often want to know at compile-time how much shared memory I’m using in order to construct a proper launch bound.
I write most kernels in a C++ “lite” style that hews toward pure C99 so what can I do?
My simple solution is the “shared memory variable struct” idiom.
This idiom simplifies naming and interacting with shared variables by collecting all shared declarations into a single struct. It looks like this:
The idea is to declare a structure with no tag that contains all the shared variables in the kernel. The tagless struct identifier is given the highly indicative name shared and has file scope.
Now you can reference any shared memory variable with: shared.fieldName.
It also makes it a little easier to declare struct-union combinations in kernels that non-trivially reuse shared memory.
Furthermore, you can easily obtain the overall shared memory footprint at compile time and use it to calculate launch bounds. Here’s a simple launch bound example that, assuming shared memory is the resource bottleneck, hints that as many blocks as possible should be squeezed into an SMX:
In summary, this simple idiom has helped me keep my kernel code nice and clean!