Sometimes it is useful to look at the intermediate and assembly code for GPU programs. This can lead to some interesting performance insights, especially for compiler writers. Unfortunately, the AMD APP SDK is a bit limited on Linux, and the AMD APP KernelAnalyzer, which conveniently dumps the AMDIL and Device ISA for an OpenCL kernel, is not available on Linux. However, digging through the AMD APP OpenCL Programming Guide, one finds an environment variable that can be used for the same purpose: GPU_DUMP_DEVICE_KERNEL.

According to the programming guide, this environment variable can take one of three values:

1Save intermediate IL files in local directory.
2Disassemble ISA file and save in local directory.
3Save both the IL and ISA files in local directory.

Therefore, if you run your OpenCL program with:

    $ GPU_DUMP_DEVICE_KERNEL=3 ./my-program

You will get two files in your local directory: [kernel-name]_[device-name].il and [kernel-name]_[device-name].isa, which contain AMDIL and Device ISA disassembly, respectively.