Proceedings of the 2016 International Symposium on Code Generation and Optimization, ACM, New York, NY, pp. 105-116 Graphics Processing Units have emerged as powerful accelerators for massively parallel, numerically intensive workloads. The two dominant software models for these devices are NVIDIA’s CUDA and the cross-platform OpenCL standard. Until now, there has not been a fully open-source comp