by dar » Tue Aug 12, 2014 1:03 pm
I checked and max compute units is set to 1. This is an error. Its easy to fix, and I will have this done.
-DAR
Edit: Ok, its early here ... I checked the device info parameter code, and in fact this is correct and is set to 1 for a reason. In OpenCL-speak the Epiphany processor should be viewed as a single compute unit supporting a max workgroup size of 16. The definition of CL_DEVICE_MAX_COMPUTE_UNITS is imprecise outside of the GPUs that inspired it. This is the best interpretation (1 x 16) to help programmers target the device efficiently. For optimal performance you should use OpenCL to launch 16 parallel threads and have each thread perform 1/16th of the work for your problem. This is different from a GPU where you are taught to launch thousands of threads - GPUs are massively multithreaded architectures, a non-hyper-threaded multi-core CPU is not. So you must use a different work distribution model for performance.
Last edited by
dar on Tue Aug 12, 2014 1:27 pm, edited 1 time in total.