Notes on the OpenCl example

Any technical questions about the Epiphany chip and Parallella HW Platform.

Moderator: aolofsson

Notes on the OpenCl example

Postby aihaike » Mon Aug 11, 2014 5:12 am

Hey,

The OpenCL example runs on my board but I'm wondering why only root can see the epiphany device.
I added a function that print informations of the device in used and it turns out that CL_DEVICE_MAX_COMPUTE_UNITS returns 1 for both the processor and coprocessor.
Can you please tell me why?
To me using OpenCl on the board only require specific library and that any OpenCl code is likely able to run on it.
Am I right here?
Thanks,

Éric.
User avatar
aihaike
 
Posts: 31
Joined: Wed Aug 06, 2014 5:41 am
Location: Shanghai, China

Re: Notes on the OpenCl example

Postby 9600 » Mon Aug 11, 2014 4:23 pm

The reason you need to be root is that the currently supported interface is /dev/mem. However, there is a driver in development that provides access via /dev/epiphany, which is much safer to set more relaxed permissions on, e.g. via udev. If you build the eSDK from git HEAD it will expect this new interface, and if you wanted to experiment with it I believe there is a kernel branch that has it (but don't expect this to be fully tested).

Regards,

Andrew
Andrew Back (a.k.a. 9600 / carrierdetect)
User avatar
9600
 
Posts: 997
Joined: Mon Dec 17, 2012 3:25 am

Re: Notes on the OpenCl example

Postby bcxcube » Mon Aug 11, 2014 4:37 pm

Andrew,

You are correct - the branch "bcxcube_epiphany_driver" of the parallella-linux-adi repository contains the Epiphany driver. The sources at HEAD of the epiphany-sdk repository require the Epiphany driver (these sources use the epiphany driver for memory mapping rather than the /dev/mem driver).

Note that if you use the SDK modules from HEAD, you will need to use a kernel built from the bcxcube_epiphany_driver branch which includes the Epiphany driver by default.

Regards,
Ben
User avatar
bcxcube
 
Posts: 7
Joined: Tue Jul 29, 2014 4:00 pm
Location: Nashua, NH

Re: Notes on the OpenCl example

Postby aihaike » Tue Aug 12, 2014 1:13 am

Thank you guys for your replies.
I'm going to wait the official release of the driver.
By the way, what does mean the fact that
Code: Select all
clGetDeviceInfo
with the flag
Code: Select all
CL_DEVICE_MAX_COMPUTE_UNITS
returns 1 ?
User avatar
aihaike
 
Posts: 31
Joined: Wed Aug 06, 2014 5:41 am
Location: Shanghai, China

Re: Notes on the OpenCl example

Postby tincman » Tue Aug 12, 2014 11:59 am

aihaike wrote:Thank you guys for your replies.
I'm going to wait the official release of the driver.
By the way, what does mean the fact that
Code: Select all
clGetDeviceInfo
with the flag
Code: Select all
CL_DEVICE_MAX_COMPUTE_UNITS
returns 1 ?


From what I can tell on my tablet, CL_DEVICE_MAX_COMPUTE_UNITS should return the number of cores in the Epiphany, however there is another place where it does get initialized with "1".

Are you sure you're selecting the Epiphany device (as coprthr also provides a CPU implementation)? If you are, perhaps you should file a bug report on their github page (http://github.com/browndeer/coprthr)
User avatar
tincman
 
Posts: 8
Joined: Mon Dec 17, 2012 3:30 am
Location: CO, US

Re: Notes on the OpenCl example

Postby dar » Tue Aug 12, 2014 1:03 pm

I checked and max compute units is set to 1. This is an error. Its easy to fix, and I will have this done.

-DAR

Edit: Ok, its early here ... I checked the device info parameter code, and in fact this is correct and is set to 1 for a reason. In OpenCL-speak the Epiphany processor should be viewed as a single compute unit supporting a max workgroup size of 16. The definition of CL_DEVICE_MAX_COMPUTE_UNITS is imprecise outside of the GPUs that inspired it. This is the best interpretation (1 x 16) to help programmers target the device efficiently. For optimal performance you should use OpenCL to launch 16 parallel threads and have each thread perform 1/16th of the work for your problem. This is different from a GPU where you are taught to launch thousands of threads - GPUs are massively multithreaded architectures, a non-hyper-threaded multi-core CPU is not. So you must use a different work distribution model for performance.
Last edited by dar on Tue Aug 12, 2014 1:27 pm, edited 1 time in total.
dar
 
Posts: 90
Joined: Mon Dec 17, 2012 3:26 am

Re: Notes on the OpenCl example

Postby dar » Tue Aug 12, 2014 1:12 pm

aihaike wrote:To me using OpenCl on the board only require specific library and that any OpenCl code is likely able to run on it.
Am I right here?


OpenCL was designed for GPUs. You can use it for other devices, but not perfectly, and not without considering the architecture. The Epiphany RISC array provides very unique capabilities that are not accessible from OpenCL and it also does not perform well if you try to run code written for GPUs without considering the architecture. OpenCL is a portable API, but it does not follow that OpenCL code is performance-portable. It generally is not. Most OpenCL code is designed for GPUs and even then it is not uncommon to find optimized code paths for each GPU vendor. Programming Epiphany with OpenCL is in many ways similar to programming the Phi with OpenCL - you must use different algorithms (compared with a GPU) if you expect performance. Also, note that OpenCL is provided here as a convenient means of programming the platform, one of several APIs, but the implementation is not conformant and the intent was not to allow all OpenCL code to run without modification.
dar
 
Posts: 90
Joined: Mon Dec 17, 2012 3:26 am

Re: Notes on the OpenCl example

Postby aihaike » Tue Aug 12, 2014 4:20 pm

@Dar,

Thank you so much for your replies.
That make things much more clear now.
User avatar
aihaike
 
Posts: 31
Joined: Wed Aug 06, 2014 5:41 am
Location: Shanghai, China


Return to Epiphany and Parallella Q & A

Who is online

Users browsing this forum: No registered users and 14 guests

cron