Why is one eCore faster then the Arm?

Any technical questions about the Epiphany chip and Parallella HW Platform.

Moderator: aolofsson

Why is one eCore faster then the Arm?

Postby etim » Thu Oct 01, 2015 7:59 pm

I did a little benchmark today running this code on a single eCore:

Code: Select all
for (n = 0; n < TIMES; n++){
 
  //Clear Sum
  (*(c))=0x0;

  //Sum of product calculation
  for (i = 0; i < N/CORES; i++){
    (*(c)) += a[i] * b[i];
  }
}


Then I did the same on the ARM:

Code: Select all
for (n = 0; n < TIMES; n++){

    // printf("j= %d\n", j);

    //Clear Sum
    sop = 0;

    //Sum of product calculation
    for (i = 0; i < N; i++){

      sop += a[i] * b[i];
    }
  }


For TIMES=100,000 and N=4096, the eCore takes 11 seconds and the ARM takes 19 seconds.

Can anyone explain why the eCore is faster at this benchmark?
etim
 
Posts: 22
Joined: Sat Jun 27, 2015 6:08 pm

Re: Why is one eCore faster then the Arm?

Postby justsomeguy » Thu Oct 01, 2015 8:33 pm

etim wrote:I did a little benchmark today running this code on a single eCore:
...
Can anyone explain why the eCore is faster at this benchmark?


I suggest you take a look at the generated assembly code, the Epiphany can do a multiply & add in one cycle, I don't think the basic ARM core can do that. Maybe the NEON cores can but I expect the code you wrote won't be executed on the NEON.

-jsg
justsomeguy
 
Posts: 4
Joined: Thu Feb 20, 2014 1:55 am


Return to Epiphany and Parallella Q & A

Who is online

Users browsing this forum: No registered users and 18 guests