etim wrote:

I did a little benchmark today running this code on a single eCore:

...

Can anyone explain why the eCore is faster at this benchmark?

I did a little benchmark today running this code on a single eCore:

...

Can anyone explain why the eCore is faster at this benchmark?

I suggest you take a look at the generated assembly code, the Epiphany can do a multiply & add in one cycle, I don't think the basic ARM core can do that. Maybe the NEON cores can but I expect the code you wrote won't be executed on the NEON.

-jsg

Statistics: Posted by justsomeguy — Thu Oct 01, 2015 8:33 pm

]]>

- Code:
`for (n = 0; n < TIMES; n++){`

//Clear Sum

(*(c))=0x0;

//Sum of product calculation

for (i = 0; i < N/CORES; i++){

(*(c)) += a[i] * b[i];

}

}

Then I did the same on the ARM:

- Code:
`for (n = 0; n < TIMES; n++){`

// printf("j= %d\n", j);

//Clear Sum

sop = 0;

//Sum of product calculation

for (i = 0; i < N; i++){

sop += a[i] * b[i];

}

}

For TIMES=100,000 and N=4096, the eCore takes 11 seconds and the ARM takes 19 seconds.

Can anyone explain why the eCore is faster at this benchmark?

Statistics: Posted by etim — Thu Oct 01, 2015 7:59 pm

]]>