Parallella Community

by **smatthews** » Wed Aug 26, 2015 2:32 pm

Sorry, you're right. Each e-core has 32KB of local memory (not 8KB).

Thus for dot product, we could theoretically place two 16KB arrays on each e-core. With 4 bytes per integer, it will be about 4,096 elements per core. Over 16 cores, we could support arrays up to 65,536 in length without requiring a fetch to main memory. The DMA cost I was talking about was the DMA channel between the ARM and the Epiphany chip, not the elinks between cores. If we want to exceed array sizes of 65,536, we would have to write our program in such a way to use the DMA channel to fetch new portions of the array from the 2GB main memory bank on the ARM chip. As the array gets large, I imagine this performance cost will get prohibitively high.

Back to the issue with greater than 4096. The SOP for two integer arrays consisting of (i=0... n-1) for n = 4096 is 22,898,104,320. This exceeds the capacity of a 4 byte integer or long. There is a long long type that is 8 bytes, which should hold this value. However, I ran into trouble when I tried to change the type of the sop variable to be unsigned long long. That's why I left it as an open problem.

-Suzanne

by **sebraa** » Wed Aug 26, 2015 7:44 pm

by **dobkeratops** » Thu Aug 27, 2015 11:38 am

by **smatthews** » Fri Aug 28, 2015 12:17 pm

by **sebraa** » Fri Aug 28, 2015 4:36 pm

by **smatthews** » Mon Aug 31, 2015 12:55 pm

Does the 32MB also apply to the 16-core Epiphany chips, or just the 64MB chips? The manuals sometime does not make the distinction clear in their descriptions.

Is their a figure anywhere in the manuals that diagram out this memory block? I haven't seen it anywhere yet, and I think it would really add to people's understanding to see it pictorially.

by **sebraa** » Tue Sep 01, 2015 12:06 pm

Parallella Community

dot product

Re: dot product

Re: dot product

Re: dot product

Re: dot product

Re: dot product

Re: dot product

Re: dot product

Who is online