Parallella Community

by **grzeskob** » Mon Dec 15, 2014 1:13 pm

Could someone give me some advice ?
I am stuck on this problem.
I have spend long time by searching documentation and trying to apply different changes to benchmark app. :

- Is my calculation wrong, and I can not expect 2,4 GBps ?
or
- There is a SW bug inside benchmark, which I can correct to get 2,4 GBps ?

by **aolofsson** » Mon Dec 15, 2014 1:20 pm

Sorry for the slow reply!

The cmesh can transfer 8 bytes/cycle at each node in each direction. At 600Mhz this implies a peak bandwidth of 4.8GB/s. However, due two errata items, the DMA bandwidth out of one core is limited to ~25% of this. This is documented in the datasheet of the processor E16G301 and E64G401. As regrettable as this is, we have found the existing on chip bandwidth to be the least of our problems. (see FFT and matmul benchmarks on github for examples showing effective on chip communication patterns). The 1.2GB/s is still much higher than the off chip bandwidth.

What are you trying to test?

Andreas

by **grzeskob** » Mon Dec 15, 2014 2:42 pm

Hi Andreas,

Thank you for your answer. It has helped me a lot.
I do my thesis on Parallella board. First step is to measure and validate peak performance between different memory blocks on the board. I already have seen the posts about ERAM<->SRAM bandwidth problems. Later on I want to find out bottlenecks and possible congestion points and try to optimize apps with this knowledge.

BR
Bartek

Parallella Community

Parallella Memory benchmark

Parallella Memory benchmark

Re: Parallella Memory benchmark

Re: Parallella Memory benchmark

Re: Parallella Memory benchmark

Who is online