Page 2 of 2

Re: Iterating host code: Parallella restarts

PostPosted: Fri Jun 23, 2017 11:36 pm
by ninlar
sebraa wrote:Use writes: Reads incur high latency; read requests travel at 12.5% speed only; reads do not allow bursts.


That is something I need to remember. Optimize algorithm for remote writes but local reads. I remember it being mentioned in the documentation. And there are three meshes cMesh, rMesh, and xMesh. The cMesh is for chip to chip writes and can do up to 8 bytes per cycle. While the rMesh is for reads and takes 8 cycles for one hop on the network. So if the core is multiple hops away it could take say 24 cycles to read that one byte. Whereas you could write 192 bytes in the same number of cycles.

I should comment out the e_read prototypes in the header to discourage use or cause them generate a warning or something when the warning level is increased to make it more obvious.

From the doc:
Optimization of Write Transactions over Read Transactions.

Writes are approximately 16x more efficient than reads for on-chip transactions. Programs should use the high write transaction
bandwidth and minimize inter-node, on-chip read transactions.

Re: Iterating host code: Parallella restarts

PostPosted: Sat Jun 24, 2017 9:07 am
by sebraa
ninlar wrote:While the rMesh is for reads and takes 8 cycles for one hop on the network. So if the core is multiple hops away it could take say 24 cycles to read that one byte. Whereas you could write 192 bytes in the same number of cycles.
To be fair, the rMesh only carries the read requests. The actual data is transmitted over the cMesh, like regular writes. Still, the overhead compared to writes is substantial.

Re: Iterating host code: Parallella restarts

PostPosted: Tue Jun 27, 2017 7:25 am
by gordon
The computation for speed for done for a single e_write instruction wherein I sent the address of first element of the array. So I do think all elements have consecutive addressing over here. Also how do i replace e_read with e_writes to reduce the time?