Page 2 of 2

Re: Is there anything like e_load() which loads from memory?

PostPosted: Mon Feb 23, 2015 2:22 am
by rowan194
aolofsson wrote:rowan194,
Writing directly to the cores is always going to be very slow. The current memcpy() implementation inside e_write() is just using a store in a loop from the registers. [...]

So you're saying that e_write() is horribly inefficient? ;) I may try a little experimenting to see if a DMA "pull" would be faster. One advantage of doing it this way is that each Epiphany core should be able to run in a continuous loop, rather than relying on the host to push new code and signal start of execution.

Rough back of the envelope calcs...

8GB = 2^33 (host <-> Epiphany bandwidth per second - would shared memory run at this speed?)
32k = 2^15 (sample program size)

So you could transfer a theoretical maximum of 262144 (2^18) 32k programs per second, or 16384 per core per second (16 core CPU), which is about 0.06ms to transfer a single program to a core. If the real world was even a half or quarter as good as that, I think my application may still be viable. Look plausible?

Re: Is there anything like e_load() which loads from memory?

PostPosted: Mon Feb 23, 2015 9:55 am
by piotr5
shouldn't it be 0.48ms transfer-time per core? also, I suspect there is some latency, a time where you cannot run any program because you're waiting for the data to arrive. would you calculate 16k per program, you probably could avoid that latency. as far as I know, there is no way to protect that local memory from local writes while making it accessible from dma-transfers.

to actually get 0.16ms transferrate for each 32k core, you would need to attach the fpga to the north and south eLink. and then you would need to program it...

finally, you should consider where data is sent when, so data isn't stalled because of other cores doing a transfer at the same time. especially if you actually get input from 3 connections...