In other words, the core program fetches a chunk of the "problem" from external RAM, processes it, and shuffles it back to external RAM (the Adapteva matrix multiplication demo is an example of this technique, even if the C code is a bit mind-bending to read).
Since the Epiphany cores have two DMA channels, it is feasible (if the chunks are sufficiently small) to have fresh input data being read and results being written while calculation is in progress. There will probably be practical pitfalls - the approach takes one further & further away from the original code.
The ideal is for each core to have enough internal RAM that back-and-forth shuffling is not needed, but I suspect that is not a realistic expectation. 20+ years ago, 256KB was a tight fit for a SPMD parallelised 3D Navier-Stokes CFD code (half code + buffers, half data). On Parallella, much of that code could go into the external RAM, but that leaves 128KB of data to find a home for. Increasing the amount of internal RAM and/or increasing the speed and parallelism of external RAM would make Epiphany systems more generally useful.Statistics: Posted by mhonman — Thu Oct 17, 2013 8:22 pm
]]>