I should make it clear that each candidate program is executed multiple times (one for each item of training/validation data), and the multiple runs can all be done on-core, independent of the host. So it's more like transfer from host, run, run, run, run, run, run (etc) rather than transfer, run, transfer, run. The latter would be very inefficient as the Parallella would spend half the time just shifting data between the two chips.
I think I may have a good read of the Ephiphany datasheet, and a peek at the SDK internals, to see if I can roll my own from first principles. Even a 'blank' program int main() {}; ends up as a 6428 byte SREC file. Because the cores are running a simple program that reads input data, does some calculations and conditional branches, then exits, there's no need to use any C library functions.
Anyway, sorry for rambling, it's really just thinking aloud. I'd be curious to know if anyone has bypassed the SREC/e_load() system.Statistics: Posted by rowan194 — Thu Jan 29, 2015 10:37 am
]]>