would it be possible to get an approach like this working on the epiphany 'boost compute' (which seems to use tricks like (boost lambda) to create openCL kernel code; (basically it seems to use template magic to roll 'function objects', and compile them as openCL kernels)
The appeal would be writing portable code (CPU,GPU,e-cores..); you could iterators that could work both on GPU/epiphany, and they'd do something smarter with the dataflow where possible on the Epiphany.
It's not as nice as real lambda functions, but with enough 'iterators' it might be enough to handle a lot of useful cases.
(e.g: one could write a 'convolutional iterator', and you pass in a function to apply to the result- like the activation function for a neural net. Or rely on some expression-template magic to combine 'convolute' & 'map' operations.)
I think 'boost compute' still relies on the ability to invoke the openCL compiler from within an application.
I invisage the framework rolling stub code in the epiphany kernels for synchronization/DMA, and inlining the code generated by the lambda-function-objects.
(other inspiration along these lines is microsoft C++AMP. I had some more complex ideas in mind originally ..writing an LLVM preprocessor to extract functions in build process. I'm not sure there's LLVM support for epiphany ),