Parallella Community

by **fredd** » Wed Jan 16, 2013 5:45 pm

Just some thoughts on the subject (from one mostly intrested in numerical Python code):

- Two different models for the parallel environment:
-- Each core is a indepentent python interpreter and can only access it's own set of python objects. Shared SDRAM is divided into chunks belonging to a single interpreter at each time. Can communicate with other cores with a message passing interface (and maybe even transfer ownership of python objects, but each object belongs to a single interpreter all time). Could have shared access to blocks of raw memory (as arrays of ints or floats)
-- OR: the cores are seen as threads in a shared python object environment. An object on core A could have a direct reference to a object residing on core B, and objects in SDRAM are completely shared. Need per object locks instead of GIL. Still needs a way to transfer objects between two cores and between corses and SDRAM (because indirect memory access is relative expensive), or some smart caching system.

Personally I prefer the former model (much simpler IMHO), but ideally the system should support both. Explicitly distguish between "local" and "global" objects similar to OpenCL, maybe?

- cython-style compilation of kernels. By adding a few type annotations to Python code and compiling to C, Cython often gives big speedups for (especially numerical) python code. But cython produces pretty big C code (like 10-15 lines of C code for a single line of Python), and have long compilation times, so we probably don't want to use Cython itself, but we could reuse many of its concepts. Maybe one could translate Python bytecode to Epiphany assembler, which calls functions in PyMite to manipulate python objects, and which also can manipulate int/float-arrays directy. I will look into PyMite and see if that's a resonable idea.

by **fredd** » Thu Jan 17, 2013 9:26 pm

by **markd** » Mon Jan 21, 2013 8:05 am

Thanks, fredd, this is exactly the kind of example I was looking for.

This example uses two-sided communication (send/receive).

In general, the calls are
send_array(<target node>, <source data>)
which must be paired with
receive_array(<source node>, <target data>)

Just thinking about some alternate models/syntax:

One-sided communication
put_array(grid[0,:], LEFT, right_edge)
put_array(grid[-1:], RIGHT, left_edge)

in general: put_array(<source data>, <target node>, <target data>)

Or alternately, the get version
get_array(right_edge, LEFT, grid[0,:])

in general get_array(<target data>, <source node>, <source data>)

Or coarrays:
right_edge = grid(LEFT)[0,:]
left_edge = grid(RIGHT)[-1,:]

And all of these methods would require the synchronization call at the end.

by **tnt** » Mon Jan 21, 2013 8:50 am

Just in case you missed it: Writing data to a neighbor core is _much_ faster than reading from a neighbor core.

by **fredd** » Mon Jan 21, 2013 3:54 pm

by **ysapir** » Mon Jan 21, 2013 4:10 pm

by **camara** » Sat Jan 26, 2013 11:55 pm

Mark,

Have you considered talking to the pypy folks (pypy.org or #pypy on freenode). They already have a backend for arm I believe for v7 and on going work to support v6. They have started work on STM (software transaction memory) for parallel processing which potential could be helpful for the Parallella project. A backend for Epiphany would need to be created which could take a couple of man months and then some work would need to be done to bridge between the arm and epiphany cores which I'm sure would be doable given the pypy architecture.

Any way, its just something to consider.

John M. Camara

by **henwee** » Wed Mar 06, 2013 12:46 pm

Ahh now I get it! So latency is effecting the throughput so by controlling data dma we can increase this. I was wondering why writing data to a neighbor core is _much_ faster than reading from a neighbor core.

Parallella Community

Parallel python design

Parallel python design

Re: Parallel python design

Re: Parallel python design

Re: Parallel python design

Re: Parallel python design

Re: Parallel python design

Re: Parallel python design

Re: Parallel python design

Re: Parallel python design

Who is online