[Paper] A Distributed Shared Memory Model and C++ Templated

Announcements of academic papers and technical reports based on Parallella or the Epiphany architecture.

[Paper] A Distributed Shared Memory Model and C++ Templated

Postby jar » Fri Apr 28, 2017 12:59 am

https://arxiv.org/abs/1704.08343
Title: A Distributed Shared Memory Model and C++ Templated Meta-Programming Interface for the Epiphany RISC Array Processor
Abstract: The Adapteva Epiphany many-core architecture comprises a scalable 2D mesh Network-on-Chip (NoC) of low-power RISC cores with minimal uncore functionality. Whereas such a processor offers high computational energy efficiency and parallel scalability, developing effective programming models that address the unique architecture features has presented many challenges. We present here a distributed shared memory (DSM) model supported in software transparently using C++ templated metaprogramming techniques. The approach offers an extremely simple parallel programming model well suited for the architecture. Initial results are presented that demonstrate the approach and provide insight into the efficiency of the programming model and also the ability of the NoC to support a DSM without explicit control over data movement and localization.

Comments and discussion appreciated
User avatar
jar
 
Posts: 222
Joined: Mon Dec 17, 2012 3:27 am

Re: [Paper] A Distributed Shared Memory Model and C++ Templ

Postby dobkeratops » Thu May 04, 2017 11:35 pm

Just reading through it... I see 'parallel_for', taking a lambda .. looks very interesting. I need to read it again more closely.

If I've understood correctly this is exactly the sort of thing I was after in earlier brainstorming posts?

I notice details r.e. how you map the calculations onto the grid? ... I see a mention of offsets in array indexing.

The application of the CLETE-2 package requires a compiler that correctly implements the C++17 standard specification and also correctly optimizes C++ template partial specializations to produce efficient code. In this work we utilize the GCC 5.4 complier for targeting the Epiphany processor. We additionally rely on the COPRTHR-2 SDK which provides run-time support for the Epiphany processor including support for fast SPMD direct co-processor execution, without requiring offload semantics or co-design with the ARM CPU on the Parallella platform. As a result, the compilation and run-time environment used in this work resembles that of an ordinary Linux platform with a multi-core processor.


So basically that's the holy grail as I see it.. the ability to write portable code that can run on the e-cores, but also other parallel processors, so long as they're not too dis-similar. (write once, run on e-cores, clusters, GPU..)

What I had in mind was building more elaborate 'high order functions' (various combinations of map / gather / filter; etc) which could express the dataflow, to give the epiphany implementation more opportunity to leverage the scratchpads/DMA;
if I've understood correctly, perhaps those could be built directly (as helper code) on top of what you demonstrate here.

But perhaps this technique is doing all that already through templated types for the indices, with a lot of TMP magic to compile to something efficient.


Is this proprietary ( I see '2U.S. Army Research Laboratory'..) .. or can it appear in the SDK ; Are you able to put any of this on GitHub ?


I still don't have a parallela myself .. I continue to mess with regular GPUs, openCL. Knowing the 1024 core chip exists does dramatically increase the motivation to write suitable code for it.
dobkeratops
 
Posts: 159
Joined: Fri Jun 05, 2015 6:42 pm
Location: uk

Re: [Paper] A Distributed Shared Memory Model and C++ Templ

Postby jar » Fri May 05, 2017 5:15 am

I thought you'd like this. Yes, this is the similar to the thing you were brainstorming, but you were ahead of your time. GCC wasn't ready (at least version 4.8 with the older Linux image) as well as some of our software.

And it's not ready for prime time yet. This was an early experiment on Epiphany as a side project from the main effort. There is a lot left to improve, but the intention is to place this on GitHub at some point and it won't just be for Epiphany. We would like to delay this as long as possible after witnessing what happened with Kokkos -- they released an unfinished product on the DOE in a panic to have some semblance of code portability between their next Xeon Phi and Power/GPU supercomputers. The end result was that many things are completely missing or unrefined and to properly fix it would break codes.

I don't think we want to begin implementing 'high order functions' but rather enable expressions to be written that compile to efficient code. But I'll keep it in mind. It's not a library though it might be considered a header-only library. It actually can't be pre-compiled and shipped as a proprietary package, so it must be open source if anyone will use it.

The memory layout accessors vary between architectures and platforms. It's a single like of code appearing in an application header that defines memory layout and has a certain complexity to it. Each platform will have defaults, but it's a memory-layout-first approach to parallel computing. The parallel kernel code will remain the same and the expression templates handle the rest. Each platform may have specific optimizations baked into the layout description.
User avatar
jar
 
Posts: 222
Joined: Mon Dec 17, 2012 3:27 am

Re: [Paper] A Distributed Shared Memory Model and C++ Templ

Postby dobkeratops » Fri May 05, 2017 2:34 pm

but rather enable expressions to be written that compile to efficient code.


If the underlying template 'magic' does exactly the same job, then great.
I could simply implement my own idea as helper code ontop. It sounds like this library is actually more ambitious /general already.

I'm sure it would just take a few examples to make it clear how it works.
dobkeratops
 
Posts: 159
Joined: Fri Jun 05, 2015 6:42 pm
Location: uk


Return to Academic Papers

Who is online

Users browsing this forum: No registered users and 1 guest