Newbie clustering discussion

Newbie clustering discussion

Postby Landy DeField » Sat May 18, 2013 12:46 pm

My IT manager and I are very interested in clustering with the parallella board. I would love for this thread to be a place where people new to clustering with Parallella boards can ask questions, post links to resources, in and outside of this forum. Share their experiences, successes and failures with constructive input and without ridicule.
45GHZ!!! :O
User avatar
Landy DeField
 
Posts: 1
Joined: Sat May 18, 2013 12:36 pm
Location: Washington

Re: Newbie clustering discussion

Postby LamsonNguyen » Sun May 19, 2013 5:46 am

LamsonNguyen
 
Posts: 138
Joined: Sun Dec 16, 2012 7:09 pm

Parallela Cluster using Epiphany link (How does it work?)

Postby prodsn » Wed Aug 07, 2013 4:08 am

Would someone explain how a elink connected parallela cluster will work? What is the architecture?
One master board and slaves, all coprocessors showing as one coprocessor (16 x boards = total n of ECPUs) on the master?
Will the ARM processors, ram and/or ports cluster too? or at least be accessible in some way?
User avatar
prodsn
 
Posts: 1
Joined: Wed Aug 07, 2013 3:32 am

Re: Newbie clustering discussion

Postby optimaler » Mon Aug 19, 2013 7:54 pm

I'd also like to hear more about the eLink cables, since I presume that the mini-cluster and full cluster backers are going to be getting a few of these (we are getting those, right?). I'm mostly concerned about designing MPI type software around the architecture provided by eLink to obviously get the best performance EVAR. I want my cluster to scream as loud as it can.

Here are some questions I have right now:

1) Let's say I have eight parallella boards and eight eLink cables to go with them. A ring topology for message passing is the obvious option here. Is there a penalty for communicating with boards more than one jump away, or does eLink have some kind of fast pass-through mechanism that skips any processing? This kind of topology has caused me grief trying to get Xeon Phi's to play nicely with some of my more serious code.

2) Is a tree topology possible with the eLink cables, or is there some kind of hardware limitation which prevents a board form having more than two connections? (A better question is, are these connecting to the GPIO pins? That obviously limits the hardware connections).

3) How long are the eLink cables? In order to achieve any of the above mentioned topologies, we need to be able to stretch at least two boards with the standoff legs.

4) Related to prodsn's question, what is actually communicating via eLink cables? Parallella only, or will the ARM procs be able to communicate directly over eLink as well? (I actually don't really care that much, although it would be a nice perk).

5) Semi related question: when the Adapteva crew put together the 42-board cluster, did you do any benchmarks on ethernet saturation? I imagine 2-4 boards wouldn't be able to saturate a 1Gb switch, but it might be a concern for more than that (say, eight boards -wink-).*

I eagerly await the response of our fearless designers.

*A follow on to this thought: Thinking of the different MPI message passing styles (point-to-point vs broadcast vs all-to-all), did you test how the Parallella cluster performed in each circumstance? I would expect all-to-all to saturate the ethernet, but not necessarily a broadcast.
optimaler
 
Posts: 24
Joined: Mon Dec 17, 2012 3:29 am

Re: Newbie clustering discussion

Postby Janos » Sat Oct 05, 2013 12:14 pm

I would be keen to learn the answers to these questions as well.

Especially an answer on how the eLink cables join things together. Where n is the number of boards, is the result of the cluster n(2arm+16epiphany) or 2arm+n(16epiphany)?
Janos of The Scottish BOINC Team
http://www.tsbt.co.uk
Janos
 
Posts: 51
Joined: Sun Feb 24, 2013 8:31 am

Re: Newbie clustering discussion

Postby timpart » Tue Oct 08, 2013 6:43 am

Here are some unofficial answers based on my understanding. Some terminology:

Epiphany = the chip with 16 (or 64) cores
Zynq = FPGA plus 2 ARM cores
Parallella = the whole board.

The eLink is specific to the Epiphany chip. Each chip has four: North, East, South and West. On the rev 1 Parallella board the North and South links are bought out directly to high speed connectors and can be joined to other boards using the special eLink cables. The East link goes to the FPGA via some of the latter's GPIO pins and makes the memory on the Epiphany available to the ARMs and the DRAM on the board usable by the Epiphany. The West eLink is not connected. Unused GPIO and other pins on the Zynq are brought out to other connectors and play no part in the eLink.

The board spec says an eLink has a peak bandwidth of 1.3 gigabytes per second (11.4 Gigabits). You will only get this if you transfer a lot of data 64 bits at a time to consecutive memory addresses. For randomly changing addresses the speed is "less than 1/3 of peak". The link can send and receive simultaneously. I'm not sure if the figure quoted refers to the bidirectional rate.

Because of the way the Epiphany routes messages the only possible topology for joining rev 1 Parallellas via eLinks is a linear arrangement, with a maximum of 16 boards (for 16 core Epiphanies). So you would have 256 Epiphany cores and 32 ARM cores. If you can obtain 64 core Epiphanies then the limit is 8 boards but the number of Epiphany cores doubles. I think it is not possible to mix 16 and 64 core boards in the chain without some kind of extra circuitry due to differing voltage levels. I suspect the eLink cables are specific to the board type as well because of pinout differences on the Epiphany generations.

I don't think ARM core communication via eLink is supported. (If an ARM sent data into the mesh with the address of another ARM, the mesh routing protocol would send it back the way it had come.) The boards do have Gigabit Ethernet though.

There is no eLink short cut. A message has to pass through all intermediate boards in the chain. I don't know how long this takes though. On the plus side there is a dedicated xMesh for off chip writes so data exchange within the chip isn't slowed. I can't comment on how saturated Ethernet communication gets as I don't have access to that kind of set up.

Andreas did give some extra information to clarify the Architecture manual.

Hope this helps,

Tim
timpart
 
Posts: 302
Joined: Mon Dec 17, 2012 3:25 am
Location: UK

Re: Newbie clustering discussion

Postby Janos » Tue Oct 08, 2013 2:52 pm

Nice one, thank you Tim.
Janos of The Scottish BOINC Team
http://www.tsbt.co.uk
Janos
 
Posts: 51
Joined: Sun Feb 24, 2013 8:31 am

Re: Newbie clustering discussion

Postby optimaler » Wed Oct 09, 2013 7:25 pm

Yah, indeed, thanks, Tim. Some of us are just too lazy to read the specs on the manual, but it also seem slike it took the community a while to collect the info on the architecture. Since Adapteva is still working on the eLink connectors, it's actually somewhat irrelevant at the moment anyway (although I was told in email that the epiphany cables coudl be added if they finish them in time to ship with the kickstarter clusters).
optimaler
 
Posts: 24
Joined: Mon Dec 17, 2012 3:29 am

Re: Newbie clustering discussion

Postby manklee » Mon Feb 02, 2015 7:08 am

manklee
 
Posts: 24
Joined: Sun Jun 29, 2014 10:06 pm


Return to Clustering

Who is online

Users browsing this forum: No registered users and 3 guests

cron