Inter board communication with Epiphany

Any technical questions about the Epiphany chip and Parallella HW Platform.

Moderator: aolofsson

Inter board communication with Epiphany

Postby krmld » Tue Sep 22, 2015 12:53 pm

Hi, all!
I am beginning research on Epiphany chips and have following questions:

  • Can Epiphany chips be used for "inter board" communication when connected as in attached image. Can I send message from board #1 to #3, so that CPU on board #3 will process the message?
  • If yes, what is the maximum number of boards can be connected in such way?
Any pointers to related works are welcome, Thanks in advance!
Attachments
sample.png
sample.png (13.35 KiB) Viewed 20643 times
krmld
 
Posts: 5
Joined: Wed Sep 02, 2015 12:05 pm

Re: Inter board communication with Epiphany

Postby peteasa » Fri Oct 02, 2015 3:11 pm

I am planning to try this out, once my Porcupine boards plus my second Parallella arrive!
Read the architecture reference http://www.adapteva.com/docs/epiphany_arch_ref.pdf and see that the maximum size grid is 32 x 32 cores or 8 x 8 Epiphany chips.
Now to get the inter board elink working well you are likely to want to get data on demand from the epiphany chip to the CPU. At the moment that is just a polled interface so the Arm cores can poll a message buffer over elink into the Epiphany chip. But soon, if not already, you should be able to use https://github.com/parallella/oh/tree/master/emailbox that will interrupt the CPU from the Epiphany side.
Also because the Parallella CPU uses one of the four elink connections and the Parallella board does not have a connection to the second elink connection (http://www.parallella.org/docs/parallella_schematic.pdf) you will be limited to 8 Parallella boards in a row... but wait there is an FPGA on the board with a lot of spare gpio pins... so it is not beyond the wit of man to connect the CPU elink connection to a home grown elink like connection (perhaps with for example 4 bit width rather than 8 bit width) from the FPGA so with a bit of HDL work you might be able to get 16 Parallella boards connected up...
You are also likely to need to do a bit of Linux kernel driver work to get the whole thing up and running.
User avatar
peteasa
 
Posts: 117
Joined: Fri Nov 21, 2014 7:04 pm

Re: Inter board communication with Epiphany

Postby peteasa » Sat Oct 10, 2015 7:09 am

I have just tried this out and read some more and found that its not quite as easy as it appears.. Here is an old post about this http://parallella.org/forums/viewtopic. ... gin+in+hdf. My simple test was to wire up two parallella boards and change the hdf file as follows:
Code: Select all
NUM_CHIPS                       2
CHIP                      E16G301
CHIP_ROW                       32
CHIP_COL                        8
CHIP                      E16G301
CHIP_ROW                       36
CHIP_COL                        8

I tried several different row and col numbers to see what effects I got. I ran the hello_world test application and found that with some combinations I could run the hello_world application on one parallella and it would completely lock up. Then I ran the hello_world application on the second board and this unlocked the test run so both boards completed the hello_world scan of chips! Still not got the e_hello_world program to load and run on the remote epiphany, but at least I have proven that some sort of communication works if not quite what I was expecting! Next step is to wire up DSP_XID, DSP_YID on the PEC_POWER connector as suggested in the link. Now that the e-link redesign has been completed this has a chance of working. So setting DSP_YID bit 0 to 1.8v with a jumper on the south board (ie the board with its North connector wired to the South connector of the (32,8) board. Now hello world on the South board uses eCore 0x908 -> 0x9cb and hello world on the North board uses eCore 0x808 -> 0x8cb. Running a modified hello_world that has a 32 core sequence on South board for eCore on the North board results in South board hanging until e-reset is run on the North board, but the North board hello-world runs to completion but does not report anything from the South board.... Note this is different to the earlier behaviour where the hello-world application only had to attempt to load the remote core (and load I assume includes a core reset) to unlock the remote core, how I have to do a e-reset.. Hmm more reading required!

Looked at the character driver in the kernel source and found that the epiphany memory start is hard wired to use #define EPIPHANY_MEM_START 0x80800000UL so eCore values lower than 0x808 would not work.

Also looking in the e-hal src epiphany-hal.c I can see that there is a function to disable the North, South and West connections run as part of resetting the epiphany system... perhaps they are never enabled after a reset?

If anyone has a suggestion please let me know!
User avatar
peteasa
 
Posts: 117
Joined: Fri Nov 21, 2014 7:04 pm

Re: Inter board communication with Epiphany

Postby aolofsson » Fri Oct 16, 2015 4:59 pm

Sorry for the slow reply! I am very impressed by your bravery in tackling some of the FPGA and multiboard issues/

The registers touched by the sdk can be found here:
http://www.adapteva.com/docs/e16g301_datasheet.pdf

Notes:
-You will not be able to read a remote Epiphany from each ARM (ie ARM on P#1 can't access Epiphanu on P#2 in your picture). (the return transaction will get lost due to the preference for east/west in NOC)
-You should be able to do a write from ARM on P#1 to Epiphany on P#2 (but that is not very interesting?)
-You should reduce the frequency of the link (300MHz is too fast for most cables), this is done by writing to the IO register
-You will need to enable the North/South links by writing to the proper IO registers (see SDK example)
-run with the internal.ldf to simplify things further as a start, otherwise you would need to compile different programs for each board

Hope this helps?
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Re: Inter board communication with Epiphany

Postby peteasa » Fri Oct 16, 2015 7:56 pm

Excellent! Thanks for the tips.

I have been modifying / extending sdk (int e_reset_system(void) --> int e_reset_connected_system(int syscfgrow, int txclkmode, int cclkdivider)) to allow prevention of disabling the North / South elink on the appropriate board and modifying the C-clock and the Tx clock mode for the eMesh. No luck so far but now I have some pointers for other places to read.

One interesting thing was that I thought I would be able to read ESYSCOREID and get the configured Core ID row and column. However this always seemed to show row 32 column 8 when I read it from within e_reset_system().. The modified hello-world application correctly prints out the two different sets of core id's so I know that my wire links on the epiphany chip and the hdf files are correct.
User avatar
peteasa
 
Posts: 117
Joined: Fri Nov 21, 2014 7:04 pm

Re: Inter board communication with Epiphany

Postby aolofsson » Fri Oct 16, 2015 8:04 pm

The ESYSCOREID is in the FPGA link and is hard coded and not affected by the strap pins. (this is why the printf works)

Remind me which elink code you are using (pointer?)
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Re: Inter board communication with Epiphany

Postby peteasa » Fri Oct 16, 2015 9:38 pm

Ok I missed the ESYSCOREID hard coding bit.. thanks.

A second interesting thing is that the table in the referenced epiphany_arch_ref appears to have a typo. You can see there is a problem because according to the table when the Address-Row Tag Matches the Mesh-Node Column(!) and the Address-Column Tag matches the Mesh-Node Row(!) the routing direction is Into Mesh Node. Spot the typo where Row and Column appear to have been transposed in the table heading. The text in the paragraph 5.2 Routing Protocol has a similar typo because the text states that the transaction is routed East if the destination- address column tag is less than the column ID of the current router node. This implies that moving East will decrease the node column id. However the Figure 6 and Figure 8 shows moving East will increase the node column id. Assuming that these two observations are indeed typo's I do not understand why the return transaction will get lost because a message from one Parallella board will move to the required destination column then pass down the south connector to the north connector of the Parallella with the higher row numbers and the reply will pass back to the requires source column and then back up out of the north connector into the south connector of the originating Parallella board.

I am using elink - redesign fpga (https://github.com/parallella/parallell ... o/releases) and am using https://github.com/adapteva/epiphany-li ... af2ee5191c .. ie just before the fpga registers were changed.. It appears to work with the fpga that I have. I also plan to update soon to the latest of everything...
User avatar
peteasa
 
Posts: 117
Joined: Fri Nov 21, 2014 7:04 pm

Re: Inter board communication with Epiphany

Postby aolofsson » Fri Oct 16, 2015 10:25 pm

Thanks for highlighting the typo. Clearly a mistake, can't believe nobody noticed (or they did and they told me and I forgot to fix it...)

Attaching pictures that hopefully shows why it won't work. The arrows show the routing path followed.
The algorithm for routing is "go along row first, when you get a column match, move up/down a column. When row and column match the destination address you are home.
Same algorithm for read requests, writes, and read-return data.

drawing.png
drawing.png (23.72 KiB) Viewed 20429 times
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Re: Inter board communication with Epiphany

Postby peteasa » Sat Oct 17, 2015 7:15 am

Ahhh! Obvious when you put it like that!

And it works! I used my memorymap test that gets each core to write its core address into the local memory of every core in the group and then the cpu reads the local memory of each core it can see -
Code: Select all
main:   0: Message from eCore 0x908 ( 0, 0): "0x908: 0x90b 0x909 0xac8 0x948 "
0x908 0x909 0x90a 0x90b 0x948 0x949 0x94a 0x94b 0x988 0x989 0x98a 0x98b 0x9c8 0x9c9 0x9ca 0x9cb 0xa08 0xa09 0xa0a 0xa0b 0xa48 0xa49 0xa4a 0xa4b 0xa88 0xa89 0xa8a 0xa8b 0xac8 0xac9 0xaca 0xacb

So core 0x908 has core 0x948 0x90b 0x909 0xac8 next to it and all the cores in the second parallella have written the core id to the appropriate location in core 0x908 local memory. Once I have the interrupt from the epiphany core to the Arm cpu working I would be able to route any traffic I want from one Arm core to the other Arm core via the epiphany NOC connections!

I am still experimenting but the flow that I use at the moment to control all 32 cores is as follows:
Start a normal 32 core access from the North Parallella board (cores 0x908... ie row 36 in the eMesh).. Note that this hangs the first time you run it..
Code: Select all
e_init(NULL);
e_reset_connected_system(36); // to configure the South connector at full rate - I found this is ok with my short ribbon cable
e_get_platform_info(&platform);
e_alloc(&emem, _BufOffset, _BufSize*32);
e_open(&dev, 0, 0, platform.rows, platform.cols);
e_reset_group(&dev);
load_epiphany();// note all cores North and South can be configured by North Parallella
e_start_group(&dev);
read_epiphany(); // note only the local North Epiphany cores can be read by North Parallella
e_close(&dev);
e_free(&emem);
e_finalize();


Once at startup on South Parallella (cores 0xA08... ie row 40 in the eMesh) ..
Code: Select all
e_init(NULL);
e_reset_connected_system(40);  // enables North connector at full rate - I found this is ok with my short ribbon cable
e_get_platform_info(&platform);
e_alloc(&emem, _BufOffset, _BufSize*32);
e_open(&dev, 0, 0, platform.rows, platform.cols);
e_reset_group(&dev);
load_epiphany();


Note that once you run this Once only application on the South Parallella this unblocks the North Parallella and allows it to complete its run.

Then whenever the South Parallella needs to read the South Epiphany chip it can run
Code: Select all
e_init(NULL);
e_get_platform_info(&platform);
e_alloc(&emem, _BufOffset, _BufSize*32);
e_open(&dev, 0, 0, platform.rows, platform.cols);
read_epiphany(); // note only the local South Epiphany cores can be read by South Parallella
e_free(&emem);


Once this first run is completed the North Parallella is in full control of all 32 cores and can load and run as many times as necessary. Still experimenting but it seems like a useful system.

Job (almost) done!

Thanks,

Peter
User avatar
peteasa
 
Posts: 117
Joined: Fri Nov 21, 2014 7:04 pm

Re: Inter board communication with Epiphany

Postby aolofsson » Sun Oct 18, 2015 1:24 am

Fantastic! I really enjoying hearing about your progress. We have tested some of this stuff before and Bittware even had a 4 chip product on the market for a while, but still....it's always great to get more validation that this stuff actually works! btw. how are you hooking up the two boards together. Did you make your own cables?
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Next

Return to Epiphany and Parallella Q & A

Who is online

Users browsing this forum: No registered users and 7 guests

cron