Questions about the Matmul16 example

Any technical questions about the Epiphany chip and Parallella HW Platform.

Moderator: aolofsson

Questions about the Matmul16 example

Postby Richardye » Sun Nov 15, 2015 9:55 pm

Hi guys,

Recently I have read several example code, but I cannot understand the purpose of one instruction in the Matmul 16 example.

Code: Select all
do {
      if (me.corenum == 0)
      {
         // Wait for matmul() call from host. When a rising
         // edge is detected in the mailbox, the loop is
         // terminated and a call to the actual matmul()
         // function is initiated.
         while (Mailbox.pCore->go == 0) {};

         Mailbox.pCore->ready = 0;
      }

      // Sync with all other cores
      e_barrier(barriers, tgt_bars);

      // Calculate. During this time, the host polls the
      // shared mailbox, waiting for a falling edge that
      // indicates the end of the calculation.
      bigmatmul();

      // Sync with all other cores
      e_barrier(barriers, tgt_bars);

      if (me.corenum == 0)
      {
         // Signal End-Of-Calculation to the host.
         Mailbox.pCore->go    = 0;
         Mailbox.pCore->ready = 1;
      }
   } while (0);


Here is the main function code for the epiphany side, why we should have the if instruction of me.corenum == 0 at the first beginning? Hope someone could help me with that! Thank you very much !

Best Regards,
Richard
Richardye
 
Posts: 8
Joined: Mon Sep 07, 2015 6:30 pm

Re: Questions about the Matmul16 example

Postby sebraa » Mon Nov 16, 2015 1:44 pm

You can easily synchronize all cores on the Epiphany chip. But since the bandwidth between the host and the Epiphany chips is limited, it makes more sense to synchronize only a single core with the host (and have that core synchronize with all other cores inside the Epiphany system, then).
sebraa
 
Posts: 495
Joined: Mon Jul 21, 2014 7:54 pm

Re: Questions about the Matmul16 example

Postby Richardye » Wed Nov 18, 2015 9:32 am

Hi Sebera,

Thank you very much! So you means that it is the same if we just use me.corenum == 1 or 2 or 3 ... with the me.corenum == 0 ? Since we only need one core to be synchronize with the host and that core doesn't need to be the core 0 ? Here is my understanding, if anything is wrong, hope you can tell me ! Thank you all the same!

Best Regrads,
Richard
Richardye
 
Posts: 8
Joined: Mon Sep 07, 2015 6:30 pm

Re: Questions about the Matmul16 example

Postby sebraa » Thu Nov 19, 2015 2:56 pm

Richardye wrote:Since we only need one core to be synchronize with the host and that core doesn't need to be the core 0 ?
Exactly.
sebraa
 
Posts: 495
Joined: Mon Jul 21, 2014 7:54 pm

Re: Questions about the Matmul16 example

Postby cmcconnell » Thu Nov 19, 2015 6:10 pm

I guess it makes sense to pick one of the cores 0 - 3, to avoid the traffic having to pass through other cores on the way to its destination.
Colin.
cmcconnell
 
Posts: 99
Joined: Thu May 22, 2014 6:58 pm

Re: Questions about the Matmul16 example

Postby sebraa » Fri Nov 20, 2015 12:53 am

As far as I understand it, the external memory is located next to core 15 (bottom right).
sebraa
 
Posts: 495
Joined: Mon Jul 21, 2014 7:54 pm

Re: Questions about the Matmul16 example

Postby cmcconnell » Fri Nov 20, 2015 3:11 am

sebraa wrote:As far as I understand it, the external memory is located next to core 15 (bottom right).

Oops. Sorry. Looks like you're right. (Described pictorially in this post - viewtopic.php?f=13&t=1226&start=10#p7770 )

For some reason I had it in my head that the off-chip traffic went North, whereas in fact it goes East (and South before it leaves the Epiphany chip).

I've never entirely got my head around this subject. Was my basic premise a sound one? (i.e. that there could be an efficiency gain to be made by making the right choice of core to do the off-chip comms, which would be core 15.)

I'm not specifically thinking of this matmul startup example, but the general case of an application which designates one Epiphany core as the 'boss' core which handles all the off-chip comms, at the same time as inter-core comms within the chip may be occurring.
Colin.
cmcconnell
 
Posts: 99
Joined: Thu May 22, 2014 6:58 pm

Re: Questions about the Matmul16 example

Postby sebraa » Fri Nov 20, 2015 3:33 pm

Take with a grain of salt, what I write now. I haven't tested it.

Inside the Epiphany, bandwidth is generally sufficient (our problem is the memory size and as a consequence of too little buffering, latency; not data throughput). To my understanding, the NoC routers queue at most one transaction per direction. So, if your top-left core writes many packets to shared memory, then these transactions will queue up at each affected intersection and basically kill all traffic on that path. So it is probably beneficial to designate the bottom right core for off-chip (shared memory) transactions.
sebraa
 
Posts: 495
Joined: Mon Jul 21, 2014 7:54 pm

Re: Questions about the Matmul16 example

Postby aolofsson » Fri Nov 20, 2015 4:48 pm

All 4 cores on right side are equidistant from dram. They all go through one 4:1 mux before going out on elink to the fpga.
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Re: Questions about the Matmul16 example

Postby sebraa » Mon Nov 23, 2015 2:14 pm

Ah, thanks for the clarification!
sebraa
 
Posts: 495
Joined: Mon Jul 21, 2014 7:54 pm

Next

Return to Epiphany and Parallella Q & A

Who is online

Users browsing this forum: No registered users and 3 guests

cron