32K per Epiphany Core.

Any technical questions about the Epiphany chip and Parallella HW Platform.

Moderator: aolofsson

32K per Epiphany Core.

Postby keithsloan52 » Sat Mar 08, 2014 3:43 pm

For Epiphany III 16 Core the amount of memory per core works out at 32K ( 512K / 16 ). The same goes for the upcoming
Epiphany IV 64 Core ( 2048 K / 64 = 32K ). Would I be right in assuming that if you found that 32K was not enough for your application you could run with less processors and share out the memory 512K Epiphany III and 2048K Epiphany IV between the processors that will run.

Would be interesting to know what/if applications hit the 32K per core limit and need a future Epiphany Chip with more memory per core.
keithsloan52
 
Posts: 17
Joined: Fri Mar 07, 2014 9:22 am

Re: 32K per Epiphany Core.

Postby timpart » Sat Mar 08, 2014 4:35 pm

Each core has a local 32K and can access the memory of another core, but with a time penalty. Writes take 1.5 cycles per hop, Reads 9.5 cycles per hop. (More if there is congestion on the network.) It is possible to execute code held on another core and the instructions are fetched 8 bytes at a time and split up into individual instructions. (An instruction is two or four bytes long.)

It will always be quicker to execute code from the local memory. I understand there may be a code overlay system under development to load chunks of code into the local core when needed.

As for data, it is important to note that the 32K areas of memory are not contiguous in the memory map. So you couldn't define an array that was 40K of consecutive locations and expect it to work properly.

Each core can run a different program, so personally I'd try to split a big problem up that way if possible.

Tim
timpart
 
Posts: 302
Joined: Mon Dec 17, 2012 3:25 am
Location: UK

Re: 32K per Epiphany Core.

Postby bobdvb » Tue Mar 11, 2014 2:31 pm

I was thinking about this, now comparatively the speed of each processor is quite fast, so while the read is slow you just have to risk being idle. Would it not be possible to create a bitwise mask of the memory locations and divide the entire on-chip memory in half: half being system memory and half being chip memory. That way, with 256kB of RAM you could have something actually running on the chipset rather than just dispatched from the host board.

Another angle to this is that couldn't one of the CPUs run a control node, dispatcher and MMU running from the shared RAM, then treat the CPU dedicated RAM as cache/registers. Then ask the host FPGA to share its MMU using something like RDMA (but not actually RDMA)? This way the CPUs could have access to much more memory and resources, being a host instead of a slave?
bobdvb
 
Posts: 3
Joined: Mon Dec 17, 2012 3:25 am


Return to Epiphany and Parallella Q & A

Who is online

Users browsing this forum: No registered users and 16 guests

cron