Memory transfer benchmark

Any technical questions about the Epiphany chip and Parallella HW Platform.

Moderator: aolofsson

Memory transfer benchmark

Postby tnt » Thu May 02, 2013 6:43 pm

Has anyone done benchmark from core to/from external DRAM ?

I thought I had seen some a few months ago but can't find them ...

Cheers,

Sylvain
tnt
 
Posts: 408
Joined: Mon Dec 17, 2012 3:21 am

Re: Memory transfer benchmark

Postby ysapir » Thu May 02, 2013 8:19 pm

Here's the output of my memory access speed test, for E64G4:

Code: Select all
Testing SRAM speed.
Host -> SRAM: Write speed =   17.12 MBps
Host <- SRAM: Read speed  =   20.93 MBps

Testing ERAM speed.
Host -> ERAM: Write speed =  100.83 MBps
Host <- ERAM: Read speed  =  136.66 MBps

Testing chip speed (@ 600Mz)
Core -> SRAM: Write speed = 1949.88 MBps   clocks = 2404
Core <- SRAM: Read speed  =  480.82 MBps   clocks = 9749
Core -> ERAM: Write speed =  304.05 MBps   clocks = 15417
Core <- ERAM: Read speed  =  153.31 MBps   clocks = 30576



and here's for E16G3:

Code: Select all
Testing SRAM speed.
Host -> SRAM: Write speed =   14.62 MBps
Host <- SRAM: Read speed  =   17.85 MBps

Testing ERAM speed.
Host -> ERAM: Write speed =  100.71 MBps
Host <- ERAM: Read speed  =  135.42 MBps

Testing chip speed (@ 600Mz)
Core -> SRAM: Write speed = 1286.01 MBps   clocks = 3645
Core <- SRAM: Read speed  =  406.80 MBps   clocks = 11523
Core -> ERAM: Write speed =  235.88 MBps   clocks = 19872
Core <- ERAM: Read speed  =   85.99 MBps   clocks = 54514
User avatar
ysapir
 
Posts: 393
Joined: Tue Dec 11, 2012 7:05 pm

Re: Memory transfer benchmark

Postby tnt » Thu May 02, 2013 9:30 pm

Thanks, that's consistent with what I get ( 87 Mo/s read , 234 Mo/s write ).

But that's pretty low, the interface peak is supposed to be like 900 Mo/s right ?
tnt
 
Posts: 408
Joined: Mon Dec 17, 2012 3:21 am

Re: Memory transfer benchmark

Postby ysapir » Thu May 02, 2013 10:02 pm

Please note that the host transfer speeds were measured using memcpy() calls (which is the implementation of the e_read() and e_write() API's). You can probably get better performance using DMA.
User avatar
ysapir
 
Posts: 393
Joined: Tue Dec 11, 2012 7:05 pm

Re: Memory transfer benchmark

Postby tnt » Thu May 02, 2013 10:05 pm

I assumed that those :

Core -> ERAM: Write speed = 235.88 MBps clocks = 19872
Core <- ERAM: Read speed = 85.99 MBps clocks = 54514


are done with the DMA ?

at least they match the speed that I get when doing DMA. (assuming MBps is 'Mega Bytes per sec' and not 'Mega Bits per sec' )
tnt
 
Posts: 408
Joined: Mon Dec 17, 2012 3:21 am

Re: Memory transfer benchmark

Postby ysapir » Thu May 02, 2013 11:10 pm

Yes.
User avatar
ysapir
 
Posts: 393
Joined: Tue Dec 11, 2012 7:05 pm

Re: Memory transfer benchmark

Postby shodruk » Fri May 03, 2013 1:57 pm

Hmm... Host <-> ERAM bandwidth is strangely slow.

What is the Zynq's DDR configuration (operating frequency, DRAM bus width) ?
Shodruky
shodruk
 
Posts: 464
Joined: Mon Apr 08, 2013 7:03 pm

Re: Memory transfer benchmark

Postby ysapir » Fri May 03, 2013 8:56 pm

I added a section to the test, measuring the memcpy() speed withing application space (virtual memory). It happens that memcpy() between buffers within the host application (i.e., virtual to virtual space, insode the O/S DRAM segment) achieve speeds of 240 MBps.

This is about 2x the speed of DRAM to/from ERAM (reminder: ERAM here is the segment of the board's DRAM dedicated to the Epiphany and not seen by linux). I am open to explanations on *why* the two operations are so different in speeds.

Looking at the memcpy() disassembly code, it looks like the copy is done via reg read/write and not DMA.

Regarding the ZedBoard's spec - according to Roman, the default ZedBoard configuration is 533MHz Operating Frequency, 32bit effective DRAM bus width, which means ~2 GBps. However, we will look further on the documentation to see if the actual settings we have is different.
User avatar
ysapir
 
Posts: 393
Joined: Tue Dec 11, 2012 7:05 pm

Re: Memory transfer benchmark

Postby tnt » Fri May 03, 2013 9:17 pm

ysapir wrote:This is about 2x the speed of DRAM to/from ERAM (reminder: ERAM here is the segment of the board's DRAM dedicated to the Epiphany and not seen by linux). I am open to explanations on *why* the two operations are so different in speeds.


The ERAM zone is most likely mapped as non-cacheable, no prefetch, no write combining or any of those things to optimize data access. But it's also those same things being disabled that make it "easy" and that you haven't have any cache issue when talking to the epiphany :)


ysapir wrote:Looking at the memcpy() disassembly code, it looks like the copy is done via reg read/write and not DMA.


Yes, userspace wouldn't have any way to control a DMA peripheral anyway and the libc would have to know about the hw specifics ...

Cheers,

Sylvain
tnt
 
Posts: 408
Joined: Mon Dec 17, 2012 3:21 am

Re: Memory transfer benchmark

Postby shodruk » Thu Jun 20, 2013 4:57 am

Is this understanding about eLink correct?

eLink packet size is always 104 bits.
(data:32, src_address:32, dst_address:32, control:8)

Writing 32 bits costs 104 bps of bandwidth.

Reading 32 bits costs 208 bps of bandwidth.
(104 bits for request, 104 bits for response)
Shodruky
shodruk
 
Posts: 464
Joined: Mon Apr 08, 2013 7:03 pm

Next

Return to Epiphany and Parallella Q & A

Who is online

Users browsing this forum: No registered users and 5 guests

cron