timpart wrote:

I think some kind of large FFT algorithm is under development because of the Birthday Beer Challenge. It might not be general enough for your individual requirements though.

Tim

tnchan wrote:Do we have newer FFT algorithm or codes for E16 by now? Cheers, TN

I think some kind of large FFT algorithm is under development because of the Birthday Beer Challenge. It might not be general enough for your individual requirements though.

Tim

Interesting, the "Birthday Beer challenge" was actually inspired by a radio astronomy application as well (Einstein@Home Binary Pulsar Search). I'm pretty confident that any code that would come from this challenge should be applicable in principle to tnchan's problem as well, after some modification. I think the problem that the challenge will address is how to efficiently combine smaller FFTs (computed very fast on the Epiphany with 16 cores in parallel) to compute longer FFTs with a minimum of overhead caused by memory transfers.

HBE

Statistics: Posted by Bikeman — Mon Jun 02, 2014 3:01 pm

]]>

tnchan wrote:

Do we have newer FFT algorithm or codes for E16 by now? Cheers, TN

Do we have newer FFT algorithm or codes for E16 by now? Cheers, TN

I think some kind of large FFT algorithm is under development because of the Birthday Beer Challenge. It might not be general enough for your individual requirements though.

Tim

Statistics: Posted by timpart — Mon Jun 02, 2014 12:40 pm

]]>

]]>

Hi Yaniv, thanks again for the explanation. My intended application of Epiphany 16 cores is for radio astronomy signal channelisation by FFT (from time to spectrum). The data set for FFT computation in my application is 2^18 (262144) single dimension of 8bit + 8bit data. I applied the formula you gave in your white paper on 2D image filtering, and found that the max point size for my application to be 2^16 = 65536 without using external memory and I will need to wait for the release of Epiphany 4096 cores (assuming 8KB per memory bank in each core) to do the job without using external memory. Can you confirm that my calculations are correct? In case I use external memory and accept the memory latency, how do I estimate the time taken? Where can I find a sample FFT for real life testing with Epiphany 16?

Statistics: Posted by tnchan — Thu May 08, 2014 7:31 am

]]>

It was quite some time ago, but IIRC, the current Epiphany chip can do up to 1024-point FFT. Thus, if you allow DRAM access, you could do an 1024x1024 image in (1024/16=) 64 batches. Furthermore, one can perform 1D FFT using a 2D FFT engine. This means that, adding a simple intermediate stage, you could use the given method for performing 128x128=16K point FFT, in similar times.

Statistics: Posted by ysapir — Sun Apr 27, 2014 7:55 pm

]]>

theover wrote:

you could use FPGA-fabric, or Xilinx' FPGA IP blocks which implement various efficient FFT blocks, also available with the Free Webpack design tools

you could use FPGA-fabric, or Xilinx' FPGA IP blocks which implement various efficient FFT blocks, also available with the Free Webpack design tools

unless you really know what you are doing, i highly discourage this option.

Statistics: Posted by Gravis — Sun Apr 27, 2014 5:05 pm

]]>

Essentially you have a number of options, depending on what type of FFT you want to fit on the limited chips estate. I was looking for a replacement of the PC-power that's available in the Open Source FFTw3 library which compiles on modern Intels, and is also available for Cuda (on NVidia's GPUs).

The Arm cores can do computations on FFTs, possibly accelerated by one or two (I don't know if both cores have it) NEON parallel processing, you could use FPGA-fabric, or Xilinx' FPGA IP blocks which implement various efficient FFT blocks, also available with the Free Webpack design tools, and of course, the 16 or 64 Parallella cores should be usable to lightly-parallelize FFT computations of various dimensionality, speed and size.

T.V.

Statistics: Posted by theover — Sun Apr 27, 2014 3:31 pm

]]>

]]>

Hi Yaniv, your paper is very informative and it is a pleasure to read even though I do not have any computer science training or background. Basically I understand the flow of the information and logic and am impressed with the speed of 1.5ms for filtering a 128x128 pixel picture with 2D FFT on 16 cores. Regarding the formula for working out the max FFT point size (the picture or image size) for a given hardware configuration, I presume the formula applies to 2D FFT. Is this true? Will the max FFT point for 1D be the square of 2D? That is, 128 x 128 for 2D = 2^7 x 2^7 = 2^14 for 1D. Thank you for your help? TN

Statistics: Posted by tnchan — Fri Apr 18, 2014 5:45 am

]]>

http://www.adapteva.com/white-papers/us ... hancement/

Statistics: Posted by ysapir — Fri Apr 11, 2014 1:13 pm

]]>

I was looking through FFT libraries a while back and the Fastest Fourier Transform in the South caught my eye as it has a BSD license and it can already generate code for ARM as well as Intel. Perhaps a relatively small step to do Epiphany as well. The algorithms used split the task down recursively I'm hoping it could be split between cores relatively easily.

There are of course many other fine FFT libraries around. I'm not aware of any library being ported to the Epiphany yet. (Corrections welcome!)

Tim

Statistics: Posted by timpart — Fri Apr 11, 2014 11:44 am

]]>

]]>