[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 112: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Parallella Community • View topic - Very Fast Fourrier Transform

Very Fast Fourrier Transform

Very Fast Fourrier Transform

Postby tnt » Mon Aug 03, 2015 8:37 pm

tnt
 
Posts: 408
Joined: Mon Dec 17, 2012 3:21 am

Re: Very Fast Fourrier Transform

Postby aolofsson » Mon Aug 03, 2015 9:13 pm

So your routine on a simple scalar Epiphany core at 600MHz runs 39% faster than FFTW running on a 4-way A9 core at 667MHz?
I'd say that's pretty darn impressive! :D

Andreas
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Re: Very Fast Fourrier Transform

Postby tnt » Tue Aug 04, 2015 7:43 am

Yeah, I'm pretty happy about it.

The big advantage of the epiphany in this case are:
- Large register file : except for fft data load / store, there is no memory access for temporary results. Despite having loop pipelining and processing 4 data per loop iteration (2 radix-2 ops in //), I only ever use registers, and even only the "caller saver" registers so I don't even need to save/restore them.
- BITR opcode : infinitely useful for this :p
- Easy to predict low level behavior: Because I can understand exactly how the CPU will execute stuff, I can tailor the operations manually much better. Optimizing for ARM (or even worse Intel) has so many rules to follow that I can't keep them all in my head ...

Next step will probably be to extend this for higher point FFTs using multiple cores. (The current one is local mem only, so you can do at most 2048 points, but more realistically 1024 when using double-buffering)
tnt
 
Posts: 408
Joined: Mon Dec 17, 2012 3:21 am

Re: Very Fast Fourrier Transform

Postby aolofsson » Tue Aug 04, 2015 12:26 pm

That's great to hear! Look forward to your inputs in the following topic.

viewtopic.php?f=23&t=3127
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA


Return to Assembly

Who is online

Users browsing this forum: No registered users and 1 guest

cron