Page 1 of 1

IALU2 instruction latency

PostPosted: Sun Feb 08, 2015 3:56 pm
by mikebell
What cycle separation is necessary after an IALU2 instruction (when FPU is in integer mode) before the next IALU2 or IALU instruction operating on the result? I can't find a definitive answer to this anywhere.

Re: IALU2 instruction latency

PostPosted: Sun Feb 08, 2015 11:41 pm
by cmcconnell
I believe it's 3 cycles in both cases. (That's if you set the FPU into truncate mode, making the IALU2 instructions complete one cycle earlier than would otherwise be the case; otherwise the cycle separation would be 4, as described in table 20 of the architecture reference - http://www.adapteva.com/docs/epiphany_arch_ref.pdf.)

I was enquiring about the same subject in a previous thread, which you might find helpful - viewtopic.php?f=23&t=860&p=10474&hilit=truncate#p10434

Re: IALU2 instruction latency

PostPosted: Sat Feb 21, 2015 8:25 pm
by mikebell
I haven't got around to doing explicit tests counting cycles, but from general performance it looks like you're right: Even in integer mode where there isn't any rounding, the latency is 3 or 4 cycles depending on the rounding mode.

Could I suggest that the compiler is changed to set truncate mode in the CONFIG register when you specify -mfp-mode=int, unless there's some implication of that that I'm missing? Setting -mfp-mode=truncate as well as -mfp-mode=int doesn't appear to do the right thing.