Parallella Community

by **upcFrost** » Mon Jul 03, 2017 4:27 pm

Spent quite a lot of time trying to make the Load/Store optimization pass work as it should. Still a bunch of bugs might pop up.

Not exactly sure how to proceed with volatile memory references - on one hand, the compiler should not touch those. On the other hand, merging two loads or stores does not change the logic flow itself (it should be preserved in any case, volatile or not). Maybe I'll just make an additional flag

by **jar** » Wed Jul 05, 2017 3:39 pm

by **upcFrost** » Thu Jul 06, 2017 2:19 pm

by **sebraa** » Thu Jul 06, 2017 3:15 pm

I don't think you can merge two word-writes into a dword-write.
For volatile variables, each access must happen exactly in the way specified.

by **GreggChandler** » Thu Jul 06, 2017 3:40 pm

by **jar** » Thu Jul 06, 2017 3:56 pm

by **GreggChandler** » Thu Jul 06, 2017 5:22 pm

by **jar** » Thu Jul 06, 2017 9:09 pm

by **upcFrost** » Tue Jul 11, 2017 3:50 pm

Callee-saved regs now use paired loads and stores. Optimized scheduler a bit. In general, matmul-16 currently gives out 158ms against 130ms on gcc, purely because of suboptimal scheduling. Will work on it tomorrow.
Also found a couple of bugs in Load/Store optimization pass, fixed all except one.

Gregg, about volatile access - that's exactly as @jar said. It is possible to implement (basically skipping one "if-then" case), but as it might, and it will in some cases, break the code.
Still, in some cases it can improve performance quite a bit. I'm planning to make additional flag to allow it, but it will be set to false by default.
So the default behavior is not to touch volatile ldr/str at all, just as in spec

by **upcFrost** » Mon Sep 11, 2017 8:53 pm

Took a long vacation to switch jobs and move to the different country

On the compiler side - working on vectorization, or strd/ldrd to be precise, plus i64/f64 support. Atm most tests build fine, and actually arithmetic runs a bit faster than on GCC (not much, as i'm still using precompiled libs from gcc bundle). I want to fix matmul-16 test which fails on memory mapping (sections overlap) before moving on.

Another big question is CI and distribution. Travis has build time limitation of 1 hour, which is not enough to build full LLVM stack, and ccache size of 512 mb, which is not enough to use ccache with LLVM build. Semaphore has disk quote, which is also too small to fit the build. Will try to look for solution, or maybe i'll just mail Semaphore or Travis and ask them to raise quote for my build (iirc they're ok doing it for FOSS projects :roll:

)

Parallella Community

Current status

Re: Current status

Re: Current status

Re: Current status

Re: Current status

Re: Current status

Re: Current status

Re: Current status

Re: Current status

Re: Current status

Re: Current status

Who is online