Current status

Moderator: Hoernchen

Re: Current status

Postby upcFrost » Mon Jul 03, 2017 4:27 pm

Spent quite a lot of time trying to make the Load/Store optimization pass work as it should. Still a bunch of bugs might pop up.

Not exactly sure how to proceed with volatile memory references - on one hand, the compiler should not touch those. On the other hand, merging two loads or stores does not change the logic flow itself (it should be preserved in any case, volatile or not). Maybe I'll just make an additional flag
Current LLVM backend for Epiphany: https://github.com/upcFrost/Epiphany. Commits and code reviews are welcome
upcFrost
 
Posts: 28
Joined: Wed May 28, 2014 6:37 am

Re: Current status

Postby jar » Wed Jul 05, 2017 3:39 pm

upcFrost wrote:Not exactly sure how to proceed with volatile memory references - on one hand, the compiler should not touch those. On the other hand, merging two loads or stores does not change the logic flow itself (it should be preserved in any case, volatile or not). Maybe I'll just make an additional flag


Code: Select all
volatile int* p = ...;
*p = 1;
*p = 2;


IMO, in the code above, the compiler should not attempt to remove either line. Otherwise, I'm not sure what you mean.
User avatar
jar
 
Posts: 257
Joined: Mon Dec 17, 2012 3:27 am

Re: Current status

Postby upcFrost » Thu Jul 06, 2017 2:19 pm

Here's what I mean (the code is a bit different)

Code: Select all
volatile int* p = ...;
p[0] = 1;
p[1] = 2;

This code can be translated into (don't mind %p, there's a reg ofc)
Code: Select all
mov r0, 1
str r0, %p
mov r1, 2
str r1, %p+4

Or both stores can be merged into a single strd
Code: Select all
mov r0, 1
mov r1, 2
strd r0, %p


There is a bunch of checks if such conversion is possible, of course, but still that's a question if the compiler is eligible to do it or not.
Current LLVM backend for Epiphany: https://github.com/upcFrost/Epiphany. Commits and code reviews are welcome
upcFrost
 
Posts: 28
Joined: Wed May 28, 2014 6:37 am

Re: Current status

Postby sebraa » Thu Jul 06, 2017 3:15 pm

I don't think you can merge two word-writes into a dword-write.
For volatile variables, each access must happen exactly in the way specified.
sebraa
 
Posts: 487
Joined: Mon Jul 21, 2014 7:54 pm

Re: Current status

Postby GreggChandler » Thu Jul 06, 2017 3:40 pm

You might find this link to be an informative read: http://en.cppreference.com/w/c/language/volatile

Among the rest of the read, specifically:
Every access (both read and write) made through an lvalue expression of volatile-qualified type is considered an observable side effect for the purpose of optimization and is evaluated strictly according to the rules of the abstract machine (that is, all writes are completed at some time before the next sequence point). This means that within a single thread of execution, a volatile access cannot be optimized out or reordered relative to another visible side effect that is separated by a sequence point from the volatile access.


The following link speaks to the interpretation of "sequence points": http://en.cppreference.com/w/c/language/eval_order

Again, consider:
4) There is a sequence point after the evaluation of a full expression (an expression that is not a subexpression: typically something that ends with a semicolon or a controlling statement of if/switch/while/do) and before the next full expression.


An example where combining sequenced writes to "volatile" memory would be in the case of I/O. The sequence of I/O to registers is often significant. Converting two 32-bit writes to a 64-bit write could break the sequencing depending upon the "endianness" of the architecture.
GreggChandler
 
Posts: 66
Joined: Sun Feb 12, 2017 1:56 am

Re: Current status

Postby jar » Thu Jul 06, 2017 3:56 pm

upcFrost wrote:Here's what I mean (the code is a bit different)
Code: Select all
volatile int* p = ...;
p[0] = 1;
p[1] = 2;



That question makes more sense and it shouldn't be done according to what Gregg posted. I can't think of a real code that this might break, but there is probably an edge case somewhere. If you do this, make it an optional flag.

However, if it's NOT a pointer to volatile data, I would appreciate the compiler optimizing it to a single double-word store (if 'p' is 8-byte aligned). This would improve unrolled code where two consecutive results or two inputs can be stored or read in a single instruction.
User avatar
jar
 
Posts: 257
Joined: Mon Dec 17, 2012 3:27 am

Re: Current status

Postby GreggChandler » Thu Jul 06, 2017 5:22 pm

jar wrote:it shouldn't be done according to what Gregg posted


@jar, are you suggesting that the implementor ignore the specs?

The referenced document clearly articulates a relationship to "sequence points", and as I read the quoted pages, statements/expressions separated via semi-colons define sequence points. Optimizing across a "sequence point" does not appear to be per the spec when volatile data is involved. Did I really mis-interpret or mis-read that?

The purpose of specifications is to avoid statements such as
jar wrote: I can't think of ...
GreggChandler
 
Posts: 66
Joined: Sun Feb 12, 2017 1:56 am

Re: Current status

Postby jar » Thu Jul 06, 2017 9:09 pm

GreggChandler wrote:
jar wrote:it shouldn't be done according to what Gregg posted

@jar, are you suggesting that the implementor ignore the specs?

I think you misunderstood me. I meant:
jar wrote:[combining two stores to volatile memory into one double word operation] shouldn't be done


I also meant if upcFrost does implement this behavior that he should make it an optional flag so that it doesn't break code.

I feel like you're trying to misrepresent what I said...
jar wrote:there is probably an edge case somewhere


I meant that the spec should be followed because there will be a code that is broken even though I can't think of a specific real code that would break.

I believe we're in agreement, Gregg.
User avatar
jar
 
Posts: 257
Joined: Mon Dec 17, 2012 3:27 am

Re: Current status

Postby upcFrost » Tue Jul 11, 2017 3:50 pm

Callee-saved regs now use paired loads and stores. Optimized scheduler a bit. In general, matmul-16 currently gives out 158ms against 130ms on gcc, purely because of suboptimal scheduling. Will work on it tomorrow.
Also found a couple of bugs in Load/Store optimization pass, fixed all except one.

Gregg, about volatile access - that's exactly as @jar said. It is possible to implement (basically skipping one "if-then" case), but as it might, and it will in some cases, break the code.
Still, in some cases it can improve performance quite a bit. I'm planning to make additional flag to allow it, but it will be set to false by default.
So the default behavior is not to touch volatile ldr/str at all, just as in spec
Current LLVM backend for Epiphany: https://github.com/upcFrost/Epiphany. Commits and code reviews are welcome
upcFrost
 
Posts: 28
Joined: Wed May 28, 2014 6:37 am

Previous

Return to LLVM Compiler

Who is online

Users browsing this forum: No registered users and 1 guest