[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 112: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/bbcode.php on line 112: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Parallella Community • View topic - _vsnprintf_r() problems with double?

_vsnprintf_r() problems with double?

Discussion about Parallella (and Epiphany) Software Development

Moderators: amylaar, jeremybennett, simoncook

_vsnprintf_r() problems with double?

Postby GreggChandler » Fri Apr 28, 2017 5:26 pm

I wrote some benchmark code that used _vsnprintf_r() to print the statistical results of my memory test on the Epiphany III. The benchmarks run reliably, however, when I print the results, _vsnprintf_r() appears to hang in the midst of every thousand'th or so invocation--but only when when printing floating point values. Integer values print fine. The calls are made from core memory, although, _vsnprintf_r() resides in external memory with a custom link script based upon the standard fast script. The only modification to fast.ldf is to also load some of my framework code in external memory along with the library. The application code is exercising the mesh pretty extensively as it benchmarks core and external memory on each core individually, and all cores simultaneously.

Obviously, my first thought was that there was some bug in my code--and there still may be. Once I isolated the hang to _vsnprintf_r(), I began to experiment. When I remove the offending call using a double, and replace it with two calls using int (one for each part of the double using floor(), round(), etc.), or just don't print the results, the code runs for hours, i.e. hundreds of thousands to millions of invocations of the library invocation. I spent a fair amount of time ensuring that enough stack space was available, etc, moving code to external memory, etc., as I would presume _vsnprintf_r() was necessarily making calls to a double simulation library. (My recollection is that the C standard requires promotion of a float to double when passed to a function such as _vsnprintf_r(), but that the E3 hardware only supports float.) Output buffer spaces are more than sufficient, and this function should never over-write them anyway. There shouldn't be any memory leaks that accumulate, as the code is reloaded after approximately ten or so results are printed. Each invocation runs a different test via a command line parameter, and there are twenty five different tests, about three loops, that is 75 tests per minute, 4.5K tests per hour, about 30K results/invocations (of _vsnprintf_r()) per hour. I have trapped the relevant interrupts, E_SW_EXCEPTION and E_MEM_FAULT, and nothing is being generated. Nested bash scripts let the system run hours at a time, tmux lets me reconnect via ssh to monitor progress. Temperature is 51.5 degrees Celsius as measured by ztemp.sh. My other applications don't generate floating point results, and all seem to run fine.

I generally don't believe in statistical debugging or crowd sourcing, however, I was curious as to whether anyone else has seen anything similar?
GreggChandler
 
Posts: 66
Joined: Sun Feb 12, 2017 1:56 am

Re: _vsnprintf_r() problems with double?

Postby jar » Fri Apr 28, 2017 7:11 pm

It's hard to help without being able to reproduce your issue. You're not calling _vsnprintf_r() directly, are you? I have a suspicion that a double word load or store on misaligned memory is the issue. All doubles must be 8-byte aligned. I don't know why your code only breaks sometime.
User avatar
jar
 
Posts: 295
Joined: Mon Dec 17, 2012 3:27 am

Re: _vsnprintf_r() problems with double?

Postby GreggChandler » Sat Apr 29, 2017 4:40 pm

I doubt that the issue is a mis-aligned memory access. As I wrote, the benchmarks are complete at the point of failure, and have been complete for quite a few cycles. That would be the likely place for such a failure. After the benchmarks, two "long long"'s are printed successfully. Then eight "double"'s are printed. (I double checked the number, I previously thought only six.) Sometimes the first double fails to print, other times failure is on the last double, and I have seen failure anywhere in between. For an alignment problem, it would need to be a compiler bug--and the compiler looks pretty sound in this regard. The arguments to _vsnprintf_r() are two byte arrays, an unsigned int, and a pointer. It would be difficult to mis-align the byte arrays (grin). The uint is passed in as a constant buffer size. Again, the compiler is responsible to passing it--likely in a register per the ABI, although I didn't examine the generated code. Lastly, the pointer is from library calls referencing parameters on the stack. A problem there would also indicate a problem with the library--also probably not likely.

As anything is possible, I created a program to purposely create a mis-aligned data access. The standard casting hacks from the PDP-11 days didn't work. The compiler appeared to be clever enough to mask the low address bits when cast. Ultimately, I resorted to a union. Only then would the compiler let me generate a mis-aligned access. It was at this point that I ran into another E3 errata--although the software exception is generated at mis-alignment, it is not processed. The processor, per the errata, halts in/at the interrupt vector. Then I wrote code on the host end to detect this condition. For fun, I also wrote a core (memory and register) dump when the condition occurs. I added the code to the dmatest, added a command line command to e-dmatest (the Epiphany code) that would let me optionally generate a purposeful memory fault on the core, verified that it all worked, halts were also detected in dmatest on the core, and then re-ran the dmatest test suite. The host application did not detect a halt condition.

When I get more time, I intend to add external memory to the core dump. (The irony of repurposing the term is somewhat amusing.) I already have code that parses the .elf file for symbols, so dumping sections would be easy to add.
GreggChandler
 
Posts: 66
Joined: Sun Feb 12, 2017 1:56 am

Re: _vsnprintf_r() problems with double?

Postby olajep » Fri May 12, 2017 3:38 pm

_start = 266470723;
olajep
 
Posts: 140
Joined: Mon Dec 17, 2012 3:24 am
Location: Sweden

Re: _vsnprintf_r() problems with double?

Postby GreggChandler » Sat May 20, 2017 3:57 am

@Ola, thank you for your consideration on this matter. I agree that the issues were probably not bugs in newlib or gcc, however, I have found little documentation for the use of the *_r() routines, and none for their use on Epiphany. I have also read of bugs in dtoa_r(), which obviously is foundational to the formatted output of doubles. As you also mentioned, Epiphany III has bugs with burst mode, bus errors, etc.

I believe that I have solved my issue. I sat down this evening and read much of the critical newlib source. Although your code might work in the simulator, it probably would not work well in a more general context. I have built more general foundational class libraries, so I am after the broader solution.

The difference between my code and yours is that I am using *_r() routines. I chose these versions of routines because they allow me to allocate "struct _reent" in core memory. This is to avoid the "weak memory model" problems you reference. "struct _reent" claims to put all of the context in the structure, and putting the structure in core memory should avoid memory model issues if my understanding of the architecture is correct. So far, this appears to be the case. However, I can only be sure after hours of successful testing which I hope to complete overnight.

The key to making *_r() work lies in initialization of the _reent structure stored in core memory. I mistakenly copied some bad code that merely initialized it to 0. A better approach is to use the _REENT_INIT() macro defined in the newlib headers. It sets up many of the pointers--which I don't need--and some static buffers which dtoa_r() does need. I additionally used the _REENT_CHECK_MP() macro. This initializes the limited context used by dtoa_r(), but in my case does nothing as I am not using _REENT_SMALL. Careful coding here can prevent the use of malloc().

I also wrote test code to verify that va_arg() and friends work correctly. They appear to, however, I only tested up to 8 variable arguments after an initial first argument. My reading of your EABI suggests that should be enough to verify that the transition from register args to stack args is handled correctly. My recollection is that a maximum of 4 32-bit values would be passed in registers. I also verified that "float" is correctly promoted to double per the C specification, etc. I further incorporated host code that examined registers to catch the alignment errors bugs, et al.

I suspect that the rarity of failure was due to the complexity of my foundational class libraries and the randomness of the errors resulting from the incompletely/incorrectly initialized _reent structure. With so much code and buffers, random memory modification was less likely to cause problems than in a single purpose eSDK application. I rewrote my original application as an eSDK app, which was considerable work, however, that resulted in a program that failed consistently. It works consistently now. Originally, this code was in my memory/dma benchmark. The original failures only seem to occur with code that exercised the mesh quite vigorously.

I have documented this for anyone that my also try to make newlib work on an Epiphany III where there isn't really enough room to load the code in core.
GreggChandler
 
Posts: 66
Joined: Sun Feb 12, 2017 1:56 am


Return to Programming Q & A

Who is online

Users browsing this forum: No registered users and 6 guests

cron