-------
Hello all,
I am still having problems with my mini-cluster of Parallellas, however I have made some progress.
To make sure that power was not the issue I have been powering Parallellas one board at a time and have also tried a different PSU to power a single Parallella - a TDK Lambda 5V/10A affair - with no change in stability for any of the boards.
Finally I acted on a vague observation that the most stable runs seemed to be on very hot days (for the UK!) in the mid to late afternoon when it was hottest (note: Parallellas are in a quite small room with only a door and windows for air conditioning).
I let a board get hot by restricting the air flow around it. I have so far concentrated on the board that locks up the system (or possibly just crashes the network) when it fails as this is the more serious problem.
I find that if I let the board warm up so that the ztemp.sh script reports temperatures above 60 degrees C (but below the 70 degrees C recommended maximum) then this board magically starts to repeatedly pass all the Epiphany tests listed in the LIST.E16 file run by /home/linaro/epiphany-examples/scripts/TestEpiphany.pl.
I have been running this board hot over that last few days - and cooler to get an idea of the range of temperatures involved). Today I have been running the board at 64 degrees C to 67 degrees C for over four and a half hours and 265+ iterations of the tests with no lock ups and no failures or interruptible getting stuck incidents. Note that I had no luck with an earlier setup that cooled the Zynq chip to ~57 degrees C.
I am not convinced it is the Zynq chip that needs to be this hot nor necessarily the Epiphany chip, but might well be some other component or components that of course get warm as well.
Any ideas as to what could cause such behaviour - i.e. Running Epiphany tests and examples causes board to fail unless something - or somethings - are warm enough - would be very much appreciated. As would any suggestions as to other things I could check such as voltage levels or frequencies (I have a multi-meter and an old analogue CRT 20MHz oscilloscope available).
Regards
RalphStatistics: Posted by ralphmcardell — Tue Jul 22, 2014 2:36 pm
]]>