Nbody simulations on the Parallella

Forum for anything not suitable for the other forums.

Re: Nbody simulations on the Parallella

Postby capnrob97 » Wed Jul 08, 2015 2:12 pm

Andreas, I am not really up to speed on using github, I can email you the 2 C files changed in the US Army Nbody project if you want to play with it and possible create a parrallella-examples project.

If you have the latest libcopthr and MPI already setup, should just be a quick compile and run.
capnrob97
 
Posts: 74
Joined: Fri Feb 01, 2013 1:11 pm

Re: Nbody simulations on the Parallella

Postby capnrob97 » Wed Jul 08, 2015 3:39 pm

Just added 2 command line arguments, -t and -r

-t lets you manipulate the timestep, so you can run it fast or slow without a recompile, and -r will do the run based on the command line argument settings, when done display the Gflop and time taken info for 4 seconds, then resets the stars randomly on the starting sphere and starts again, so you can run it as a demo that keeps going for as long as you want.

Also, when I use the libcoprthr library, it likes to spit warnings out all over the screen, there might be a more elegant way to suppress those as they ruin the star display, but what I did was find those warnings in the source code and converted them from WARNING to DEBUG. Brute force but got the job done for what I needed for a clean display.
capnrob97
 
Posts: 74
Joined: Fri Feb 01, 2013 1:11 pm

Re: Nbody simulations on the Parallella

Postby capnrob97 » Thu Jul 09, 2015 11:24 am

Interesting factoid on these sims

Since they are brute force all stars compared to every other star for the gravity interactions, when I run with 3456 stars (216 per core) each frame is the result of 11,943,936 comparisons(3456^2) and the underlying calculations involved. The numbers scale up rapidly as the star count increases.
capnrob97
 
Posts: 74
Joined: Fri Feb 01, 2013 1:11 pm

Re: Nbody simulations on the Parallella

Postby aolofsson » Thu Jul 09, 2015 7:51 pm

capnrob,

Let me know how you want to handle the code. Feel free to email me at andreas at adapteva any time.

Based on how many changes you are making, seems like working with git would benefit a lot of people since they could then follow your progress real time.

I put together a blog post to try to simplify (not that you need it based on your ferocious progress with nbody, mpi, and coprtrh!) :D

https://www.parallella.org/2015/07/09/a ... nd-github/

What do you think of the prospect of reaching 4096 stars? It's a nice "round number". ;)

Andreas
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Re: Nbody simulations on the Parallella

Postby capnrob97 » Thu Jul 09, 2015 8:28 pm

Ok, I will get up to speed on github and create 2 projects, one with my first nBody code that doesn't use MPI and the one based on the MPI nBody US Army version.

I will see what I can do to get to 4096 stars as well.
capnrob97
 
Posts: 74
Joined: Fri Feb 01, 2013 1:11 pm

Re: Nbody simulations on the Parallella

Postby capnrob97 » Fri Jul 10, 2015 12:53 am

4096 stars, those little cores are getting a work out

capnrob97
 
Posts: 74
Joined: Fri Feb 01, 2013 1:11 pm

Re: Nbody simulations on the Parallella

Postby capnrob97 » Fri Jul 10, 2015 11:19 am

To get to 4096 stars, I had to make the epiphany code smaller so everything can fit in the core local memory.

They had unrolled a loop by a factor of 8 for speed reasons. That is faster, but makes the code larger as well.

I cut down the unrolling to 4 from 8.

4096 stars with a faster tilmestep.

16,777,216 star compares per frame of the animation.

capnrob97
 
Posts: 74
Joined: Fri Feb 01, 2013 1:11 pm

Re: Nbody simulations on the Parallella

Postby jar » Sat Jul 11, 2015 1:33 am

User avatar
jar
 
Posts: 295
Joined: Mon Dec 17, 2012 3:27 am

Re: Nbody simulations on the Parallella

Postby jar » Sat Jul 11, 2015 1:52 am

I would also like to mention that I have a version of the code that goes beyond ~5000 particles which is just the on-chip limit. That is limited by the available off-chip RAM (32 MB), but because the algorithm is O(N^2), a single time step with a large number of particles would take a very long time. The performance loss for going off-chip is less than 5%. I also had a version with visualization in X without needing to dump to the frame buffer. We'll have more interesting stuff later.

The threaded MPI package has been improved recently, but my colleague is working on that. I'm curious about your thoughts on the software.

You can read a little bit more about this code and others in our paper:
http://arxiv.org/abs/1506.05442
User avatar
jar
 
Posts: 295
Joined: Mon Dec 17, 2012 3:27 am

Re: Nbody simulations on the Parallella

Postby capnrob97 » Sat Jul 11, 2015 9:31 am

capnrob97
 
Posts: 74
Joined: Fri Feb 01, 2013 1:11 pm


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 30 guests