Introduction to parallel programming with 'foreach'

Moderator: censix

Introduction to parallel programming with 'foreach'

Postby David.S » Sun Sep 01, 2013 12:23 pm

This blog post may be interesting to some of you. It's an introduction to parallel computation with the 'foreach'-package.
Looks like going parallel in R can be pretty straightforward once we have epiphany support.

http://www.exegetic.biz/blog/2013/08/the-wonders-of-foreach/
David.S
 
Posts: 1
Joined: Mon Dec 17, 2012 4:00 am

Re: Introduction to parallel programming with 'foreach'

Postby censix » Tue Sep 03, 2013 2:34 pm

You are right, 'foreach' is certainly one possibility to parallelize R functions.

Also keep in mind that, I think as of R version 2.14 or so, certainly since 2.15, the standard R install contains the 'parallel' package and functions such as 'mclapply', a parallel version of 'lapply', that enables concurrent function evaluation. The tricky bit however is in the detail. 'mclapply' can only do concurrency up to the number of processor cores, ie. 4, out of the box.

Making this work on an E16 would be great!
censix
 
Posts: 49
Joined: Sun Dec 16, 2012 7:54 pm
Location: europe

Re: Introduction to parallel programming with 'foreach'

Postby cozymandias » Tue Sep 10, 2013 7:42 pm

The simplicity of being able to use foreach and plug a couple of Parallellas in whenever I want more power was a big motivator in my decision to purchase some boards. The best part about foreach is that it scales very easily, since the same code runs on single or multiple cores. Which parallels well (pun intended) with the scaleability that Parallella allows.

As censix mentions, R has done a good bit to build/integrate core parallel functions for more flexibility and use in tailored applications, so there are several options.

I haven't received my boards yet, but I'm hoping they come in soon and epiphany support isn't too far behind.
cozymandias
 
Posts: 6
Joined: Sat Aug 03, 2013 5:04 pm

Re: Introduction to parallel programming with 'foreach'

Postby 9600 » Wed Sep 11, 2013 6:59 am

cozymandias wrote:I haven't received my boards yet, but I'm hoping they come in soon and epiphany support isn't too far behind.


Let me know if you (or anyone else for that matter) would like to try this out on a prototype, as I can easily provide network access to one.

Cheers,

Andrew
Andrew Back (a.k.a. 9600 / carrierdetect)
User avatar
9600
 
Posts: 997
Joined: Mon Dec 17, 2012 3:25 am

Re: Introduction to parallel programming with 'foreach'

Postby censix » Fri Sep 13, 2013 11:59 am

@cozymandias

I agree that 'foreach' is a nice wrapper for paralellization, however, the package itself is just that, a wrapper for parallel functions that have to be implemented elsewhere. Either by the native 'parallel' package, or the 'doSNOW' package, or the 'doMP' package.

so what is needed is a an R package, maybe called 'doEpiphany' or similar, that provides the paralellization, i.e. instantiates the workers ...
censix
 
Posts: 49
Joined: Sun Dec 16, 2012 7:54 pm
Location: europe

Re: Introduction to parallel programming with 'foreach'

Postby gtg302v » Tue Sep 24, 2013 10:21 pm

I've been using foreach for the last few days, and on the surface it's very easy, but i've run into a few cautions that i'll share for anyone interested:

all the examples i've seen compute a single quantity within the foreach loop that can be combined with a built in .combine argument (such as rbind). In some analysis i'm running now, what i need to happen in the loop is more complex and requires the storage of several variables of different data types. This is still possible either by writing your own combine function, or by returning all objects you want to preserve as a list in the last line of code within the foreach loop eg:

result<-foreach( iterator, .combine, parameters etc) %dopar% {

a<-some code
b<-some code
c<-some code

d<-list(a,b,c)
}

then result will be a list of lists that you can iterate over and recombine as you desire...the bigger sticking point for me was memory management. Without some tweaking the parallel backend packages basically duplicate everything required for the code within the foreach loop to execute on each worker, so in my case i have two lists with training and validation data each containing about 30 million records. My original iterator was just an index on a data frame with an id from the test set and an id from the validation set (objective is to return probability that the data in each set is from the same source)...the workspace is consuming about 6 GB of memory just with all the objects loaded to start with, so the first time i did this it blew up pretty quickly....the Iterators package helps alleviate this....

so my conclusion...foreach is definitely a nice wrapper to make parallel processing accessible, but for some applications it does require a bit of digging and probably restructuring your code to make it perform well, as with pretty much all packages :)

jon
gtg302v
 
Posts: 5
Joined: Mon Dec 17, 2012 3:27 am
Location: Huntsville, AL

Re: Introduction to parallel programming with 'foreach'

Postby censix » Thu Sep 26, 2013 6:13 pm

Useful to know. Thanks for those insights.
How long does it take when you run this over 30Million test + 30Million validation data ? (and with how many processor cores?)
I suppose you have at least 16 Gb RAM ?
censix
 
Posts: 49
Joined: Sun Dec 16, 2012 7:54 pm
Location: europe

Re: Introduction to parallel programming with 'foreach'

Postby gtg302v » Fri Sep 27, 2013 3:18 pm

Through a series of unfortunate late night amazon sorties i have 24 GB of memory on my machine :)

The performance gain i got using foreach is about a factor of 3 with 4 processors supporting 4 workers.

In each iteration of the loop i'm pulling a sample out of the test set that's five columns and 300 rows and a set of data out of the training set which is 5 columns and anywhere from a few thousand rows to many hundreds of thousands of rows and computing several quantities. Both sets are stored in lists where each item in the list is either a data set for a specified device in the training data or a test sequence id in the test data---indexing this way, or some other way maybe through a database connection is essential since subsetting a huge data frame is slowwwww.

I compute the times between samples of the test sequence, pull the precomputed times between samples for the training device data and compare their distributions with KS test, compute correlation of three of the columns in both data sets and their differnces, compute different in means for one column in each, and compute differnece in mode sampling rate between the two sets.

then pull a precomputed glm object from a list (that was built solely in the training data), and get predictions from those computed quantities.

The actual length of the iteration is only 90024 (each iteration gets a chunk from the 30 mil + 30 mil data sets) and it takes about 2.5 hours in parallel with 4 workers on my machine (AMD A8 @3.0 Ghz), sorry if that was misleading in my first post.

The first time i tried to do it i was just subsetting the data frames with 30 millionish rows in each and a rough guess is that it would take 3 or 4 days to run subsetting that way. So i have a pre processing script that just subsets the data frames and stores them in a list...this takes an hour or so to run, but then subsetting is quick. probably better ways to do it but this is my first time really messing with large data sets and it's the way i got it to work :)

jon
gtg302v
 
Posts: 5
Joined: Mon Dec 17, 2012 3:27 am
Location: Huntsville, AL

Re: Introduction to parallel programming with 'foreach'

Postby cozymandias » Mon Oct 14, 2013 3:32 pm

Sorry, I quit getting updates on this thread for some reason and just now thought to check back. Hopefully you've gotten things to work the way you need them to by now gtg302v, but the duplication you noted is something to look out for with these sort of tasks. It sounds like you handled it by pre-chunking, which makes sense. You might consider adjusting the size of your chunks a little more. I would expect a speed up of more than 3x when moving from 1 to 16 workers (if I've understood correctly), and even with 24GB of RAM, that could still be an issue with 16 workers.

@censix, agreed. My intention was just to show some support for foreach - I've been using it a bit lately and I appreciate it's simplicity. I think a "doEpiphany" package is a great way to frame things. Let's talk about what it would take to develop such an approach; I would be very glad to help. I'm still waiting on my boards, but maybe there's something that can be done before they get here.
cozymandias
 
Posts: 6
Joined: Sat Aug 03, 2013 5:04 pm

Re: Introduction to parallel programming with 'foreach'

Postby gtg302v » Tue Oct 15, 2013 1:55 am

Just 4 workers for the time being (four cores on one CPU) so a speed up of a factor of three with four workers sounds about right based on other examples I've seen. The key for handling the memory issue for me was changing the iterator from a typical (i in 1:90024), and using i to index the lists, to using the list itself as the iterator (need the {iterators} package)

Thanks,
Jon
gtg302v
 
Posts: 5
Joined: Mon Dec 17, 2012 3:27 am
Location: Huntsville, AL

Next

Return to R

Who is online

Users browsing this forum: No registered users and 1 guest

cron