Thanks for that censix. There's been a huge amount of work done in this area since I left uni many years ago. Do you know of any papers that compare the performance of deep networks to more shallow ones? I'm interested to see what all of that extra work buys you at the end of the day.
As for the implementation, I'm going to start with a simple, one hidden layer implementation to start with. There are a number of key areas that I'd like to know more about before attempting anything more ambitious. For example, with larger networks I think that local memory constraints might be a problem. I'm also uncertain if my whole architecture is the best one for the job.
nick