Object Detection

Object Detection

Postby notzed » Fri Aug 09, 2013 11:42 am

So now i've got a parallella to poke at, I've had a little read of the sdk and am starting to think about how to approach some problems.

BTW my initial thought on making it do work is ... 'mmmm, tricky ...'.

This isn't necessarily the first bit of code I want to work on but i was just thinking about it and here are some thoughts on it. I know one of the first availale demo's was a face detector but i'm ignoring that atm, even if it's because i didn't fully grok it when i looked at it 6 months ago. It's a problem i've looked at fairly deeply against OpenCL/GPU.

One issue with the haar cascade algorithm (i.e. the opencv face detection code) is that it requires a great deal of meta-data to process a small amount of visual data. Even with some fairly tight packing a simple face detector is upwards of 60K - and that must be walked (although usually not in full) for every (say) 20x20 pixel window of image input. It also requires either a lot of scaling, or many sparse lookups.

But since in the vast majority of cases it only goes down a few levels of the cascade before aborting, it should be possible to just keep those in LDS and leave the upper levels either in system memory or double-buffered from system memory.

It might be possible to use some of the many the registers to process multiple windows at once (and save on cascade lookups), although one can fall into the SIMD trap of wasted cycles if not all windows are still active. Potentially as the inner loop is so simple it may be possible to have multiple implementations for different window counts and still remain within the tight memory constraints.

From some GPU work I did I found that scaling the features and using a single SAT for the feature resolution turned out to be 'pretty bad' in terms of efficiency because of the lack of locality of reference in the feature lookups. This may not necessarily be the case for parallella (does it even access global memory via a cache line?), but i'll just assume it is for this post ...

So here's one possible way of approaching it. Probably the ARM could be dragged in to do more of the work, not to mention the FPGA (although lets not get carried away).

The code required for the complete algorithm:
  • A couple of small routines for implementing image scaling.
  • A couple of small routines to implement that summed area table generation.
  • A small routine to execute the cascade.

Data in LDS:
  • 2 buffers to store sampled windows of the input data - each would contain several probe windows worth and the filling of them from the scaled SAT is double-buffered via dma. Minimum 20x20x4 bytes.
  • As much of the cascade as will fit / or at least some statistically significant number of stages
  • Small stack.

Data in global:
  • Source image (greyscale, mip-map)
  • Rescaling buffer (float or byte)
  • Summed area table buffer (int or float, probably float)
  • Summed^2 area table (strictly should be long or double, but float would probably suffice).
  • Rest of the cascade.

edit: nearly forgot about that SAT^2 table.

And the processing would be something like:
  • Scale image in X/Y (double-buffer input)
  • barrier
  • Scale image in Y/X - generate SAT and SAT^2 row-as-you-go (double buffer input)
  • barrier
  • Generate SAT in X/Y (double-buffer input)
  • barrier
  • Run detector at all window positions:
    • Load in a rectangle of SAT (double-buffer)
    • Load in SAT^2 bounds (scatter-gather), calculate variance (requires sqrt)
    • Process one or more windows against LDS cascade
    • If any survivers, process one or more windows against global cascade.
    • Output raw hits to per-core global buffer.
  • barrier
  • do it again for the next scale, until done.

Then tally/process results on host.

Anyway I need to think about it some more, and obviously also try running some actual code on the thing.
notzed
 
Posts: 331
Joined: Mon Dec 17, 2012 12:28 am
Location: Australia

Re: Object Detection

Postby Gravis » Fri Aug 09, 2013 12:05 pm

i think a more effective approach would be to port the GPU module of OpenCV to use Epiphany. then you have all the resources of OpenCV which which includes object detection... or you can just rip the part you want for your project and be done with it.
User avatar
Gravis
 
Posts: 445
Joined: Mon Dec 17, 2012 3:27 am
Location: East coast USA.

Re: Object Detection

Postby notzed » Sun Aug 11, 2013 1:37 am

Gravis wrote:i think a more effective approach would be to port the GPU module of OpenCV to use Epiphany. then you have all the resources of OpenCV which which includes object detection... or you can just rip the part you want for your project and be done with it.


"effective" in terms of what?

I want to learn how to push the platform, not port horrible libraries I have no interest in ever using.
notzed
 
Posts: 331
Joined: Mon Dec 17, 2012 12:28 am
Location: Australia

Re: Object Detection

Postby Gravis » Sun Aug 11, 2013 2:35 am

notzed wrote:"effective" in terms of what?

as in that it would be easily reusable by many people with existing code.

notzed wrote:not port horrible libraries I have no interest in ever using.

you think OpenCV is horrible? harsh bro but if you can improve upon it, please do because it would really be great, not just for me but for everyone.

i'm doing what i can to help the Epiphany architecture, i hope you will too.
User avatar
Gravis
 
Posts: 445
Joined: Mon Dec 17, 2012 3:27 am
Location: East coast USA.

Re: Object Detection

Postby 9600 » Mon Aug 12, 2013 6:53 am

I'm personally not qualified to comment on the merits of one approach versus another, but note that when Adapteva put together a face detection demo in October of last year, they said:

In our face detection example, we chose to leverage the high level OpenCV functions for high level application functions while completely rewriting the inner loop LBP based tile processing kernel in ANSI-C and bypassing the OpenCV framework.


Cheers,

Andrew
Andrew Back (a.k.a. 9600 / carrierdetect)
User avatar
9600
 
Posts: 997
Joined: Mon Dec 17, 2012 3:25 am

Re: Object Detection

Postby notzed » Wed Aug 14, 2013 1:28 am

Last night I blogged about the story so far.

http://a-hackers-craic.blogspot.com.au/2013/08/progress-on-object-detection.html

If all things go well, the next time i have a few hours to hack i should get something going on the epiphany.
notzed
 
Posts: 331
Joined: Mon Dec 17, 2012 12:28 am
Location: Australia

Re: Object Detection

Postby aolofsson » Wed Aug 14, 2013 2:18 pm

This is really incredible! Can't wait to see it running on the Epiphany. Hopefully developing software on the device won't be too painful. :D

Note that there has been some success getting the USB working on the gen0. @tnt can probably comment.
You will need to ground the ID pin and make sure you have a device tree that supports USB.

Andreas
User avatar
aolofsson
 
Posts: 1005
Joined: Tue Dec 11, 2012 6:59 pm
Location: Lexington, Massachusetts,USA

Re: Object Detection

Postby notzed » Sun Aug 18, 2013 12:05 pm

I managed to finally get the object detector code "working" today on a single EPU (epiphany processing unit/core). Well by working, I mean it runs to completion without crashing or hanging. I'm not getting the same results as my laptop yet so i need to do some debugging and also make an ARM version for comparison.
notzed
 
Posts: 331
Joined: Mon Dec 17, 2012 12:28 am
Location: Australia

Re: Object Detection

Postby Gravis » Sun Aug 18, 2013 1:10 pm

notzed wrote:a single EPU (epiphany processing unit/core)

it's called an eCore
User avatar
Gravis
 
Posts: 445
Joined: Mon Dec 17, 2012 3:27 am
Location: East coast USA.

Re: Object Detection

Postby notzed » Mon Aug 19, 2013 5:36 am

Gravis wrote:
notzed wrote:a single EPU (epiphany processing unit/core)

it's called an eCore


Well i kinda like epu so i'll stick with that, actually.
notzed
 
Posts: 331
Joined: Mon Dec 17, 2012 12:28 am
Location: Australia

Next

Return to Image and Video Processing

Who is online

Users browsing this forum: No registered users and 2 guests

cron