Page 1 of 1

dma and mutex problems - is it worth perservering

PostPosted: Tue Oct 23, 2018 10:38 am
by nickoppen

I'm having a final push to get my image processing framework written. I want to have each core processing 1 sixteenth of the image blocked down into work units that will fit inside the core's memory. While one work unit is being processed I want to use dma to transfer the next one so it is available when the processing on the previous one is done. I'm trying to use e_mutex_lock and e_mutex_unlock to coordinate the processing and the dma. My code is here:

I've posted some questions previously about issues I'm having with dma and there have been some vague mentions of "hardware errata" possibly being responsible for the dma stalling. Now that I'm using e_mutex_(un)lock the problems seem to be getting worse.

I also have noticed that there does not seem to be any examples of e_mutex_lock on git hub or in the parallella examples.

Is it worth persevering with dma and e_mutex?


Re: dma and mutex problems - is it worth perservering

PostPosted: Mon Oct 29, 2018 3:47 pm
by jar
From personal experience, using the DMA engine directly in code is technically challenging. There's latency in kicking off the DMA operation compared to simply dereferencing a remote address. With an optimized remote memory copy, performance is often better than DMA. This is partly because the DMA hardware errata limits performance to less than half of what it should be. Data must also be carefully located to take advantage of asynchronous compute and memory movement.

So, I would avoid DMA.

Re: dma and mutex problems - is it worth perservering

PostPosted: Tue Oct 30, 2018 5:56 am
by nickoppen
Thanks James,

That's a pity because if the combination would have been great.

Your post actually explains something else I've seen, a small chunk of data missing at the end of the array after the DMA interrupt has fired.

I'll give your OpenSHMEM a go. I've been (slowly) writing an image processing system for a blog post about COPRTHR-2 but I was reluctant to talk about code that doesn't always work and add a caveat "ignore the DMA - it's flaky".