@jar: Yes, I would also be interested in the scaling of the multicast. But seeing, that I only have the kickstarter parallella with 16 cores, doesn't offer a large enough coarse-grained processing array, which could really show realistic scaling values. As a side node: maybe Adapteva wants to supply researching chairs with the 64-core version. I would be willing to do some work at my university
So if I understood the manual right, you can run 2048 differnt multicasts in parallel, as there is a 11-bit wide field for the multicast identifier. Any core, willing to receive the multicast can just program its field to the specific id, the core's interested in.
The only issue I'm having right now, has been mentioned by aolofsson (). Only if the first row of the cores are issuing the multicast, the multicast reaches any of the cores within the coarse-grained array. If a core issues it, which is not within the first row, not all of the cores (even if all of them registered for it) are receiving it. So this is clearly an issue with it, and I hope that this might be potentially be solved by the Zynqs PL?
Due to larger transfer of buffers: that should be possible, but the DMA is first quite fast and second relieves the specific core. Still, as long as the core issues e.g. stores direct back-to-back it should be at least sufficient fast.Statistics: Posted by psiegl — Fri Nov 06, 2015 10:52 pm
]]>