Originally Posted by matthey
Parallel blitters would have been good for 256 color planar graphics but it would be a waste of resources now that even FPGA hardware has enough bandwidth for moderate to high resolution in at least 16 bit chunky. The way to work on chunky is with SIMD. One proposal for problems with a high speed blitter was to trap to the CPU/SIMD which would handle the work. A more versatile and standard SIMD would allow other code to be accelerated also. Grond hinted at this being used by SAGA/Apollo which I believe to be true if and when it is possible.
I think Mrs Beanbag means the logical progression for the blitting architecture, not for you guys.
Parallel blitters actually allow superlinear acceleration compared to plane by plane operations because they make masking channels unnecessary: they can be computed from the combined inputs. And this is interesting for any number of planes.
A cookie-cut operation would use 25% less DMA when conducted with parallel blitters because one of four channels is made unnecessary.
Even putting just two blitters alongside one another allows sharing the mask and thus saves 1/8=12.5% DMA cycles. This is significant.
But yup, not too useful for youse.