Thanks for your thoughts, Beany. Appreciated. But by my math, your cube example actually argues for doing three single-bitplane fills.
Doing three separate one-bitplane fills I get:
14 words x 316 lines = 4424 words filled (red area)
15 words x 241 lines = 3615 words filled (green area)
15 words x 243 lines = 4460 words filled (blue area)
Total words to be filled = 12499
Total DMA cycles for fill = 12499*3 = 37497 DMA cycles
For a five plane display, cutting each polygon into the new frame with ACD blit = 3 x 5 x 12499 = 187485 DMA cycles.
Total DMA cycles = 224982
Filling two planes with all three cube faces to be filled all at once:
25 words x 400 lines x 2 planes = 20000 words to be filled.
Total DMA cycles for fill = 60000 DMA cycles
ABCD blit into a five plane frame = 4 x 5 x 10000 = 200000 DMA cycles.
Total DMA cycles = 260000
Perhaps the better method is simply very dependent on the pattern of the polygons that make up the object. For filling, the cube provides too much empty area that must unnecessarily be touched. Some other shape might be faster (a polyhedron with tessellated faces and few colors perhaps?)
The ABCD blit is a nice trick, but it too requires that the blitter touch much empty space in the case of the cube. Again, which is better must be object dependent.
With respect to calculating away hidden surfaces, using the debugger to examine several 3d poly games running on an emulated A500 and looking at the ratio of time used for calculation versus polygon draw, I can say, except for demos with a very small number of objects, that the ratio of calculation time to polygon fill/render time is usually greater than 1:1. In larger worlds, many calculations are done that result in nothing being drawn. Many polygons are clipped away or are located behind the viewer. This suggests that it might be better to use the blitter to cover more distant polys than to introduce additional calculation for hidden surface removal, provided the CPU and blitter can be made to run concurrently.