Originally Posted by Mrs Beanbag
One can render an entire 3-colour convex polyhedron and fill it with a single blit, and then copy it to the screen with another single blit. All the ships on elite, for instance, are convex polyhedra!
That seems like it would often take more time. There's the larger off screen buffer to clear. And twice as much memory to cover with the fill. I can see how it might be competitive for small polys, but I'm not sure there's a big win with that method over simply filling one plane and then blitting to multiple planes, especially if you need more than three colors.
Originally Posted by Mrs Beanbag
I don't understand why it makes a difference. Whether you fill a polygon with the CPU or the Blitter, you need to have calculated it first.
It is easy enough to sort all the objects by Z-distance first, then one can multiply the points of the next object by the rotation matrix and calculate screen co-ordinates, while the previous object is still being drawn.
Unless your rotation matrix calculation time exactly matches blitter render time, either the CPU or the blitter will be left waiting at some point. For a small poly, the blitter will wait for the CPU to finish. For a large poly, the CPU will have to wait for the blitter to finish.
And there typically are plenty of other calculations that must be done in a program like Elite. It isn't all rotation matrix calculations. It would be desirable to overlap rendering with all those other calculations, too.
Here's what I would try if I had the skills:
The Amiga hardware permits the copper to run a list that programs the blitter. I'd use the copper for most of the rendering. With some cleverness, it's possible to get the copper to cross vblanks without misprogramming the blitter allowing for CPU/blitter concurrency. Of course the CPU has to reprogram bitplane pointers at some point before the top of the display using an interrupt (VBlank or even timer based on HSync), but getting CPU/blitter concurrency is probably more than worth it.
There's even an opportunity to use the blitter together with CPU for occlusion culling by using BZero together with an occlusion mask.
Going all out, one could even use a tile-based (16x16 pixels) system together with occlusion culling to minimize the amount of rendering done by the blitter of scenes that have large polys obscuring most of the background polys.
The render pipeline (running concurrent with calculations for the next frame) would look something like:
Clear poly buffer (will hold multiple single bitplane filled polys)
Clear occlusion buffer to all 1s (zeroed areas will indicate an area covered by a poly)
Clear render buffer
Draw polys/fill to multiple poly buffer (holds multiple single plane filled polys)
Sort polys front to back
Occlusion cull by using AB blit with D=A*B (test to see if new poly would add to scene) and draw black (zeroed) single bitplane polys to occlusion buffer if BZero is nonzero (to reduce unnecessary blits to frame buffer)
Sort non-occluded polys back to front
Render tiles of non-occluded polys using LF to proper plane for correct color
The key to making this fast is to allow the CPU to calculate a frame ahead while much of the blitting is done concurrently. The CPU would have to be diverted to help with occlusion culling and sorting from time to time, but most of the other blits can be done with copper lists. It the CPU is allowed to begin calculating one frame ahead, a great deal of concurrency can be taken advantage of.
For instance, once a copper list is built to clear buffers and draw lines/fill polys in the poly buffer, the CPU can immediately move to calculating the coordinates of next set of polys for the next frame and building the next copper list while the previous copper list is rendering.
It's all a great deal of work, and while the additional rendering steps would tend to make for slower rendering of scenes with very few polys, as the number of polys grows, the scheme should quickly overtake simpler methods I think. The idea is to make fast cases that have many polys. Occlusion culling might even make 640x400 8-colors usable.