Originally Posted by pandy71
Game or demo no difference - code is code - (assumed skilled developer use blitter).
HW data says on 1 plane up to 1 million pixels can be draw on screen - line mode (and up to 16 million pixel fillrate), count four planes (16 colors) so divide those figures by 4 and to be realistic by another 4 (25% efficiency, setup need time, short length vector may be more efficient draw by CPU).
Probably some work can be made on hybrid mode - CPU + Blitter + Copper and this can provide some gain.
But you can't usually render a poly directly to a frame buffer unless it's the only poly to be rendered, unless I'm missing something.
The best way, IMO, is to first render/fill the poly to a single plane in a cleared buffer using AD copy blit with fill on. Then blit the poly using an ACD blit to each plane setting LF for each plane to control color.
This means that a generic poly fill and render (ignoring line draw stage), takes (3*(planes+1)+2)*2 = 34 CPU cycles per 16 pixels on a 16-color screen (ignoring interference from other DMA).
Poly fill on the Amiga comes with the bonus that interior areas can be unfilled/filled. Very irregular shapes can be handled. A star shape can be quickly filled, for example.
That doesn't happen that often, though, in a 3d poly game. Typically triangles or quadrilaterals need to be rendered.
Using the CPU, these simpler shapes can be rendered by the CPU directly to the frame buffer. A clever technique I've seen documented involves using the CPU to draw horizontal lines between to two points, moving line by line through a portion of the frame buffer. A table driven approach allows the ends of the lines to be rendered quickly using masks and MOVEM is used to fill in the middle of the line.
On the ST, the best case is when a horizontal line is drawn entirely with a MOVEM. In that case it takes 16+(line_length) CPU cycles. The ST's otherwise odd interleaved bitmap scheme actually helps here, making long runs of writes to the frame buffer more likely.
Of course calculating a new horizontal line length and position for each scanline is time consuming.
With a fill rate that asymptotically approaches 1 CPU cycle per pixel (for only simply polys, of course), I can see how a programmer might be inclined to stick with the CPU-based solution rather than explore an unknown gain by using the Amiga's specialized hardware. And considering that more time is spent calculating the 3d world than poly filling/rendering, the investment of time might not seem worth it. To an expert Amiga programmer it might, but not a programmer writing for multiple platforms.
This whole discussion of speed ignores one important point, however, and that's the ability of the Amiga to perform operations concurrently with the CPU. For 3d games where many multiplies and divides are performed, many cycles are left free for the blitter. Huge gains are possible.
But again there's a problem. To gain maximum concurrency, poly fill/render must happen with data that has already been computed since poly render order can only be known after all calculation have been performed. This means that calculations for the next frame must occur while the previous frame is being built. That greatly complicates the design. A programmer used to a simple calculate/render loop would have to make serious changes to the programs logic to take advantage of that, possibly using interrupts or the copper to render.
I hate to be down on the Amiga here. Watching simple 3d demos on both the Amiga and ST suggests that, for simple scenes, the Amiga is several times faster rendering filled polys than is the ST.
But for a large world and complicated game logic, if the programmer can't find a way to take advantage of concurrency, the gains probably aren't that great.
If the programmer can find some way to pipeline blitter operations while doing calculations, then I think the gains can still be large.