Quote:
Originally Posted by Adropac2
I recall Jez San saying saying something about the blitter and Starglider1 but it's not something you can use for filled in polygons? and so that's why the ST had ever so slightly faster 3d generally which frankly the differences there i still don't see

Many multiplications are needed each frame to determine the positions of all vertices. These calculations dominate in situations where rendering is wireframe. The Amiga's blitter doesn't help much. Calculations on the ST are going to be about 12% faster.
Filled polygons are another story. The Amiga's blitter definitely makes a difference there.
Of course if someone could come up with a series of blits that used a variety of logic functions to perform binary multiplies, that might really help. Sounds ridiculous, but consider Gerald Hull who used a 9 blit combination to perform additions in a blitter accelerated implementation of Conway's LIFE. Maybe using the blitter to accelerate 3d calculations isn't impossible.
Start with 16 blits to create an 16x16 bit array of partial products for each pair of values. That's easy. Coming up with a quick way to sum them properly is more challenging. Maybe Wallacetree. I think I remember 64 full adders were needed with Wallace tree for 16x16 mult. A full adder with the blitter takes two blits. So we're up to at least 16+64*2=134 blits.
At least nine MULs are required per vertex for handling rotations and maybe a couple of DIVs when mapping from 3d space to screen coordinates. That's a lot of cycles. The nine MULs on the 68000 are going to cost around 360 to 630 cycles. That's for each vertex, but most of it is internal processing. If the blitter were calculating in parallel, even inefficiently, it might still speed things up overall. If the 134 blits number is correct, then the total number of cycles per multiply is slightly over 1000. Nine times that for each vertex and it's near 10000 cycles. So the blitter would be about 1/20th the speed of the CPU. That's disappointing. It would only be a tiny increase in speed  and probably wipedout by the overhead of setting up the blitter.
Maybe there are some tricks that can be used to reduce the total number of blits. It's possible the area fill logic can be exploited to help with the additions (the high bit of every nibble can do the mod 2 sum of the previous three bits).
Computers are so fast now maybe the optimal sequence of blits can be determined by brute force.