English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 09 August 2018, 16:42   #1
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
How did games like Starglider 2 get such a high frame rate?

I'm playing around with some 3D graphics code at the moment, thinking about maybe doing something more with it, if only I can make it fast enough.

I'm animating a familiar looking chequered ball made up of 128 polygons. it's just moving around the screen while rotating around its centre. About half of the polygons get back face culled and so aren't rendered, leaving me with with around 64 to draw to the screen. I'm not currently doing any hidden surface removal apart from the backface culling, not even depth sorting. And it looks ok, but it's just way too slow.

I'm getting less than 2 fps. I want more like 8.

I'm only doing rotate and translate of the object at the moment, the camera is fixed. I'm doing a 3x3 matrix translation for the rotation, 2.14 fixed point maths (in assembly language for the multiplies), then translating it and doing the division for the perspective view. I'm not doing 4x4 matrices at the moment because they're not needed with a fixed camera.

From the timings I've done, I know I spend 1/3 of my time doing the moving the ball and doing the 3D calculations, and about 2/3 of the time drawing the polygons. Drawing the polygons roughly splits into half of the time tracking the edges and half the time drawing horizontal lines from one side of the polygon to the other, making the whole split approx 1/3 3D calculations, 1/3 2D calculations 1/3 writing to screen memory.

So, even if I continue optimising my code and get it running twice as fast, I still won't be anywhere near my goal.

Does anyone know what tricks games like Starglider 2 did back in the day to get such a high framerate?
Attached Thumbnails
Click image for larger version

Name:	001.png
Views:	240
Size:	66.7 KB
ID:	59169  

Last edited by deimos; 09 August 2018 at 18:11.
deimos is offline  
Old 09 August 2018, 18:04   #2
AChristian
Registered User

 
Join Date: Aug 2018
Location: england
Posts: 23
how are you doing the calculations that involve sin cos and pi?
using lookup tables?
i was chatting with the programmer of carrier command on youtube years ago, he mentioned something about doing very imprecise but fast calculations involving pi.
i told him he should write a book or something but dont think he did.
AChristian is offline  
Old 09 August 2018, 18:07   #3
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
Quote:
Originally Posted by AChristian View Post
how are you doing the calculations that involve sin cos and pi?
using lookup tables?
Yes, lookup tables:

Code:
WORD SinTable_14[2048] = {
    0x0000, 0x0032, 0x0064, 0x0096, 0x00c9, 0x00fb, 0x012d, 0x015f, 
    0x0192, 0x01c4, 0x01f6, 0x0228, 0x025b, 0x028d, 0x02bf, 0x02f1, 
    0x0323, 0x0356, 0x0388, 0x03ba, 0x03ec, 0x041e, 0x0451, 0x0483, 
    0x04b5, 0x04e7, 0x0519, 0x054b, 0x057d, 0x05af, 0x05e1, 0x0613, 
    0x0645, 0x0677, 0x06a9, 0x06db, 0x070d, 0x073f, 0x0771, 0x07a3, 
    0x07d5, 0x0807, 0x0839, 0x086b, 0x089c, 0x08ce, 0x0900, 0x0932, ...
deimos is offline  
Old 09 August 2018, 18:40   #4
AChristian
Registered User

 
Join Date: Aug 2018
Location: england
Posts: 23
ok can't think of anything else

Did you create your 3d engine yourself or from a book?

There is a book for the zx spectrum on how to create a 3d engine, it was written in the 80's but is freely available online. I could direct you to that if you would like.

(because it could contain speed advice, even though its for a different processor)

I assume are you drawing the polygons using the amiga's ability to draw horizontal lines until it detects a black pixel?
AChristian is offline  
Old 09 August 2018, 19:32   #5
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
Quote:
Originally Posted by AChristian View Post
are you drawing the polygons using the amiga's ability to draw horizontal
I have tested with the Blitter's area fill and settled on the non-Blitter polygon fill that I'm currently using. The Blitter did make things slightly faster (2.2fps vs 1.8fps), but adds complexity, and you need to clip, which you don't with a scanline fill.

My numbers (1/3 1/3 1/3) tell me that even if I could make drawing to the screen free, I'd still only get 5fps. That's why I think the Starglider guys might have had some tricks.

And besides, Starglider 2 was available for the ST, so the Blitter can't be the secret.
deimos is offline  
Old 09 August 2018, 20:02   #6
AChristian
Registered User

 
Join Date: Aug 2018
Location: england
Posts: 23
interesting, thanks I wish I was in contact with this carrier command bloke still. I guess he could actually help you out
AChristian is offline  
Old 09 August 2018, 20:09   #7
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 745
I'm not an expert in 3D gaphics, but there are a few things these games did to speed things up I do know:

1) almost all of them ran in low colour modes & resolutions.
2) almost all of them didn't run full screen.
3) a number of them used the blitter to clear the screen while doing the calculations for the next frame to draw.
4) the ones that did use the blitter (as I understand it) made a 1BPL version of any filled polygon in a seperate buffer first (using blitter line/blitter fill) and then copied it over to whatever bitplanes where needed.

And 5) they tended to draw simple shapes only (no nice looking spheres for instance), meaning less math was needed for drawing the polygons and each of the polygons was a simple shape.

Edit: I'm also pretty sure that any game showing circles just used 2D circles instead of 3D representations.

However, there is probably more going on than just this
roondar is offline  
Old 09 August 2018, 20:31   #8
grond
Registered User

 
Join Date: Jun 2015
Location: Germany
Posts: 537
MULs and DIVs can be replaced by ADDs and SUBs if you do log() and exp() on them via LUTs. With the relatively low precision required for low screen resolutions and low polygon objects, the tables will be reasonably small.
grond is offline  
Old 09 August 2018, 20:33   #9
Amigajay
醉拳
Amigajay's Avatar
 
Join Date: Jan 2010
Location: >
Posts: 2,037
Yes only half the screen is used for the playing area for starters, then less than half of that is the actual chequered grid, plus if you look when you fly up high the draw distance is very close.
Coupled with the 3D objects pop up very close so they arent drawn in from the horizon like some fly simulators.

I guess more tricks are used here than flight sims to fool the eye the 3D world is larger than it actually is.
Amigajay is offline  
Old 09 August 2018, 20:44   #10
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
Thanks for your answer. I think these are all valid points, but I think I've already considered all of them.

1) SG2 looks like it's in 16 colour mode (I've tried counting colours on screen). I've tried dialling down the depth I'm using and I can see that the last 1/3 of my "pipeline" is directly proportional to the number of planes I have to update, which is as you would expect, but there's not enough savings here.
2) True, SG2 and others use half the screen for the view outside and half for the cockpit / dials, etc. But the object I'm drawing fits in that half, so right now there's no savings by cutting half the screen off.
3) Yes, I'm doing that - blitting the screen memory full of zeros takes less time than my calculations stage so is effectively free.
4) Yes, and if I go back to blitter filling I'll double check all this, but I'm not seeing huge benefits from the blitter at the moment. Bigger polygons might make the difference more significant. A blitter queue would be like magic, but like magic, they don't seem to exist in reality.
5) Yes, the ball is not realistic for a game object, but I've tried counting the polygons in a busy SG2 scene, and I think my ball has about the same, so I think it's probably a good test.

Quote:
Originally Posted by roondar View Post
I'm not an expert in 3D gaphics, but there are a few things these games did to speed things up I do know:

1) almost all of them ran in low colour modes & resolutions.
2) almost all of them didn't run full screen.
3) a number of them used the blitter to clear the screen while doing the calculations for the next frame to draw.
4) the ones that did use the blitter (as I understand it) made a 1BPL version of any filled polygon in a seperate buffer first (using blitter line/blitter fill) and then copied it over to whatever bitplanes where needed.

And 5) they tended to draw simple shapes only (no nice looking spheres for instance), meaning less math was needed for drawing the polygons and each of the polygons was a simple shape.

Edit: I'm also pretty sure that any game showing circles just used 2D circles instead of 3D representations.

However, there is probably more going on than just this
deimos is offline  
Old 09 August 2018, 20:50   #11
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
I will look into this, though I've been able to see differences in the end result when I've changed the number of fractional bits used, so I'm wary of losing accuracy.

Quote:
Originally Posted by grond View Post
MULs and DIVs can be replaced by ADDs and SUBs if you do log() and exp() on them via LUTs. With the relatively low precision required for low screen resolutions and low polygon objects, the tables will be reasonably small.
deimos is offline  
Old 09 August 2018, 20:53   #12
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
I know SG2's draw distance is small, and less than the (fake?) chequered grid, but I think I'm trying to process around the same the number of polygons as you'd get if you count all the enemies you see on a busy screen.

Quote:
Originally Posted by Amigajay View Post
Yes only half the screen is used for the playing area for starters, then less than half of that is the actual chequered grid, plus if you look when you fly up high the draw distance is very close.
Coupled with the 3D objects pop up very close so they arent drawn in from the horizon like some fly simulators.

I guess more tricks are used here than flight sims to fool the eye the 3D world is larger than it actually is.
deimos is offline  
Old 09 August 2018, 22:46   #13
jotd
This cat is no more
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 47
Posts: 2,861
one trick is the fast inverse square root function. Used in DOOM or Quake to compute some 3D stuff, probably not in Starglider. Precomputed tables/masks are probably the key.
jotd is offline  
Old 09 August 2018, 23:22   #14
coder76
Registered User

 
Join Date: Dec 2016
Location: Finland
Posts: 73
It isnt clear from your writings how you fill your polygons, but you should fill only 1 bitplane per surface, so if you have a 4 bitplane screen, then you have only 4 colors available (0001, 0010, 0100, 1000). Fast filled fullscreen polygons that run in 50 FPS on an A500 usually have only 3-4 colors and are done with blitter.
coder76 is offline  
Old 09 August 2018, 23:28   #15
chb
Registered User

 
Join Date: Dec 2014
Location: germany
Posts: 110
Quote:
Originally Posted by jotd View Post
one trick is the fast inverse square root function. Used in DOOM or Quake to compute some 3D stuff, probably not in Starglider. Precomputed tables/masks are probably the key.
No, that one relies on some properties of the binary representation of floating point numbers and makes only sense when there's a reasonably fast fpu present. On an 68000 it would be much slower than directly computing it with fixed point arithmetic, let alone using a table-based approach.
The Wikipedia article on this topic is actually quite good:
https://en.wikipedia.org/wiki/Fast_inverse_square_root

A possible way to speed up coordinate calculations is to exploit object symmetries, e.g. your sphere is highly symmetric, so that there's a lot less independent coordinate values than for a set of random points. This allows you to precompute some of the products in the matrix multiplication: If your coordinate vectors (x,y,z) are e.g. (a,a,a),(a,-a,a),(-a,-a,a),(-a,a,a),(a,a,-a),(a,-a,-a),(-a,-a,-a),(-a,a,-a) for a simple cube, you only have to precompute the product of every rotation matrix element with 'a' and carry out the additions and subtractions. So in that case only nine multiplications instead of 216 72 for random coordinates. For a generic 3D-engine you'd probably need a bit of clever coordinate transformation ping pong to make it work.

Last edited by chb; 10 August 2018 at 11:34.
chb is offline  
Old 10 August 2018, 00:11   #16
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
I'm not sure I understand how to exploit symmetry when arbitrary rotations of the object are allowed. I guess the mirror YZ plane rotates through the same raw pitch roll as the points, but then what's the maths to reflect across this now arbitrary plane?

Quote:
Originally Posted by chb View Post
No, that one relies on some properties of the binary representation of floating point numbers and makes only sense when there's a reasonably fast fpu present. On an 68000 it would be much slower than directly computing it with fixed point arithmetic, let alone using a table-based approach.
The Wikipedia article on this topic is actually quite good:
https://en.wikipedia.org/wiki/Fast_inverse_square_root

A possible way to speed up coordinate calculations is to exploit object symmetries, e.g. your sphere is highly symmetric, so that there's a lot less independent coordinate values than for a set of random points. This allows you to precompute some of the products in the matrix multiplication: If your coordinate vectors (x,y,z) are e.g. (a,a,a),(a,-a,a),(-a,-a,a),(-a,a,a),(a,a,-a),(a,-a,-a),(-a,-a,-a),(-a,a,-a) for a simple cube, you only have to precompute the product of every rotation matrix element with 'a' and carry out the additions and subtractions. So in that case only nine multiplications instead of 216 for random coordinates. For a generic 3D-engine you'd probably need a bit of clever coordinate transformation ping pong to make it work.
deimos is offline  
Old 10 August 2018, 00:13   #17
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
Sure, but I'm after something more general than that. A game, not a demo.

Quote:
Originally Posted by coder76 View Post
It isnt clear from your writings how you fill your polygons, but you should fill only 1 bitplane per surface, so if you have a 4 bitplane screen, then you have only 4 colors available (0001, 0010, 0100, 1000). Fast filled fullscreen polygons that run in 50 FPS on an A500 usually have only 3-4 colors and are done with blitter.
deimos is offline  
Old 10 August 2018, 00:57   #18
chb
Registered User

 
Join Date: Dec 2014
Location: germany
Posts: 110
Quote:
Originally Posted by deimos View Post
I'm not sure I understand how to exploit symmetry when arbitrary rotations of the object are allowed. I guess the mirror YZ plane rotates through the same raw pitch roll as the points, but then what's the maths to reflect across this now arbitrary plane?
My idea was mostly about reducing the number of values the coordinates can take. I was assuming you are working always with the original coordinates, or do you rotate the rotated coordinates again, instead of computing only new rotation matrices? In that case this trick does not work, of course. But I do not see a need to do so, as you can always multiply your old rotation matrix with a new one to get the same effect?

Matrix multiplications are distributive, so R (p1 +p2) = R p1 + R p2 and vice versa (R rotation matrix). So have your object points in a coordinate space where it has maximal symmetry, and which can be transformed to your world system by a translation (addition of the vector to the origin): p_world = p_object + p_origin. To rotate your object in the world system, p'_world = R p_world = R (p_object + p_origin) = R p_object + R p_origin. The last term is constant for all points of your object, so needs to be calculated only once for every rotation matrix.

Having your objects in separate coordinate spaces has also other benefits, like quick translation (change only the origin vector), and you might get higher precision/faster math: if you have a large world where 2.14 is not sufficient, do e.g. the rotation in the object system in 2.14 precision or whatever you like and only the rotation of the translation vectors and the translation in 18.14.
chb is offline  
Old 10 August 2018, 09:12   #19
grond
Registered User

 
Join Date: Jun 2015
Location: Germany
Posts: 537
If you really want to do vector multiplications, you'll run out of registers. Hence, use MULs with constants for the camera vectors (like MULS.w #cam,Dn) and replace the dummy value with your updated camera values for each frame (self-modifying code). In this way you can do point projection without having to juggle with registers.
grond is offline  
Old 10 August 2018, 10:21   #20
deimos
Registered User

 
Join Date: Jul 2018
Location: Londonish / UK
Posts: 126
I always start with original coordinates in object space. I fill in my rotation matrix and apply this to my coordinates and add the translation for the object's position to get coordinates in world space. The camera is fixed at 0,0,0 for now, otherwise I'd move to 4x4 matrices and combine the move to camera coordinates into one matrix.

I can see advantages in modelling my objects by taking advantage of their symmetry, they'll be easier to design by hand that way, but unless I'm missing something, I don't think I can use it to reduce the computations I need to do.

Quote:
Originally Posted by chb View Post
My idea was mostly about reducing the number of values the coordinates can take. I was assuming you are working always with the original coordinates, or do you rotate the rotated coordinates again, instead of computing only new rotation matrices? In that case this trick does not work, of course. But I do not see a need to do so, as you can always multiply your old rotation matrix with a new one to get the same effect?

Matrix multiplications are distributive, so R (p1 +p2) = R p1 + R p2 and vice versa (R rotation matrix). So have your object points in a coordinate space where it has maximal symmetry, and which can be transformed to your world system by a translation (addition of the vector to the origin): p_world = p_object + p_origin. To rotate your object in the world system, p'_world = R p_world = R (p_object + p_origin) = R p_object + R p_origin. The last term is constant for all points of your object, so needs to be calculated only once for every rotation matrix.

Having your objects in separate coordinate spaces has also other benefits, like quick translation (change only the origin vector), and you might get higher precision/faster math: if you have a large world where 2.14 is not sufficient, do e.g. the rotation in the object system in 2.14 precision or whatever you like and only the rotation of the translation vectors and the translation in 18.14.
deimos is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Games that are Full Frame Rate or Slower - Limitations or Choice? Foebane Retrogaming General Discussion 35 08 April 2018 14:22
F1 grand prix frame rate universale support.Games 18 13 July 2015 22:45
The First Person Shooter frame rate tolerance poll... DDNI Retrogaming General Discussion 41 30 June 2011 04:32
Vsync Fullscreen and Double Buffer, incorrect frame rate? rsn8887 support.WinUAE 1 07 April 2011 21:43
Propper speed request when recording with "Disable frame rate" turned on. Ironclaw request.UAE Wishlist 9 02 August 2006 08:21

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 18:27.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.
Page generated in 0.12629 seconds with 14 queries