Amiga Games I'm willing to fund the development of - Page 5

VladR · 22 June 2022, 17:56

Quote:

Originally Posted by a/b

Then I would suggest that you also record a bitmap ptr for each star as you draw them (maybe overwrite x/y to save space), so the clearing is then simply: read a bitmap ptr, set whole byte to 0 for each bitplane.

Yes, exactly. I was thinking of having just an array of word offsets (updated during drawing the pixels, once the offset is computed) for each star, then adding it to a1 and just doing 6x clr.b.
That's 5*16c + 12c = 92c (plus tiny loop overhead). The threshold, where it would be faster to use Blitter to clear all 6 BPLs is probably unreasonably high anyway (64,440c / ~120c = ~537 pixels). Cutscenes with generic 3D meshes would still need the Blitter codepath, though...

Code:

	clr.b (-$4000,a1)
	clr.b (-$2000,a1)
	clr.b (a1)
	clr.b ($2000,a1)
	clr.b ($4000,a1)
	clr.b ($6000,a1)

VladR · 03 July 2022, 16:34

Well, I finally implemented a run-time switch between 2BPL, 4BPL and 6 BPL and can switch between all 3 at any point, each calling its separate set of routines (thus, every method is duplicated 3 times, but that's alright).

I also realized that always clearing all 6 (or 4 or 2) bitplanes is quite wasteful, since about 50% of pixels only use 3 (or 2 or 1), but the alternative would be to have 64+16+4 unrolled versions (via jumptable), so that will have to wait for final benchmarks.
Still faster than full-screen clear, for sure...

I have a separate switch to use full-screen ClearScreen and Erasing individual stars, all from the array (not like before where they were rendered upon transformation without any storage whatsoever).
This approach does allow for few additional effects to be done on the stars, as they can be simply traversed in array, so the cost of additional RAM access (write+read+loop) is worth it.

Unfortunately, there's a complication from the 3D perspective - depending on your view angle, some stars share the same screen-space position, depending how you rotate our view.

So, I still have to clear the target bits when drawing the pixel (via AND), otherwise the overdraw will [obviously] result in glitch colors for those pixels. Of course, the more stars there are, the worse it is (the higher chance of it happening).

I briefly checked the cost of comparing the newly inserted (xp,yp) pair into an array to avoid duplicates, but that's prohibitively expensive on 7 MHz 68000 - I definitely loose more cycles than gain by not doing AND.

The best alternative I can think of right now would be to have all 64 codepaths unrolled and then I could get rid of the conditions during drawing and simply set all target bits to either 0 or 1 (hence solving the problem and saving additional cycles since I first clear the bit to 0, no matter what, only to reset it to 1 later).

Of course, there are ways to sort upon insertion, thus not running in O (N^2) but I doubt those would result in significant savings compared to cost of doing AND. At least it doesn't seem like it's worth the effort.

Hopefully I will have some benchmark within a week...

paraj · 03 July 2022, 18:54

As usual there are lots of different ways to approach it, but seems like it would be easiest to just unconditionally store the byte offset (or address) into the first bitplane when you "PutPixel". Later on you can run through the list and clear whole bytes in all the planes. Since overdraw doesn't seem to be a huge concern, just ignore it.

hitm4n · 03 July 2022, 18:57

Has anyone started coding a chess game or platformer yet?

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Help Fund the Amiga 4000 Replica Project!	Acill	Amiga scene	82	02 March 2020 20:04
Financial Fund London Amiga or PC	runandbecome	Amiga scene	8	30 September 2016 00:44
An idea for continued games development... using Amiga	Galahad/FLT	Amiga scene	91	29 December 2010 11:45
Amiga development	freehand	Retrogaming General Discussion	4	18 April 2010 17:53
Amizilla Fund closes in on almost $9000 in donations; first one that donates and gets	Pyromania	News	0	11 January 2005 11:00

03 July 2022, 16:34	#82
VladR Registered User Join Date: Dec 2019 Location: North Dakota Posts: 741	Well, I finally implemented a run-time switch between 2BPL, 4BPL and 6 BPL and can switch between all 3 at any point, each calling its separate set of routines (thus, every method is duplicated 3 times, but that's alright). I also realized that always clearing all 6 (or 4 or 2) bitplanes is quite wasteful, since about 50% of pixels only use 3 (or 2 or 1), but the alternative would be to have 64+16+4 unrolled versions (via jumptable), so that will have to wait for final benchmarks. Still faster than full-screen clear, for sure... I have a separate switch to use full-screen ClearScreen and Erasing individual stars, all from the array (not like before where they were rendered upon transformation without any storage whatsoever). This approach does allow for few additional effects to be done on the stars, as they can be simply traversed in array, so the cost of additional RAM access (write+read+loop) is worth it. Unfortunately, there's a complication from the 3D perspective - depending on your view angle, some stars share the same screen-space position, depending how you rotate our view. So, I still have to clear the target bits when drawing the pixel (via AND), otherwise the overdraw will [obviously] result in glitch colors for those pixels. Of course, the more stars there are, the worse it is (the higher chance of it happening). I briefly checked the cost of comparing the newly inserted (xp,yp) pair into an array to avoid duplicates, but that's prohibitively expensive on 7 MHz 68000 - I definitely loose more cycles than gain by not doing AND. The best alternative I can think of right now would be to have all 64 codepaths unrolled and then I could get rid of the conditions during drawing and simply set all target bits to either 0 or 1 (hence solving the problem and saving additional cycles since I first clear the bit to 0, no matter what, only to reset it to 1 later). Of course, there are ways to sort upon insertion, thus not running in O (N^2) but I doubt those would result in significant savings compared to cost of doing AND. At least it doesn't seem like it's worth the effort. Hopefully I will have some benchmark within a week...

03 July 2022, 18:54	#83
paraj Registered User Join Date: Feb 2017 Location: Denmark Posts: 1,298	As usual there are lots of different ways to approach it, but seems like it would be easiest to just unconditionally store the byte offset (or address) into the first bitplane when you "PutPixel". Later on you can run through the list and clear whole bytes in all the planes. Since overdraw doesn't seem to be a huge concern, just ignore it.

03 July 2022, 18:57	#84
hitm4n Registered User Join Date: Nov 2006 Location: Lincoln, UK Posts: 621	Has anyone started coding a chess game or platformer yet?

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)