Star Raiders port to Amiga? - Page 4

VladR · 22 August 2020, 08:08

Quote:

Originally Posted by roondar

To illustrate the effect of this, here's a simple calculation:

Code:

Total DMA slots/cycles available (NTSC): 59212 (226 slots * 262 lines)
320x200x4 DMA slots used: 16000 ((320/16)*200*4)
640x200x4 DMA slots used: 32000 ((640/16)*200*4)

           CPU speed (full frame/NTSC)
320x200x4: ~100% (due to interleaving)
640x200x4:  ~54% (32000/59212)

Now there are obviously some caveats here because 3D calculations use a lot of multiplications which have a lot of idle cycles on the bus, but the majority of each rasterline where the bitplane DMA is being done will still be unusable by the CPU so as a general rule of thumb the above figures should be close enough.

I'm trying to compare the costs of these 2 approaches:

Code:

1. Computing the 3D transformation
    - Advantage: Can use the DMA lock-out during scanline productively
    - Advantage: Zero RAM consumption for large LUTs
    - Disadvantage: Much slower than look up
2. Using look-up tables
    - Advantage: much faster
    - Disadvantage: needs a lot of DMA access (which we don't have)
    - Disadvantage: needs a lot of RAM

I want to fill registers and do computation in parallel with DMA lock-out and only write to RAM roughly in sync with beam.

But I don't know how many DMA cycles there are during HBlank. 8-12, perhaps ?

Here's a quick computation of cycles available on target NTSC system:

Code:

262.5 scanlines per frame
  7.15909 MHz
 59.94 Hz
 20 scanlines for VBLANK

7.15909 / 59.94 = 119,437 cycles of raw CPU time per frame
119,437 / 262.5 = 455 cycles per scanline (including vblank ones without DMA lock-out)

In 368 cycles I can do 3D transform of one star into on-screen position.

I will need DMA to write the result (two words) into RAM via

Code:

move.w d0,(a0)+
move.w d1,(a0)+

This should still happen just fine within the HBlank. But since I don't know how many DMA cycles there are, I don't know if I will have enough DMA slots to read next vertex:

Code:

move.w (a1)+,d0
move.w (a1)+,d1
move.w (a1)+,d2

Of course, adding rotation and color computation will increase the cost per star to at least 3 scanlines (very rough estimate, could be much more) - but it's still very important to know just how many DMA slots I will have during HBlank to properly interleave DMA with HiRes@4bpl.

roondar · 22 August 2020, 12:08

Quote:

Originally Posted by VladR

But I don't know how many DMA cycles there are during HBlank. 8-12, perhaps ?

Here's a quick computation of cycles available on target NTSC system:

Code:

262.5 scanlines per frame
  7.15909 MHz
 59.94 Hz
 20 scanlines for VBLANK

7.15909 / 59.94 = 119,437 cycles of raw CPU time per frame
 119,437 / 262.5 = 455 cycles per scanline (including vblank ones without DMA lock-out)

In 368 cycles I can do 3D transform of one star into on-screen position.

The 68000 effectively gets 454 CPU cycles per scanline if no DMA is on the bus (or if there is DMA that can be interleaved with), so that's pretty accurate. Note that this is equal to 226 DMA cycles, as they're counted at 3.5MHz (i.e. 2 CPU cycles = 1 DMA cycle).

However, when we're talking about HBLANK on the Amiga it's often not really important to consider the real HBLANK ("the interval during which the beam resets for the next scanline"), but rather it's usually more important to consider the interval during which no bitplane DMA is active on any given scanline where there is bitplane DMA. This is a much larger timeslice than the HBLANK itself and represents the amount of time you have available between two scanlines to do stuff that can't run when bitplane DMA is running.

With that in mind: the number of DMA cycles available between scanlines is 226-number of cycles used by bitplane DMA*. For the given screenmode of 640x200x4 we know that the number of DMA cycles used per scanline is equal to (640/16)*4, or 160 DMA slots. This leaves us 66 DMA cycles per scanline, which translates to 132 CPU cycles available per scanline.

*) Note: I'm assuming no other DMA is present on the bus to simplify calculations. If the Blitter is active, or Copper+any other DMA is active during a particular scanline it gets a bit more complicated. I'm also ignoring memory refresh as that interleaves with the CPU.

VladR · 22 August 2020, 18:18

Thanks for the info, roondar - it's very helpful

Quote:

Originally Posted by roondar

I'm also ignoring memory refresh as that interleaves with the CPU.

I was wondering why you didn't account for that, as the illustration depict it at the start of scanline, followed by bitplane DMA. I guess I missed that it's interleaved with CPU. That's great news!

Quote:

Originally Posted by roondar

*) Note: I'm assuming no other DMA is present on the bus to simplify calculations. If the Blitter is active, or Copper+any other DMA is active during a particular scanline it gets a bit more complicated.

Not sure about Blitter or Copper now, but for sure we will have sprites. Easily 4-8.

How many DMA slots does a sprite scanline take ? Just 1 ? It's just 2 words per scanline, so I presume HW can read both of them in one DMA cycle ?

Quote:

Originally Posted by roondar

The 68000 effectively gets 454 CPU cycles per scanline if no DMA is on the bus (or if there is DMA that can be interleaved with), so that's pretty accurate. Note that this is equal to 226 DMA cycles, as they're counted at 3.5MHz (i.e. 2 CPU cycles = 1 DMA cycle).

However, when we're talking about HBLANK on the Amiga it's often not really important to consider the real HBLANK ("the interval during which the beam resets for the next scanline"), but rather it's usually more important to consider the interval during which no bitplane DMA is active on any given scanline where there is bitplane DMA. This is a much larger timeslice than the HBLANK itself and represents the amount of time you have available between two scanlines to do stuff that can't run when bitplane DMA is running.

With that in mind: the number of DMA cycles available between scanlines is 226-number of cycles used by bitplane DMA*. For the given screenmode of 640x200x4 we know that the number of DMA cycles used per scanline is equal to (640/16)*4, or 160 DMA slots. This leaves us 66 DMA cycles per scanline, which translates to 132 CPU cycles available per scanline.

So, 132c / scanline. For 200 scanlines, that's 200 * 132 = 26,400c
The remaining 62.5 scanlines (262.5 - 200:Visible) I get full 454c / scanline, right ? Thus, 454*62.5 = 28,375. Correct ?
Here's my current summary of cycles available per frame:

Code:

Cycle Budget

Scanlines    Cycles   Total    Description
--------------------------------------------
  200         132     26,400   DMA On:  cycles during 200 scanlines
  200         322     64,400   DMA Off: cycles (bitplane DMA)
  62.5        454     28,375   DMA On:  cycles (VBLANK)

54,775 : DMA cycles (26,400 + 28,375)
64,400 : No DMA cycles

Because we want 30 fps lock, we have two frames, so total scene cycle budget is double (2*54,775 ; 2*64,400). But, that must also account for CPU spikes to avoid framedrops...

In reality, I will have to benchmark [in scanlines], as the moment I will require RAM, I will burn through the current scanline (up to 322c!) till I reach HBLANK area. So, it will go fast, but at least I have a good idea of the boundaries involved.

roondar · 22 August 2020, 18:48

Quote:

Originally Posted by VladR

Thanks for the info, roondar - it's very helpful

Your welcome

Quote:

I was wondering why you didn't account for that, as the illustration depict it at the start of scanline, followed by bitplane DMA. I guess I missed that it's interleaved with CPU. That's great news!

Yeah, the Amiga designers really thought this through. There's a lot of this interleaving "trickery" going on. Speaking of which...

Quote:

Not sure about Blitter or Copper now, but for sure we will have sprites. Easily 4-8.

How many DMA slots does a sprite scanline take ? Just 1 ? It's just 2 words per scanline, so I presume HW can read both of them in one DMA cycle ?

The Sprites take 2 DMA slots per sprite per scanline. Amiga DMA on OCS machines never reads more than 16 bits per access (on AGA both bitplanes and sprites can do more than 16 bits per access, but that's a whole different topic).

However, the CPU is normally not impacted by Sprite DMA because it interleaves with Sprite, Audio and Disk DMA. So effectively no cycles are lost from the perspective of the CPU. Again, I'm assuming no Blitter/Copper is on the bus as that can and often will cause cycles to be stolen from the CPU.

Quote:

So, 132c / scanline. For 200 scanlines, that's 200 * 132 = 26,400c
The remaining 62.5 scanlines (262.5 - 200:Visible) I get full 454c / scanline, right ? Thus, 454*62.5 = 28,375. Correct ?
Here's my current summary of cycles available per frame:

Code:

Cycle Budget

Scanlines    Cycles   Total    Description
--------------------------------------------
  200         132     26,400   DMA On:  cycles during 200 scanlines
  200         322     64,400   DMA Off: cycles (bitplane DMA)
  62.5        454     28,375   DMA On:  cycles (VBLANK)

54,775 : DMA cycles (26,400 + 28,375)
64,400 : No DMA cycles

Because we want 30 fps lock, we have two frames, so total scene cycle budget is double (2*54,775 ; 2*64,400). But, that must also account for CPU spikes to avoid framedrops...

In reality, I will have to benchmark [in scanlines], as the moment I will require RAM, I will burn through the current scanline (up to 322c!) till I reach HBLANK area. So, it will go fast, but at least I have a good idea of the boundaries involved.

Yes, you get the full 454 CPU cycles per scanline during the VBLANK area (though VBLANK isn't really the correct term here either, as the 62 scanlines includes the border area outside of the 200 pixel high window).

About your cycle budget, the numbers are more or less accurate. Note however that the bitplane DMA does not take 322 cycles per scanline, but rather 320. The 2 cycles you're now missing don't get to be used by the chipset (the HRM points out that out of the 227.5 cycles per scanline only 226 can be used by chipset DMA, the rest are not available). The CPU however keeps running during those "missing" cycles, leading to it getting 454 cycles per scanline effectively.

I'm sure it's possible I didn't quite get all the details right in the above paragraph as to why the CPU gets 2 more cycles per scanline than the chipset does, but it's close enough for what you're doing here

VladR · 22 August 2020, 20:06

Quote:

Originally Posted by roondar

Yeah, the Amiga designers really thought this through. There's a lot of this interleaving "trickery" going on. Speaking of which...

Yeah, I just noticed that HRM mentions the following:

Quote:

If a device (disk/audio/sprite/bitplane) does not request one of its allocated time slots, the slot is open for other uses.

So, since I won't be using disk during frame, I should have 3 additional DMA cycles available per scanline, right ? Meaning, if there is no sprite, 68000 will get those 3 additional DMA slots (assuming no other higher-priority device wants access) ?

Quote:

Originally Posted by roondar

The Sprites take 2 DMA slots per sprite per scanline. Amiga DMA on OCS machines never reads more than 16 bits per access

Re-reading HRM, I noticed the table:

Code:

 4 slots: Memory Refresh at odd-numbered slots - this is the hint for interleaving :)
 3 slots: Disk DMA
 4 slots: Audio DMA
16 slots: Sprite DMA
80 slots: BitPlane DMA

So, even at HiRes:4bpl, looks like I still get 16 Sprite slots, meaning I can still have 8 sprites at any scanline ?

Quote:

Originally Posted by roondar

About your cycle budget, the numbers are more or less accurate. Note however that the bitplane DMA does not take 322 cycles per scanline, but rather 320. The 2 cycles you're now missing don't get to be used by the chipset (the HRM points out that out of the 227.5 cycles per scanline only 226 can be used by chipset DMA, the rest are not available). The CPU however keeps running during those "missing" cycles, leading to it getting 454 cycles per scanline effectively.

Thanks, I will adjust it to 320 in my table now.

Quote:

Originally Posted by roondar

I'm sure it's possible I didn't quite get all the details right in the above paragraph as to why the CPU gets 2 more cycles per scanline than the chipset does, but it's close enough for what you're doing here

Yeah, I was looking for a ballpark figure, and this is definitely way more precise than a ballpark

Not to mention that it will be a feat if I even get an average of over 50% of CPU cycles during DMA lock-out

Besides, since I want a 30fps lock, I will have to reserve a CPU Spike buffer of at least 20-25% of a frame time - e.g. somewhere around ~40,000c, which will be burned by WaitTOF () to make sure the game is always smooth.
I could probably accept a very occasional framedrop, like one framedrop in 5-10 seconds, but no more than that.

But, the engine absolutely must handle 30 fps with :
- HUD
- Audio
- stars
- old enemy explosion
- player lasershots
- new enemy
- new enemy lasershots
- 3 other enemies in sector processing their AI (FSM State Machine)

If there's a second explosion in parallel with first one, it will eat into the Spike Buffer. Now, the Galaxy update cost will depend on how many squadrons are in the system, so it will take most of time at the start of the game. Perhaps that component will have to be spread across multiple frames (like, just one squadron processed per game frame) as updating 30 squadrons each frame would certainly kill a lot of cycles...

roondar · 22 August 2020, 20:19

Quote:

Originally Posted by VladR

Yeah, I just noticed that HRM mentions the following:
So, since I won't be using disk during frame, I should have 3 additional DMA cycles available per scanline, right ? Meaning, if there is no sprite, 68000 will get those 3 additional DMA slots (assuming no other higher-priority device wants access) ?

Well... Yes and no.

Yes, those cycles are now technically available to the CPU
And no, the CPU could already interleave with disk or sprite DMA in such was no cycles appear to be lost (i.e. the disk DMA cycles slot into the cycles the CPU is not on the bus in the first place).

Sorry to harp on about it, but it's all about interleaving. The 68000 normally only accesses the bus every other DMA cycle (this is not quite 100% true, but it's true enough for our purposes). The cycle it doesn't access the bus, it's doing internal stuff. It looks a bit like this:

Code:

68000 solo:
C-C-C-C

68000 with refresh, disk, sprite or audio DMA:
CDCDCD

- = no memory access
C = CPU access
D = DMA access

So in these cases the CPU technically can't access the bus all the time because other DMA is active on certain slots. But, due to the CPU limits for accessing the bus in the first place it doesn't actually slow down due to these forms of DMA access.

Perhaps it's helpful to remember that Sprite, audio, disk and refresh DMA cycles occur at set times during the scanline and never overlap one another, so CPU interleaving stays possible all the time.

Quote:

So, even at HiRes:4bpl, looks like I still get 16 Sprite slots, meaning I can still have 8 sprites at any scanline ?

Yes, assuming you're not using horizontal hardware scrolling or horizontal overscan this is indeed correct - you get to use all 8 sprites.

Quote:

Yeah, I was looking for a ballpark figure, and this is definitely way more precise than a ballpark

Not to mention that it will be a feat if I even get an average of over 50% of CPU cycles during DMA lock-out

Yeah, sorry.. I'm kinda like that

Quote:

Besides, since I want a 30fps lock, I will have to reserve a CPU Spike buffer of at least 20-25% of a frame time - e.g. somewhere around ~40,000c, which will be burned by WaitTOF () to make sure the game is always smooth.
I could probably accept a very occasional framedrop, like one framedrop in 5-10 seconds, but no more than that.
..snip..
updating 30 squadrons each frame would certainly kill a lot of cycles...

It's certainly going to be a challenge. Will be interesting to see how viable it ends up being. Would be pretty awesome if it worked!

Rotareneg · 22 August 2020, 20:19

VladR, in case you didn't know, WinUAE has a handy visual DMA debugger that will nicely visualize this stuff. Use shift-F12 to open the debugger, use v -2 for visual DMA debugging, and then x to exit the debugger. v -3 doubles the horizontal width, v -4 doubles it vertically as well.

Here's the default colors:

Here's F-18 Interceptor:

And here's Shadow of the Beast:

VladR · 22 August 2020, 21:03

Quote:

Originally Posted by roondar

Well... Yes and no.

Yes, those cycles are now technically available to the CPU
And no, the CPU could already interleave with disk or sprite DMA in such was no cycles appear to be lost (i.e. the disk DMA cycles slot into the cycles the CPU is not on the bus in the first place).

Well, but that begs the question - as 68000 is lowest priority device, how come it can get those DMA cycles, instead of BitPlane ?
Imagine we executed an op at the last possible slot in the Mem Refresh area and 68000 now waits for the DMA (due to accessing (a0)+).
Shouldn't BitPlane get the DMA cycle (being highest priority) then ? At which point, 68000 will have to wait for whole 80/160 DMA cycles ?

Quote:

Originally Posted by roondar

Sorry to harp on about it, but it's all about interleaving. The 68000 normally only accesses the bus every other DMA cycle (this is not quite 100% true, but it's true enough for our purposes). The cycle it doesn't access the bus, it's doing internal stuff. It looks a bit like this:

Code:

68000 solo:
C-C-C-C

68000 with refresh, disk, sprite or audio DMA:
CDCDCD

- = no memory access
C = CPU access
D = DMA access

So in these cases the CPU technically can't access the bus all the time because other DMA is active on certain slots. But, due to the CPU limits for accessing the bus in the first place it doesn't actually slow down due to these forms of DMA access.

I presume you mean Effective Address calculation, right ? Each op takes vastly different amount of cycles, so it's hard to predict where exactly will the RAM access fall (in terms of DMA slots).
Also, each 68000 op, depending on addressing mode, takes different amount of cycles even before it gets to the RAM access (EA calculation), so that will certainly result in missing the DMA slot often, I would presume.

Code:

          (an) (an)+  -(an)  d(an)  d(an,dn)
---------------------------------------------
.b.w/.l   4/8   4/8   6/10   8/12   10/14

          abs.s  abs.l  d(pc)  d(pc,dn) Imm
---------------------------------------------
.b.w/.l   8/12   12/16   8/12   10/14   4/8

Quote:

Originally Posted by roondar

Yes, assuming you're not using horizontal hardware scrolling or horizontal overscan this is indeed correct - you get to use all 8 sprites.

Thanks for confirming !

Quote:

Originally Posted by roondar

It's certainly going to be a challenge. Will be interesting to see how viable it ends up being. Would be pretty awesome if it worked!

Well, if I find I need at least 30,000c per frame, there's always a PAL-only back-up plan

VladR · 22 August 2020, 21:05

Quote:

Originally Posted by Rotareneg

VladR, in case you didn't know, WinUAE has a handy visual DMA debugger that will nicely visualize this stuff. Use shift-F12 to open the debugger, use v -2 for visual DMA debugging, and then x to exit the debugger. v -3 doubles the horizontal width, v -4 doubles it vertically as well.

Wow. Nope, I didn't know that !

That certainly beats me putzing around with crayons, scribbling on shards of paper around the house after each build

Thanks !

roondar · 22 August 2020, 23:33

Quote:

Originally Posted by VladR

Well, but that begs the question - as 68000 is lowest priority device, how come it can get those DMA cycles, instead of BitPlane ?

I'd love to continue this discussion but I feel it's sidetracking from the actual discussion about the game (this is getting very in depth about Amiga hardware). So, I've created a new thread in the asm/hardware section about the topic of 68000 interleaving with DMA and posted my reply to you there.

You can find it and my reply here: http://eab.abime.net/showthread.php?t=103669

TroyWilkins · 25 August 2020, 14:47

I don't mind the slightly off-topic talk, I found it very interesting. Looks like you've got some great ideas for this VladR, well done!

VladR · 27 August 2020, 13:02

Quote:

Originally Posted by TroyWilkins

I don't mind the slightly off-topic talk, I found it very interesting. Looks like you've got some great ideas for this VladR, well done!

Well, ideas are dime a dozen compared to the effort in coding all this...

That being said, being a huge fan of Galactica, it's impossible watching the series without getting dozens of ideas suitable to plug into a Star Raiders-type game...

In our case, the sooner you destroy the Resurrection ships, the sooner you can eradicate all Cylons.

If you don't destroy it ASAP, they will just keep spawning again and again and merely prove a great targeting practice

TroyWilkins · 28 August 2020, 03:37

Quote:

Originally Posted by VladR

Well, ideas are dime a dozen compared to the effort in coding all this...

That being said, being a huge fan of Galactica, it's impossible watching the series without getting dozens of ideas suitable to plug into a Star Raiders-type game...

In our case, the sooner you destroy the Resurrection ships, the sooner you can eradicate all Cylons.

If you don't destroy it ASAP, they will just keep spawning again and again and merely prove a great targeting practice

Yes, fair point about ideas vs effort.

Having agreed with that, the idea you have said does sound good, and could add an extra layer of strategy, as in which one do you go after first, one further from where you are but closer to a starbase you need to defend, or one closer to where you currently are?

VladR · 28 August 2020, 18:28

Quote:

Originally Posted by TroyWilkins

Having agreed with that, the idea you have said does sound good, and could add an extra layer of strategy, as in which one do you go after first, one further from where you are but closer to a starbase you need to defend, or one closer to where you currently are?

This would only be enabled on higher difficulty setting to preserve the classic gameplay.

The thing with Resurrection ship is that there would be only one. But shooting down Cylons while they are in its range doesn't accomplish anything as they merely download again.

Actually, it's a bit more complex than that. There was a whole episode on Galactica where they explained how Death is merely a learning experience for Cylons.
Meaning, the more a Cylon dies, the better he becomes as they keep their memories.

Which points back to my basic RPG system with stats like {Level, HP, Armor, Attack, ShotsPerSecond, Accuracy, Rage}.
Meaning when they die, and download into new body, they keep their stats.
But, each time they die, their Rage stats goes up, meaning they become more aggressive. At a certain Rage level they could turn kamikaze and simply ram your ship (from up close and personal), which would prove a great learning experience for a player

So, in theory, you could keep killing Cylons, and they would simply keep becoming better until they are so strong you can't do much damage to them anymore as they will kill you in one shot (after, say, 5-10 Downloads).

On another hand, it would be ridiculous, if you could just show up, at Level 1, and destroy Resurrection Ship right after mission starts. So, there will have to be some balance in how much you need to upgrade yourself and how many times you can realistically respawn the same Cylons.

I'm debating whether to simply show the Cylon's level in the HUD or not. It would certainly create more tension, if you could only indirectly infer their level from their behavior - but coming up with a different recognizeable attack pattern after each Death Upgrade is a ton of work and debugging...

But, that'd add yet another strategy layer - imagine you encounter a Cylon and don't know their level, so attempt to come closer and notice kamikaze behavior. You'd have to hyperspace instantly and hope to survive. Of course, you would have to keep some thyllium (refilled upon docking, but easily reusable [and used up] for engine boost when hunting) on board for these scenarios, otherwise

Obviously, to keep it fair, no Download for Player - only InstaDeath

sandruzzo · 30 August 2020, 16:51

Why don't close a little bit actual screen area to gain some dma slots? Instead 640*200 you could go for 640-640 = 576*200. This will give you some edge....

VladR · 01 September 2020, 19:08

Quote:

Originally Posted by sandruzzo

Why don't close a little bit actual screen area to gain some dma slots? Instead 640*200 you could go for 640-640 = 576*200. This will give you some edge....

This surely would help a lot, but isn't this practice heavily frowned upon ? On Atari some games were in narrow mode (32 Bytes/scanline instead of standard 40), but this did look quite weird even back then.

Today, with ultra-wide monitors, this might look somewhat like doom in a postage stamp size window...

Benchmarks will tell what's doable and what is not...

VladR · 01 September 2020, 19:22

I've implemented basic view-point rotation. On top of Star Raiders' Axis X and Axis Y, I figured it wouldn't hurt to also have rotation along Axis Z - could be used to enhance the Hyperspace jump...

I did about 5 versions, last being probably fastest possible without using additional look-up tables, but I suspect there will be some additional cycle savings possible once I merge this stage with 3D transform.

sandruzzo · 02 September 2020, 07:01

@VladR

Think how many Great Games on Amiga used this "trick": Turrican series, Elf Mania, Shadow of the beast....

VladR · 02 September 2020, 09:21

Didn't know that. I will keep it in mind.

Number one thing that is in my control is number of stars processed and how many days I wanna burn on optimizing

The rest (AI+sprite handling+audio+input+HUD) has a fixed cost, though some of those components can have a different update frequency (not everything has to happen each frame).

VladR · 02 September 2020, 16:55

I did a bit of work on rotation and tied it to a keyboard input, so stars now rotate properly along X&Y Axis, including clipping.

Next thing will be experimenting with star entry/exit. I currently assume that the star keeps its original direction vector that it had upon entry to the viewport (inverted view vector, basically) - which is what I believe gives the stars its look&feel (needs some tweaking).

Since I'm testing against WinUAE, I keep 24 stars per layer and have 16 layers. That many certainly won't be realistic on OCS at a target framerate (but should certainly be fine on V4).

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Is there Star Raiders 2 on Amiga ?	VladR	Retrogaming General Discussion	16	19 January 2020 12:55
Star Trek Judgment Rites Amiga port	mc68060	Retrogaming General Discussion	4	02 January 2020 21:57
[Found: Star Fleet I: The War Begins!] Star Trek-like, probably not licensed	aeberbach	Looking for a game name ?	7	14 October 2019 15:51
Star Raiders (Atari ST) - Source Code	kamelito	Retrogaming General Discussion	8	19 December 2015 06:02
raiders from lankhor	turrican3	project.aGTW	11	19 August 2012 15:05

22 August 2020, 20:19	#67
Rotareneg Registered User Join Date: Sep 2017 Location: Kansas, USA Posts: 324	VladR, in case you didn't know, WinUAE has a handy visual DMA debugger that will nicely visualize this stuff. Use shift-F12 to open the debugger, use v -2 for visual DMA debugging, and then x to exit the debugger. v -3 doubles the horizontal width, v -4 doubles it vertically as well. Here's the default colors: Here's F-18 Interceptor: And here's Shadow of the Beast:

25 August 2020, 14:47	#71
TroyWilkins Registered User Join Date: Jan 2015 Location: Melbourne, Australia Posts: 548	I don't mind the slightly off-topic talk, I found it very interesting. Looks like you've got some great ideas for this VladR, well done!

30 August 2020, 16:51	#75
sandruzzo Registered User Join Date: Feb 2011 Location: Italy/Rome Posts: 2,281	Why don't close a little bit actual screen area to gain some dma slots? Instead 640200 you could go for 640-640 = 576200. This will give you some edge....

01 September 2020, 19:22	#77
VladR Registered User Join Date: Dec 2019 Location: North Dakota Posts: 741	I've implemented basic view-point rotation. On top of Star Raiders' Axis X and Axis Y, I figured it wouldn't hurt to also have rotation along Axis Z - could be used to enhance the Hyperspace jump... I did about 5 versions, last being probably fastest possible without using additional look-up tables, but I suspect there will be some additional cycle savings possible once I merge this stage with 3D transform.

02 September 2020, 07:01	#78
sandruzzo Registered User Join Date: Feb 2011 Location: Italy/Rome Posts: 2,281	@VladR Think how many Great Games on Amiga used this "trick": Turrican series, Elf Mania, Shadow of the beast....

02 September 2020, 09:21	#79
VladR Registered User Join Date: Dec 2019 Location: North Dakota Posts: 741	Didn't know that. I will keep it in mind. Number one thing that is in my control is number of stars processed and how many days I wanna burn on optimizing The rest (AI+sprite handling+audio+input+HUD) has a fixed cost, though some of those components can have a different update frequency (not everything has to happen each frame).

02 September 2020, 16:55	#80
VladR Registered User Join Date: Dec 2019 Location: North Dakota Posts: 741	I did a bit of work on rotation and tied it to a keyboard input, so stars now rotate properly along X&Y Axis, including clipping. Next thing will be experimenting with star entry/exit. I currently assume that the star keeps its original direction vector that it had upon entry to the viewport (inverted view vector, basically) - which is what I believe gives the stars its look&feel (needs some tweaking). Since I'm testing against WinUAE, I keep 24 stars per layer and have 16 layers. That many certainly won't be realistic on OCS at a target framerate (but should certainly be fine on V4).

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)