English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 01 August 2020, 20:55   #1
mcgeezer
Registered User
 
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
Mega Typhoon Deconstruction

I've been meaning to take a look at Mega Typhoon for a while now and got around to doing it today for the first time.

While I did guess correctly that the game was running in 16 colours, I didn't anticipate that it was running it in Dual Playfield mode.

On initial look, there is certainly a lot of sprite multiplexing going on with the player ship and enemy bullets. But what is interesting is the way they are driving the blitter. From what I can gather the game has a copper list containing CMOVE's that drive the chip registers for the Blitter.

Are they really gaining that much speed by driving it this way?

More importantly - and the reason for my post, has anyone else done a deep dive on this game? (I fell into this trap when I did Xenon 2 only to find Galahad had done a load of work on it already).

It's certainly impressive for an ECS game.

Geezer

Update: Just reading Roondar's site here as just found it... it doesn't reference the original EAB thread though.

http://web.archive.org/web/201808290...-fastbobs.html

Last edited by mcgeezer; 01 August 2020 at 21:10.
mcgeezer is offline  
Old 01 August 2020, 22:38   #2
Tigerskunk
Inviyya Dude!
 
Tigerskunk's Avatar
 
Join Date: Sep 2016
Location: Amiga Island
Posts: 2,770
I thought there was a discussion about it somewhere here on the EAB a few years ago...

The speed comes from using dual playfield and direct blitting into the front playfield without the need to save and restore the background then or something like that....
Tigerskunk is offline  
Old 01 August 2020, 23:04   #3
Asman
68k
 
Asman's Avatar
 
Join Date: Sep 2005
Location: Somewhere
Posts: 828
@mcgeezer
Check this post and other related.
http://eab.abime.net/showpost.php?p=...&postcount=167

It would be really great to have source code or resourced version of MegaTyphoon.
Asman is offline  
Old 02 August 2020, 00:27   #4
lmimmfn
Registered User
 
Join Date: May 2018
Location: Ireland
Posts: 672
Quote:
Originally Posted by Asman View Post
@mcgeezer
Check this post and other related.
http://eab.abime.net/showpost.php?p=...&postcount=167

It would be really great to have source code or resourced version of MegaTyphoon.
Im curious, i understand using dual playfield removes the need for the blitter to store the underlying bitmap, restore it and blit the new location, however whats the difference with having say 5 blitplane scrolling but blitting objects only on the first 3 bitplanes( say with 4 colour 2 bitplane scrolling bk so no overlap in colours) Of course i understand the scrolling with dual playfield is an advantage, im just wondering what is the advantage of dual playfield in that scenario?

Funnily the Amstrad CPC's Mission Genocide uses a similar technique, i. e. reduces total colours from 16 to 8 as 2x4 colour playfields saving on removal and restore on sprites moving.

Last edited by lmimmfn; 02 August 2020 at 00:37.
lmimmfn is offline  
Old 02 August 2020, 01:41   #5
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
Quote:
Originally Posted by mcgeezer View Post
Minor question: that article is still online on my website, did you have problems reaching it? Or is there another reason you linked through web.archive.org?
Quote:
Originally Posted by mcgeezer
Are they really gaining that much speed by driving it this way?
Quote:
Originally Posted by Steril707 View Post
The speed comes from using dual playfield and direct blitting into the front playfield without the need to save and restore the background then or something like that....
AFAIK/IIRC they use at least three "tricks" (plus the Copper Blitting/Sprite multiplexing already pointed out). The first is that they use Blitter clear operations to avoid a full restore, which also gains them some extra CPU DMA slots over using a copy based restore. The second is that they do not have background music, which saves them the ProTracker player overhead (which AFAIK was still significant at that time as stuff like "the player" didn't exist yet)

The last is that they wrote their entire animation & movement system in such a way that all coordinates & other values are always already in the correct form for blitting (so they never have to translate X,Y & animation frame numbers to Blitter Addresses/Shifts). They were quite proud of that one, they pointed it out in either the documentation or in an interview (I forget which).

Quote:
Originally Posted by lmimmfn View Post
...however whats the difference with having say 5 blitplane scrolling but blitting objects only on the first 3 bitplanes( say with 4 colour 2 bitplane scrolling bk so no overlap in colours)...
There's two things. The first is simply convenience (not needing to muck about with setting up a very specific palette for it to work, etc). The second is more serious: using the method you suggest means quite a few of the palette entries normally reserved for sprites now need to be used a doubled colours to allow for the foreground to scroll over the background without odd colours showing up.

Which can a problem for using sprites.

Last edited by roondar; 02 August 2020 at 02:20. Reason: Combined both my replies into one post instead of making two ;)
roondar is offline  
Old 02 August 2020, 11:31   #6
dlfrsilver
CaptainM68K-SPS France
 
dlfrsilver's Avatar
 
Join Date: Dec 2004
Location: Melun nearby Paris/France
Age: 46
Posts: 10,412
Send a message via MSN to dlfrsilver
Quote:
.......The last is that they wrote their entire animation & movement system in such a way that all coordinates & other values are always already in the correct form for blitting (so they never have to translate X,Y & animation frame numbers to Blitter Addresses/Shifts). They were quite proud of that one, they pointed it out in either the documentation or in an interview (I forget which).
this is exactly what a coin op is doing. Capcom CPS1 game and others are working like this.
dlfrsilver is offline  
Old 02 August 2020, 12:54   #7
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
I'm not too sure of that actually.

Coin ups generally use sprites, which tend to have easy to use coordinate systems already. For example: looking at the small amount of CPS1 documentation that is out there, it seems that CPS1 sprites simply have a 2 byte X value and a 2 byte Y value for their on screen location.

This means that on at least CPS-1 arcade hardware you don't have to use address translation and shift calculation like the Amiga's Blitter requires. You simply write the desired X & Y coordinates to the proper location in memory. It's precisely this address/shift translation of X&Y coordinates that Mega Typhoon optimises.
roondar is offline  
Old 02 August 2020, 15:22   #8
Master484
Registered User
 
Master484's Avatar
 
Join Date: Nov 2015
Location: Vaasa, Finland
Posts: 525
I used google translate to the german readme file on the Mega Typhoon game disk, and it says this about the "copper controlled blitter" method:

Quote:
All blitter operations are progressively controlled by copper, independent of the CPU in the background
( with increasing number of BOBs, this procedure achieves a performance increase of up to 30% compared
to the conventional "blitter finished interrupt chaining" due to the relatively slow interrupt processing.)
Also it talks about how the program has 3 independently running processes, and how CPU calculations don't slow the game down:

Quote:
animation control process separation: Thanks to this completely new technology, in which the entire program
process is divided into 3 independent, asynchronously communicating processes, the visible frame rate (50 hz)
is in no way limited by the CPU calculations, but exclusively by the blitter. That means: even complex calculations,
like 50 simultaneously running target search algorithms do not slow down the refresh rate!
Also the readme says that the game maps are large bitmaps, up to 448 * 1684 pixels in size, and that it uses "graphics brushes of any size" that can be put to any pixel position on the virtual background. So I think this means that it doesn't use normal tiles to build the levels.

And when you play the game, at regular intervals the scrolling slows down for a moment, and there aren't that many enemies...I think that at these spots it builds the next background area by blitting those graphics brushes into the whole 448 * 1684 bitmap. And thanks to this method it doesn't need to blit new tile rows every 16 pixels like other shoot'em ups do, which brings another small speed increase.

---

Also here is a screenshot of how the game looks in the WinUAE visual debugger:





The cyan blitter operations always have those yellow dots before them, which I guess is the "copper controlling the blitter" thing. This is the only game where I have seen this sort of thing happening in the debugger window. In other games there are usually long "empty sections" between the cyan "blitter activity" zones. But here all those individual "blitter activity" zones sort of join together into one big chunk, that increases and decreases in size as the game goes on.
Attached Thumbnails
Click image for larger version

Name:	MegaTyphoonDebugger.png
Views:	979
Size:	48.5 KB
ID:	68358  
Master484 is offline  
Old 02 August 2020, 18:03   #9
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by roondar View Post
The last is that they wrote their entire animation & movement system in such a way that all coordinates & other values are always already in the correct form for blitting (so they never have to translate X,Y & animation frame numbers to Blitter Addresses/Shifts). They were quite proud of that one, they pointed it out in either the documentation or in an interview (I forget which).
More on what roondar wrote here.

In this game the blitter objects would be better defined in the broad sense as "copper objects".
CPU 'construct' the copper list dynamically as a queue where next object is already defined as a series of copper instruction that write in blitter registers.
Construction is fast because 'copper objects' are simply indexed by a copper jumps. Jumps are 'near' so only COP1LCL is changed.
Queues is sorted to make the blitter setup as fast as possible, guaranteeing less registers to be changed between a blit and the successive.
A very clever copper usage

There is one thing that I didn't properly understand ..
Why slow down the copper list using three convecutive CWAIT_BFD?
There is a thread somewhere on EAB (where I also wrote some code) where it was reported that with a single wait there were problems
(but at the end it has not been clarified if it was an hw problem specific to that machine).
Theoretically only one should be sufficient on the Agnus from the A500 onwards (but maybe here Toni should intervene to clarify).
It seems strange to me, considering how much they have optimized everything, that there is no reason for this.

Another curiosity: instead of the $1FE register for a CNOP the $190 register is used.
For all intents and purposes, cause the way the bitplanes are used, there is no substantial difference, but it is quite particular
ross is offline  
Old 02 August 2020, 18:22   #10
Tigerskunk
Inviyya Dude!
 
Tigerskunk's Avatar
 
Join Date: Sep 2016
Location: Amiga Island
Posts: 2,770
That's how my Sprite Multiplexer operates as well.
Sorting Y position, and then writing a dynamic copperlist. Works like a charm.

I am not sure though, what the benefit would be using this system with BOBs?

Is there really a speed advantage doing this?

Sounds more like a pain in the ass for something that you can have much less complicated as well.
Tigerskunk is offline  
Old 02 August 2020, 18:43   #11
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by Steril707 View Post
I am not sure though, what the benefit would be using this system with BOBs?

Is there really a speed advantage doing this?
I really think so, especially if you have code in chip or slow ram.
Copper is very fast at writing in blitter registers, with only two bus/mem accesses.
Using 68000 code in a quick way: move.w #const,blitter_reg(custom_base), are 4 accesses to the internal bus/memory...
To match the copper you should use move.w dx,(ax) but how many times you can use it? Very very rarely..
ross is offline  
Old 02 August 2020, 19:30   #12
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,211
Quote:
Originally Posted by Steril707 View Post
Is there really a speed advantage doing this?
CPU not having to wait for blitter.. all blitting taken care of by copper, and CPU totally free to update game logic, and build all render info for the next frame.
DanScott is online now  
Old 02 August 2020, 19:46   #13
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by DanScott View Post
CPU not having to wait for blitter.. all blitting taken care of by copper, and CPU totally free to update game logic, and build all render info for the next frame.
Unfortunately there is a big disadvantage in doing this: you cannot do sync effects with copper (well, there is a method to alleviate this but is a bit complex), because you cannot know where the beam position is during the blitter wait in copper list.
But for a vertical shoter this is not usually a problem
ross is offline  
Old 02 August 2020, 21:46   #14
dlfrsilver
CaptainM68K-SPS France
 
dlfrsilver's Avatar
 
Join Date: Dec 2004
Location: Melun nearby Paris/France
Age: 46
Posts: 10,412
Send a message via MSN to dlfrsilver
Quote:
Originally Posted by DanScott View Post
CPU not having to wait for blitter.. all blitting taken care of by copper, and CPU totally free to update game logic, and build all render info for the next frame.
This is again exactly what the capcom cps1 is doing. when the graphic GPU process the graphics, during the VBL the 68000 is completely free to update the game logic.

Mega Typhoon is very clever
dlfrsilver is offline  
Old 03 August 2020, 03:02   #15
FSizzle
Registered User
 
Join Date: Nov 2017
Location: Los Angeles
Posts: 49
Quote:
Originally Posted by ross View Post
Copper is very fast at writing in blitter registers, with only two bus/mem accesses.
I think that's actually 3 accesses (2 reads and 1 write).

That doesn't detract from your general point though - while it's possible to write more efficiently from the CPU, it's difficult and you sort of have to ignore all the other overheads (like the blitter waits).

I read somewhere (in another thread I think?) that the player bullets and ship were re-using the same sprites with a palette change - that would be somewhat tricky to have working efficiently in this system.
FSizzle is offline  
Old 03 August 2020, 08:27   #16
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by FSizzle View Post
I think that's actually 3 accesses (2 reads and 1 write).
Not from speed perspective (even if you are correct from 'physical' accesses perspective, I've simplified to grasp the concept )

Take this CMOVE, a blitter operation start: dc.w $0058,$0041
How many cycles it requires? Only 4 cck cycles/8 68k cycles (for the two mem fetches for copper command/param), because the write is 'hidden' and only on the RGA bus.

The same MOVE with CPU code, using the aforementioned move.w #const,blitter_reg(custom_base): dc.w $3d7c,$0041,$0058
Here you have 3 read (for the opcode/param fetches) and 1 write (to the mem/custom), so 8 cck cycles/16 68k cycles, because the write is 'external'!

Well this is also oversimplified, the copper/CPU/blitter concurrency on buses is not so trivial, but it is only to make it clear that in most real cases, at least on an architecture like a bare A500, copper is much faster for this kind of operations (obviously if you already have the copper list built in memory, as in the case of the game in question).
ross is offline  
Old 03 August 2020, 13:20   #17
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
Quote:
Originally Posted by ross View Post
Well this is also oversimplified, the copper/CPU/blitter concurrency on buses is not so trivial, but it is only to make it clear that in most real cases, at least on an architecture like a bare A500, copper is much faster for this kind of operations (obviously if you already have the copper list built in memory, as in the case of the game in question).
One thing I've never really been clear on is the cost of updating the Copperlist.

It seems to me that any discussion on Copper blitting (as a way to gain performance) kind of glosses over the updates to the Copper instructions needed between frames.

Even if we assume the Copperlist itself can stay 100% static and no queue entries are ever added or removed (which doesn't seem quite right to me if we want to be able to cover every situation), you still have to change the data in the Copperlist in order to move or animate Bobs. Obviously you won't need to update all of the Copper instructions every frame, but it still does cost some CPU time to update the Copperlist itself between frames.

The question then becomes: how much raster time is lost doing this?
I've never really seen a good answer to this so I'd love to hear your (or anyone else's ) view on this.
roondar is offline  
Old 03 August 2020, 15:16   #18
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by roondar View Post
One thing I've never really been clear on is the cost of updating the Copperlist.

It seems to me that any discussion on Copper blitting (as a way to gain performance) kind of glosses over the updates to the Copper instructions needed between frames.

Even if we assume the Copperlist itself can stay 100% static and no queue entries are ever added or removed (which doesn't seem quite right to me if we want to be able to cover every situation), you still have to change the data in the Copperlist in order to move or animate Bobs. Obviously you won't need to update all of the Copper instructions every frame, but it still does cost some CPU time to update the Copperlist itself between frames.

The question then becomes: how much raster time is lost doing this?
I've never really seen a good answer to this so I'd love to hear your (or anyone else's ) view on this.
You are absolutely right, it is impossible that the list is completely static and also that some of the fields are not modified.

So this is a difficult answer, real tests should be made and statistics collected.
In any case, I try to make rough calculations.

I take the image of message #8 and I observe the DMA slots used for the generic blitting* of an object (very simple to see because it is preceded by an unmistakable multiple wait sequence, really 3 CWAIT_BFD).
[*this is a 'bad case', for clear/other ops I can optimize it]
There are 11 writes on the blitter registers, plus the setting for new copper code execution position (the next object in the queue) and the CJMP for a total of 13 copper 'operations'.

Now I weigh the writing with the CPU as 100 and the one with the copper 50 (based on assumpitions in my previous messages).
To the copper ops weight I must add the writing on the clist by the CPU (therefore 100).
But how many writes do I need on the list? I can only estimate an average, certainly 1 for the jump setting plus those for the registers to be updated. If I am lucky 0 (totally static object for the frame), otherwise even 5 or 6 .. Let's say that on average a couple for frame are enough, I will have: CPU = 11*100 = 1100, COPPER = 50*13 + 100*(2+1) = 950.
I have gained around 15%.

How did the programmers then reported a gain that can reach 30%?
Well, if you read they wrote 'compared to the conventional blitter finished interrupt chaining', that is not the way we want to use for an optimized blitter usage .

Correct me if I wrote too big nonsense, they are just quick guesses without thinking much about it.
ross is offline  
Old 03 August 2020, 15:51   #19
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
Right, it is indeed hard to be exact without measurements. Perhaps I should try it myself once to see how it goes. Thanks for your views though, they certainly sound reasonable.
roondar is offline  
Old 03 August 2020, 16:25   #20
pink^abyss
Registered User
 
Join Date: Aug 2018
Location: Untergrund/Germany
Posts: 408
Quote:
Originally Posted by ross View Post
There is one thing that I didn't properly understand ..
Why slow down the copper list using three convecutive CWAIT_BFD?
There is a thread somewhere on EAB (where I also wrote some code) where it was reported that with a single wait there were problems
(but at the end it has not been clarified if it was an hw problem specific to that machine).
Theoretically only one should be sufficient on the Agnus from the A500 onwards (but maybe here Toni should intervene to clarify).
It seems strange to me, considering how much they have optimized everything, that there is no reason for this.

For my current game i also use blitting from the copper list and i ran into these 'single wait' blit issues on real hardware. My observations:

- Single wait doesn't work always on my A500 but two waits work always.
- On A1200 a single wait is enough
- Using a single wait runs always fine in WinUae

My experience:

A500 + single wait does work under certain situations (as stated in EAB forums) but it is depending on channel usage and number of blits. I never needed 3 waits.
pink^abyss is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Mega Typhoon ECS only? Photon HOL suggestions and feedback 8 16 April 2020 21:47
EAB/Lemon Super League 2017: Round 4 - Mega Typhoon Graham Humphrey EAB's competition 50 09 April 2017 11:01
Working copy of Mega Typhoon ECS game? ImmortalA1000 request.Old Rare Games 9 04 February 2013 06:38
Mega Typhoon Trainer Version - Working! plasmatron request.Old Rare Games 1 03 July 2011 23:52
Mega Typhoon haynor666 HOL contributions 1 19 August 2008 00:37

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:46.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.16905 seconds with 16 queries