English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 18 July 2017, 18:01   #221
frank_b
Registered User
 
Join Date: Jun 2008
Location: Boston USA
Posts: 466
Quote:
Originally Posted by roondar View Post
I'd say option b) is not acceptable and option a) sounds like a non-optimal approach. However, I personally feel your take on option c) is rather negative. While it is true that transfering data from fast to chip is not going to be very fast, I am also pretty certain you don't actually need to transfer all that much per frame.
You're talking about prefetching. It may or may not work depending on how the game assets are used. I'm aware of the parallelism possible with the chipset.

Generally though you only really get parallelism on the last bit plane of a bob. If you're doing a load of blits in sequence you have to serialise access to the blitter.
The rest of the time you're waiting on the blitter. Polling the blitter's status is a chip ram bus access btw. If you want true parallelism you either:

a) use interleaved bitplanes which limits the vertical size of your bob and costs you duplicate mask data (ouch)

b) Use an interrupt restart and suck up the cost of the interrupt handler (ouch)

c) Limit it to one VBL, use the blitter finished disable bit and do it via the copper.

Many slots usable for the blitter typically get consumed by higher priority DMA.
Bitplanes, copper, audio (!) and sprites all steal cycles usable by the Amiga blitter.

I'd rather have a true bus master blitter which can access everything. I'd prefer that even if it only uses 68k bus cycles.
That way system fonts can be blitted from ROM, workbench background pictures don't consume any precious chip memory and bitmap fonts can be rendered from fast RAM.

Swings and roundabouts....

Last edited by frank_b; 18 July 2017 at 18:15.
frank_b is offline  
Old 18 July 2017, 18:01   #222
idrougge
Registered User
 
Join Date: Sep 2007
Location: Stockholm
Posts: 4,332
Sounds to me like the true Amiga solution is a lot of talking and a lot less doing.
idrougge is online now  
Old 18 July 2017, 18:08   #223
frank_b
Registered User
 
Join Date: Jun 2008
Location: Boston USA
Posts: 466
Quote:
Originally Posted by dlfrsilver View Post
The CPU speed clock is mostly is a "myth" (remember what Mcoder said on Atari Forum, that it was totally irrelevant).
I'm not talking about the clock speed. I'm talking about the cycle count per masked word on screen. 8 cycles is best case on the Amiga using the $CA cookie cut method. That's during horizontal and vertical blank when there's no competing DMA activity. What do you think the CPU cycle count is when say 4 low res bitplanes are used on the Amiga during a blit? Hint: It's way more than 8.

Anima's method on the STe doesn't mask at all. It's a line by line copy taking advantage of the intelligent end masks. It's 8 cycles best case and 12 cycles worst case per word. That's the same speed as the best case on the Amiga and it's considerably faster than the worst case when there's competing DMA activity. That's before you take into account the 11% faster clock speed.
There's blitter restart overhead on the Ste but it still probably runs about 50% faster than the brute force method I used in my ST blitter intro.
frank_b is offline  
Old 18 July 2017, 18:12   #224
dlfrsilver
CaptainM68K-SPS France
 
dlfrsilver's Avatar
 
Join Date: Dec 2004
Location: Melun nearby Paris/France
Age: 46
Posts: 10,412
Send a message via MSN to dlfrsilver
Quote:
Originally Posted by roondar View Post
While I'm sure it's possible that all tile & sprite data can be accessed at any time, this doesn't actually seem to happen in the game itself - certainly not in the space of say a couple of hundred frames.
Agreed. even in this game, and even considering the huge memory it's eating up.

Quote:
I'd say option b) is not acceptable and option a) sounds like a non-optimal approach.
Option B & C are clearly not what you want to do.

Quote:
However, I personally feel your take on option c) is rather negative. While it is true that transfering data from fast to chip is not going to be very fast, I am also pretty certain you don't actually need to transfer all that much per frame.
Indeed.

Quote:
And after transfer, you'd obviously not use the CPU for drawing.
Yes the Amiga CPU is freed as soon as datas have been sent to chip ram.

Quote:
As a quick back-of-the-envelope kind of thing:

The maximum number of tiles you'd need to transfer the definition of at any time would be roughly 30 (assuming 8x8 tiles). At 16 colours, that is all of 960 bytes and that is assuming there are no tiles at all available in chipmemory at the time that can satisfy the new tiles that need to be drawn (which is exceedingly unlikely).
In CPS1 games, the sprites are all 16x16 pixels tiles. A common sprite list is something like 32kb ($8000) maximum for the sprites with the most frames. but it can be way way less than that.

Quote:
The maximum number of sprite frames is harder to estimate, but a single sprite frame (assuming 32x32 and 16 colours) is roughly 512 bytes.
Let see : the skeleton from level 1 use something like 15-20 frames maximum.

Its sprite list is 561 bytes if i refer to my notes (i gathered it inside the program.)

The most complicated sprite i think to move is the zombi worm rising from the ground in level 1. It's a linked sprites made of many tiles, with inside each the tiles changing to make the player believe it's piece per piece animated. That must be really processing intensive.

Quote:
Assuming 8 new sprite frames are needed every frame (which I again, find exceedingly unlikely), you are looking at transfering 4096 bytes of data.
Well i think the bosses should have big sprite list, but since you are alone in front of them, it should be no problem at all.

Quote:
So in a scenario that most likely will never actually happen, you still only need roughly 5K of bandwidth.
Count between 5K and 32K to be sure.

Quote:
And again, that does make assumptions that may not be true. In reality, if tiles are 8 pixels in size, you basically have multiple frames to transfer them. For sprites, you only need to transfer what is needed newly - and that's not going to be hundreds of frames of data. Most of the animations don't seem to actually show a new image every frame, lowering the required bandwidth further.
They are 16x16 for sprites and a part of the tiles. the other part is 8x8 (mostly HUD) and 32x32 for the big lot of the level tiles.
dlfrsilver is offline  
Old 18 July 2017, 18:13   #225
frank_b
Registered User
 
Join Date: Jun 2008
Location: Boston USA
Posts: 466
Quote:
Originally Posted by idrougge View Post
Sounds to me like the true Amiga solution is a lot of talking and a lot less doing.
Hopefully it doesn't turn out like the Amiga port of Cho Jen Ra.
frank_b is offline  
Old 18 July 2017, 18:17   #226
DamienD
Banned
 
DamienD's Avatar
 
Join Date: Aug 2005
Location: London / Sydney
Age: 47
Posts: 20,420
Quote:
Originally Posted by idrougge View Post
Sounds to me like the true Amiga solution is a lot of talking and a lot less doing.
Hahahahaha

...seems true in this case

@sandruzzo; stop wasting time constantly replying to this thread and get back to Rygar
DamienD is offline  
Old 18 July 2017, 18:22   #227
frank_b
Registered User
 
Join Date: Jun 2008
Location: Boston USA
Posts: 466
Quote:
Originally Posted by DamienD View Post
Hahahahaha

...seems true in this case

@sandruzzo; stop wasting time constantly replying to this thread and get back to Rygar
Thanks. Now I'll have the Lynx Rygar game music in my head all day ;P
frank_b is offline  
Old 18 July 2017, 19:08   #228
kovacm
Banned
 
Join Date: Jan 2012
Location: Serbia
Posts: 275
Quote:
Originally Posted by roondar View Post
However, to be clear: this kind of full CPU/Blitter concurrency would only work if you can fit enough data in fastram and actually have fastram available. Plus, it would need to be coded specifically for the Amiga.
Why not try it if anyone have time/motivation. I am egar to see what Amiga could do with FastRAM. Sandruzzo idea of adding only small chunk of FastRAM is great! Maybe Amiga could be even better if Commodore add this small chunk of FastRAM in every Amiga! They miss opportunity to take advantage of Amiga design. We (Atari ppl) had to wait _two decades_ to see what DSP in Falcon is capable of: [ Show youtube player ]...


EDIT:
reading few last post, I conclude that only coders should have right to write in this thread
and we non-coders, should only ask question or two, for them to clarified their writings.

Last edited by kovacm; 18 July 2017 at 19:18.
kovacm is offline  
Old 18 July 2017, 19:32   #229
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Quote:
Originally Posted by DamienD View Post
Hahahahaha

...seems true in this case

@sandruzzo; stop wasting time constantly replying to this thread and get back to Rygar
sandruzzo is offline  
Old 18 July 2017, 19:40   #230
NorthWay
Registered User
 
Join Date: May 2013
Location: Grimstad / Norway
Posts: 839
Quote:
Originally Posted by sandruzzo View Post
In my opinio OCS chip set was an unfinished project.
The opinion of the original designers was that it was more important to get it out the door.
I can't disagree with them, but it certainly would have been nice to have some nips and tucks in there (apart from how it screams for being 32-bit so you don't do half address writes).
NorthWay is offline  
Old 18 July 2017, 22:38   #231
AnimaInCorpore
Registered User
 
Join Date: Nov 2012
Location: Willich/Germany
Posts: 232
Quote:
Originally Posted by dlfrsilver View Post
In CPS1 games, the sprites are all 16x16 pixels tiles. A common sprite list is something like 32kb ($8000) maximum for the sprites with the most frames. but it can be way way less than that.
In fact, sprites can be bigger than 16 x 16 pixels defined by special width and height "concatenation" bits but they're still being comprised of 16 x 16 pixels tiles though.
However, you need to add the flipped sprites/tiles to your memory calculation as well as some different coloured alternate reincarnations. This will increase your memory requirement by a good margin.
AnimaInCorpore is offline  
Old 18 July 2017, 22:56   #232
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
Quote:
Originally Posted by idrougge View Post
Sounds to me like the true Amiga solution is a lot of talking and a lot less doing.
Ah, sorry - this thread is about discussions (see title), not actually making stuff. That'd be silly

Quote:
You're talking about prefetching. It may or may not work depending on how the game assets are used. I'm aware of the parallelism possible with the chipset.

Generally though you only really get parallelism on the last bit plane of a bob. If you're doing a load of blits in sequence you have to serialise access to the blitter.
The rest of the time you're waiting on the blitter. Polling the blitter's status is a chip ram bus access btw. If you want true parallelism you either:

a) use interleaved bitplanes which limits the vertical size of your bob and costs you duplicate mask data (ouch)
Interleaved bitplanes is pretty much the standard for quick blitting on the Amiga. Yeah, the extra mask data kinda sucks - but the vertical size is never really a problem (for a 4 bitplane screen, max blit vertical size is ~250 pixels when using interleaved mode. I'd say that's enough, really).

Blitting seperate bitplanes is not the way to go if you want speed.

Quote:
b) Use an interrupt restart and suck up the cost of the interrupt handler (ouch)
Indeed, that'll cost some 100 or so CPU cycles per blit. But when compared to the ~3000 CPU cycles your blit is going to take it's not really that much.

Quote:
c) Limit it to one VBL, use the blitter finished disable bit and do it via the copper.
Blitter controlled by Copper is pretty much the way to go when using fastram and was what I was getting at. The CPU can spend whatever portion of a frame it needs to create an optimised Copper list (copying over just the result or even just the delta if you want to be really clever), which then does all the blitting on the next frame. And because you'd create such a copperlist runtime you can even do blits that cross the VBL.

Moreover, because you are blitting using the copper, the CPU needs only touch chipram to either write the relevant parts of the copperlist to update or to transfer GFX over from fastram. Meaning that the rest of the available DMA* can be used pretty much just for the blitter - instead of using 10-30% of the DMA time to run CPU instructions.

*) Of course, deducting bitplane, sprite, etc DMA.

In essence, such a method would give the 68000 all ~141000 cycles to use per frame for game logic and setting up the copperlists and give the blitter a bunch more cycles than it would normally have.

Again, not exactly easy - but definitely possible and definitely much quicker than running code & blitter from chipmemory/slowmemory. Which is exactly why I lament the lack of fastram expansions for the most common Amiga.

Quote:
Many slots usable for the blitter typically get consumed by higher priority DMA.
Bitplanes, copper, audio (!) and sprites all steal cycles usable by the Amiga blitter.
This happens regardless of running the CPU in fastram or not. So, running the CPU in fastram does let you use more chip DMA cycles than normal. Plus, you're no longer slowing the CPU - regardless of screenmode.

Quote:
I'd rather have a true bus master blitter which can access everything. I'd prefer that even if it only uses 68k bus cycles.
That way system fonts can be blitted from ROM, workbench background pictures don't consume any precious chip memory and bitmap fonts can be rendered from fast RAM.

Swings and roundabouts....
WellI agree that a bus master blitter would've been nicer, but the Amiga does not have such a Blitter. The one it does have is pretty nice though
roondar is offline  
Old 18 July 2017, 23:15   #233
frank_b
Registered User
 
Join Date: Jun 2008
Location: Boston USA
Posts: 466
Quote:
Originally Posted by AnimaInCorpore View Post
In fact, sprites can be bigger than 16 x 16 pixels defined by special width and height "concatenation" bits but they're still being comprised of 16 x 16 pixels tiles though.
However, you need to add the flipped sprites/tiles to your memory calculation as well as some different coloured alternate reincarnations. This will increase your memory requirement by a good margin.
Hi Anima. How big is the sprite data excluding background tiles? How many sprites get rendered for the average, best and worst case on the Ste? What sizes are they? How many can your engine draw in 1 vbl / 2 vbl?
How much faster is your scan line renderer vs brute force masking? What's the cost for masking a single word?
frank_b is offline  
Old 18 July 2017, 23:19   #234
frank_b
Registered User
 
Join Date: Jun 2008
Location: Boston USA
Posts: 466
Quote:
Originally Posted by roondar View Post
WellI agree that a bus master blitter would've been nicer, but the Amiga does not have such a Blitter. The one it does have is pretty nice though
No argument there. The Amiga is a work of art. The ST blitter is pretty nice though and criminally underrated. You know it can do four pass hflipping and 2x/4x/8x/16x scaling of raster images? It has an indirect addressing HOP smudge mode and more flexible masking registers.

BTW running code from fast RAM on a 4 bitplane screen is actually faster than using chip memory. Branches and shifts don't need to be aligned on 4 cycle boundaries.
frank_b is offline  
Old 19 July 2017, 00:40   #235
saimon69
J.M.D - Bedroom Musician
 
Join Date: Apr 2014
Location: los angeles,ca
Posts: 3,516
Quote:
Originally Posted by frank_b View Post
No argument there. The Amiga is a work of art. The ST blitter is pretty nice though and criminally underrated. You know it can do four pass hflipping and 2x/4x/8x/16x scaling of raster images? It has an indirect addressing HOP smudge mode and more flexible masking registers.

BTW running code from fast RAM on a 4 bitplane screen is actually faster than using chip memory. Branches and shifts don't need to be aligned on 4 cycle boundaries.

The Powder way, if i remember good
saimon69 is offline  
Old 19 July 2017, 05:32   #236
Miggy4eva
Amiga warrior
 
Miggy4eva's Avatar
 
Join Date: Jul 2017
Location: Australia
Posts: 64
With the huge 14MB memory requirement for all sprites at any time, I refuse to believe this can't be streamlined. Why hasn't anyobody played the arcade game through and written down what sprites appear at what stages of the game. I'm sure you only need to hold a few in memory for each level.
Like I have said several times earlier I'm sure the entire game could be run in 2MB of Chipram anyway if you only load the sprites into memory for that level.

Quote:
Originally Posted by idrougge View Post
The difference being that the STE didn't use a RAM board at all.




EDIT - OK, now I see online some details. STE had simm slots on the motherboard for 30 pin simms. So what? My point is still that STE with fully populated RAM slots is modified. It is considered modified, therefore you must allow Amiga modified with RAM board, or the tracer cut on the motherboard to allow 2MB of chipram to be allowed. Most Amiga ram board required very little soldering anyway, and then once board was installed you could use SIMM's to change memory as much as you want with zero effort.

4MB STE isn't a "stock" STE, so to have equal, fair chance in a retro style battle of ports then A500 must be allowed to compete with RAM upgrade also. That is my point.
ST users were budget people and not much of them could afford 4MB in the 80's anyway. Yes, early 90's when the 30 pin simms started to get cheap maybe every die-hard STE owner who hadn't upgraded to Amiga or PC by then might have 4MB but it's not common configuration in "golden era" so you can't compare 4MB STE to 1MB A500 in these battles. Should be 4MB STE vs 4MB A500 with 2MB Chipram motherboard mod to be equal, oranges vs oranges comparison. And in that situation the Amiga will defeat the ST yet again.

Last edited by Miggy4eva; 19 July 2017 at 05:34. Reason: Error for STE simm slots
Miggy4eva is offline  
Old 19 July 2017, 05:49   #237
AnimaInCorpore
Registered User
 
Join Date: Nov 2012
Location: Willich/Germany
Posts: 232
Quote:
Originally Posted by roondar View Post
Blitter controlled by Copper is pretty much the way to go when using fastram and was what I was getting at. The CPU can spend whatever portion of a frame it needs to create an optimised Copper list (copying over just the result or even just the delta if you want to be really clever), which then does all the blitting on the next frame. And because you'd create such a copperlist runtime you can even do blits that cross the VBL.
[...]

Again, not exactly easy - but definitely possible and definitely much quicker than running code & blitter from chipmemory/slowmemory. Which is exactly why I lament the lack of fastram expansions for the most common Amiga.
I think this is a very good point. FastRAM seems the way to go. What are the options here for the Amiga 500 (except for turbo cards)?
AnimaInCorpore is offline  
Old 19 July 2017, 05:55   #238
AnimaInCorpore
Registered User
 
Join Date: Nov 2012
Location: Willich/Germany
Posts: 232
Quote:
Originally Posted by frank_b View Post
Hi Anima. How big is the sprite data excluding background tiles? How many sprites get rendered for the average, best and worst case on the Ste? What sizes are they? How many can your engine draw in 1 vbl / 2 vbl?
How much faster is your scan line renderer vs brute force masking? What's the cost for masking a single word?
I haven't checked that in detail yet. The problem is that the generated Blitter routines highly depend on the data itself and the sprite usage is highly dynamic as well. So far I guess the speed is roughly comparable to the raw Amiga Blitter "cookie-cut" method.
AnimaInCorpore is offline  
Old 19 July 2017, 06:26   #239
AnimaInCorpore
Registered User
 
Join Date: Nov 2012
Location: Willich/Germany
Posts: 232
Quote:
Originally Posted by Miggy4eva View Post
With the huge 14MB memory requirement for all sprites at any time, I refuse to believe this can't be streamlined. Why hasn't anyobody played the arcade game through and written down what sprites appear at what stages of the game. I'm sure you only need to hold a few in memory for each level.
Like I have said several times earlier I'm sure the entire game could be run in 2MB of Chipram anyway if you only load the sprites into memory for that level.
The aimed memory requirement is the lowest possible. The current 14 MB requirement is a result of emulating the whole game in RAM which is generally a bad idea and not a realistic target for available hardware so this can be considered being a first test. So far the problem is that program code runs on the original place so that the OS functions for loading, etc. are completely "disabled".
AnimaInCorpore is offline  
Old 19 July 2017, 10:27   #240
zero
Registered User
 
Join Date: Jun 2016
Location: UK
Posts: 428
Quote:
Originally Posted by AnimaInCorpore View Post
you need to add the flipped sprites/tiles to your memory calculation
I remember reading the startup-sequence rant of Final Fight that the coder claimed to be doing flipping in real-time. I always wondered about that... Maybe he meant just vertical flipping which is easy and cheap, because I can't really see any good way to do horizontal flipping while blitting without it being really slow.
zero is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Japanese Console/Computer RPG discussions Retro-Nerd Retrogaming General Discussion 2 02 April 2009 01:32
Given the recent Scanlines discussions... DamienD request.UAE Wishlist 26 26 April 2007 17:36
Wii Virtual Console / Xbox Live Arcade? killergorilla HOL suggestions and feedback 2 06 March 2007 17:20
Landover's Amiga Arcade Conversion Contest Frog News 1 28 January 2005 23:41

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 12:53.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10702 seconds with 13 queries