English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 24 July 2020, 22:58   #121
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Quote:
Originally Posted by redblade View Post
How much memory do the tiles take up? Are they 16 colours?

Great work Agermose, looks very smooth
The tiles are 32 colours, but don't use all the colours, mainly the lower 16 colours, plus a few from the top 16.

The tiles are 112 Kb.
agermose is offline  
Old 25 July 2020, 11:16   #122
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
With most parts in place it is time to optimize the code, and get it to run on an A500.

Initial testing shows everything runs fine on the A500, but as expected we use more raster time, on the less powerful machine. I have a list of optimizations ready.

First up is the tile blit when constrcuting a new row of tile blocks. Currently the restore buffer is created tile by tile, then blitted to the two screen buffers. I had anticipated to save time, by starting blit to one screen buffer, while using the CPU to copy to the other buffer, but it actually ran slower than the blit only version. I assume it is due to the CPU and blitter competing for chip mem access. Any ideas are welcome.

A bigger problem: There is not enough memory to run everything. The tilemap is bigger than what I had planned with, when I decided to go with 32 colours and interleaved gfx.
If I don't free up memory elsewhere it may be necessary to make the unique tiles load per layout.
This would require major rework of the tile engine, so best avoided.
There is another list of memory optimizations that I want to test out.
Some of the speed optimization require jump tables, which take up even more memory.


Using HW sprites instead of blitted chars on the character overlay will save both raster time, and memory, so next task will probably be to convert the char set to sprite data.

Last edited by agermose; 25 July 2020 at 11:57.
agermose is offline  
Old 25 July 2020, 13:16   #123
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 2,174
I love a good puzzle, so here's some idea's I had. Obviously I haven't seen your code, so I may be saying things that are impractical or already being done. Still, I hope they might be helpful anyway:
Quote:
Originally Posted by agermose View Post
First up is the tile blit when constrcuting a new row of tile blocks. Currently the restore buffer is created tile by tile, then blitted to the two screen buffers. I had anticipated to save time, by starting blit to one screen buffer, while using the CPU to copy to the other buffer, but it actually ran slower than the blit only version. I assume it is due to the CPU and blitter competing for chip mem access. Any ideas are welcome.
Well, on the A500, the Blitter is basically always faster for tile blits than using the CPU to do them.

What I personally do for adding new tiles is split the work over as many frames as I have tiles to draw per row. I draw them to all three buffers one after the other (so I draw up to three tiles per frame, using the tile to draw as source for every blit). I don't copy over the restore buffer to the main buffers. Doing it this way can potentially save calculating the offset to the buffer several times. To do this, I allocate enough space in the bitmap to be able to scroll for at least as many frames as it takes to build the new row. If I'm really tight on raster time, I sometimes just create a table for the correct offsets and use a lookup instead of calculating it at all.

Quote:
A bigger problem: There is not enough memory to run everything. The tilemap is bigger than what I had planned with, when I decided to go with 32 colours and interleaved gfx.
If I don't free up memory elsewhere it may be necessary to make the unique tiles load per layout.
This would require major rework of the tile engine, so best avoided.
I'm not sure this will help in your case, but you might be able to eke out some extra memory by not storing any vertically flipped versions of tiles (as vflipping can be done relatively easily using the Blitter). Whether this helps would of course rather depend if any tiles are exact vflipped versions of tiles.

Another option might be to create the blitting masks 'on the fly'. If not all enemy objects are required for a section of the game, you might get away by storing only a single bitplane for the masks permanently and having a small buffer where you create the required masks for the next section "Just In Time". Not done this myself though, so I have no idea how feasible it is or how difficult the code would end up being.

A last idea would be not permanently store any horizontally flipped version of tiles/sprites, doing the same as above and creating them on the fly ahead of sections that need them. Again, not done this myself so no idea about code complexity/performance cost.
Quote:
Using HW sprites instead of blitted chars on the character overlay will save both raster time, and memory, so next task will probably be to convert the char set to sprite data.
If you're OK with loading stuff in between the game and the title, you might even consider only keeping the digits and whatever words the game actually uses in the game and ditching the rest during runtime.

Probably won't save a lot of memory, but still.

I've also experimented with only ever having a 1 bitplane charset and creating the other colours (or the "two colour font") when drawing. This does cost extra raster time, but also saves 4/5th of your character set memory use.

Swings and roundabouts...

---
P.S. my intention here is to be helpful, if you'd rather I didn't post these kind of things just let me know - I know some of us just want to figure it out themselves. I'm definitely not trying to tell you how do things

Last edited by roondar; 25 July 2020 at 13:35. Reason: Made one point a bit more clear
roondar is offline  
Old 25 July 2020, 14:53   #124
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Quote:
Originally Posted by roondar View Post
I love a good puzzle, so here's some idea's I had. Obviously I haven't seen your code, so I may be saying things that are impractical or already being done. Still, I hope they might be helpful anyway:

Well, on the A500, the Blitter is basically always faster for tile blits than using the CPU to do them.

What I personally do for adding new tiles is split the work over as many frames as I have tiles to draw per row. I draw them to all three buffers one after the other (so I draw up to three tiles per frame, using the tile to draw as source for every blit). I don't copy over the restore buffer to the main buffers. Doing it this way can potentially save calculating the offset to the buffer several times. To do this, I allocate enough space in the bitmap to be able to scroll for at least as many frames as it takes to build the new row. If I'm really tight on raster time, I sometimes just create a table for the correct offsets and use a lookup instead of calculating it at all.

I'm not sure this will help in your case, but you might be able to eke out some extra memory by not storing any vertically flipped versions of tiles (as vflipping can be done relatively easily using the Blitter). Whether this helps would of course rather depend if any tiles are exact vflipped versions of tiles.

Another option might be to create the blitting masks 'on the fly'. If not all enemy objects are required for a section of the game, you might get away by storing only a single bitplane for the masks permanently and having a small buffer where you create the required masks for the next section "Just In Time". Not done this myself though, so I have no idea how feasible it is or how difficult the code would end up being.

A last idea would be not permanently store any horizontally flipped version of tiles/sprites, doing the same as above and creating them on the fly ahead of sections that need them. Again, not done this myself so no idea about code complexity/performance cost.
If you're OK with loading stuff in between the game and the title, you might even consider only keeping the digits and whatever words the game actually uses in the game and ditching the rest during runtime.

Probably won't save a lot of memory, but still.

I've also experimented with only ever having a 1 bitplane charset and creating the other colours (or the "two colour font") when drawing. This does cost extra raster time, but also saves 4/5th of your character set memory use.

Swings and roundabouts...

---
P.S. my intention here is to be helpful, if you'd rather I didn't post these kind of things just let me know - I know some of us just want to figure it out themselves. I'm definitely not trying to tell you how do things

I do prefer to figure things out myself, but ideas and help are always welcome, and your input is appreciated. Some good suggestions here, thank you.
I know the blitter is faster than the CPU, but the idea was to run the tilecopy in parallel.
agermose is offline  
Old 25 July 2020, 16:12   #125
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 2,174
Quote:
Originally Posted by agermose View Post
I do prefer to figure things out myself, but ideas and help are always welcome, and your input is appreciated. Some good suggestions here, thank you.
I know the blitter is faster than the CPU, but the idea was to run the tilecopy in parallel.
Ah, the reason this isn't faster on OCS is because both the Blitter and CPU are limited to 16 bits per bus access*, so if you run both at the same time you're effectively just switching which chip gets the time slots on the bus but the overall throughput can't increase for drawing tiles (though there are some exceptions to this, such as clearing blits).

*) though the CPU isn't fast enough to saturate the bus, while the Blitter often can.

---
I also wanted to say that I'm really liking what I'm seeing so far - the video of the full stage 32 looks great. It seems to be very close to the Arcade to me

Makes me very happy as 1942 was one of my favourite games back in the day, even when playing the in hindsight rather poor C64 port.
roondar is offline  
Old 03 August 2020, 12:45   #126
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Quote:
Originally Posted by roondar View Post
Ah, the reason this isn't faster on OCS is because both the Blitter and CPU are limited to 16 bits per bus access*, so if you run both at the same time you're effectively just switching which chip gets the time slots on the bus but the overall throughput can't increase for drawing tiles (though there are some exceptions to this, such as clearing blits).

*) though the CPU isn't fast enough to saturate the bus, while the Blitter often can.

---
I also wanted to say that I'm really liking what I'm seeing so far - the video of the full stage 32 looks great. It seems to be very close to the Arcade to me

Makes me very happy as 1942 was one of my favourite games back in the day, even when playing the in hindsight rather poor C64 port.
You're correct about the uselessness of running in parallel. I've been developing on an A1200, and forgot the limitations on OCS.
1942 was also one of my favourite aracde games, that's why I chose it. Thanks for the kind words about stage 32. There is still a lot of tweaking to do, to get it even closer to the original.

Back from vacation, and into optimisation mode. Lots of ideas, will focus on biggest wins first.

About spreading the construction of new tile data over multiple frames. Currently I calculate the next 14 tiles (224 pix) for the restore buffer, in one frame, then copy the entire tile row to the two screen buffers the next frame. 1942 scrolls 0.5 pix/frame, which makes this simple implementation possible.
Everything has to run fast enough for the worst case frame to stay within 1/50 sec. The restore buffer calculation is the worst case. Spreading the calculation over multiple frames requires additional buffer space, so the classic speed vs memory tradeoff.

About memory usage.
Initially I had coded everything for 320x272 buffers, only copying tiles to the center 14 tiles, and just set the display window to 224x256.
During the weekend I changed everything to run in 256x272, and use the same display window.
This saved a big chunk of memory, and probably also improved performance, since the display data fetch only happens for 256 pixels now. Note: the 256 pixel buffer width is needed to have the enemies appear from either side, in the 224 pix window.

The best candidates for saving memory are the tilemap and the interleaved masks.
Our tilemap is 700 tiles, but the arcade only has 512. The difference is probably mirrored tiles. Will have to investigate some more. Big potential save here. Flipping the sprites is also an option to investigate.

The game runs in interleaved bitplanes, for speed. Switching to consecutive bitplane mode, would save 4/5 of the blitmasks, at the expense of speed.
Orginially I had the idea of using the lower 16 colours for the software sprites, and then blitting only to 4 bitplanes, and clearing the 5th plane. But the addition of the black colour got the number of sprite colours to 17 plus transparent. If I can get it down to 16 colours again, then the speed increase of just clearing the 5th plane, can make up for the loss of blitting each plane individually.

Back to work.

Last edited by agermose; 03 August 2020 at 13:11.
agermose is offline  
Old 03 August 2020, 13:24   #127
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 2,174
Yeah, it's always a question of optimising memory or speed. The Amiga is so flexible in terms of it's display that it becomes a giant (but ultimately IMHO satisfying) puzzle to try and solve.

I'm looking forward to seeing more updates
roondar is offline  
Old 04 August 2020, 00:10   #128
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Quote:
Originally Posted by roondar View Post
Yeah, it's always a question of optimising memory or speed. The Amiga is so flexible in terms of it's display that it becomes a giant (but ultimately IMHO satisfying) puzzle to try and solve.

I'm looking forward to seeing more updates
Like you, I rather enjoy this stage of development.
Saved another 4k today by changing some movement tables from w to b. Been wanting to refactor the movement code for a long time.
agermose is offline  
Old 04 August 2020, 11:19   #129
mcgeezer
Registered User

 
Join Date: Oct 2017
Location: Sunderland, England
Posts: 1,923
Quote:
Originally Posted by agermose View Post
Like you, I rather enjoy this stage of development.
Saved another 4k today by changing some movement tables from w to b. Been wanting to refactor the movement code for a long time.
Funny you should say this because I do too....
Squeezing everything into memory and onto disk was something I really enjoyed and I learned a few techniques along the way doing it.
mcgeezer is offline  
Old 10 August 2020, 16:30   #130
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Quick update. Been working on the character layer.
Working with hardware sprites in a 5 bpl setup is a real pain in the a....
agermose is offline  
Old 10 August 2020, 16:47   #131
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 2,174
Yeah, Amiga HW Sprite colours being shared with the main palette entries has always been a bit unfortunate

Yes, I know there are is a trick to lower colour register use for 3 colour Sprites but even then you're still looking at "losing" six colour registers to use for just the Sprite channels. That's a lot of colours to lose.
roondar is offline  
Old 10 August 2020, 17:09   #132
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Quote:
Originally Posted by roondar View Post
Yeah, Amiga HW Sprite colours being shared with the main palette entries has always been a bit unfortunate

Yes, I know there are is a trick to lower colour register use for 3 colour Sprites but even then you're still looking at "losing" six colour registers to use for just the Sprite channels. That's a lot of colours to lose.
Cannot afford to loose any colours. Looking at attached sprites instead.

Bigger problem is that the DMA timing of 5 bitplanes does not allow enough time for the horizontal multiplexing I had planned. Too many slots are taken, and not enough time to reposition a single sprite. Trying to mix it with attached sprites. Looks good on paper, but I'll have to see.

And it gets worse with the copper split, since it has to happen at a varying line, which may or may not be the same line as some sprites. Initially I have parked that issue, to focus on the sprite multiplexer.
agermose is offline  
Old 12 August 2020, 23:34   #133
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Almost there with the score panel. Using a combination of repeated sprites and attached sprites did the trick. Couple of glitches to fix. Attached sprites work except in one case, which I’ve yet to figure out why.
Copper split not as big a problem as I had feared. Screenshot coming, when the glitches are fixed.
agermose is offline  
Old 14 August 2020, 17:48   #134
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Colours to be sorted out.
Attached Thumbnails
Click image for larger version

Name:	1320D1DB-9243-42EE-941A-BEB0ED40BEB4.jpg
Views:	144
Size:	978.0 KB
ID:	68481  
agermose is offline  
Old 16 August 2020, 12:44   #135
lmimmfn
Registered User

 
Join Date: May 2018
Location: Mullingar
Posts: 143
Looks great
lmimmfn is offline  
Old 16 August 2020, 14:12   #136
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 2,174
Quote:
Originally Posted by agermose View Post
Colours to be sorted out.
Nice! Always good when you get complicated effects to work
roondar is offline  
Old 16 August 2020, 18:58   #137
pink^abyss
Registered User
 
Join Date: Aug 2018
Location: Untergrund/Germany
Posts: 136
Quote:
Originally Posted by agermose View Post
Cannot afford to loose any colours. Looking at attached sprites instead.

Bigger problem is that the DMA timing of 5 bitplanes does not allow enough time for the horizontal multiplexing I had planned. Too many slots are taken, and not enough time to reposition a single sprite. Trying to mix it with attached sprites. Looks good on paper, but I'll have to see.

And it gets worse with the copper split, since it has to happen at a varying line, which may or may not be the same line as some sprites. Initially I have parked that issue, to focus on the sprite multiplexer.
I had similar issues to solve with an upcoming A500 + 512kb OCS project. I skipped any copper tricks for scroll wrapping and simply used half of the chipmem to have 3 buffers with 256x512 at 5 planes. To wrap i duplicate the screen. No blit split needed.

As chipmem is now precious i have all tiles in fast. With 0.5 pixel scroll i copy one tile from 'fastmem' with the cpu to the restore buffer and blit from there to the screenbuffers. So i can scroll for infinity.

For bobs i use blitting from copper list with a couple of predefined copper lists for certain bob widths and variable blit height. I also blit all hud elements.

This setup frees me from any sprite multiplexing and other complexity that often has drawbacks in regards of color fidelity. An issue could be to have not enough chipmem. To solve this i unpack new chipmem chunks as needed on the fly. Players usually don't notice when a game slows down for a single frame every few minutes when decrunching new data (i recommend DOYNAX for speed).
pink^abyss is offline  
Old 18 August 2020, 19:34   #138
agermose
Registered User

 
Join Date: Nov 2019
Location: Odense / Denmark
Posts: 72
Quote:
Originally Posted by pink^abyss View Post
I had similar issues to solve with an upcoming A500 + 512kb OCS project. I skipped any copper tricks for scroll wrapping and simply used half of the chipmem to have 3 buffers with 256x512 at 5 planes. To wrap i duplicate the screen. No blit split needed.

As chipmem is now precious i have all tiles in fast. With 0.5 pixel scroll i copy one tile from 'fastmem' with the cpu to the restore buffer and blit from there to the screenbuffers. So i can scroll for infinity.

For bobs i use blitting from copper list with a couple of predefined copper lists for certain bob widths and variable blit height. I also blit all hud elements.

This setup frees me from any sprite multiplexing and other complexity that often has drawbacks in regards of color fidelity. An issue could be to have not enough chipmem. To solve this i unpack new chipmem chunks as needed on the fly. Players usually don't notice when a game slows down for a single frame every few minutes when decrunching new data (i recommend DOYNAX for speed).
Looking forward to seeing your project, sounds interesting.
The way you do the scrolling is how the 1942 arcade works. Using all that chipmem was a no go on this project, since a lot of graphics are reused across all levels.
agermose is offline  
Old 18 August 2020, 20:18   #139
aros-sg
Registered User

 
Join Date: Nov 2015
Location: Italy
Posts: 53
Quote:
Originally Posted by pink^abyss View Post
I had similar issues to solve with an upcoming A500 + 512kb OCS project. I skipped any copper tricks for scroll wrapping and simply used half of the chipmem to have 3 buffers with 256x512 at 5 planes. To wrap i duplicate the screen. No blit split needed.

As mentioned once in another thread it should be possible to avoid copper split and blit split (*) while still using normal sized (not double height like in your case) buffers if you basically arrange the single buffers inside a 3xheight master buffer and make them all wrap at the bottom of the master buffer.


33333
11111
11111
11111


11111
22222
22222
22222


22222
33333
33333
33333


This way 2 of the buffers are always non-wrapping in memory (no copper split, no blit split). Only 1 of the buffer will most of the time have a split and there (*) you would need blit splits.



The buffers must swap "role" at the correct time as the wrapping changes. So that the buffer which wraps is always the "restore buffer".
aros-sg is offline  
Old 19 August 2020, 11:35   #140
pink^abyss
Registered User
 
Join Date: Aug 2018
Location: Untergrund/Germany
Posts: 136
Quote:
Originally Posted by agermose View Post
Looking forward to seeing your project, sounds interesting.
The way you do the scrolling is how the 1942 arcade works. Using all that chipmem was a no go on this project, since a lot of graphics are reused across all levels.

That's of course an issue with that approach. I guess the arcade uses around 64kb for sprites and the rest is tiles (fast ram!). This would be around 160kb interleaved masked blits for sprites.
With 3 buffers it you would need ~400kb chipram. This is quite a lot as you get only around 440kb from A500 dos startup.. and you need chipram for audio too..
pink^abyss is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
1942 monitor mabus MarketPlace 4 20 March 2009 20:27
Schematic diagram of an Amiga 1942 or Amiga 1940 monitor Vaclav support.Hardware 0 18 May 2006 05:18
Any good 1942 conversions? Maverick Retrogaming General Discussion 9 04 September 2005 00:40
Looking for Battlehawks 1942 micktheodor request.Old Rare Games 1 03 September 2004 12:20

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 02:45.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.
Page generated in 0.09301 seconds with 16 queries