English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 09 January 2018, 20:56   #21
LaBodilsen
Registered User

 
Join Date: Dec 2017
Location: Gandrup / Denmark
Posts: 58
Quote:
Originally Posted by britelite View Post
I had some spare time and decided to implement mipmapping to the rendering, resulting in getting rid of displacements when reading the texture. So, now the code is pretty much:
Code:
...
move.b (a1)+,0(a0)
move.b (a1),160(a0)
move.b (a1)+,320(a0)
...
and for really up close walls (zoom factor >2.0):
Code:
...
move.b (a1)+,d2
move.b d2,0(a0)
move.b d2,160(a0)
move.b d2,320(a0)
move.b (a1)+,d2
...
With the stream I've been using this gets the speed to around 1.7-2.5 frames (including blitter clear of buffer and c2p), while the previous version barely had a best case of under 2 frames.

The routine still only renders bytes, so next up would be to try out chb's method of rendering in pairs. With this approach I might also try turning the framebuffer 90 degrees, so I could do the writing with pre/post-increments instead of displacements, saving up a few additional cycles, as an additional blitter pass will anyway be required for handling the byte-pairs.
Nice.. So all the work i just did to try out this:
Quote:
Originally Posted by britelite View Post
Could also be a good idea to have a look the cases for zoom factor <1.0, to see if a combination of post increments and addq.l #value,(a1) could speed up the rendering, like:
Code:
...
move.b (a1)+,0(a0)
move.b (a1)+,160(a0)
addq.l #1,a1 ; skipping an additional byte
move.b (a1)+,320(a0)
move.b (a1)+,480(a0)
...
Is wasted , just kidding. i tried optimizing by hand, just to test, and the result was positive, as it did speed up the rendering to more cases under 2 frames. of course now that you have changed to mipmaps, this is not needed anymore.

btw: in your code, it seems the chunky buffer is segmented for every 4 pixels (0,1600,3200,4800), is that to easier do C2P with the blitter?
LaBodilsen is offline  
AdSense AdSense  
Old 10 January 2018, 08:01   #22
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 378
Quote:
Originally Posted by LaBodilsen View Post
btw: in your code, it seems the chunky buffer is segmented for every 4 pixels (0,1600,3200,4800), is that to easier do C2P with the blitter?
Yes, the way my blitter c2p works, it's easier to have the pixels segmented this way, basically having them spread out on four bitplanes from the start. For 16bit pixels (for example when using HAM), I can have a linear framebuffer instead.
britelite is offline  
Old 10 January 2018, 15:25   #23
LaBodilsen
Registered User

 
Join Date: Dec 2017
Location: Gandrup / Denmark
Posts: 58
Quote:
Originally Posted by britelite View Post
With the stream I've been using this gets the speed to around 1.7-2.5 frames (including blitter clear of buffer and c2p), while the previous version barely had a best case of under 2 frames.
Could this benefit from using double buffering on the Chunky buffer, so you don't have to wait for C2P and Buffer Clear, But instead do the wall drawing while the Blitter is working (preferably using a blitter interrupt), or is the chipmem bandwidth already saturated enough as it is, to not really make that much of a difference in performance?

Or do you want to use that "Blitter time" to do other stuff like raycasting?
LaBodilsen is offline  
Old 10 January 2018, 15:48   #24
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 378
Quote:
Originally Posted by LaBodilsen View Post
Could this benefit from using double buffering on the Chunky buffer, so you don't have to wait for C2P and Buffer Clear, But instead do the wall drawing while the Blitter is working (preferably using a blitter interrupt), or is the chipmem bandwidth already saturated enough as it is, to not really make that much of a difference in performance?

Or do you want to use that "Blitter time" to do other stuff like raycasting?
I would at least use the time allowed by the the blitter clearing of the buffer for raycasting.

I also had a stab at chb's double pixel rendering, and the results are very promising. The best case didn't improve much (due to the extra blitter pass), but now the stream runs almost entirely in two frames, worst case being maybe 2.05-2.1 frames.
britelite is offline  
Old 11 January 2018, 01:29   #25
Photon
Moderator
Photon's Avatar
 
Join Date: Nov 2004
Location: Hult / Sweden
Posts: 4,501
Looking forward to the demo About time Amiga got a better port than other platforms!

What are some reasons a chunky buffer should not have Y horizontal? I haven't found one, whether truecolor or indexed.

Last edited by Photon; 11 January 2018 at 01:43.
Photon is offline  
Old 11 January 2018, 07:36   #26
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 378
Quote:
Originally Posted by Photon View Post
What are some reasons a chunky buffer should not have Y horizontal? I haven't found one, whether truecolor or indexed.
If you mean having the chunkybuffer rotated 90 degrees, then currently mainly the fact that I haven't felt like tinkering with my c2p routine yet, but at some point I will
britelite is offline  
Old 11 January 2018, 08:29   #27
LaBodilsen
Registered User

 
Join Date: Dec 2017
Location: Gandrup / Denmark
Posts: 58
Quote:
Originally Posted by britelite View Post
I also had a stab at chb's double pixel rendering, and the results are very promising. The best case didn't improve much (due to the extra blitter pass), but now the stream runs almost entirely in two frames, worst case being maybe 2.05-2.1 frames.
That is just incredible, and close to insane.. having somthing like this running mostly 25fps on a plain A500 would have been regarded as "magic" 25 years ago.

OT: I've coded my own nontextured wallrender hacked into your "framework", it's not using unrolled loops per se, and is dog slow (worst case is over 4 frames). but it was just for learning purpose, and it's all starting to make a lot more sence now.

next up is writing my own raycaster, which is mostly done in psydo code atm. but hopefully i will have some time this weekend to try it out in Assembler.
/OT

Quote:
Originally Posted by Photon View Post
What are some reasons a chunky buffer should not have Y horizontal? I haven't found one, whether truecolor or indexed.
If your thinking about my question regarding the segmented chunkybuffer, then it's only segmented for every 4 pixel on X, not on Y. sorry if my question was not explanatory enough.
LaBodilsen is offline  
Old 11 January 2018, 08:59   #28
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 1,180
I don't think you'll hit 25fps woth game logic, music, collisions and on. A demo is demo, a game is another thing! Btw Amiga is still incredible after 25 years!

That for you efforts to show as what a simple 7mhz machine can do!!!
sandruzzo is offline  
Old 11 January 2018, 09:05   #29
chb
Registered User

 
Join Date: Dec 2014
Location: germany
Posts: 73
Wow, great result, almost steady 25fps on an A500? Very nice! Even if raycaster + game logic come on top of that - a smaller window or running in 3 frames would be perfectly acceptable for a game. Looking very much forward to the next demo(s).
chb is offline  
Old 11 January 2018, 09:05   #30
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 378
Quote:
Originally Posted by sandruzzo View Post
I don't think you'll hit 25fps woth game logic, music, collisions and on.
Indeed, I'd guess a full game would run in around 12-16fps at current resolution.

But my goal with this routine is to code it in a way that it's suitable for a game, without wasting memory. For demo purposes I could already make it run way faster

Last edited by britelite; 11 January 2018 at 09:13. Reason: Clarification to fps estimate
britelite is offline  
Old 11 January 2018, 09:28   #31
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 1,180
Quote:
Originally Posted by britelite View Post
Indeed, I'd guess a full game would run in around 12-16fps at current resolution.

But my goal with this routine is to code it in a way that it's suitable for a game, without wasting memory. For demo purposes I could already make it run way faster


You're resolution is 160*100 2x2?
sandruzzo is offline  
Old 11 January 2018, 09:42   #32
LaBodilsen
Registered User

 
Join Date: Dec 2017
Location: Gandrup / Denmark
Posts: 58
Quote:
Originally Posted by sandruzzo View Post
I don't think you'll hit 25fps woth game logic, music, collisions and on. A demo is demo, a game is another thing!
Of course not, But for what it do at the moment, 25 fps is a very good result.
LaBodilsen is offline  
Old 11 January 2018, 09:47   #33
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 378
Quote:
Originally Posted by sandruzzo View Post
You're resolution is 160*100 2x2?
It's 160x80 (to match the aspect ratio of the original Wolf3D)
britelite is offline  
Old 12 January 2018, 14:57   #34
Master484
Registered User
Master484's Avatar
 
Join Date: Nov 2015
Location: Vaasa, Finland
Posts: 303
Here is a random idea:

Could a system be used where the graphics of the previous frame are never cleared, but instead we preserve them, and when drawing a new frame we only draw those pixels that have been changed since the last frame, and therefore need to be updated.

And those pixels that remain the same color as they did last frame, are simply skipped.

So every frame we would go through the pixels one by one, and check the current color versus the color that it should be; and draw the pixel only in the case where the "should be color" is different from the current color.

I think in quite many cases the individual pixel colors in two consequtive frames would be the same, and also you would never need to totally "clear" the screen. So could this method work or boost the speed?

And I don't have any experience in raycasting or 3D coding, just thought to mention this idea.
Master484 is offline  
Old 12 January 2018, 16:05   #35
ajk
Registered User
ajk's Avatar
 
Join Date: May 2010
Location: Helsinki, Finland
Posts: 1,066
It's quite likely that doing such comparisons for each pixel will take more time than just redrawing the screen
ajk is offline  
Old 12 January 2018, 20:12   #36
Dunny
Registered User

Dunny's Avatar
 
Join Date: Aug 2006
Location: Scunthorpe/United Kingdom
Posts: 974
Ask yourself - how will you know if those pixels have changed from the previous frame?
Dunny is offline  
Old 12 January 2018, 22:18   #37
saimon69
J.M.D - Bedroom Musician

 
Join Date: Apr 2014
Location: los angeles,ca
Posts: 777
In the meanwhile that the engine goes on i want to express my ideas for palettes: considered that the target is OCS/ECS i expect that very likely the game might run at 16 colors, therefore even without using a dual playfiled approach i guess using 8 colors for the walls and 8 for enemies (actually 7) might be a decent approach. I did not paint anything but makes me think like this:
- a dark neutral color for shadows (like a dark grey 10%) that might double as black
- two shades of grey in the 66% and 33%
- one brown - 50%
- one red
- one pink
- one white or very light grey like 12%

This should cover most of the W3D sprites - some colors might change according to levels
saimon69 is offline  
Old 13 January 2018, 00:45   #38
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 378
Quote:
Originally Posted by saimon69 View Post
In the meanwhile that the engine goes on i want to express my ideas for palettes
Let's keep off-topic discussion out of this thread.
britelite is offline  
Old 13 January 2018, 10:52   #39
Master484
Registered User
Master484's Avatar
 
Join Date: Nov 2015
Location: Vaasa, Finland
Posts: 303
Quote:
Ask yourself - how will you know if those pixels have changed from the previous frame?
I would make a data table which holds the color values of each pixel in the screen.

And when drawing new pixels to screen, I would check the old values from this table, and in those cases where the old pixel color is different from the one that I'm going to draw, then I would update both the pixel on the screen and the value in the data table.

Also only those pixel values would need to be checked that are going to be updated this frame. So we wouldn't need to go through the whole data table.

Although of course doing this stuff would consume time...but if lots of pixel drawing and screen clearing every frame could be skipped, then could it be faster? I don't know, just throwing the idea around.

Last edited by Master484; 13 January 2018 at 11:00.
Master484 is offline  
Old 13 January 2018, 11:16   #40
Megol
Registered User

Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 292
Quote:
Originally Posted by Master484 View Post
I would make a data table which holds the color values of each pixel in the screen.

And when drawing new pixels to screen, I would check the old values from this table, and in those cases where the old pixel color is different from the one that I'm going to draw, then I would update both the pixel on the screen and the value in the data table.

Also only those pixel values would need to be checked that are going to be updated this frame. So we wouldn't need to go through the whole data table.

Although of course doing this stuff would consume time...but if lots of pixel drawing and screen clearing every frame could be skipped, then could it be faster? I don't know, just throwing the idea around.
So you are rendering all the pixels anyway but adding a comparison requiring a memory read plus a branch.

If the texturing code was very fast but the c2p code was _extremely_ slow that could be a good choice however here the texturing code is extremely slow and the c2p isn't.
Megol is offline  
AdSense AdSense  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Wolf3D on stock A500 gururise Retrogaming General Discussion 9 08 November 2017 14:03
Wolf3d: more ideas. AndNN Coders. Asm / Hardware 7 17 October 2017 13:03
Optimizing HAM8 renderer. Thorham Coders. Asm / Hardware 5 22 June 2017 18:29
NetSurf AGA optimizing arti Coders. Asm / Hardware 199 10 November 2013 14:36
rendering under wb 1.3 _ThEcRoW request.Apps 2 02 October 2005 17:23

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 22:11.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2018, vBulletin Solutions Inc.
Page generated in 0.09450 seconds with 16 queries