English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 13 March 2018, 20:30   #1
Samurai_Crow
Total Chaos forever!
 
Samurai_Crow's Avatar
 
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,186
Blitter C2P? How?

I know from the Wolfenstein 3D port thread that turning the chunky buffer sideways in Chip RAM allows the blitter to somehow assist in Chunky to Planar conversion. I also know that bitwise, doing so makes the bits 90 degrees exactly on a centered point of symmetry. How to use the blitter to do C2P escapes me though. Is there any documentation or examples of a blitter-based chunky to planar converter? My goal is to make an AGA 4-bit mode for dual playfields and sprites.
Samurai_Crow is offline  
Old 13 March 2018, 21:48   #2
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,212
Quote:
Originally Posted by Samurai_Crow View Post
I know from the Wolfenstein 3D port thread that turning the chunky buffer sideways in Chip RAM allows the blitter to somehow assist in Chunky to Planar conversion. I also know that bitwise, doing so makes the bits 90 degrees exactly on a centered point of symmetry. How to use the blitter to do C2P escapes me though. Is there any documentation or examples of a blitter-based chunky to planar converter? My goal is to make an AGA 4-bit mode for dual playfields and sprites.
I believe that the turning sideways allows the CPU to write the vertical strips using (a0)+, which is quicker than writing to offset(a0)
DanScott is offline  
Old 13 March 2018, 22:42   #3
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
There is a short blurb on it at the bottom of http://www.lysator.liu.se/~mikaelk/doc/c2ptut/ .

The gist of it is:

1. You can construct a "merge op" which exchanges bits 'diagonally' between a pair of CPU registers via some AND/shift/EOR operations

2. You can construct a similar "merge op" with the blitter, by doing one ascending and one descending blitter pass

3. Judicious use of the "merge op" with different shift factors and masks allows you to perform c2p conversion

1 + 3 => CPU C2P

2 + 3 => Blitter C2P
Kalms is offline  
Old 15 March 2018, 08:43   #4
chb
Registered User
 
Join Date: Dec 2014
Location: germany
Posts: 439
Quote:
Originally Posted by DanScott View Post
I believe that the turning sideways allows the CPU to write the vertical strips using (a0)+, which is quicker than writing to offset(a0)
Yep, it's not related to blitter c2p, in fact, britelite's demo wolf3d was using blitter c2p form the beginning, but the rotated chunky buffer came later. AFAIK, he rotates it back to normal layout with the blitter before doing c2p.
chb is offline  
Old 15 March 2018, 10:18   #5
Samurai_Crow
Total Chaos forever!
 
Samurai_Crow's Avatar
 
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,186
It seems I'll have to give this some thought.

Does the 90 degree rotation work on a single tall bitplane?

Last edited by Samurai_Crow; 15 March 2018 at 10:25.
Samurai_Crow is offline  
Old 15 March 2018, 12:33   #6
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
If the rotation happens before c2p, it will work on chunky pixels. If the rotation happens afterward, it will work on individual bitplanes. If the rotation happens as part of the c2p, it... makes the c2p transform a bit more convoluted to describe.
Kalms is offline  
Old 15 March 2018, 17:12   #7
Samurai_Crow
Total Chaos forever!
 
Samurai_Crow's Avatar
 
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,186
Quote:
Originally Posted by Kalms View Post
If the rotation happens before c2p, it will work on chunky pixels. If the rotation happens afterward, it will work on individual bitplanes. If the rotation happens as part of the c2p, it... makes the c2p transform a bit more convoluted to describe.
Still, rotating the bits 90 degrees clockwise with the CPU would combine the depth and X dimension loops into one loop for simpler chunky to planar conversions on the 020+. I'll have to try writing such a routine and see how it clocks out.
Samurai_Crow is offline  
Old 22 April 2018, 23:47   #8
AndNN
Registered User
 
Join Date: Oct 2016
Location: Australia
Posts: 58
Quote:
Originally Posted by Kalms View Post
If the rotation happens before c2p, it will work on chunky pixels. If the rotation happens afterward, it will work on individual bitplanes. If the rotation happens as part of the c2p, it... makes the c2p transform a bit more convoluted to describe.
I'm actually curious about about the 90 degree rotation idea. I wrote a paragraph about using this technique on the 16 Nov 2016 in a thread I started about how to do a Wolf 3D port. I pretty much independently came up with this idea. I'm wondering if any demos used this trick prior to my thread post. I'm asking Kalms this because he is an expert on C2P routines and should know the history of blitter C2P conversions. It would actually be good to know the history of all this just to clarify a few things.
AndNN is offline  
Old 23 April 2018, 13:26   #9
LaBodilsen
Registered User
 
Join Date: Dec 2017
Location: Denmark
Posts: 179
Well it seems Tony McGarry got the same basic idea for rotating the buffer in 1994, when talking about porting Doom.

https://groups.google.com/d/msg/comp...o/D9XpzynemsQJ

Quote:
To take advantage of planar you really need a routine that
works by scaling rows ( so you can get 8/16 consecutive
pixels ) . Unfortunatley wall tmapping usually involves
scaling columns so planar wouldn't be a good idea .
Out of interest does anyone know the fastest way to rotate
a screen 90 degrees . Then you could scale rows and just
rotate to give the walls , making planar tmapping just about
possible . My method only needs each column to be shifted
vertically , but optimising it is proving difficult .

Basically I think chunky emulation is the easiest ( and
currently most effective ) way to tmap .

Tony
LaBodilsen is offline  
Old 23 April 2018, 13:40   #10
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
When I saw [ Show youtube player ] I thought "huh, smart use of a wolfenstein/doom style wall renderer". Not only do they render quicker by drawing horizontally, they also get a different visual impact. It's not exactly a huge step from there to "perhaps it is faster to render horizontally + transpose, than to render vertically".

I believe that the Amiga game "Trapped" renders walls horizontally and transposes / performs c2p. Based on second hand info, sourced from this ooold thread ( https://groups.google.com/forum/#!searchin/comp.sys.amiga.programmer/doom$20trapped|sort:date/comp.sys.amiga.programmer/2Sl744igESk/4JajbJvdPCoJ )

The primary reason why I never made any c2p routine which also included a transpose is because its use case is rather narrow. When people were making wolf/doom clones in the late-90s everyone was drawing both walls and textured floors. Transpose in the c2p will speed up wall rendering but make floor/ceiling rendering slower.

In the case of Trapped I expect (but don't know) that TTS/Oxyron measured the performance difference and either did all rendering in transposed mode, or rendered walls transposed, then performed transpose, then continued with drawing walls/floors in non-transposed mode, all depending on which came out faster in the general case.
Kalms is offline  
Old 23 April 2018, 15:34   #11
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Is it possible to do c2p 1x1 only with blitter or close that?
sandruzzo is offline  
Old 23 April 2018, 21:47   #12
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 818
Quote:
Originally Posted by sandruzzo View Post
Is it possible to do c2p 1x1 only with blitter or close that?
Yes, 1x1 c2p is possible with the blitter. But I'm not really sure if there would be any benefit of doing it completely with the blitter compared to a cpu+blitter version.
britelite is offline  
Old 24 April 2018, 09:13   #13
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Quote:
Originally Posted by britelite View Post
Yes, 1x1 c2p is possible with the blitter. But I'm not really sure if there would be any benefit of doing it completely with the blitter compared to a cpu+blitter version.
I need it for 256*192 screen and let the cpu doing all texture mapping
sandruzzo is offline  
Old 24 April 2018, 09:20   #14
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
The major problem with doing it entirely with the blitter is that it caps your max framerate.

For example, if you have a normal chunky-pixel format as input, with 1 byte per pixel, and you want to convert that to 320 x 256 x 4 bitplanes output, that will in the ideal case require approx. 4 frames of processing. This means that the blitter will hog chipmem for 4 frames. You can therefore expect that any visual effect will take at least 5 frames to run = 10fps.

If you want to go quicker than 10fps at that resolution, you need to reduce the amount of blitter work. You can do that by modifying the format of the chunkybuffer. When you do that, you are effectively moving work from the blitter side to the CPU side. Moving more work to the CPU side means more complicated, and sometimes slower, CPU code.

You can also move format changes into the source graphics data (i.e. reformat your textures). This can reduce both CPU and blitter work. When you do this, it often results in more memory usage, and sometimes more complicated CPU code.

For many "chunky" A500 effects, people balance these three parameters to find a sweet spot. People often end up doing 0, 1 or 2 passes with the blitter. People often scramble the pixel format in textures. People sometimes use 16 bits per pixel in the texture. People often unroll the CPU logic and make custom code for processing 4 pixels at a time in order to support the pixel format in their textures.
Kalms is offline  
Old 24 April 2018, 09:22   #15
sandruzzo
Registered User
 
Join Date: Feb 2011
Location: Italy/Rome
Posts: 2,281
Quote:
Originally Posted by Kalms View Post
The major problem with doing it entirely with the blitter is that it caps your max framerate.

For example, if you have a normal chunky-pixel format as input, with 1 byte per pixel, and you want to convert that to 320 x 256 x 4 bitplanes output, that will in the ideal case require approx. 4 frames of processing. This means that the blitter will hog chipmem for 4 frames. You can therefore expect that any visual effect will take at least 5 frames to run = 10fps.

If you want to go quicker than 10fps at that resolution, you need to reduce the amount of blitter work. You can do that by modifying the format of the chunkybuffer. When you do that, you are effectively moving work from the blitter side to the CPU side. Moving more work to the CPU side means more complicated, and sometimes slower, CPU code.

You can also move format changes into the source graphics data (i.e. reformat your textures). This can reduce both CPU and blitter work. When you do this, it often results in more memory usage, and sometimes more complicated CPU code.

For many "chunky" A500 effects, people balance these three parameters to find a sweet spot. People often end up doing 0, 1 or 2 passes with the blitter. People often scramble the pixel format in textures. People sometimes use 16 bits per pixel in the texture. People often unroll the CPU logic and make custom code for processing 4 pixels at a time in order to support the pixel format in their textures.
Thanks a lot for your explanation. But I don't need 320*256, 256*192 will suffice. How is arrangend texture in scrambled mode?
sandruzzo is offline  
Old 24 April 2018, 09:46   #16
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
Quote:
Originally Posted by sandruzzo View Post
Thanks a lot for your explanation. But I don't need 320*256, 256*192 will suffice. How is arrangend texture in scrambled mode?
That depends on what blitter logic you choose to use.

My suggestion: Before you try to use a CPU/blitter combination, try making a CPU version of an effect - any effect - and in-line the C2P conversion into it. In other words, your effect should have an inner loop which reads from the texture, does all the processing needed, and then writes out the result directly to the bitplanes.

Once you have done that, you will have a better understanding of how to shuffle around bits when converting from chunky to planar. You are in a much better position to then move some of the work onto the blitter when you have that understanding.
Kalms is offline  
Old 24 April 2018, 10:19   #17
hooverphonique
ex. demoscener "Bigmama"
 
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,624
Quote:
Originally Posted by Kalms View Post
When I saw [ Show youtube player ] I thought "huh, smart use of a wolfenstein/doom style wall renderer". Not only do they render quicker by drawing horizontally, they also get a different visual impact. It's not exactly a huge step from there to "perhaps it is faster to render horizontally + transpose, than to render vertically".
If you mean the "roller coaster" effect in the beginning, it might as well be done with the copper and 102-scaling..
hooverphonique is offline  
Old 24 April 2018, 10:31   #18
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,212
Quote:
Originally Posted by hooverphonique View Post
If you mean the "roller coaster" effect in the beginning, it might as well be done with the copper and 102-scaling..
No, there's a full wall / ceiling doom style renderer later on in the demo

Unless Kalms is referring to that first part (most definitely 102 trick for sure..and a lot of prescaled bitmaps)
DanScott is offline  
Old 24 April 2018, 10:34   #19
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,212
Quote:
Originally Posted by sandruzzo View Post
How is arrangend texture in scrambled mode?
Maybe best if you start to experiment... you can look at each iteration of your code and see where optimisations can be made in terms of how you store your source data etc...
DanScott is offline  
Old 24 April 2018, 11:15   #20
Kalms
Registered User
 
Join Date: Nov 2006
Location: Stockholm, Sweden
Posts: 237
Quote:
Originally Posted by DanScott View Post
No, there's a full wall / ceiling doom style renderer later on in the demo

Unless Kalms is referring to that first part (most definitely 102 trick for sure..and a lot of prescaled bitmaps)
oh, duh. Yeah, I was referring to that first part. Never realized that it was 102 based
Kalms is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Blitter busy flag with blitter DMA off? NorthWay Coders. Asm / Hardware 9 23 February 2014 21:05
Questions on C2P using blitter only neoman Coders. General 2 21 July 2013 22:14
Any C2P experts here? oRBIT Coders. General 36 27 April 2010 07:26
C2P....help! NovaCoder Coders. General 8 17 December 2009 00:15
Game in c2p? oRBIT Amiga scene 11 01 February 2007 21:28

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 16:01.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.24501 seconds with 15 queries