31 May 2018, 23:58 | #1 |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Fast tile flipping on CD32
Hi all,
Does anyone know if there are any system calls I can make on a CD32 platform that will take a graphical tile and flip it on the X, Y or both axis? I have the CD32 developer documentation but for some reason I can't find anything specific to this Akiko chip and how to access it. I want to take a 32x32 pixel tile and flip it. Any help as always is really appreciated. Geezer |
01 June 2018, 08:06 | #2 | ||
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
I do not know if it exists and in any case I imagine it would be slow.. Quote:
Around the x axis is very simple with both the cpu and the blitter (modulo is your friend). The alternative is to work completely in chunky, forgetting bitplanes and working only in bytes and then make the conversion with Akiko (that I have no idea how to program) or using one of the many chunky to planar routines available. Obviously the fastest thing is to use double the memory (or the triple if you want also the composite flips) with different copies of the same tile |
||
01 June 2018, 09:42 | #3 | |
ex. demoscener "Bigmama"
Join Date: Jun 2012
Location: Fyn / Denmark
Posts: 1,624
|
Quote:
|
|
01 June 2018, 16:31 | #4 | |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
Quote:
This code is wrong, see my post below for corrected version. Code:
move.l #$55555555,d1 eor.l d0,d1 eor.l d1,d0 add.l d1,d1 lsr.l #1,d0 or.l d1,d0 move.l #$33333333,d1 eor.l d0,d1 eor.l d1,d0 lsl.l #2,d1 lsr.l #2,d0 or.l d1,d0 move.l #$0f0f0f0f,d1 eor.l d0,d1 eor.l d1,d0 lsl.l #4,d1 lsr.l #4,d0 or.l d1,d0 rol.w #8,d0 swap d0 rol.w #8,d0 Last edited by Thorham; 03 June 2018 at 00:10. |
|
01 June 2018, 16:53 | #5 |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Thanks for the suggestions guys, I guess i'm looking at doing it with the CPU.
There a couple of reasons I can't use memory, one being capacity and two complications. On the plus side I only need to do this flip when needed depending on what is in the Side Arms tile map. The other plus is that the scrolling only runs at 25 FPS so I should have plenty of time. I'll write the scroll routine over the next week or so, the challenge is getting all of the palettes to mesh together during scrolling without having to alter the arcade rom tile map - ugh. |
01 June 2018, 18:07 | #6 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
If there isn't enough memory for keeping mirrored copies of the same tile, then you might use some kind of graphical cache holding the last few ones that were used.
If you want to do that purely dynamic then the 256-byte table seems the best compromise. |
01 June 2018, 21:08 | #7 |
J.M.D - Bedroom Musician
Join Date: Apr 2014
Location: los angeles,ca
Posts: 3,519
|
The good'ol' Side Arms! So underrated but also with some playability problems, would like to see it ported decently and improved from its original incarnation...
|
01 June 2018, 21:41 | #8 | |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Quote:
However I will get a nice 8 way scrolling routine out of it supporting 16 or 32 pixel tile sets that I could use on other projects. |
|
01 June 2018, 23:14 | #9 |
J.M.D - Bedroom Musician
Join Date: Apr 2014
Location: los angeles,ca
Posts: 3,519
|
Powder was using a scrolling trechnique similar to side arms and some tricks to run lot of stuff with 16 colors, have the source code if you want to give it a look
|
01 June 2018, 23:20 | #10 |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
|
02 June 2018, 21:58 | #11 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
|
02 June 2018, 23:07 | #12 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
This is a right version: (I have not thought that much if it can be optimized) Code:
move.l d0,d1 move.l #$55555555,d2 lsr.l #1,d0 add.l d1,d1 and.l d2,d0 add.l d2,d2 and.l d2,d1 or.l d1,d0 move.l d0,d1 move.l #$33333333,d2 lsr.l #2,d0 lsl.l #2,d1 and.l d2,d0 lsl.l #2,d2 and.l d2,d1 or.l d1,d0 move.l d0,d1 move.l #$0f0f0f0f,d2 lsr.l #4,d0 lsl.l #4,d1 and.l d2,d0 lsl.l #4,d2 and.l d2,d1 or.l d1,d0 rol.w #8,d0 swap d0 rol.w #8,d0 |
|
03 June 2018, 00:09 | #13 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
Thanks for pointing that out Some of the eors have to be ands Remind me to test code before posting it
Code:
move.l #$55555555,d1 and.l d0,d1 eor.l d1,d0 add.l d1,d1 lsr.l #1,d0 or.l d1,d0 move.l #$33333333,d1 and.l d0,d1 eor.l d1,d0 lsl.l #2,d1 lsr.l #2,d0 or.l d1,d0 move.l #$0f0f0f0f,d1 and.l d0,d1 eor.l d1,d0 lsl.l #4,d1 lsr.l #4,d0 or.l d1,d0 rol.w #8,d0 swap d0 rol.w #8,d0 |
03 June 2018, 00:18 | #14 | |
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Quote:
I haven't debugged or tried it yet but a short explanation of source data/dest would be really useful. Cheers, Geezer |
|
03 June 2018, 00:25 | #15 | |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Quote:
Basically is like a SIMD approach because there is not carry between operations. Input D0 contains the 32 bits from a bitplane, output d0 the same bits flipped. Last edited by ross; 03 June 2018 at 00:35. Reason: typo... bitplane not bitblane :) |
|
03 June 2018, 00:25 | #16 | |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
Quote:
D0 is both source and destination. |
|
03 June 2018, 00:31 | #17 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
|
03 June 2018, 00:32 | #18 | ||
Registered User
Join Date: Oct 2017
Location: Sunderland, England
Posts: 2,702
|
Quote:
Quote:
I like this because I can fit this in 68020 cache so it will go full speed. Appreciate it. |
||
03 June 2018, 01:02 | #19 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
|
03 June 2018, 11:31 | #20 |
Defendit numerus
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
|
Some non-scientific and quick tests.
Pure code seems slightly faster than this lazy bfextu 8bit LUT implementation: Code:
_lut8flip: lea _8lut(pc),a0 move.l d0,d1 bfextu d1{8:8},d2 move.b (a0,d2.w),d0 ror.l #8,d0 bfextu d1{16:8},d2 move.b (a0,d2.w),d0 ror.l #8,d0 bfextu d1{24:8},d2 move.b (a0,d2.w),d0 ror.l #8,d0 bfextu d1{0:8},d2 move.b (a0,d2.w),d0 rts Simple as: Code:
_lut16flip: lea _16lut+65536,a0 move.w (a0,d0.w*2),d0 swap d0 move.w (a0,d0.w*2),d0 rts suppose you have a lot of big AGA sprites (64x64,4planes) and also a lot of tiles (32x32,4/8planes) for a big total of 1MB of data, all to be flipped. In this case may be useful (the waste becomes proportionally less and less significant, and CPU time is precious on 020..) But surely pure code, like Thorham suggested, is a great deal! [EDIT, PS] Why non-scientific? I do not have a CD32, nor an Amiga for that matter So it's all based on the emulation of WinUAE which for 020 is not CE perfect (or it is for this simple code? well, it's not that important..). Also I had no will to write code other than bfextu and anyway the difference in speed between pure code and 8bit LUT does not seem significant enough to justify the exclusive use of LUT Last edited by ross; 03 June 2018 at 12:04. Reason: PS |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Workaround to X-Flipping issue found. No actual solution as yet. | Brick Nash | Coders. AMOS | 12 | 13 October 2017 19:01 |
flipping through screens using middle mouse button | Yulquen74 | request.Apps | 5 | 27 June 2014 21:31 |
Too fast CD32 emulation | Amigabest | support.WinUAE | 1 | 13 May 2012 20:13 |
wing commander cd32 too fast | JuvUK | support.Games | 8 | 21 March 2009 21:43 |
Flipping floppies | Dave_wb | support.Hardware | 8 | 03 December 2006 12:36 |
|
|