09 January 2022, 22:23 | #21 |
Total Chaos forever!
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,187
|
When an image needs shifting, there is part of an image that needs to be made transparent pixels because the left and right edges have no source images being shifted in on one side and only image portions shifted in but no source on the other. Meynaf is referring to those cases as corner cases, in part because shifting bits in 68000 Assembly requires special care when shifting more than 16 bits in either direction. The C compiler generates that code for you.
An interleaved bitplane display uses horizontal modulo registers to allow the bitplane rows to be stacked in memory vertically. On OCS that severely limits the display width because the modulo registers can skip a maximum of 1024 bits from one row to the next. That means if you have a 5 bitplane display, the maximum width a display can be is 256 pixels. Using a shallower palette depth helps with that by reducing the number of bitplanes to skip using the modulo. Also, ECS has a 15-bit modulo instead of 10-bit so in can handle much wider displays with this configuration. The way an interleaved display looks in memory is row 0 bitplane 0 is followed in memory by row 0 bitplane 1, followed by row 0 bitplane 2, up to row 0 bitplane d-1 where d is the screen depth. After that, you start over with row 1 for all bitplanes, then row 2 for all bitplanes, all the way up to row h-1 for all bitplanes where h is the display height. The reason for interleaved bitplanes are blitting speed allows all bitplanes to be processed as one tall bitplane. The disadvantage is that an interleaved "cookie-cutter" masked blit requires the mask plane to be duplicated in height for all bitplanes to get the speed advantage, thus costing a lot of chip memory. |
09 January 2022, 22:49 | #22 | ||||
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
Quote:
However, when i asked about handling the edges separately i did not mean this. Currently my code iterates all the rows, but in a row, it handles the left and right edge outside the internal loop, because unlike the rest, they need one read/write. I don't know if it is handled the correct way or can it handled another way. Quote:
SA_Interleavedin the manual of OpenScreen(). Thanks for shedding the light. Quote:
Quote:
|
||||
09 January 2022, 23:01 | #23 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
You can blit on screens up to 1008 pixels wide on OCS systems and using the new ECS/AGA Blitter registers up to 32768 pixels wide on those. Interleaved vs non-interleaved blitting does not change these maximums, so the screen can be pretty much as wide as you like.
|
09 January 2022, 23:09 | #24 | |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
If the type of blitting does not change that, then what did this mean?
Quote:
|
|
09 January 2022, 23:32 | #25 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Ah, I hadn't read that part.. Well, that info is not correct, the Blitter & Bitplane Modulo values are signed 16 bit one values measured in bytes, meaning you can skip up to 32768 bytes per line (on OCS), which is far more than 1024 bits/pixels.
|
10 January 2022, 00:53 | #26 |
Total Chaos forever!
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,187
|
Are you sure about that? It was one significant thing that was added in ECS that wasn't in OCS. Bitmaps bigger than 1024 pixels horizontally require ECS.
|
10 January 2022, 07:48 | #27 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
It's simply about altering transparency mask to remove pixels out of target region, so that they are not blitted (if they were, part of the graphic would be visible out of the wanted region, i.e. overflow).
Quote:
Something like this : - init phase : setup for first word - main loop - exit phase : setup for last word, return to main loop for 1 more iteration Perhaps this is just easier to do in asm than in C. |
|
10 January 2022, 10:40 | #28 | |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Quote:
So you can blit a maximum of 1024 pixels wide on OCS, but that can be on a bitmap wider than 1024 pixels due to use of modulos. |
|
10 January 2022, 11:05 | #29 | ||||
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
Quote:
Quote:
Also, you've mentioned utilizing the CPU for "clearing" on the previous page; what did you mean by that? Zeroing out everything in a square or AND-ing the mask there? (Also, does it gain performance in DPF mode only or does it in SPF mode too?) Quote:
Quote:
@topic: So, to sum it up, the blitter is faster with interleaved blitting than continous, even if it needs a trick to "duplicate" the mask for it? Is any benchmark, sources or tutorials available about that? Last edited by TCH; 10 January 2022 at 11:14. |
||||
10 January 2022, 11:12 | #30 | |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Quote:
Is faster to do it this way in any display mode |
|
10 January 2022, 11:21 | #31 | |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
Quote:
|
|
10 January 2022, 11:26 | #32 | |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Quote:
Basically, on the 68000, any form of masking/copying is always much slower with the CPU than the Blitter. The clearing of data is an exception because you can take advantage of both the fact that the Blitter in that specific case only uses half the available cycles (for copy/cookie-cut this is not the case) and the fact that the 68000 can write constant values to memory using move.l or movem.l fairly quickly. |
|
10 January 2022, 11:37 | #33 |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
The Blitter doing the same means, it is also doing clearing? So, just like in your tutorial, one half of the clearing is done by the Blitter and other half of the clearing is done by the CPU?
Is movemfaster? Because for that, one would need to save a lot of registers. For instance this 68k code: Code:
; a0 = pointer in bitplane #0 at x, y ; d0 = line length in longwords ; d1 = number of lines * number of bitplanes ; d2 = line modulo ; trashes: d3, d4 zero_out: moveq #0, d3 zero_out_0: move.w d0, d4 zero_out_1: move.l d3, (a0)+ subq d4 dbne zero_out_1 add.l d2, a0 subq d1 dbne zero_out_0 rts movem, even counting the register saving? (Also, is this the reason of an interleaved approach is faster: only one modulo, not two?) |
10 January 2022, 11:52 | #34 | ||
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Quote:
Quote:
In general though, on 68000 performance is mostly gained by (partially) unrolling loops. For instance, if you know the area cleared will always be a multiple of 16 lines, it's normally best for performance to unroll the loop 16 times so that the amount of dbne's executed is as small as possible. About interleaving: the reason for an interleaved approach being faster is normally that you only need to set up the expensive parts of the blit (calculate address/shift values, set up pointers) once over all the planes, rather than once per plane. Normally you'd not need an additional modulo for non-interleaved blitting, though. In fact, the modulo value for interleaved and non-interleaved blitting are normally the same. It's the height that changes |
||
10 January 2022, 12:27 | #35 | |||
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
Quote:
Quote:
Quote:
AFAIK, for continous blitting, depending on the approach, we would need either two modulo: Code:
Continous #1: ============ ****************************************************************** * Bitplane #0 * ****************************************************************** | | | | | 00000000 < + line modulo | | 00000000 < + line modulo | | 00000000 < + line modulo | | 00000000 < + plane modulo | | | | | ****************************************************************** * Bitplane #1 * ****************************************************************** | | | | | 11111111 < + line modulo | | 11111111 < + line modulo | | 11111111 < + line modulo | | 11111111 < + plane modulo | | | | | ****************************************************************** * Bitplane #2 * ****************************************************************** | | | | | 22222222 < + line modulo | | 22222222 < + line modulo | | 22222222 < + line modulo | | 22222222 | | | | | ****************************************************************** Code:
Continous #2: ============ ****************************************************************** * Bitplane #0 * ****************************************************************** | | | | | 00000000 < + plane modulo | | 00000000 < + plane modulo | | 00000000 < + plane modulo | | 00000000 < + plane modulo | | | | | ****************************************************************** * Bitplane #1 * ****************************************************************** | | | | | 11111111 < + plane modulo | | 11111111 < + plane modulo | | 11111111 < + plane modulo | | 11111111 < + plane modulo | | | | | ****************************************************************** * Bitplane #2 * ****************************************************************** | | | | | 22222222 < = next line / - modulo B | | 22222222 < = next line / - modulo B | | 22222222 < = next line / - modulo B | | 22222222 | | | | | ****************************************************************** Code:
Interleaved: ============ ****************************************************************** | | | | | | | | | | | | | 00000000 < + line modulo | | 11111111 < + line modulo | | 22222222 < + line modulo | | 00000000 < + line modulo | | 11111111 < + line modulo | | 22222222 < + line modulo | | 00000000 < + line modulo | | 11111111 < + line modulo | | 22222222 < + line modulo | | 00000000 < + line modulo | | 11111111 < + line modulo | | 22222222 | | | | | | | | | | | | | ****************************************************************** Last edited by TCH; 11 January 2022 at 21:41. Reason: padding |
|||
10 January 2022, 14:51 | #36 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Right, I see what you mean...
The thing is, the Blitter doesn't have two modulo's per channel, only one. So my translation into a soft-blitting approach also didn't. Instead, when blitting non-interleaved, you normally update the Blitter pointers between planes (i.e. you blit all lines of plane 1, set the pointers for plane 2, blit that, etc). I was assuming your soft-blitting code worked in a similar way. But yes, on the CPU, you could use a second modulo to achieve the same, but you can also use separate calls using recalculated pointers. |
10 January 2022, 21:35 | #37 |
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
A-ha, okay, now i get it, thanks.
No, my algorithm was actually the "continous #2" approach, based on what i read in this forum topic. So, either i use a continous display and then blit each bitplane by a separate call, or i use an interleaved, but with height x bitplanes as the number of lines. A last stupid question: For the "stacked" mask with the "tall" interleaved blit, if i have this mask: the "stacked" mask itself is needed to be in interleaved format too, right? So it will look like this: I hope i got that correctly. |
10 January 2022, 22:21 | #38 | ||
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Quote:
Quote:
|
||
11 January 2022, 18:33 | #39 | ||
Newbie Amiga programmer
Join Date: Jun 2012
Location: Front of my A500+
Age: 38
Posts: 372
|
Quote:
I don't want to use two modulos, i'm just trying to figure out the fastest approach. Quote:
This is a bit off here (hardblit question), but related: i've read, that the Blitter does the masking blit with the following formula: DEST = (DEST & ~MASK) | (SRC & MASK) I suspect that the answer is no, but is the double masking mandatory? If i have my source already masked and the mask already inverted, then can it be just simply: DEST = (DEST & MASK) | SRC? |
||
11 January 2022, 20:18 | #40 | |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,411
|
Quote:
|
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
What demo ended with a spinning axe approaching the screen, followed by a... | Mark_C | request.Demos | 4 | 26 August 2020 23:46 |
Alien Breed 3D - tactics? | Angus | support.Games | 4 | 29 December 2019 17:26 |
Shadow Tactics - Commandos are back in Edo Japan! | Shoonay | Nostalgia & memories | 0 | 11 December 2016 12:30 |
Winning Tactics (KO2/PM) | adalsgaard | support.Games | 1 | 03 July 2015 16:50 |
Premier Manager 2 versions and tactics? | BrooksterMax | Retrogaming General Discussion | 7 | 23 December 2010 09:49 |
|
|