18 December 2020, 16:00 | #1 |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Arithmetic shift right with the blitter
I have a large number of 48 bit wide numbers that I need to shift right arithmetically, i.e. preserving the sign, by 14 bits.
At some point, if I can fiddle it, the blitter will become the best option to do this, right? Does anyone want to make a guess as to where this point might be? Does anyone have any insight into good ways to achieve this? I'm guessing that I'll need to sign extend to a 64 bit value in order to get the right ones or zeros to fill in the top bits. Tricks like right shifting by 2 will lose me 1/4 of my number space, so I think I should discount them. Any smart ideas out there? |
18 December 2020, 16:42 | #2 |
Registered User
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,430
|
Depending on how the numbers are organised in memory, this should definitely be possible using the Blitter. Compared to using the 68000 it should be significantly faster even when not doing that many numbers. Shifting 48 bit numbers on 68000 using shift instructions takes around 70 cycles per number (table based shifting may help here but that'll take a lot of memory and likely won't be more than about 1.5-2x as fast). Shifting 64 bit numbers using the Blitter should take around 16 cycles per number.
The overhead of setting up the Blitter is obviously still there, but with a simple blit like a copy & shift it shouldn't be a very large percentage. However, this all is only true if you can keep the numbers in memory such that you can do this in a single blit. If this is not possible, the overhead could spiral out of control (worst case of one blit for one number is way slower). |
18 December 2020, 17:05 | #3 | |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Quote:
|
|
19 December 2020, 09:53 | #4 |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Can we play "Spot the mistake?"
Code:
void BlitRight14Bits(volatile WORD * buffer, UWORD sizeInWords) { KPrintF("Waiting..."); WaitForBlitter(); KPrintF("Blitting..."); custom->bltcon0 = 14 << ASHIFTSHIFT | SRCA | DEST | A_TO_D; custom->bltcon1 = BLITREVERSE; custom->bltapt = (WORD *) buffer + sizeInWords - 1; custom->bltdpt = (WORD *) buffer + sizeInWords - 1; custom->bltdmod = 0; custom->bltsize = sizeInWords; KPrintF("Waiting..."); WaitForBlitter(); KPrintF("Blitted!"); } Anyone? I'm testing with sizeInWords = 6. Edit, My BLTSIZE is wrong, but if I try my next guess of custom->bltsize = (1 << 6) + sizeInWords;I get the same results. Last edited by Ernst Blofeld; 19 December 2020 at 10:01. |
19 December 2020, 10:11 | #5 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
Where do you set the masks and modulo for channel A?
|
19 December 2020, 10:19 | #6 |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
I forgot BLTAMOD, thanks.
I convinced myself that the first and last word masks didn't matter for my use case, but I'm going to think about that again. And I now have it blitting, after moving the test code to a point in my program where DMA is enabled, but it doesn't look like it's doing anything. Going to check those masks and a few other things. |
19 December 2020, 10:26 | #7 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
Code:
custom->bltapt = (WORD *) buffer + sizeInWords - 1; custom->bltdpt = (WORD *) buffer + sizeInWords - 1; Edit: NVM saw the word* cast. |
19 December 2020, 10:27 | #8 |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
C pointers should mean the 1 is really a 2, if I've done it right.
|
19 December 2020, 10:30 | #9 |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Guess what? The data wasn't in chip memory.
|
19 December 2020, 10:33 | #10 |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Updated code, I'm now verifying the results of this, but I'm blitting so I can't be far wrong...
Code:
__attribute__((section("buffers.MEMF_CHIP"))) volatile WORD foo [] = { 0x8000, 0x0000, 0x1234, 0x5678, 0xffff, 0x0fff, 0xffff, 0x0fff, 0x1010, 0x1010, 0x1010, 0x1010 }; Code:
for (UWORD i = 0; i < 3 * 4; i++) { KPrintF("0x%04lx", (LONG) foo[i] & 0x0000ffff); } BlitRight14Bits(foo, 3); for (UWORD i = 0; i < 3 * 4; i++) { KPrintF("0x%04lx", (LONG) foo[i] & 0x0000ffff); } Code:
void BlitRight14Bits(volatile WORD * buffer, UWORD sizeInLong64s) { KPrintF("Waiting..."); WaitForBlitter(); KPrintF("Blitting..."); custom->bltcon0 = 14 << ASHIFTSHIFT | SRCA | DEST | A_TO_D; custom->bltcon1 = BLITREVERSE; custom->bltafwm = 0xffff; custom->bltalwm = 0xffff; custom->bltapt = (WORD *) buffer + 4 * sizeInLong64s - 1; custom->bltdpt = (WORD *) buffer + 4 * sizeInLong64s - 1; custom->bltamod = 0; custom->bltdmod = 0; custom->bltsize = (sizeInLong64s << 6) + 4; KPrintF("Waiting..."); WaitForBlitter(); KPrintF("Blitted!"); } |
19 December 2020, 10:49 | #11 |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Right then, it's not shifting by 14 bits. I guess that was too much to hope for.
I have to do something funky like set BLTAPT one word lower and shift by 2 bits instead? |
19 December 2020, 10:55 | #12 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
DESC mode shifts left. Is that your intention as the function is called BlitRight ? What's your expected output?
If this is you input and I'm reading your ask correctly Code:
0x8000, 0x0000, 0x1234, 0x5678, 0xffff, 0x0fff, 0xffff, 0x0fff, 0x1010, 0x1010, 0x1010, 0x1010 Edit: If that's the case you'll need to treat each 64bit/4 words chunk as a single "line" and not in DESC mode. Then your blit is 4 words wide of xx lines high (xx is however many groups of 4 words). bltlwm mask will need to mask the last 14 bits so they don't get shifted to the next line so $c000. Last edited by Antiriad_UK; 19 December 2020 at 11:04. |
19 December 2020, 11:10 | #13 | |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Quote:
I'm using DESC as I'm writing over the top of the data, shifting it in place, so I need to fetch the data before it's overwritten. But if that's not going to work there's no reason why I have to do it in place. Would it be better to use ascending mode and a separate destination buffer? Edit: Yes, that's exactly what you're saying. Or can I still do it in place with ascending? |
|
19 December 2020, 11:19 | #14 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
Yes you can do in place with ascending.
Rereading again you mentioned an arithmetic shift, setup in this way it will be a logical shift only so 0x8000000012345678 will end up with the sign bit shifting down as well...not sure it's possible to handle the sign bit..hmm I guess you could mask the sign bit and then OR the result back over the top rather than a normal A-D copy. Edit: Would have to use a separate source/dest buffer if ORing. |
19 December 2020, 11:21 | #15 | |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Quote:
|
|
19 December 2020, 11:27 | #16 |
OCS forever!
Join Date: Mar 2019
Location: Birmingham, UK
Posts: 418
|
|
19 December 2020, 11:31 | #17 | |
<optimized out>
Join Date: Sep 2020
Location: <optimized out>
Posts: 321
|
Quote:
It shows 4 words, or 64 bits, which are the result of a calculation. The calculation is currently a 16 word multiplied by a 32 bit long giving a 48 bit value, which I'm sign extending into the upper blue word to make it 64 bits. I will have the need for a full 64 bit calculation without this sign extension. I want to discard the lower yellow 14 bits of this 64 bit value, doing lots of them at a time. Edit: This seems to work for me: Code:
void BlitRight14Bits(volatile WORD * input, volatile WORD * output, UWORD sizeInLong64s) { WaitForBlitter(); custom->bltcon0 = 14 << ASHIFTSHIFT | SRCA | DEST | A_TO_D; custom->bltcon1 = 0; custom->bltafwm = 0xffff; custom->bltalwm = 0xc000; custom->bltapt = (WORD *) input; custom->bltdpt = (WORD *) output; custom->bltamod = 0; custom->bltdmod = 0; custom->bltsize = (sizeInLong64s << 6) + 4; WaitForBlitter(); } Last edited by Ernst Blofeld; 19 December 2020 at 13:28. |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
BCD Arithmetic - howto^ | Herpes | Coders. General | 50 | 22 November 2021 06:38 |
Blitter shift eats 1px away | KONEY | Coders. Asm / Hardware | 64 | 04 November 2020 17:48 |
Blitter shift BACKWARDS | KONEY | Coders. Asm / Hardware | 3 | 29 January 2020 21:50 |
Blitter Mask shift during copy | LeCaravage | Coders. Asm / Hardware | 6 | 18 March 2018 22:50 |
Blitter busy flag with blitter DMA off? | NorthWay | Coders. Asm / Hardware | 9 | 23 February 2014 21:05 |
|
|