26 August 2014, 10:01 | #1 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
Optimized Alpha-Blending in asm?
Hi,
I am trying to calculate apha-blending in asm. I have a color value in the destination (the screen) ranging from 0 to 255, a color value in the source (my image) also ranging from 0 to 255 and an alpha value in my image data which is also ranging from 0 to 255. Question ist, how to calculate the final destination color value from these three? I have following formula: pixel = alpha * x + (1 - alpha) * y However this is meant for values ranging from 0 to 1 and I have only a weak idea how to change this formula to fit my needs. Also having a lookpup table for the multiplications would be good. Can anybody help me with the right asm code? greets, A |
26 August 2014, 10:38 | #2 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
This code is straight from my demosystem and shouldn't be too hard to understand.
Code:
******************************************* *** 8 bit Alpha blending *** ******************************************* ; performs alpha blending/crossfade between ; 2 8bit grey-scaled pictures ; v1.o, 2o-Mar-2oo4 by StingRay/[S]carab^Scoopex ; ; a0.l: source 1 ; a1.l: source 2 ; a2.l: destination ; d0.w: blending value 0-255, 255=100% ; d1.w: size/4-1 ; ; trashes: d1-d6,a0-a2 FX_Alpha_Blend ext.l d0 move.l #$00ff00ff,d5 .loop move.l (a0)+,d6 ; c1c2c3c4 move.l (a1)+,d2 move.l d6,d3 move.l d2,d4 and.l d5,d3 ; 00c200c4 and.l d5,d4 sub.l d3,d4 muls.l d0,d4 lsr.l #8,d4 add.l d3,d4 and.l d5,d4 lsr.l #8,d6 ; 00c1c2c3 lsr.l #8,d2 and.w d5,d6 ; 00c100c3 and.w d5,d2 sub.l d6,d2 muls.l d0,d2 lsr.l #8,d2 add.l d6,d2 and.l d5,d2 lsl.l #8,d2 or.l d4,d2 move.l d2,(a2)+ dbf d1,.loop rts |
26 August 2014, 10:46 | #3 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
For me this is hard to understand. Any more comments?
|
26 August 2014, 10:54 | #4 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,764
|
The formula for color components as bytes is:
((alpha * component) + ((255 - alpha) * component)) / 255 Sadly, this means that if you want it to be exact, you can't divide by 256 with a simple shift. If you don't care, shifting will be much faster. And, yes, 256 is WRONG. Don't let anyone tell you any different |
26 August 2014, 11:02 | #5 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
Thanks. Is it equal which "component" is the source and which one the destination?
|
26 August 2014, 11:15 | #6 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,764
|
It's like this:
component3 = ((alpha * component1) + ((255 - alpha) * component2)) / 255 |
26 August 2014, 11:22 | #7 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
Yes, I meant, is "component1" my picture or the screen?
|
26 August 2014, 11:34 | #8 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
Here is an unoptimised version of my code above which should be much easier to understand. Please note that the code is untested as I've just made it as an example for you to understand the routine. It should work though [famous last words ].
Code:
lea SRCPIC1,a0 lea SRCPIC2,a1 lea DEST,a2 move.l #SIZE,d7 moveq #0,d1 .loop moveq #0,d2 move.b (a0)+,d1 move.b (a1)+,d2 sub.w d1,d2 muls.w d0,d2 lsr.w #8,d2 add.w d1,d2 move.b d2,(a2)+ subq.l #1,d7 bne.b .loop rts Last edited by StingRay; 26 August 2014 at 11:40. |
26 August 2014, 12:03 | #9 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
Looks good. In the meanwhile I wrote this (tested):
Code:
calc_pixels move.w #255,d6 .loop moveq #0,d2 move.b 3(a0),d2 ; alpha moveq #0,d0 moveq #0,d1 move.b (a0)+,d0 move.b (a1),d1 bsr .one move.b d0,(a1)+ moveq #0,d0 moveq #0,d1 move.b (a0)+,d0 move.b (a1),d1 bsr .one move.b d0,(a1)+ moveq #0,d0 moveq #0,d1 move.b (a0)+,d0 move.b (a1),d1 bsr .one move.b d0,(a1)+ addq.l #1,a0 ; skip alpha in source dbf d4,.loop rts .one ; d2 alpha, d0 pic, d1 screen mulu.w d2,d0 move.w d6,d5 sub.w d2,d5 mulu.w d5,d1 add.w d1,d0 divu.w d6,d0 rts I wonder how you have got rid of the divu by using a lsr #8,d2. Thorham wrote this is not exact. |
26 August 2014, 12:37 | #10 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
I am using simple linear interpolation and my routine IS exact! The lsr.w #8 is to shift down to integer as I'm using fixed point.
|
26 August 2014, 12:56 | #11 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
@StingRay
You multiply with d0, however there is no load something into d0 in your code. |
26 August 2014, 13:06 | #12 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Trying to understand that as
a*x + (1-a)*y = a*x + y - a*y = a*(x-y) + y or (a*x + (255-a)*y)/255 = (a*x + 255*y - a*y)/255 = a*(x-y)/255 + y StingRay I can't see how your routine is exact as you are dividing by 256 and not 255. @AGS: d0 is the input alpha value (constant for whole image) |
26 August 2014, 13:13 | #13 |
Registered User
Join Date: Jan 2002
Location: Germany
Posts: 7,001
|
|
26 August 2014, 13:14 | #14 | |
Registered User
Join Date: Jan 2012
Location: USA
Posts: 372
|
Quote:
Correct me if I'm wrong, though, but isn't there a difference between 1/255 and 1/256 only if alpha is >=128? Including a single rounding at the end with an ADDX should take care of any inaccuracy (over the input domain) when using lsr #8 for the divide. |
|
26 August 2014, 13:20 | #15 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
a mathematical fact that some people find surprising is that 0.999999... = 1.
Yes that is equals, not approximates. If there are infinite 9s after the decimal point, it is just another way of writing the same number. We can use this fact. Given an alpha in the range 0..255 we can make it into the range 0..65535 simply by doing (256a +a), which is a multiply by 257/256. Then a right shift of 16 bits is better than our original 8 bit shift. We could repeat this until the desired precision was reached. If we carried the expansion on forever (impossible in practice, of course) it would be exact. But if our result is only 8 bits who cares. Probably the first approximation is more than good enough. This same trick is used in the Amiga's hardware to convert OCS 12 bit colour values into 24 bit when they are written into the hardware registers. |
26 August 2014, 13:57 | #16 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
Please help, I need a v45 picture.datatype that returns alpha information. This must exist, even that cybergraphics does not support alpha.
|
26 August 2014, 13:57 | #17 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,764
|
Perhaps a bit late, but...
Yeah, sorry about that. Doesn't really matter, just pick one convention, and stick to it. Here's an example: If you say that component1 comes from the screen, and component2 comes from the bitmap you want to blend over the screen, and you want the screen to be 75 percent and the bitmap 25 percent, then you set alpha to 191 (75 percent of 256 = 192 -> 192 - 1 = 191). |
26 August 2014, 14:03 | #18 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
@Mrs Beanbag
You possibly are a mathematician. If StingRay isn't right with that addx, I would involve a division table. Was that possible? |
26 August 2014, 14:08 | #19 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
You could use a multiplication table, it would only take 64k memory and map (x,y) -> x*y/255
|
26 August 2014, 17:10 | #20 |
XoXo/Tasko Developer
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
|
I did it, partially. What I get is not very good. It's getting to dark. Please see the examples. Picture one is the picture to draw, picture two is the result I get and three is how it should be. What's wrong? Here is my code as it is now:
d0 = component 1 d1 = component 2 d2 = alpha d3 = 255 Code:
sub.w d1,d0 muls.w d2,d0 divs.w d3,d0 add.w d1,d0 Last edited by AGS; 26 August 2014 at 17:17. |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Tool to convert asm to gnu asm (gas) | Asman | Coders. Asm / Hardware | 13 | 30 December 2020 11:57 |
TCP/IP stack: Most optimized//small? | Amiga1992 | support.Apps | 17 | 14 June 2008 00:42 |
Optimized Protracker playroutine? | Photon | Coders. General | 10 | 11 June 2005 00:54 |
|
|