Optimized Alpha-Blending in asm?

AGS · 26 August 2014, 10:01

Hi,

I am trying to calculate apha-blending in asm. I have a color value in the destination (the screen) ranging from 0 to 255, a color value in the source (my image) also ranging from 0 to 255 and an alpha value in my image data which is also ranging from 0 to 255.

Question ist, how to calculate the final destination color value from these three? I have following formula:

pixel = alpha * x + (1 - alpha) * y

However this is meant for values ranging from 0 to 1 and I have only a weak idea how to change this formula to fit my needs. Also having a lookpup table for the multiplications would be good.

Can anybody help me with the right asm code?

greets,
A

StingRay · 26 August 2014, 10:38

This code is straight from my demosystem and shouldn't be too hard to understand.

Code:

*******************************************
*** 8 bit Alpha blending		***
*******************************************

; performs alpha blending/crossfade between
; 2 8bit grey-scaled pictures
; v1.o, 2o-Mar-2oo4 by StingRay/[S]carab^Scoopex
;
; a0.l: source 1
; a1.l: source 2
; a2.l: destination
; d0.w: blending value 0-255, 255=100%
; d1.w: size/4-1
;
; trashes: d1-d6,a0-a2


FX_Alpha_Blend
	ext.l	d0
	move.l	#$00ff00ff,d5
.loop	move.l	(a0)+,d6	; c1c2c3c4
	move.l	(a1)+,d2

	move.l	d6,d3
	move.l	d2,d4
	and.l	d5,d3		; 00c200c4
	and.l	d5,d4	
	sub.l	d3,d4
	muls.l	d0,d4
	lsr.l	#8,d4
	add.l	d3,d4
	and.l	d5,d4

	lsr.l	#8,d6		; 00c1c2c3
	lsr.l	#8,d2
	and.w	d5,d6		; 00c100c3
	and.w	d5,d2
	sub.l	d6,d2
	muls.l	d0,d2
	lsr.l	#8,d2
	add.l	d6,d2
	and.l	d5,d2
	lsl.l	#8,d2

	or.l	d4,d2
	move.l	d2,(a2)+
	dbf	d1,.loop
	rts

AGS · 26 August 2014, 10:46

For me this is hard to understand. Any more comments?

Thorham · 26 August 2014, 10:54

The formula for color components as bytes is:

((alpha * component) + ((255 - alpha) * component)) / 255

Sadly, this means that if you want it to be exact, you can't divide by 256 with a simple shift. If you don't care, shifting will be much faster. And, yes, 256 is WRONG. Don't let anyone tell you any different

AGS · 26 August 2014, 11:02

Thanks.

Is it equal which "component" is the source and which one the destination?

Thorham · 26 August 2014, 11:15

It's like this:

component3 = ((alpha * component1) + ((255 - alpha) * component2)) / 255

AGS · 26 August 2014, 11:22

Yes, I meant, is "component1" my picture or the screen?

StingRay · 26 August 2014, 11:34

Quote:

Originally Posted by AGS

For me this is hard to understand. Any more comments?

Here is an unoptimised version of my code above which should be much easier to understand. Please note that the code is untested as I've just made it as an example for you to understand the routine. It should work though [famous last words

].

Code:

    lea    SRCPIC1,a0
    lea    SRCPIC2,a1
    lea    DEST,a2
    move.l    #SIZE,d7
    moveq    #0,d1
.loop    moveq    #0,d2
    move.b    (a0)+,d1
    move.b    (a1)+,d2
    sub.w    d1,d2
    muls.w    d0,d2
    lsr.w    #8,d2
    add.w    d1,d2
    move.b    d2,(a2)+

    subq.l    #1,d7
    bne.b    .loop
    rts

AGS · 26 August 2014, 12:03

Looks good. In the meanwhile I wrote this (tested):

Code:

calc_pixels	move.w	#255,d6

.loop		moveq	#0,d2
		move.b	3(a0),d2		; alpha

		moveq	#0,d0
		moveq	#0,d1
		move.b	(a0)+,d0
		move.b	(a1),d1
		bsr	.one
		move.b	d0,(a1)+

		moveq	#0,d0
		moveq	#0,d1
		move.b	(a0)+,d0
		move.b	(a1),d1
		bsr	.one
		move.b	d0,(a1)+
		
		moveq	#0,d0
		moveq	#0,d1
		move.b	(a0)+,d0
		move.b	(a1),d1
		bsr	.one
		move.b	d0,(a1)+
		
		addq.l	#1,a0		; skip alpha in source

		dbf	d4,.loop
		rts

.one		; d2 alpha, d0 pic, d1 screen 

		mulu.w	d2,d0
		move.w	d6,d5
		sub.w	d2,d5
		mulu.w	d5,d1
		add.w	d1,d0
		divu.w	d6,d0
		rts

Where a0 is a RGBA pixels array of the image and a1 is the contents of the screen behind the picture to draw in RGB format.

I wonder how you have got rid of the divu by using a lsr #8,d2. Thorham wrote this is not exact.

StingRay · 26 August 2014, 12:37

I am using simple linear interpolation and my routine IS exact! The lsr.w #8 is to shift down to integer as I'm using fixed point.

AGS · 26 August 2014, 12:56

@StingRay

You multiply with d0, however there is no load something into d0 in your code.

Mrs Beanbag · 26 August 2014, 13:06

Trying to understand that as

a*x + (1-a)*y = a*x + y - a*y = a*(x-y) + y

or

(a*x + (255-a)*y)/255 = (a*x + 255*y - a*y)/255 = a*(x-y)/255 + y

StingRay I can't see how your routine is exact as you are dividing by 256 and not 255.

@AGS: d0 is the input alpha value (constant for whole image)

thomas · 26 August 2014, 13:13

Quote:

Originally Posted by StingRay

I am using simple linear interpolation and my routine IS exact! The lsr.w #8 is to shift down to integer as I'm using fixed point.

It's not. Your routine assumes an alpha value (in D0) between 0 and 256. But the requirement of the OP was 0..255.

mc6809e · 26 August 2014, 13:14

Quote:

Originally Posted by Thorham

The formula for color components as bytes is:

((alpha * component) + ((255 - alpha) * component)) / 255

Sadly, this means that if you want it to be exact, you can't divide by 256 with a simple shift. If you don't care, shifting will be much faster. And, yes, 256 is WRONG. Don't let anyone tell you any different

Nice catch!

Correct me if I'm wrong, though, but isn't there a difference between 1/255 and 1/256 only if alpha is >=128?

Including a single rounding at the end with an ADDX should take care of any inaccuracy (over the input domain) when using lsr #8 for the divide.

Mrs Beanbag · 26 August 2014, 13:20

a mathematical fact that some people find surprising is that 0.999999... = 1.

Yes that is equals, not approximates. If there are infinite 9s after the decimal point, it is just another way of writing the same number. We can use this fact.

Given an alpha in the range 0..255 we can make it into the range 0..65535 simply by doing (256a +a), which is a multiply by 257/256. Then a right shift of 16 bits is better than our original 8 bit shift. We could repeat this until the desired precision was reached. If we carried the expansion on forever (impossible in practice, of course) it would be exact. But if our result is only 8 bits who cares. Probably the first approximation is more than good enough. This same trick is used in the Amiga's hardware to convert OCS 12 bit colour values into 24 bit when they are written into the hardware registers.

AGS · 26 August 2014, 13:57

Please help, I need a v45 picture.datatype that returns alpha information. This must exist, even that cybergraphics does not support alpha.

Thorham · 26 August 2014, 13:57

Perhaps a bit late, but...

Quote:

Originally Posted by AGS

Yes, I meant, is "component1" my picture or the screen?

Yeah, sorry about that. Doesn't really matter, just pick one convention, and stick to it.

Here's an example:

If you say that component1 comes from the screen, and component2 comes from the bitmap you want to blend over the screen, and you want the screen to be 75 percent and the bitmap 25 percent, then you set alpha to 191 (75 percent of 256 = 192 -> 192 - 1 = 191).

AGS · 26 August 2014, 14:03

@Mrs Beanbag

You possibly are a mathematician. If StingRay isn't right with that addx, I would involve a division table. Was that possible?

Mrs Beanbag · 26 August 2014, 14:08

You could use a multiplication table, it would only take 64k memory and map (x,y) -> x*y/255

AGS · 26 August 2014, 17:10

I did it, partially. What I get is not very good. It's getting to dark. Please see the examples. Picture one is the picture to draw, picture two is the result I get and three is how it should be. What's wrong? Here is my code as it is now:

d0 = component 1
d1 = component 2
d2 = alpha
d3 = 255

Code:

		sub.w	d1,d0
		muls.w	d2,d0
		divs.w	d3,d0
		add.w	d1,d0

Note: I use pngalpha.library to get the alphachannel data and dtimage.library to get the color data. Then I insert the alpha bytes into the image data and draw.

26 August 2014, 17:10	#20
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	I did it, partially. What I get is not very good. It's getting to dark. Please see the examples. Picture one is the picture to draw, picture two is the result I get and three is how it should be. What's wrong? Here is my code as it is now: d0 = component 1 d1 = component 2 d2 = alpha d3 = 255 Code: sub.w d1,d0 muls.w d2,d0 divs.w d3,d0 add.w d1,d0 Note: I use pngalpha.library to get the alphachannel data and dtimage.library to get the color data. Then I insert the alpha bytes into the image data and draw. Attached Images Last edited by AGS; 26 August 2014 at 17:17.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Tool to convert asm to gnu asm (gas)	Asman	Coders. Asm / Hardware	13	30 December 2020 11:57
TCP/IP stack: Most optimized//small?	Amiga1992	support.Apps	17	14 June 2008 00:42
Optimized Protracker playroutine?	Photon	Coders. General	10	11 June 2005 00:54

26 August 2014, 10:01	#1
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	Optimized Alpha-Blending in asm? Hi, I am trying to calculate apha-blending in asm. I have a color value in the destination (the screen) ranging from 0 to 255, a color value in the source (my image) also ranging from 0 to 255 and an alpha value in my image data which is also ranging from 0 to 255. Question ist, how to calculate the final destination color value from these three? I have following formula: pixel = alpha * x + (1 - alpha) * y However this is meant for values ranging from 0 to 1 and I have only a weak idea how to change this formula to fit my needs. Also having a lookpup table for the multiplications would be good. Can anybody help me with the right asm code? greets, A

26 August 2014, 10:46	#3
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	For me this is hard to understand. Any more comments?

26 August 2014, 10:54	#4
Thorham Computer Nerd Join Date: Sep 2007 Location: Rotterdam/Netherlands Age: 47 Posts: 3,764	The formula for color components as bytes is: ((alpha * component) + ((255 - alpha) * component)) / 255 Sadly, this means that if you want it to be exact, you can't divide by 256 with a simple shift. If you don't care, shifting will be much faster. And, yes, 256 is WRONG. Don't let anyone tell you any different

26 August 2014, 11:02	#5
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	Thanks. Is it equal which "component" is the source and which one the destination?

26 August 2014, 11:15	#6
Thorham Computer Nerd Join Date: Sep 2007 Location: Rotterdam/Netherlands Age: 47 Posts: 3,764	It's like this: component3 = ((alpha * component1) + ((255 - alpha) * component2)) / 255

26 August 2014, 11:22	#7
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	Yes, I meant, is "component1" my picture or the screen?

26 August 2014, 12:37	#10
StingRay move.l #$c0ff33,throat Join Date: Dec 2005 Location: Berlin/Joymoney Posts: 6,863	I am using simple linear interpolation and my routine IS exact! The lsr.w #8 is to shift down to integer as I'm using fixed point.

26 August 2014, 12:56	#11
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	@StingRay You multiply with d0, however there is no load something into d0 in your code.

26 August 2014, 13:06	#12
Mrs Beanbag Glastonbridge Software Join Date: Jan 2012 Location: Edinburgh/Scotland Posts: 2,243	Trying to understand that as ax + (1-a)y = ax + y - ay = a(x-y) + y or (ax + (255-a)y)/255 = (ax + 255y - ay)/255 = a*(x-y)/255 + y StingRay I can't see how your routine is exact as you are dividing by 256 and not 255. @AGS: d0 is the input alpha value (constant for whole image)

26 August 2014, 13:20	#15
Mrs Beanbag Glastonbridge Software Join Date: Jan 2012 Location: Edinburgh/Scotland Posts: 2,243	a mathematical fact that some people find surprising is that 0.999999... = 1. Yes that is equals, not approximates. If there are infinite 9s after the decimal point, it is just another way of writing the same number. We can use this fact. Given an alpha in the range 0..255 we can make it into the range 0..65535 simply by doing (256a +a), which is a multiply by 257/256. Then a right shift of 16 bits is better than our original 8 bit shift. We could repeat this until the desired precision was reached. If we carried the expansion on forever (impossible in practice, of course) it would be exact. But if our result is only 8 bits who cares. Probably the first approximation is more than good enough. This same trick is used in the Amiga's hardware to convert OCS 12 bit colour values into 24 bit when they are written into the hardware registers.

26 August 2014, 13:57	#16
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	Please help, I need a v45 picture.datatype that returns alpha information. This must exist, even that cybergraphics does not support alpha.

26 August 2014, 14:03	#18
AGS XoXo/Tasko Developer Join Date: Dec 2013 Location: Munich Age: 48 Posts: 450	@Mrs Beanbag You possibly are a mathematician. If StingRay isn't right with that addx, I would involve a division table. Was that possible?

26 August 2014, 14:08	#19
Mrs Beanbag Glastonbridge Software Join Date: Jan 2012 Location: Edinburgh/Scotland Posts: 2,243	You could use a multiplication table, it would only take 64k memory and map (x,y) -> x*y/255

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)