English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 02 September 2014, 14:14   #81
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
Ah! And is this new method as accurate as the divs-version?
AGS is offline  
Old 02 September 2014, 14:23   #82
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
with the multiplication by 258 it should be accurate to 2 parts in 65535 (255*255*258/256 = 65533.0078...)

that should be good enough for anyone
Mrs Beanbag is offline  
Old 02 September 2014, 14:27   #83
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
But it's more accurate and faster than the methode with the lsr #8 (instead of the div by 255), right? An lsr takes double as cycles as the amount of bits that are shiftet.
AGS is offline  
Old 02 September 2014, 15:11   #84
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,810
Quote:
Originally Posted by AGS View Post
But it's more accurate and faster than the methode with the lsr #8 (instead of the div by 255), right? An lsr takes double as cycles as the amount of bits that are shiftet.
On 20s and 30s eight bit shifts and swaps are all four cycles when executed from the cache. For shifting more than eight bits you have to add two cycles (swap is faster when you can use it, but you can't always, because it's essentially a cheap rotate).

On the 68000 shifts take up more cycles per bit.
Thorham is offline  
Old 02 September 2014, 15:13   #85
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
on 68000, lsr takes 6+2n cycles, swap takes 4 cycles, so yes it is both more accurate and faster.

on 68020+, however, lsr and swap take the same time. But then 68000 doesn't have muls.l anyway. muls.l is much slower than muls.w, too. We could instead multiply alpha by 129 and use a muls.w followed by a left shift.
Mrs Beanbag is offline  
Old 02 September 2014, 15:20   #86
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
Quote:
Originally Posted by Mrs Beanbag View Post
We could instead multiply alpha by 129 and use a muls.w followed by a left shift.
Please give me the example. Will it still work with swap then or ... ?
AGS is offline  
Old 02 September 2014, 15:31   #87
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
well it would be either lsl #1 followed by swap, or lsr #7 and lsr #8. We'd need to shift right 15 bits instead of 16.
Mrs Beanbag is offline  
Old 02 September 2014, 15:40   #88
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
Happy

Ok, I tried it this way:

Code:
	moveq	#0,d0
	moveq	#0,d2

		move.b	d2,d0
		lsl.w	#7,d2
		add.w	d0,d2

		moveq	#0,d0

		move.b	(a0)+,d0
		move.b	(a1),d1
		sub.l	d1,d0
		muls.w	d2,d0
		lsl.l	#1,d0
		swap	d0
		add.w	d1,d0
		move.b	d0,(a1)+
Doesn't work though.
AGS is offline  
Old 02 September 2014, 15:57   #89
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
Question

Quote:
Originally Posted by Mrs Beanbag View Post
well it would be either lsl #1 followed by swap, or lsr #7 and lsr #8. We'd need to shift right 15 bits instead of 16.
Doesn't the highest bit dissapear in an lsl #1,d0?
AGS is offline  
Old 02 September 2014, 15:59   #90
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
can't see why not. it would help to have a more detailed description of the not working.

those first two moveq #0s are inside the loop, right?

Quote:
Originally Posted by AGS View Post
Doesn't the highest bit dissapear in an lsl #1,d0?
Multiplying by 258 or multiplying by 129 followed by a left shift should be the same. I hope that seems obvious. Lsl is no different to Asl, the sign is only important in right shifts.

254*129 = 32766 < 32768 so that shouldn't overflow. 255 is handled elsewhere. hmm...

Last edited by Mrs Beanbag; 02 September 2014 at 16:05.
Mrs Beanbag is offline  
Old 02 September 2014, 16:09   #91
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
Yes they are inside. Here's what it looks like. None of the transparent pixels are drawn.
Attached Thumbnails
Click image for larger version

Name:	error.png
Views:	194
Size:	49.5 KB
ID:	41361  
AGS is offline  
Old 02 September 2014, 16:13   #92
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
and the correct result?
Mrs Beanbag is offline  
Old 02 September 2014, 16:23   #93
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
Here:
Attached Thumbnails
Click image for larger version

Name:	correct.jpg
Views:	178
Size:	20.9 KB
ID:	41362  
AGS is offline  
Old 02 September 2014, 16:35   #94
JimDrew
Registered User
 
Join Date: Dec 2013
Location: Lake Havasu City, AZ
Posts: 741
If you guys don't mind burning 64K of memory, this would be way faster using a simple lookup table.
JimDrew is offline  
Old 02 September 2014, 17:07   #95
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
How can I measure the speed? It's running in an OS app. Looking up the fields with the seconds and micros in the intbase did not reveal much. It's seemingly too fast for intuition updating those fields.
AGS is offline  
Old 02 September 2014, 17:19   #96
alkis
Registered User
 
Join Date: Dec 2010
Location: Athens/Greece
Age: 53
Posts: 720
call it a gazillion times
alkis is offline  
Old 02 September 2014, 17:38   #97
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
so where is D2 being read from memory?
Mrs Beanbag is offline  
Old 02 September 2014, 17:41   #98
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
AH! It was cleared after the read! Thus all partially pixels became total transparent. I am now measuring the speed.
AGS is offline  
Old 02 September 2014, 18:01   #99
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
It works, but it turns out being 5 seconds slower (66 vs 71) at 3000 iterations, at least here in FS-UAE.

ps: I think this is because for the emulation there is no difference between muls.w and muls.l and then the additional lsl.l takes more time. @Toni Wilen: What do you mean?

Last edited by AGS; 02 September 2014 at 18:09.
AGS is offline  
Old 02 September 2014, 18:38   #100
AGS
XoXo/Tasko Developer
 
AGS's Avatar
 
Join Date: Dec 2013
Location: Munich
Age: 48
Posts: 450
@Mrs Beanbag

I compared speed with the original muls/divs version and found that the muls/divs version is faster than the new optimized variant by 3 seconds. And additionally, when we draw the same alpha picture onto the screen over and over again, the result is surprising. Left is the variant with muls and divs, and right the new optimized variant. Seems like something is going wrong?
Attached Thumbnails
Click image for larger version

Name:	compare.jpg
Views:	162
Size:	54.7 KB
ID:	41366  

Last edited by AGS; 02 September 2014 at 18:59.
AGS is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Tool to convert asm to gnu asm (gas) Asman Coders. Asm / Hardware 13 30 December 2020 11:57
TCP/IP stack: Most optimized//small? Amiga1992 support.Apps 17 14 June 2008 00:42
Optimized Protracker playroutine? Photon Coders. General 10 11 June 2005 00:54

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 21:31.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.11182 seconds with 14 queries