English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 21 March 2021, 01:40   #61
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,209
Now 54 bytes Shaved off 4 bytes by removing the loop counter

Code:
                        ; Entry a0 = Beginning of 2048 byte sine table buffer

                        lea     512(a0),a0  ; a0 = 2nd Quadrant Start
	                lea     2(a0),a1    ; a1 = 1st Quadrant End + 2
	                moveq	#11,d0      ; d0.l = x = 11
                        moveq   #1,d1
                        ror.w   #2,d1       ; d1.l = y = 16384 
	                move.w  #163,d2     ; d2 = Q = magic division value = 512/PI (162.97466)
.Loop
	                move.l  d1,d3	
	                divu    d2,d3
	                add.w	d3,d0       ; x = x + (y / Q)
	                move.l  d0,d3
	                divu    d2,d3
	                sub.w	d3,d1       ; y = y - (x / Q)
	                move.w  d1,d3
	                neg.w   d3
	                move.w  d3,1024(a0) ; write 4th Quadrant
                        move.w  d3,1022(a1) ; write 3rd Quadrant
                        move.w  d1,(a0)+    ; Write 2nd Quadrant
                        move.w  d1,-(a1)    ; Write 1st Quadrant
	                addq.w   #1,d3
	                blt    .Loop
                        clr.w   (a1)       ; Set SinTable index 0 to 0 
                        clr.w   -(a0)        ; Set SinTable index 512 to 0
.End
DanScott is online now  
Old 21 March 2021, 02:03   #62
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
As I guessed:
Code:
    moveq   #97,d0
    ror.l   #7,d0
Max AbsError=1.56%, Max RelError=4.50%

ross is offline  
Old 21 March 2021, 02:35   #63
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,209
For mine (assuming I have the error calculation correct)

Max AbsError = 0.19% Max RelError = 0.47%
DanScott is online now  
Old 21 March 2021, 05:07   #64
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 386
Quote:
Originally Posted by DanScott View Post
Now 54 bytes Shaved off 4 bytes by

The precision on this is great!


However I think there might be a buffer overrun happening somewhere because it's causing a problem in my framework where none of the others do. I couldn't find it right away.
Jobbo is online now  
Old 21 March 2021, 05:07   #65
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 386
Quote:
Originally Posted by ross View Post
As I guessed:
Code:
    moveq   #97,d0
    ror.l   #7,d0
Max AbsError=1.56%, Max RelError=4.50%


Added this one to the spreadsheet.
Jobbo is online now  
Old 21 March 2021, 07:11   #66
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
All in one package (24 bytes speedy lower accuracy, and 38 bytes 0.9% accuracy):
Code:
	moveq	#0,d0			; amp=16384, length=1024
	move.w	#511+2,a1
Loop5	subq.l	#2,a1
	move.l	d0,d1

; extra accuracy begin
	move.w	d1,d2
	not.w	d2
	mulu.w	d1,d2
	divu.w	#75781/2,d2		; 16384/0.2162
	lsr.w	#3,d2			; can't do a 32-bit divu
	sub.w	d2,d1
; extra accuracy end

	asr.l	#2,d1
	move.w	d1,(a0)+
	neg.w	d1
	move.w	d1,(1024-2,a0)
	add.l	a1,d0
	bne.b	Loop5
My fixedpoint test code reports 9/1000 (=%0.9), I guess floats shouldn't be far off.

Last edited by a/b; 21 March 2021 at 07:48. Reason: more opt
a/b is offline  
Old 21 March 2021, 09:44   #67
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by a/b View Post
All in one package (24 bytes speedy lower accuracy, and 38 bytes 0.9% accuracy):
This one wins hands down.

Max AbsError=0.24%, Max RelError=0.53%, 138174 CPU cycles

Absolutely superb
ross is offline  
Old 21 March 2021, 11:01   #68
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,209
Quote:
Originally Posted by Jobbo View Post
The precision on this is great!


However I think there might be a buffer overrun happening somewhere because it's causing a problem in my framework where none of the others do. I couldn't find it right away.
You;re right, it's writing an extra value to the end at the 1025th index :/
DanScott is online now  
Old 21 March 2021, 11:10   #69
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by DanScott View Post
You;re right, it's writing an extra value to the end at the 1025th index :/

That's what made me add that damn beq in my code.
ross is offline  
Old 21 March 2021, 11:53   #70
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,209
I'm going to investigate the 2 quadrant method a bit more.. see if I can get the accuracy into the 2nd quadrant, and that should shave off some bytes.

I'm still well happy with the accuracy of mine though I didn't know it was called the Minsky circle... I just remembered someone showing me a fast way to rotate 2D coordinates back in the early 90's using just shifts.. so I derived my algorithm from that I wonder if Bresenhams circle formula could also be used to calculate a sine ?

a/b's method of subtracting another scaled down parametric curve to remove the bulge is also pretty neat!!!

Last edited by DanScott; 21 March 2021 at 14:58.
DanScott is online now  
Old 21 March 2021, 16:20   #71
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
46 bytes..

Code:
    lea SinEntries*2(a0),a1
    moveq   #0,d1
.1  moveq   #97,d0
    ror.l   #7,d0
    move.l  d1,d2
    mulu.w  d2,d2
    sub.l   d2,d0
    swap    d0
    mulu.w  d1,d0
    swap    d0
    move.w  d0,(a0)+
    move.w  d0,-2-SinEntries(a1)
    neg.w   d0
    beq.b   .2
    move.w  d0,-(a1)
.2  move.w  d0,-2+SinEntries(a0)
    addi.w  #1<<7,d1
    bpl.b   .1
    bvs.b   .1
Thanks to a/b!
My usual two byte evident optimization that I miss
ross is offline  
Old 21 March 2021, 22:38   #72
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Finally the 2+2 quadrant version.

40 bytes.. 38 byte if table of 1025 sine entries accepted.
Algorithm and precision are the same as the previous one.

The only advantage over a/b one is speed.

Code:
    move.w  #1<<7,d3
    moveq   #0,d1
.1  moveq   #97,d0
    ror.l   #7,d0
    move.l  d1,d2
    mulu.w  d2,d2
    sub.l   d2,d0
    swap    d0
    mulu.w  d1,d0
    swap    d0
    move.w  d0,(a0)+
    neg.w   d0
    move.w  d0,-2+SinEntries(a0)
    add.w   d3,d1
    beq.b   .e
    bpl.b   .1
    neg.w   d3
    bmi.b   .1
.e
I don't like how the exit is handled.
Those who can suggest possible optimizations are welcome.

I also made a version that directly calculate all quadrants, but I would need a different entry point for a0, so I discarded it.
ross is offline  
Old 22 March 2021, 08:25   #73
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Nice, that's more like it ;P. 4x was too much trouble.

I'm using the official measurements now (float sine as a reference), and I've slightly improved the accuracy. It's still 0.528 but the sum of all errors is about 30% lower. I'll post the update later in case there are any other changes, don't want to create too many 'official' versions.
Whoever is updating the docs, muchas thanks.
a/b is offline  
Old 22 March 2021, 23:04   #74
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,209
a/b's method of "adjusting" the error with another curve, got me thinking about the possibilty to make a fast atan2 function, using a similar curve to adjust the result of simple divide ( x/y )? I wonder if that's possible?
DanScott is online now  
Old 22 March 2021, 23:46   #75
mr.spiv
Registered User
 
mr.spiv's Avatar
 
Join Date: Aug 2006
Location: Finland
Age: 51
Posts: 241
Quote:
Originally Posted by ross View Post
This one wins hands down.

Max AbsError=0.24%, Max RelError=0.53%, 138174 CPU cycles

Absolutely superb
Ditto!
mr.spiv is offline  
Old 23 March 2021, 00:59   #76
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
OK, I'm gonna give it a rest now (have to do some actual work ><)...
A request to the maintainer, could you please remove my previous versions or eventually leave "A/B 2" as a short+inaccurate version so we can also have a collection of those. The rest is obsolete (yeah, rm -rf is my favorite command ).

My code rates this as max_err=0.528 avg_err=0.159%, not sure what is the sheet going to say. Max absolute is still 0.528 since the "first" value is 100.528, so if you start with a 100 you get 0.528 and if you start with a 101 you get 0.472, and basically that's about it.

Code:
	moveq	#0,d0		; amp=16384, len=1024
	move.w	#511+2,a1
Loop	subq.l	#2,a1
	move.l	d0,d1

; extra accuracy begin (comment in/out as needed)
	move.w	d1,d2
	neg.w	d2
	mulu.w	d1,d2
	divu.w	#74504/2,d2	; 74504=amp/scale
	lsr.w	#2+1,d2
	sub.w	d2,d1
; extra accuracy end

	asr.l	#2,d1
	move.w	d1,(a0)+
	neg.w	d1
	move.w	d1,(1024-2,a0)
	add.l	a1,d0
	bne.b	Loop
a/b is offline  
Old 23 March 2021, 06:15   #77
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 386
Spreadsheet is updated with A/B and Ross' latest.
Jobbo is online now  
Old 23 March 2021, 09:45   #78
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Quote:
Originally Posted by Jobbo View Post
Spreadsheet is updated with A/B and Ross' latest.
Thanks!

Hmmm, it gives 0.07%. Not that I'm complaining but I guess it's using 32-bit floats. Haven't thought about it before, I'm using 64-bit floats (*don't* laught, javascript ) and that's the reason I get a larger error (previous version was me 0.23% vs. the sheet 0.21%).
a/b is offline  
Old 23 March 2021, 13:15   #79
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,209
Quote:
Originally Posted by Jobbo View Post
Spreadsheet is updated with A/B and Ross' latest.
If it's not too much trouble, could you add a RoundToInt version of the accurate sine table (as that is the most accurate integer representation of -16384 to +16384

Perhaps also an error for each routine calculated against that rounded int ?
DanScott is online now  
Old 23 March 2021, 14:37   #80
Jobbo
Registered User
 
Jobbo's Avatar
 
Join Date: Jun 2020
Location: Druidia
Posts: 386
Quote:
Originally Posted by DanScott View Post
If it's not too much trouble, could you add a RoundToInt version of the accurate sine table (as that is the most accurate integer representation of -16384 to +16384

Perhaps also an error for each routine calculated against that rounded int ?
I added a little checkbox so you can toggle between the two and added min/max errors up at the top. To change between the two you need to make a copy of the document or request access.

I've put way more work into the spreadsheet than coding a nice sin routine!

Do you have a fix for your accurate version? It's still in there despite the bug.

Last edited by Jobbo; 23 March 2021 at 14:44.
Jobbo is online now  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
horiz. size & vert. size greyed out in some configurations honx support.WinUAE 3 15 August 2020 21:14
Coding Competition #1 DanScott Coders. Asm / Hardware 83 04 May 2020 22:31
Looking to join team/coder for competition nobody Coders. Contest 2 16 October 2018 09:11
Anyone up for an ASM coding competition? DanScott Coders. Asm / Hardware 526 22 September 2018 21:38
it's a sin SquawkBox Member Introductions 2 17 February 2016 23:26

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 15:26.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.12820 seconds with 15 queries