View Single Post
Old 23 April 2017, 23:35   #141
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,107
Ye olde amycoders website has a tutorial on avoiding multiplications. Looking at the "using squares" part, I'm having trouble finding an immediate use. All the other tips are more or less "drop-in" replacements for their slower counterparts. While I can certainly think of some uses if one is willing to sacrifice precision, it seems more of an advanced building block compared to the other tricks. Am I missing something?

I'm talking ~7MHz 68000 here, while I know 020+ has (aX,dY*4) and probably other stuff that helps don't they multiply fast enough for this trick not to matter?

Below is my attempt doing this in code. As usual there's probably errors, inefficiencies and wrong cycle counts, but in general it seems that the target of 38 (or even 50) seems long out of reach.

EDIT: Hmm.. testing shows that folding the shift into the table doesn't lose precision. It's probably easy to see why, but it'll have to wait and it's still not fast

Code:
; d0 *= d1, assumes a5=square table, d6 used as scratch
; Instruction to beat: muls.w d1, d0 38-70(1/0)
        move.w  d0, d6          ;  4(1/0)
        add.w   d1, d0          ;  4(1/0)
	add.w	d0, d0          ;  4(1/0)
	add.w	d0, d0          ;  4(1/0)
	move.l	(a5,d0.w), d0   ; 18(4/0) d0 = (A+B)^2
        sub.w   d1, d6          ;  4(1/0)
	add.w	d6, d6          ;  4(1/0)
	add.w	d6, d6          ;  4(1/0)
	sub.l	(a5,d6.w), d0   ; 20(3/0) d0 = (A+B)^2 - (A-B)^2
	; see edit asr.l	#2, d0	        ; 12(1/0) d0 = ((A+B)^2 - (A-B)^2))/4
                                ;--------
                                ; 66(14/0)

    ; ...
    dc.l 9>>2
    dc.l 4>>2
    dc.l 1>>2
squares: ; <- a5 points here
    dc.l 0>>2
    dc.l 1>>2
    dc.l 4>>2
    ; ...

Last edited by paraj; 24 April 2017 at 00:00.
paraj is offline  
 
Page generated in 0.26657 seconds with 11 queries