View Single Post
Old 28 April 2012, 23:37   #37
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,647
Quote:
Originally Posted by pmc View Post
Code:
                    lea                 vals(pc),An
                    movem.w             (An),Dn-Dn
gets you the ext.l's for free
Indeed it does, and
Code:
	movem.w Vals(PC),d0-d1
saves a further 4 cycles. (Yeye I know it's obvious.)

Mainly replied to say that movem has an overhead which makes it break even at a count of 3 registers. Here, 2 are faster only because of the desired sign extends.

The instructions take the cycles they take, and there's no instruction reorder optimizations on 68000 apart from the prefetch after the write to BLTSIZ and the hard-to-know odd-cycle alignment wait of instructions that take 6/10/14 etc cycles.

A simple one for when you have a loop loading registers from memory is to backup, then pre-poke a magic exitvalue (such as say, zero or negative) instead of checking end-address or loopctr/DBF. Since you're loading the registers anyway, a simple bmi.s Done instead of dbf Dn,KeepOn saves 2 cycles.

The same is true for other branches inside loops; you may save 4 cycles for 50% of the branches if all branches jump outside the loop. More, if there is a bias toward either true or false.

Optimized an unrolled loop yesterday from 64 cycles to 51.75 cycles average

Last edited by Photon; 28 April 2012 at 23:45.
Photon is offline  
 
Page generated in 0.05368 seconds with 10 queries