English Amiga Board Amiga Lore


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 03 May 2012, 02:00   #61
Photon
Moderator
Photon's Avatar
 
Join Date: Nov 2004
Location: Hult / Sweden
Age: 100
Posts: 4,024
Quote:
Originally Posted by TheDarkCoder View Post
may you explain to me these two optimizations?
Well, the prefetch feature of the 68000 is no secret. My theory was, this will cause an internal stage and another internal stage XOR a word memory access (total 2x 4 cycles) to be executed before a blit starts. The theory hasn't been tested for much more than DIVSing perspective while drawing lines 23 years ago, basically cos I couldn't find anything else that was more useful that wasn't a normal sub-12 cycle instruction. Sometimes the lines would be shorter than the DIVS cycle time of course, but it was rare. The point was that it was started immediately thus calculated internally in parallel finishing faster.

The other one is aligning table lookups (usually 14c) or a taken branch (10c) and similar with the alternating 4-cycle CMA/DMA memory access. In the vblank period, when no DMA is active it's "as written", just sum up the cycles. But while actually displaying something some of them would just be out of luck and have their CMA execute the NEXT 4-cycle slot the bitplane DMA wasn't hogging access.

Normally this is too much work really (really!) since you can't really go "oh, I'll halve the number of colors on screen and I'll be able to fit one or two CMA's between bitplane accesses" cos you'd have ruined the original idea (by making it look shit) and also would have gained much more already, by removing a bitplane's DMA, both for blitter and CPU.

So I haven't tried this unless I got some routine right by accident so I expect someone is going to debunk it instantly (and thunderously!)

But it's basically the last straw to grip when a frameful of effect is a sequence of perfectly and godlikely optimized instructions (according to yourself of course). Considering you'd likely have to sync with raster at the start of something, you'd probably lose more by that sync than you gained! But there might be a situation. Not one that wouldn't be completely 'surpassed' by precalc or infinite bobs or whatever, of course. Hah.
Photon is offline  
AdSense AdSense  




Old 20 May 2012, 17:26   #62
meynaf
68k wisdom
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon (France)
Age: 41
Posts: 1,118
A few coding tricks, not specific to 68000 but still ok there i think :
Code:
; sgn - returns d1=0 if d0=0, d1=1 if d0>0, or d1=-1 if d0<0
 add.l d0,d0
 subx.l d1,d1
 negx.l d0
 addx.l d1,d1

; quick-test to check if one byte of d0 is 00
 move.l d0,d1
 not.l d0
 sub.l #$01010101,d1
 and.l #$80808080,d0
 and.l d0,d1
 bne null_byte_found

; check if a=b, (d0=a, d1=b, range 0000-7FFF), but true in all cases if b=$ffff
 eor.w d1,d0
 bgt not_equal

; instead of :
 scs d0
 ext.w d0  ; or extb.l d0
 ext.l d0  ;
; write :
 subx.l d0,d0

; to see if a value is between $FFFF8000 and $7FFF, better put it in An reg :
 cmpa.w a0,a0  ; cmp with a0.w extended to .l, and a0.l
meynaf is offline  
Old 25 May 2012, 08:36   #63
Photon
Moderator
Photon's Avatar
 
Join Date: Nov 2004
Location: Hult / Sweden
Age: 100
Posts: 4,024
Trying to optimize often leads to a few lines of strange new code that does it faster or shorter but looks irrelevant to the task Even though this looks more like good code for implementing variable typing in a higher level language, I liked it.
Photon is offline  
Old 25 May 2012, 10:28   #64
pmc
rebooting...
pmc's Avatar
 
Join Date: Apr 2007
Location: Elsewhere
Posts: 1,593
Quote:
Originally Posted by Photon
Trying to optimize often leads to a few lines of strange new code that does it faster or shorter but looks irrelevant to the task
So true. Leading to the weird experience of looking at some of your very own code and thinking: huh? what the hell was I doing that for?

Followed by a few minutes of groping through hazy memories and realising: oh, yeah, that's why.

It's another reason why I personally find it difficult to nigh on impossible to figure out other people's demo code. They did so many weird little things and optimisations that only they understood the reason for that I've got no chance.

Much easier to code your own effects from scratch than figure out how some other coder did it their way.
pmc is offline  
Old 25 May 2012, 12:58   #65
phx
Registered User

phx's Avatar
 
Join Date: Nov 2009
Location: Herford / Germany
Posts: 487
That's why most programming languages allow comments.
phx is offline  
Old 28 May 2012, 09:40   #66
meynaf
68k wisdom
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon (France)
Age: 41
Posts: 1,118
Quote:
Originally Posted by phx View Post
That's why most programming languages allow comments.
I second that. Whenever i use a "trick" in asm, i add comments for each line.

But, of course, when you re-source a program, comments are gone
meynaf is offline  
Old 31 May 2012, 22:31   #67
Lonewolf10
AMOS Extensions Developer
Lonewolf10's Avatar
 
Join Date: Jun 2007
Location: near Cambridge, UK
Age: 35
Posts: 1,300
Some great tips here

Anyone else think this thread is worthy of being made "sticky"?


Regards,
Lonewolf10
Lonewolf10 is offline  
Old 04 June 2012, 09:25   #68
TheDarkCoder
Registered User
 
Join Date: Dec 2007
Location: Dark Kingdom
Posts: 141
I second that. But I would suggest to move the thread in the ASM section
TheDarkCoder is offline  
Old 07 June 2012, 08:47   #69
meynaf
68k wisdom
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon (France)
Age: 41
Posts: 1,118
Perhaps there is a little bit too much OT here to do that, huh ?
meynaf is offline  
Old 15 June 2012, 17:26   #70
pmc
rebooting...
pmc's Avatar
 
Join Date: Apr 2007
Location: Elsewhere
Posts: 1,593
Instead of this:
Code:
                    cmpi.l              #num,Dn
this:
Code:
                    moveq.l             #num,Dn
                    cmp.l               Dn,Dn
where (to suit the moveq.l) num is in the range -128 to +127

Last edited by pmc; 16 June 2012 at 11:36. Reason: got Stung
pmc is offline  
Old 15 June 2012, 21:21   #71
Photon
Moderator
Photon's Avatar
 
Join Date: Nov 2004
Location: Hult / Sweden
Age: 100
Posts: 4,024
Yes, and one I use a lot is masking or subtracting stuff "to another register". When you do gfx stuff you often go

Code:
moveq #15,d1
and.w d0,d1
which retains the value in d0 should you need to mirror it or something.

Thread moved to asm forum and made sticky 8)
Photon is offline  
Old 15 June 2012, 23:23   #72
StingRay
move.l #$c0ff33,throat

StingRay's Avatar
 
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 5,054
Quote:
Originally Posted by pmc View Post
Instead of this:
Code:
                    cmpi.l              #num,Dn
this:
Code:
                    moveq.l             #num,Dn
                    cmp.l               Dn,Dn
where (to suit the moveq.l) num is in the range -127 to +127
moveq range is -128 to +127.
StingRay is offline  
Old 16 June 2012, 11:35   #73
pmc
rebooting...
pmc's Avatar
 
Join Date: Apr 2007
Location: Elsewhere
Posts: 1,593
Doh! Stung again!

Original post now edited.
pmc is offline  
Old 16 June 2012, 16:55   #74
TheDarkCoder
Registered User
 
Join Date: Dec 2007
Location: Dark Kingdom
Posts: 141
@pmc: you probably meant to keep the small immediate value in a register different to the one against which you do the cmp:

moveq.l #num,Dx
cmp.l Dx,Dn
TheDarkCoder is offline  
Old 16 June 2012, 17:21   #75
pmc
rebooting...
pmc's Avatar
 
Join Date: Apr 2007
Location: Elsewhere
Posts: 1,593
Yes. That was supposed to be implied but, as you say, it reads rather ambiguously. It's much more clear written your way.
pmc is offline  
Old 20 June 2012, 08:41   #76
pmc
rebooting...
pmc's Avatar
 
Join Date: Apr 2007
Location: Elsewhere
Posts: 1,593
Here's on optimisation I worked out ages ago using shifting and adding instead of multiplying.

Maybe obvious or well known to others, or maybe not but well... might be useful to someone

As an example, the eight binary digits in a byte represent the decimal numbers:

128 64 32 16 8 4 2 1

So, multiplying a number by shifting is pretty easy: shift left once to multiply a number by 2 for example.

Code:
                    lsl.w               #1,d0
or shift left by three to multiply a number by 8 etc. etc.

But what about multiplying to other numbers?

I've found that it's possible to do a couple of shifts and an add to multiply to other numbers.

For example, if I wanted to multiply a number by 40 - in the binary digits above, there's no 40 but there are a 32 and an 8.

And, as luck would have it, 32 + 8 = 40

So, if I shift a number left by five (multiply by 32) and take the same number and shift it left by three (multiply by 8) and then add the two results

Code:
                    move.w              d0,d1
                    lsl.w               #3,d0
                    lsl.w               #5,d1
                    add.w               d1,d0
I get the original number in d0 multiplied by 40 but without having to use a comparatively slow mulu.w

It can also work with more than two additions - if I wanted to multiply a number by 56:

Code:
                    move.w              d0,d1
                    move.w              d0,d2
                    lsl.w               #3,d0
                    lsl.w               #4,d1
                    lsl.w               #5,d2
                    add.w               d2,d1
                    add.w               d1,d0
This works for other numbers too - try it and see what works. Obviously there'll be a cut off somewhere where all the shifting and adding will mount up and it might be quicker or no different to use a mulu.w instead.

Also, you do have to watch out for shifting digits "off the end" of the size the registers in use can hold.
pmc is offline  
Old 20 June 2012, 08:52   #77
StingRay
move.l #$c0ff33,throat

StingRay's Avatar
 
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 5,054
On 68000, instead of doing shifts/adds, you could also use a multiplication table. Disadvantage is that you might need a spare address register (depending on where in memory your table is) and that the table needs some memory of course. Advantage is that it is faster than lots of shifts+adds.

So f.e. if you want to multiply a number in d0 by 56 you'd do this:

lea multab(pc),a0
add.w d0,d0
move.w (a0,d0.w),d0
...
StingRay is offline  
Old 20 June 2012, 09:16   #78
pmc
rebooting...
pmc's Avatar
 
Join Date: Apr 2007
Location: Elsewhere
Posts: 1,593
Oh yes, definitely - pre multiplying your values and just dragging them out of a table by index is better than doing calculations in the code but you know, it's nice to have options
pmc is offline  
Old 20 June 2012, 09:48   #79
StingRay
move.l #$c0ff33,throat

StingRay's Avatar
 
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 5,054
Of course it is. Sometimes you don't even have the memory required for the table f.e. I just mentioned it for the sake of completeness.
StingRay is offline  
Old 20 June 2012, 10:24   #80
Codetapper
Moderator

Codetapper's Avatar
 
Join Date: May 2001
Location: Auckland / New Zealand
Age: 39
Posts: 2,856
Send a message via Skype™ to Codetapper
If you want to multiply by "nice" numbers like 40 on a 68000 and not use a multiplication table, you should do the following to avoid a second slow lsl operation rather than pmc's method:

Code:
        lsl.w   #3,d0   ;d0 = Number * 8
        move.w  d0,d1   ;d1 = Number * 8
        add.w   d1,d1   ;d1 = Number * 16
        add.w   d1,d1   ;d1 = Number * 32
        add.w   d1,d0   ;d0 = Number * 40
If shifting left by 2 or less, it's quicker to do 2 add's than a shift (on 68000 only!) If you wanted to multiply by 48 for example, it's actually quicker than to multiply by 40:

Code:
        lsl.w   #4,d0   ;d0 = Number * 16
        move.w  d0,d1   ;d1 = Number * 16
        add.w   d1,d1   ;d1 = Number * 32
        add.w   d1,d0   ;d0 = Number * 48
To multiply by 56:

Code:
        lsl.w   #3,d0   ;d0 = Number * 8
        move.w  d0,d1   ;d1 = Number * 8
        add.w   d1,d1   ;d1 = Number * 16
        move.w  d1,d2   ;d2 = Number * 16
        add.w   d2,d2   ;d2 = Number * 32
        add.w   d2,d1   ;d1 = Number * 48
        add.w   d1,d0   ;d0 = Number * 56
The other alternative is to use the fact that 56 is 64 - 8:

Code:
        lsl.w   #3,d0   ;d0 = Number * 8
        move.w  d0,d1   ;d1 = Number * 8
        lsl.w   #3,d0   ;d0 = Number * 64
        sub.w   d1,d0   ;d0 = Number * 56
Shifts of 3 or more are faster with a shift operation than 3 add's.

Last edited by Codetapper; 20 June 2012 at 10:44.
Codetapper is offline  
AdSense AdSense  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
68000 boot code billt Coders. General 15 05 May 2012 20:13
Wasted Dreams on 68000 sanjyuubi support.Games 5 27 May 2011 17:11
680x0 to 68000 Counia Hardware mods 1 01 March 2011 10:18
quitting on 68000? Hungry Horace project.WHDLoad 60 19 December 2006 20:17
3D code and/or internet code for Blitz Basic 2.1 EdzUp Retrogaming General Discussion 0 10 February 2002 11:40

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 23:44.


Powered by vBulletin® Version 3.8.8 Beta 1
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Page generated in 0.22535 seconds with 13 queries