adda / suba Vs. lea

pmc · 13 May 2010, 09:50

Watcha boys

Very quick question:

On 68000 is it quicker to do, for example, this:

Code:

lea 10(a0),a0

instead of this:

Code:

 
adda.w #10,a0

...?

Asman · 13 May 2010, 10:27

Hi

lea 10(a0),a0 is faster then adda.w #10,a0.

Regards

StingRay · 13 May 2010, 10:48

Quote:

Originally Posted by Asman

lea 10(a0),a0 is faster then adda.w #10,a0.

Not true, both instructions are equally fast (8 cycles AFAIR).

Cosmos · 13 May 2010, 10:55

>both instructions are equally fast (8 cycles AFAIR)

And addq.w #8,a0 / addq.w #2,a0 on 68000 ?

StingRay · 13 May 2010, 11:04

4 cycles for both instructions, i.e. there is no difference (unlike the shift instructions where you have to take the shift count into account, i.e. lsl.w #2,dx is faster than lsl.w #6,dx).

hitchhikr · 13 May 2010, 11:26

lea 10(a0),a0 is the fastest alternative the 2 addqs are slightly slower & adda is the slowest.

Toni Wilen · 13 May 2010, 11:27

addq.l #x,An, addq.w #x,An (and subq) are 8 cycles. Data register destination: 4 cycles.

Word sized address register calculations are slower than data registers because of internal sign extension to long.

lea x(An),An is also 8 cycles.

StingRay · 13 May 2010, 11:36

Quote:

Originally Posted by Toni Wilen

addq.l #x,An, addq.w #x,An (and subq) are 8 cycles. Data register destination: 4 cycles.

addq.w #x,ax = 4 cycles according to my 68000 reference manual.

Quote:

Originally Posted by hitchhikr

lea 10(a0),a0 is the fastest alternative the 2 addqs are slightly slower & adda is the slowest.

There is no difference between lea x(ax),ax and add.w #x,ax.

hitchhikr · 13 May 2010, 11:38

See for yourselves boys:

Quote:

start: move.l 4.w,a6
jsr -132(a6)
sub.l a0,a0
loop: cmp.b #80,$dff006
bne.b loop
move.w #$f00,$dff180
rept 100
; addq.w #8,a0
; addq.w #2,a0
lea 10(a0),a0
; add.w #10,a0
endr
move.w #0,$dff180
btst #6,$bfe001
bne.w loop
move.l 4.w,a6
jsr -138(a6)
moveq #0,d0
rts

Toni Wilen · 13 May 2010, 11:40

Quote:

Originally Posted by StingRay

addq.w #x,ax = 4 cycles according to my 68000 reference manual.

It is wrong, it should be same as subq (which is correct in same page of the manual), logic analyzer confirmed too

Asman · 13 May 2010, 11:42

Quote:

Originally Posted by StingRay

There is no difference between lea x(ax),ax and add.w #x,ax.

Hi

Hm..... Its mean that documentation where I find that lea is faster then add.w have probably bug. So everything what I need is a real test on real 68000 machine

Regards

Toni Wilen · 13 May 2010, 11:48

Quote:

Originally Posted by hitchhikr

See for yourselves boys:

Did you run it in real fast ram? Chip ram is bad idea when calculating cycles because of DMA cycles (especially refresh slots that can never be disabled),

addq/subq = 4 cpu cycles for instruction prefetch, 4 cpu idle cycles
lea = 2 * 4 cpu cycles for instruction prefetch, 0 idle cycles

Very different results if other things use the bus, even if instruction cycle length is exact same

EDIT: addq/subq is "faster" on an Amiga because it uses less bus time.

pmc · 13 May 2010, 11:50

And there was me thinking there would be a simple answer...

So, basically, the official answer seems to be that there's no difference between the two (ie. 8 cycles each...)

Toni Wilen · 13 May 2010, 11:55

Nothing Amiga related is simple

addq/subq is better choice because it only needs single read cycle vs lea that needs 2 read cycles (1 read, 1 idle vs 2 read, 0 idle)

"Faster" execution when there is simultaneous DMA activity.

hitchhikr · 13 May 2010, 11:58

Toni: same running conditions and i don't think there is much differences between CHIP & FAST ram on a plain 68000 based machine, is there ? Stable results show that lea is faster than adda.

Cosmos · 13 May 2010, 12:01

>EDIT: addq/subq is "faster" on an Amiga because it uses less bus time
>addq/subq is better choice because it only needs single read cycle vs lea that needs 2 read cycles (1 read, 1 idle vs 2 read, 0 idle)

On 68000 ?

And on 68060 ??

Toni Wilen · 13 May 2010, 12:06

Quote:

Originally Posted by hitchhikr

Toni: same running conditions and i don't think there is much differences between CHIP & FAST ram on a plain 68000 based machine, is there ? Stable results show that lea is faster than adda.

You mean adda.w or addaq.w? I only talked about addaq #x,An vs lea x(an),an. adda.w might be slower, didn't check

EDIT: adda.w #x,An (not 'q') seems to need 4 cycles more than lea x(An),An. (and if it is true, this is wrong in emulation)

No DMA chip ram vs fast ram has tiny 4 cycle difference per scan line but it still can cause surprising results in specific situations

Quote:

On 68000 ?

Yes

Quote:

And on 68060 ??

No one cares because of big caches

TheDarkCoder · 15 May 2010, 09:49

Quote:

Originally Posted by Toni Wilen

You mean adda.w or addaq.w? I only talked about addaq #x,An vs lea x(an),an. adda.w might be slower, didn't check

EDIT: adda.w #x,An (not 'q') seems to need 4 cycles more than lea x(An),An. (and if it is true, this is wrong in emulation)

indeed, in my copy of the Motorola 68000 Manual
adda.w #d,Ax is 12 cycles (2 bus read)
lea d(Ax),Ax is 8 cycles (2 bus read)

addq.q #x,Ax is 4 cycles (1 bus read), whilst subq is 8 cycles (1 bus read):
but then there is a handwritten note (by a version of myself who lived long time ago

) saying that the timing for addq #d, Ax is wrong because experimentally I discovered that it is the same as subq #d, Ax .

Interestingly, the manual says that for the 68010 both addq.q #x,Ax and subq.q #x,Ax are 4 cycles (1 bus read). I don't have a 68010 to test if this is true.

68010 is the only main cpu model used on Amiga missing from my collection!
(not counting diffrent versions like Lc or Ec etc.)

Hewitson · 04 June 2010, 13:37

Got one here if you would like me to do some tests.

Thorham · 04 June 2010, 17:28

If this has to be done in a loop, then how about this?

Code:

	moveq	#10,d0
.loop
	add.w	d0,a0
	...
	...

04 June 2010, 17:28	#20
Thorham Computer Nerd Join Date: Sep 2007 Location: Rotterdam/Netherlands Age: 47 Posts: 3,751	If this has to be done in a loop, then how about this? Code: moveq #10,d0 .loop add.w d0,a0 ... ...

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
32bit PC-relative LEA ??	Nut	Coders. General	22	18 March 2010 10:56

13 May 2010, 09:50	#1
pmc gone Join Date: Apr 2007 Location: completely gone Posts: 1,596	adda / suba Vs. lea Watcha boys Very quick question: On 68000 is it quicker to do, for example, this: Code: lea 10(a0),a0 instead of this: Code: adda.w #10,a0 ...?

13 May 2010, 10:27	#2
Asman 68k Join Date: Sep 2005 Location: Somewhere Posts: 828	Hi lea 10(a0),a0 is faster then adda.w #10,a0. Regards

13 May 2010, 10:55	#4
Cosmos Banned Join Date: Jan 2007 Location: France Posts: 655	>both instructions are equally fast (8 cycles AFAIR) And addq.w #8,a0 / addq.w #2,a0 on 68000 ?

13 May 2010, 11:04	#5
StingRay move.l #$c0ff33,throat Join Date: Dec 2005 Location: Berlin/Joymoney Posts: 6,863	4 cycles for both instructions, i.e. there is no difference (unlike the shift instructions where you have to take the shift count into account, i.e. lsl.w #2,dx is faster than lsl.w #6,dx).

13 May 2010, 11:26	#6
hitchhikr Registered User Join Date: Jun 2008 Location: somewhere else Posts: 511	lea 10(a0),a0 is the fastest alternative the 2 addqs are slightly slower & adda is the slowest.

13 May 2010, 11:27	#7
Toni Wilen WinUAE developer Join Date: Aug 2001 Location: Hämeenlinna/Finland Age: 49 Posts: 26,506	addq.l #x,An, addq.w #x,An (and subq) are 8 cycles. Data register destination: 4 cycles. Word sized address register calculations are slower than data registers because of internal sign extension to long. lea x(An),An is also 8 cycles.

13 May 2010, 11:50	#13
pmc gone Join Date: Apr 2007 Location: completely gone Posts: 1,596	And there was me thinking there would be a simple answer... So, basically, the official answer seems to be that there's no difference between the two (ie. 8 cycles each...)

13 May 2010, 11:55	#14
Toni Wilen WinUAE developer Join Date: Aug 2001 Location: Hämeenlinna/Finland Age: 49 Posts: 26,506	Nothing Amiga related is simple addq/subq is better choice because it only needs single read cycle vs lea that needs 2 read cycles (1 read, 1 idle vs 2 read, 0 idle) "Faster" execution when there is simultaneous DMA activity.

13 May 2010, 11:58	#15
hitchhikr Registered User Join Date: Jun 2008 Location: somewhere else Posts: 511	Toni: same running conditions and i don't think there is much differences between CHIP & FAST ram on a plain 68000 based machine, is there ? Stable results show that lea is faster than adda.

13 May 2010, 12:01	#16
Cosmos Banned Join Date: Jan 2007 Location: France Posts: 655	>EDIT: addq/subq is "faster" on an Amiga because it uses less bus time >addq/subq is better choice because it only needs single read cycle vs lea that needs 2 read cycles (1 read, 1 idle vs 2 read, 0 idle) On 68000 ? And on 68060 ??

04 June 2010, 13:37	#19
Hewitson Registered User Join Date: Feb 2007 Location: Melbourne, Australia Age: 41 Posts: 3,772	Got one here if you would like me to do some tests.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)