13 May 2010, 09:50 | #1 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
adda / suba Vs. lea
Watcha boys
Very quick question: On 68000 is it quicker to do, for example, this: Code:
lea 10(a0),a0 Code:
adda.w #10,a0 |
13 May 2010, 10:27 | #2 |
68k
Join Date: Sep 2005
Location: Somewhere
Posts: 828
|
Hi
lea 10(a0),a0 is faster then adda.w #10,a0. Regards |
13 May 2010, 10:48 | #3 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
|
13 May 2010, 10:55 | #4 |
Banned
Join Date: Jan 2007
Location: France
Posts: 655
|
>both instructions are equally fast (8 cycles AFAIR)
And addq.w #8,a0 / addq.w #2,a0 on 68000 ? |
13 May 2010, 11:04 | #5 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
4 cycles for both instructions, i.e. there is no difference (unlike the shift instructions where you have to take the shift count into account, i.e. lsl.w #2,dx is faster than lsl.w #6,dx).
|
13 May 2010, 11:26 | #6 |
Registered User
Join Date: Jun 2008
Location: somewhere else
Posts: 511
|
lea 10(a0),a0 is the fastest alternative the 2 addqs are slightly slower & adda is the slowest.
|
13 May 2010, 11:27 | #7 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,506
|
addq.l #x,An, addq.w #x,An (and subq) are 8 cycles. Data register destination: 4 cycles.
Word sized address register calculations are slower than data registers because of internal sign extension to long. lea x(An),An is also 8 cycles. |
13 May 2010, 11:36 | #8 | |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
|
Quote:
There is no difference between lea x(ax),ax and add.w #x,ax. |
|
13 May 2010, 11:38 | #9 | |
Registered User
Join Date: Jun 2008
Location: somewhere else
Posts: 511
|
See for yourselves boys:
Quote:
|
|
13 May 2010, 11:40 | #10 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,506
|
|
13 May 2010, 11:42 | #11 |
68k
Join Date: Sep 2005
Location: Somewhere
Posts: 828
|
|
13 May 2010, 11:48 | #12 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,506
|
Did you run it in real fast ram? Chip ram is bad idea when calculating cycles because of DMA cycles (especially refresh slots that can never be disabled),
addq/subq = 4 cpu cycles for instruction prefetch, 4 cpu idle cycles lea = 2 * 4 cpu cycles for instruction prefetch, 0 idle cycles Very different results if other things use the bus, even if instruction cycle length is exact same EDIT: addq/subq is "faster" on an Amiga because it uses less bus time. |
13 May 2010, 11:50 | #13 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
And there was me thinking there would be a simple answer...
So, basically, the official answer seems to be that there's no difference between the two (ie. 8 cycles each...) |
13 May 2010, 11:55 | #14 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,506
|
Nothing Amiga related is simple
addq/subq is better choice because it only needs single read cycle vs lea that needs 2 read cycles (1 read, 1 idle vs 2 read, 0 idle) "Faster" execution when there is simultaneous DMA activity. |
13 May 2010, 11:58 | #15 |
Registered User
Join Date: Jun 2008
Location: somewhere else
Posts: 511
|
Toni: same running conditions and i don't think there is much differences between CHIP & FAST ram on a plain 68000 based machine, is there ? Stable results show that lea is faster than adda.
|
13 May 2010, 12:01 | #16 |
Banned
Join Date: Jan 2007
Location: France
Posts: 655
|
>EDIT: addq/subq is "faster" on an Amiga because it uses less bus time
>addq/subq is better choice because it only needs single read cycle vs lea that needs 2 read cycles (1 read, 1 idle vs 2 read, 0 idle) On 68000 ? And on 68060 ?? |
13 May 2010, 12:06 | #17 | |||
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,506
|
Quote:
EDIT: adda.w #x,An (not 'q') seems to need 4 cycles more than lea x(An),An. (and if it is true, this is wrong in emulation) No DMA chip ram vs fast ram has tiny 4 cycle difference per scan line but it still can cause surprising results in specific situations Quote:
Quote:
Last edited by Toni Wilen; 13 May 2010 at 12:17. |
|||
15 May 2010, 09:49 | #18 | |
Registered User
Join Date: Dec 2007
Location: Dark Kingdom
Posts: 213
|
Quote:
adda.w #d,Ax is 12 cycles (2 bus read) lea d(Ax),Ax is 8 cycles (2 bus read) addq.q #x,Ax is 4 cycles (1 bus read), whilst subq is 8 cycles (1 bus read): but then there is a handwritten note (by a version of myself who lived long time ago ) saying that the timing for addq #d, Ax is wrong because experimentally I discovered that it is the same as subq #d, Ax . Interestingly, the manual says that for the 68010 both addq.q #x,Ax and subq.q #x,Ax are 4 cycles (1 bus read). I don't have a 68010 to test if this is true. 68010 is the only main cpu model used on Amiga missing from my collection! (not counting diffrent versions like Lc or Ec etc.) |
|
04 June 2010, 13:37 | #19 |
Registered User
Join Date: Feb 2007
Location: Melbourne, Australia
Age: 41
Posts: 3,772
|
Got one here if you would like me to do some tests.
|
04 June 2010, 17:28 | #20 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
If this has to be done in a loop, then how about this?
Code:
moveq #10,d0 .loop add.w d0,a0 ... ... |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
32bit PC-relative LEA ?? | Nut | Coders. General | 22 | 18 March 2010 10:56 |
|
|