English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 19 June 2016, 13:48   #1
Galahad/FLT
Going nowhere
 
Galahad/FLT's Avatar
 
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 8,987
68000 optimisation

Need to optimize something, but don't want to waste a lot of effort on parts that won't yield proper benefits.

Code in question is lots of ADDA.L #$X,Ax, especially in loops.

Am I going to get much of a saving for the processor if I change them all to
LEA x(Ax),Ax instead?
Galahad/FLT is offline  
Old 19 June 2016, 14:46   #2
ReadOnlyCat
Code Kitten
 
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
Quote:
Originally Posted by Galahad/FLT View Post
Need to optimize something, but don't want to waste a lot of effort on parts that won't yield proper benefits.

Code in question is lots of ADDA.L #$X,Ax, especially in loops.

Am I going to get much of a saving for the processor if I change them all to
LEA x(Ax),Ax instead?
According to http://oldwww.nvg.ntnu.no/amiga/MC68...timjmpetc.HTML, LEA would take the same number of cycles in this case, that is 8.
Even ADDQ is not faster on address kittens but it would be on data registers (4 cycles).

What kind of values do you add to these registers? Maybe there is a way to use (Ax)+ instead? It would help if you had an example with the initial setting of the data registers and the loop.
ReadOnlyCat is offline  
Old 19 June 2016, 15:03   #3
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,960
Quote:
Originally Posted by ReadOnlyCat View Post
According to http://oldwww.nvg.ntnu.no/amiga/MC68...timjmpetc.HTML, LEA would take the same number of cycles in this case, that is 8.
Even ADDQ is not faster on address kittens but it would be on data registers (4 cycles).

What kind of values do you add to these registers? Maybe there is a way to use (Ax)+ instead? It would help if you had an example with the initial setting of the data registers and the loop.
Lea is fastest than adda.l for 68000.
He can use add.w, same speed like lea, or addq.l, shortest code.
Don_Adan is offline  
Old 19 June 2016, 15:19   #4
ReadOnlyCat
Code Kitten
 
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
Quote:
Originally Posted by Don_Adan View Post
Lea is fastest than adda.l for 68000.
He can use add.w, same speed like lea, or addq.l, shortest code.
Are you sure? Both are listed as 8 cycles on the site I linked to.
ReadOnlyCat is offline  
Old 19 June 2016, 15:38   #5
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,507
add.l #x,reg can't be 8 cycles. Count the number of memory fetches needed..
Toni Wilen is offline  
Old 19 June 2016, 16:06   #6
ReadOnlyCat
Code Kitten
 
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
Quote:
Originally Posted by Toni Wilen View Post
add.l #x,reg can't be 8 cycles. Count the number of memory fetches needed..
Damn, what a dummy. I read the cycle count but forgot to add operation timing + effective address computation.
Indeed, just fetching the operands would be 8 cycles itself.

Lesson of the day: do not surf the EAB before a morning shower.
ReadOnlyCat is offline  
Old 19 June 2016, 21:38   #7
Asman
68k
 
Asman's Avatar
 
Join Date: Sep 2005
Location: Somewhere
Posts: 828
@Galahad/FLT
Please post more code lines.
If you have spare dx register then I will use moveq #x,dx and in loop
add.l dx,a0
Asman is offline  
Old 17 July 2016, 17:57   #8
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
On 68000, there's no faster way than lea d16(An),An.

Not even addq (but you will save 2 bytes of code).
Photon is offline  
Old 17 July 2016, 19:03   #9
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,507
addq.l #x,an can be faster than lea because it is memory cycle + 2xidle cycle combination (lea is 2xmemory cycle), DMA can use second cycle without slowing down the CPU.

Last edited by Toni Wilen; 17 July 2016 at 19:10.
Toni Wilen is offline  
Old 20 August 2016, 00:29   #10
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,602
Quote:
Originally Posted by Toni Wilen View Post
addq.l #x,an can be faster than lea because it is memory cycle + 2xidle cycle combination (lea is 2xmemory cycle), DMA can use second cycle without slowing down the CPU.
Of course, however you will need more than 4 bitplanes on, or a combination of bitplanes and the eccentric BLTPRI=0 mode (or equally eccentric minterms). Otherwise you wont get MA cycles untimely stolen-/granted-or-not and the CPU is simply either locked out or not.

Lea is a normal MA for the instruction, and a MA for the offset, while addq is a normal MA for the instruction barring prefetch, followed by a 2 cycle internal operation which affects nothing but the CPU internal state. Naturally you should cut down on MA where possible, but correct blits and not hampering the CPU is the larger optimization.

I was about to post something about this doublespeed addq anomaly in WinUAE vs. real Amiga, but I tried it in emu now and saw you fixed that Good work

Last edited by Photon; 20 August 2016 at 00:35.
Photon is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
anyone have Play16 V1.8 (68000) Yulquen74 support.Apps 2 22 November 2013 22:50
680x0 to 68000 Counia Hardware mods 1 01 March 2011 10:18
quitting on 68000? Hungry Horace project.WHDLoad 60 19 December 2006 20:17
Picasso IV optimisation Tony Landais support.Hardware 10 01 September 2006 19:54

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 06:08.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.08170 seconds with 15 queries